You are on page 1of 39

entropy

Article
Graphical Models in Reconstructability Analysis and
Bayesian Networks
Marcus Harris * and Martin Zwick

Systems Science Program, Portland State University, Portland, OR 97207, USA; zwick@pdx.edu
* Correspondence: maharris@pdx.edu

Abstract: Reconstructability Analysis (RA) and Bayesian Networks (BN) are both probabilistic
graphical modeling methodologies used in machine learning and artificial intelligence. There are
RA models that are statistically equivalent to BN models and there are also models unique to RA
and models unique to BN. The primary goal of this paper is to unify these two methodologies via
a lattice of structures that offers an expanded set of models to represent complex systems more
accurately or more simply. The conceptualization of this lattice also offers a framework for additional
innovations beyond what is presented here. Specifically, this paper integrates RA and BN by
developing and visualizing: (1) a BN neutral system lattice of general and specific graphs, (2) a joint
RA-BN neutral system lattice of general and specific graphs, (3) an augmented RA directed system
lattice of prediction graphs, and (4) a BN directed system lattice of prediction graphs. Additionally, it
(5) extends RA notation to encompass BN graphs and (6) offers an algorithm to search the joint RA-BN
neutral system lattice to find the best representation of system structure from underlying system
variables. All lattices shown in this paper are for four variables, but the theory and methodology
 presented in this paper are general and apply to any number of variables. These methodological

innovations are contributions to machine learning and artificial intelligence and more generally to
Citation: Harris, M.; Zwick, M.
complex systems analysis. The paper also reviews some relevant prior work of others so that the
Graphical Models in
innovations offered here can be understood in a self-contained way within the context of this paper.
Reconstructability Analysis and
Bayesian Networks. Entropy 2021, 23,
Keywords: probabilistic graphical models; Reconstructability Analysis; Bayesian networks; informa-
986. https://doi.org/10.3390/
e23080986
tion theory; maximum entropy; artificial intelligence; machine learning; lattice of general structures;
hypergraph; directed acyclic graph
Academic Editor: Christopher
J. Fonnesbeck

Received: 5 June 2021 1. Introduction


Accepted: 26 July 2021 Reconstructability Analysis (RA) and Bayesian Networks (BN) are both probabilistic
Published: 30 July 2021 graphical modeling methodologies. A probabilistic graphical model uses a graph (or
hypergraph) to encode independencies and dependencies between variables and proba-
Publisher’s Note: MDPI stays neutral bility theory to encode the precise nature of the relations between variables. Graphs are
with regard to jurisdictional claims in
either undirected or directed. RA graphs include undirected graphs (or hypergraphs) that
published maps and institutional affil-
have loops or do not have loops. BN graphs are directed graphs that do not have cycles.
iations.
(“Loops” here refer to undirected graphs; “cycles” refer to directed graphs.) RA and BN
graphs can represent independence structures that are unique to each methodology, and
also independence structures that are the same in both methodologies. For RA models
without loops and for all BN models, variable independencies can be represented in closed
Copyright: © 2021 by the authors. algebraic (factorized) form. For RA models with loops, solutions require iterative calcula-
Licensee MDPI, Basel, Switzerland. tions. The value of integrating these two methodologies lies in the fact that the RA lattice
This article is an open access article of structures offers potential models of complex systems not found in BNs, while BNs
distributed under the terms and
are a more widely used analytical approach than RA and also include unique models.
conditions of the Creative Commons
Combining the candidate models of the two methodologies thus offers a more expressive
Attribution (CC BY) license (https://
framework than either alone. It also does so in an organized and coherent way that allows
creativecommons.org/licenses/by/
for future possible extensions discussed in Section 6.
4.0/).

Entropy 2021, 23, 986. https://doi.org/10.3390/e23080986 https://www.mdpi.com/journal/entropy


Entropy 2021, 23, 986 2 of 39

RA is a data modeling approach developed in the systems community [1–17] that


combines graph theory and information theory. Its applications are diverse, including time-
series analysis, classification, decomposition, compression, pattern recognition, prediction,
control, and decision analysis [14]. It is designed especially for nominal variables, but
continuous variables can be accommodated if their values are discretized. RA could in
theory accommodate continuous variables; however, this extension of the methodology
has yet to be formalized. Graph theory specifies the structure of the model: if the relations
between the variables are all dyadic (pairwise), the structure is a graph; if some relations
have higher ordinality, the structure is a hypergraph. In speaking of RA, the word ‘graph’
will henceforth include the possibility that the structure is a hypergraph. The structure is
independent of the data except for specification of variable cardinalities. In RA, information
theory uses the data to characterize the precise nature and the strength of the relations.
Data applied to a graph structure yields a probabilistic graphical model of the data.
RA has three primary types of models: variable-based models without loops, variable-
based models with loops and state-based models (where individual states of variables
specify model constraints) that nearly always have loops. Models that do not have loops
have closed-form algebraic solutions; those that have loops require iterative proportional
fitting. In RA, graphs are undirected, although directions are implicit if one variable is
designated as the response variable (dependent variable or DV), while all other variables
are designated as explanatory variables (independent variables or IVs). In principle, there
could be more than one DV, but in the discussion that follows, a single DV is assumed. If
the IV-DV distinction is made, the system is ‘directed’ and the primary aim is prediction
of the DV given the IVs; if no IV-DV distinction is made, the system is ‘neutral’ and the
primary aim is to characterize the nature of relations among all variables.
RA models are undirected graphs that either have or do not have loops, where a
‘loop’ is the presence of circularity in a set of undirected links. We reserve the word ‘cycle’
and ‘acyclic’ for circularity or lack thereof in directed graphs, which are used in BN and
not in RA. An undirected graph having a loop can become an acyclic graph for certain
assignments of link directions. For example, an RA model that posits relations between A
and B, between B and C, and between A and C has a loop, but if directions are assigned in a
BN model so that these relations are A→B, B→C, and A→C, the resulting graph is acyclic.
Graphs are general or specific. A general graph identifies relations among variables
that are unlabeled, i.e., variables whose identity is not specified; a specific graph labels
(identifies) the variables. For example, for a system consisting of variables A, B, and C,
AB:BC is a specific graph where nodes A and B are linked and B and C are also linked.
Specific graphs AB:BC, BA:AC and AC:CB are all instances of the same general graph that
has a unique independence structure regardless of variable labels. In this notation, the
order of variables in any relation is arbitrary, as is the order of the relations. For example,
CB:BA is identical to AB:BC. Relations include all of their embedded relations. For example,
ABC includes embedded relations AB, AC and BC and the univariate margins A, B, and C.
The lattice of graphs for a neutral or a directed system with or without loops depends
upon the number of variables in the data. For a three-variable neutral system allowing loops
there are five general graphs and nine specific graphs; for four variables there are 20 general
graphs and 114 specific graphs. The number of graphs increases hyper-exponentially with
the number of variables. In the confirmatory mode, RA can test the significance of a single
model—a hypothesis being tested—relative to another model used as a reference. In the
exploratory mode, RA can search the lattice of graphs for models that are statistically
significant and best represent the data with maximal information captured and minimal
complexity.
Bayesian Networks (BN) are another probabilistic graphical modeling approach to
data modeling that is closely related to RA. Indeed, where BN overlaps RA the two methods
are equivalent, but with respect to neutral systems, RA and BN each has distinctive features
absent in the other methodology. For directed systems; however, where prediction of a
single dependent variable is the aim, RA encompasses all models found in BN under the
Entropy 2021, 23, 986 3 of 39

convention used in this paper that all nodes except for parent nodes within a V-structure
are allowed to be the variable being predicted; this inclusion of the BN directed system
lattice within the RA lattice will be shown later in this paper.
BNs have origins in the type of path model described by Wright [18,19], but it was
not until the 1980s that BNs became more formally established [20–23]. As does RA, BN
combines graph theory and probability theory: graph theory provides the structure and
probability theory characterizes the nature of relationships between variables. BNs are
represented by a single type of graph structure; a directed acyclic graph, which is a subset
of chain graphs, also known as block recursive models [24]. BNs can be represented more
generally by partially directed acyclic graphs (PDAG), a subset of chain graphs where
edge directions are removed when directionality has no effect on the underlying inde-
pendence structure. Discrete variables are most common in BNs, but BNs accommodate
continuous variables without discretization [25]. In principal RA could also accommodate
continuous variables but this feature has not yet been implemented. For a three variable
BN lattice, there are 5 general graphs and 11 specific graphs; for four variables there are
20 general graphs and 185 specific graphs with unique probability distributions. In the
confirmatory mode, BNs can test the significance of a model relative to another model
used as a reference [26]; in the exploratory mode, BNs can search for the best possible
model given a scoring metric. BNs are used to model expert knowledge about uncertainty
and causality [20,21] and are also used for exploratory data analysis with no use of expert
knowledge [27]. Like RA, BN applications in machine learning and artificial intelligence
are broad including classification, prediction, compression, pattern recognition, image
processing, time-series, decision analysis and many others.
The joint RA-BN lattice of neutral system general and specific graphs and the accom-
panying search algorithm developed in this paper expands both RA and BN beyond what
was previously available by either RA alone or BN alone, thus providing a more complete
ensemble of models for the representation of complex systems. When prediction of a single
dependent variable (DV) is the aim, the RA directed system lattice encompasses the BN
directed system lattice under the strict convention used in this paper that excludes a parent
node with a V-structure being the DV. However, we also show that when this constraint
is relaxed so that a parent node within a V-structure can be the DV, BN models can offer
predictions unique to BN. We also show that (under the above convention) the BN directed
system lattice reduces the size of the full BN neutral system lattice by retaining only graphs
that give unique predictions of the DV, significantly reducing the search space to find the
best BN when prediction of a single DV is the aim. Finally, this paper develops an aug-
mented RA directed system lattice which expands the conventional RA lattice of prediction
graphs to include naïve Bayes equivalent graphs. This augmented lattice encompasses
graphs in the BN directed system lattice and allows for models of complex systems which
are (i) more predictive and/or (ii) simpler and thus both more comprehensible and more
generalizable than models restricted to the conventional RA directed system lattice.

2. RA Lattice
2.1. RA Neutral Systems
All lattices shown in this paper are for four variables, but the theory and methodology
presented in this paper are general and apply to any number of variables. RA neutral
systems include only independent variables, i.e., there is no concept in such systems of a
dependent variable. A neutral system model thus represents the relationships, graphically
and probabilistically, between all the (independent) variables. The graphical representation
specifies the independencies among variables. When data are then applied, probabilities
represent the strength of the relationships between dependent variables. Neutral system
graphs are commonly used in applications where variable clustering is important, such
as computer vision and social and biological network analyses. Neutral system analysis
is more computationally demanding than directed system analysis, so when one is really
interested in predicting specific variables, directed system models are more convenient.
Entropy 2021, 23, 986 4 of 39

The four-variable RA lattice of neutral system general graphs (Figure 1), [7,9], rep-
resents all four-variable RA graphs with unique independence structures. Bold graphs
do not have loops while non-bold graphs have loops. In these graphs, lines (including
branching lines) are variables; boxes are relations. Where only two lines extend from a box,
the relation is dyadic. If more than two lines extend from a box, the graph is a hypergraph.
Where two or more specific graphs have the same independence structure, regardless of
variable labels, they are part of the same general graph equivalence class. For example,
the left-most and right-most variables in G7 are independent of one another given the
two central variables that connect both relations; this results in the general independence
structure (. ⊥.. | . . . , . . . .), where each different number of dots indicates a different
variable, but does not specify its actual identity. The expression says that the first variable
is independent (“⊥” is the symbol used in this paper for independence) of the second
variable given (“|” is the symbol used in this paper for “given”) the third and (the comma
“,” represents a logical “and”) fourth variables.
G1 is the most complex general graph, in which the variables are connected in a
tetradic relation. Graphs below G1 reflect increasingly less complex decompositions of G1,
ending with G20 which has complete independence among the variables. Arrows from one
general graph to another represent hierarchy such that going from the parent graph (the
source of the arrow) to the child graph (the terminus of the arrow) results from deleting
one relation from the parent graph.
In this paper, when the variables of a general graph are labeled in RA or BN, it is
called a specific graph, which is a unique probabilistic model given the data. For RA, given
data and after labeling all the variables, there is only one specific graph for any general
graph. By contrast, as explained in the Section 3, (beginning in the Section 3.2.1), two or
more topologically different BN general graphs can have the same probability distribution;
such equivalent graphs have the same underlying set of independencies even though they
are topologically different; they are said to constitute a ‘Markov equivalence class’ [28].
RA graphs can include pairwise and non-pairwise relations. For example, graph
G15 has four lines (variables) and three boxes (relations). One line connects to all three
boxes, meaning one variable is included in all three relations, and separately a single
line representing one of the other three variables extends from each box. Because only
two lines extend from any given box, all relations in G15 are pairwise (dyadic). Figure 2
shows G15 with labels (A, B, C, D) added for the variables, yielding a specific structure
having dyadic relations AD, BD, and CD. In RA notation, this graph is AD:BD:CD, the
colon represents independence among relations. The notation AD:BD:CD encodes the inde-
pendencies ( A ⊥ B, C | D ), ( B ⊥ C | D ) . The example in Figure 2 represents one of four
specific graphs for the general graph G15, the other possible permutations are AB:AC:AD,
BA:BC:BD, CA:CB:CD. These permutations have the same general independence struc-
ture (. ⊥.., . . . | . . . .), ( . . . ⊥ . . . | . . . .), but given data, produce different conditional
probability distributions.
In contrast to graph G15 which includes dyadic relations only, graph G13 in Figure 1
is a hypergraph, with three lines extending from one box and a single line extending from
the other. This could, for example, represent four variables A, B, C and D, where A, B, and
C label the three lines extending from one box, and D labels the single line extending from
the other box. Figure 3 shows this specific graph, which in RA notation is ABC:D with the
independence structure of ( D ⊥ A, B, C ). This example represents one of four specific
graphs for general graph G13, the other three being, ABD:C, ACD:B, and BCD:A with
independencies of (C ⊥ A, B, D ), ( B ⊥ A, C, D ), and ( A ⊥ B, C, D ), respectively. Given
data, each of these four specific graphs (ABC:D, ABD:C, ACD:B, and BCD:A) generates a
unique probability distribution.
Entropy 2021, 23, 986 5 of 39
Entropy 2021, 23, x FOR PEER REVIEW 5 of 42

G1

G2

G3

G4

G7 G5

G8 G6

G10 G9

G13 G11 G12

G14
G15 G16

G17 G18

G19

Entropy 2021, 23, x FOR PEER REVIEW 6 of 42


G20

Figure
Figure1.1.Lattice of of
four-variable RARA neutral system general graphs. Structures with with
bold bold
boxesboxes
CA:CB:CD. Lattice
These four-variable
permutations neutral
have the system
same general
general graphs. Structures
independence structure (. ⊥.., (re-
…|
lations) are loopless. All lattices in this paper are for four variables.
….), (… ⊥ are
(relations) … |loopless.
….), butAll lattices
given in this
data, paperdifferent
produce are for four variables. probability distributions.
conditional
RA graphs can include pairwise and non-pairwise relations. For example, graph G15
has four lines (variables) and three boxes (relations). One line connects to all three boxes,
meaning one variable is included in all three relations, and separately a single line repre-
senting one of the other three variables extends from each box. Because only two lines
extend from any given box, all relations in G15 are pairwise (dyadic). Figure 2 shows G15
with labels (A, B, C, D) added for the variables, yielding a specific structure having dyadic
relations
Figure 2.AD, BD, and
RA specific CD.G15,
graph In RA notation, this graph is AD:BD:CD, the colon represents
AD:BD:CD.
Figure 2. RA specific
independence among graph G15, AD:BD:CD.
relations. The notation AD:BD:CD encodes the independencies
(𝐴 ⊥ 𝐵, 𝐶 | 𝐷), (𝐵 ⊥ 𝐶 | 𝐷). The example in Figure 2 represents one of four specific graphs
for theIngeneral
contrastgraph
to graph G15
G15, thewhich
otherincludes
possibledyadic relationsare
permutations only, graph G13BA:BC:BD,
AB:AC:AD, in Figure 1
is a hypergraph, with three lines extending from one box and a single line extending from
the other. This could, for example, represent four variables A, B, C and D, where A, B, and
C label the three lines extending from one box, and D labels the single line extending from
the other box. Figure 3 shows this specific graph, which in RA notation is ABC:D with the
independence structure of (𝐷 ⊥ 𝐴, 𝐵, 𝐶). This example represents one of four specific
C label the three lines extending from one box, and D labels the single line extending from
the other box. Figure 3 shows this specific graph, which in RA notation is ABC:D with the
independence structure of (𝐷 ⊥ 𝐴, 𝐵, 𝐶). This example represents one of four specific
graphs for general graph G13, the other three being, ABD:C, ACD:B, and BCD:A with
independencies of (𝐶 ⊥ 𝐴, 𝐵, 𝐷), (𝐵 ⊥ 𝐴, 𝐶, 𝐷), and (𝐴 ⊥ 𝐵, 𝐶, 𝐷), respectively. Given
Entropy 2021, 23, 986 6 of 39
data, each of these four specific graphs (ABC:D, ABD:C, ACD:B, and BCD:A) generates a
unique probability distribution.

Entropy 2021, 23, x FOR PEER REVIEW 7 of 42

Figure 3. RA3.specific
Figure graphgraph
RA specific G13. G13.
to be more successful. This best characterizes a search upwards that (typically) starts from
G20.Figure
Figure For directed
4 shows allsystems
4 shows ofall where
theofgeneral
the prediction
general graphs
graphs offrom
from a single1DV
Figure
Figure 1isas
as wellthe
asaim,
well aall
allasof high information
ofspecific
the the specific
model
graphs
graphs is one with
associated that with
associated giveseach
each maximal
general reduction
general graph.
graph. of the
ThereThere
are 20 Shannon
are entropy
20 general
general graphs in(uncertainty)
graphs theinRA of the
thelattice
RA lattice
and
DV. 114 specific graphs.
and 114 specific graphs.

G1
Searching the RA Neutral System Lattice
ABCD
The data are the top of the lattice, i.e., G1 ABCD, and one searches the lattice to find
a good representation of the data. The G2
lattice can be searched from the top down or from
the bottom up or from some other starting model. Typically, a reference model, a specific
ABC:ABD:ACD:BCD
graph, is selected to begin the search. Commonly, it is the independence (bottom) model,
G20 A:B:C:D from Figure 4, that is selected as the reference model, and the lattice is
G3
searched upward to find the best model. TheABC:ABD:ACD
lattice may also be searched downward start-
ABC:ABD:BCD
ing from the saturated (top) model, G1, or from a reference model in-between the bottom
ABC:ACD:BCD
ABD:ACD:BCD
or top, searching up or down. The starting model does not have to be the reference model,
ABC:ABD:CD
but this is often the case. G4 ABC:ACD:BD
ABC:BCD:AD
Commonly, when the lattice is being searched, the goal is to find a model (a specific
ABD:ACD:BC
ABC:ABD ABD:BCD:AC
graph) that adequately
ABC:ACD
represents but is less complex than the data. This best characterizes
ACD:BCD:AB
a search downwards
ABC:BCD that G7(typically) starts from G5 G1. WhenABC:DA:DB:DC
searching down the lattice, the
ABD:ACD ABD:CA:CB:CD
ABC:AD:BD
goal is to search as far down the lattice as possible, resulting in
ACD:BA:BC:BD the greatest complexity
ABC:AD:CD ABD:BCD
ACD:BCD BCD:AB:AC:AD
reduction
ABC:BD:CD from the reference model, while incurring the least amount of information loss,
ABD:AC:BC
so ABD:AC:DC
that the model still adequately
G8 represents theG6data. Finding a simpler representation of
ABC:AD
ABD:BC:DC
ABC:BDthe data reduces the complexity of the system under AB:AC:AD:BC:BD:CD observation, allowing for greater
ACD:AB:CB
ABC:CD
ABD:AC
understanding
ACD:AB:DB of the most important underlying relations. Alternatively, the goal is to
ACD:CB:DB
ABD:BCfind a model, a specific graph, that captures as much of the information in the data as
BCD:AB:AC
ABD:DC
ACD:ABpossible, as long as its difference from mutual independence of the variables, i.e., G20 in
BCD:AB:AD
BCD:AC:AD G9 AB:AC:AD:BC:BD
ACD:CB
ACD:DB
Figure 4, is defensible, so G10 the model is not overfit, and its application AB:AC:AD:BC:CD
to new data is likely
BCD:BA AB:AC:AD:BD:CD
BCD:CA AB:AC:BC:BD:CD
BCD:DA AB:AD:BC:BD:CD
AC:AD:BC:BD:CD
AB:AC:AD:BC G13 G11 G12
AC:AD:BC:BD
AB:AC:AD:BD
ABC:D AB:AD:BC:CD
AB:AC:BC:BD
ABD:C AB:AC:BD:CD
AB:AD:BC:BD
ACD:B
AB:AC:AD:CD
BCD:A
AB:AC:BC:CD
AC:AD:BC:CD
AB:AC:AD G14 AB:BC:CD
AB:AD:BD:CD G15
BA:BC:BD G16 AB:BD:DC
AC:AD:BD:CD
CA:CB:CD AB:AC:BC:D AC:CB:BD
AB:BC:BD:CD
DA:DB:DC AB:AD:BD:C AC:CD:DB
AD:BC:BD:CD
AC:AD:CD:B AD:DB:BC
AC:BC:BD:CD
BC:BD:CD:A AD:DC:CB
CA:AB:BD
AB:AC:D DA:AB:BC
G17 G18 AB:CD
AB:BC:D BA:AC:CD
AC:BD
AC:BC:D DA:AC:CB
AD:BC
AB:AD:C BA:AD:DC
AB:BD:C CA:AD:DB
AD:BD:C
AC:AD:B G19 AB:C:D
AC:CD:B AC:B:D
AD:CD:B AD:B:C
BC:BD:A BC:A:D
BC:CD:A BD:A:C
BD:CD:A CD:A:B
G20

A:B:C:D

Figure4.4.Lattice
Figure Latticeof
ofRA
RAneutral
neutralsystem
systemgeneral
generaland
andspecific
specificgraphs.
graphs.

Given data, specific graphs can be tested for statistical significance. The Chi square
statistical test can be used to test the difference between any candidate model and a refer-
ence model, usually the data, G1, or the independence model, G20. As an alternative or in
addition to such a statistical test, the Bayesian Information Criteria (BIC) and the Akaike
Entropy 2021, 23, 986 7 of 39

Searching the RA Neutral System Lattice


The data are the top of the lattice, i.e., G1 ABCD, and one searches the lattice to find a
good representation of the data. The lattice can be searched from the top down or from the
bottom up or from some other starting model. Typically, a reference model, a specific graph,
is selected to begin the search. Commonly, it is the independence (bottom) model, G20
A:B:C:D from Figure 4, that is selected as the reference model, and the lattice is searched
upward to find the best model. The lattice may also be searched downward starting from
the saturated (top) model, G1, or from a reference model in-between the bottom or top,
searching up or down. The starting model does not have to be the reference model, but
this is often the case.
Commonly, when the lattice is being searched, the goal is to find a model (a specific
graph) that adequately represents but is less complex than the data. This best characterizes
a search downwards that (typically) starts from G1. When searching down the lattice, the
goal is to search as far down the lattice as possible, resulting in the greatest complexity
reduction from the reference model, while incurring the least amount of information loss,
so that the model still adequately represents the data. Finding a simpler representation
of the data reduces the complexity of the system under observation, allowing for greater
understanding of the most important underlying relations. Alternatively, the goal is to find
a model, a specific graph, that captures as much of the information in the data as possible,
as long as its difference from mutual independence of the variables, i.e., G20 in Figure 4, is
defensible, so the model is not overfit, and its application to new data is likely to be more
successful. This best characterizes a search upwards that (typically) starts from G20. For
directed systems where prediction of a single DV is the aim, a high information model is
one that gives maximal reduction of the Shannon entropy (uncertainty) of the DV.
Given data, specific graphs can be tested for statistical significance. The Chi square
statistical test can be used to test the difference between any candidate model and a
reference model, usually the data, G1, or the independence model, G20. As an alternative
or in addition to such a statistical test, the Bayesian Information Criteria (BIC) and the
Akaike Information Criteria (AIC) are among the other measures that can be used to decide
on the best model.

2.2. RA Directed Systems


2.2.1. Conventional Directed System Lattice
The RA lattice of directed systems shown in Figure 5 is a sub-lattice of the complete
neutral system lattice of Figure 1. The purpose of the directed system lattice is to organize
models that make an IV-DV (explanatory-response) distinction and where prediction of
the DV is the sole aim. In contrast, the neutral system lattice organizes models that do not
make any IV-DV distinction; these models do not focus on a single response variable. There
are fewer general graphs in the directed system lattice compared to the neutral system
lattice because, by convention, we care only about models whose predictions of the DV are
different and are not interested in identifying relations among the IVs. The word ‘directed’
in RA ‘directed systems’ has a meaning that is different from the meaning of the same
word in BN ‘directed acyclic graphs’. In RA ‘directed systems’, this word means that the
focus of modeling is on the relation of the dependent variable to the independent variables.
It does not imply directionality of edges from the IVs to the DV as this word means in BN
‘directed acyclic graphs’.
In the neutral system lattice of Figure 1, any of the variables can be part of any relation.
In contrast, in the standard directed system lattice, by convention, all of the IVs are always
included in one of the relations (the “IV relation”); the other relations in the model include
predictive IV-DV interactions (or the DV alone if there are no such interactions). In this
paper, the DV in directed system specific graphs is called “Z” and the IVs are called
A, B, C, and so on. For example, the first specific graph listed under G3 from Figure 5,
ABC:ABZ:ACZ, has all three IVs in the first relation, followed by two IV-DV relations.
Aside from allowing for the presence of relations among the IVs (without specifying any
Entropy 2021, 23, 986 8 of 39

such relations), the model says that there is a relation in which A and B might predict Z
Entropy 2021, 23, x FOR PEER REVIEW and another relation in which A and C might predict Z; the net predictive
9 of 42 relation between
A, B, C and Z is a maximum entropy fusion of these two predictive relations.

G1
ABCZ

G2

ABC:ABZ:ACZ:BCZ

G3
ABC:ABZ:ACZ
ABC:ABZ:BCZ
ABC:ACZ:BCZ

G4 ABC:ABZ:CZ
ABC:ACZ:BZ
ABC:BCZ:AZ

G7 G5
ABC:ABZ
ABC:ACZ ABC:AZ:BZ:CZ
ABC:BCZ

G8 G6

ABC:AZ:BZ
ABC:AZ:CZ
ABC:BZ:CZ

G10 G9
ABC:AZ
ABC:BZ
ABC:CZ

G13 G11 G12

ABC:Z

G14
G15 G16

G17 G18

G19

G20

Figure 5. Conventional RA directed system lattice. Structures with boxes (relations) in bold are loop-
Figure 5. Conventional RA directed system lattice. Structures with boxes (relations)
less. in bold are
loopless.
2.2.2. Augmented Directed System Lattice
General
Figure graph
6 augments theG13 (ABC:Z)directed
conventional from Figure 5 represents
system lattice independence
(on the left) of Figure 5 between the IVs
with a lattice
(ABC) andof additional
the DV (Z),predictive graphsis(on
thus there nothe right). These
relation additional
between graphs
the IVs andoffer
the DV, and graph G1
unique analytical results, but that are not typically included when searching the hierar-
(ABCZ, which is not written as ABC:ABCZ because ABC is embedded in ABCZ) represents
chically restricted directed system lattice.
complete dependence among the IVs and the DV. It should be noted; however, that the
directed system lattice of Figure 5 is not entirely exhaustive. What restricts this lattice is
that all models include the “IV relation”; this makes these models hierarchically nested, and
amenable to standard statistical tests. There are additional predictive graphs where this
restriction is dropped that produce different predictions of Z than the models of Figure 5;
these additional graphs are discussed in the following Section 2.2.2.
Figure 5 shows all directed system general and specific graphs for four variables. The
graphs that are greyed represent graphs from the neutral system lattice from Figure 5 that
Entropy 2021, 23, 986 9 of 39

are not part of the directed system lattice because they do not offer unique predictions of
the DV. There are nine directed system general graphs and 19 specific graphs in contrast to
the neutral system lattice, which has 20 general graphs and 114 specific graphs.

2.2.2. Augmented Directed System Lattice


Figure 6 augments the conventional directed system lattice (on the left) of Figure 5
with a lattice of additional predictive graphs (on the right). These additional graphs
offer unique analytical results, but that are not typically included when searching
Entropy 2021, 23, x FOR PEER REVIEW 10 of 42 the
hierarchically restricted directed system lattice.

6. Conventional
FigureFigure RA RA
6. Conventional directed system
directed lattice
system latticeand
andadditional predictivespecific
additional predictive specific graphs.
graphs. Structures
Structures with with bold boxes
bold boxes
(relations) are loopless.
(relations) are loopless.

In In Figure6,6,the
Figure thegraphs
graphs in thetheAdditional
Additional Predictive
Predictive Graphs
Graphs lattice are denoted
lattice are denotedby an by an
apostrophe
apostrophe totoidentify
identifythat
that the
the original
originalgraph
graphwas wasaltered by removing
altered by removing the IV
therelation. For For
IV relation.
example,
example, thethe bottom
bottom relationiningraph
relation graphG2, G2,interpreted
interpretedas asthe
theIV IVrelation,
relation, ABC,
ABC, waswas removed
re-
moved to produce an additional predictive graph0 G2′. This general
to produce an additional predictive graph G2 . This general graph has only one specific graph has only one
specific graph, ABZ:ACZ:BCZ, which is analytically different from G2
graph, ABZ:ACZ:BCZ, which is analytically different from G2 (ABC:ABZ:ACZ:BCZ) be-
(ABC:ABZ:ACZ:BCZ) because the ABC term in graph G2 imposes a constraint among the
cause the ABC term in graph G2 imposes a constraint among the IVs that is not imposed
IVs that is 0not imposed in0 graph G2′. Because G2′ does not follow the standard directed
in graph
system G2 . Because
convention G2 doesthe
of including notIV follow
terms inthe standard
a first relation,directed
it produces system convention
a different pre- of
including the IV terms in a first relation, it produces a different
diction of Z. The apostrophe-marked graphs are less complex than the graphs from which prediction of Z. The
apostrophe-marked
they are derived, and graphs are less
so should complex
also be consideredthaninthe graphs
searches forfrom
goodwhich theymodels.
predictive are derived,
andG5′
soand
should also be considered in searches for good predictive models. G5 0 and G80 from
G8′ from Figure 6 represent naïve Bayes equivalent RA graphs; G4′ is also a naïve
Figure 6 represent
Bayes-like graph. naïve
This isBayes equivalent
discussed in Section RA graphs; G40 is also a naïve Bayes-like graph.
3.3.
A merger of the conventional
This is discussed in Section 3.3. directed system lattice with the additional predictive
graphs of Figure 6 gives the augmented directed
A merger of the conventional directed system lattice system lattice
within Figure 7. The specific
the additional predictive
graphs of Figure 6 gives the augmented directed system lattice in FigureG7,
graphs from G2′ and G3′ from Figure 6 are members of general graphs G3 and 7. respec-
The specific
tively.from
graphs Three G2general
0 and G3 graphs
0 fromare added
Figure to the
6 are augmented
members lattice, namely
of general graphs G10,
G3 and G15G7,andrespec-
G17; these are G4′, G5′, and G8′ from Figure 6, the naïve Bayes or naïve Bayes-like equiv-
tively. Three general graphs are added to the augmented lattice, namely G10, G15 and G17;
alent RA general graphs. All of the specific structures that are added to the augmented
lattice are denoted in bold letters in Figure 7. G13 is the independence model for the con-
ventional directed system graphs. The augmented lattice also includes G20 A:B:C:D,
which is the natural independence model for the additional predictive graphs that do not
include the IV term (ABC). Including these additional predictive graphs in the directed
Entropy 2021, 23, 986 10 of 39

these are G40 , G50 , and G80 from Figure 6, the naïve Bayes or naïve Bayes-like equivalent
RA general graphs. All of the specific structures that are added to the augmented lattice
are denoted in bold letters in Figure 7. G13 is the independence model for the conventional
Entropy 2021, 23, x FOR PEER REVIEW 11 of 42
directed system graphs. The augmented lattice also includes G20 A:B:C:D, which is the
natural independence model for the additional predictive graphs that do not include the
IV term (ABC). Including these additional predictive graphs in the directed system lattice
system latticethe
increases increases
numberthe number of general
of predictive predictive general
graphs fromgraphs from
nine in the nine in the conven-
conventional directed
tional directed
system latticesystem
to 12 lattice
in the to 12 in the augmented
augmented lattice
lattice and 19 and graphs
specific 19 specific graphs
in the in the
conventional
conventional
lattice to 31lattice
in thetoaugmented
31 in the augmented
lattice. lattice.

Figure 7. Augmented
Figure RARA
7. Augmented directed system
directed lattice.
system Structures
lattice. with
Structures bold
with boxes
bold areare
boxes loopless; model
loopless; model
names in bold are augmentations.
names in bold are augmentations.

3. BN Lattice
3.1. BN Introduction
A Bayesian Network model, like an RA model, is a type of probabilistic graphical
Entropy 2021, 23, 986 11 of 39

3. BN Lattice
3.1. BN Introduction
A Bayesian Network model, like an RA model, is a type of probabilistic graphical
model. BN modeling originated from path models in the early 1900s [18,19] and was
expanded as a field of study in the late 1900s by Pearl [21], Neapolitan [20] and others.
BNs are directed graphs: nodes represent variables, and edges represent relations.
The graph structure or topology (variables, edges, orientations of edges) encodes inde-
pendencies, and thus also dependencies, among the variables identified in a particular
graph. Since BNs are directed graphs, edges typically have arrows or some form of notation
representing directionality: A→B means that variable B is dependent upon variable A.
(This dependency might be interpreted as a causal influence of A on B, but in this paper,
we will not address such causal interpretations of BNs.) A is the ‘parent’ of B, which means
that they are dependent. One variable is independent of all other variables given its parents.
For example, in the BN A→B→C, variable C is independent of A given B, since B is the
parent of C.
A BN graph provides the structure from which a probability expression can be derived
that describes the relation between variables. For example, the graph A→B provides the
structure identifying the dependence between A and B, and probability values define the
nature and strength of the relation between A and B. A unique feature of BNs versus other
graphical models is in the independencies that are encoded when two edges converge.
For example, in A→B←C the edges converge on variable B. If A and C are not directly
connected by an edge, this convergence is called a V-structure [29]. This V-structure is
interpreted as yielding the conditional distribution p(B|A,C)p(A)p(C), which encodes
dependence among A, B, and C, but marginal independence between A and C. The
interpretation is that together, but being independent of one another, A and C influence or
cause or allow one to predict B.
BNs are also acyclic graphs, meaning they have no closed paths following the arrows.
For example, graph A→B→C→A is disallowed because it contains a cycle. Because BNs
are acyclic, inference on all BN graphs can be performed in closed algebraic form.
The primary differences between RA and BN are two-fold: (1) BNs are directed and
acyclic whereas RA graphs are undirected and can have loops or not have loops and (2)
some BN graphs contain converging edges, that is one or more V-structures that encode
unique independence relations not found in RA graphs. The absence of a V-structure in a
BN graph results in this graph being equivalent to some (loopless) RA graph. The presence
of a V-structure results in the graph not having an RA equivalent and thus being unique to
BN. This is discussed below in Section 3.2.6, in connection with Table 3.

3.2. BN Neutral Systems


3.2.1. Lattice of BN General Graphs
As in RA, there are general BN graphs and specific BN graphs; in the BN literature gen-
eral graphs are referred to as maximally oriented graphs [30], essential graphs [31], equiva-
lence classes of directed acyclic graphs [32], and partially directed graphs (PDAG) [29].
In BN general graphs, the graph structure (variables, edges, and orientations of edges)
results in a unique independence structure, where specific identities are not assigned to
the variables. Figure 8, developed by Harris and Zwick [33], shows all BN general graphs
with four variables and their hierarchy. There are 20 BN general graphs in the lattice, i.e.,
20 unique independence structures. The procedure to generate this lattice is outlined in
Section 3.2.6.
Entropy
Entropy2021, 23,23,
2021, 986x FOR PEER REVIEW 12
13 ofof3942

Figure 8. Lattice of BN neutral system general graphs.


Figure 8. Lattice of BN neutral system general graphs.

In Figure 8, general graphs are labeled BN1, BN2 . . . BN19, BN20. Solid squares
represent variables; edges are represented by directed arrows from one square to another,
Entropy 2021, 23, x FOR PEER REVIEW 14 of 42

InFigure
In Figure8,8,general
generalgraphs
graphsare
arelabeled
labeledBN1,
BN1,BN2…BN19,
BN2…BN19,BN20.
BN20.Solid
Solidsquares
squaresrepre-
repre-
Entropy
Entropy 2021,
2021, 23,
23, xx FOR
FOR PEER
PEER REVIEW
REVIEWsent 14
14 of 42
ofrep-
42
sentvariables;
variables;edges
edgesare
arerepresented
representedby
bydirected
directedarrows
arrowsfrom
fromone
onesquare
squareto
toanother,
another, rep-
In Figure
resenting
resenting 8, general graphs
aaparent–child
parent–child dependency
dependency are labeled BN1, BN2…BN19,
relationship.
relationship. dashedBN20.
Thedashed
The lineswith
lines Solid
with squares
arrows
arrows repre-
from
from one
one
Entropy 2021, 23, 986 sent
general variables;
generalgraph graphto edges
toanother are represented
anotherrepresent representthe by directed
thehierarchy
hierarchyof arrows
ofgeneralfrom
generalgraphs, one square
graphs,with to
withparentanother,
parentgraphs
13 of rep-
graphs
39
resenting
being
being In above
Figure
In above
Figure a parent–child
child
8,
8, general
child graphs.
graphs.
general dependency
graphs
graphs Child
Child are
are graphs relationship.
labeled
graphs
labeled result
BN1,
resultBN1, from
from The the
BN2…BN19,
the
BN2…BN19, dashed
deletion
deletion lines
BN20.
of
BN20. with
ofone
one
Solid
Solid arrows
edge
edge from
squares
from
squares from one
thepar-
repre-
the
repre-par-
general
ent
sent
sent graph.graph
variables;
ent variables;
graph. Theedges The to
edges another
insert
insertare are on represent
the
represented
onrepresented bottom
the bottom right the
right
by hierarchy
indicates
directed
indicates
by directed of
arrows general
structures
from
structures
arrows from that graphs,
that
one are
square with to
are topologically
one square parent
topologically
another, graphs
to another,differ- differ-
rep-
rep-
being
ent from
resenting
ent
resenting above
from child graphs.
graphs
aagraphs
parent–child
parent–child inin the Child marked
thedependency
lattice
lattice
dependency graphs
marked result
withfrom
relationship.
with
relationship. asterisks
The
asterisks
Thethedashed
deletion
but have
but
dashed haveof identical
lines
lines one
with
with edge
identical arrows
arrows from the one
independence
from
independence
from par-
one
representing a parent–child dependency relationship. The dashed lines with arrows from
ent
generalgraph.
structures
structures graph The
tothese
to toinsert
these
totoanother on
marked
marked the bottom
graphsand
represent
graphs right
and
the indicates
thusare
hierarchy areMarkov structures
Markov
of general that
equivalent
graphs, are topologically
(the
with topological
parent differ-
differ-
graphs
general
one general graphgraph another
another represent
represent thethus
the hierarchy
hierarchy of general
of equivalent
general graphs,(the
graphs, with
with topological
parent
parent differ-
graphs
graphs
ent
ence
being
ence
being from cannot
above
cannot
above graphsbebe
child
child in
removed
removed the
graphs.
graphs. lattice
byby
Child
Child any
any marked
labeling
graphs
labeling
graphs with
result
of
result of asterisks
the
from
the
from variables).
the
variables). but
deletion have
For
For of identical
example,
one
example, edge independence
BN2b
from
BN2b and
the
and BN2c
par-
the BN2c
being above child graphs. Child graphs result from the the deletion
deletion of oneof one
edgeedge fromfrom the parent par-
structures
entin the
graph.
in the
ent graph. insert
insert to
The
The these
are
areinsert
insert marked
topologically
on
topologically the
onbottom graphs
bottom
the bottom differentand
different
right thus
but
but are
have
indicates
have Markov
the
the same
structures
same equivalent
independence
that
independence are(the topological
structure
topologically
structure differ-
as BN2*
differ-
as BN2*
graph. The insert on the rightright indicates indicates structures
structures that thattopologically
are are topologically differ-
different
ence
entin thecannot
from
in graphs
ent the
from lattice.
graphs
lattice.
graphsbe removed
These
These in
in thethe by
additional
lattice
additional any
lattice with labeling
representations
marked
representations
marked withof
with but the variables).
are
asterisks
are
asterisks discussed
but
discussed
but have For
have example,
below
below in
identical
in Section
identical BN2b
Section and
3.2.2.
independence
3.2.2.
independence BN2c
from in the lattice marked asterisks have identical independence structures
in theTable
structures
structures insert
Table to
to 1are
1
these topologically
summarizes
summarizes
these marked
marked the
the different
graphs
RA
graphs RA and
and
and but
BN
thus
BN
thus have
are
are the same
terminology
Markov
terminology
Markov andindependence
equivalent
and supports
supports
equivalent (the
(the the
the structure
discussion
topological
discussion
topological asdiffer-
BN2*
ofBN
of
differ-BN
to these marked graphs and thus are Markov equivalent (the topological difference cannot
in the
thatcannotlattice.
follows. These
Entries additional
inthe
the representations
table forRA RAgeneral
general are discussed
andspecific
specific below
graphs in Section
(thelattices
lattices 3.2.2.
of general
beence
that
ence
removed follows.
cannot bybebeany removed
Entries
removed in
labeling by any
byoftablethe labeling
any for
labeling
variables). of
of theFor variables).
the and
variables).
example, For
For
BN2b example,
graphs and(the
example, BN2c BN2b
BN2b
in the and
of
and BN2c
general
BN2c
insert
in and
the
and Table
specific
insert
specific
intopologically 1 summarizes
are
the insert aredifferent graphs
topologically
graphs
topologically from
from the RA
Figures
different
Figures
differentand 1 1BN and
but
and terminology
but havehave
4,4, respectively)
the same
respectively) and supports
have
independence
the same structure have
independence the
already
already discussion
been
structure
been
structure as of
as BN2*
discussed
BN2*BN
discussed
are but have the same independence as BN2* in the lattice.
that
in above.
the follows.
lattice. Entriesadditional
Thediscussion
discussion
These in that
the
thattable followsfor this
RAthisgeneral
representations tablebelow willand
are specific
explain
discussed the graphs (the
additional
below lattices of general
representations of
inabove.
These the The
lattice.
additional These additional
representations follows
representations
are discussed table will
are explain
discussed
in Section the 3.2.2. in
additional
below in Section
Section 3.2.2.
representations
3.2.2. of
and
BN
BN specific
general
Table
general
Table
Table 1 graphs
graphs
summarizes
1 graphs
summarizes
1 summarizes infrom
in thethe the Figures
insert
insert
thethe RA RA
RA of
and
ofand
and 1
Figure
Figure
BN BN
BNand 8,4, andrespectively)
terminology will
8, and will derive
terminology
terminology derive
andand
and have
the
supports
the lattice
supports
supports already
lattice the
thethe of been
specific
discussion
ofdiscussion
specific
discussion discussed
BNBN of
ofgraphs
BN BN
ofgraphs
BN
above.
that summarized
that
summarized
that TheEntries
follows.
follows.
follows. discussion
ininFigure
Entries
Entries Figure
ininin
the that
the14
14
thetable follows
presented
table forfor
presented
table for RA this
RA
RA belowtablein
below
general
general
general will
in explain
Section
and
Section
and and 3.2.5.
specific
specific the
3.2.5.
specific additional
graphs
graphs
graphs (the
(the
(the representations
lattices
lattices
lattices ofofof general
general
general of
BN
and
and
and general
specific
specific
specific graphs
graphs
graphs
graphs fromin the
from
from insert
Figures Figures
Figures of Figure
1 and11 4,and and 8, and
4, will
4, respectively)
respectively) derive
respectively) have alreadythe
have lattice
have already
alreadyof specific
been discussed been BN
been discussed graphs
discussed
above.
Thesummarized
Table
above.
Table
above. 1.
discussion 1.
The
RA RA andin
and
discussion
The discussion Figure
BN
BN
that follows 14
terminology.
that
terminology. presented
that thisfollows
followstable this
thisbelow
will table
table in
willSection
explain
will explain
explain 3.2.5.
the additional
the additional
the additional representations
representations
representations of BNof of
BN
general general
BN general graphs graphs
in Termi-
graphs in the
theininsert
the insert of
of Figure
insert of Figure
Figure8, and 8,
8, and
will
and will
derive
will derive
derive thethe the lattice
lattice of specific
of specific
lattice of specific BNBN BN graphs
graphs
graphs
Table 1. RAOur Our
and Termi-
BN terminology. Literature Termi-
Literature Termi- Lattice Name, RA- Lattice Name, RA-
summarized
summarized
summarized ininin Figure
Figure
Figure 1414 presented
presented
14 presented below
belowbelow in in Section
Section
in Section 3.2.5.
3.2.5.
3.2.5. Visuals
Visuals
nology
nology nology
nology likeNotation
like Notation
Our Termi- Literature Termi- Lattice Name, RA- G15
Table G15
Visuals
Table 1. 1. RA
RA and andTable
nology
BN
BN1. terminology.
RA and BN
terminology. terminology. like Notation
nology
GeneralRA
General RA
RA Our
RA Termi- G-structures
G-structures [7]Lattice
[7] G15Name,
G15 (Figure
(Figure 1)1)
Ourgraph graph Literature
Termi- Literature Termi-
Termi- Lattice
Lattice Name, Name, RA-LikeRA-
RA- G15
Visuals
Our Terminology Literature Terminology
General Visuals
Visuals
RA nologyRA G-structures
nology nology
nology [7] like
like Notation
Notation
G15 Notation
(Figure 1)
graph G15
G15G15
G15
General
General RA
RA BB
RA General RA graph RA
RA SpecificRA
G-structures
Specific G-structures
RA[7]Specific
SpecificRA
G-structures RAgraph [7]
graph
[7] G15
G15 G15
G15
G15 (Figure
(Figure
(Figure
(Figure
(Figure 1) 1)
1) 4),
4),
graph
graph G15 BD
graph
graph [15]
[15] AD:BD:CD
AD:BD:CD BD
B
Specific RA Specific RA graph G15 (Figure 4), AA AD D CDCD CC
AD

G15D
BD
graph [15] AD:BD:CD G15
Maximally ori- A B
B CD C
Specific RA Maximally
Specific RA ori- G15 (Figure 4),
graph
AD
D
Specific RA graph Specific
Specific RA graph [15]RA Specific
entedgraphs,RA
graphs,graph G15 G15 (Figure
es-(Figure 4), AD:BD:CD 4),
graph ented [15] es- AD:BD:CD BD
BD
graph Maximally [15]graphs,
sential ori- AD:BD:CD
sential graphs, A
A AD AD
D
CD
CD C
C
GeneralBN ented
BN equivalence graphs,
equivalenceclas- es-
clas- BN11* BN11*&&BN11b BN11b D
BN General
BN graphgraphs, sential
Maximally
sesof
Maximally graphs,
ofdirected ori-
directed
ori- (Figure8)8)
Maximally oriented graph ses
essential (Figure
graphs, General BN
equivalence classes equivalence
ented
ented
of graphs,
acyclic
graphs,
directed
acyclic graphs, clas-
graphs, es-
es- BN11* & BN11b
BN General BN graph BN BN11* & BN11b (Figure 8)
acyclic graphs, graph partiallypartiallyses
sential
directed of
partially
sential directed
graphs,
directed
graphs, (Figure 8)
directed
graphs [29–32]
General BN acyclic
equivalence graphs, clas- BN11* & BN11b
BN General BN equivalence graphs[29–32]
graphs [29–32]
clas- BN11* & BN11b
BN graph partially
ses directed
graph ses of of directed
directed (Figure
(Figure 8) 8)
Specific BN
Specific BN acyclic graphs, Labeled
graphs
acyclic
Labeled maxi-
[29–32]
graphs,
maxi- BN11*,
BN11*, BN11b BN11b
Specific BN graph graph(no-V- (no-V-partially mallyoriented
mally oriented
directed BN11*, BN11b (Figure (Figure14),14),
graph partially directed (Figure 14),
(no-V-structure) Specific BN Labeled maxi- AD:BD:CD
BN11*, BN11b
Labeled maximally structure)
structure) oriented graphs,
graphs
graphs,
graphsgraphs, essential
[29–32]
essential
[29–32] AD:BD:CD
AD:BD:CD
essential graph graphs,(no-V- equivalence mally
graphs,
classes
graphs, oriented
equiva-
equiva- (Figure 14),
of directed structure)
Specific
acyclicBN
Specific graphs,
BN graphs,
Labeled
lencepartially
lence
Labeled essential
classes
classes maxi-
maxi- ofofdi-di- BN11*, AD:BD:CD
BN11*, BN11b
BN11b
graph Specific
directed
Specific (no-V-
graph (no-V- mally BNBN
graphs graphs,
mally rected equiva-
oriented
acyclic BN17
oriented (Figure
(Figure 14),
14), 14),
Specific BN graph rected acyclic BN17
BN17 (Figure
(Figure
(Figure 14), 14),
graph
structure) (V-
graph (V- graphs,
structure) lence
graphs,
graphs,classes
essential of
partially
essential di- AD:BD:CD
AD:BD:CD
(V-structure) Specific BN graphs, graphs, partially BCD BCD
BCD B:C :AB:C
B:C :A
:A
structure)
structure) rected
directed
graphs, acyclic
equiva-
graphs BN17 (Figure 14),
equiva-
graph (V- lence directed graphs
graphs,
lence classes
classespartially
of
of di-
di- BCDB:C:A
structure)
Specific
Specific BN
BN directed
rected graphs
rected acyclic
acyclic BN17 BN17 (Figure(Figure 14),
14),
3.2.2.Additional
3.2.2. graph
graph (V-
Additional Representations
Representations
(V- graphs, ofofBN
partially BNGeneral
General Graphs
Graphs
3.2.2. Additional Representations graphs, of partially
BN General BCD Graphs
BCD B:C:A
B:C :A
There
There structure)
are20
structure)
are 20general
general directed graphs
graphs inthe
graphs
in theBN BNlattice.
lattice.However,
However,eight eightof ofthese,
these,marked markedwith with
There
3.2.2. are 20 general
Additional directed
graphs
Representations ingraphs
the
of BN BNGeneral
lattice. However,
Graphs eight of these, marked with
asterisks in Figure 8, namely BN2*, BN4*,
asterisks in Figure 8, namely BN2*, BN4*, BN5*, BN9*, BN11*, BN14*, BN15*, and BN16*, BN5*, BN9*, BN11*, BN14*, BN15*, and BN16*,
asterisks in Figure
There are 208,general
namelygraphs BN2*, BN4*, in thethat BN5*,
BNinclude BN9*, BN11*, BN14*,
lattice. BN15*, and BN16*,
represent
represent Markov
Markov equivalenceclasses
equivalence classes that includeHowever, additionaleight
additional unique
unique of these, marked
edgetopologies
edge topologies with
that
represent
3.2.2.
asterisks Markov
Additionalin Figure equivalence
Representations
8, namely classes
BN2*, of that
BN
BN4*, include
General
BN5*, additional
Graphs
BN9*, BN11*, uniqueBN14*, edge topologies
BN15*, and thatthat
BN16*,
3.2.2. Additional Representations of BN General Graphs
have identical
represent probability distributions when include applied additional
to data. These additional topologies,
ThereMarkov are 20 equivalence
20 general graphs classes
in thethat
the BN lattice.
lattice.8,However, unique
eight of edge
of these,topologies
these, marked that
with
shownThere in theare insert general
at the bottom graphs in
right BNFigure
of However,
cannot beeight made equivalent marked with
to the
asterisks
asterisks in in Figure
Figure 8, namely
8, (those
namelywith BN2*,
BN2*, BN4*,
BN4*, BN5*, BN5*, BN9*,
BN9*, BN11*,
BN11*, BN14*, BN14*, BN15*, and
and BN16*,
BN15*, variables. BN16*,
representative graphs asterisks) by any 1:1 mapping of unlabeled
represent
represent Markov
Markov equivalence
equivalence classes
classes that
that include
include additional
additional unique
unique edge
edge topologies
topologies that
that
This property, described by Heckerman [34], who showed that BNs with differing edge
topologies can have the same independence structure and thus the same probability
shown in the insert at the bottom right of Figure 8, cannot be made equivalent to the rep-
have identical probability distributions when applied to data. These additional topol
resentative graphs (those with asterisks) by any 1:1 mapping of unlabeled variables. This
shown in the insert at the bottom right of Figure 8, cannot be made equivalent to th
property, described by Heckerman [34], who showed that BNs with differing edge topol-
resentative graphs (those with asterisks) by any 1:1 mapping of unlabeled variables
ogies can have the same independence structure and thus the same probability distribu-
Entropy 2021, 23, 986 property, described by Heckerman [34], who showed that BNs with differing 14 of 39 edge
tion, is unique to BN and is not found in RA, where there is a single unique representation
ogies can have the same independence structure and thus the same probability dis
of each RA general graph. All general graphs in Figure 8 without an asterisk have no Mar-
tion, is unique to BN and is not found in RA, where there is a single unique represen
kov equivalent representations.
of each RA general graph. All general graphs in Figure 8 without an asterisk have no
Two Bayesian Networks are Markov equivalent if and only if they have the same
distribution, iskovunique to BNrepresentations.
equivalent and is not found in RA, where there is a single unique
skeleton and the same V-structure [28], resulting in the same underlying independence
representation of eachTwo RABayesian
general graph.
Networks All general graphs
are Markov in Figure 8if without
equivalent and onlyanifasterisk
they have the
structure. The skeleton
have no Markov of a representations.
equivalent graph is its undirected representation. As already defined, a
skeleton and the same V-structure [28], resulting in the same underlying indepen
V-structure occurs when
Two Bayesian two orare
Networks more directed
Markov edges that
equivalent are not
if and onlythemselves directly
if they have the con-
structure. The skeleton of a graph is its undirected representation. Assame
already defin
nected by an edge
skeleton and the converge on
same V-structure a single node.
[28],two Figure
resulting 9 shows
indirected
the same an example
underlying of Markov non-
V-structure occurs when or more edges that areindependence
not themselves directly
equivalent (Example
structure. The 1) and
skeleton equivalent
of aedge
graph is its(Example 2) BN general graphs.
undirected
nected by an converge on a singlerepresentation. As already
node. Figure 9 shows defined,
an example of Markov
a V-structure occurs when two or more directed edges that are not themselves
equivalent (Example 1) and equivalent (Example 2) BN general graphs. directly
connected by an edge converge on a single node. Figure 9 shows an example of Markov
non-equivalent (Example 1) and equivalent (Example 2) BN general graphs.

Figure 9. Examples of Markov equivalence tests.


Figure 9. Examples of Markov equivalence tests.
Figure 9. Examples of Markov equivalence tests.
BNs that are Markov equivalent define an equivalence class; this is illustrated by
BNs that are Markov equivalent define an equivalence class; this is illustrated by BN2*
BN2* in Figure
in Figure 10 for 10 for which
which two general
other general graphs (BN2b and BN2c) included in the
BNstwo thatother
are Markov graphs (BN2b
equivalent and an
define BN2c) included
equivalence in thethis
class; insert
is illustrat
insert at the bottom
at the bottom BN2* of
of Figure Figure 8
8 are inare in the same equivalence class. All three general graphs
in Figure 10the
for same
whichequivalence class. All
two other general three(BN2b
graphs general graphs
and BN2c)are
included
are Markov
Markov equivalent
equivalent because
theythey have
the the same skeleton and V-structures, and thus
insert because
at the bottom have
of Figure same skeleton
8 are in the sameand V-structures,
equivalence andAll
class. thus thegeneral g
three
the
samesame independence
independence structure,
structure, but they
but they havehave semantically different
semantically edge orientations.
are Markov equivalent because they have different
the sameedge orientations.
skeleton BN2*
and V-structures, and
BN2* was chosen arbitrarily
was chosen arbitrarily to to represent
represent this this equivalence
equivalence classclass
and and
its its
uniqueunique independ-
independence
the same independence structure, but they have semantically different edge orienta
ence structure.
structure. BN2b BN2b
andwas and have
BN2c BN2cthehave
sametheindependence
same independence structure,
structure, and for and for corre-
corresponding
BN2* chosen arbitrarily to represent this equivalence class and its unique inde
sponding variable
variable labels,ence labels, have
havestructure. identical
identical BN2b
probability probability distributions.
distributions.
and BN2c have the same independence structure, and for
sponding variable labels, have identical probability distributions.

Figure 10. BN2*,


Figure 10. BN2*, BN2b,
BN2b, BN2c.
BN2c.
Figure 10. BN2*, BN2b, BN2c.
A BN general graph is represented in the literature by an unlabeled PDAG [29], also
known as a Maximally Oriented Graphs [30], Essential Graph [31] and equivalence classes
A BN general graph is represented in the literature by an unlabeled PDAG [29
of directed acyclic graphs [32]. In a PDAG, edges can be directed, undirected or a mix of
known as a Maximally Oriented Graphs [30], Essential Graph [31] and equivalence c
directed and undirected. A PDAG includes edge direction when a V-structure is present
and removes edge direction when no V-structure is present. If there are no V-structures
in a given BN, all edges are undirected in its PDAG representation. Figure 11 shows the
PDAG representation of the graphs shown in the insert at the bottom of Figure 8. (PDAG2
encompasses BN2b and BN2c, etc.) Undirected edges can have either direction as long as a
cycle is not created and also a V-structure is not created that is represented by another BN
general graph. For example PDAG16, labeling variables A, B, C, D in order of left to right,
top to bottom could be oriented as B←D→C (BN16*) or B→D→C (BN16b) (or its mirror
and removes edge direction when no V-structure is present. If there are no V-structures
and
in aremoves
given BN, edge direction
all edges are when no V-structure
undirected in its PDAG is present. If there are
representation. no V-structures
Figure 11 shows the
inPDAG
a given BN, all edgesofare
representation theundirected
graphs shown in its
inPDAG representation.
the insert at the bottomFigure 11 shows
of Figure 8. (PDAG2 the
PDAG representation
encompasses BN2b and of the graphs
BN2c, etc.)shown in theedges
Undirected insertcan
at the bottom
have eitherofdirection
Figure 8.as
(PDAG2
long as
encompasses
a cycle is notBN2bcreatedand BN2c,
and alsoetc.) Undirected
a V-structure is edges can have
not created thateither direction by
is represented as long
anotheras
Entropy 2021, 23, 986 15 of 39
aBNcycle is not created and also a V-structure is not created that is represented
general graph. For example PDAG16, labeling variables A, B, C, D in order of left to by another
BN general
right, top tograph.
bottomFor example
could PDAG16,
be oriented labeling(BN16*)
as B←D→C variables orA, B, C, D(BN16b)
B→D→C in order(or
of its
leftmir-
to
right, top tobut
ror image) bottom
could could beoriented
not be orientedas asB→D←C,
B←D→C (BN16*)
because orthatB→D→C
creates (BN16b) (or itsresult-
a V-structure mir-
ror
ingimage)
in abut
image) but could
different
could notnot
be be
independenceoriented
oriented as→B→D←C,
structure
as B D← because
represented
C, because thatcreates
creates
separately
that aV-structure
V-structure
bya BN17. result-
resulting
ing in a different independence structure represented separately
in a different independence structure represented separately by BN17. by BN17.

Figure 11. PDAGs for graphs in Figure 8 insert.


Figure
Figure11.
11.PDAGs
PDAGsfor
forgraphs
graphsininFigure
Figure88insert.
insert.
Although representation of an entire Markov equivalence class in a single PDAG is
Although
useful,Although
the PDAG representation
representation
does not visiblyofofanan entire
entireMarkov
display the fact equivalence
Markov equivalence
that semanticallyclass in
class a single
in
differenta single PDAG
edge PDAG is
topol-
useful,
is
ogies the the
useful,
inhere PDAG inPDAGdoesdoes
many not general
BN visibly display
not visibly
graphs the
(in fact
display of that
8 the fact
20 semantically
graphsdifferent
that semantically
general edge topol-
in thedifferent
four-variable edge
topologies
ogies Useinhere
inhere
lattice). in
of many in many
Figure BN BN
8 togeneralgeneral
display thegraphs
graphs BN(in (in
8 of8 20
general of 20 general
general
graph graphs
graphs
lattice ininthe
opts instead thefour-variable
four-variable
to show rep-
lattice).
lattice).
resentatives Use
Use ofofof Figure
Figure
these 8 display
8 to
classestoand
display the
BN BN
thetheir
also general
general
alternative graph graph lattice
lattice
topologies opts opts
instead
in the instead
insert to thetobottom
at show show
rep-
representatives
resentatives
of the figure. of of
these these classes
classes and and
also also
their their alternative
alternative topologies
topologies in the in
insert theat insert
the at
bottom the
bottom
of of the figure.
the figure.
3.2.3. BN Specific Graph Notation
3.2.3.BN
3.2.3. BNSpecific
SpecificGraph GraphNotation
Notation
A BN specific graph is simply a labeled BN general graph. As summarized in Table
1, we AABN BN specific graph
use specific
graph is
the terminology
is simply
simply aa labeled
of “specific labeled
graph”
BN
BN general
general
for
graph. As
whatgraph.
in theAs
summarized ininTable
BNsummarized
literature is called Table1,a
we use
1,labeled the terminology
we usemaximally
the terminology of “specific graph” for what in the BN literature is called a labeled
orientedofgraph“specific graph” graph
or essential for what in the BN literature
or equivalence class of directedis called a
acy-
maximally
labeled orientedoriented
maximally graph orgraphessential graph orgraph
or essential equivalence class of directed
or equivalence of acyclic graphs
clic graphs or partially directed graph; these four different terms class
all refer directed
to the same acy-
or
clic partially
graphs directed
or graph;
partially these
directed four
graph;different
these terms
four all refer
different to the
terms same
all thing.
refer to All
thespecific
same
thing. All specific graphs for a given BN general graph class can be generated by permut-
graphs
thing. for a given BN general graph class cangraph
be generated bybepermuting all permut-
possible
ing allAll specific
possible graphs
variable for a given
labels. Given BNdata,
generaltwo BN specific class cangraphs generated
with differentby labels
variable labels. Given data, two BN specific graphs with different labels from the same BN
ing
from allthe
possible
same BN variable labels.
general graph Given
classdata, two BN different
will produce specific graphs with distributions.
probability different labels
general graph class will produce different probability distributions.
from theThesamenotation BN general
that we graph
use forclass will produce
BN specific graphs different
is derivedprobability
from thedistributions.
RA notation de-
The notation that we use for BN specific graphs is derived from the RA notation
The previously.
scribed notation that Aswein use
RA,fortheBN colonspecific graphsmarginal
represents is derived or from the RAindependence
conditional notation de-
described previously. As in RA, the colon represents marginal or conditional independence
scribed previously. As in RA, the colon represents marginal
among variables and relations. For example, Figure 12 shows a labeled version of RA gen- or conditional independence
among variables and relations. For example, Figure 12 shows a labeled version of RA
among
eral graphvariables
G15G15 and BN
and relations.
general For example,
graph BN11* Figure 12can shows aequivalently
labeled version represented
of RA gen-
general graph and BN general graph BN11*which which canalso also equivalently be be represented
((𝐴
A ⊥⊥B,𝐵,C𝐶 || D 𝐷),) , ((𝐵 ⊥C𝐶| |D𝐷),
eral graph G15
by BN11b,
BN11b, bothand BN general
of which
which havethe graph
thesame BN11*
same which can also
independencies equivalently be represented the
by both of have independencies B⊥ ) , the
by
sameBN11b, both of which
conditional have distribution
probability the same independencies (𝐴 ⊥ 𝐵, 𝐶 | 𝐷),
𝑝(𝐴|𝐷)𝑝(𝐵|𝐷)𝑝(𝐶|𝐷)𝑝(𝐷) and (𝐵thus⊥ 𝐶 the| 𝐷),same
the
same conditional probability distribution p( A| D ) p( B| D ) p(C | D ) p( D ) and thus the same
same
notationconditional
AD:BD:CD. probability distribution 𝑝(𝐴|𝐷)𝑝(𝐵|𝐷)𝑝(𝐶|𝐷)𝑝(𝐷) and thus the same
notation AD:BD:CD.
notation AD:BD:CD.

Figure
Figure 12. RA and
12. RA and BN
BN notation
notation example,
example, without
without subscripts.
subscripts.
Figure 12. RA and BN notation example, without subscripts.
RA
RA notation
notation must
must be
be modified
modified to
to accommodate
accommodate the
the V-structures
V-structures that
that are
are unique
unique to
to
BNsRAand not found in RA; this is done by adding subscripts that specifythat
the independence
BNs andnotation
not foundmust be modified
in RA; to by
this is done accommodate the V-structures
adding subscripts that specify the are unique to
independence
relations
BNs encoded
and not byRA;
found in the this
V-structures.
is done by(For a BN
adding graph without
subscripts a V-structure,
that specify BN nota-
the independence
tion is identical to the RA notation.) For example, BN17 in Figure 13a has the notation
BCDB:C :A, where the colon between BCDB:C and A states the independency ( A ⊥ B, C, D ),
namely that A is marginally independent of B, C, and D. The subscript B:C states marginal
independence between B and C within the triadic, dependent, BCD relation. Figure 13b
shows the more complex BN4, which has a V-structure in which A, B, and C have arrows
going to D; this means that it has a tetradic dependency between A, B, C and D, which will
be reflected in a p( D | ABC ) in the probability expression for this graph. The graph also
has the single independency ( A ⊥ B | C ) . The notation for this graph is thus ABCDAC:BC ,
which preserves the dependency between A, B, C, and D, and also encodes the condi-
tional independence between A and B given C. (In RA, this conditional independence is
independence between B and C within the triadic, dependent, BCD relation. Figure 13b
shows the more complex BN4, which has a V-structure in which A, B, and C have arrows
going to D; this means that it has a tetradic dependency between A, B, C and D, which
will be reflected in a 𝑝(𝐷|𝐴𝐵𝐶) in the probability expression for this graph. The graph
Entropy 2021, 23, 986
also has the single independency (𝐴 ⊥ 𝐵 | 𝐶) . The notation for this graph is 16thus of 39

ABCDAC:BC, which preserves the dependency between A, B, C, and D, and also encodes
the conditional independence between A and B given C. (In RA, this conditional inde-
pendence
expressedisby
expressed by saying
saying that : BC )𝑇(𝐴𝐶:
T ( AC that = TC𝐵𝐶)
( A : B)𝑇=
C(𝐴: 𝐵)
0, where 0,
T where T is information-
is information-theoretic
theoretic transmission.)
transmission.)

Figure13.
Figure BNnotation
13.BN notationexamples
exampleswith
withsubscripts.
subscripts.(a)
(a)BCD
BCD :A;:A;
B:C
B:C (b)(b) ABCD
ABCD . .
AC:BC
AC:BC

3.2.4. BN Independencies and Probability Distributions


3.2.4. BN Independencies and Probability Distributions
As has been repeatedly stated in the above discussion, the marginal or conditional
As has been
independence repeatedly
between statedand
variables in the above is
relations discussion,
what uniquelythe marginal
specifiesoranconditional
RA or BN
independence
model. “It is known that the statistical meaning of any causal model can beRA
between variables and relations is what uniquely specifies an or BN
described
model. “It is known that the statistical meaning of any causal model
economically by its stratified protocol, which is a list of independence statements that can be described
economically by its stratified
completely characterize protocol,
the model” which is
[22,23,28]. Thea list of independence
method to determine statements
BN independen-that
completely
cies is known characterize the model”
as D-separation, and[22,23,28].
is describedThein method to determine
the Appendix A.2. BN independen-
To determine the
cies is known as D-separation, and is described in the Appendix A.2.
list of independence statements that completely describe any BN, D-separation is appliedTo determine the list
oftoindependence statements that
all possible independence completely
statements for describe
a given BN.any Those
BN, D-separation is applied to
satisfying independence
all possible independence statements for a given BN. Those satisfying
among variables are retained and represent the set of independencies that fully describe independence
among variables
the structure are retained
of relations andarepresent
within given BN.the Forsetfour
of independencies
variables, Tablethat fully describe
2 provides all pos-
the structure
sible of relations
independence within aFor
statements. given BN. For
a given BN,four
withvariables,
node labels Table
and2directed
providesedges,
all pos-
all
sible independence
independence statements.
statements from For
this atable
given BN,towith
need node labels
be tested. and directed
Independence edges,that
statements all
independence
are satisfied are statements
kept, andfrom this table
represent need
the set to be tested. Independence
of independencies statements
that fully describe thatthat
BN.
are satisfied are kept, and represent the set of independencies that fully describe that BN.
Table 2. Four-variable independence statements.
Table 2. Four-variable independence statements.
Marginal Independence Conditional Independence
General
Marginal Independence Conditional Independence
General Expression (. ⊥ .... ))
(. ⊥ (. (⊥. ⊥
. . ,..,…. ). .) . ⊥. . ..,
(.(⊥ , …. ,. …
. , ..). . .) (. ⊥
(. ⊥ . . |... |. .. ). .) . ⊥. ...| |…
(.(⊥ , .. .). .)
. .,.…
Expression
Specific
SpecificExpression
Expression 1 (𝐴 ⊥ 𝐵)
( A ⊥ B)
(𝐴( A⊥⊥𝐵,B,𝐶) C)
(𝐴( A⊥⊥𝐵,B,𝐶,C,𝐷)D ) (𝐴( A⊥ ⊥𝐵B| |𝐶) C)
(𝐴( A⊥⊥𝐵B||𝐶, 𝐷)
C, D )
21 (𝐴 ⊥ 𝐶) (𝐴 ⊥ 𝐵, 𝐷) (𝐵 ⊥ 𝐴, 𝐶, 𝐷) (𝐴 ⊥ 𝐵 | 𝐷) (𝐴 ⊥ 𝐶 | 𝐵, 𝐷)
2 ( A ⊥ C) ( A ⊥ B, D ) ( B ⊥ A, C, D ) ( A ⊥ B | D) ( A ⊥ C | B, D )
33 (𝐴 ⊥ 𝐷) (𝐴 ⊥ 𝐶, 𝐷) (𝐶 ⊥ 𝐴, 𝐵, 𝐷) (𝐴 ⊥ 𝐶 | 𝐷) (𝐴 ⊥ 𝐷 | 𝐵, 𝐶)
( A ⊥ D) ( A ⊥ C, D ) (C ⊥ A, B, D ) ( A ⊥ C | D) ( A ⊥ D | B, C )
44 (𝐵
(B ⊥⊥ C𝐶) ) (𝐵( B⊥⊥𝐴, A,𝐶) C) (𝐷( D⊥ ⊥𝐴,A,𝐵,B,𝐶)C ) (𝐵( B⊥⊥𝐴A| |𝐶) C) (𝐵( B⊥⊥𝐴A||𝐶, C,𝐷) D)
5 (𝐵
(B ⊥⊥ D𝐷) ) (𝐵( B⊥⊥𝐴, A,𝐷) D) (𝐵( B⊥⊥𝐴A| |𝐷) D) (𝐵( B⊥⊥𝐶C|| 𝐴, A,𝐷) D)
66 (𝐶
(C ⊥⊥ D𝐷) ) (𝐵( B⊥⊥𝐶, C,𝐷) D) (𝐵( B⊥⊥𝐶C| |𝐷) D) (𝐵( B⊥⊥𝐷D||𝐴, A,𝐶) C)
7
7
(𝐶(C⊥⊥𝐴, A, B)
𝐵) (C ⊥ A | B)
(𝐶 ⊥ 𝐴 | 𝐵) (C ⊥ A | B, D )
(𝐶 ⊥ 𝐴 | 𝐵, 𝐷)
8 (C ⊥ A, D ) (C ⊥ A | D ) (C ⊥ B | A, D )
89 (𝐶(C⊥⊥𝐴, B,𝐷) D) (𝐶 (C⊥ ⊥𝐴B| |𝐷) D) (𝐶(C⊥⊥𝐵D| |𝐴,A,𝐷)B)
9
10 (𝐶( D⊥ ⊥𝐵, A,𝐷) B) (𝐶( D⊥ ⊥𝐵A| 𝐷) | B) (𝐶( D⊥⊥𝐷A| |𝐴,B,𝐵) C)
11
10 (𝐷 ⊥ 𝐴, 𝐵) )
( D ⊥ A, C (𝐷 ⊥ 𝐴 | 𝐵)
( D ⊥ A | C ) (𝐷 ⊥ 𝐴 | 𝐵, 𝐶)
( D ⊥ B | A, C)
12
11 (𝐷( D⊥⊥𝐴, B,𝐶) C) ( D ⊥
(𝐷 ⊥ 𝐴 | 𝐶) B | C ) D ⊥
(𝐷 ⊥ 𝐵 | 𝐴, 𝐶)B)
( C | A,

D-separation can also be used to test the Markov equivalence of any labeled BNs.
If two BNs have the same independencies as revealed by D-separation tests, they are
in the same Markov equivalence class and thus the same BN general graph. The prior
section, however, provided a simpler way, illustrated above in Figure 9, to test for Markov
equivalence of two BNs with different edge topologies.

3.2.5. Lattice of BN General and Specific Graphs


The BN literature on lattices predominately focuses on search algorithms to find the
best BN given a scoring metric. Implicit in these search algorithms is a lattice of candidate
graphs being explored in search of the best model. Chickering [35] and others have shown
the search problem to be NP-hard, with four variables there are 543 possible BNs, with
Entropy 2021, 23, 986 17 of 39

10 variables there are O(10ˆ18) [36]. Because of this, research in this area has focused
less on characterizing exhaustively the lattice of BN graphs, and more on advancing
search heuristics to efficiently traverse the lattice to identify the best BN given a scoring
metric [37–46], and others.
Heckerman [34] first showed that BNs with differing edge topologies can have the
same independence structure and the same probability distribution, herein described as
BN specific graphs. In contrast to heuristics that search all BNs, search heuristics for BN
specific graphs have proven to be more efficient because they reduce the dimensionality of
search space [29,31,32,40,47–50], and others. For four variables, this approach reduces the
search space from 543 BNs to 185 BN specific graphs [31]. These 185 BN specific graphs can
be summarized by 20 BN general graphs all with unique independence structures when
variable labels are removed.
Building from the RA work of Klir [8] and Zwick [14], and the BN work of Pearl [21–23,51],
Verma [28], Heckerman [34], Chickering [29,35,40,52,53], Andersson [31], Rubin [54], and
others, the following procedure was used to generate the four variable BN general and
specific graph lattice of Figure 14 in a way that can be integrated with the RA general
graph lattice. While this procedure is applied in this paper to four variables, it could
in principle be used for any number of variables, although of course as the number of
variables increases the effort required increases exponentially.

3.2.6. BN Neutral System General and Specific Graph Procedure


The procedure to generate the BN neutral system general and specific graph lattice for
any number of variables is as follows:
1. Assign labels arbitrarily to the n solid squares representing variables.
2. Generate all graphs for these n variables by permuting all possible edge connections
and edge orientations. Eliminate graphs with cycles. The result is the set of all labeled
directed acyclic graphs for n variables.
3. For each directed acyclic graph, determine its independence structure using the D-
separation procedure [55] detailed in Appendix A.2. This identifies which of the
independence statements in Table 2 apply to the graph.
4. Collect together all graphs with the same unlabeled independencies. The set of these
DAGs comprise a general graph equivalence class.
5. For each general graph equivalence class, collect together all graphs with the same
labeled independencies into specific graph equivalence classes. List the RA notation
for each of these specific graphs.
6. Select one specific graph equivalence class to represent the general graph, and from
this specific graph equivalence class, select a single edge topology to represent the
general graph. List any additional equivalent general graphs with unique edge
topologies separately, as was done in the insert in Figures 8 and 14.
7. Organize general graphs into levels based upon the number of edges in each general
graph and link hierarchically nested general graphs in the lattice to reflect parent-child
general graphs.
Figure 14 shows the result of following this procedure for four variables. This BN
general and specific graph lattice can be directly compared with the RA general and specific
graph lattice. The RA lattice can also be extended to include the BN lattice. The comparison
and extension will be discussed in Section 4.
Table 3 lists specific graph representatives for each of the general graphs in Figure 14.
These specific graphs, highlighted in bold in Figure 14, assume that nodes are labeled in
the order A, B, C, D from left to right, top to bottom, which is the labeling convention
throughout this paper. The notation for a BN specific graph without a V-structure is
identical to the RA notation. As in RA, the colon represents marginal or conditional
independence among variables. For a BN graph with a V-structure, the notation adds
subscripts to represent the independence relations encoded by the V-structure, which are
unique to BNs and not found in RA. (See the Section 3.2.3. for more details on this notation).
Entropy 2021, 23, 986 18 of 39

Thus, graphs in Table 3 without subscripts are equivalent to an RA graph and graphs
Entropy 2021, 23, x FOR PEER REVIEW
with
19 of 42
subscripts are unique to BN. Equivalence and non-equivalence between RA and BN graphs
will be discussed in Section 4.

BN neutral
Figure 14. Lattice of general and specific BN neutral system
system graphs.
graphs.

3.2.6. BN Neutral System General and Specific Graph Procedure


The procedure to generate the BN neutral system general and specific graph lattice
for any number of variables is as follows:
1. Assign labels arbitrarily to the n solid squares representing variables.
Entropy 2021, 23, 986 19 of 39

Table 3. Probability distribution and independencies of BN specific graph examples.

Specific Graph Example


BN General Graph
RA Notation Probability Distribution Independencies
BN1 ABCD p(B|A)p(A)p(C|AB)p(D|ABC) none
BN2 ACD:BCD p(A|CD)p(C)p(B|CD)p(D|C) (A ⊥ B | C, D)
BN3 ABCDA:B p(C|AB)p(A)p(B)p(D|ABC) (A ⊥ B)
BN4 ABCDAC:BC p(A|C)p(C)p(B|C)p(D|ABC) (A ⊥ B | C)
BN5 BCD:AD p(A|D)p(D)p(B|CD)p(C|D) (A ⊥ B, C | D)
BN6 ABCDBC:A p(B|C)p(C)p(D|ABC)p(A) (A ⊥ B, C)
BN7 BCD:ABDA:B p(C|BD)p(B)p(D|AB)p(A) (A ⊥ B), (A ⊥ C | B, D)
BN8 ACDC:D :BCDC:D p(A|CD)p(C)p(D)p(B|CD) (C ⊥ D), (A ⊥ B | C, D)
BN9 ABD:ABCAC:BC p(A|C)p(C)p(B|C)p(D|AB) (A ⊥ B | C), (C ⊥ D | A, B)
BN10 BCD:A p(B|C)p(C)p(D|BC) (A ⊥ B, C, D)
BN11 AD:BD:CD p(A|D)p(D)p(B|D)p(C|D) (A ⊥ B, C | D), (B ⊥C | D)
BN12 ABCDA:B:C p(D|ABC)p(A)p(B)p(C) (A ⊥ B, C), (B ⊥ C)
BN13 ACDA:C :BD p(B|D)p(D|AC)p(A)p(C) (A ⊥ C), (B ⊥ A, C |D)
BN14 AD:BC:BD p(A|D)p(D)p(B|D)p(C|B) (A ⊥ B | D), (C ⊥ A, D | B)
BN15 ABDA:B :BC p(C|B)p(B)p(D|AB)p(A) (A ⊥ B, C), (C ⊥ D | A, B)
BN16 BD:CD:A p(B|D)p(D)p(C|D)p(A) (B ⊥ C | D), (A ⊥ B, C, D)
EntropyBN17
2021, 23, x FOR PEER REVIEW BCDB:C :A p(D|BC)p(B)p(C)p(A) (B ⊥ C), (A ⊥ B, C, D) 21 of 42
BN18 AD:BC p(C|B)p(B)p(D|A)p(A) ( A, D ⊥ B, C)
BN19 CD:A:B p(D|C)p(C)p(A)p(B) (B ⊥ C, D), (A ⊥ B, C, D)
BN9 ABD:ABCAC:BC p(A|C)p(C)p(B|C)p(D|AB) (A ⊥ B | C), (C ⊥ D | A, B)
BN20 A:B:C:D p(A)p(B)p(C)p(D) (A ⊥ B, C, D), (B ⊥ C, D), (C ⊥ D)
BN10 BCD:A p(B|C)p(C)p(D|BC) (A ⊥ B, C, D)
BN11 AD:BD:CD p(A|D)p(D)p(B|D)p(C|D) (A ⊥ B, C | D), (B ⊥C | D)
BN12 Table ABCD
3 shows A:B:C for each BN general graph from Figure
p(D|ABC)p(A)p(B)p(C) (A ⊥ B,14C),
a specific
(B ⊥ C) graph with its RA
BN13 notation, probability
ACDA:C:BD distribution, and minimal list of(A
p(B|D)p(D|AC)p(A)p(C) independencies
⊥ C), (B ⊥ A, C |D) resulting from the
BN14 D-separation procedure. The
AD:BC:BD probability distribution(A
p(A|D)p(D)p(B|D)p(C|B) is ⊥
obtained
B | D), (Cas⊥follows:
A, D | B)(1) For each
BN15 labeled nodeABDofA:Ba:BC
BN specific graph, list each node’s individual
p(C|B)p(B)p(D|AB)p(A) (A ⊥ B, C), (Cprobability
⊥ D | A, B)expression as
BN16 the probability of the nodep(B|D)p(D)p(C|D)p(A)
BD:CD:A (B ⊥
given its parents, i.e., p(node | parents
C | D), (A ⊥there
) ; if B, C, D)are no parents,
BN17 simply theBCDp(node
B:C:A). (2) Join (B ⊥ C), (A For
the list of probability expressions.
p(D|BC)p(B)p(C)p(A) ⊥ B,example,
C, D) for BN2* in
BN18 Figure 15, AD:BC
the individualp(C|B)p(B)p(D|A)p(A)
probability expressions are( A, p (DA|⊥
C,B,DC)) for A, p( B|C, D ) for B,
BN19 p(C ) for C,CD:A:B
and p( D |C ) for p(D|C)p(C)p(A)p(B)
D. Joining these gives p( A ⊥D
(B|C, C,)D),
p( B(A ⊥D
|C, B,) C,
p(CD)) p( D |C ) . (The
BN20 table omitsA:B:C:D (A ⊥ B, C, D), (B ⊥
the commas for variables that are given in conditional probability(Cterms.)
p(A)p(B)p(C)p(D) C, D), ⊥ D)

p(A|C,D)p(B|C,D)p(C)p(D|C)

Figure 15. Probability distribution for BN2* example.


Figure 15. Probability distribution for BN2* example.
The equivalence or non-equivalence of RA and BN graphs is discussed in detail in
The equivalence
Section 4, below, or
butnon-equivalence of an
Table 3 provides RAadvanced
and BN graphs
look atisthis
discussed in detail
issue. Any in
BN general
Section 4, below, but Table 3 provides an advanced look at this issue. Any BN general
graph with a specific graph example whose RA notation does not include subscripts is
graph with a specific
equivalent graph
to some example
general whosethere
RA graph; RA notation
are 10 of does
thesenot
BNinclude
general subscripts
graphs. Anyis BN
equivalent to some general RA graph; there are 10 of these BN general graphs. Any
general graph with a specific graph example whose notation includes subscripts is not BN
general graph with a specific graph example whose notation includes subscripts is not
equivalent to any general RA graph; there are also 10 of these BN general graphs, which
all have V-structures.

3.3. BN Directed Systems


Entropy 2021, 23, 986 20 of 39

equivalent to any general RA graph; there are also 10 of these BN general graphs, which all
have V-structures.

3.3. BN Directed Systems


The BN discussion so far has focused on BN neutral systems in which an IV-DV
distinction is not made. This section narrows the focus to BN predictive graphs, analogous
to RA directed systems, where the aim is to predict a single DV given the IVs. As in RA,
we define Z as the dependent variable in the BN directed system lattice, replacing variable
D in the neutral system lattice. We designate as the DV in a given BN any node with
the exception of a parent node within a V-structure. That is, we do not consider here the
possibility that a parent node within a V-structure could be designated as a DV; this will be
discussed further in Section 5. As is the case for RA, many graphs in the neutral system
lattice are redundant when the aim is only to predict the DV. The BN directed system lattice
of Figure 16, where only graphs with unique predictions of Z are highlighted, is thus a
subset of the BN neutral system lattice of Figure 14. For each general graph in Figure 16
with a unique prediction, associated specific graphs are listed. Specific graphs that are
bolded correspond to the displayed BN edge orientation and edge connections assuming
labeling of nodes from top left, top right, bottom left, bottom right as A, B, C, Z respectively.
These bolded specific graphs also correspond to the examples in below Table 4. Graphs
not highlighted in Figure 16 are equivalent in their predictions to highlighted graphs.
(Asterisks in this figure have the same meaning they have in BN Figures 8 and 14). For two
graphs with identical predictions, the graph with the least degrees of freedom was selected.
There are eight general graphs and 18 specific graphs in the BN directed system lattice;
this is a significant compression of the BN neutral system lattice that includes 20 general
graphs and 185 specific graphs.

Table 4. BN directed system graphs.

Predictively Equivalent Simpler Specific Graph Example Specific Graph Example


BN General Graph
Graph RA Notation Probability Distribution
BN1 BN12 ABCZ p(Z|ABC)p(C|AB)p(B|A)p(A)
BN7 ABZ:BCZ p(C|BZ)p(Z|AB)p(A|B)p(B)
BN2
BN17 ABC:BCZ p(Z|BC)p(B|CA)p(A|C)p(C)
BN3 BN12 ABCZA:B p(Z|ABC)p(C|AB)p(A)p(B)
BN4 BN12 ABCZAC:BC p(Z|ABC)p(A|C)p(B|C)p(C)
BN13 ACZ:BZ p(Z|AC)p(B|Z)p(C|A)p(A)
BN5
BN19 ABC:CZ p(Z|C)p(B|CA)p(C|A)p(A)
BN6 BN12 ABCZBC:A p(Z|ABC)p(B|C)p(C)p(A)
BCZ:ABZA:B p(C|BZ)p(Z|AB)p(B)p(A)
BN7
BN17 BCZ:ABCA:C p(Z|BC)p(B|CA)p(C)p(A)
BN8 BN17 ABCB:C :BCZB:C p(Z|BC)p(A|BC)p(B)p(C)
BN9 BN17 BCZ:ABCAB:AC p(Z|BC)p(B|A)p(C|A)p(A)
BN10 BN17 BCZ:A p(Z|BC)p(B|C)p(C)p(A)
AZ:BZ:CZ p(A|Z)p(B|Z)p(C|Z)p(Z)
BN11
BN19 AC:BC:CZ p(Z|C)p(B|C)p(C|A)p(A)
BN12 ABCZA:B:C p(Z|ABC)p(A)p(B)p(C)
BN13 ACZA:C :BZ p(Z|AC)p(B|Z)p(A)p(C)
BN19
ABCA:B :CZ p(Z|C)p(C|AB)p(A)p(B)
BN16 AB:BZ:CZ p(Z|B)p(C|Z)p(B|A)p(A)
BN14
BN19 AB:BC:CZ p(Z|C)p(B|A)p(A)p(C|B)
BN17 BCZB:C :AB p(Z|BC)p(A|B)p(B)p(C)
BN15
BN19 ABCA:C :CZ p(Z|C)p(B|CA)p(A)p(C)
BN16 BZ:CZ:A p(B|Z)p(C|Z)p(Z)p(A)
BN19
BC:CZ:A p(Z|C)p(C|B)p(B)
Entropy 2021, 23, 986 21 of 39

Table 4. Cont.

Predictively Equivalent Specific Graph Example Specific Graph Example


BN General Graph
Simpler Graph RA Notation Probability Distribution
BN17 BCZB:C :A p(Z|BC)p(B)p(C)p(A)
BN18 BN19 AB:CZ p(Z|C)p(B|A)p(A)p(C)
BN19 CZ:A:B p(Z|C)p(C)p(A)p(B)
Entropy 2021, 23, x FOR PEER REVIEW 23 of 42
BN20 A:B:C:Z p(Z)p(A)p(B)p(C)

Figure 16. BN directed system lattice.


Figure 16. BN directed system lattice.
Table 4 lists all BN directed system general graphs. When BN graphs are greyed in
column 1 it means the graph is equivalent in terms of prediction to a simpler (fewer de-
grees of freedom) general graph. Column 2 identifies which simpler graph it is equivalent
to. General graphs with a blank row in column 2 have no simpler equivalently predicting
Entropy 2021, 23, 986 22 of 39

Entropy 2021, 23, x FOR PEER REVIEW 24 of 42


Table 4 lists all BN directed system general graphs. When BN graphs are greyed in
column 1 it means the graph is equivalent in terms of prediction to a simpler (fewer degrees
of freedom) general graph. Column 2 identifies which simpler graph it is equivalent to.
graph,
Generaland are included
graphs in therow
with a blank directed
in column system lattice
2 have noofsimpler
Figure equivalently
16. Column 3predicting
provides
specific
graph, and graphareexamples
includedof in these generalsystem
the directed graphs lattice
and columnof Figure 4 shows the specific
16. Column graph
3 provides
probability distributions. Within column 4, only the expressions
specific graph examples of these general graphs and column 4 shows the specific graph that are used to predict
the dependent
probability variable are
distributions. highlighted
Within columnin4,black. only the Allexpressions
other non-predictive
that are usedrelations are
to predict
greyed. For example, BN1, BN3, BN4, and BN6 and BN12
the dependent variable are highlighted in black. All other non-predictive relations are all predict Z in the same way,
i.e.,
greyed.𝑝(𝑍|𝐴𝐵𝐶), thus they
For example, areBN3,
BN1, all equivalent
BN4, andinBN6 terms and ofBN12
prediction. However,
all predict BN12
Z in the hasway,
same the
least
i.e., pdegrees
( Z | ABCof
) , freedom
thus theyandare is
alltherefore
equivalent selected
in terms to of
represent
prediction. all five of these
However, equivalent
BN12 has the
general graphs.
least degrees of freedom and is therefore selected to represent all five of these equivalent
general graphs.
4. Joint RA-BN Neutral System Lattice
4. Joint
4.1. JointRA-BN
RA-BN Neutral SystemLattice
Neutral System LatticeIntroduction
4.1. Joint RA-BN Neutral System Lattice Introduction
This section integrates the RA and BN neutral system general graph lattices using the
This section
four variable Rho integrates
lattice [7]. the RA and the
Combining BN Rho,neutral RAsystem
and BNgeneral lattice graph
createslattices
a largerusing
and
the four variable Rho lattice [7]. Combining the Rho, RA and
more descriptive lattice than any previously identified in the literature. The lattice identi- BN lattice creates a larger
andindependence
fies more descriptive lattice than
structures unique anytopreviously
RA or to BNs, identified in the literature.
and independence The lattice
structures that
identifies independence structures unique to RA or to BNs, and independence structures
are equivalent across RA and BN. Equivalence is in terms of independence structure as
that are equivalent across RA and BN. Equivalence is in terms of independence structure as
described separately for RA in the Section 2, and BN in the Section 3. Where two or more
described separately for RA in the Section 2, and BN in the Section 3. Where two or more
graphs, RA or BN, have the same general independence structure regardless of variable
graphs, RA or BN, have the same general independence structure regardless of variable
labels, they are equivalent. General independence structure is represented with independ-
labels, they are equivalent. General independence structure is represented with indepen-
ence statements without labels. For example, (. ⊥ . . | … ), one variable is independent of
dence statements without labels. For example, (. ⊥ .. | . . .) , one variable is independent of
another, given a third. Consider, for example, RA general graph G15 and BN general
another, given a third. Consider, for example, RA general graph G15 and BN general graph
graph BN11 have the same general independence structure (. ⊥ . . , … | … ), (. . ⊥. . . | … ),
BN11 have the same general independence structure (. ⊥ .., . . . | . . .), (.. ⊥ . . . | . . .) , thus
thus they are equivalent. Two specific graphs are equivalent if they have the same inde-
they are equivalent. Two specific graphs are equivalent if they have the same independence
pendence structure given variable labels. For example, using RA general graph G15 and
structure given variable labels. For example, using RA general graph G15 and BN general
BN general graph BN11 again, Figure 17 shows these general graphs with variable labels
graph BN11 again, Figure 17 shows these general graphs with variable labels added mak-
added
ing them making them
specific specific
graphs. graphs.
Given Given
these labels, these theylabels,
havethey have equivalent
equivalent general andgeneral and
specific
independence structure, (. ⊥ .., . . . | . . . .), (.. ⊥ . . . | . . .) and ( A ⊥ B, C | D ), ( B ⊥ C | D⊥)
specific independence structure, (. ⊥ . . , … | … . ), (. . ⊥. . . | … ) and (𝐴 ⊥ 𝐵, 𝐶 | 𝐷), (𝐵
𝐶 | 𝐷) respectively.
respectively.

Figure 17. G15 and BN11* specific graph example.


Figure 17. G15 and BN11* specific graph example.
4.2. RA-BN Rho Neutral System Graphs
4.2. RA-BN
The Rho Rho(ρ)
Neutral
latticeSystem Graphs
of Figure 18 (adapted from Klir [7] (p. 237)) is a simplification
of theThe
RARho (ρ) lattice
lattice of Figure
of general graphs18 and
(adapted from
is used Klir
here to [7] (p. 237))
integrate theisRAa simplification
neutral system of
the RAwith
lattice lattice ofBN
the general graphs
neutral andlattice.
system is usedThehereRho
to integrate
lattice is the RA neutral
an even system lattice
more general
with
than the
theBN
RAneutral
generalsystem
graph lattice.
lattice The
and Rhocan lattice is anRA
map both even
andmore
BN general
generallattice
graphs than the
to one
RA general
of its elevengraph lattice A
structures. and candot
solid map both RA and
represents BN general
a variable; a linegraphs
connects to one of its eleven
variables in the
Rho latticeAif solid
structures. thesedot
tworepresents
variables aare directlya connected
variable; line connectsby variables
any box (relation)
in the Rhoinlattice
the RA if
general graph lattice. Arrows from one Rho graph to another represent
these two variables are directly connected by any box (relation) in the RA general graph hierarchy, i.e., the
generation
lattice. of afrom
Arrows childone
graph
Rhofrom
grapha parent graph.
to another ρ1 represents
represent maximal
hierarchy, i.e., theconnectedness,
generation of
aorchild
dependence,
graph from between
a parentvariables,
graph. ρ1and ρ11 represents
represents maximalindependence
connectedness,among all variables.
or dependence,
Graphs in-between
between ρ1 and
variables, and ρ11ρ11 represent
represents a mix of dependence
independence among all and independence
variables. Graphsamong
in-be-
variables.
tween Eachρ11
ρ1 and RArepresent
or BN general
a mixorofspecific graph and
dependence corresponds to one,among
independence and only one, of
variables.
the eleven
Each RA orRhoBN graphs.
general or specific graph corresponds to one, and only one, of the eleven
Rho graphs.
Entropy 2021, 23, 986 23 of 39
Entropy 2021, 23, x FOR PEER REVIEW 25 of 42

Figure18.
Figure 18.Lattice
Latticeofoffour-variable
four-variableRho
Rhographs.
graphs.

4.3.
4.3.Rho
Rhoand
andEquivalent
EquivalentRA RAand
andBN
BNGeneral
GeneralGraphs
Graphs
Out
Outofof2020
RARA neutral system
neutral general
system graphs
general and 20and
graphs BN 20neutral system general
BN neutral system graphs,
general
there are there
graphs, 10 RA aregeneral graphs, graphs,
10 RA general comprising all of the
comprising all graphs with no
of the graphs loops
with in theinRA
no loops the
lattice that that
RA lattice are equivalent to BN
are equivalent general
to BN generalgraphs.
graphs.Each of of
Each these RA-BN
these RA-BN equivalent
equivalentpairs
pairs
corresponds
correspondstotoone oneofofthe
the1111Rho
Rhographs
graphsfromfromFigure
Figure18,
18,with
withthe
theexception
exceptionofofρ4. ρ4has
ρ4.ρ4 has
corresponding RA and BN general graphs, but these do not have equivalent
corresponding RA and BN general graphs, but these do not have equivalent independence independence
structures,
structures,and
andarearediscussed
discussedininthe
thefollowing
followingSection
Section4.3.
4.3.
ρ1 reflects maximal connectedness among allfour
ρ1 reflects maximal connectedness among all fourvariables.
variables.For
Forboth
boththe
theRA
RAgeneral
general
graph
graphG1 G1and
andthetheBN
BNgeneral
generalgraph
graphBN1BN1fromfromFigures
Figures11andand88respectively,
respectively,there
thereare
arenono
independencies among the variables and thus the graphs are equivalent. Both
independencies among the variables and thus the graphs are equivalent. Both graphs have graphs have
only
onlyone
onespecific
specificgraph,
graph,ABCD.
ABCD.This
Thisisissummarized
summarizedininFigure
Figure19.
19.
Entropy 2021, 23, x FOR PEER REVIEW 26 of 42
Entropy 2021, 23,
Entropy 2021, 23, 986
x FOR PEER REVIEW 24 of
26 of 39
42

Figure 19. Rho1, G1 and BN1 specific graph.


Figure 19.
Figure Rho1, G1
19. Rho1, G1 and
and BN1
BN1 specific
specific graph.
graph.
ρ2
ρ2 corresponds
corresponds to to RA
RA general
general graph
graph G7 G7 andand BNBN general
general graph
graph BN2*,
BN2*, as
as shown
shown in
in
Figure ρ2 corresponds
20. ItIt is clear to
how RA general
BN2* graph
corresponds G7 and
to Rho BN general
graph ρ2 graph
because BN2*, as
visually shown
they in
are
Figure
Figure 20.
20. It is
is clear
clear how
how BN2*
BN2* corresponds
corresponds to
to Rho
Rho graph
graph ρ2
ρ2 because
because visually
visually they
they are
are
represented
represented in in the
the same
same wayway with
with the
the exception
exception that that the
the Rho
Rho graph
graphhas hasundirected
undirectededges.
edges.
represented
There are two in the same
additional way
BN with
generalthe exception
graphs (BN3 that
and the Rho
BN4*) graph
that has undirected
correspond to ρ2;edges.
how-
There
Theretheyare two
are have additional
two additional BN general
BN general graphs
graphs (BN3
(BN3 andandBN4*)
BN4*)that correspond
that correspond to ρ2; however
to ρ2; how-
ever no equivalent RA general graph, so they are
they have no equivalent RA general graph, so they are discussed in the next section whichdiscussed in the next section
ever
which they have
concerns no equivalent
non-equivalent RA general graph, so they are discussed in the next section
concerns non-equivalent RA andRA BN and generalBN graphs.
general ρ2, graphs.
G7, andρ2,BN2*
G7, and BN2* two
represent represent
three-
which
two concerns
three-variable non-equivalent
relations RA and
with conditional BN general graphs.
independence ρ2, G7, and BN2* represent
variable relations with conditional independence between between two variables,
two variables, with
with general
two three-variable
general independence relations with(.conditional
structure ⊥ . . | … , … . independence
). Assigning between
labels to two variables,
variables makes it with
eas-
independence structure (. ⊥ .. | . . . , . . . .) . Assigning labels to variables makes it easier to
general
ier to independence
interpret the RA structure
association(. ⊥ .
with . | …
ρ2., … . ).
Figure Assigning
20 shows labels
an to variables
example with makes it
variable eas-
la-
interpret the RA association with ρ2. Figure 20 shows an example with variable labels (one
ier to(one
bels interpret the RA association
of sixpermutations
possible permutations withofρ2. Figurelabels)
variable 20 shows an example
assigned to RA with variable la-
of six possible of variable labels) assigned to RA graph G7 graph
whichG7 which
results in
bels (one
results of
in RA six possible
specific permutations
graph ACD:BCD, of variable
inindependent labels) assigned
which A is independent to RA graph
of BD,given G7 which
RA specific graph ACD:BCD, in which A is of B given C and ( A ⊥CB|and
C, DD, ).
results
(𝐴 ⊥ 𝐵|𝐶, in RA𝐷). specific
Assigning graph ACD:BCD, in which A
in is independent of the
B given Cspecific
and D,
Assigning labels to the BNlabels
graphtointheFigureBN graph
20 yields Figure
the same20specific
yields graph. same
Other label
(𝐴 ⊥ 𝐵|𝐶,
graph. Other 𝐷).labelAssigning labels to the fiveBN other
graphequivalent
in Figure RA 20 yields BNthe same graphs:
specific
permutations yield permutations yield
five other equivalent RA and BN specific graphs: andABC:ABD, specific
ABC:ACD,
graph.
ABC:ABD, Other label permutations yield five other equivalent RA and BN specific graphs:
ABC:BCD, ABC:ACD,ABD:ACD, ABC:BCD, ABD:BCD. ABD:ACD, ABD:BCD.
ABC:ABD, ABC:ACD, ABC:BCD, ABD:ACD, ABD:BCD.

Figure 20.
Figure Rho2, G7, and BN2* example.
20. Rho2,
Figure 20. Rho2, G7, and BN2* example.
represents RA
ρ3 represents
ρ3 RA graph
graph G10 G10 and
and BNBN graph
graph BN5*
BN5* which
which have
have thethe same
same independence
independence
ρ3
structure, (. ⊥
represents
.., . . RA
. | .graph
. . .). G10
Figureand
21 BN graph
shows an BN5* which
example of have
one
structure, (. ⊥ .., … |….). Figure 21 shows an example of one of eight RA G10 of the same
eight RA independence
G10and
and BN5*
BN5*
structure,
specific (. ⊥
graphs, .., …
BCD:AD, |….). Figure
with 21 shows
independencies an ( A ⊥
example
B, of
C |one
D ) . of
Theeight
full
specific graphs, BCD:AD, with independencies (𝐴 ⊥ 𝐵, 𝐶 |𝐷). The full list of eight spe- RA
list G10
of and
eight BN5*
specific
RA G10
specific
cific and
G10BN5*
RAgraphs, and specific
BCD:AD, graphs
with
BN5* specific are: ABC:AD,
independencies
graphs (𝐴ABC:BD,
⊥ 𝐵, 𝐶 ABC:BD,
are: ABC:AD, |𝐷).
ABC:CD,
The fullABD:AC, ABD:BC,
list of eight
ABC:CD, spe-
ABD:AC,
ABD:DC,
cific RA ACD:AB,
G10 and ACD:CB,
BN5* ACD:DB,
specific BCD:BA,
graphs are: BCD:CA,
ABC:AD,
ABD:BC, ABD:DC, ACD:AB, ACD:CB, ACD:DB, BCD:BA, BCD:CA, and BCD:DA. G10 and
ABC:BD, BCD:DA.
ABC:CD, G10 has been
ABD:AC,
previouslyABD:DC,
ABD:BC, characterized as naïve BN-like.
has been previously ACD:AB,characterized ACD:CB, ACD:DB,
as naïve BN-like.BCD:BA, BCD:CA, and BCD:DA. G10
ρ4 ispreviously
has been discussed later in the section
characterized on non-equivalent
as naïve BN-like. RA and BN general graphs.
ρ5 represents RA graph G13 and BN graph BN10 which have the same independence
structure, (. ⊥ .., . . . , . . . .) in that they have no independencies in the triadic relation and the
fourth variable is independent of all three variables in the triadic relation. Figure 22 shows
an example of one of four RA G13 and BN10 specific graphs, BCD:A, with independencies
( A ⊥ B, C, D ). The full list of RA G13 and BN10 specific graphs are: ABC:D, ABD:C,
ACD:B, and BCD:A.
Entropy
Entropy2021,
2021,23,
23,x 986
FOR PEER REVIEW 27 25
ofof4239

Figure
Figure 21.
21. Rho3,
Rho3, G10,
G10, and
and BN5*
BN5* specific
specific graph
graph example.
example.

ρ4
ρ4 is
is discussed
discussed laterlater in
in the
the section
section onon non-equivalent
non-equivalent RA RA and
and BNBN general
general graphs.
graphs.
ρ5
ρ5 represents RA graph G13 and BN graph BN10 which have the same independence
represents RA graph G13 and BN graph BN10 which have the same independence
structure, (.(. ⊥
structure, ⊥ ..,
.., …,
…, ….)
….) in
in that
that they
they have
have no
no independencies
independencies in in the
the triadic
triadic relation
relation and
and the
the
fourth variable is independent of all three variables in the triadic relation. Figure
fourth variable is independent of all three variables in the triadic relation. Figure 22 shows 22 shows
an
an example
example of of one
one of
of four
four RA
RA G13
G13 and
and BN10
BN10 specific
specific graphs,
graphs, BCD:A,
BCD:A, with
with independencies
independencies
(𝐴
(𝐴 ⊥⊥ 𝐵,
𝐵, 𝐶,
𝐶, 𝐷).
𝐷). The
The full
full list
list of
of RA
RA G13
G13 and
and BN10
BN10 specific
specific graphs
graphs are:
are: ABC:D,
ABC:D, ABD:C,
ABD:C,
Figure
Figure21.
ACD:B, 21.Rho3,
Rho3,
and G10,
BCD:A.G10,and
andBN5*
BN5*specific
specificgraph
graphexample.
example.
ACD:B, and BCD:A.
ρ4 is discussed later in the section on non-equivalent RA and BN general graphs.
ρ5 represents RA graph G13 and BN graph BN10 which have the same independence
structure, (. ⊥ .., …, ….) in that they have no independencies in the triadic relation and the
fourth variable is independent of all three variables in the triadic relation. Figure 22 shows
an example of one of four RA G13 and BN10 specific graphs, BCD:A, with independencies
(𝐴 ⊥ 𝐵, 𝐶, 𝐷). The full list of RA G13 and BN10 specific graphs are: ABC:D, ABD:C,
ACD:B, and BCD:A.

Figure
Figure 22.
Figure 22. Rho5,
22. Rho5,G13,
Rho5, G13, and
G13, and BN10*
and BN10* specific
BN10* specific graph
specific graph example.
graph example.
example.

ρ6
ρ6 represents
ρ6 representsRA
represents RAgeneral
RA generalgraph
general graphG15
graph G15and
G15 andand BN
BN BN graph
graph
graph BN11*
BN11* BN11* whichwhich
which have
have the
have same
thethesame in-
samein-
dependence
independence structure,
structure, (. ⊥ (...,⊥… |
.., ….),
. . . | ( .... .⊥ …
.), ( |.. ⊥
dependence structure, (. ⊥ .., … | ….), ( .. ⊥ … | ….). There are three dyadic relations in
….). . There
. . | . .are
. .). three
There dyadic
are relations
three dyadic in
these
these graphs
relations
graphs with
in these one
one variable
withgraphs with one
variable present
presentvariablein
in all three
three dyadic
present
all in all three
dyadic relations
relationsdyadic and
and the
the other
relationsotherandthree
the
three
other three
variables
variables variables
present
present in present
in only
only one in
one of only
of threeone
three dyadic
dyadic of three dyadic relations.
relations.
relations.
This
This graph is
is described
described in inthethe literature
This graph is described in the literature (Zhang 2004)
graph literature (Zhang
(Zhang 2004)
2004) asas
as a naïve
aa naïve
naïve BN,
BN,BN, simple
simple
simple Bayes,
Bayes,
Bayes, or
or
or independence
independence Bayes,Bayes, because
because of of
its its
simple simple dyadic dyadic
relations relations
among among variables.
variables. What What
is also
independence
Figure 22.RA Bayes,
Rho5,isG13, because of its simple dyadic relations among variables. What is also
is also
clear clear RAand BN10*graph
general specific G15graph example.
represents a naïve BNofbecause of its equivalent
clear is
is RA general
general graph
graph G15G15 represents
represents aa naïve naïve BN BN because
because of its its equivalent
equivalent independ-
independ-
independence
ence structure. structure.
Figure 23 Figure
shows 23
an shows
example an of example
one of of
RA one
G15 of and
RA G15 BN11*andspecific
BN11* graphs,
specific
enceρ6 structure.
representsFigure
RA with23 shows
general an example
graph G15 and of one of RA G15 and BN11* specific graphs,
graphs,
AD:BD:CD, AD:BD:CD,
with independencies (𝐴
independencies ⊥ 𝐵, 𝐶 (|ABN
𝐷),⊥ (𝐵graph
B, ⊥
C |𝐶DBN11*
| )𝐷),
, ( Band which
⊥ | Dhave
) , and
Cconditional the same
conditional
probability
in-
AD:BD:CD,structure,
dependence with independencies
(. ⊥ .., … | (𝐴 ⊥( ..𝐵,⊥𝐶…| 𝐷),
….), | (𝐵 ⊥
….). 𝐶 | 𝐷),
There are and
threeconditional
dyadic probability
relations in
𝑝(𝐴|𝐷)
probability distribution
distribution p( A| D ) p( B| D ) p(Cfull
𝑝(𝐵|𝐷)𝑝(𝐶|𝐷)𝑝(𝐷).The | D )list
p( Dof) .The full RA list of specific RA G15 and
distribution
these graphs 𝑝(𝐴|𝐷)
with one𝑝(𝐵|𝐷)𝑝(𝐶|𝐷)𝑝(𝐷).The
variable present in all full
three of specific
listdyadicspecific RA G15
relations G15 and
and
and the
BN11*
BN11*
other
specific
specific
three
BN11* specific
graphs graphs are: AB:AC:AD, AB:BC:BD, AC:BC:CD, and AD:BD:CD.
graphs are:
variables are: AB:AC:AD,
AB:AC:AD,
present in only one
AB:BC:BD,
AB:BC:BD, AC:BC:CD,
of threeAC:BC:CD,dyadic relations.
and
and AD:BD:CD.
AD:BD:CD.
This graph is described in the literature (Zhang 2004) as a naïve BN, simple Bayes, or
independence Bayes, because of its simple dyadic relations among variables. What is also
clear is RA general graph G15 represents a naïve BN because of its equivalent independ-
ence structure. Figure 23 shows an example of one of RA G15 and BN11* specific graphs,
AD:BD:CD, with independencies (𝐴 ⊥ 𝐵, 𝐶 | 𝐷), (𝐵 ⊥ 𝐶 | 𝐷), and conditional probability
distribution 𝑝(𝐴|𝐷) 𝑝(𝐵|𝐷)𝑝(𝐶|𝐷)𝑝(𝐷).The full list of specific RA G15 and BN11* specific
graphs are: AB:AC:AD, AB:BC:BD, AC:BC:CD, and AD:BD:CD.

Figure 23. Rho 6, G15 and BN11* specific graph example.

ρ7 represents RA graph G16 and BN graph BN14* which have the same independence
structure, (. ⊥ .. | . . . .), ( . . . ⊥ ., . . . . | ..). Figure 24 shows an example of one of twelve RA
G16 and BN14* specific graphs, AD:BC:BD, with independencies ( A ⊥ B | D ), (C ⊥ A, D | B)
and conditional probability distribution p( A| D ) p( B| D ) p(C | B) p( D ) . The full list of spe-
cific RA G16 and BN14* specific graphs are: AB:AC:BD, AB:AC:CD, AB:AD:BC, AB:AD:CD,
Figure 23. Rho 6, G15 and BN11* specific graph example.
Figure 23. Rho 6, G15 and BN11* specific graph example.
ρ7 represents RA graph G16 and BN graph BN14* which have the same independ-
ρ7 represents RA graph G16 and BN graph BN14* which have the same independ-
ence structure, (. ⊥ .. | ….), (… ⊥ ., …. | ..). Figure 24 shows an example of one of twelve
ence structure, (. ⊥ .. | ….), (… ⊥ ., …. | ..). Figure 24 shows an example of one of twelve
Entropy 2021, 23, 986 RA G16 and BN14* specific graphs, AD:BC:BD, with independencies (𝐴 ⊥ 𝐵 | 𝐷), (𝐶 39 ⊥
RA G16 and BN14* specific graphs, AD:BC:BD, with independencies (𝐴 ⊥ 𝐵 | 𝐷),26(𝐶of ⊥
𝐴, 𝐷 | 𝐵) and conditional probability distribution 𝑝(𝐴|𝐷)𝑝(𝐵|𝐷)𝑝(𝐶|𝐵)𝑝(𝐷). The full list
𝐴, 𝐷 | 𝐵) and conditional probability distribution 𝑝(𝐴|𝐷)𝑝(𝐵|𝐷)𝑝(𝐶|𝐵)𝑝(𝐷). The full list
of specific RA G16 and BN14* specific graphs are: AB:AC:BD, AB:AC:CD, AB:AD:BC,
of specific RA G16 and BN14* specific graphs are: AB:AC:BD, AB:AC:CD, AB:AD:BC,
AB:AD:CD, AB:BC:CD, AB:BD:CD, AC:AD:BC, AC:AD:BD, AC:BC:BD, AC:BD:CD,
AB:AD:CD,
AB:BC:CD, AB:BC:CD, AC:AD:BC,
AB:BD:CD,AC:AD:BD,
AC:AD:BC,AC:BC:BD,
AC:AD:BD, AC:BC:BD, AC:BD:CD,
AD:BC:BD, AB:BD:CD,
and AD:BC:CD. AC:BD:CD, AD:BC:BD, and
AD:BC:BD,
AD:BC:CD. and AD:BC:CD.

Figure 24. Rho7, G16 and BN14* specific graph example.


Figure24.
Figure Rho7,G16
24.Rho7, G16and
andBN14*
BN14*specific
specificgraph
graphexample.
example.

ρ8 represents
ρ8 representsRA RA general
RA general graph
graph G17
G17 andand BN BN general
general graphgraph BN16* BN16* which
which have
have the
the
ρ8 represents general graph G17 and BN general graph BN16* which have the
same independence
sameindependence structure, (.. ⊥
structure,(..(..⊥ ⊥
independencestructure, … | ….), (. ⊥ .., …, ….).
⊥ ….). There are two dyadic relations
same … .|. .….),
| . (.
. . ⊥.),..,(.…, .., . . There
. , . . . .).
areThere are two
two dyadic dyadic
relations
in these
relations graphs
in these with one
graphs variable present
with onepresent in both
variableinpresent dyadic
in both relations, and
dyadic relations, the fourth
and thevariable
fourth
in these graphs with one variable both dyadic relations, and the fourth variable
not present
variable not in either in
present dyadic
either relation,
dyadic and thus and
relation, independent
thus of the three
independent of other
the variables.
three other
not present in either dyadic relation, and thus independent of the three other variables.
This graphThis
variables. is also representative
graph is also of a naïve of
representative BN. a Figure
naïve BN.25 Figure
shows 25 an shows
example an of one of
example
This graph is also representative of a naïve BN. Figure 25 shows an example of one of
twelve
of one of RA G17 RA and BN16* specific graphs, BD:CD:A, with independencies (𝐵 ⊥
twelve RAtwelveG17 and G17 BN16*andspecific
BN16* specific
graphs, graphs, BD:CD:A, BD:CD:A, with independencies
with independencies (𝐵 ⊥
𝐶 | 𝐷), (𝐴 ⊥ 𝐵, 𝐶, 𝐷), and conditional probability distribution p(B|D) p(C|D)p(D). The
𝐶( B| ⊥
𝐷),C(𝐴 ( A𝐶,⊥𝐷),
| D⊥), 𝐵, B, and
C, Dconditional
) , and conditional
probabilityprobability distribution
distribution p(B|D) p(B|D) p(C|D)p(D).
p(C|D)p(D). The
full full
The list of
listspecific RA G17
of specific and BN16* specific graphs are: AB:AC:D, AB:BC:D, AC:BC:D,
full list of specific RA RA
G17G17andand BN16*
BN16* specific
specific graphs
graphs are:AB:AC:D,
are: AB:AC:D,AB:BC:D,
AB:BC:D,AC:BC:D,
AC:BC:D,
AB:AD:C, AB:BD:C,
AB:AD:C, AB:BD:C, AD:BD:C,
AD:BD:C, AC:AD:B,
AC:AD:B, AC:CD:B,
AC:CD:B, AD:CD:B, AD:CD:B, BC:BD:A, BC:BD:A, BC:CD:A,
BC:CD:A, andand
AB:AD:C, AB:BD:C, AD:BD:C, AC:AD:B, AC:CD:B, AD:CD:B, BC:BD:A, BC:CD:A, and
BD:CD:A.
BD:CD:A.
BD:CD:A.

Figure 25. Rho 8, G17 and BN16* specific graph example.


Figure 25. Rho 8, G17 and BN16* specific graph example.
ρ9 represents RA
ρ9 represents RA general
general graph
graph G18 G18 andand BN BN general
general graph
graph BN18
BN18 which
which have
have the
the
same ρ9independence
represents RAstructure,
general graph G18 ⊥ and BN general graph BN18relations
which have the
same independence structure, (. , …. ⊥ .., …). There are two dyadic relations in
(. , . . . . .., . . . ). There are two dyadic in these
these
same
graphs independence
with two structure,
two variables
variables (. , ….in⊥ one
included .., …). There
dyadic are two
relation anddyadic
the relations
other two in these
included
graphs with included in one dyadic relation and the other two included in
graphs
in with two variables included in one dyadic relation and the other two included in
thethe other.
other. Figure
Figure 26 shows
26 shows an example
an example of oneofofone threeof RA
three
G18 RAandG18 andspecific
BN18 BN18 specific
graphs,
the other. Figurewith 26 shows an example( A, of one⊥ofB,three RA conditional
G18 and BN18 specific distribu-
graphs,
graphs,
AD:BC, AD:BC, independencies
with independencies (𝐴, 𝐷 ⊥ 𝐵,D𝐶) , and C ),conditional
and probability
probability distribution
AD:BC,
tion with independencies
p(C | B) p( B) p( D | A) pThe (𝐴,
( A) full 𝐷
. Thelist ⊥
full 𝐵, 𝐶) , and conditional probability distribution
𝑝(𝐶|𝐵)𝑝(𝐵)𝑝(𝐷|𝐴)𝑝(𝐴). oflist of specific
specific RA G18 RA and
G18 BN18
and BN18 specific
specific graphsgraphs
are:
𝑝(𝐶|𝐵)𝑝(𝐵)𝑝(𝐷|𝐴)𝑝(𝐴).
are: AB:CD, AC:BD, and The full list of specific RA G18 and BN18 specific graphs are:
AD:BC.
AB:CD, AC:BD, and AD:BC.
AB:CD, ρ10AC:BD, and AD:BC.
represents RA graph G19 and BN graph BN19 have the same independence
structure, (.. ⊥ . . . , . . . .), (. ⊥ .., . . . , . . . .). There is one dyadic relation and two variables
independent of all other variables. Figure 27 shows an example of one of six RA G19
and BN19 specific graphs, CD:A:B, with independencies ( B ⊥ C, D ), ( A ⊥ B, C, D ) and
conditional probability distribution p( D |C ) p(C ) p( A) p( B) . The full list of specific RA G19
and BN19 specific graphs are: AB:C:D, AC:B:D, AD:B:C, BC:A:D, BD:A:C, and CD:A:B.
ρ11 represents RA graph G20 and BN graph BN20 which have the same independence
structure (. ⊥.., . . . , . . . .), (..⊥ . . . , . . . .), ( . . . ⊥....) in which all variables are independent
of one another, ( A ⊥ B, C, D ), ( B ⊥ C, D ), (C ⊥ D ). Figure 28 shows the only specific
graph for RA G20 and BN20, A:B:C:D.
Entropy 2021, 23, 986
Entropy x FOR PEER REVIEW 29 of
27 42
of 39

Figure 26. Rho 9, G18 and BN18* specific graph example.


Figure 26. Rho 9, G18 and BN18* specific graph example.
ρ10 represents RA graph G19 and BN graph BN19 have the same independence
structure, (.. ⊥ …, ….),
ρ10 represents ⊥ .., …,G19
RA(. graph ….).and
There BNis graph BN19relation
one dyadic have the
andsame independence
two variables inde-
structure,of(..all⊥ other
pendent …, ….), (. ⊥ .., …,
variables. Figure 27 shows
….). There an example
is one of one of
dyadic relation andsixtwo
RAvariables
G19 and BN19
inde-
pendentgraphs,
specific of all other variables.
CD:A:B, with Figure 27 shows an
independencies (𝐵 example
⊥ 𝐶, 𝐷), of
(𝐴one
⊥ 𝐵,
of 𝐶,
six𝐷)
RAand
G19conditional
and BN19
specific graphs,
probability distribution with independencies (𝐵
CD:A:B, 𝑝(𝐷|𝐶)𝑝(𝐶)𝑝(𝐴)𝑝(𝐵). The⊥full
𝐶, 𝐷),
list (𝐴 ⊥ 𝐵, 𝐶, 𝐷)
of specific RAand
G19conditional
and BN19
specific graphs
probability are: AB:C:D,
distribution 𝑝(𝐷|𝐶)𝑝(𝐶)𝑝(𝐴)𝑝(𝐵).
AC:B:D, AD:B:C, BC:A:D, The fullBD:A:C, and CD:A:B.
list of specific RA G19 and BN19
specific
ρ11graphs
represents are: AB:C:D,
RA graph AC:B:D,
G20 and AD:B:C,
BN graphBC:A:D,
BN20BD:A:C, and CD:A:B.
which have the same independ-
ence ρ11
structure
represents(. ⊥.., RA
…, ….),
graph (..⊥G20
…, and
….), BN(…⊥....)
graphin BN20
whichwhich
all variables
have thearesame
independent
independ-of
ence structure(𝐴
one another, (. ⊥..,
⊥ 𝐵,…,𝐶,….),
𝐷), (𝐵(..⊥⊥…,𝐶,….),
𝐷), (𝐶 ⊥ 𝐷).inFigure
(…⊥....) which28 allshows
variables are independent
the only of
specific graph
oneRA
for another,
G20 and (𝐴 BN20,
⊥ 𝐵, 𝐶,A:B:C:D.
𝐷), (𝐵 ⊥ 𝐶, 𝐷), (𝐶 ⊥ 𝐷). Figure 28 shows the only specific graph
for RA
Figure G20
26. Rhoand 9,BN20,
G18 A:B:C:D.
and
Figure 26. Rho 9, G18 and BN18* BN18*specific
specificgraph
graphexample.
example.

ρ10 represents RA graph G19 and BN graph BN19 have the same independence
structure, (.. ⊥ …, ….), (. ⊥ .., …, ….). There is one dyadic relation and two variables inde-
pendent of all other variables. Figure 27 shows an example of one of six RA G19 and BN19
specific graphs, CD:A:B, with independencies (𝐵 ⊥ 𝐶, 𝐷), (𝐴 ⊥ 𝐵, 𝐶, 𝐷) and conditional
probability distribution 𝑝(𝐷|𝐶)𝑝(𝐶)𝑝(𝐴)𝑝(𝐵). The full list of specific RA G19 and BN19
specific graphs are: AB:C:D, AC:B:D, AD:B:C, BC:A:D, BD:A:C, and CD:A:B.
ρ11 represents RA graph G20 and BN graph BN20 which have the same independ-
ence structure (. ⊥.., …, ….), (..⊥ …, ….), (…⊥....) in which all variables are independent of
one another, (𝐴 ⊥ 𝐵, 𝐶, 𝐷), (𝐵 ⊥ 𝐶, 𝐷), (𝐶 ⊥ 𝐷). Figure 28 shows the only specific graph
for RA27.
Figure G20
Rho,and
10,BN20, A:B:C:D.
G19 and BN19 specific graph example.
Figure27.
Figure Rho,10,
27.Rho, 10,G19
G19and
andBN19
BN19specific
specificgraph
graph example.
example.

Figure28.
Figure 27. Rho11,
28.Rho
Rho, 11,G20
10, G20and
G19 andBN20
and BN20specific
BN19 specificgraph.
specific graph.example.
graph
Figure 28. Rho 11, G20 and BN20 specific graph.
Table55summarizes
Table summarizesall allequivalent
equivalentRA RAand
andBN
BNgeneral
general graphs,
graphs, with
with their
their associated
associated
Rho graph,
Rho graph,
Table 5anan example of their
example ofalltheir
summarizes specific
specificRA
equivalent graph
graph notations
andnotations and their
andgraphs,
BN general independences.
their independences. These
These
with their associated
specific
Rho graph,
specific graph examples
an examples
graph example of align with
theirwith
align the
specific BN
the BN general
graph graphs
notations
general and
graphs of Figure
of their
Figure 8 assuming
independences. labeling
These of
8 assuming labeling
nodes
specific
of A, B,
nodesgraph C, D in the
examples
A, B, C, order
D in thealign of
orderwithtop left,
theleft,
of top top
BNtop right,
general bottom
right,graphs left,
bottomofleft,bottom
Figure right.
8 assuming
bottom right. labeling
of nodes A, B, C, D in the order of top left, top right, bottom left, bottom right.
Table 5. Equivalent Rho, RA and BN neutral system general graphs.

Rho RA General BN General Specific Graph Example


Independencies
Graph Graph Graph (RA Notation)
ρ1 G1 BN1 ABCD no independencies
ρ2 G7 BN2* ACD:BCD (A ⊥ B | C, D)
Figure
ρ3 28. Rho 11, G20 and BN20
G10 BN5* specific graph.
BCD:AD (A ⊥ B, C | D)
ρ5 G13 BN10 BCD:A (A ⊥ B, C, D)
ρ6 Table 5G15
summarizesBN11*
all equivalent RA and BN general graphs,
AD:BD:CD (A ⊥ B, Cwith
| D),their
(B ⊥ Cassociated
| D)
Rho graph, an example of their specific graph notations and their independences. These
ρ7 G16 BN14* AD:BC:BD (A ⊥ B | D), (C ⊥ A, D | B)
specific graph examples align with the BN general graphs of Figure 8 assuming labeling
ρ8 G17 BN16* BD:CD:A (B ⊥ C | D), (A ⊥ B, C, D)
of nodes A, B, C, D in the order of top left, top right, bottom left, bottom right.
ρ9 G18 BN18 AD:BC (A, D ⊥ B, C)
ρ10 G19 BN19 CD:A:B (B ⊥ C, D), (A ⊥ B, C, D)
ρ11 G20 BN20 A:B:C:D (A ⊥ B, C, D), (B ⊥ C, D), (C ⊥ D)
Entropy 2021, 23, 986 28 of 39

4.4. Rho and Non-Equivalent RA and BN General Graphs


In addition to the 10 equivalent RA and BN general graphs, there are 10 general
graphs unique to the RA lattice and 10 general graphs unique to the BN lattice. All 10 non-
equivalent RA general graphs in the four variable lattice have loops and require iteration
to generate their probability distributions. BNs are acyclic and have analytic solutions, so
there are no BN general graphs that are equivalent to the RA graphs with loops. Since RA
graphs are undirected, one might think that there could be some equivalent acyclic directed
BN graphs, but this is not the case, because BN graphs that are acyclic when directions are
considered but cyclic if directions are ignored have V-structure interpretations, as described
previously. All 10 non-equivalent BN general graphs have such V-structures, which encode
independence relations unique to BNs. To illustrate: the structure A→B, B→C, C→D,
D→A is cyclic and not a legitimate BN structure, but the directed structure of A→B, B→C,
C→D, A→D (BN9b from Figure 8), which has the same undirected links, is not cyclic, and
is a legitimate BN structure. However, this latter structure is not interpreted as a set of
dyadic relations, which would be written in RA notation as AB:BC:CD:AD and contains a
loop (RA general graph G12 from Figure 1). Rather, the V-structure consisting of C→D and
A→D is interpreted as a triadic relation, which contributes a p( D | AC ) to the probability
expression, p( A) p( B| A) p(C | B) p( D | AC ) , which does not correspond to any RA structure.

4.5. Lattice of Rho, RA, BN Neutral System General Graphs


The lattice of Rho, RA and BN equivalent and non-equivalent general graphs in
Figure 29 was developed from the RA lattice in Figure 1 and the BN lattice in Figure 8. This
lattice includes all 10 unique RA general graphs, 10 unique BN general graphs, and 10 RA
and BN equivalent general graphs, for a total of 30 unique general graphs. The lattice is
organized using the Rho lattice [7]. All 20 RA general graphs and all 20 BN general graphs
for each Rho graph are represented in the joint lattice. Within each Rho graph, where RA
and BN graphs are equivalent, that is, when their independence structures are identical,
the BN graph is placed under the RA equivalent graph. Where RA or BN graphs are not
equivalent, representing an independence structure unique to RA or BN, they stand alone.
Arrows from one graph to another in the joint lattice represent the hierarchy of the
RA lattice only. As can be seen in Section 3, the hierarchy of the BN lattice has many more
links from parent to child graphs and thus is not a useful representation in the joint lattice.
Additionally, Figure A6 in Appendix A includes the Joint RA-BN lattice of general and
specific graphs. This lattice shows 53 unique RA specific graphs, 124 unique BN specific
graphs, and 61 RA-BN equivalent specific graphs, for a total of 238 combined, unique, RA
and BN specific graphs.

4.6. Joint RA-BN Lattice Algorithm


This section defines an algorithm for generating the Joint RA-BN lattice of neutral
system general and specific graphs.

4.6.1. Procedure to Generate the RA Neutral System General and Specific Graphs from a
Single Rho Graph
This is done in three steps: in Step 1, generate the most complex set of specific graphs
that correspond to the Rho graph; in Step 2, generate all their less complex specific graph
descendants; in Step 3, specific graphs are collected together in general graphs.
Step 1 begins with (Step 1.1) labeling the Rho graph, as shown in Figure 30. The most
complex specific graph that corresponds to this labeled Rho graph is obtained (Step 1.2) by
representing each clique with a single relation encompassing all the variables in the clique
and then joining these relations with a “:”. For example, in Figure 30, A, B, and C are in
a clique, i.e., are fully linked to one another and this is also the case for B, C, and D, but
A and B are not linked. The resulting specific graph is ABC:BCD, which is encompassed
in RA general graph G7. Next (Step 1.3), permute all the variables in this specific graph,
Entropy 2021, 23, 986 29 of 39

Entropy 2021, 23, x FOR PEER REVIEW


whichgenerates the other five specific graphs that are encompassed within G7, as31 of 42
shown
in RA lattice of Figure 4.

Entropy 2021, 23, x FOR PEER REVIEW 32 of 42

Step 1 begins with (Step 1.1) labeling the Rho graph, as shown in Figure 30. The most
complex specific graph that corresponds to this labeled Rho graph is obtained (Step 1.2)
by representing each clique with a single relation encompassing all the variables in the
clique and then joining these relations with a “:”. For example, in Figure 30, A, B, and C
are in a clique, i.e., are fully linked to one another and this is also the case for B, C, and D,
but A and B are not linked. The resulting specific graph is ABC:BCD, which is encom-
passed in RA general graph G7. Next (Step 1.3), permute all the variables in this specific
Figuregraph,
Figure
which
Lattice
29. Lattice
29. of
generates
of4-variable
4-variable the other
general
general Rho, five
Rho, RA
specific
RA and
and graphs
BN neutral
BN neutral that are
system
system
encompassed within G7, as
graphs.
graphs.
shown in RA lattice of Figure 4.
4.6. Joint RA-BN Lattice Algorithm
This section defines an algorithm for generating the Joint RA-BN lattice of neutral
system general and specific graphs.

4.6.1. Procedure to Generate the RA Neutral System General and Specific Graphs from a
Single Rho Graph
Example, Rho
Figure 30. Example, Rho 2.
This is done in three steps: in Step 1, generate the most complex set of specific graphs
that correspond
Step 2 then to the Rhothe
generates graph; in Step
simpler RA 2, generate all their
representations less
of G7 complex
that map tospecific graph
Rho2, namely
descendants; in Step 3, specific graphs are collected together in general graphs.
the specific graphs that are encompassed within the RA general graphs G8 and G9. Klir
[7] (p. 231) details the procedure for this step. In Step 3, specific graphs with the same
independence structure are then collected together in general graph equivalence classes.
shown in RA lattice of Figure 4.

Entropy 2021, 23, 986 30 of 39

Figure 30. Example, Rho 2.


Step 2 then generates the simpler RA representations of G7 that map to Rho2, namely
Step 2 then
the specific generates
graphs theencompassed
that are simpler RA representations
within the RA of G7 that map
general to Rho2,
graphs namely
G8 and G9. Klir [7]
the specific graphs that are encompassed within the RA general graphs
(p. 231) details the procedure for this step. In Step 3, specific graphs with G8 and G9. Klirthe same
[7] (p. 231) details
independence the procedure
structure forcollected
are then this step.together
In Step 3,inspecific
generalgraphs
graphwith the same classes.
equivalence
independence structure are then collected together in general graph equivalence classes.
Doing this for Rho 2 results in general graphs G7, G8 and G9 and their specific graphs as
Doing this for Rho 2 results in general graphs G7, G8 and G9 and their specific graphs as
shown in Figure 4.
shown in Figure 4.
4.6.2. Procedure to Generate the BN Neutral System General and Specific Graphs from a
4.6.2. Procedure to Generate the BN Neutral System General and Specific Graphs from a
Single Rho Graph
Single Rho Graph
In contrast to RA graphs, BNs are just Rho graphs with directions added to edges,
In contrast to RA graphs, BNs are just Rho graphs with directions added to edges, as
as shown in Figure
shown in Figure 31.generate
31. To To generate
all BNallspecific
BN specific
graphsgraphs for a Rho
for a given given Rho simply
graph, graph, simply
permute all possible edge directions and variable combinations, and follow the
permute all possible edge directions and variable combinations, and follow the BN neutral BN neutral
system general and specific graph procedure outlined above in Section 3.2.6.
system general and specific graph procedure outlined above in Section 3.2.6. Essentially, Essentially,
the
the process entailsdiscarding
process entails discarding redundant
redundant specific
specific graphs
graphs and graphs
and graphs withfrom
with cycles cycles
all from all
these permutations,and
these permutations, and collecting
collecting together
together BN specific
BN specific graphsgraphs with unique
with unique independence
independence
structures into
structures intoaageneral
general graph.
graph.

ρ2

Figure 31.
Figure 31.Rho
Rho2 2example,
example,with associated
with BNsBNs
associated general graphs.
general graphs.

4.6.3. Generating
4.6.3. Generatingthe Joint
the RA-BN
Joint General
RA-BN and and
General Specific GraphGraph
Specific LatticeLattice
The
The following
followingprovides a general
provides algorithm
a general to generate
algorithm the joint
to generate theRA-BN lattice oflattice of
joint RA-BN
neutral system general and specific graphs for any number of variables from
neutral system general and specific graphs for any number of variables from some specific
some specific
starting graph, either downwards or upwards.
starting graph, either downwards or upwards.
1. Identify a starting Rho graph
1.
2. Identify all
Generate a starting
possible Rho graph
RA and BN specific graphs for the given Rho graph.
2. Generate all possible RA and BN specific graphs for the given Rho graph.
a. For RA, follow the procedure detailed in the prior Section 4.6.1.
a. ForFor
b. BN,RA, follow
follow the procedure
the procedure detailed
detailed in theSection
in the prior prior 4.6.2.
Section 4.6.1.
c.b. Organize
For BN, allfollow
RA andthe
BNprocedure detailed
general graph in theclasses
equivalence prior Section 4.6.2.
into three categories:
c. RAOrganize
graphs with loops,
all RA andBN
BNgraphs
generalwith
graphV-Structures,
equivalence and equivalent
classes RA-BN
into three categories:
graphs containing
RA graphs noloops,
with loops or
BNV-structures.
graphs with V-Structures, and equivalent RA-BN
graphs containing no loops or V-structures.
3. If searching the lattice upward, add an edge to the prior Rho graph. If searching the
lattice downward, delete an edge from the prior Rho graph.
4. Repeat steps 2 and 3 until the top or bottom of the lattice is reached.
Consider for example the results of the RA and BN procedures for Rho 2. Organizing
these results via step 2c gives the following six general structures: G8 and G9 for RA
graphs with loops, BN3 and BN4* for BN graphs with V-structures, and G7 and BN2* for
equivalent RA-BN graphs. Specific structures can be simply obtained from these general
structures by listing all permutations of variable labels. Following these procedures for any
number of variables will result in the exhaustive, non-redundant, lattice of joint RA-BN
neutral system general and specific graphs.

5. Comparing RA and BN Directed System Graphs


Figure 32 shows side-by-side for comparison the RA augmented directed system
lattice from Figure 7 and the BN directed system lattice from Figure 16. To the left or right
of each BN directed system general graph is the equivalent RA directed system general
graph. For example, BN7 is equivalent to RA general graph G7. Equivalence in this context
is in terms statistical equivalence of prediction results given data. Two directed system
general graphs are equivalent if they predict the DV (Z) in the same way. Each of the BN
directed system general graphs in the lattice is equivalent to an RA general graph in the
Entropy 2021, 23, 986 31 of 39

augmented RA directed system general graph lattice. In addition, the RA directed system
lattice includes additional predictive graphs, those with loops that are not found in the BN
Entropy 2021, 23, x FOR PEER REVIEW 34 of 42
lattice. Thus, restricting BN directed systems to those where the DV is not a parent in a
V-structure, the RA augmented directed system lattice fully encompasses the BN directed
system lattice and offers additional predictive graphs.

32. Comparison
FigureFigure of RA
32. Comparison augmented
of RA directed
augmented directedsystem
system lattice toBN
lattice to BNdirected
directed system
system lattice.
lattice.

Table 6 shows all BN directed


However, system
as pointed general
out above, thegraphs and their
BN directed systemRA equivalents
lattice developedas in this pa-
well as specific graph examples with their associated probability distributions. In these If this
per was constrained to disallow any DV that is a parent node within a V-structure.
probability distributions,
constraintonly
werethe
to terms used to
be relaxed to allow
predict
DVsthethat
DVare(Z)parent
are highlighted in black;
nodes in V-structures, then
non-predictive terms
thereare
are greyed. All equivalences
BN predictive models that necessarily
give differentinvolve loopless
analytical resultsRA models;
than RA predictive
models.
half of these involve Therefore,
RA graphs in the BN directed
standard system system
directed lattice developed in this
lattice, where papermodel
every is preliminary
and incomplete.
has an IV component, and the other half involve graphs in the augmentation of this lattice.
Prior to development To illustrate
of the this point,
BN directed systemconsider BN17
lattice in this from
paper,Figure
the RA16directed
with itssystem
specific graph
ABZ A:B:C, and removing variable “C” for simplicity resulting in ABZA:B with edge orien-
lattice did not include naïve Bayes equivalent graphs, e.g., G15 and G17, and the naïve
tations A->Z<-B and with probability distribution 𝑝(𝑍|𝐴𝐵)𝑝(𝐴)𝑝(𝐵). Here, Z is the DV
Bayes-like graph, G10. The development of the BN directed system lattice in this paper in
Entropy 2021, 23, 986 32 of 39

part inspired the augmentation of the standard RA directed system lattice to include naïve
Bayes type graphs.

Table 6. BN directed system graphs and RA equivalent example.

BN General BN Specific Graph Example BN Specific Graph Example Equivalent RA Equivalent RA Graph
Equivalent RA Graph
Graph RA Notation Probability Distribution Graph Notation Probability Distribution
BN7 BCZ:ABZA:B p(C|BZ)p(Z|AB)p(B)p(A) G7 (augmentation) BCZ:ABZ p(C|BZ)p(Z|AB)p(B|A)p(A)
BN11 AZ:BZ:CZ p(A|Z)p(B|Z)p(C|Z)p(Z) G15 (augmentation) AZ:BZ:CZ p(A|Z)p(B|Z)p(C|Z)p(Z)
BN12 ABCZA:B:C p(Z|ABC)p(A)p(B)p(C) G1 ABCZ p(Z|ABC)p(ABC)
BN13 ACZA:C :BZ p(Z|AC)p(B|Z)p(A)p(C) G10 (augmentation) ACZ:BZ p(Z|AC)p(B|Z)p(C|A)p(A)
BN16 BZ:CZ:A p(B|Z)p(C|Z)p(Z)p(A) G17 (augmentation) BZ:CZ:A p(B|Z)p(C|Z)p(Z)p(A)
BN17 BCZB:C :A p(Z|BC)p(B)p(C)p(A) G7 ABC:BCZ p(Z|BC)p(B|CA)p(A|C)p(C)
BN19 CZ:A:B p(Z|C)p(C)p(A)p(B) G10 ABC:CZ p(Z|C)p(ABC)
BN20 A:B:C:Z p(Z)p(A)p(B)p(C) G13 ABC:Z p(Z)p(ABC)

However, as pointed out above, the BN directed system lattice developed in this paper
was constrained to disallow any DV that is a parent node within a V-structure. If this
constraint were to be relaxed to allow DVs that are parent nodes in V-structures, then
there are BN predictive models that give different analytical results than RA predictive
models. Therefore, the BN directed system lattice developed in this paper is preliminary
and incomplete.
To illustrate this point, consider BN17 from Figure 16 with its specific graph ABZA:B :C,
Entropy 2021, 23, x FOR PEER REVIEW 35 of 42
and removing variable “C” for simplicity resulting in ABZA:B with edge orientations
A->Z<-B and with probability distribution p( Z | AB) p( A) p( B) . Here, Z is the DV and is
the child node within the V-structure and is thus included within the BN directed system
and is the child node within the V-structure and is thus included within the BN directed
lattice developed in this paper. This graph is equivalent in terms of prediction to RA
system lattice developed in this paper. This graph is equivalent in terms of prediction to
directed
RA directed system
system graph
graphG7G7 with
withspecific
specificgraph
graph ABC:ABZ.
ABC:ABZ. In Incontrast,
contrast,consider
consider BN17 with
BN17
its specific graph ABZ
with its specific graph ABZ A:ZA:Z:C. Again, for simplicity and comparability, removing vari-variable
:C. Again, for simplicity and comparability, removing
“C”ableresults in ABZ
“C” results in ABZ with
A:Z A:Z withedge
edgeorientations A->B<-Z
orientations A->B<-Z and
and with
with probability
probability distribution
distribu-
( B| AZ
ption 𝑝(𝐵|𝐴𝑍)𝑝(𝐴)𝑝(𝑍).
) p( A) p( Z ) . Here,
Here,ZZisisthetheDV,
DV,andand a parent
is is node
a parent nodewithin
withinthethe
V-structure;
V-structure;therefore,
this specific
therefore, thisgraph was
specific not was
graph considered in the BN
not considered directed
in the system
BN directed lattice
system developed
lattice devel- in this
oped inHowever,
paper. this paper.theHowever,
predictingthe components
predicting components
within thewithin the probability
probability distribu-
distribution are different
tion thus
and are different and in
will result thus will resultstatistical
a different in a different statistical
result. result. The differences
The differences between ABZ be- A:B and
ABZtween ABZ areA:B and ABZA:Z are illustrated in Figure 33, in which a hypothetical joint proba-
illustrated in Figure 33, in which a hypothetical joint probability distribution
A:Z
pbility
( ABZ distribution
), shown in 𝑝(𝐴𝐵𝑍), shown
(a), yields in (a), yields adistribution
a conditional conditional distribution 𝑝(𝑍|𝐴𝐵)
p( Z | AB) for RA modelfor RAABZ and
model ABZ and BN model ABZA:B, shown in (b), that is different from the conditional
BN model ABZA:B , shown in (b), that is different from the conditional distribution q( Z | AB)
distribution 𝑞(𝑍|𝐴𝐵) for BN model ABZA:Z, shown in (c). ABZA:Z is an unconventional
for BN model ABZA:Z , shown in (c). ABZA:Z is an unconventional BN model in its choice of
BN model in its choice of the parent node Z as the DV. These non-conventional BN models
the parent node Z as the DV. These non-conventional BN models are not considered in this
are not considered in this paper, but are a promising topic for future research that will
paper,
extendbutthe are
work a promising
reported here.topic for future research that will extend the work reported here.

(a) p(ABZ), joint distribution for ABZ


Z0 Z1
B0 B1 B0 B1
A0 0.01 0.19 0.06 0.14
A1 0.17 0.31 0.03 0.10
(c) q(Z|AB), conditional distribution for ABZA:Z
(b) p(Z|AB), conditional distribution where q(ABZ) = p(B|AZ) p(A) p(Z)
for ABZ & ABZA:B and q(Z|AB) = q(ABZ) / q(AB)
Z0 Z1 Z0 Z1
B0 B1 B0 B1 B0 B1 B0 B1
A0 0.11 0.58 0.89 0.42 A0 0.21 0.74 0.79 0.26
A1 0.83 0.75 0.17 0.25 A1 0.74 0.64 0.26 0.36

Figure 33. BN Directed System Prediction Example.


Figure 33. BN Directed System Prediction Example.

6. Discussion
6.1. Neutral Systems
This paper builds on the RA work of Harris and Zwick [33], which developed the BN
neutral system general graph lattice of Figure 8, expanding it here to offer the BN neutral
Entropy 2021, 23, 986 33 of 39

6. Discussion
6.1. Neutral Systems
This paper builds on the RA work of Harris and Zwick [33], which developed the
BN neutral system general graph lattice of Figure 8, expanding it here to offer the BN
neutral system specific graph lattice of Figure 14. This paper also builds on the joint RA-BN
neutral system general graph lattice of Figure 29 developed in that earlier work, expanding
it here to offer the joint RA-BN neutral system specific graph lattice of Figure A6. In
developing these new lattices, this paper extends RA notation to encompass BN graphs
(see Section 3.2.3).
For four variables, the joint RA-BN neutral system general graph lattice increases the
number of general graphs from 20 in the RA lattice and 20 in the BN lattice to 30 in the joint
RA-BN lattice, and unique specific graphs from 114 in the RA lattice and 185 in the BN
lattice to 238 in the joint lattice. The integration of the two lattices offers a richer and more
expansive way to model and represent complex systems leveraging the V-structure unique
to BN graphs and the ability accommodate loops and hypergraphs in the RA lattice.
This paper also develops an algorithm to generate the joint RA-BN neutral system gen-
eral and specific graph lattices for any number of variables in both upward and downward
directions (Section 4.6). The exhaustive and non-redundant RA and BN lattices follow the
more general Rho lattice. Figure A6 shows the results of this algorithm for four variables.
Although this algorithm is exhaustive, it does not create a hierarchical nesting of general or
specific graphs. Such nesting is a desirable feature, so future extensions of this work could
enhance the algorithm by enabling it to develop sequentially with each new graph being
hierarchically nested. Given data, such an extension would allow statistical significance
tests to be performed at each incremental step of lattice generation. Additionally, the
current algorithm produces the exhaustive lattice, but searching the exhaustive lattice to
find best candidate graphs is inefficient, so algorithms to efficiently search the joint lattice
for best candidate graphs would be a useful extension.
Another promising extension of this work would be to develop hybrid RA-BN gen-
eral graphs [13] for neutral systems to further extend the expression of the joint RA-BN
neutral system lattice developed in this paper. Such hybrid graphs could incorporate
directed edges to encode BN V-structures with loops and hypergraphs found in RA. Other
possible extensions of this work could explore the application of Bayesian networks to
hypergraphs [56] and under appropriate conditions to certain types of cycles [57].

6.2. Directed Systems


This paper develops the RA augmented directed system lattice (Figure 7), which is an
extension of the conventional RA directed system lattice (Figure 5). While the conventional
RA directed system lattice encompasses all prediction graphs in the BN directed system
lattice (under the restriction that DVs in BN models are not parent variables in V-structures),
the RA conventional directed system lattice did not include naïve Bayes graphs. Doing so,
as shown in Figure 7, increases the number of general graphs from nine in the conventional
RA lattice to 12 in the augmented lattice, and the number of specific graphs from 19 to
31. The augmented RA directed system lattice thus offers more candidate graphs, and
this allows for the possibility of more accurate or simpler and thus more generalizable
RA prediction models. Augmentation of the conventional RA directed system lattice was
inspired in part by the BN directed system lattice developed in this paper.
Future extension of this work could examine whether BN graphs with predictions
equivalent to RA models but with fewer degrees of freedom than RA predictive equivalents
(because of independence constraints among the IVs) offer any advantage in calculations
of statistical significance. If so, such BN graphs might replace their RA equivalents in the
augmented directed system RA lattice. A related statistical issue that should be explored is
how to compare augmenting directed RA models whose natural reference is A:B: . . . , the
neutral system independence reference, with conventional directed systems models whose
Entropy 2021, 23, 986 34 of 39

natural reference is AB . . . :Z, i.e., a reference that has an IV component that joins together
all IVs in a single relation.
This paper develops the BN directed system lattice of prediction graphs for four
variables (Figure 16), reducing the number of possible specific graphs from 185 in the BN
neutral system lattice to 18 in the BN directed system lattice—a significant compression of
the BN neutral system lattice when prediction of a single DV is the goal. This paper also
shows that all of the graphs in the BN directed system lattice (where this lattice disallows
graphs where the DV is a V-structure parent) are equivalent in their predictions to RA
graphs, although many of them have fewer degrees freedom than their RA-equivalent
counterpart. The augmented RA directed system lattice thus encompasses all of the BN
directed system general graphs in terms of prediction, and offers additional predicative
graphs, those including loops, that are not in the BN lattice. However, the restriction
that disallows BN graphs where the DV is a V-structure parent might be relaxed, so a
future extension of this work could consider expanding the BN directed system lattice to
include such unusual BN predictive graphs. An additional extension could be to develop
an algorithm to generate the BN directed system lattice of general and specific graphs
for any number of variables allowing for efficient search of the BN lattice for graphs that
uniquely predict a single DV.

Author Contributions: Conceptualization, M.H. and M.Z.; formal analysis, M.H.; writing—original
draft preparation, M.H.; writing—review and editing, M.Z.; visualization, M.H.; supervision, M.Z.
All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Acknowledgments: Many thanks to Rajesh Venkatachalapathy for stimulating discussions of Bayesian
Networks.
Conflicts of Interest: The authors declare no conflict of interest.

Appendix A
The contents of this Appendix are the work of other researchers and are included here
to allow this paper to be understood in a self-contained way.

Appendix A.1. RA Loop Detection Procedure


In the RA graphs of Figures 1 and 4, graphs that do not have loops are highlighted
in bold, while graphs that have loops are non-bold. Graphs without loops are fitted
algebraically, whereas graphs with loops are fitted using iterative proportional fitting. To
determine if a graph has a loop, the following procedure is performed:
Given the set of relations of a specific graph:
1. Remove all variables that are unique to any individual relation
2. Remove any relation that is equal to or embedded in any other relation of the (remain-
ing) set
3. Repeat 1 and 2 until either
a. No variables remain, in which case there are no loops, or
b. The remainder is unalterable by steps 1 or 2, in which case there are loops
For example, in Figure 4, graph G7, illustrated by specific graph ABC:ABD does not
have a loop. Krippendorff’s loop detection algorithm [11] produces the following results.
First, removing variables unique to both ABC and ABD removes C from ABC and D from
ABD. What remains is AB:AB, for which the second AB is redundant and thus removed,
leaving AB. Then, removing the variables unique to AB removes both A and B, leaving the
null set, and thus this specific graph does not contain a loop.
Entropy 2021, 23, 986 35 of 39

By contrast, in Figure 4, graph G8, illustrated by specific graph ABC:AD:BD, does


have a loop. Krippendorff’s loop detection algorithm [11] produces the following results.
First, removing variables unique to one relation removes C from ABC. What remains is
AB:AD:BD. There is no redundant relation, i.e., no relation that is repeated or embedded
in another relation. There are also no variables unique to one relation. Therefore, nothing
further can be removed; because the remaining set is unalterable, the specific graph has a
loop.

Appendix A.2. D-Separation Procedure


The following provides the procedure for determining all independencies for a BN [55]
Step 1. List all possible independence statements for a given BN.
Step 2. For each independence statement, construct the ‘ancestral graph’ [58] for the
variables mentioned in the independence statement.
Step 3. ‘Moralize’ the ancestral graph by adding an undirected edge between two nodes if
2021, 23, xEntropy 2021,REVIEW
FOR PEER 23, x FOR PEER REVIEW they have a common child. 38 of 42 38 of 42
Step 4. ‘Disorient’ the moralized, ancestral graph, by making all edges undirected.
Step 5. Delete the givens (nodes) and any of their edges from the independence statement
Step 6. Read the being
Step 6.tested.
answer Read to the
the answer
independenceto the independence
statement question statement
fromquestion from the remaining
the remaining
Step
graph, if the variables6.
graph, if Read the answerare
thedisconnected
are variables toin
the independence
disconnected
the remaining statement
in the the question
remaining
graph, graph,
answer tofrom
the the remaining
answer
inde- to thegraph,
inde-
if the variables
pendence are
statement disconnected
is in the in the
affirmative. remaining graph, the answer to the independence
pendence statement is in the affirmative.
statement is in the affirmative.
The following The following
provides provides
examples of the examples of theprocedure
D-separation D-separation procedure
for BN12 for BN12 and BN9*:
and BN9*:
The following provides examples of the D-separation procedure for BN12 and BN9*:
Example 1 Example 1
Example 1
Step 1. List all Step
Step 1.
possible1. List all
all possible
possible independence
independence
List statements
independence for statements
a given BN.for
statements For
for aa given
four BN.
BN. For
For four
givenvariables, four variables,
variables,
Table
Table 2 is the complete 2 is the
Table 2 islist. complete
the complete list. list.
Step 2. For each Step 2.
2. For
For each
independence
Step each independence
statement,
independence statement,
construct construct
the ‘ancestral
statement, the
the ‘ancestral
constructgraph’ [58] forgraph’
‘ancestral the
graph’ [58]
[58] for
for the
the
variables
variables mentioned in mentioned
the independencein the independence
statement.
variables mentioned in the independence statement. statement.
An ancestral graphAn
An ancestral graph
graph of
of the probability
ancestral of the probability
expression
the expression
includes
probability includes
all nodes
expression listedall
includes innodes
all listed
the inde-
nodes listed in
in the
theinde-
inde-
pendence
pendence statement that statement
is being that
tested is being
and all tested
parents, and all parents,
grandparents, grandparents,
great-grandpar-
pendence statement that is being tested and all parents, grandparents, great-grandparents, great-grandpar-
ents,
ents, etc., of those etc.,
etc.,nodes.
of of those
those nodes.nodes.
BN12 and theBN12 and
and the
independence
BN12 statement (A ⊥
the independence
independence statement
B | D) will
statement (A ⊥
(A beBBused
⊥ || D)
D)as will
anbe
will be used
used as an example
example
throughout thethroughout
remaining the
throughout remaining
procedure procedure
(Steps (Steps
2–5, Figures A1–A5).
2–5, Figures A1–A5).

Figure A1. Step Figure


2, createA1.
theFigure
Step 2, A1. Step
create
ancestral the2,ancestral
graph. create thegraph.
ancestral graph.

Figure A2. Step Figure A2. Step


3, moralize Step 3,
3, moralize
moralize
the ancestral the ancestral graph.
graph.
Entropy 2021, 23, 986 36 of 39
Figure A2. Step 3, moralize the ancestral graph.
Figure A2. Step 3, moralize the ancestral graph.

Figure A3. Step 4, disorient the moralized, ancestral graph.


Figure A3.
Figure Step4,
A3. Step 4,disorient
disorient the
the moralized,
moralized, ancestral
ancestral graph.
graph.

y 2021, 23, x FOR PEER REVIEW 39 of 42

Figure A4. Step 5, delete the givens.


Figure A4. Step 5, delete the givens.

Figure A5.
Figure Example,
A5. full
Example, D-separation
full procedure
D-separation forfor
procedure independence statement
independence statement ⊥⊥
(C(C D|A,
D|A,B).B).

Step the
Step 3. ‘Moralize’ 3. ‘Moralize’ the ancestral
ancestral graph graph
by adding anby adding anedge
undirected undirected
betweenedge two between two
nodesa ifcommon
nodes if they have they have a common child.
child.
Step the
Step 4. ‘Disorient’ 4. ‘Disorient’
moralizedthe moralized
ancestral ancestral
graph by makinggraphallby making
edges all edges undirected.
undirected.
Step 5. Delete the givens from and any of their edges. In the continuing example, D example, D is
Step 5. Delete the givens from and any of their edges. In the continuing
the⊥‘given’
is the ‘given’ (A B | D),(A ⊥ BD|and
thus D), its
thus D and itsedges
connected connected edges
to A, B, and toC A,
areB,removed.
and C are removed.
Step 6. Read the answer to the independence statement
Step 6. Read the answer to the independence statement question from the remaining question from the remaining
graph, if the variables are disconnected in the remaining
graph, if the variables are disconnected in the remaining graph, the answer to the inde-graph, the answer to the indepen-
dence statement is in the affirmative. In this example, the
pendence statement is in the affirmative. In this example, the independence statement be- independence statement being
tested
ing tested is the is the assertion
assertion that A and thatBAareand B are independent
independent given D.given
ThisD.assertion
This assertion is false because
is false
because A and B are connected in the remaining graph; thus, they are not conditionally independent
A and B are connected in the remaining graph; thus, they are not conditionally
given D.
independent given D.
Example 2
Example 2 Consider a second example, using BN9*. The independence statement being tested
Considerhere
a second example, using
is the assertion: BN9*. The
C independent of independence
D given A and statement
B (C ⊥ D|A, being tested A5 shows all
B). Figure
here is the assertion:
steps in C theindependent
procedure for ofthis
D given A and
example, B (C ⊥ C
affirming D|A,
and B). Figure
D are indeed A5independent
shows given A
all steps in theandprocedure
B. for this example, affirming C and D are indeed independent
given A and B.
Entropy 2021, 23, x FOR PEER REVIEW 40 of 42
Entropy 2021, 23, 986 37 of 39

FigureA6.
Figure A6.Joint
JointRA-BN
RA-BNlattice
lattice of
of 44 variable
variable general
general and specific graphs.
Entropy 2021, 23, 986 38 of 39

References
1. Ashby, W.R. Constraint Analysis of Many-Dimensional Relations. Gen. Syst. Yearb. 1964, 9, 99–105.
2. Broekstra, G. Nonprobabilistic constraint analysis and a two stage approximation method of structure identification. In Proceed-
ings of the 23rd Annual SGSR Meeting, Houston, TX, USA, 3–8 January 1979.
3. Cavallo, R. The Role of System Science Methodology in Social Science Research; Martinus Nijhoff Publishing: Leiden,
The Netherlands, 1979.
4. Conant, R. Mechanisms of Intelligence: Ashby’s Writings on Cybernetics; Intersystems Publications: Seaside, CA, USA, 1981.
5. Conant, R. Extended dependency analysis of large systems. Int. J. Gen. Syst. 1988, 14, 97–123. [CrossRef]
6. Klir, G. Identification of generative structures in empirical data. Int. J. Gen. Syst. 1976, 3, 89–104. [CrossRef]
7. Klir, G. The Architecture of Systems Problem Solving; Plenum Press: New York, NY, USA, 1985.
8. Klir, G. Reconstructability analysis: An offspring of Ashby’s constraint theory. Syst. Res. 1986, 3, 267–271. [CrossRef]
9. Krippendorff, K. On the identification of structures in multivariate data by the spectral analysis of relations. In General Systems
Research: A Science, a Methodology, a Technology; Gaines, B.R., Ed.; Society for General Systems Research: Louisville, KY, USA, 1979;
pp. 82–91. Available online: http://repository.upenn.edu/asc_papers/207 (accessed on 27 July 2021).
10. Krippendorff, K. An algorithm for identifying structural models of multivariate data. Int. J. Gen. Syst. 1981, 7, 63–79. [CrossRef]
11. Krippendorff, K. Information Theory: Structural Models for Qualitative Data; Quantitative Applications in the Social Sciences #62;
Sage: Beverly Hills, CA, USA, 1986.
12. Willet, K.; Zwick, M. A software architecture for reconstructability analysis. Kybernetes 2004, 33, 997–1008. Available online:
https://works.bepress.com/martin_zwick/55/ (accessed on 27 July 2021). [CrossRef]
13. Zwick, M. Reconstructability Analysis of Epistasis. Ann. Hum. Genet. 2010, 157–171. Available online: https://works.bepress.
com/martin_zwick/3/ (accessed on 27 July 2021). [CrossRef]
14. Zwick, M. Wholes and parts in general systems methodology. In The Character Concept in Evolutionary Biology; Wagner, G., Ed.;
Academic Press: New York, NY, USA, 2001. Available online: https://works.bepress.com/martin_zwick/52/ (accessed on 27
July 2021).
15. Zwick, M. An overview of reconstructability analysis. Kybernetes 2004, 33, 887–905. Available online: https://works.bepress.
com/martin_zwick/57/ (accessed on 27 July 2021). [CrossRef]
16. Zwick, M.; Johnson, M.S. State-based reconstructability analysis. Kybernetes 2004, 33, 1041–1052. Available online: https:
//works.bepress.com/martin_zwick/47/ (accessed on 27 July 2021). [CrossRef]
17. Zwick, M.; Carney, N.; Nettleton, R. Exploratory Reconstructability Analysis of Accident TBI Data. Int. J. Gen. Syst. 2018, 47,
174–191. Available online: https://works.bepress.com/martin_zwick/80/ (accessed on 27 July 2021). [CrossRef]
18. Wright, S. Correlation and causation. J. Agric. Res. 1921, 20, 557–585.
19. Wright, S. The method of path coefficients. Ann. Math. Stat. 1934, 5, 161–215. [CrossRef]
20. Neapolitan, R. Probabilistic Reasoning in Expert Systems: Theory and Algorithms; Wiley: New York, NY, USA, 1989.
21. Pearl, J. Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In Proceedings of the 7th Conference
of the Cognitive Science Society, Irvine, CA, USA, 15–17 August 1985; pp. 329–334.
22. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann Publishers, Inc.:
San Francisco, CA, USA, 1988.
23. Pearl, J.; Verma, T. The logic of representing dependencies by directed graphs. In Proceedings of the 6th National Conference on
Artificial Intelligence, Seattle, WA, USA, 13–17 July 1987.
24. Lauritzen, S. Graphical Models; Oxford Statistical Science Series; Oxford University Press: New York, NY, USA, 1996.
25. Driver, E.; Morrell, D. Implementation of Continuous Bayesian Networks Using Sums of Weighted Gaussians. In Proceedings of
the Eleventh Conference on Uncertainty in Artificial Intelligence, Montréal, QC, Canada, 18–20 August 1995.
26. Tang, Y.; Srihari, N. Efficient and Accurate Learning of Bayesian Networks using Chi-Squared. In Proceedings of the 21st
International Conference on Pattern Recognition, Tsukuba, Japan, 11–15 November 2012.
27. Rebane, G.; Pearl, J. The recovery of causal polytrees from statistical data. In Proceedings of the Third Conference on Uncertainty
Artificial Intelligence, Seattle, WA, USA, 10 July 1987; pp. 222–228.
28. Verma, T.; Pearl, J. Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in
Artificial Intelligence, Cambridge, MA, USA, 27–29 July 1990; pp. 255–270.
29. Chickering, D. Learning Equivalence Classes of Bayesian-Network Structures. J. Mach. Learn. Res. 2002, 2, 445–498.
30. Meek, C. Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Conference on
Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–20 August 1995; Hanks, S., Besnard, P., Eds.; Morgan Kaufmann:
Burlington, MA, USA, 1995; pp. 403–410.
31. Andersson, S.; Madigan, D.; Perlman, M. A Characterization of Markov Equivalence Classes For Acyclic Digraphs. Ann. Stat.
1997, 25, 505–541. [CrossRef]
32. Gillispie, S.; Perlman, M. Enumerating Markov Equivalence Classes of Acyclic Digraph Models. In Proceedings of the Seventeenth
conference on Uncertainty in Artificial Intelligence, Seattle, WA, USA, 2 August 2001; pp. 171–177.
Entropy 2021, 23, 986 39 of 39

33. Harris, M.; Zwick, M. Joint Lattice of Reconstructability Analysis and Bayesian Network General Graphs. In Unifying Themes in
Complex Systems X, ICCS 2020; Springer: Cham, Switzerland, 2021.
34. Heckerman, D.; Geiger, D.; Chickering, D. Learning Bayesian networks: The combination of knowledge and statistical data. In
Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, WA, USA, 29–31 July 1994; pp. 293–301.
35. Chickering, D.; Heckerman, D.; Meek, C. Large-Sample Learning of Bayesian Networks is NP-Hard. J. Mach. Learn. Res. 2004, 5,
1287–1330.
36. Murphy, K. A Brief Introduction to Graphical Models and Bayesian Networks. 1998. Available online: https://www.cs.ubc.ca/
~{}murphyk/Bayes/bayes_tutorial.pdf (accessed on 27 July 2021).
37. Bouckaert, R. Properties of Bayesian Belief Network Learning Algorithms. In Proceedings of the Tenth international conference
on Uncertainty in artificial intelligence, Seattle, WA, USA, 29–31 July 1994; pp. 102–109.
38. Buntine, W. Classifiers: A theoretical and empirical study. In Proceedings of the IJCAI, Sydney, Australia, 24–30 August 1991;
Morgan Kaufmann: Burlington, MA, USA, 1991; pp. 638–644.
39. Buntine, W. Theory refinement on Bayesian networks. In Proceedings of the Seventh Conference on Uncertainty in Artificial
Intelligence, Los Angeles, CA, USA, 13–15 July 1991; pp. 52–60.
40. Chickering, D.; Geiger, D.; Heckerman, D. Learning Bayesian networks: Search methods and experimental results. In Proceedings
of the Fifth Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 4–7 January 1995; pp. 112–128.
41. Cooper, D.A. A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 1992, 9, 309–347. [CrossRef]
42. Friedman, N.; Goldszmidt, M. Building classifiers using Bayesian Networks. In Proceedings of the National Conference on
Artificial Intelligence, Portland, OR, USA, 4–8 August 1996.
43. Friedman, N.; Koller, D. Being Bayesian about network structure: A Bayesian approach to structure discovery in Bayesian
networks. Mach. Learn. 2003, 50, 95–125. [CrossRef]
44. Koivisto, M.; Sood, K. Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 2004, 5, 549–573.
45. Larranaga, P.; Kuijpers, C.; Murga, R.; Yurramendi, Y. Learning Bayesian network structures by searching for the best ordering
with genetic algorithms. IEEE Trans. Syst. Man Cybern. 1996, 26, 487–493. [CrossRef]
46. Malone, B.; Yuan, C.; Hanse, E. Memory-Efficient Dynamic Programming for Learning Optimal Bayesian Networks. In
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 7–11 August 2011; pp.
1057–1062.
47. Chen, Y. Structure Discovery in Bayesian Networks: Algorithms and Applications. Master’s Thesis, Iowa State University, Ames,
IA, USA, 2016.
48. Studený, M.; Vomlel, J. Geometric view on learning Bayesian Network Structures. Int. J. Approx. Reason. 2010, 51, 573–586.
[CrossRef]
49. Tian, J.; He, R.; Ram, L. Bayesian model averaging using the k-best Bayesian network structures. In Proceedings of the
Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, 8–11 July 2010.
50. Zhang, H. The Optimality of Naive Bayes. In Proceedings of the FLAIRS Conference, Miami Beach, FL, USA, 12–14 May 2004.
51. Pearl, J. Causality: Models, Reasoning, and Inference; Cambridge University Press: Cambridge, UK, 2000.
52. Chickering, D. Transformational Characterization of Equivalent Bayesian Network Structures. In Proceedings of the Eleventh
Conference on Uncertainty in Artificial Intelligence, Montréal, QC, Canada, 18–20 August 1995; pp. 87–98.
53. Chickering, D.; Heckerman, D.; Meek, C. A Bayesian approach to learning Bayesian networks with local structure. In Proceedings
of the Thirteenth Conference on Uncertainty in Artificial Intelligence, Providence, RI, USA, 1–3 August 1997; Geiger, D., Ed.;
Morgan Kaufmann: Burlington, MA, USA, 1997; pp. 80–90.
54. Rubin, D. Bayesian Inference for Causal Effects: The Role of Randomization. Ann. Stat. 1978, 6, 34–58. [CrossRef]
55. MIT. D-Separation. 2015. Available online: http://web.mit.edu/jmn/www/6.034/d-separation.pdf (accessed on 27 July 2021).
56. Javidian, M.A.; Wang, Z.; Lu, L.; Valtorta, M. On a hypergraph probabilistic graphical model. Ann. Math Artif. Intell. 2020, 88,
1003–1033. [CrossRef]
57. Forre, P.; Mooij, J. Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias. In Proceedings of the
Uncertainty in Artificial Intelligence Conference, Tel Aviv, Israel, 22–25 July 2019; Volume 115, pp. 71–80.
58. Richardson, T.; Spirtes, P. Ancestral graph Markov models. Ann. Stat. 2002, 30, 962–1030. [CrossRef]

You might also like