You are on page 1of 21

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/224920616

Formal Derivation of Spanning Trees Algorithms

Conference Paper in Lecture Notes in Computer Science · June 2003


DOI: 10.1007/3-540-44880-2_27 · Source: DBLP

CITATIONS READS

27 181

3 authors, including:

Jean-Raymond Abrial Dominique Mery


self employee University of Lorraine
111 PUBLICATIONS 6,597 CITATIONS 230 PUBLICATIONS 1,516 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

IMPEX IMPlicit and EXplicit semantics integration in proof based developments of discrete systems View project

IMPEX ANR View project

All content following this page was uploaded by Dominique Mery on 27 May 2014.

The user has requested enhancement of the downloaded file.


Formal derivation of spanning trees
algorithms
 
Jean-Raymond Abrial , Dominique Cansell , and Dominique Méry

LORIA,Université Henri Poincaré Nancy 1

Dominique.Mery@loria.fr
LORIA,INRIA Lorraine
Dominique.Cansell@loria.fr
BP 239
Vandœuvre-lès-Nancy
 Cédex,France
Consultant
Marseille, France
jr@abrial.org

Abstract. Graphs algorithms and graph-theoretical problems provide


a challenging battle field for the incremental development of proved
models. The B event-based approach implements the incremental and
proved development of abstract models which are translated into al-
gorithms; we focus our methodology on the minimum spanning tree
problem and on Prim’s algorithm. The correctness of the resulting so-
lution is based on properties over trees and we show how the greedy
strategy is efficient in this case. We compare properties proven me-
chanically to the properties found in a classical algorithms textbook.

1 Introduction

Overview. Developing distributed algorithms can be improved by the use


of refinement techniques. A refinement technique allows one to gradually
develop a distributed algorithm step by step, or to tackle complex problem
like the PCI Transaction Ordering Problem [7] or the IEEE 1394 Tree Iden-
tification Protocol [3]. The B event-based method provides a framework for
deriving abstract systems modeling distributed algorithmic solutions like
the Minimum Spanning Tree algorithms, MST algorithms for short. This
paper analyses the proof-based development of MST algorithms and Prim’s
algorithm in particular [20] is produced in fine: this is an illustration of the
effectiveness of refinement for such algorithms.
Proof-based Development. Proof-based development methods integrate for-
mal proof techniques in the development of software systems. The main idea
is to start with a very abstract model of the system under development. We
then gradually add details to this first model by building a sequence of more

Supported in part by PRST Intelligence Logicielle/QSL/DIXIT project and by
PRST Intelligence Logicielle/QSL/ADHOC project
concrete ones. The relationship between two successive models in this se-
quence is that of refinement [6, 5, 9]. It is controlled by means of a number
of proofs obligations, which guarantee the correctness of the development.
Such proof obligations are proved by automatic (and interactive) proof pro-
cedures supported by a proof engine. The essence of the refinement relation-
ship is that it preserves already proved system properties including safety
properties and termination properties. The invariant of an abstract model
plays a central role for deriving safety properties and our methodology fo-
cuses on the incremental discovery of the invariant; the goal is to obtain a
formal statement of properties through the final invariant of the last refined
abstract model. When developing formal models for the IEEE 1394 protocol,
we use the Atelier B environment [10] for generating and proving proof obli-
gations.
Refining Formal Models. Formal models, as described in this paper, con-
tain events which preserve some invariant properties; they also include as-
pects related to the termination. Such models are thus very close to action
systems introduced by R.J. Back [6] and to UNITY programs [9]. The re-
finement of formal models plays a central role in these frameworks and is a
key concept for developing algorithmic systems. When one refines a formal
model, the corresponding more concrete model may have new variables and
new events, it may also strengthen the guards of more abstract events. As
already mentioned, some proof obligations are generated in order to prove
that a refinement is correct. Notice that, if some proof obligations remain
unproved, it means that, either the formal model is not correctly refined, or
that an interactive proving session is required. The prover allows us to get
a complete proof of the development.
Organization of the paper. Section 2 introduces the MST problem and known
MST algorithms; the different steps of Prim’s algorithm and the correct-
ness are informally explained. Section 3 recalls the proof-based develop-
ment methodology. Section 4 describes the formal development of a first
spanning tree algorithm; the problem is formally stated in the B event-
based framework and the resulting model is called the generic model. Sec-
tion 5 uses the previous development by adding effective cost functions to
edges and Prim’s algorithm is then obtained with the complete proof. Sec-
tion 6 details properties over trees proved to ensure that the resulting al-
gorithm returns the minimum spanning tree. Section 7 compares our ap-
proach to related approaches. Section 8 concludes our paper.

2 The Minimum Spanning Tree Problem

The Minimum Spanning Tree Problem, MST problem for short, is the prob-
lem of finding a minimum spanning tree with respect to a connected graph.
The literature contains several algorithmic solutions like Prim’s algorithm [20]
or Kruskal’s algorithm [16]. Both algorithms implement the greedy method.
Typically, we assume that a cost function is related to every edge and the

2
problem is to infer a globally minimum spanning tree, which covers the ini-
tial graph. The cost function returns integer values. The MST problem is
strongly related to practical problems like the optimisation of circuitry and
the greedy strategy advocates making the choice that is the best one at the
moment; It does not always guarantee the optimality but certain greedy
strategies yield a MST.
Prim’s algorithm is easy to explain but it underlies mathematical proper-
ties related to the graph theory and especially the general theory of trees.
We consider two kinds of solutions; a first one is called generic algorithm
because it does not use a cost function. This first generic solution allows us
to develop a second solution: the MST one.
Let us summarize how Prim’s algorithm works. The state of the algorithm
while executing contains two sets of nodes of the current graphs. A first set
of nodes, equipped with a restriction of the relation over the global set of
nodes, defines the current spanning tree starting from a special node called
the root of the spanning tree. A second set of nodes is the complement of
the first set. The acyclicity of the spanning tree must be preserved, while
adding a new edge in the current spanning tree and the basic computation
step consists of taking an edge between a node in the current spanning tree
and a node which is in the other set. The choice leads to maintaining the
acyclicity of the current spanning tree with the new node, since both sets of
nodes are disjoint. The process is repeated as long as the set of remaining
and unchosen nodes is empty. The final computed tree is a spanning tree
computed by the generic algorithm. Now, if one adds the cost function, one
gets Prim’s algorithm by modifying the choice of the new node and edge to
add to the current spanning tree. In fact, the minimum edge is chosen and
the final spanning tree is then the minimum spanning tree. However, the
addition of the cost function is a refinement of the generic solution.
The generic MST algorithm without cost function is sketched as follows:

– Precondition: A undirected connected graph,  , over a set of nodes


and a node
– Initial Step  _ (the current set of nodes) contains only and is
included into and  (the current set of edges) is empty
– Computation Step If  _ is not empty, then choose a node  in
 _ and a node  in  _ such that the link  "!#%$ is in 
with the minimum cost and add it to  ; then add  to  _ and  "!#%$
to 
– Termination Step If  _ is empty ( '&( _ ), then  is
a minimum spanning tree on
– Postcondition ) !#*$ is a minimum spanning tree

The termination of the algorithm is ensured by decreasing the set  _ .
The genericity of the solution leads us to the refinement by introducing the
cost function in the computation step. We have a clear simple abstract view
of the problem and of the solution. We can, in fact, state the problem in the

3
B event-based framework. It remains to prove the optimality of the result-
ing spanning tree and that will be derived using tools and models. Before
starting the modeling, we recall the B-event-based modeling technique.

3 Proof-based development

3.1 Event-based modeling

Our event-driven approach [2, 4] is based on the B notation [5]. It extends


the methodological scope of basic concepts such as set-theoretical notations
and generalized substitutions in order to take into account the idea of for-
mal models. Roughly speaking, a formal model is characterized by a (finite)
list  of state variables possibly modified by a (finite) list of events; an in-
variant +, ,$ states some properties that must always be satisfied by the
variables  and maintained by the activation of the events. Abstract models
are close to guarded commands of Dijkstra [12], action systems of Back [6]
and to UNITY programs [9]. In what follows, we briefly recall definitions
and principles of formal models and explain how they can be managed by
Atelier B [10].

Definition 1. : Generalized Substitution


Generalized substitutions are borrowed from the B notation. They provide a
way to express the transformations of the values of the state variables of a
formal model. In its simple form, -.&0/1 2$ , a generalized substitution looks
like an assignment statement. In this construct,  denotes a vector build on
the set of state variables of the model, and /13,$ a vector of expressions of
the same size as the vector  . The interpretation we shall give here to this
statement is not however that of an assignment statement. We interpret it
as a logical simultaneous substitution of each variable of the vector  by
the corresponding expression of the vector /13,$ . There exists a more general
form of generalized substitution. It is denoted by the construct 4-563,7!#2$ .
This is to be read: “ is modified in such a way that the predicate 56 27!#,$
holds”, where  denotes the new value of the vector, whereas  7 denotes its old
value. It is clearly non-deterministic in general. This general form could be
considered as a normal form, since the simplest form  -8&9/13,$ is equivalent
to the more general form  -:3;&9/<  7 $#$ .

Definition 2. : Events and Before-After Predicates


An event is essentially made of two parts: a guard, which is a predicate built
on the state variables, and an action, which is a generalized substitution. An
event can take one of the forms shown in the table below. In these constructs,
>= is an identifier: this is the event name. The first event is not guarded:
it is thus always enabled. The guard of the other events, which states the
necessary condition for these events to occur, is represented by ?6 ,$ in the
second case, and by @ACBD?6 E!F,$ in the third one. The latter defines a non-
deterministic event where represents a vector of distinct local variables. The,

4
so-called, before-after predicate GIHI3J!F,K $ associated with each event shape,
describes the event as a logical predicate expressing the relationship linking
the values of the state variables just before ( ) and just after ( K ) the event
“execution”.

Event Before-after Predicate LNMNOQPSR3PUTQV

WXZY9\ [ begin P^]`_aObP>c`R)PUV end _dOQP*R3PUTbV

WXZY9\ [ select efOQPUV then PA]EgfOQP c RhP>V end efOQPUV^ijgfOQP*RhP>TQV

WXZY9\ [ any Y where efO Y R3PUV then Pk]mlNOQP c RhP*R Y V end n Yo OZefO Y R3PUVCiplNObP*R)PUTqR Y VmV

Proof obligations are produced from events in order to state that the invari-
ant condition +r3,$ is preserved. We next give the general rule to be proved.
It follows immediately from the very definition of the before-after predicate,
GIHI3J!FrK $ of each event:

+, 2$tsuGIHv "!#rK $xw +, rKq$

Notice that it follows from the two guarded forms of the events that this
obligation is trivially discharged when the guard of the event is false. When
it is the case, the event is said to be “disabled”.

3.2 Model Refinement


The refinement of a formal model allows us to enrich a model in a step by
step approach. Refinement provides a way to construct stronger invariants
and also to add details in a model. It is also used to transform an abstract
model in a more concrete version by modifying the state description. This is
essentially done by extending the list of state variables (possibly suppress-
ing some of them), by refining each abstract event into a corresponding con-
crete version, and by adding new events. The abstract state variables,  ,
and the concrete ones,  , are linked together by means of a, so-called, glu-
ing invariant yf "!#%$ . A number of proof obligations ensure that (1) each
abstract event is correctly refined by its corresponding concrete version, (2)
each new event refines zD{}| , (3) no new event take control for ever, and (4)
relative deadlock-freeness is preserved.
Definition 3. : Refinement
We suppose that an abstract model H~ with variables  and invariant +r3,$
is refined by a concrete model €~ with variables  and gluing invariant
yd3J!F‚$ . If GIHƒHv "!#rK $ and GvH„€  2!FDKq$ are respectively the abstract and con-
crete before-after predicates of the same event, we have to prove the following
statement:

5
+, 2$s†yf "!#%$tsuGvH„€  2!FDK}$‡w @DrK:B3GvHƒHv J!FrK $ts†yf3rKh!#‚K}$#$

This says that under the abstract invariant +, 2$ and the concrete one yd3J!F‚$ ,
a concrete step GvH„€ 3,!F‚K}$ can be simulated ( @D,K ) by an abstract one GIH„HI "!#,K $
in such a way that the gluing invariant yf 2Kˆ!#‚KQ$ is preserved. A new event
with before-after predicate GvHI 2!F%Kb$ must refine zD{}| (,Ka&‰ ). This leads to
the following statement to prove:

+, 2$s†yf "!#%$tsuGvHI 2!F K $xw yf3J!# K $

Moreover, we must prove that a variant Š63‚$ is decreased by each new event
(this is to guarantee that an abstract step may occur). We have thus to prove
the following for each new event with before-after predicate GvHI 2!F:Kq$ :

+, ,$s†yd3J!F‚$suGIHI32!#‚Kq$jw Š13‚Kb$^‹4Š13‚$

Finally, we must prove that the concrete model does not introduce more
deadlocks than the abstract one. This is formalized by means of the follow-
ing proof obligation:

+r3,$ts†yf3J!#%$tsŒ:Ž}>)H„~($‘w Œ:Ž}>h€~($

where Œ%ŽqSm)H„~($ stands for the disjunction of the guards of the events of the
abstract model, and Œ%ŽqS>)€~($ stands for the disjunction of the guards of
the events of the concrete one. The MST problem can now be stated in the
B-event based framework.

4 Development of a spanning tree algorithm

4.1 Formal specification of the spanning tree problem

First we define elements of the current graph namely  over the set of nodes
namely . The graph is assumed to be undirected, which is modeled by
the symmetry of the relation of the graph. Node is the root of the resulting
tree and we obtain the following B definitions:

<’“ †”  t •s
 &–,— s
v˜t

The termination of the algorithm is clearly related to properties of the cur-


rent graph; the existence of the spanning tree is based on the connectivity
of the graph. The modelling of a tree uses the acyclicity of the graph. A tree

6
is defined by a root , a node: ˜4 , and a parent function (each node
has an unique parent node, but the root): 9˜™ ›šUœ‰, . A tree is
an acyclic graph. A cycle ž in a finite graph built on a set  , is a subset
of whose elements
¡ are members of the inverse image of ž under , for-
mally žŸ’ — žZ¢ . To fulfill the requirement of acyclicity, the only set ž that
enjoys this property is necessarily the empty set. We formalize it by the left
predicate that follows, which can be proved to be equivalent to the one on
the right, which can be used as an induction rule:

£2§
£ BS
žkB §
’– •s
ž„’– ¤s §
a¡ ¦ ¨˜ d¡ s
ž„’“ — žZ¢ § §
— ¢ ’
J
w
w
ž&0¥f$ §
u& $

We prove the equivalence using Atelier B. We can now define a spanning


tree (rooted by and with the parent function ) of a graph  as one whose
parent function is included in  , formally:

`©ª««S¬­«UŒ„
°±  E!®%$–& ¯ µ¶
–˜† ›š>œ(,  s a¡ §
£²§ § § § §
B ’4 s³¨˜ s ³ — ¢J’ w u& $ts
–’´

Now we can define the set ·EŽb¸>¸¹q%$ of all spanning trees (with root ) of the
graph  , formally:

·EŽ}¸m¸¹ ‚$x&¤š>mº `©ª««*¬­«ŒC E!#‚$Eœ

We define the property of being a connected graph by »`¼:««¸>»`·®¸>rq%$ :

»`¼:««¸>»`·®½ ¸>C ‚$4& ¯


u˜† ¿¾ s ¡À
£À À À À À
BS ’4 s³¨˜ sÁ ¢J’ w u& $Â

The graph  and the node are two global constants of our problem and
must satisfy properties stated above. Moreover, we assert that there is at
least one solution to our problem. The optimality of the solution will be
analyzed later, while introducing the cost function. Now, we build the first
model which computes the solution in one shot. The event span corresponds
to producing a spanning tree among the non-empty set of possible spanning
trees for  . The variable > contains the resulting spanning tree.

7
span & ¯
begin
>A-Q˜;·EŽ}¸m¸ ‚$
end
The invariant is very simple and only a type invariant.

mA˜ ¾Ã

The initialization establishes this invariant.

The current model is in fact the specification of the simple spanning tree
problem; we have not yet mentioned the cost function. The next step is to
refine the current model into a simple spanning tree algorithm.

4.2 Development of a simple spanning tree algorithm


The second model introduces a new event which gradually computes the
spanning tree by constructing the spanning tree in a progressive way. The
new event adds a new edge to the current tree  which partly spans  . The
chosen edge is such that the first component of the pair is in  _ and
the second one is in UÄ;Å{ˆ²{ˆ2 _ . These two new variables partition
the set of nodes and we obtain the following new properties to add to the
invariant of the current model.

 _Æ’Ç s
UÄ;Å{ˆ²{ˆ2 _Æ’Ç s
 _‘ȍUÄ;Å{ˆ²{ˆ2 _É&Ê s
 _ƒËp>ÄtÅ{ˆ²{h2 _É&0¥

A new event, progress, simulates the computation step of the current solu-
tion by choosing a pair maintaining the updated invariant.

progress & ¯
select
UÄtÅ{h²{ˆ2 _60 Ì& ¥
then
any J!# where
"!#<˜;Ís³J!#ΘÎ _I”ÎUÄ;Å{ˆ²{ˆ2 _
then
¨-.&É“È'š>¨ÏÐJœvºQº
 _-8&9 _¹È'š>,œIºQº
UÄ;Å{ˆ²{ˆ2 _-8&9UÄtÅ{ˆ²{ˆ2 _›š>,œ
end
end

8
The event span is simply refined by modifying the guard of the previous
instance of the event in the abstract model. The event is triggered when the
set of remaining nodes is empty: the variable m contains a spanning tree for
the graph  .

span & ¯
select
UÄtÅ{h²{ˆ2 _&0¥
then
mA-8&9
end

The invariant of the new model states the properties of the two new vari-
ables and relates them to previous ones.

 _ Æ’Ç s
UÄtÅ{ˆ²{ˆ2 _Æ’Ç s
 _ ‘ÈuUÄtÅ{h²{ˆ2 _É&Ñ s
 _ ƒËpUÄ;Å{ˆ²{ˆ2 _É&0¥ s
ҘÓ _›š>œŸr  _  d¡ s
£²§ § § § § §
B ’– _Ðs³¨˜ sÔ — ¢J’ w  _& $

The following initialization establishes the invariant:

¨-8&0¥ ºbº
 _ Õ-.&֚>œvºQº
UÄtÅ{ˆ²{ˆ2 _›-8&0 ›š>œ

The expression of the absence of deadlock is simply stated as follows:

UÄtÅ{h²{ˆ2 _&É¥Æ×
"!#<˜;Ís
UÄtÅ{h²{ˆ2 _6É Ì& ¥–sØ@²3J!#%$`B2Ù
"!#<˜t _”;>ÄtÅ{ˆ²{h2 _2Ú

We have obtained a simple iterative solution for the simple MST problem;
the solution follows the sketch of the algorithm given in the section describ-
ing the so called generic algorithm in the book of Cormen et al. [11]. We can
derive the following algorithm from the current model:

9
ÀaÛ
algorithm ‚>U{ž _~
¨-.&Í¥%Ü
 _„&(š>œÜ
while UÄ;Å{ˆ²{ˆ2 _ Ý Ì& ¥ do
let "!# where
"!#<˜;Ís³J!#ΘÎ _   I”ÎUÄ;Å{ˆ²{ˆ2 _
then
¨-.&É“È'š>¨ÏÐJœÜ
 _-8&9 _¹È'š>,œÜ
UÄ;Å{ˆ²{ˆ2 _-8&9UÄtÅ{ˆ²{ˆ2 _›š>,œ
end
end_while
>k-8&É

The next step refines the current model into a model where the cost function
is effectively used.

4.3 A proof view of the spanning tree algorithm

The previous model computes a spanning tree, when the graph is connected.
This algorithm looks like a proof of existence of a spanning tree; the follow-
ing lemma allows us to prove that the set of spanning trees is not empty
and hence a minimum spanning tree exists:

Lemma 1 (Existence of a spanning tree)


»`¼%««¸>»E·®¸UCq%$fwÞ·EŽ}¸m¸¹ ‚$É&Í
Ì ¥

However, the previous lemma requires to construct a tree from the hypoth-
esis related to the connectivity of the graph. Hence, we must prove a first
inductive theorem on finite sets, which will include the existence of a tree.
We suppose that the set is finite and there exists a function from
to ßSàbà  , where  is the cardinality of .

Lemma 2 (An inductive theorem on finite sets)

£
5ÝBS
5ᒓâA3 <$Îs
¥¨˜t5Çs
£
H4B*H֘t5ãsuHÒ0
Ì& Áw @%ÅÕBSÅ1˜t CHäsuH„ÈkšÅrœÕ˜;5Õ$F$
w
Әt5Õ$

We can use the previous lemma with the following set:

10
°ç µmé
ç æ˜;HI›š>œŸ, H s é
ç é
ç æ’èÍ s é
± °± À À ¡ À À µ¶ ¶
šHvº H֒“ åsØ@%æ<B ’–¤s¨˜ sØæ — ¢J’ œ
£²À
B w
À
H(’

to prove that the set of spanning trees of  is not empty.

5 Development of Prim’s algorithm


The cost function is defined on the set of edges and is extended over the
global set of possible pairs of nodes.

ž`*>k-12äêës
£
 "!#%$NBS3J!#;˜Î‰w ž`*>`3;ϝÐ%$f&ɞm*m`3 ϝÐ2$#$ts
€›*mA-*âAq%$¹2äêës
€›*m`®šSœ$f&9 °± ì4s µ¶
՘;âAq%$<suJ!#Θ<2ƒ
£
)!FJ!#%$NB w
€›*>`)AÈš>ÎϝÐ,œ$f&ɀ›*>`)$Fížm*m`3tϝÐ%$

We have proved that ·EŽ}¸m¸q%$ is not empty, since the graph  is connected;
the Ä> _UU`q%$ containing every minimum spanning tree of the graph  is
defined as follows:

Ä> _ U>` ‚$f& £


š>Ä>mº Ä>¹˜Î·EŽb¸>¸ ‚$;s ƒB ¨˜<·EŽ}¸m¸Dq%$fwÀ›*>` Ä>#$^›*>` *$F$Zœ

The set Äm _ >U` ‚$ is clearly not empty. The first «one shot» model is refined
into the new model which contains only one event span. We strengthen the
definition of the choice of the resulting tree by strengthening the condition
over the set and by choosing a candidate in the set of possible MST trees.

span & ¯
begin
>A-Q˜tÄ> _ U>` ‚$
end

The second model gradually computes the spanning tree by adding a new
edge to the current «under construction» tree  spanning a part of  . The
tree  is defined over the set of already treated nodes, called  _ . The
event progress is modified to handle the minimality criterion: the guard is
modified to integrate the choice of the minimum edge among the remaining
possible ones.

11
progress & ¯
select
>ÄtÅ{ˆ²{h2 _6&0
Ì ¥
then
any J!F where
"!#Θ<Ís³J!#Θt _ՔÎUÄ;Å{ˆ²{ˆ2 _Þs
£
3Å,!ïm$JB3Å<˜; _ ‡s
ÎUÄ;Å{ˆ²{ˆ2 _‘s
Å,!Î
w
ž`*>` 6ϝÐ,$^î4ž`*>`)ïϝäł$F$
then
¨-8&4–Ȇš> ϝäœvºbº
 _ -8&É _¹È'š>,œIºQº
UÄtÅ{ˆ²{ˆ2 _-8&ÉUÄtÅ{h²{ˆ2 _›š>,œ
end
end

The event span remains unchanged:

span & ¯
select
UÄtÅ{h²{ˆ2 _&0¥
then
mA-8&9
end

The invariant includes the invariant of the refined model of the generic re-
finement and we add that the current spanning tree  is a part of a mini-
mum spanning tree of the graph  :

Û Û Û
@ B ˜tÄt> _UU`q%$;sŸ’ $

The invariant implies that after completion, when the event span occurs,
the current spanning tree  is finally a minimal one. Since ·EŽ}¸m¸q%$ is not
empty, then Ä> _ UU`q%$ is not empty and a tree can be chosen in this non-
empty set to prove that a MST exists (this MST contains ¥ ). So the invariant
holds for the initialization, using the lemma 1. The difficult task is to prove
that the event progress maintains the invariant. We can take the minimum
spanning tree given by the invariant, if 6ϝÐ is in this tree. Or else we must
provide another minimum tree which includes the current one and the new
edge  ϝä .
In fact, textbooks provide algorithms implementing the greedy strategy and
we refer our explanations to the book of Cormen et al. [11]. The authors

12
prove a theorem page 501 numbered 24.1 to assert that the choice of the two
edges is done following a given requirement, namely a safe edge (a safe edge
is a edge allowing the progress of the algorithm). We recall the theorem:

Theorem 1. (24.1, p 501from [11])


Let  be a connected, undirected graph on (set of nodes) with a real-
valued weight function ž`*> defined on  (edges). Let  be a subset of  that is
included in some minimum spanning tree for  , let   _!F  _$
be any cut of  that respects  _ , and let  "!#%$ be a light edge crossing
  _ !F  _U$ . Then edge 3J!F‚$ is safe for  _ .

Let us explain notions of cut, crosses and light edge. A cut


  _ !F  _U$F$ of an undirected graph  is a partition of . An
edge  "!#%$ crosses the cut   _!F  _$ if one of its endpoints is
in  _  and the other is in  _ . An edge is a light edge cross-
ing a cut if its weight is the minimum of any edge crossing the cut. A light
edge is not unique.
Û
Proof: Let be a minimum spanning tree that includes  , and assume that
Û
does not contain the light edge  J!F%$ , sinceÛ if it does, we are done. We shall
construct another minimum spanning tree K that includes ÎÈ֚D J!F%$Zœ by
using a cut-and-paste technique, thereby showing that  J!F%$ is a safe edge for
Û
 . The edge 3J!#%$ forms a cycle with the edges on the path | from  to  in .
Since  and  are on opposite sides of the cut   _ !F _$ , there
Û
is at least one edge in on the path | that also crosses the cut. Let 3Å,!ï`$ be
any such edge. The edge 3Å,!ï`$ is not in  , because Û
the cut respects  .Û Since
(a,b) is on the unique path from  to  in , removing 3Å,!ïm$ breaks into
two
Û
components.
Û
Adding 3J!#%$ reconnects them to Û
form a new spanning tree
K„& ›š)År!Zï`$Zœ Èuš "!#%$Zœ . We next show that K is a minimum spanning
tree. Since 3J!#%$ is a light edge crossing   _!F  _$ and )År!Zï`$
also crosses this cut, ž`*>`3J!#%$^`*>`3Å,!ïm$ . Therefore,
Û Û
€›*m` K $¹&0€›*>` $ECž`*>`)År!Zï`$Fížm*m`3J!F‚$
Û
î4€›*>` $
Û Û Û Û
But is a minimum spanning tree, so that €›*m` $pîހ›*>` Kb$ ; thus, K
must be a minimum spanning tree also. It remains to show that 3J!#%$ is ac-
Û Û
tually a safe edge for  . We have Β K , since <’ and )Å,!ï`$6j ð˜  ; thus,
Û Û
ÈpšD "!#%$Zœ ’ K . Consequently, since K is a minimum spanning tree,  J!F%$
is safe for  . ñ

We have to prove the property above that has been in fact adapted into the
B proof engine. However, it is not a simple exercise of translation but a com-
plete formulation of graph-theoretical aspects; moreover, the proof has been
completely mechanized, as we will show in the next section. Let us com-
pare the theorem and our formulation. The pair   _!F  _$
is a cut in the left part of the implication; the restriction of the tree æ to
the set of nodes  _ is a tree rooted by ; 3J!#%$ crosses the cut. Those

13
assumptions imply that there exists a spanning tree #| rooted by that is
minimum on  _ and such that there exists a light cut )År!Zï`$ preserving
the minimality property.

We must give a formal description of this theorem. We introduce a predicate


ªU·EŽb¸>¸ E!F!#$ stating that a structure U is a tree on the set 
and whose root is  :

ªU·EŽ}¸m¸3E!#!F$É& ¯
°± µ¶
9˜ë s
Uá˜Ó›š>Zœ(,  s a
 ¡§
£2§ § § § §
B ’èäs³^˜ s³U — ¢"’ w „& $

Hence, we must add the following property which is proved separately.

£ Û
 !# _S!FJ!F‚$NB
 _’4 ¤s
<˜t •s
Û
ªU·EŽ}¸m¸3!F ! $
¨˜Î _‘s
˜t _‡s
 x˜Îð  _ $;s
Û4ó
ªU·EŽ}¸m¸3!# _!>  _ ›¡ š>œò  _$F$;s
£²À À À Û À À À
B ’“ åsô<˜ s ¢’ w ËN _áÍ Ì& ¥$
w
Û
@²)År!Zï! KQ$aBS
Û
År!ZïC˜ sØÅp˜; ð  _ ‡sØÎ _‘s
Û
ªU·EŽb¸>¸ ! !  K $ts
Û Û Û
K2’Æ È — ›šï^ϝäÅ,!FŨϝ´ïUœ$,ÈšU¨ä Ï Jœ‡s
Û Û
€›*m` K}$f&0€›*>` $ECž`*>`)ïϝäł$íž`*>` ¨ÏÃ,$ts
Û
 ϝИ K›s
Û4ó Û
  _S›šUœCò  _$A’ KQ$F$

The property is the key result for ensuring the optimality of the greedy
strategy in this process. In the next section, we detail the proof of our theo-
rem.

6 On the theory of trees

As we have mentioned previously, trees play a central role in the justifica-


tion of the algorithm; the optimality of the greedy strategy is mainly based
on the proof of the theorem used by Cormen et al. [11]. We should now detail
the theory of trees and intermediate lemmas required for deriving the theo-
rem. Both the development of the tree identification protocol IEEE 1394 [3]

14
and the development of recursive functions [8] require proofs related to the
closure of relations; we apply the same technique for the closure of a func-
tion defining
Û
a tree. Û
Let  !F$ be a tree defined by Û a tree function and a root ; they satisfy the
following axioms Û ªU·EŽ}¸m ¸ !F! $ .
The closure ž`õ of Û —  is the smallest relation containing ¬ ,3<$ and stable
by application of — , that is:

ž`õN˜ ³¾ •s
¬ r) <$è  Ç
’ `
ž õ s
Û
3ž`õ#Ü — $è’Çž`õ s
£
„B
¨˜; ö¾ •s
¬ ,3 <$ 蒴“s
Û
 DÜ — $è’Ã4s
w
ž`õp’Ã
$

Useful properties on the closure can be derived from those definitions; for
instance, the closure is a fix-point; the root is connected to every node of
the connected component; the closure is transitive, etc. We summarize those
properties using our notations:

Û 
ž`õ&÷¬ :) <$fȓ3ž`õ#Ü — $EÜ
”;  ë’4ž`õ®Ü
Û
 — Üž`õ)$k’4ž`õ®Ü
3ž`õ#ÜFž`õh$A’“ž`õ#Ü
Û
Ëž`õ&Ý  ¥‚Ü
ž`õËpž`õ— ’4¬ ,3 <$`Ü

Figure 1 contains a tree with the edge ïpϝøÅ and without the edge Ýϝ
 . The construction of a new tree which contains the edge 4ϝ³ but not
the edge ï6ϝåÅ is done according to the following points (see the result in
Figure 3):

1. remove the edge ïϝäÅ


2. reverse all edges between  to ï (dashed arrows)
3. add the edge ¨Ïä

The resulting object seems to be a tree rooted by . On Figure 2 we observe


that the both parts are subtrees rooted by or ï . We should prove these two
facts.

15
a
b
x
y

Fig. 1. A spanning tree containing ù>úû

a
b
x
y

Fig. 2. Two trees

a
b
x
y

Fig. 3. A spanning tree containing ü‚úˆý

16
Lemma 3 (Concatenation of two separate trees) Û
ªU·EŽ}¸m¸3  !F  !  $
þ þ Û²þ
Û Û2þ þ þ
 ªU·EŽ}¸m¸3 þ F! !
Let  #!  !F  ! !# !F !F be such that: ÿ  Ëa &Ý¥
 Èa þ &0
˜ 
Û Û þ þ
Then ªU·EŽb¸>¸  !F !  È ÈkšU ϝÐJœ$ . 
Proof Sketch: The proof is made up of several steps. A þ first step proves that
the concatenation is a total function over the set  Èa . A second one leads
to a more technical task and we should prove the À À
inductive property
À þ
over
trees using a splitting of the inductive variable ( Ë  and Ë ). ñ

Lemma 4 (Subtree property)


Û Û
Let  !#*$ be a tree ¡
on ¡ (ªU·EŽ}¸m¸ ! ! $ ) and ï a node in .
Û
Then ªU·EŽb¸>¸)ï!Fž`õ šïUœm¢ˆ!>)ž`õ šïUœ>¢h›šïUœfò $#$

Proof Sketch: The À main difficulty ¡ is related


À
to the ¡ inductive Û
part.
¡ À We
must prove ¡ that, if ’ `
ž õ 
š U
ï m
œ ¢ , Ÿ
ï ˜ and 3
 m
ž õ 
š U
ï >
œ h
¢ ›
 
š U
ï 6
œ ò $ — ¢ ’
À À Û
À
, then ž`õ š¡ ïUœ>¢•’ . We use the inductive property on with the set
Èp Cžmõ šïUœ>¢ . ñ

Lemma 5 (Complement of a subtree)


Û
Let  !#*$ be a tree on¡ and ¡
ï a node in .
Û
Then ªU·EŽb¸>¸ ! Cž`õ šïUœm¢ˆ!>)ž`õ šïœm¢ $#$ .

À ¡ À
Proof ¡ Sketch:
Û ¡ À We should
À
prove that,
¡ if À ’ Cž`õ šïUœm¢ , ïƘ and
3ž`õ šïUœm¢ $ — ¢;’ , then C¡ `
ž õ 
š 
ï m
œ 6
¢ ’ . A hint is to use the inductive
Û À
property on with the set Èpž`õ šïUœm¢ . ñ
Now, we must characterize the subtree, where we have reversed the edge
Û Û
between ¡  to the root ï . Let ²ïZ !ïm$ be the subtree of with ï as root
Û
(it’s ž`õ šïUœm¢ˆ›šïUœkò ). This following function seems to be a good choice:

¡ Û ¡ Û 
)ž`õ — š>,œm¢ ²ïZUD !ï`$F$aȓ)ž`õ — š>,œm¢%ò‡2ïED !ï`$F$ —

¡ Û  *¡
3ž`õ — š>,œ>¢ò; ²ïZD !Zï`$#$ — is exactly all reverse edges. ž`õ — šUrœ>¢ is the set of
all parents of  .

Lemma 6 (Reverse from  to ï produces a tree)


 ¡
Let ï!# such that:
1˜ž`õ šïUœm¢
¡ U¡ Û *¡ Û 
Then ªU·EŽ}¸m¸ 2!Fžmõ šïUœ>¢h!U3ž`õ — š>,œm¢  ²ïZ !ïm$#$Nȑ3žmõ — šU,œm¢Sò 2ïED !ï`$F$ — $
¡
Proof Sketch: In this case we must use an induction on the tree ž`õ šïUœm¢ and
sometimes use an second induction with the inductive property in hypothesis. ñ

17
Lemma 7 (Existence of a spanning
Û
tree)
 ï!FÅ<˜ ¡
Let Å,!Zï!#"!# such that <˜tž`õ  š ïœm¢ ¡
ÿ  -S  ž`õ šïUœ>¢
C
Û
Then there
Û exists Û aÛ tree   K such that:
KΒ  È — ›šUÅvϝÃï!ï^ä Ï Årœ$dȑš> ϝäœ
Û
 ªU·EŽ}¸m¸3!Û F ! Kb$
Û
ÿ €›*>` K}$f&݀›*m` $ECžm*m`hïAϝÃÅD$íž`*>` 6Ð Ï ,$
Û
¨ÏÐ¡ †
 ˜ K
Û Û
ž`õ šïUœm ¢  ’ K
 Û
Proof Sketch: K is obtained by concatenation of . the two trees identified
in the two previous lemmas. Both trees are linked by the edge ¨Ïä . ñ
Finally, we have to prove the existence of an edge ïϝäÅ which is safe in the
sense of the greedy strategy.

Lemma 8 (Existence of ïAϝÃÅ )


 _ƒ´
1˜  _
 ÿ I˜t _
Let  _!# such that: °± À À Û ¡À À µ¶
’– •sΘ s ¢’
£²À
B w
À
ËN _áÑ
Ì& ¥
 Å1˜; _ 
 ï^ϝÃÅ1˜
Û
Then there exists Å and ï such that: .
ÿ 礼; ð  _¡ 
tž`õ — š>,œm¢

The property of the existence of a minimum spanning tree can now be de-

rived using lemmas and the proof of the property is then completely mech-
anized.

7 Related works
The refinement is a concept introduced by Back in his seminal paper [6] and
it is developed from the wp semantics defined by E. Dijkstra [12]. Those
seminal works inspire notations or methods for developing programs and
systems (Morgan[19], Abrial [1, 5], Gribomont [14] . . . ). Temporal aspects
are integrated into notations for extending the expressivity of the model-
ing language (UNITY [9], TLA/TLA [17, 18]). Clearly, our contribution is
based on the B event-based method, which is a proposal for an integrated
mechanization of the refinement process and a methodology for develop-
ing (reactive, distributed, sequential) systems; the B event-based approach
proposes a complete environment supporting the methodological approach.
However, as it was pointed out by previous readers, our works require com-
parisons with development of MST-like systems using a formal framework.

18
First at all, Cormen et al. [11] present a collection of algorithms which are
justified in a pseudo-mathematical language, leaving details of formaliza-
tion to the reader; we obtain a complete verified version of the algorithm.
The paper of Stroetman [21] addresses a similar kind of case study, the
constrained minimal spanning tree problem, using the ASM [15] notation.
Proofs are clearly done in a very classical way using the ASM mathematical
framework and no reference to the use of a proof tool is given. Moreover, our
case study is related to Prim’s algorithm and we used an incremental pro-
cess, namely the refinement. Although Stroetman mentions a refinement,
it is not systematically used for constructing the final solutions. Fraer [13]
has developed the Kruskal algorithm using the «classical» B method. In fact,
the main difference is our use of the B event-based approach for develop-
ing events systems. Even if Fraer has proved a part of proof obligations,
it is clear that some proof obligations might be found false in the remain-
ing unproved proof obligations. We recall that the goal of the refinement is
to produce systems which are completely proved with respect to the target
properties. Finally, we can reuse a part of our development for obtaining the
Kruskal algorithm.
Our paper illustrates a systematic approach for developing event systems
combining the refinement and the proof process supported by a tool; formal-
izations of concepts like graphs and trees are designed using the proof tool
and are checked by the proof tool; they can be reused in other case studies.

8 Conclusion
The development of Prim’s algorithm leads us to state and to prove proper-
ties over trees. The inductive definition of trees helps in deriving intermedi-
ate lemmas asserting that the growing tree converges to the MST, according
to the greedy strategy. The resulting algorithm is completely proved and we
can partially reuse current developed models to obtain Dijkstra’s algorithm
or Kruskal’s one. The greedy strategy is not always efficient and the opti-
mality of the resulting algorithm is proved by the theorem 24.1. The gain
is clear, since we have a mechanized and verified proof of Prim’s algorithm.
The mechanized proof looks like the proof given in the book of Cormen et
al.; we think that it is a proof readable by specialists of graph theory. More-
over, sketches of proofs are directly derived from the mechanized proofs; in
fact, the tool provides a way to discover these sketches. We can plan the re-
writing of a book like Cormen et al including algorithms, complete develop-
ments of each algorithm, proofs of developed algorithms and the possibility
to replay the developments to explain how each algorithm is really working.
Future work will study other techniques related to algorithmics for graph
theory, since only one chapter in the book of Cormen et al. is treated!

19
References
1. J.-R. Abrial. A formal approach to large software constructions. In J. L. A
van de Snepscheut, editor, Mathematics for Program Construction, pages 1–20.
Springer-Verlag, june 1989. LNCS 375.
2. J.-R. Abrial. Extending B without changing it (for developing distributed sys-
tems). In H. Habrias, editor, 1  Conference on the B method, pages 169–190,
November 1996.
3. J.-R. Abrial, D. Cansell, and D. Méry. A mechanically proved and incremental
development of the IEEE 1394 Tree Identify Protocol . Formal Aspects of Com-
puting, ??(??), 2002. accepted for publication.
4. J.-R. Abrial and L. Mussat. Introducing dynamic constraints in B. In D. Bert,
editor, B’98 :Recent Advances in the Development and Use of the B Method, vol-
ume 1393 of Lecture Notes in Computer Science. Springer-Verlag, 1998.
5. J.R. Abrial. The B Book - Assigning Programs to Meanings. Cambridge Univer-
sity Press, 1996. ISBN 0-521-49619-5.
6. R. J. R. Back. On correct refinement of programs. Journal of Computer and
System Sciences, 23(1):49–68, 1979.
7. Dominique Cansell, Ganesh Gopalakrishnan, Mike Jones, Dominique Méry, and
Airy Weinzoepflen. Incremental proof of the producer/consumer property for the
pci protocol. In D. Bert, editor, ZB 2002, Lecture Notes in Computer Science.
Springer-Verlag, January 2002.
8. Dominique Cansell and Dominique Méry. Développement de fonctions définies
récursivement en b. Technical report, LORIA, 2002.
9. K. M. Chandy and J. Misra. Parallel Program Design A Foundation. Addison-
Wesley Publishing Company, 1988. ISBN 0-201-05866-9.
10. ClearSy, Aix-en-Provence (F). Atelier B, 2002. Version 3.6.
11. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliff Stein. In-
troduction to Algorithms. MIT Press and McGraw-Hill, 2001.
12. E. W. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976.
13. R. Fraer. Formal Development in B of a Minimum Spanning Tree Algorithm. In
H. Habrias, editor, 1  Conference on the B method, pages 169–190, November
1996.
14. E. Pascal Gribomont. Concurrency without toil : a systematic method for paral-
lel program design. Science of Computer Programming, 21:1–56, 1993.
15. Y. Gurevitch. Specification and Validation Methods, chapter "Evolving Algebras
1993: Lipari Guide", pages 9–36. Oxford University Press, 1995. Ed. E. Börger.
16. J. B. Kruskal. On the shortest spanning subtree and the traveling salesman
problem. Proc. Am. Math. Soc., 7:48–50, 1956.
17. L. Lamport. A temporal logic of actions. Transactions On Programming Lan-
guages and Systems, 16(3):872–923, May 1994.
18. Leslie Lamport. Specifying Systems: The TLA+ Language and Tools for Hard-
ware and Software Engineers. Addison-Wesley, 2002.
19. C. Morgan. Programming from Specifications. Prentice Hall International Se-
ries in Computer Science. Prentice Hall, 1990.
20. R. C. Prim. Shortest connection and some generalizations. Bell Syst. Tech. J.,
36, 1957.
21. K. Stroetmann. The constrained shortest path problem: A case study in using
ASMs. J. of Universal Computer Science, 3(4):304–319, 1997.

20

View publication stats

You might also like