Professional Documents
Culture Documents
Sanchez
Sanchez
tion for
Wide Area Repli
ated Memory
Alfonso San
hez Luis Veiga Paulo Ferreira
INESC/IST,
o
Rua Alves Redol N 9,
Lisboa 1000-029, Portugal
falfonso.san
hez, luis.veiga, paulo.ferreiragines
.pt
appear ea
h day. Typi
al examples are found in and x , respe
tively. Now, suppose that x
ontains
j i
the elds of on urrent engineering, ooperative ap- a referen e to an obje t z in another pro ess k, x j
Manual memory management is extremely dif- and x is lo
ally rea
hable1. Then, the question is:
j
ult when developing the aforementioned dis- 1 Lo
ally (un)rea
hability is related to (un)a
essibility
process i
local
process j
local
i.e. it is out of the s
ope of the paper how the algo-
root root
rithm behaves in presen
e of
ommuni
ation failures
x x and pro
esses
rashes. However, solutions similar to
those found in
lassi
al DGC algorithms
an also be
applied (for example, leasings as in RMI [18℄.
This paper is organized as follows. In Se
tion 2
z we present the model of a WARM for whi
h the
local
root
DGC was dened. The DGC algorithm is des
ribed
process k in Se
tions 3 and 4. Se
tion 5 highlights some of
the most important implementation aspe
ts. Se
-
Figure 1: Safety problem of
urrent DGC algo- tion 6 presents some performan
e results from a real
rithms whi
h do not handle repli
ated data: z is appli
ation. The paper ends with some related work
erroneously
onsidered unrea
hable. and
on
lusions in Se
tion 7 and 8, respe
tively.
of all the repli as of the sour e obje t, x in this ex- delivery of that message is noted <deliver:M> ! . i j
ample, do not refer to it. We
all this the Union In a WARM, the only way to share information
Rule (more details in Se
tion 4.2.2). is by repli
ation of data, whi
h
an be done with a
The se
ond drawba
k, i.e. imposing severe DSM based me
hanism[12℄. Thus, pro
esses do not
onstraints on s
alability, ae
ts
urrent DGC al- use Remote Pro
edure Call (RPC) to a
ess remote
gorithms
on
eived for WARM systems, su
h as data.
Lar
hant [5, 10℄. As a matter of fa
t, su
h algo- It's worthy to note that appli
ation
ode inside
rithms are not s
alable be
ause they require the a pro
ess never sends messages expli
itly. Instead,
underlying
ommuni
ation layer to support
ausal appli
ation
ode a
ess data always lo
ally; trans-
delivery. parently to the appli
ation
ode, the WARM run-
So, in
on
lusion,
lassi
al DGC algorithms, su
h time system is responsible to repli
ate data lo
ally
as IRC and SSP Chains, are not safe for WARM when needed.
systems but promise to be s
alable, in parti
ular,
do not require
ausal delivery; on the other hand, Ea
h parti
ipating pro
ess in the WARM en-
loses, at least, the following entities: memory, mu-
WARM spe
i
DGC algorithms, su
h as Lar
hant,
deals safely with repli
ation but la
ks s
alability. tator3 , and a
oheren
e engine. In our WARM
model, for ea
h one of these entities, we
onsider
Thus, the main
ontribution of this work is the only the operations that are relevant for GC pur-
following: showing how
lassi
al DGC algorithms poses.
(
on
eived for fun
tion-shipping based systems)
an
be extended to handle repli
ation while keeping We believe that our model is suÆ
iently gen-
their s
alability. eral to des
ribe most distributed systems support-
We do not address the issue of fault-toleran
e, ing wide area appli
ations using data shipping. This
model
learly denes the environment for whi
h the
from the en
losing pro
ess's lo
al root. DGC algorithm is
on
eived.
2 In distributed systems with repli
ated data, an \a
quire"
operation allows a pro
ess to update its lo
al repli
a of a par-
ti
ular obje
t with the
ontents of another repli
a, of that 3 The term mutator [7℄ designates the appli
ation
ode
same obje
t, residing in some other pro
ess with a data- whi
h, from the point of view of the garbage
olle
tor, mu-
shipping me
hanism. tates (or modies) the rea
hability graph of obje
ts.
2.1 Memory Organization
before <x:=y> i after <x:=y> i
web pages, et
.
Obje
ts
an
ontain referen
es pointing to other
obje
ts. An outgoing inter-pro
ess referen
e is z z
a referen
e to a target obje
t in a dierent pro-
ess. An in
oming inter-pro
ess referen
e is a local
root
local
root
referen
e to an obje
t that is pointed from a dif- process k process k
ferent pro
ess. Our model does not restri
t how
referen
es are a
tually implemented. They
an be
virtual memory pointers, URLs, et
. Figure 2: Creation of a new inter-pro
ess referen
e
An obje
t is said to be rea
hable if it is attain- to obje
t z through an assignment operation.
able dire
tly or indire
tly from a GC root (dened
in Se
tion 3.1). An obje
t is said to be unrea
h-
or deleting referen
es. An obje
t be
omes unrea
h-
able if there is no referen
e path (dire
t or indire
t) able when the last referen
e to it disappears; when
from a GC root leading to that obje
t.
this o
urs, su
h an obje
t
an be safely re
laimed
The unit for
oheren
e is the obje
t. Any obje
t by the garbage
olle
tor be
ause there is no possi-
an be repli
ated (i.e.
a
hed) in any pro
ess. A bility for any pro
ess to a
ess it.
repli
a of obje
t x in pro
ess i is noted x . Ea
hi
The lo
al
olle
tor starts the graph tra
ing from the these lists are not lo
ally rea
hable (i.e. by the
pro
ess's lo
al root and set of s
ions. For ea
h out- lo
al mutator); however, they
an not be re-
going inter-pro
ess referen
e it
reates a stub in the
laimed without ensuring their global unrea
h-
new set of stubs. On
e this tra
ing is
ompleted, ability, i.e. that none of their repli
as are a
-
every obje
t lo
ally rea
hable by the mutator has
essible. This will be explained in detail in the
been found (e.g. marked, if a mark-and-sweep algo- following se
tion.
rithm is used); obje
ts not yet found are lo
ally un- The DGC algorithm is independent of the par-
rea
hable; however, they
an still be rea
hable from ti
ular
oheren
e proto
ol implemented by the
some other pro
ess holding a repli
a of, at least,
oheren
e engine. In other words, the DGC al-
one of su
h obje
ts (as is the
ase of x in Figure 1).
i
gorithm does not require waiting for repli
as to
To prevent the erroneous deletion of su
h obje
ts, be
oherent.
the
olle
tor tra
es the obje
ts graph from the lists
7 Note that from now on, the repli
a is not rea
hable by the
inPropList and outPropList, and performs as follows.
lo
al mutator; if another propagate operation o
urs bring-
When a lo
ally rea
hable obje
t (previously ing a new repli
a of that same obje
t into the pro
ess, the
old repli
a remains lo
ally unrea
hable, and a new entry is
dis
overed by the lo
al
olle
tor) is found, the
reated in the inPropList with the
orresponding sentUmess
tra
ing along that referen
e path ends. set to 0.
message sent/re
eived by sent when
unrea
hable LGC/DGC obje
t repli
a is rea
hable only from the inPropList
re
laim LGC/DGC all obje
t repli
as are rea
hable only from the inPropLists
newSetStubs DGC/DGC a new set of stubs is available
olle
tion; this message is sent to the pro
esses points to obje
t z in pro
ess k. There is a single
holding the s
ions
orresponding to the stubs stub-s
ion pair (s2-s1) des
ribing the only outgoing
in the previous stub set. In ea
h of the re
eiv- inter-pro
ess referen
e from y to z. For the sake of
j
ing pro
esses, the distributed
olle
tor mat
hes simpli
ity of our des
ription, we assume that this
the just re
eived set of stubs with its set of stub-s
ion pair is
reated when the system boots.9
s
ions; those s
ions that no longer have the
or- Then, the sequen
e of steps of the prototypi
al
responding stub, are deleted. example
onsiders the following operations (see Fig-
ures 5 and 6; the ee
ts of the operations are shown
As previously des
ribed, when a lo
al
olle
- in bold).
tion takes pla
e two kinds of messages may be
sent: unrea
hable and re
laim. On the re
eiving Step 1 - Propagate y from pro
ess j to pro
ess
pro
ess, these messages are handled by the dis- i; this results in the
reation of a new outgo-
tributed
olle
tor that performs the following ing inter-pro
ess referen
e from obje
t y in i to
operations: sets the re
Umess bit in the
or- obje
t z in k.
responding outPropList entry, and deletes the
orresponding entry in the inPropList, respe
- Step 2 - The operation <x := y> is performed
i
the model, the
omponents responsible for sending ing inter-pro
ess referen
e from obje
t x in j to
and re
eiving them, and when they o
ur. In Ta- obje
t z in k.
ble 2 we present all the events with impa
t on the
GC and the
orresponding a
tions taken. These two Step 4 - The operation <y := 0> is performed
j
tables summarize the way GC is performed. In the by the mutator in j; this results in the deletion
next se
tion we des
ribe the DGC algorithm in more of an outgoing inter-pro
ess referen
e from ob-
detail using a prototypi
al example. je
t y in j to obje
t z in k.
8 Note that this may result in the
reation of
hains of stub- 9 For example, the referen
e to z
ould be obtained from a
s
ion pairs, as it happens in the SSP Chains algorithm [16℄. name servi
e.
event o
urs when a
tion taken
referen
e exported propagate an obje
t
reate s
ion
from a pro
ess
referen
e imported propagate an obje
t
reate stub
into a pro
ess
obje
t repli
a LGC runs send unrea
hable message to
rea
hable only the pro
ess with the
orresponding
from the inPropList outPropList entry; set the
sentUmess bit a
ordingly
unrea
hable message unrea
hable message sent set the re
Umess bit a
ordingly;
re
eived if all re
Umess bits for a
parti
ular obje
t are set, then send
the
orresponding re
laim messages
and delete the outPropList entry
re
laim message re
laim message sent delete
orresponding inPropList
re
eived entry
new set of stubs LGC runs newSetStubs message sent to the
available pro
esses holding the s
ions
orres-
ponding to the previous set of stubs
newSetStubs message newSetStubs message sent
ompare stubs set with set of s
ions;
re
eived delete s
ions with no
orresponding stubs
Step 5 - Propagate y from pro
ess j to pro
ess i; 2), and nally by propagation again (step 3). We
this results in the deletion of an outgoing inter- address these
ases now.
pro
ess referen
e from obje
t y in i to obje
t z
in k. 4.1.1 Propagation
Step 6 - The operation <x := 0> is performed
i
The rst operation in the prototypi
al example is
by the mutator in i; this results in the deletion propagate(y) ! (Figure 5, step 1). Immediately be-
of an outgoing inter-pro
ess referen
e from ob- j i
ation of new outgoing inter-pro
ess referen
es; the in pro
ess i, obje
t y has to be s
anned for imported
last ve steps result in z be
oming unrea
hable. In outgoing inter-pro
ess referen
es in order to
reate
the next se
tions we des
ribe how the DGC works the
orresponding stubs in pro
ess i, if they do not
in order to deal with this prototypi
al example. exist yet. In the prototypi
al example, y
ontains
a single referen
e and there is no stub des
ribing
4.1 Creation of Outgoing Inter-pro
ess it in pro
ess i. Thus, the
orresponding stub s4 is
Referen
es
reated (shown in bold); this stub, through its in-
ternal data stru
tures, refers to the s
ion previously
In the prototypi
al example, the
reation of outgo-
reated in pro
ess j. Then, the mutator may freely
ing inter-pro
ess referen
es o
urs rst by propa- a
ess obje
t y in pro
ess i.
gation (step 1), then by referen
e assignment (step Thus, the information stored in the stub-s
ion
process i process j
inPropList local local outPropList
root root
xj 0 xi 0
y y
yj 0 x S3 z x yi 0
z z
S4 S2
z
z S1
local
root
process k
Step 1: propagation of object y from process j to process i.
process i process j
inPropList local local outPropList
root root
xj 0 xi 0
y y
yj 0 x S3 z x yi 0
z z
S4 S2
z
z S1
local
root
process k
Step 2: creating a new inter-process reference through <x:=y> i.
process i S6 process j z S7
local z local
inPropList outPropList
root z S5 root
xj 0 y y xi 0
yj 0 x S3 z x yi 0
z z
S4
S2
z
z S1
local
root
process k
Step 3: propagation of object x from process i to process j.
S7
process i S6 process j z
local z local
inPropList root root outPropList
z S5
xj 0 y y xi 0
yj 0 x S3 z x yi 0
z z
S4
S2
X
z
z S1
local
root
process k
Step 4:deleting an inter-process reference through <y:=0> j .
z
z S1
local
root
process k
Step 5: propagation of object y from process j to process i.
process j S7
process i S6 z
local z local
root S5 root outPropList
inPropList z
y xi 0
xj 0 y x
yj 0 x S3 z yi 0
z z
S4 S2
X z
z S1
local
root
process k
Step 6:deleting an inter-process reference through <x:=0> .
i
process j S7
process i S6 z
local z local
inPropList root z S5 root outPropList
X
xj 0 y y xi 0
yj 0 x S3 z x
yi 0
z z
S4 S2
z
z S1
local
root
process k
Step 7:object x in process j becomes unreachable from the local root.
process j S7
process i S6 z
local
z
S5 local
inPropList root outPropList
z root
X
xj 0 y y xi 0
yj 0 x S3 z x yi 0
z z
S4
S2
local z S1
root
process k
Step 8:object x in process i becomes unreachable from the local root.
reate inter-pro
ess referen
es very easily and fre- ing propagated to i no longer points to any obje
t,
quently, through a simple referen
e assignment op- after the propagate is delivered, the outgoing inter-
eration, su
h in
rement would be extremely ineÆ- pro
ess referen
e from obje
t y in pro
ess i to z, is
ient. As a matter of fa
t, this would require instru- (impli
itly) deleted (Figure 6, step 5). At this mo-
menting every referen
e assignment and in
rement ment, there is absolutely no operation to be done for
a
ounter a
ordingly, possibly on some remote pro- GC purposes. Note that, given that the obje
t be-
ess. In the following se
tions it will be
ome
lear ing propagated
ontains no referen
es, both safety
that su
h in
rement (or equivalent operation) does 10 Note that if a lo
al
olle
tion has previously taken pla
e
not need to be performed immediately. in pro
ess i, stub s5 would have been already
reated.
proc. i z
S6
S5
. ..
root(x)=0
inPropList
xj0 LGC
. inPropList
xj1
inPropList
S4
z
z S5
S6
. . . S4 X
z
S6
S5 S4
LGC
Xz zX
new
S4 yj0
unr
yj0 yj0 z t
bs
z z z z S5
im(x)
etStu
eac
. .. . . . ..
Set
hab
Stu
S3
newS
S3
recla
S3 S7 outPropList outPropList outPropList S7
X
le(x
bs
S3 S3
X
z z
proc. j z xi0 xi1 LGC LGC z z z z LGC
)
S2 yi0 S2 S2 S2
X
yi0 yi0 S2 z z
new
z z z S2 z t
. .
Set
Stu
S1 S1 S1 S1 S1
S1
z Xz LGC
S1
bs
proc. k z z z z z
t
8th step: initial situation 1st LGC 2nd LGC 3rd LGC 4th LGC 5th LGC 6th LGC
Figure 7: Timeline des ribing the GC operations after the 8th step of the prototypi al example.
rules do not imply the exe
ution of any parti
ular When this message is delivered in pro
ess j, the
operation. re
Umess bit in the
orresponding entry of out-
The sixth step of the prototypi
al example is the PropList is set.
exe
ution of the operation <x := 0> . This results in
the deletion of the outgoing inter-pro
ess referen
e,
i
2nd LGC - The lo
al
olle
tor in pro
ess j de-
from obje
t x in pro
ess i to obje
t z in pro
ess k te
ts that obje
t x is rea
hable only from the
outPropList and the
orresponding entry has its
(Figure 6, step 6). At this moment, there is abso-
re
Umess bit set to one; thus a message re
laim
lutely no operation to be done for GC purposes.
The seventh step of the prototypi
al example is sent to pro
ess i and the entry in the outPro-
pList is deleted.
makes obje
t x in pro
ess j unrea
hable from the
lo
al root. The last step makes obje
t x in pro
ess When this message is delivered in pro
ess i, the
i unrea
hable from the lo
al root. In both
ases
orresponding entry in inPropList is deleted.
there is absolutely no operation to be done for GC 3rd LGC - As a result of a lo
al
olle
tion in
purposes. pro
ess j, x is re
laimed and,
onsequently, stub
So far, the DGC has performed no operation. In s7 des
ribing its outgoing inter-pro
ess refer-
parti
ular, no s
ion has been deleted. Consequently, en
e to obje
t z is not in the new set of stubs.
obje
t z, whi
h is no longer rea
hable, has not been This new set of stubs is sent as a newSetStubs
re
laimed yet. This will happen only after its pro- message from pro
ess j to pro
ess i; then, the
te
ting s
ion s1 in pro
ess k is deleted and the lo
al distributed
olle
tor in i deletes the
orrespond-
olle
tor is exe
uted. Now we address the modi
a- ing s
ion s6.
tion and deletion of stubs and s
ions.
Note that stub s2, in spite of the fa
t that y in
j holds no outgoing inter-pro
ess referen
e any-
4.2.2 Colle
ting Garbage
more, is still in the new set of stubs be
ause is
In step 8 of the prototypi
al example we see that rea
hable from s
ion s3 through its Chain data
obje
t z will be re
laimed by the lo
al
olle
tor in stru
ture.
pro
ess k only after its prote
ting s
ion s1 has been
deleted. This s
ion will be deleted only after the 4th LGC - As a result of a lo
al
olle
tion in
orresponding stub s2 in pro
ess j has disappeared; pro
ess i, obje
t x is re
laimed and the new
this will o
ur only after all the
hain of stub-s
ion set of stubs does not
ontain any stub (s5 and
s4, in parti
ular) be
ause there are no outgoing
pairs s7-...-s3 gets deleted.
A
ording to Se
tion 3.2, the stubs and s
ions inter-pro
ess referen
es.
will disappear as a result of the lo
al and distributed This new set of stubs is sent as a newSetStubs
olle
tors in pro
esses i and j, as explained now (see message from pro
ess i to pro
ess j; then, the
Figure 7). distributed
olle
tor in j deletes the
orrespond-
ing s
ion s3.
1st LGC - The lo
al
olle
tor in pro
ess i de-
te
ts that obje
t x is rea
hable only from the 5th LGC - As a result of a lo
al
olle
tion in
inPropList; thus, a message unrea
hable is sent pro
ess j a new set of stubs is generated in
to pro
ess j and the
orresponding sentUmess whi
h there is no stub (i.e. s2) be
ause there
bit is set. are no outgoing inter-pro
ess referen
es.
This new set of stubs is sent as a newSet- site S1 site S2
Stubs message from pro
ess j to pro
ess k; then,
standard HTTP protocol
the distributed
olle
tor in k deletes the
orre- NG client
(with a web browser NG protocol: make-replica
NG server
(web server
sponding s
ion s1. component inside) DGC protocol with servlets)
files are accessed with any tool and files are accessed with any tool and
6th LGC - Finally, a lo
al
olle
tion o
urs in made available by means of a web
server
made available by means of a web
server
Table 4: Values for the top-set group of les. (Sizes in bytes, times in millise onds.)
Table 5: Values for the bran
h-set group of les in the bran
h world/europe. (Sizes in bytes, times in
millise
onds.)
it takes to serialize a hash table with all the stubs In Tables 4 and 5, for ea
h le in the top-set and
orresponding to a single le. in the bran
h-set, respe
tively, we present the times
However, in a normal browsing session, the user mentioned above along with the size of ea
h le and
does not makes-repli
a of all the les. We expe
t the number of URLs en
losed.
the user to browse a few top-level pages and then These performan
e results are worst-
ase be
ause
pi
k one or more bran
hes of the hierar
hy. Some they assume all the URLs en
losed in a le refer to a
of these les will be repli
ated into the users lo
al le in another site, whi
h is not the usual
ase. How-
omputer. ever, they give us a good notion of the performan
e
So, in order to obtain more realisti
numbers, we limits of the
urrent implementation. In parti
ular,
performed the following. We pi
ked 10 les from we see that the most relevant performan
e
osts are
the top of the
nn.
om hierar
hy. These les are due to the s
anning of a le and the serialization of
mostly entry points to the others with more spe
i
the hash table. However, we believe that these val-
ontents. We
all this set of les, the top-set. We ues are a
eptable taking into a
ount the fun
tion-
also pi
ked other 10 les representing a bran
h of ality of the system, i.e. it ensures that no dangling
the
nn.
om hierar
hy. We
all this set of les, the referen
es and no memory leaks o
ur. In addition,
bran
h-set. when a user runs the NG browser and a
esses any
web page without making a lo
al repli
a of any le, Previous work in DGC as IRC [14℄, SSP
there is absolutely no performan
e overhead due to
hains [16℄ and Lar
hant [10℄ served as the starting
DGC. point of the DGC algorithm presented in this paper.
We
an also
on
lude that the size on disk of the Our new algorithm builds on these previous two al-
hash table
ontaining all the stubs for a le is about gorithms in su
h a way that
ombines their advan-
half the size of the HTML le. This rather large tages: no need for
ausal delivery support to be pro-
size is mostly due to the size of the URLs whi
h are vided by the underlying
ommuni
ating layer (from
responsible for about 90% of that size. The size of the rst two), and
apability to deal with repli
ated
the le
ontaining the stubs
an
ertainly be redu
ed obje
ts (from Lar
hant).
using regular
ompression te
hniques.