Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 684 (1998)
(USC DC Other)
USC Computer Science Technical Reports, no. 684 (1998)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Distributed Garbage Collecti on b y Timeouts and
Bac kw ard Inquiry
SungW o ok Ryu B Cliord Neuman
fryu b cn gISIEDU
Information Sciences Institute
Univ ersit y of Southern California
Abstract
W e presen t a practical and ecien t garbage collection
mec hanism for large scale distributed systems The
mec hanism collects all garbage including distributed
cyclic garbage without global sync hronization or bac k
w ard links The primary metho d used for lo cal and
remote garbage collection is time outseac h ob ject has
a timetoliv e and clien ts whichha vealink toan ob ject m ust refresh the target ob ject within the timeto
liv e to guaran tee that the link will remain v alid F or
cyclic garbage collection ob jects susp ected to b e garbage
are detected b y last r efer enc e able timestamp propaga
tion and cyclic garbage is reclaimed b y b ackwar d inquiry
bac ktracing Since without additional o v erhead the
information ab out bac kw ard references can b e obtained
during the refreshing pro cess and since messages neces
sary for cyclic garbage collection are bundled with the
messages used for the refreshing comm unication com
putation and storage o v erhead is minim i zed This mec h
anism has b een implemen ted and ev aluated on Prosp ero
directory service and the p erformance results sho wthat
it w orks w ell for large scale distributed systems
Keyw ords Distributed Garbage Collection Dis
tributed ob jects Cyclic Garbage Timeouts Bac k
T racing
In tro duction
This pap er suggests a practical and ecien t garbage col
lection mec hanism for large scale distributed systems A
critical problem for large scale distributed systems whic h
manage distributed ob jects is deciding when ob jects are
not reac hable from anyclien ts
Since eac h system has
limited size of storage spaces unreac hable ob jects should
b e reclaimed as so on as p ossible while the referen tial in
Clien ts are application programs or activ e ob jects including
ro ot ob jects
tegrit y of the distributed systems is main tained
This pro cess called garbage collection is imp ortan t for
the follo wing reasons First man ual garbage collection is
errorprone b ecause it is dicult for users to main tain
the information ab out references correctly In the cur
rentW orld Wide W eb b ecause o wners mo v e or delete
their w eb pages without considering incoming links there
are man y dangling links whic h cause Not F ound error
messages Second since distributed ob jects are dynam
ically created deleted migrated and shared across the
net w ork large scale it is dicult to determine when an
ob ject is not reac hable and whether it is safe to reclaim
it
In the ideal distributed system ob jects con tin ue to ex
ist as long as they are reac hable from clien ts In practice
this is dicult to supp ort in large scale distributed sys
tems b ecause of the follo wing reasons
Distributed systems are administrativ ely decen tral
ized so their w ellb eha ving co op eration cannot b e
required
Distributed systems are v ery large in scale so it is
imp ossible to get a global view of clien ts ob jects and
their references
Serv ers and clien ts can crash during garbage collec
tion related op erations
Messages can b e lost and the net w ork can b e parti
tioned for a while
Man y solutions for distributed garbage collection ha v e
b een suggested but they are not suitable for large scale
distributed systems b ecause their target systems are small
in scale or they cannot collect cyclic garbage Esp ecially it is a c hallenging problem to collect distributed cyclic
garbage
Ladin and Lisk o v uses a logically cen tralized ser
vice whichk ept all information ab out in tersystems refer
ences and Jull and Jul uses a global comprehen
siv e tracing algorithm Both of them do not scale w ell
b ecause they require global sync hronization Neuman
uses a lo cal lo w er b ound instead of a global lo w er
b ound that is the algorithm sacrices safet y for p erfor
mance Lang et al in tro duced tracing within a
group instead of global systems to a v oid global sync hro
nization Ho w ev er it cannot collect distributed cyclic
garbage completely F uc hs suggested a bac k
tracing mec hanism but this mec hanism assumed that
all ob jects main tain bac kw ard references whic hisnot
scalable Mahesh w ari Lisk o v suggested an en
hanced bac ktracing mec hanism whic h dynamically cal
culated bac kw ard references Ho w ev er the o v erhead of
calculating and main taining bac kw ard references is to o
hea vy F or the solution w e designed and implemen ted a prac
tical and ecien t distributed garbage collection mec ha
nism W eha v e implemen ted a garbage collection mec h
anism whic h uses time outs last r efer enc e able timestamp
propagation Neuman and b ackwar d inquiryThe
primary metho d used for lo cal and remote garbage col
lection is time outswhic h is similar to leases in Ja v a
RMI RMI and pinging in DCOM Chapp ell Eac h ob ject has a TimeT oLiv e TTL and an expira
tion and eac h link main tains an expiration time of its
target ob ject Clien ts and ob jects whichha v e a link to
another ob ject m ust send a refresh message to the target
ob ject b efore the expiration to guaran tee that the link
will remain v alid Timeouts collect acyclic garbage but
not cyclic garbage
F or the cyclic garbage collection lo cal garbage sus
p ected to b e garbage is detected b y last r efer enc e able
timestamp propagation LR TS When an ob ject is ac
cessed b y a clien t the LR TS of the ob ject is set to the
accessed time and the LR TS the accessed time is prop
agated to all reac hable ob jects recursiv ely during refresh
ing pro cess There is no additional comm unication o v er
head for LR TS propagation b ecause it is propagated with
a refresh message b y piggybac king The LR TS sho ws
the most recen t time when the ob ject w as accessible b y
clien ts Ob jects whose LR TSs are not more recentthan
a lo c al thr eshold are susp ects of garbage and they start
b ackwar d inquiry bac ktracing to conrm that they are
garbage Since all referencing ob jects send a refresh mes
sage to the susp ect within the susp ects TTL the susp ect
can obtain the information ab out bac kw ard links Using
the information bac kw ard inquiry is p erformed to exam
ine if the susp ect is not reac hable from anyliv e ob jects
The bac kw ard inquiry is p erformed during refreshing pro
cess so the comm unication o v erhead is minim i zed
The System Arc hitecture
Our garbage collection mec hanism has b een implemen ted
on the Prosp ero directory service Neuman whic h
manages distributed information but the denitions and
assumptions do not restrict the generalityofour mec ha
nism
A distributed ob ject is a unit of information eg
an ob ject of CORBA V ogel and Duddy DCOM
Chapp ell Distributed OODB w eb or Prosp ero
and maycon tain data and metho ds Data includes at
tributes and links A link consists of the target ob jects
name and attributes of the link Ev ery ob ject has a
unique iden tier so a link alw a ys p oin ts to at most one
ob ject
Ob jects are managed byaserv er whic h has t w o tasks
the dir e ctory servic e and the garb age c ol le ction servic e Clien ts requests eg requesting ob ject information
adding attributes to an ob ject and making a link to an
other ob ject are pro cessed b y the directory service and
garbage collection is p erformed b y the garbage collection
service Although these m ust b e t w o tasks in a single
serv er it is more con v enien t to think that there are t w o
serv ers a directory serv er and a garbage collection serv er
Clien ts are application programs or activ e ob jects includ
ing ro ot ob jects whic hha v e links to ob jects
Object w
Object x
Client A
Object z
Object y
access
Figure Distributed Ob ject Mo del
In our system mo del all ob jects are equal and there
are no sp ecial ro ot ob jects In practice some distributed
ob ject spaces do not ha v e sp ecial ro ot ob jects whichare
alw a ys aliv e F or that reason to collect garbage our
mec hanism uses reac habilit y from clien ts but not from
ro ot ob jects Ob jects reac hable from clien ts are aliv e
and unreac hable ob jects are garbage If there are ro ot
ob jects all ob jects reac hable from the ro ot ob jects are
alivetoo Assumptions ab out our target system are listed b elo w
Some practical systems whic h meet the assumptions are
distributed ob ject orien ted databases CORBA DCOM
and Prosp ero
Garbage accum ulates at a relativ ely lo w rate in con
trast to programmi ng languages suc h as Lisp or
C
Links are created or deleted at a relativ ely lo w rate
in con trast to programming languages suchas Lisp
or C
Messages are deliv ered within a nite p erio d of time
Crashed serv ers reco v er within a nite p erio d of time
All clo c ks in the net w ork are sync hronized to some
degree eg hours
Tim eouts
Information ab out b oth remote and lo cal links are main
tained b y timeouts When an ob ject is created b yaclien t
the clien t assigns a TTL TimeT oLiv e to the ob ject
The new ob ject also gets an expiration time whichis the
creation time plus the TTL The TTL is the p erio d for
whic h the ob ject is guaran teed to exist after b eing r e
fr eshe d receiving a refresh message When a link is made
to the ob ject the TTL is added to the curren t lo cal time
and the resulting expiration time is stored in the ob ject
Then the TTL is sen t to the new link and the expiration
time is calculated whic h is its lo cal time plus the TTL
T o guaran tee that a link will con tin ue to w ork the
target ob ject remains v alid it m ust b e refreshed b efore
its expiration This means that if a link is not refreshed
b efore its expiration time the target ob ject can disap
p ear T o reduce the o v erhead of garbage collection eac h
serv er p erforms lo cal garbage collection p erio dically Dur
ing this time the serv er examines the expiration times
of links and if a link is going to expire the serv er re
freshes the target ob ject and the link Therefore all ob
jects whichha v e incoming links can get new expiration
times b efore they expire
The adv an tages of timeouts are it mak es error re
co v ery easy b ecause referencing information is expired af
ter a TTL time that is an y wrong information is cor
rected after a TTL time and it is suitable for ad
ministrativ ely decen tralized distributed systems b ecause
w ellb eha ving co op eration of all distributed systems are
not required Only co op erativ e links are considered as
actual links
All acyclic garbage ob jects ev en tually expire and are
reclaimed Once an ob ject b ecomes unreac hable from
clien ts and other ob jects it cannot b e refreshed and will
expire after its TTL time Cyclic garbage ob jects ho w
ev er are not reclaimed b y their expiration times b ecause
the ob jects in the same cycle will refresh eac h other and
they nev er expire In Figure when the clien t A dis
app eared and do es not access the ob jects an y more all
ob jects w x y and z b ecome garbage The ob ject w
cannot b e refreshed and expires so on Ho w ev er cyclic
garbage ob jects x y and z cannot b e reclaimed b ecause
the ob jects in the same cycle will refresh eac h other and
they nev er expire The LR TS L ast R efer enc e able TimeS
tamp propagation and b ackwar d inquiry are used to col
lect cyclic garbage
LR TS Propagation
Eac h ob ject main tains an LR TS Last Referenceable
TimeStamp whic h sho ws the most recen t time when the
ob ject w as accessible b y clien ts When an ob ject is ac
cessed b y a clien t the LR TS of the ob ject is set to the
accessed time and the LR TS the accessed time is prop
agated to all reac hable ob jects recursiv ely when links are
refreshed All reac hable ob jects from the accessed ob ject
will get new LR TSs ev en tually The LR TS of inaccessi
ble distributed cyclic garbage ho w ev er will stabilize at a
v alue that is less than or equal to the time at whichthe
cycle b ecame inaccessible Ob jects whose LR TSs are not
more recen t than a lo c al thr eshold are lo c al garb ageThe
basic idea of LR TS w as suggested b y Neuman and
the idea has b een expanded here
In Figure when the clien t A accesses the ob ject w then the ob ject w gets a new LR TS and the LR TS is
propagated to the subsequen t ob jects x y and z during
a refreshing pro cess If the ob ject w deletes its link to
the ob ject x then the ob jects x y and z b ecome cyclic
garbage and they cannot get new LR TSs from the ob
ject w The LR TS of inaccessible ob jects including cyclic
garbage x y z will stabilize at a v alue that is less than
or equal to the time at whic h the cycle b ecame inaccessi
ble
If there is a ro ot ob ject the lo cal threshold is LR TS of
the ro ot ob ject allo wing for time for the LR TS to propa
gate If there is no ro ot ob ject the lo cal threshold is se
lected b y clien ts or system administrators based on t yp e
and n um b er of clien ts and frequency of clien ts access
F or sc ho ols one migh t exp ect that all enrolled studen ts
will access their ob jects during a semester so the lo cal
threshold can b e mon ths
This means that w ecan as sume that all ob jects themselv es or their ancestors will
b e accessed within mon ths The lo cal threshold aects
only cyclic garbage collection but not acyclic garbage col
lection
Since eac h system selects its o wn lo cal threshold eac h
has a dieren t threshold Moreo v er it is p ossible that
LR TSs migh t not arriv e at target ob jects on time b e
cause they are sen t only during the link refreshing pro cess
Therefore b efore lo cal garbage is reclaimed it should b e
conrmed to b e garbage F or this conrmation bac k
tracing is p erformed b y bac kw ard inquiry Bac kw ard Inquiry
The basic idea of b ackwar d inquiry is that b efore a lo cal
garbage ob ject a susp ect is reclaimed all referencing
ob jects are examined to see whether they are garbage
or not
If all referencing ob jects are garbage then the
ob ject is also garbage and can b e reclaimed The b est
w ayto c hec k referencing ob jects is to ask them directly
to examine themselv es whether they are garbage or not
and to reply with results
Since referencing ob jects refresh their links and target
ob jects b efore their expiration times ob jects can get re
fresh messages from all referencing ob jects This means
that ob jects can get information ab out all referencing ob
jects ie an implicit reference list within their TTL
times ev en though they do not ha v e an explicit reference
list
In Figure the ob ject w refreshes the ob ject x b efore
the ob ject x expires So the ob ject x kno ws that the
ob ject w has a reference to x When the ob ject x became
a lo cal garbage then the ob ject x can ask the ob ject
w to examine whether the ob ject w is garbage or not
Since lo cal garbage ob jects con tin ue to refresh their target
ob jects the ob ject x do es not kno w whether the ob ject w
is aliv e or not If the ob ject w is garbage then the ob ject
x is also garbage Ho w ev er if the ob ject w is aliv e then
the ob ject x also aliv e
If there is a cycle the request migh tbe forw arded with
out stopping T oa v oid this un b ounded lo oping in a cycle
ob jects are mark ed b efore sending a request If an ob ject
is already mark ed and gets the same request again the
ob ject can detect a cycle
The b enets of bac kw ard inquiry are bac kw ard
Y ou ma y think that it is to o long Ho w ev er all acyclic ob jects
are collected b y timeouts Only cyclic garbage is collected after mon ths
Cyclic garbage ob jects con tin ue to refresh eac h other so that
they nev er expire
inquiry bac ktracing is started from ob jects susp ected
to b e garbage and only susp ects tak e part in the bac k
tracing bac kw ard inquiry is p erformed bac kw ard so
it is easy to nd the time when sync hronization is nished
and the incremen tal o v erhead of bac kw ard inquiry is
lo w b ecause messages necessary for the inquiry are bun
dled with the messages used for refreshing The pro of of
correctness of the bac kw ard inquiry is attac hed at the end
of this pap er
A simple example
c
b a
(1) AYG(c)
(2) AYG(c)
(5) R[c,P]
(4) R[c,P]
processing list={c} processing list={c}
processing list={c}
: AYG message send : AYG message reply : Link & refreshing
(3) Detect Cycle
Figure Bac kw ard inquiry for a simple cycle
Figure sho ws a simple example of bac kw ard inquiry In this example w e assume that only one ob ject starts
bac kw ard inquiry There is a simple cycle whic h consists
of three lo cal garbage ob jects a b and cWhen c starts
bac kw ard inquirya new A YG AreY ouGarbage mes
sage is created and the name of starting ob ject cis
assigned to the A YG message Only the original starting
ob ject can assign its name on the A YG message The
A YGc message is added to its pr o c essing list The pro
cessing list is in tro duced to a v oid un b ounded lo oping in
cycles When an ob ject gets an A YG message the ob ject
is mark ed with the starting ob ject of the A YG message
When b sends a refresh message to c this is not sho wn
in the Figure then c sends the A YGc message to
a with the reply of the refresh message The c in the
A YG message is the starting ob ject of the message c is
app ended to the pro cessing list of the ob ject b to a v oid
un b ound lo oping When a sends a refresh message to b then b sends an A YGc to a with the reply of the
refresh message c is app ended to the pro cessing list of
the ob ject a When c sends a refresh message a a
detects that there is a cycle since the A YGc message is
from c The A YGc message is going bac kw ard and the
refresh message is coming forw ard If they meet at the
same ob ject then a cycle is detected
During the TTL time a did not get an y refresh message
except one from the ob ject cso a replies to b with
RcP P P ending means that there is no liv eobject
in the path After receiving the reply b replies to c
with RcP that b is PENDING When c receiv es a reply
from the ob ject b c kno ws that c is garbage b ecause the
ob ject is not reac hable from an y liv e ob ject
Starting bac kw ard inquiry
An ob ject starts bac kw ard inquiry when the ob ject is lo cal
garbage and should b e reclaimed A new A YG AreY ou
Garbage message is created and the starting ob ject is
noted in the A YG message When an ob ject receiv es an
A YG message from another ob ject the ob ject also starts
bac kw ard inquiry for the message The ob ject should use
the receiv ed A YG message instead of creating a new one
A pr o c essing list is in tro duced to a v oid un b ounded lo op
ing in a cycle If there is a cycle an A YG message w ould
otherwise tra v erse the net w ork without stopping When
an A YG message visits an ob ject for the rst time the
starting ob ject of the message is stored in the pro cessing
list If a message whose starting ob ject is same as pre
vious one visits the ob ject once more the ob ject detects
a cycle byc hec king its pro cessing list The un bounded
lo oping can b e a v oided b y marking visited ob jects with
A YG messages
Ob jects refresh their links and target ob jects b efore
they expire so target ob jects can get refresh messages
from all referencing ob jects b efore target ob jects ex
piration time A refresh message con tains an LR TS
and a r efr esh typ e whic h is the garbage collection sta
tus of a referencing ob ject If a referencing ob ject is
garbage the ob ject do es not send a refresh message If
a referencing ob ject is aliv e the refresh t yp e is ALIVE
and if a referencing ob ject is lo cal garbage the re
fresh typeisLOCAL GARBA GE If a referencing ob
ject is p erforming bac kw ard inquiry the refresh t yp e is
A YG PR OCESSING
When an ob ject whic h is p erforming bac kw ard inquiry
receiv es a refresh message the ob ject examines the refresh
message and if required sends an A YG message to the
refreshing ob ject When a refresh message t yp e is ALIVE
then the ob ject stops bac kw ard inquiry and b ecomes aliv e
b ecause the ob ject is reac hable from a liv e ob ject and
is aliv e If a message t yp e is A YG PR OCESSING the
ob ject compares the sender with a pro cessing list If the
sender name is already on the pro cessing list a cyclic is
detected By the A YG message the ob ject kno ws that
there is a path from the ob ject itself to the sender and
b y the refresh message the ob ject kno ws that there is a
path from the sender to the ob ject itself whic h forms a
cycle If the sender name is not in the pro cessing list an
A YG message is sen t to the referencing ob ject
If the message t yp e is LOCAL GARBA GE the ob ject
sends an A YG message to the referencing ob ject to see
if the referencing ob ject is garbage or aliv e The ob ject
receiv es and pro cesses the refresh messages un til receiv
ing refresh messages from all referencing ob jects that is
during its TTL time
A r e questing list is main tained to see if all replies to an
A YG message ha v e arriv ed When an A YG message is
sen t to a referencing ob ject the referencing ob ject name
is stored in the requesting list and the referencing ob ject
name is remo v ed from the list when the ob ject replies
Some ob jects in the requesting list can crash without re
plying A YG messages When an ob ject in the requesting
list do es not send a refresh message the ob ject is re
mo v ed from the requesting list When the requesting list
b ecomes empt ybac kw ard inquiry nishes
Multiple starting ob jects
Basicallyeac h ob ject p erforms bac kw ard inquiry for only
one A YG message b ecause all A YG messages can share
the result of an A YG message F or example if the re
sult of an A YG message is GLOBAL GARBA GE w e
kno w that the result of the rest of messages will b e
GLOBAL GARBA GE Ho w ev er if more than t woob jects start the A YG algorithm and eac h ob ject p erforms
bac kw ard inquiry only for one A YG message it is p ossible
that b oth of them cannot nish b ecause of a deadlo c k
Toa v oid the deadlo c k and to get p erformance impro v e
men t eac h ob ject main tains t w o lists a pr o c essing list and
a r eply list Curren tly pro cessing A YG messages in a pro
cessing list and other A YG messages are stored in a reply
list When the pro cessing list is empt yan A YG message
is selected from the reply list based on the priorityofthe
A YG messages The starting ob ject name of A YG mes
sage is a priorit y Eac h ob ject has a unique iden tier so
eac h ob ject has a unique priorit y The selected A YG mes
sageismo v ed to the pro cessing list and bac kw ard inquiry
is p erformed for the message When a new A YG message
arriv es the priorit y of the message and the lo w est prior
it y message in the pro cessing list are compared If the
new message has a higher priorit y then the new message
is stored in the pro cessing list and bac kw ard inquiry is
also p erformed for the message to a v oid a deadlo c k If
the new message has lo w er priorit y the message is just
stored in the reply list and is pro cessed later
Figure sho ws an example that t w o ob jects start bac k
w ard inquiry at the same time The ob ject a starts bac k
w ard inquiry when the ob ject is lo cal garbage and should
b e reclaimed The ob ject a starts bac kw ard inquiry and
sends an A YGa message to the ob ject x x is re
serv ed for a and a is app ended to the pro cessing list The A YGa message is sen tto y from x and y is also
reserv ed for a The ob ject b also starts bac kw ard inquiry and sends an A YGb message to z z is reserv ed for
b and b is app ended to the pro cessing list
When A YGb message is sen tfrom z to x b and
a are compared b ecause x w as already reserv ed for a Since b is bigger than a b is stored in the pro cessing list
and con tin ue to go forw ard The A YGb message can go
forw ard and un til it detects a cycle When A YGa message is sen t from y to zthe
message is stored in the reply list and blo c k ed b ecause z
w as already reservefor b and b is bigger than a After b
nishes bac kw ard inquiry a con tin ues bac kw ard inquiry a
b
processing list={a,b}
y
processing list={b}
reply_list={a}
z
processing list={a, b}
x
1)AYG(a)
2)AYG(a)
5)AYG(a)
3)AYG(b)
4)AYG(b)
: Link & Refreshing : a’s AYG message path
: b’s AYG message path
processing list={a}
processing list={b}
7)AYG(b)
6)AYG(b)
Figure Pro cessing list and reply list
Pro cessing replies
The reply of an A YG message can be ALIVE
GLOBAL GARBA GE or PENDING When receiving an
ALIVE reply the ob ject then stops bac kw ard inquiry and
replies with ALIVE to all ob jects in the pro cessing list
and the reply list b ecause the ob ject is aliv e The sta
tus of the ob ject b ecomes aliv e Otherwise the ob ject
w aits for other replies and pro cesses them un til receiving
replies from all referencing ob jects to whic h the ob ject
sentA YG messages If the ob ject has receiv ed all replies
and if all replies are GLOBAL GARBA GE then the ob
ject is garbage so it replies with GLOBAL GARBA GE
to all ob jects in the pro cessing list and the reply list The
status of the ob ject b ecomes garbage
If the ob ject has receiv ed all replies and if more
than one of replies are PENDING and rest of them
are GLOBAL GARBA GE then the ob ject replies with
PENDING only to the sender of the A YG message The
ob ject do es not kno w whether or not the ob ject itself is
garbage but it kno ws that it is not reac hable from liv e
ob jects except the starting ob ject of the A YG message
The information is used only for the starting ob ject of the
A YG message
If the result of an ob ject of bac kw ard inquiry is
GLOBAL GARBA GE or ALIVE then the ob ject replies
with the result to all senders of the A YG messages in
the reply list If the result is PENDING then the ob ject
replies with PENDING to the sender of the A YG mes
sage and the A YG message is remo v ed from the pro cess
ing list If the pro cessing list is empt y the ob ject selects a
new A YG message in the reply list b y the priorityof A YG
messages for the next bac kw ard inquiry The next A YG
message is selected and the message is sen t to referencing
ob jects when the ob ject receiv es refresh messages
A complex example
Figure sho ws a more complex example of bac kw ard in
quiry There is a big cycle a b c d and a small cycle
a b d is inside the big cycle
The ob ject a starts bac kw ard inquiry when the ob ject
is lo cal garbage and should b e reclaimed a starts
bac kw ard inquiry and sends A YGa to d d sends
A YGa to b and sends A YGa to c When c sends
an A YG message to b b replies PENDING since b re
ceiv ed the same message already b do es not send its
A YG message to a b ecause the A YGa is from a c
replies to d with RaP and b replies with RaP Fi
nally d replies to a with RaP so a is global garbage
b ecause it is not reac hable from an y liv e ob jects
Lo cal garbage collecti on and
OUTREF T ables
A garbage collection serv er op erates under t womodes a
normal mo de and a lo c al garb age c ol le ction mo de Dur
ing the normal mo de garbage collection serv ers pro cess
incoming refresh messages and A YG messages A t a sp ec
ied in terv al the garbage collection serv er switc hes to a lo
cal garbage collection mo de and p erforms a lo cal garbage
collection algorithm The next lo cal garbage collection
time is sc heduled b y the serv er b efore starting the cur
1) AYG(a)
6) R[a,P]
5) R[a,P]
: AYG request : AYG reply : Link & Refreshing
b
Processing List={a}
a
Processing List={a}
2) AYG(a)
d
Processing List={a}
3) AYG(a)
4) R[a,P]
c
Processing List={a}
Detect
a cycle
Detect
a cycle
Figure bac kw ard inquiry algorithm for complex cycles
ren t lo cal garbage collection pro cess
During the lo cal garbage collection time the garbage
collection serv er sends refresh messages and p erforms
reclamation including bac kw ard inquiry if needed
First the serv er visits all ob jects in the system and exam
ines the expiration times of links If a link whic h will ex
pire b efore the next lo cal garbage collection time is found
the link and its target ob ject are refreshed After refresh
ing is nished the serv er visits all ob jects in the systems
once more to reclaim global garbage
The o v erhead of sending a refresh message to a remote
system is hea vy b ecause net w ork comm unication is re
quired T o reduce the n um b er of refresh messages an
OUTREF table is in tro duced During refreshing if a
garbage collection serv er nds a remote link whic hisgo ing to expire b efore the next lo cal garbage collection time
the sev er stores a refresh message in the OUTREF table
instead of sending the message directlyA t the end of
refreshing the refresh messages are sen tb y batc h that
is messages whose destinations are the same host are
gathered together on a message and the message is sen t
The next lo cal garbage collection time is also sen t with
the message and the time is stored in the INREF table
of the target system to pro vide faulttoleran t refreshing
The frequency of lo cal garbage collection is selected b y
system administrators If a system has enough storage
and needs a lowo v erhead garbage collection algorithm
the lo cal garbage collection algorithm can b e p erformed
infrequen tly If a system has enough computing p o w er
and it w an ts to collect garbage ob jects as so on as p os
sible then the lo cal garbage collection algorithm can b e
p erformed frequen tlyHo w ev er it is p ossible that TTLs
of some links are less than a lo cal garbage collection in ter
DCOM uses same approac hes using the O XID Resolv er
v al F or the solution garbage collection serv ers main tain
a list of the links whose TTLs are less than the in terv al
and send refresh messages b efore the expiration times re
gardless of a lo cal garbage collection sc hedule
F aulttoleran t features
In our mec hanism timeouts are used to distinguish liv e
ob jects from garbage ob jects but due to net w ork and
serv er failures refresh messages migh t not arriv e at tar
get ob jects on time This means that liv e ob jects can
b e reclaimed as garbage b ecause of the failures F or the
solution the INREF table is in tro duced
Eac h host manages an INREF table and the table
con tains information ab out remote hosts whichha veref erences to the lo cal ob jects The table con tains tuples
remote host name previous refreshing time and next re
freshing time The next refreshing time is the sc heduled
time when the remote serv er is going to send refresh mes
sages When a remote serv er mak es a link to a lo cal ob ject
or sends a refresh message at the rst time the remote
host name is registered in the INREF table
Before starting reclamation the garbage collection
serv er examines the next refreshing times in the INREF
table to c hec k whether refresh messages from all hosts
ha v e arriv ed If all next refreshing times in the INREF
table are greater than the curren t time that is if all
refresh messages ha vearriv ed garbage collection is p er
formed normally If an y next refreshing time is not more recen t than a
curren t time that is if an y refresh message has not ar
riv ed y et the garbage collection serv er requests a refresh
message explicitly If some remote hosts cannot send re
fresh messages b ecause of serv er or net w ork failures recla
mation is p erformed based on the oldest next refreshing
time of the remote hosts that ha v e not sen t refresh mes
sages y et b ecause the recen t time when the status of all
ob jects w ere correct is the oldest next refreshing time
If a remote host noties that it do es not ha v e references
to lo cal ob jects or if a remote host has b een unreac hable
for a long time the remote host name is remo v ed from
the INREF table
Automatic Con trol of Garbage
Collecti on F requency
Ob jects ha vev arious lifetimes Some stable ob jects liv e
for a long time other temp orary ob jects liv e only for
a short time Ob jects ha v e a tendency that manyof
them are shortliv ed but ob jects liv e for a long time once
they ha v e surviv ed for more than some p erio d of time
Lieb erman and Hewitt Ungar and Jac kson This means that the p ossibilit y of b ecoming garbage for
a newly created ob ject is higher than the p ossibilit y for
an old one
Bak er et al measured life times of les in Sprite
distributed le system and the measuremen ts sho w ed
that most les ha v e short lifetimes Lieb erman and He
witt suggested a garbage collection algorithm based
on the lifetimes of ob jects The basic idea w as that the
frequency of garbage collection w as selected according to
the age of ob jects F or y ounger ob jects garbage collec
tion w as p erformed more frequen tly Ungar and Jac k
son used ten uring p olicies for garbage collection
algorithm to impro v e p erformance Once an ob ject has
surviv ed for more than some threshold the ob ject gets
ten ure and the garbage collection algorithm is p erformed
v ery infrequen tly on ten ured ob jects
In timeouts the TTL of an ob ject is the p erio d dur
ing whic h the ob ject is guaran teed to exist after it is re
freshed F or links that are infrequen tly referenced paren t
ob jects should refresh the links and target ob jects within
their TTL p erio d so the frequency of garbage collection
is decided b y the TTL of a target ob ject It is hard to
assign an appropriate TTL to an ob ject The shorter the
TTL the more refreshmen ts are required so the o v er
head of refreshmen t b ecomes hea vy On the other hand
the longer the TTL the more time is needed to reclaim
the ob ject if the ob ject b ecomes garbage and storage will
be w asted for a long time
T o solv e this problems w e use garbage collection based
on the lifetime and n um b er of incoming links of an ob
ject As Lieb erman and Hewitt Bak er et al sho w ed the p ossibilit y of b ecoming garbage for a newly
created ob ject is higher than the p ossibilit y for an old
one so garbage collection should b e p erformed more fre
quen tly for newly created ob jects and less for old ob jects
In our algorithm the TTL of an ob ject starts from a small
n um b er and is increased when the ob ject is accessed or
receiv es a new last referenceable timestamp The n um
b er of incoming links also correlates with the probabilit y
of b ecoming garbage Ob jects whic h are referenced b y
man y other ob jects will liv e for a long time The TTL of
an ob ject is also increased when another ob ject mak es a
link to the ob ject
The TTL is increased up to a maxim umwhic his as signed b y the o wner of the ob ject This algorithm also re
duces net w ork comm unicationo v erhead b ecause the TTL
of an ob ject will b e increased during lo cal op eration
When a remote ob ject mak es a link to a lo cal ob ject the
TTL of the lo cal ob ject b ecomes large enough to a v oid
frequen t remote refreshing
P erformance
This section describ es the result of the p erformance ev al
uation In our system ob jects are managed b y a serv er
whichhas t w o tasks a dir e ctory server and a garb age
c ol le ctor Since the directory serv er should w ork without
disruption while the garbage collector is w orking the task
of the directory serv er has a higher priorit y than the task
of the garbage collector
Wedev elop ed v arious conguration of distributed ob
ject spaces and measured the o v erhead and p erformance
W e measured three dieren t p erformance asp ects of our
mec hanism the p erformance degradation of the di
rectory serv er caused b y the garbage collector the
o v erhead of the net w ork comm uni cation and garbage
collection latency times of acyclic garbage and cyclic
garbage F or the measuremen t Prosp ero serv ers are in
stalled on Sun w orkstations Sun Ultra Sun
Ultra Sun SP AR Cstation Sun SP AR Cstation
SP AR Cstation and Sun SP AR Csytem
whic h are scattered o v er Ethernets
The o v erhead of lo cal GC
A garbage collector has t w o mo des a normal mo de and a
lo cal garbage collection mo de During the normal mo de
the garbage collector pro cesses incoming refresh messages
and A YG messages A t a sp ecied in terv al the garbage
collector switc hes to a lo cal garbage collection mo de and
p erforms lo cal garbage collection algorithms step b y step
The garbage collector visits all ob jects in the system and
refreshes their links and target ob jects including remote
target ob jects After the refreshing pro cess it visits all
ob jects in the system once again to examine their expi
ration times and last referenceable timestamps If they
are expired they are reclaimed as garbage and if their
last referenceable timestamps are not more recen t than
a lo cal threshold they start bac kw ard inquiry to c hec k
whether they are garbage or not
First w e measured the duration of lo cal garbage col
lection using the ob ject spaces conguration sho wn in the
Figure The ob ject spaces are tree structure and eac h
ob ject has c hildren if it is not a leaf T o simplify
the ev aluation w e do not add remote links on the ob ject
spaces The Figure sho ws the duration of lo cal garbage
collection The xaxis sho ws the n um b er of ob jects and
links in the spaces If there are n ob jects in the spaces
there are also n links since the spaces are tree structure
The yaxis sho ws the duration of lo cal garbage collection
time The lo w er line sho ws the duration of lo cal garbage
collection when no links are ab out to expire and therefore
no links are refreshed The higher line sho ws the duration
of lo cal garbage collection when all links are ab out to ex
pire and therefore target ob jects are refreshed The lo cal
garbage collection time increases linearly in prop ortion to
the n um b er of ob jects and links to b e refreshed
. . .
Figure Ob ject Spaces Conguration
0
5
10
15
20
1000 1500 2000 2500 3000 3500 4000 4500 5000
local GC time (min)
nubmer of objects and references
no refreshing
with refreshing
Figure Duration of Lo cal Garbage Collection Time
Second w e measured the p erformance degradation of
the directory serv er caused b y the garbage collector In
the Figure the xaxis sho ws the size of ob jects and the y
axis sho ws the ob ject retrieving time from a remote clien t
The line mark ed as NO GC indicates that the serv er
do es not supp ort a garbage collection service The line
mark ed as GCNormal indicates that the serv er sup
Op eration of messages
refreshing a remote host propagating LR TSs sending bac kw ard inquiry replying bac kw ard inquiry T able The o v erhead of garbage collection op erations
p orts a garbage collection service and the garbage collec
tor is on normal mo de During the normal mo de when
a clien t accesses an ob ject the garbage collector mo dies
the LR TS Last Referenceable Timestamps and the ex
piration time of the ob ject and stores the ob ject Since
the garbage collector uses a cac hing mec hanism there is
only a little garbage collection o v erhead during normal
garbage collection mo de The line mark ed as GCLo cal
GC indicates that the serv er supp orts a garbage collec
tion service and the garbage collection serv er is on lo cal
garbage collection mo de There is ab out o v erhead
during lo cal garbage collection time Since the duration
of lo cal garbage collection mo de is v ery short in compar
ison with that of normal mo de the o v erall o v erhead of
garbage collection is v ery lo w
0
0.01
0.02
0.03
0.04
0.05
0 500 1000 1500 2000 2500 3000 3500
object reading time (sec)
size of objects (byte)
No GC
GC-Normal
GC-Local-GC
Figure The o v erhead of the garbage collector
Comm unication o v erhead
Our mec hanism con tains three op erations whic h require
net w ork comm unication They are refreshing LR TS
propagation and bac kw ard inquiryT able summarizes
the o v erhead of the garbage collection op erations for re
mote ob jects in terms of the n um b er of messages
The OUTREF table and the INREF table are in tro
duced to reduce the remote refreshing o v erhead When
the garbage collector meets a remote link during the re
freshing pro cess a refresh message is stored on the OUT
REF table instead of sending the message directly Af
ter nishing the refreshing pro cess the garbage collector
gathers messages whose destination serv ers are same and
sends them bybatc h Therefore one refresh message and
one reply message are required for eac h remote host re
gardless of the n um b er of remote links The LR TSs are
propagated with refresh messages b y piggybac king so no
additional message is needed A YG messages are also
sen t with the replies of refresh messages but one reply
message for the result of bac kw ard inquiry is required
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0 2 4 6 8 10 12 14
object reading time (sec)
number of remote references
Figure The remote ob jects refreshing o v erhead
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0 2 4 6 8 10 12 14 16
object reading time (sec)
number of hosts
Figure The remote hosts refreshing o v erhead
W e measured remote ob jects refreshing o v erhead and
remote hosts refreshing o v erhead F or the remote refresh
ing the garbage collector is congured to send a refresh
message ev ery second In theFigurethe xaxis
sho ws the n um b er of remote links to a same remote host
and the yaxis sho ws the ob ject retrieving time from a
remote clien t Since the n um b er of refresh message is
only one regardless of the remote links the refreshing
pro cess do es not cause net w ork congestion and clien ts
ob ject reading time is stable In the Figure the xaxis
sho ws the n um b er of remote hosts to whic h lo cal ob jects
ha v e links and the yaxis sho ws the ob ject retrieving time
from a remote clien t Since the garbage collector sends a
refresh message ev ery second the refreshing pro cess
do es not cause net w ork congestion and clien ts ob ject
reading time is stable If a host do es not get a refresh
message from a referencing host on time the host explic
itly ask the referencing host to send a message
Garbage collection latency
T o measure the latency times of acyclic and cyclic
garbage the Prosp ero serv ers and ob ject spaces are con
gured as Figure and Figure Eac h ob ject is stored
on a dieren t host and eac h garbage collector is cong
ured as Lo cal GC Cycle min TTL min Last
Referenceable Threshold min and Net w ork Grace
P erio d min Eac h garbage collection pro cess w as run
times and the results w ere a v eraged
Because refresh and A YG messages are propagated only
during lo cal garbage collection time garbage is collected
after some latency times whic h dep ends on the TTL and
Lo cal GC Cycle If they are small the garbage collection
latency times decrease but the p erformance of serv ers is
also degraded
Obj 1 Obj 2 Obj n-1 Obj n
. . .
: Link & refreshing
: backward inquiry
start
Figure Collection of Acyclic Garbage
In Figure after obj
b ecomes unreac hable from
clien ts it will b e expired after its TTL plus Net w ork
Grace P erio d obj
cannot b e reclaimed un til obj
is re
claimed b ecause obj
con tin ues to refresh obj
All sub
sequen t ob jects obj
i cannot b e reclaimed un til obj
i
is
5
10
15
20
25
30
0 5 10 15 20 25 30 35 40
Collection Latency (min)
Max Objects in a Chain
Figure Collection Latency of Acyclic Garbage
reclaimed When obj
i
is reclaimed obj
i cannot get re
freshed and is reclaimed as garbage after TTL plus Net
w ork Grace P erio d Ho w ev er if obj
i has already started
bac kw ard inquiry it will b e reclaimed just after Lo cal GC
Cycle The latency times to collect obj
i
whose maxi
m um depth in a c hain is i is as follo ws
TTL Network Gr acePeriodL o c al GC Cycle
i L ast R efer enc e able Thr eshold
Figure sho ws the measuremen t of collection latency
times of an acyclic garbage c hain Obj
and Obj
are re
claimed b y timeouts eac htook TTL Network Gr ac e
Perio d min and others are reclaimed b ybac kw ard in
quiry eac htook L o c al GC Cycle min
Obj 1 Obj 2
Obj n-1
Obj n
. . .
. . .
Obj n/2
Obj n/2 + 1
: Link & refreshing : backward inquiry
Detect
a cycle
start
Figure Collection of Cyclic Garbage
Cyclic garbage ob jects suc h as ob jects in Figure are
collected b y bac kw ard inquiry In Figure w ega v e obj
n
the highest priorit y so that the ob ject can nish bac kw ard
inquiry rst T o detect a cycle an A YG message should
b e propagated to all ob jects in the cycle from obj
n
to
obj
Its latency is
A YG Pr op agationn L ast R efer enc e able Thr eshold
TTL L o c al GC Cycle n and the PENDING result should return to obj
n
and
the latency is
A YG R eplyn n L o c al GC Cycle So the latency time to break a garbage cycle is A YG
Pr op agationn A YG R eplyn Figure sho ws the re
sult of measuremen t ab out latency time of cyclic garbage
W e measured the latency times of garbage cycles whose
size is 10
20
30
40
50
60
70
80
0 5 10 15 20 25 30 35 40
Collection Latency (min)
Max Objects in a Cycle
Figure Collection Latency of Cyclic Garbage
The result of the measurementsho ws that the latency
time to collect acyclic and cyclic garbage increase linearly
in prop ortion to the length of the garbage c hain The rea
sons wh y there are dierences b et w een measurementand
theoretical latency are all clo c ks are not sync hronized
p erfectly and the starting time of lo cal garbage collec
tion on eachserv er is not sync hronized
Discussion
The ev aluation of our garbage collection mec hanism is
based on the follo wing criteria
Safet y Ob jects reac hable from clien ts are not reclaimed
as long as they are referenced b y clien ts whic h try
to refresh target ob jects Net w ork fault tolerance is
pro vided b y INREF tables In our mec hanism only
w ellb eha ving links are considered as actual links
Liv eness All garbage including distributed cyclic
garbage is reclaimed completely b y timeouts last
referenceable timestamp propagation and bac kw ard
inquiry
Lowo v erhead The o v erhead of storage computation
and comm unicatio n is minimi zed Eac h ob ject do es
not main tain a bac kw ard reference list Instead eac h
host main tains a list of referencing hosts on an IN
REF table so not m uc h storage is used The in
formation ab out bac kw ard references are obtained
during refreshing pro cess without additional compu
tation Comm uni cation o v erhead is also reduced b y
piggybac king As T able sho ws the o v erhead of
garbage collection op erations in terms of the n um ber
of messages is not hea vy Scalabilit y The n um b er of messages whic h are required
for garbage collection increase linearly in prop or
tion to the n um b er of links and the latency time to
collect garbage increases linearly in prop ortion to
the length of the garbage c hain Our mec hanism do es
not require global sync hronization of all distributed
systems so it is suitable to large scale distributed
systems
F ault tolerance By in tro ducing INREF tables safet y
and liv eness are supp orted despite serv er and net
w ork failures under the assumption that the failures
are reco v ered within a nite p erio d of time eg one
mon th INREF tables main tain a list of referenc
ing hosts so garbage collection serv ers can exam
ine whether the remote serv es are aliv e Ifare mote serv er has crashed the garbage collection is
p erformed based on the previous time when the re
mote serv er w as aliv e The garbage collection pro cess
uses pairwise comm unicati on so ev en when some
serv ers are una v ailable others will b e able to mo v e
forw ard
Related W ork
Man y distributed garbage collection algorithms ha vebeen
suggested and those algorithms fall in to four categories
reference coun ting reference listing tracing and migra
tion
Reference coun ting is simple and scalable but the al
gorithm is not suitable for distributed systems b ecause
it cannot collect cyclic garbage ob jects and it is im
p ossible to main tain the referencing link coun ts correctly Since ob jects can b e remo v ed in a lo osely sync hronized
distributed system without deleting their links eg dur
ing a serv er crash the link coun t of an ob ject can b e
greater than zero ev en though no other ob jects are refer
encing the ob ject CORBA V ogel and Duddy and
DCOM Chapp ell use reference coun ting and they
ha v e the ab o v e problems
Bev an suggested a w eigh ted reference coun ting
algorithm to a v oid the problems caused b y extra mes
sages and race conditions Eac hobjectmain tains a refer
ence coun t and eac h reference to the ob ject has a w eigh t
When a reference is duplicated the w eigh t of the refer
ence is halv ed and the remaining half is sen t to the next
reference When a reference to the ob ject is deleted the
reference coun t of the ob ject is reduced b y the w eightof
the deleted reference When the reference coun tof the
ob ject reac hes zero the ob ject is reclaimed as garbage
Eac h reference has a virtually unique w eigh t so p oten tial
race conditions are a v oided The shortcomings of this al
gorithm is the limited n um b er of references and the ab o v e
t wodra wbac ks of reference coun ting
Piquer suggested indirect reference coun ting to
solv e the problem of w eigh ted reference coun ting The
algorithm lo calizes the creation and the duplication of
references F or garbage collection the algorithm main
tains an in v erted tree represen ting the diusion tree of
the references throughout the systems The ob ject itself
is the ro ot of the tree and when a reference is created
or duplicated a no de is added as a c hild of the creator
Eachnodek eeps one p oin ter to its paren t and a coun ter
with the n um ber of c hildren The tree is used only for
garbage collection The direct p oin ter is also created for
accessing the target ob ject When a reference is deleted
the corresp onding no de is deleted from the tree only if it
w as a leaf If not the no de is deleted when it b ecomes
a leaf When there is only a ro ot no de in the tree the
ob ject is garbage The shortcomings of the algorithm are
the o v erhead of managemen t of the in v erted tree and t w o
dra wbac ks of reference coun ting whichw ere men tioned
b efore
Reference listing Shapiro et al Birrell et al
w as in tro duced to solv e the problem of reference
coun ting Instead of storing link coun ts onlyeac h ob ject
k eeps some information ab out referencing ob jects in its
reference list so that garbage collection serv ers can p eri
o dically examine whether the ob ject is actually referenced
b y ob jects in its reference list This algorithm w orks w ell
in cases where there are only a few incoming links The
storage and managemento v erhead for the reference list
ho w ev er is hea vywhenalargen um b er of incoming links
exist Shapiro et al uses migration to collect cyclic
garbage but Birrell et al do es not collect cyclic
garbage
In tracing algorithms Juul and Jul Lang et al
eac h system starts marking all of the accessible lo
cal ob jects from its ro ot ob jects After completing the lo
cal marking step systems exc hange information ab out the
reac habilit y of remote references Remotely referenced
ob jects and their descendan ts are also mark ed When a
system gets all information ab out the remote incoming
references ob jects are examined for marks Unmark ed
ob jects are reclaimed as garbage b ecause they are un
reac hable from the ro ot ob jects in the net w ork In order
to see if all systems nish the lo cal marking and the ex
c hange of the reference information a distributed termi
nation detection algorithm or a cen tralized serv er is used
These algorithms can collect distributed cyclic garbage
The o v erhead of the termination detection ho w ev er is
extremely hea vy and in large scale distributed systems
the cen tralized serv er b ecomes a b ottlenec k
In Lang et al b oth reference coun ting and trac
ing is used Reference coun ting is a basic algorithm and
tracing is used to collect cyclic garbage T oa v oid global
sync hronization this algorithm in tro duced tracing within
a group Before starting tracing group negotiation is p er
formed to organize a group and the tracing is p erformed
only within that group Cyclic garbage within the group
is detected and reclaimed After nishing the tracing the
group is disbanded This algorithm is fault toleran tand
do es not need global sync hronization but cannot collect
garbage completely Jull and Jul uses a global comprehensiv e trac
ing algorithm Their goal w as to collect all garbage in
the en tire distributed systems The lo cal garbage collec
tor p erforms lo cal tracing and during eac h global garbage
collection cycle the global garbage collector gets informa
tion ab out references from remote systems b y co op erat
ing with other global collectors T o determine when the
global marking is nished a distributed termination de
tection algorithm is used This algorithm do es not scale
w ell
Timestamp propagation uses timestamps instead of
marks P erio dically eac h system propagates timestamps
to reac hable ob jects from its ro ot and ob jects whose
timestamps are less then a global lo w er b ound are re
garded as garbage These algorithms do not need global
sync hronization but the o v erhead of nding a global
lo w er b ound is extremely hea vy Hughes uses a
distributed termination detection algorithm to nd the
global lo w er b ound but the algorithm w as v ery exp en
siv e and did not scale w ell Ladin and Lisk o v uses
a logically cen tralized service whichk ept all information
ab out in tersystems references but this algorithm did not
scale w ell either Neuman uses timeouts and last
referenceable timestapms for garbage collection Ob jects
whose last referenceable timestamp is not more recen t
than a lo w er b ound are garbage and they are reclaimed
He uses a lo cal lo w er b ound instead of a global lo w er
b ound that is the algorithm sacrices safet y for p erfor
mance DCOM Chapp ell uses pinging to examine
clien ts status Clien ts p erio dically send a pinging mes
sage to the referencing ob ject If an ob ject has not got
an y timestamps from a clien t for a sucien t time in terv al
the clien t is assumed to ha v e died
F uc hs suggested a bac ktracing mec hanism but
this mec hanism assumed that all ob jects main tain bac k
w ard references Since distributed ob jects usually do not
main tain bac kw ard references the mec hanism is not prac
tical It is also unscalable to main tain bac kw ard ref
erences Mahesh w ari Lisk o v suggested an en
hanced bac ktracing mec hanism Eac h system main tains
t w o tables inrefs and and outrefs whic hcon tain the infor
mation ab out remote references The information ab out
bac kw ard references are calculated using these tables and
lo cal tracing The o v erhead of calculating bac kw ard ref
erences is hea vy and it is v ery hard to preserv e safetyand
completeness in the presence of concurrentm utators and
garbage collectors
Migration Bishop Shapiro et al Gupta
w as also suggested to collect cyclic garbage In this
algorithm all ob jects on a garbage cycle are migrated to
a single system and are collected during lo cal garbage
collection This approac h is not suitable for large scale
distributed systems b ecause some ob jects ma y not b e
migrated and some systems ma y not allo w remote ob jects
to b e migrated
Conclusion
Weha v e describ ed a practical and ecien t distributed
garbage collection mec hanism whic h uses timeouts last
referenceable timestamp propagation and bac kw ard in
quiry Timeouts are suitable for large scale distributed sys
tems b ecause they mak e error reco v ery easy and do not
require w ellb eha ving co op eration of all distributed sys
tems Only co op erativ e references are considered as ac
tual references for garbage collection
Last referenceable timestamp propagation and bac k
w ard inquiry collect distributed cyclic garbage safely and
completely without global sync hronization or bac kw ard
references Since messages necessary for cyclic garbage
collection are bundled with the messages used for refresh
ing o v erhead is minimi zed
The mec hanism is faulttoleran t so safet y and liv eness
are supp orted despite serv er and net w ork failures Eac h
serv er main tains information ab out remote serv ers whic h
ha v e references to lo cal ob jects and uses this informa
tion to set a threshold for expiration of ob jects that ha v e
not b een refreshed By main taining the last and exp ected
next refreshing time from eachserv er garbage collection
can b e p erformed ev en when some of serv ers are una v ail
able
References
Bev an D I Bev an Distributed garbage collection
using reference coun ting in Lecture Notes in Computer
Science v ol P ARLE SpringV erlag New Y ork
pages
Bak er et al Mary G Bak er John H Hartman
Mic hael D Kupfer Ken W Shirri and John K Ouster
hout Measuremen ts of a distributed le systems SOSP
pages Birrell et al Andrew Birrel Da vid Ev ers Greg
Nelson Susan Owic ki and Edw ard W obb er Distributed
garbage collection for net w ork ob jects T ec hnical Rep ort
Digital Equipmen t Co op eration Systems Researc h
Cen ter Decem b er Bishop P B Bishop Computer Systems with
aV ery Large Address Space and Garbage Collection
T ec hnical Rep ort MITLCSTR MIT Lab oratory
for Computer Science Cam bridge MA Ma y Chapp ell Da vid Chapp ell Understanding Activ eX
and OLE Microsoft Press F uc hs Matthew F uc hs Garbage collection on an
op en net w ork IWMM in ternational w orkshop on mem
ory managemen t Kinross UK Berlin New
Y ork NY Springer Lecture notes in computer
science pages Gupta F uc hs Alok e Gupta W Ken tF uc hs
Garbage collection in a distributed ob jectorien ted sys
tem IEEE T ransaction on Kno wledge and data engineer
ing v ol no April pages Hughes John Hughes A distributed garbage col
lection algorithm In JeanPierre Jouannaud editor
A CM Conference on F unctional Programming Languages
and Computer Arc hitecture NancyF rance Septem ber
n um b er in Lecture Notes in Computer Science
pages SpringerV erlag Juul Jul Niels Christian Jull Eric Jul Com
prehensiv e and robust garbage collection in a distributed
system Sain t Malo F rance b yYv es Bekk ers
Jacques Cohen Berlin New Y ork NY Springer
V erlag Lecture notes in computer science pages Mahesh w ari Lisk o v Umesh Mahesh w ari and
Barbara H Lisk o v Collecting distributed garbage cycles
bybac k tracing the A CM symp osium on Principles of
Distributed Computing August Neuman Barry Cliord Neuman The virtual sys
tem mo del A scalable approac h to organizing large sys
tems PhD dissertation Univ ersityofW ashington Ladin and Lisk o v Rivk a Ladin and Barbara
Lisk o v Garbage collection of a distributed heap In ter
national Conference on Distributed Computing Systems
pages Lang et al Bernard Lang Christian Queinnec and
Jose Piquer Garbage collecting the w orld In Conference
Record of the Nineteen th Ann ual A CM Symp osium on
Principles of Programming Languages pages Lieb erman Hewitt H Lieb erman and C Hewitt
A realtime garbage collector based on the lifetimes of
ob jects Comm un A CM v ol June pages Piquer Piquer J Indirect reference coun ting A
distributed garbage collection algorithm In Pro ceedings
P arallel Arc hitectures and Languages Europ e v ol Lecture Notes in Computer Science E H L Aarts
J v an Leeu w en M Rem Eds SpringerV erlag Berlin
pages RMI Sun Microsystems Ja v a Remote Metho d In
v o cation Sp ecication h ttpja v asuncom Sc helvis Marcel Sc helvis Incremen tal Distribu
tion Timestamp P ac k ets A new approac h to distributed
garbage collection In Pro ceeding of OOPSLA Oct
pages Shapiro et al M Shapiro O Grub er and D
Plainfosse A garbage detection proto col for a real
istic distributed ob jectsupp ort system Rapp ort de
Rec herc he INRIA INRIARo cquencourt P aris
F rance No v em ber Ungar Jac kson Da vid Ungar and F rank Jac k
son The T en uring P olicies for GenerationBased Storage
Reclamation OOPSLA Septem b er pages V ogel and Duddy Andreas V ogel and Keith Dyddy Ja v a Programming with CORBA John Wiley Sons
APPENDIX Pro of of Correctness
The correctness of bac kw ard inquiry is pro v en byin tro ducing an A YG graph whic h is dened as follo ws
Denitio n AnA YG gr aph is a dir e cte d acyclic gr aph
which is r epr esente d by GVE wher e V is a set of obje cts
and E is a set of links GVE has one exit obje ct v
e
which is r e achable fr om al l v
i
V v
e
do es not have
any outgoing links and e ach v
j
V f v
e
g has only one
outgoing link
Prop osition The r esult of b ackwar d inquiry for an
A YG gr aph G V E with v
e
and V
L
fv
i
V v
i
is
aliveg is c orr e ct That is if V
L
then ALIVE r eplies
arer eturnedto v
e
Otherwise v
e
is glob al garb age
Pr o of The transp ose of a directed graph G V E is
the graph G
T
V E
T
where E
T
f v
i
v
j
V V v
j
v
i
E gTh us G
T
V E
T
is G V E with all its
edges rev ersed The graph G
T
V E
T
is a tree whose ro ot
is v
e
The A YG algorithm tra v erses the tree G
T
V E
T
starting from v
e
and searc hes for liv e ob jects The path
of the A YG message is a subgraph of a tree G
T
V E
T
When the A YG message reac hes an ob ject whichis aliv e
or do es not ha v e outgoing links the message the reply
returns to the exit ob ject v
e
using the path in G V E Therefore bac kw ard inquiry for an A YG graph alw a ys
terminates An yobject v
i
V
L
returns an ALIVE reply
to v
e
th us v
e
is global garbage i V
L
Prop osition When b ackwar d inquiry is p erformedfor
agr aph GVE with an A YG starting obje ct v
e
and a set
of live obje cts V
L
the p ath of the A YG message forms an
A YG gr aph
Pr o of The path of the A YG message G
V
E
can
b e generated b y the follo wing algorithm when bac kw ard
inquiry is p erformed for the graph G V E with an A YG
starting ob ject v
e
and a set of aliv e ob jects V
L
V
E
V
t
V
V
V
f v
e
g
Cho ose v
i
V
t
st v
i
v
j
for some v
j
V
V
t
V
t
f v
i
g
if v
i
V
L
then V
V
f v
i
g E
E
f v
i
v
j
g else do nothing
go bac kto un til V
t
By induction it is pro v ed that the graph G
V
E
whic h
is generated b y the ab o v e algorithm is an A YG graph
The G
v
e
is anA YG graph whic h is trivial
Lets assume that G
n
V
n
E
n
isan A YG graph In line
of the ab o v e algorithm if v
i
V
L
then v
i
and v
i
v
j
is added to G
n
V
n
E
n
v
i
has only one outgoing link
v
i
v
j
and v
i
is reac hable to v
e
b ecause v
i
is reac hable to
v
j
whic h is reac hable to v
e
Therefore G
n V
n E
n is an A YG graph where V
n V
n
f v
i
g and E
n E
n
f e
i
g Prop osition Backwar d inquiry for any gr aph GVE
with an A YG starting obje ct v
e
V is c orr e ct
Pr o of An A YG message is propagated from v
e
to ob
jects in V f v
e
g By prop osition the path of the
message forms an A YG graph By prop osition A YG
bac kw ard inquiry is correct for that path Hence the
algorithm is correct for the en tire graph G V E
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 689 (1998)
PDF
USC Computer Science Technical Reports, no. 882 (2006)
PDF
USC Computer Science Technical Reports, no. 669 (1998)
PDF
USC Computer Science Technical Reports, no. 852 (2005)
PDF
USC Computer Science Technical Reports, no. 677 (1998)
PDF
USC Computer Science Technical Reports, no. 663 (1998)
PDF
USC Computer Science Technical Reports, no. 838 (2004)
PDF
USC Computer Science Technical Reports, no. 675 (1998)
PDF
USC Computer Science Technical Reports, no. 682 (1998)
PDF
USC Computer Science Technical Reports, no. 825 (2004)
PDF
USC Computer Science Technical Reports, no. 807 (2003)
PDF
USC Computer Science Technical Reports, no. 668 (1998)
PDF
USC Computer Science Technical Reports, no. 580 (1994)
PDF
USC Computer Science Technical Reports, no. 686 (1998)
PDF
USC Computer Science Technical Reports, no. 620 (1995)
PDF
USC Computer Science Technical Reports, no. 681 (1998)
PDF
USC Computer Science Technical Reports, no. 727 (2000)
PDF
USC Computer Science Technical Reports, no. 738 (2000)
PDF
USC Computer Science Technical Reports, no. 685 (1998)
PDF
USC Computer Science Technical Reports, no. 667 (1998)
Description
Sung-Wook Ryu and B. Clifford Neuman. "Distributed garbage collection by timeouts and backward inquiry." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 684 (1998).
Asset Metadata
Creator
Neuman, B. Clifford (author), Ryu, Sung-Wook (author)
Core Title
USC Computer Science Technical Reports, no. 684 (1998)
Alternative Title
Distributed garbage collection by timeouts and backward inquiry (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
15 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269196
Identifier
98-684 Distributed Garbage Collection by Timeouts and Backward Inquiry (filename)
Legacy Identifier
usc-cstr-98-684
Format
15 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/