Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 648 (1997)
(USC DC Other)
USC Computer Science Technical Reports, no. 648 (1997)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Lo cal Error Reco v ery in SRM Comparison of Tw o Approac hes
ChingGung Liu USC
Deb orah Estrin USCISI
Scott Shenk er Xero x
Lixia Zhang UCLAXero x
Abstract
SRM is a generic framew ork for reliable m ulticast deliv ery In order to
maximize the collab oration among the group mem b ers in error reco v ery b oth
retransmission requests and replies are m ulticast to the en tire group SRM uses
random timers to eectiv ely suppress duplicate requests and replies Ho w ev er
a few mem b ers with frequen t losses can still cause frequen t retransmissions to
all the group mem bers T o further impro vethe scalabilit y of SRM one m ust lo calize the scop e of
error reco v ery trac In this pap er w e presentt woapproac hes to lo cal reco v ery
hopbased scop e con trol and use of lo cal reco v ery groups Therstapproac h
uses hop coun t to limit the distance requests and replies tra v el whereas the sec
ond approac h connes error reco v ery trac to b e within lo cal reco v ery groups
The lo cal reco v ery groups and hop coun t settings are automatically created
and dynamically adjusted based on observ ed loss patterns W e use sim ulation
exp erimen ts to examine the p erformance of b oth approac hes
In tro duction
The Sc alable R eliable Multic ast framew ork SRM is a generic framew ork for
reliable m ulticast deliv ery In order to maximize the collab oration among the group
mem bers in the error reco v ery pro cess both retransmission requests and replies
are m ulticast to the en tire group SRM tak es a receiv erdriv en approac h in error
reco v ery to a v oid the message implosion problem and uses random timers to
eectiv ely suppress duplicate requests and replies Unfortunately b ecause ev ery
lost pac k et results in a reply b eing sen t to the en tire group a single lossy link can
still cause frequen t global retransmissions This beha vior limits the scalabilit y of
SRM as net w ork and group size increases T o further impro v e the scalabilit y of
SRM w e wish to lo calize the scop e of error reco v ery trac
In this pap er w e presentt w o dieren tmec hanisms to lo calize the scop e of error
reco v ery trac The hopscop ed error reco v ery mec hanism uses hop coun t to limit
the distance that request and reply messages can tra v el In con trast the group
scop ed error reco v ery mec hanism connes the propagation of error reco v ery trac
within some lo cal m ulticast groups Sim ulation results of these mec hanisms suggest
that they b oth reduce error reco v ery trac without in tro ducing signican to v erhead
The pap er is organized as follo ws Section giv es a brief description of SRM
framew ork and the problems that w e address in this pap er Section and de
scrib e the hopscop ed error reco v ery and groupscop ed error reco v ery mec hanisms
resp ectiv ely Section presen ts the sim ulation mo dels and analyzes the sim ulation
results and Section reviews related w orks W e conclude in Section with a short
summary Basic Approac hes of SRM
In this section w e giv e an o v erview of SRM emphasizing the features crucial to
our prop osed mec hanisms W e use the term session to mean a m ulticast applica
tion that uses SRM as its underlying reliable m ulticast service SRM pro vides basic
reliabilit y supp ort ie it guaran tees data deliv ery to all mem b ers in the m ulticast
session Other functionalities suc has total ordering and fate sharing
if desired
are the resp onsibilities of the application itself T o a v oid message implosion in
the error reco v ery pro cess SRM is receiv erinitiated with eac h receiv er b eing re
sp onsible for detecting data losses and requesting retransmissions SRM also adopts
the approac h of m ulticasting ev erything to maximize the collab oration among
mem b ers in the pro cess of error reco v ery Requests and replies are m ulticast to all
mem b ers in the session Multicasting a request allo ws the nearest mem b er with the
requested data to send a reply rst it also suppresses other mem b ers from sending
out duplicate requests for the same data Similarly m ulticasting a reply gets the
reply to all mem b ers who suer the loss without requiring the replier to kno ws their
exact lo cations as w ell as suppresses duplicate replies
The SRM mec hanisms can be decomp osed in to t w o parts group state syn
c hronization and receiv erinitiated error reco v ery Mem bers p erio dically exc hange
session messages to rep ort curren t group state eg the highest receiv ed sequence
n um ber from eac h source and to determine the propagation dela ys bet w een eac h
pair of mem b ers Mem b ers use group state information to detect data losses It is
critical for the receiv erinitiated error reco v ery approac h b ecause mem bers do not
otherwise kno w what has b een sen t to the session group Mem b ers use propagation
dela ys to sc hedule their request or reply timers as describ ed b elo w
When a pac k et gets lost eac h mem ber detecting the loss w aits a random time
period b efore sending the retransmission request The random timer is sc heduled
bet w een the time in terv al of A T A B T T is the propagation dela y
bet w een the requester and the data source and A and B are constan ts The request
timer in terv al is a function of the distance to the source b ecause wew an t a mem ber
near the source to request rst When the timer expires the sc heduled request
is m ulticast to the session group If a duplicate request is receiv ed or the curren t
sc heduled request is sen t the requester exp onen tially bac ks o its request timer If
F ate sharing is when a m ulticast session terminates if a single mem b er or a sp ecic subset of
mem b ers in the session fail dep ending on the seman tics of the application
source
p q
r
L1
L2
Figure A mem b er with p ersisten t losses causes bandwidth o v erhead to all session
mem bers L
and L
are lossy links
a reply is receiv ed the sc heduled request is canceled Amem b er with the requested
data
resp onds to the request bysc heduling a reply The reply is sc heduled in the
time in terv al a t a b t where t is the propagation delaybet w een the replier
and the requester and a and b are constan ts When the timer expires the sc heduled
reply is m ulticast to the session The sc heduled reply is canceled if a duplicate reply
is receiv ed Therefore the mem b er immediately b ehind the lossy link
is most lik ely
to send its request and the mem b er immediately ab o v e the lossy link is most lik ely
to send its reply The randomization of request and reply timers in SRM giv es mem bers an op
p ortunit y to suppress one another and th us a v oid the request and reply message im
plosion problem Ho w ev er a mem b er with p ersisten t losses can still trigger enough
request and reply activityto o v erwhelm other mem b ers in the session F or example
consider the case where mem ber p in Figure loses a pac k et at link L
Its request
and the reply from mem ber q will reac h all mem bers in the session whic h causes
duplicate data reception for all other mem b ers Moreo v er m ultiple lossy links ma y
mak e the problem w orse b ecause m ultiple lossy links reduce duplicate suppression
F or example in Figure if the reply from mem ber q is lost at link L
the sc heduled
reply at mem ber r will not b e suppressed and a duplicate reply is m ulticast to the
session In a session of size nif k mem b ers lose a pac k et the error reco v ery trac
reac hes all n mem bers b y using global error reco v ery As a result
n k n
of the
error reco v ery trac is w asted The premise of this pap er is that an error reco v ery
mec hanism m ust not only a v oid message implosion but should also isolate error
reco v ery trac to the required scop e
Weha v e exp erimen ted with t w o dieren tmec hanisms to lo calize error reco v ery
trac hopscop ed error reco v ery and groupscop ed error reco v ery Note that lo cal
reco v ery is a p erformance optimization th us the mec hanisms do not ha veto ac hiev e
the optimal or precise degree of lo calit y the more lo cal the reco v ery the less reco v ery
trac o v erhead there is Ho w ev er indep enden tly from ho w w ell our lo cal reco v ery
mec hanisms w ork all data losses are ev en tually reco v ered In particular in both
mec hanisms that w e prop ose here a mem ber ma y o ccasionally sends its requests
SRM assumes all session mem b ers not only the data source sa v e all the application data If
some mem b ers do not sa v e the data requested they simply do not participate in the error reco v ery
pro cess
W e refer to a lossy link as the place where a data pac k et is dropp ed it can b e either a link or
a router along the deliv ery path
and replies to an inappropriate scop e Suc h o ccasional abnormal beha vior is so on
corrected and ha v e little impact on the o v erall p erformance W e describ e these t w o
approac hes in detail in the follo wing sections
HopScop ed Error Reco v ery
The simplest solution to con trol the scop e of requests and replies is to limit the
n um ber of hops they tra v el W e wish to use the minim um hop coun ts p ossible in
requests and replies In order to reco v er from a loss a request m ust reac h as least one
mem b er who has the requested data Amem b er b ehind a lossy link could p erhaps
learn form its past exp erience to iden tify the lossy link and send its subsequen t
requests with a hop coun t that reac hes bey ond the lossy link Ho w ev er if there
are m ultiple lossy links along the path a conserv ativ e approac h w ould be to set
the hop limit large enough so that requests can go bey ond the farthest lossy link
Consequen tly request messages w ould tra v el long paths in all directions resulting
in high o v erhead Therefore w e do not attempt to ha v e all mem bers requests
attempt to reac h bey ond the lossy link In fact to minimize the hop limit for
request messages our design tak es the approac h that a mem ber ps request should
extend just far enough to reac h some other mem ber q who is closer to the source
If the loss o ccurred bet w een q and p then q will be able to retransmit the lost
pac k et If the loss o ccurred elsewhere so that q missed the pac k et as w ell then w e
only need to mak e sure that q will send a request further up to w ards the source
Since request timers are set prop ortional to the measured propagation dela ys to the
source a mem ber can assume that it is the closest one b ehind the lossy link if its
request timer expires rst It do es not matter if a further a w a y mem ber sends an
unsuppressed request whic h do es not reachbey ond the lossy link all that matters is
that at least one request mak es across the lossy link and it is lik ely that the request
comes from the closest mem ber b ehind the lossy link Other mem bers simply rely
on this request to trigger the retransmission of the lost data
Note that a request is used to suppress other duplicate requests as w ell as to ask
for repair While limiting the hop coun t of the request message limits the o v erhead
it generates it also diminishes its abilityto suppress the same requests from other
mem bers Ho w ev er in general the request hop coun t in our mec hanism is relativ ely
small in comparison with the distance to the source As a result the request trac
p er loss is still acceptable ev en though m ultiple requests of the same retransmission
are presen ted Moreo v er b ecause request timers are based on measured distances
to the source a mem b er far b ehind the lossy link ma y receiv e a reply b efore sending
a request whic h relaxes SRMs previous suppression requiremen t that a request
should reac h all mem b ers who share the same loss
On the other hand w e do require that a reply reac h all mem b ers who suer the
same loss Since a replier do es not kno w where a pac k et is dropp ed it is dicult
for the replier to iden tify who shares the loss Because a requester assumes it is
immediately b ehind the lossy link mem b ers closer to the requester are lik ely to b e
the ones whose replies are not suppressed it can determine as w eshowbelo w an
upp er b ound on the hop coun t needed to reac h all other mem b ers who share the
s
r
Π
s
u
π
p
r
source s
u
v
s
p
π s
q
π
w
s
w
π
q
Figure Request and reply hop coun ts thic k lines represen t data deliv ery path
circles represen t the regions of request and reply scop es
loss The upp er b ound is called the pr oxy hop c ount b ecause the requester acts as
pro xy for mem b ers who share the same loss The replier determines its reply hop
coun t based on incoming requests F or example if a replier receiv es a request from
a requester h hops a w a y and the pro xy hop coun t of the requester is then the
repliers reply hop coun t is giv en by h Note that the reply hop coun t
is an upp er b ound and the reply mayreachmem b ers who do not share the loss
Our hopscop ed error reco v ery requires a mem ber to measure its distances in
terms of the n um ber of hops to other mem b ers in a session The distance is mea
sured b y exc hanging session messages Since session messages are p erio dic the
measuremen t is also p erio dically refreshed W e will discuss the algorithm to deter
mine the request and pro xy hop coun ts in Section and Section The detailed
mec hanism is describ ed in Section Request Hop Coun t
Since eac h requester sets its request timer according to its measured distance to
the data source a requester near a lossy link should detect a loss and send its
request rst Its request is most ecien t in terms of both the reco v ery dela y and
the request scop e Based on this observ ation a requester can rely on other requesters
closer to the lossy link to ask for repair F or example in Figure a pac k et from
source s is dropp ed do wnstream of mem ber r Mem ber p is the closest requester
to the lossy link Mem ber q do es not ha v e to send its request to reac h r b ecause
the request from p can request retransmission for it Note that instead of trying
to explicitly iden tify the lossy link eac h requester assumes itself the immediate
mem ber do wnstream of the lossy link and sets its request hop coun t large enough
to reachatleast one mem ber b ey ond the lossy link
Ev en though its request ma y
not reac h a mem b er with the requested data for a particular loss the requester can
The mem ber bey ond the lossy link can be either an upstream mem ber or a sibling to the
requester
rely on other requesters that are upstream whose request will reac h a mem ber with
the requested data Hence the request hop coun t s
p
for amem ber p regarding a
source s in a session G can b e set to
s
p
min fh
pq
j q G h
sq
h
sp
g
h
pq
is the distance in terms of the n um ber of hops bet w een p and q h
pq
is de
termined using distance information obtained from session message exc hange Note
that the request hop coun t is calculated on a p ersource basis under the assumption
that sourcesp ecic distribution trees are used
Pro xy Hop Coun t
A requester sets its pro xy hop coun t so as to reac h other mem b ers that share the
same loss Since a requester has no kno wledge of the underlying net w ork top ology it can only estimate its pro xy hop coun t
A requester only has to consider mem b ers farther a w a y from the source in deter
mining its pro xy hop coun t There are four kinds of relationship b et w een a requester
and a mem b er farther a w a y from the source They are demonstrated in Figure b y
mem b er pairs fp q g fp u g fp v g and fp w g p and q ha v e an upstreamdo wnstream relationship q most lik ely shares losses
with p An upstream mem ber should be pro xy for its do wnstream mem bers to
request retransmission If q is do wnstream of p then h
sp
h
pq
will be equal to
h
sq
Ho w ev er because a mem ber ma y be one hop a w a y from its rsthop router
h
sp
h
pq
ma y be t w o hops greater than h
sq
W e refer to the do wnstream distance
of p regarding a source s as s
p
s
p
max fh
pq
j q G
s
p
h
sp
h
pq
h
sq
g
G
s
p
is the set of mem b ers who are farther a w a y from s than p is
p and u are siblings and p is within us request hop coun t ie u requests p for
repair p has to b e pro xy for u as w ell as us do wnstream mem b ers W e refer to this
distance as s
p
s
p
max fh
pu
max f s
u
s
u
gj u G
s
p
h
pu
s
u
g
p and v are also siblings and v is within ps request hop coun t but p is not
within v s request hop coun t p has to b e pro xy for v and v s do wnstream mem bers
b ecause v s requests ma y b e suppressed b y p W e refer to this distance as s
p
s
p
max fh
pv
max f s
v
s
v
gj v G
s
p
h
pv
s
p
g
Note that since v is suppressed b y p p do es not ha v e to consider s
v
p and w are also siblings but they are not within the request hop coun ts of eac h
other ie w do es not ask p for repair and its requests are not suppressed b y p Therefore w m ust send its o wn requests and p need not b e pro xy for w A requester can determine its pro xy hop coun t b y taking the maxim um of
its and from all mem b ers from whic h it hears session messages Note that
the pro xy hop countm ust also b e calculated on a p ersource basis
Issues related to the use of other t yp es of distribution trees are left for future study
Mec hanism Description
As describ ed ab o v e mem bers in a session exc hange session messages to measure
the distance in terms of the n um ber of hops to other mem b ers and the distances
bet w een eac h pair of mem bers are used to compute the request and pro xy hop
coun ts Moreo v er pro xy hop coun t information is added to eac h session message
In particular for eac h source sa mem ber p includes h
sp
s
p
and maxf s
p
s
p
g in
its session messages
The computation of request and pro xy hop coun ts is p erformed iterativ ely A
mem ber recomputes its request and pro xy hop coun ts when a session message is
receiv ed Since session messages are m ulticast periodicallyit tak es appro ximately
one session cycle time
to complete the computation of the request hop coun t
Ho w ev er the computation of the pro xy hop coun t dep ends on results from other
mem b ers and so it ma ytakesev eral session cycles to con v erge Note that in order
to capture session dynamics the computation of request and pro xy hop coun ts is
timestamp ed and aged so obsolete results are timed out
When a loss is detected a requester sends a request with its request hop coun t
Normally at least one request from mem b ers b ehind the lossy link reac hes a mem ber
with the requested data and triggers a reply Ho w ev er if no reply is receiv ed due
to pac k et loss or underestimated request hop coun t the requester then sends a
second request globally and corresp ondingly the reply will b e sen t globally as w ell
The request message carries the distance to the source and the pro xy hop coun t
The distance to the source is used to determine request suppression A mem ber
suppresses its sc heduled request if the receiv ed request is from a mem b er closer to
the source Otherwise the sc heduled request should be sen t The replier uses the
pro xy hop coun t to determine the reply hop coun t
When a replier resp onds to a request a dieren t approac h w ould be that the
replier m ulticasts its reply with a hop coun t h where h is the distance to the
original requester After receiving the reply the original requester rela ys the reply
to other do wnstream mem bers with its pro xy hop coun t Ho w ev er this step replyrela ying sc heme in tro duces additional dela y in loss reco v ery whic h ma y
cause additional duplicate requests b eing sen t F urthermore the scop es of the rst
reply and the rela y ed reply o v erlap with one another so that mem bers within the
o v erlapping area will receiv e duplicate replies Since the distance b et w een a replier
and a requester his relativ ely small in the a v erage case m ulticasting a reply with a
hop coun t equal to h should not in tro duce signican to v erhead in terms of net w ork
bandwidth Therefore in our hopscop ed error reco v ery a replier calculates its reply
hop coun t as the sum of the distance to the requester and the pro xy hop coun t of
the requester If m ultiple requests are receiv ed the replier tak es the maxim um of
its calculations as the reply hop coun t
A session cycle time is the p erio d b et w een t w o consecutiv e session messages sentbya mem b er
Since all mem b ers send their session messages at the same rate a mem b er should receiv e a session
message from eachmem b er during a session cycle time
GroupScop ed Error Reco v ery
Ideally a request should reac h a few neigh bors with the requested data to ask for
repair and a reply should reac h all mem b ers who share the same loss Ho w ev er the
propagation of error reco v ery trac is nondirectional in hopscop ed error reco v ery The w asted bandwidth is not negligible in some cases esp ecially in terms of the reply
trac F or example the reply from mem ber r in Figure propagates upstream as
w ell as do wnstream to reco v er a loss In this section w e consider the use of separate
m ulticast groups to more precisely con trol the scop e of error reco v ery trac
W e use the concept of lo c al r e c overy gr oups A lo cal group consists of a set of
mem b ers who share the same losses to at least some degree Mem b ers share the same
losses b ecause they share one or more lossy links along the data deliv ery path from
a source Because w e assume sourcesp ecic distribution trees the creation of lo cal
groups is on a p ersource basis Ho w ev er our mec hanism do es not limit mem bers
to a single lo cal group per source Multiple lo cal groups can be asso ciated with a
source where eac h group is resp onsible for error reco v ery of one or more lossy links
F or a sp ecic source the relationship among these lossy links is either ancestor
descendan t or siblings so mem b ership of these lo cal groups are either p erfectly
nested or disjoin t
Our groupscop ed error reco v ery follo ws the SRM approac h in whicheac h mem
b er is autonomous That is eac hmem b er joins or lea v es a lo cal group based on its
indep enden t decision Thereisno cen tralized co ordination among mem bers Mem
b ers in a session start with global error reco v ery Amem b er who suers data losses
from a source prop oses in its request sen t globally the creation of a lo cal m ulticast
group for error reco v ery The creation of the lo cal group is gran ted bya replier in
its reply Since the reply is sen t globally mem bers who share the same loss join
the lo cal group when they receivethe reply These mem b ers are called the normal
memb ers
of the group and their subsequen t requests are sen t to the lo cal group to
ask for repair If a normal mem b er has joined m ultiple nested groups it assumes the
loss is within the scop e of its innermost lo cal group and therefore the mem ber sends
requests to the innermost group If the loss actually o ccured in an outer group an
other normal mem b er who sees the outer group as its innermost group w ould ha v e
detected the loss and requested for retransmission
A normal mem b er in a lo cal group measures the exten t to whic h it shares losses
with that group It sta ys in the group if the degree of loss sharing is high otherwise it
lea v es the group W e use the concept of an err or ngerprint to measure loss sharing
An error ngerprin t is the sequence n um b ers of the last f losses in the lo cal group
and it is used b y session mem b ers to determine the degree of loss sharing
Other mem bers selectiv ely join a lo cal group to help error reco v ery They are
called help ers of the group When a help er receiv es a request from a mem ber of a
lo cal group the help er sends its reply to that same lo cal group
Our groupscop ed error reco v ery follo ws the softstate approac h That is
mem b ership solicitation and loss sharing measuremen t are p erio dically refreshed to
capture session dynamics The mec hanism is describ ed in detail in the follo wing
In the rest of the pap er w e will use the terms normal mem b er and mem ber in terc hangeably
sections In particular w e discuss the criteria for prop osing gran ting joining and
lea ving lo cal groups
Prop osing a Lo cal Group
A mem ber prop oses a lo cal group if its error rate exceeds In the extreme
case w e can c ho ose to encourage all error reco v ery to be handled b y lo cal
groups If a mem b er decides to prop ose a lo cal group it should w ait a p erio d of time
b efore prop osing in order to learn of existing lo cal groups If there is an existing
lo cal group the mem b er should join the existing lo cal group instead of prop osing a
new one The details of a mem ber joining an existing lo cal group are discussed in
Section The w aiting p erio d can b e measured in terms of time n um b er of losses
or n um ber of receiv ed data pac k ets If the w aiting p erio d is long a mem ber has
more c hance to learn of existing lo cal groups so the o v erhead of unnecessary group
creation is reduced On the other hand if the w aiting p erio d is short a new group
can b e created quic kly and the o v erhead of global error reco v ery is reduced at the
exp ense of group creation o v erhead
A mem ber prop oses a new lo cal group in its request message It includes the
prop osed m ulticast group address and the error ngerprin t in the request message
Since the prop osed lo cal group is not y et created the mem b er will use the sequence
n um bers of its o wn losses as the initial error ngerprin t The request prop osing a
lo cal group is m ulticast globally to suppress other group prop osals If a mem ber
has joined an y lo cal group it will not prop ose creation of additional lo cal groups
Ho w ev er it ma y join other lo cal groups as appropriate
Gran ting a Lo cal Group
A replier gran ts the creation of a new lo cal group in its reply It includes the address
and the error ngerprin t of the gran ted lo cal group The reply gran ting the new lo cal
group is m ulticast globally to solicit mem b ers who share the same loss F urthermore
the replier joins the lo cal group as a help er Therefore at the b eginning of group
creation there is at least one help er in the group
Joining a Lo cal Group as a Normal Mem ber
Amem b er joins a lo cal group if it shares more than x of the losses with the group
When a reply gran ting a new lo cal group is receiv ed a mem ber joins the group if
the similarityof its o wn losses and the error ngerprin t of the gran ted group exceeds
x Amem b er can join m ultiple lo cal groups and these groups are nested That is
the mem b ership of an inner group is a subset of the mem b ership of an outer group
It is imp ortan t for all mem bers to main tain a consisten t view of group order so
they can exercise these nested groups in the same fashion and pro duce correct loss
sharing measuremen t F urthermore the group order is used in error reco v ery since
a mem ber alw a ys sends its requests to its innermost lo cal group rst One simple
w a y to determine the order of a lo cal group is b y the sequence n um b er of the reply
source
G2
G1
(a)
source
G1
(b)
G2
source
G1
(c)
p
source
G1
(d)
p p p
Figure Ev olution of misplaced nested lo cal groups
gran ting the lo cal group The sequence n um b er of the reply gran ting a lo cal group
is called the or der numb er of the lo cal group
Generally sp eaking a lo cal group
gran ted later has a larger group order n um ber and a larger scop e Note that the
session original group is alw a ys the outermost group ev en though it do es not ha v e
an order n um ber The order of nested groups ma y not reect their ph ysical scop es at a particular
poin t of time but abnormal cases will b e xed after the requests and replies dissem
inate completely F or example in Figure a new mem ber p ma y prop ose a new
group G
b efore learning of the existing lo cal groups G
Figure a and b p
will b e solicited to join G
later and then it will use G
as the innermost lo cal group
Figure c The mem b ership solicitation sc heme is discussed in Section A t
this p oin t of time the ph ysical scop e of G
is larger than the ph ysical scop e of G
Ev en tually G
will be timed out and disapp ear Figure d The group timeout
sc heme is discussed in Section The threshold x denes the tradeos b et w een the n um b er of nested lo cal groups
and the error reco v ery p erformance F or larger x more nested lo cal groups are
created and eac h group has a higher loss sharing ratio and ac hiev es greater eciency
for retransmission As a result the group main tenance o v erhead is higher and the
error reco v ery p erformance is b etter On the other hand for smaller x few er nested
lo cal groups are main tained and the loss sharing ratio in eac h lo cal group is lo w er
In the extreme case if w ec ho ose x there is only one lo cal group in the session
to reco v er all losses if w ec ho ose x the n um b er of lo cal groups is equal to
the n um ber of lossy links and eac h lo cal group reco v ers the losses of a single lossy
link
Error Reco v ery in a Lo cal Group
If a loss is detected a mem b er sends its request to its innermost group on its rst
try If there is no reply it will expand its request scop e b y trying its next outer
group un til the loss is reco v ered As describ ed earlier ev en if a request addressed
to the innermost group do es not reac h a help er mem b ers in the outer group should
T o b e precise an order n um b er consists of the sequence n um b er of the reply gran ting the lo cal
group in the high order p ortion and the lo cal group address in the lo w order p ortion
ha v e detected the loss and sen t their requests As a result a mem b er addressing its
request to its innermost group ma y receiv e a reply from its outer group Therefore
the ma jorit y of the losses are reco v ered on the rst try and sending requests to an
outer group should happ en rarely Note that since mem b ers in the inner group ma y
rely on mem b ers in the outer group to ask for data repair a mem ber ps sc heduled
request should not be suppressed b y a request from a lo cal group G if p is not a
normal mem ber in G
In other w ords a request addressed to a lo cal group should
only suppress requests of normal mem b ers in the group
The order n um b er of the addressed group is included in the request message It
is used b y a replier to determine the destination group for that reply A replier sends
its reply to the lo cal group to whic h the request w as sen t If there are more than one
requests and they are receiv ed from dieren t groups the replier addresses its reply
to the group with the largest order n um ber Moreo v er if a request is addressed to
an outer group the group address order n um b er and error ngerprin t of the inner
group are carried in the request message to solicit new mem b ers
Lea ving a Lo cal Group
A normal mem b er measures the degree of loss sharing in eac h lo cal group it joins
The degree of loss sharing in a lo cal group is the ratio of the n um ber of reco v ered
losses and the n um b er of receiv ed replies in the group F or example the loss sharing
can be measured ev ery m replies receiv ed in a lo cal group T o prev en t oscillation
exp onen tiallyw eigh ted mo ving a v erage is adopted If a normal mem b ers ratio is
smaller than x it lea v es the lo cal group
A help er lea v es a lo cal group if it do es not act as an activ e replier in the group
F or example a help er lea v es a lo cal group if its last k consecutivesc heduled replies
for the lo cal group are suppressed As a result there are at most k help ers in a lo cal
group
Timing Out a Lo cal Group
If there is no error reco v ery trac in a lo cal group the lo cal group should b e timed
out to reduce group main tenance o v erhead Both help ers and mem bers determine
when a lo cal group is dorman t and lea v e the group The time out period can be
measured in terms of seconds or the n um b er of receiv ed data pac k ets
Soliciting New Mem bers
Since an error ngerprin t is a snapshot of the group losses a mem ber who shares
the ma jorit y of losses with a lo cal group ma y unfortunately decide not to join the
group when it compares its losses with the group error ngerprin t F urthermore
when a new mem ber joins an ongoing session it has no kno wledge of the existing
lo cal groups Therefore to capture new mem bers as w ell as old mem b ers whose
snapshots happ ened to be sk ew ed a sc heme to p erio dically solicit new mem b ers is
required
p is a help er in G to b e able to receiv e a request addressed to G
A lo cal group solicits new mem bers b y periodic p olling Mem bers p erio dically
send their requests to the next outer group to solicit new mem b ers Mem b ers in
the outer group join the inner group based on the comparison of their o wn losses
and the inner group error ngerprin t Since the requests soliciting new mem bers
are sen t to the next outer group a new mem ber joins lo cal groups one at a time
in an outsidein fashion un til it has joined all nested lo cal groups Note that if a
requester do es not receiv e a reply in its rst try the next request addressed to an
outer group can also serv e the purp ose of mem b ership solicitation The requests
soliciting new mem b ers suppress one another to minimize the n um ber of requests
addressed to the outer group
The same sc heme is used to solicit new help ers A replier to a request soliciting
new mem b ers joins the inner group as a help er Ho w ev er a help er that is already
in the inner group is closer to the requester and is most lik ely to resp ond to the
request soliciting new help ers Therefore unless all help ers in the inner group ha v e
left the session a new help er will only rarely need to join the inner group
Sim ulation Results and Discussions
W e b eliev e the b eha vior of our prop osed mec hanisms migh t be best understo o d
b y rst testing a v ariet y of extreme settings b efore mo ving on to more complicated
scenarios Consequen tlyw e initially explored our lo cal reco v ery mec hanisms in three
extreme but simple topologies star string and binary tree eac h with a single data
source The star top ology represen ts a session where all mem b ers ha v e indep enden t
losses The string top ology represen ts a session where do wnstream mem b ers share
the same losses with their upstream mem b ers The binary tree top ology represen ts
a mixture of shared and indep enden t losses in a session
Eac h top ology is tested with four dieren t session sizes and W e
sim ulated the p erformance of v e dieren tmec hanisms for eac h session size global
error reco v ery hopscop ed error reco v ery and groupscop ed error reco v ery with
three dieren t degrees of loss sharing and W e adopted a dynamic timer adjustmen t mec hanism to optimize the p erfor
mance of the reco v ery dela y and the n um ber of requests and replies per loss The general idea of the dynamic timer adjustmen t mec hanism is to mak e the gen
eration of request and reply timers adaptiv e to the session en vironmen t A mem ber
in terprets the feedbac k from a session as an estimated session size
and uses the
estimated size to tune its request and reply timer parameters A B a and b These
parameters are describ ed in Section and they determine whether requests and
replies are generated aggressiv ely or conserv ativ ely The feedbac k from the session
is the receiv ed requests and replies More duplicate requests and replies imply a
larger session size therefore a mem ber should increase its timer parameters and
send requests and replies conserv ativ ely to reduce duplicates Otherwise a mem ber
should decrease its timer parameters to minimize reco v ery dela y
The estimated session size ma y not reect the actual session size It represen ts the n um ber of
mem b ers to comp ete for sending requests and replies
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
20
|
40
|
60
|
80
|
100
|
120
| | | | | | | |
|||||||
request traffic %
(a)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
100
|
200
|
300
|
400
|
500
| | | | | | | |
||||||
reply traffic %
(b)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
2
|
4
|
6
|
8
|
10
| | | | | | | |
||||||
session size
delay ratio
(c)
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Estimated lower bound for x=33%
Estimated lower bound for x=50%
Estimated lower bound for x=100%
Figure Sim ulation results in star top ology all links with uniformlydistributed
error rate
The rst set of sim ulations assumed that all links are with uniformlydistributed
error rates and their error rates are xed throughout the sim ulation The result
is sho wn in Figures and The request trac is the pro duct of the a v erage
measured scop e a request propagates and the a v erage measured n um b er of requests
p er loss The measured request scop e is a fraction of the global scop e F or example
in global error reco v ery the request scop e is equal to the global scop e since eac h
request is m ulticast to the en tire session therefore the request trac is the a v erage
n um b er of request messages p er loss The reply trac is the pro duct of the a v erage
measured scop e a reply propagates and the a v erage measured n um b er of replies p er
loss The measured reply scop e is a fraction of the global scop e The dela y ratio
is the a v erage ratio of the measured reco v ery dela y and the measured propagation
dela y from the source
In the star top ology the distances bet w een eac h pair of mem b ers are equal
Therefore hopscop ed lo cal reco v ery p erforms exactly lik e global error reco v ery Since mem b ers ha v e indep enden t losses there is no loss shared among mem bers Appro ximately one request message p er loss is generated Figure a Ho w ev er in
star string tree
8 163264 8 16 32 64 8 16 32 64
0.99
16.0
98.7%
0.99
32.0
98.9%
topology
session size
request per loss
request hops
request traffic
0.99
64.0
99.3%
0.99
128.0
99.4%
1.60
6.5
64.7%
2.11
6.7
44.3%
2.96
6.8
31.6%
4.84
6.9
26.1%
1.09
7.6
51.8%
1.56
7.3
35.7%
1.73
7.7
20.8%
2.09
7.5
12.3%
T able The n um ber of requests per loss and the n um ber of hops that a request
tra v els in hopscop e error reco v ery
groupscop ed error reco v eryeac hmem b er creates its o wn lo cal group so the request
and reply trac is reduced signican tly Note that b ecause of the constan tn um ber
of help ers in a lo cal group the request and reply trac go es do wn with the session
size Figure a and b On the other hand since there is little loss sharing the
n um ber of a v ailable help ers for a sp ecic loss is large in global error reco v ery and
hopscop ed error reco v ery Aa result there are m ultiple replies generated p er loss
and the reply trac increases with the session size Figure b
The n um b er of lo cal groups in groupscop ed error reco v ery is equal to the n um ber
of lossy links In general if there are n mem bers in a session the n um ber of lo cal
groups is equal to n Eac h lo cal group reco v ers
n
of total losses in the session and its
scop e is roughly
n
of the session scop e Therefore w e can estimate alo w er b ound
on the request and reply trac in groupscop ed error reco v ery as
n
of the trac
in global error reco v ery Since this estimated request and reply trac is a lo w er
b ound it represen ts the greatest degree of sa vings p ossible
The estimated v alues
are sho wn as graycurv es in Figure In the string top ologydo wnstream mem b ers share all losses with their upstream
mem bers Ado wnstream mem ber can rely on its upstream mem bers to ask for re
pair in hopscop ed error reco v ery As a result request trac in hopscop ed error
reco v ery is reduced signican tly Moreo v er request trac in hopscop ed error re
co v ery go es do wn with the session size b ecause more do wnstream mem b ers rely on
their upstream mem b ers to ask for repair
T able sho ws the a v erage n um b er of requests p er loss and the a v erage n um ber of
hops that a request message tra v els in hopscop ed error reco v ery
The n um ber of
requests p er loss in the string top ology increases with the session size and the n um ber
of hops that a request message tra v els remains constan t Ho w ev er the increase in
the n um b er of requests p er loss is sublinear in terms of the session size therefore
the a v erage request trac still decreases with the session size As describ ed earlier
ev en if m ultiple requests per loss are presen ted in hopscop ed error reco v ery the
o v erall request trac is impro v ed b ecause the scop e of eac h request is small and a
mem b er far b ehind a lossy link ma y receiv e a reply b efore ev en sending a request
On the other hand in groupscop ed error reco v ery request messages propagate
Upp er b ound estimates w ould need to tak e sev eral other factors in to accoun t F or example
the n um b er of help ers and mem b ership dynamics in a lo cal group
The n um ber of requests per loss in the star top ology is less than one because some of the
requests are queued in the net w ork when the sim ulation terminates
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
20
|
40
|
60
|
80
|
100
|
120
| | | | | | | |
|||||||
request traffic %
(a)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
20
|
40
|
60
|
80
|
100
|
120
| | | | | | | |
|||||||
reply traffic %
(b)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
2
|
4
|
6
|
8
|
10
| | | | | | | |
||||||
session size
delay ratio
(c)
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Estimated lower bound for x=33%
Estimated lower bound for x=50%
Estimated lower bound for x=100%
Figure Sim ulation results in string top ology all links with uniformlydistributed
error rate
to all do wnstream mem bers so request trac go es up with the session size Fig
ure a In terms of reply trac hopscop ed error reco v ery p erforms w orse than
groupscop ed error reco v ery because hopscop ed error reco v ery do es not regulate
the direction in whic h the reply messages propagate Figure b Since request
messages only reac h a small n um ber of mem b ers in hopscop ed error reco v ery the
estimated session size is m uc h smaller As a consequence mem bers send requests
more aggressiv ely and the reco v ery dela y is reduced On the other hand in group
scop ed error reco v ery there is only a small n um b er of help ers in a lo cal group and
hence the reco v ery dela y in groupscop ed error reco v ery is greater than the reco v ery
dela y in global error reco v ery Figure c
The tree top ology is a mixture of the star and string top ologies The p erformance
of hopscop ed error reco v ery is similar to groupscop ed error reco v ery with x in terms of request trac and it is similar to groupscop ed error reco v ery with
x in terms of reply trac Note that b oth request and reply trac decreases
with the session size in groupscop ed error reco v ery The n um ber of lo cal groups in groupscop ed error reco v ery in the string and
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
20
|
40
|
60
|
80
|
100
|
120
| | | | | | | |
|||||||
request traffic %
(a)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
20
|
40
|
60
|
80
|
100
|
120
| | | | | | | |
|||||||
reply traffic %
(b)
|
0
|
10
|
20
|
30
|
40
|
50
|
60
|
70
|
0
|
2
|
4
|
6
|
8
|
10
| | | | | | | |
||||||
session size
delay ratio
(c)
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Estimated lower bound for x=33%
Estimated lower bound for x=50%
Estimated lower bound for x=100%
Figure Sim ulation results in tree top ology all links with uniformlydistributed
error rate
tree top ologies is prop ortional to x F or example if x eac h lo cal group is
resp onsible for the error reco v ery of a single lossy links The n um b er of lo cal groups
is equal to the n um ber of lossy links in the session If x a lo cal group is
resp onsible for the error reco v ery of t w o lossy links and the n um b er of lo cal groups
is equal to half of the n um ber of lossy links In general for a session of size n the
n um ber of lo cal groups is x n the n um ber of lossy links co v ered b y a lo cal group
is
x
and the p ercen tage of losses reco v ered b y a lo cal group is
x n
Therefore the
estimated error reco v ery trac T can b e calculated as T
x n
P
x n
i
f i n
where
f i is the size of the i th lo cal group F or string top ology f i n i x
F or tree
top ology f i log
n log
i x
Therefore the estimated lo w er b ounds of the
error reco v ery trac in string and tree top ologies are
T
str ing
x n
x n
T
tr ee
x n
X
i
x i n
The estimated v alues are sho wn as gra y curv es in Figures and In the second set of sim ulations a randomly selected of the links are
star string tree
request traffic reply traffic delay ratio
8 163264
A
B
C
D
E
100.0
100.0
31.2
31.2
31.2
99.3
99.3
24.6
20.3
20.3
99.1
99.1
14.5
14.9
15.5
98.9
98.9
9.7
12.6
12.6
8 163264 8 163264
A
B
C
D
E
A
B
C
D
E
265.5
265.5
46.9
46.9
46.9
311.9
311.9
52.6
41.1
41.5
349.2
349.2
35.7
39.4
42.9
381.6
381.6
26.6
38.4
38.4
6.40
6.40
6.45
6.45
6.45
6.56
6.56
6.50
6.49
6.50
6.48
6.48
6.45
6.48
6.48
6.59
6.59
6.48
6.48
6.48
100.0
25.0
25.7
25.7
25.7
106.0
49.6
67.71
63.07
62.88
104.3
25.9
82.34
78.88
69.88
107.0
19.9
90.5
77.9
66.5
100.1
37.5
25.7
25.7
25.7
100.4
99.9
64.0
58.5
58.4
100.3
76.4
78.3
73.9
63.9
101.2
79.7
84.4
71.7
58.0
3.69
3.70
3.70
3.70
3.70
3.01
3.08
3.02
3.10
3.10
2.54
2.28
2.58
2.71
3.02
2.90
1.94
3.09
3.20
3.85
4.28
4.28
4.28
4.28
4.28
3.88
3.86
4.33
3.87
3.86
3.59
3.51
3.62
3.61
3.58
3.41
3.27
3.79
3.57
3.38
100.0
31.3
25.7
25.7
25.7
99.3
21.8
40.0
23.1
23.1
99.0
16.7
22.4
18.2
17.3
100.1
9.3
23.5
17.0
12.9
98.5
63.7
39.6
23.1
23.1
100.0
56.3
25.7
25.7
25.7
97.6
49.5
22.2
18.1
17.3
97.6
34.4
23.9
16.9
12.8
A
B
global error recovery
C
D
E
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
%
topology
size
T able Sim ulation results of links with uniformlydistributed error rate
source
s1
s2
s3
p3
r3
s4
p2
p1
q2 q1
r1
r7
r6
r5 r4
r2
Figure Mb onelik e top ology used in our sim ulations
with uniformlydistributed error rates and their error rates are xed throughout
the sim ulation The results are sho wn in T able Both hopscop ed and group
scop ed error reco v eries outp erform the global error reco v ery in terms of the request
and reply trac except for hopscop ed error reco v ery in the star top ology In the
string top ology of size the reply trac in hopscop ed error reco v ery is close
to b ecause the randomlyselected lossy links are in the middle of the string
top ology Since hopscop ed error reco v ery do es not regulate the trac direction the
reply trac propagates to the en tire session mem b ership The same scenario can b e
applied to the reply trac in the tree top ology In general hopscop ed error reco v ery
p erforms w orse than groupscop ed error reco v ery in terms of the reply trac if the
lossy links are sparsely distributed
The lo cal error reco v ery mec hanisms w ere also sim ulated in a Mb onelik e top ol
ogy sho wn in Figure No des connected with thic k lines sym b olize the Mb one
Other no des represen t lo cal area net w orks Session mem bers are represen ted b y
blac k no des and one of them s
is selected as the data source The lossy links are
| |
|
0
|
5
|
10
|
15
|
20
|
25
|
30
|
35
|
40
| |
|||||||||
request scope
average
(a)
s
1
s
2
s
3
s
4
p
1
p
2
p
3
q
1
q
2
r
1
r
2
r
3
r
4
r
5
r
6
r
7
| |
|
0
|
5
|
10
|
15
|
20
|
25
|
30
|
35
|
40
| |
|||||||||
session member
reply scope
average
(b)
s
1
s
2
s
3
s
4
p
1
p
2
p
3
q
1
q
2
r
1
r
2
r
3
r
4
r
5
r
6
r
7
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Figure Av erage request and reply scop es of individual mem b ers
represen ted b y gra y lines W e assume most of the losses are at lo cal area net w orks
Figure a and b sho w the measured request and reply scop es of individual
mem bers as w ell as the a v erage request and reply scop es during the sim ulation
The scop e is measured in terms of the n um ber of hops that requests and replies
tra v el The a v erage request and reply scop es are consisten t with the results from
the star string and tree top ologies The p erformance of hopscop ed error reco v ery
is v ery comp etitiv e with groupscop ed error reco v ery in terms of the request scop e Ho w ev er the reduction of reply scop e is limited in hopscop ed error reco v ery In Figure a the request scop e in groupscop ed error reco v ery go es do wn as x
go es up A large x means higher loss sharing therefore there are few er mem bers
in a lo cal group and they ha v e a higher degree of loss sharing As a consequence
the requests and replies in the group are required b y more mem b ers in the group
and less bandwidth is w asted Most mem b ers ha v e relativ ely small request scop es
in b oth hopscop ed and groupscop ed error reco v eries This means that most of the
requests are sen t only within the lo cal area net w orks Note that mem ber r
has a
relativ ely large request scop e in b oth hopscop ed and groupscop ed error reco v eries
b ecause its requests has to propagate across the Mb one to reac h a help er
The reply scop e sho wn in Figure b dep ends on where the request is coming
from A small reply scop e means the reply is within lo cal area net w orks F or ex
ample r
s reply scop e in groupscop ed error reco v ery is small b ecause r
is only
resp onsible to reco v er losses of its do wnstream mem bers within its lo cal area net
|
0
|
100
|
200
|
300
|
400
|
500
|
600
|
0
|
5
|
10
|
15
|
20
|
25
|
30
|
35
|
40
| | | | | | |
|||||||||
request traffic
(a)
|
0
|
100
|
200
|
300
|
400
|
500
|
600
|
0
|
5
|
10
|
15
|
20
|
25
|
30
|
35
|
40
| | | | | | |
|||||||||
seconds
reply traffic
(b)
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Figure Error reco v ery trac dynamics
w ork Ho w ev er r
s reply scop e in hopscop ed error reco v ery almost doubles its
reply scop e in groupscop ed error reco v ery b ecause hopscop ed error reco v ery do es
not regulate the direction of reply trac On the other hand a large reply scop e
means the reply is m ulticast across the Mb one F or example mem ber s
s
s
p
p
and p
ha v e relativ ely large reply scop es in both hopscop ed and groupscop ed
error reco v eries whic h means their replies resp ond to requests from remote mem bers
across the Mb one Ho w ev er resp onse to remote requesters happ ens rarely As seen
in Figure b the a v erage reply scop e in both hopscop ed and groupscop ed error
reco v eries are still smaller than the reply scop e in global error reco v ery Figure a and b sho w the measured request trac and reply trac during the
sim ulation The request and reply trac is measured in terms of the n um b er of hops
that requests and replies tra v el The con v ergence p erio ds of request and reply scop es
are short in b oth hopscop ed and groupscop ed error reco v eries Generally sp eaking
the con v ergence time of request scop e in hopscop ed error reco v ery is appro ximately
one session cycle time The measuremen t of reply scop e in hopscop ed error reco v ery
relies on the results from other mem bers so the con v ergence could tak e sev eral
session cycles The con v ergence time in groupscop ed error reco v ery dep ends on the
n um ber of nested lo cal groups and their order of creation If the innermost group
is created rst the con v ergence p erio d is shorter On the other hand if the group
with the largest scop e is created rst this group has to b e shrunk b efore the second
nested group can be created Therefore it tak es a longer time to con v erge F or
|
0
|
100
|
200
|
300
|
400
|
500
|
600
|
0
|
5
|
10
|
15
|
20
|
25
|
30
|
35
|
40
| | | | | | |
|||||||||
seconds
request traffic
global error recovery
hop-scoped error recovery
group-scoped error recovery : x=33%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=50%,δ=0%, m=10, k=3, f=6
group-scoped error recovery : x=100%,δ=0%, m=10, k=3, f=6
Figure Request trac dynamics of mem ber r
r
s state is reset at time example if there are n nested groups The a v erage w aiting time to prop ose a group
creation is t
and the a v erage shrinking time is t
In the w orst case the con v ergence
time is appro ximately n t
t
The shrinking time is a function of the p erio d of
a loss sharing measuremen t
T o further understand the b eha vior of con v ergence time of a new mem b er join
ing an ongoing session w e man ually reset the state of mem ber r
during the next
sim ulation The measured request trac of r
during the sim ulation is sho wn in
Figure r
starts with global error reco v ery after its state is reset at sim ulation
time seconds Ho w ev er r
will calculate its request hop coun t based on the
incoming session messages in hopscop ed error reco v ery or learn of the existing lo cal
groups in groupscop ed error reco v ery to restore its state therefore its request traf
c is reduced Note that the request trac in groupscop ed error reco v ery oscillates
b ecause the request scop e the scop e of the lo cal group to whic h requests are sen t
is dynamically adjusted based on observ ed loss patterns On the other hand the
request trac in hopscop ed error reco v ery is fairly stable as long as the top ology
remains unc hanged
F rom the sim ulations w e found that groupscop ed error reco v ery p erforms b etter
than hopscop ed error reco v ery in terms of the reply trac Ho w ev er hopscop ed
error reco v ery p erforms b etter than groupscop ed error reco v ery in terms of the
request trac except in the star top ology Since the size of a request message is
m uc h smaller than the size of a reply message it is more imp ortan t to reduce the
reply trac than to reduce the request trac Therefore in terms of the trac
reduction groupscop ed error reco v ery app ears to pro vide a b etter solution On
the other hand if w e consider other sources of o v erhead in tro duced b y these t w o
approac hes it app ears that groupscop ed error reco v ery imp oses more o v erhead on
session mem bers
as w ell as the underlying m ulticast routing
Groupscop ed error reco v ery requires the host to send p erio dic IGMP messages to refresh the
m ulticast deliv ery path for eac h lo cal group
Related W orks
There ha vebeen sev eral other treatmen t of error reco v ery for reliable m ulticast trans
p ort In con trast to our prop osal whic h assumes session mem b ers are autonomous
these previous w orks require v arious degree of static conguration or cen tralized
co ordination
Hofmann prop osed a lo cal group concept A session is split in to sub
groups and eac h subgroup com bines mem bers in a lo cal region A subgroup is
represen ted b y a lo cal group con troller whic h supp orts lo cal loss retransmission
The establishmen t of lo cal groups is supp orted b y a comm unication service named
Gr oup Distanc e Servic e A mem ber searc hes and joins the closest lo cal group If
no suitable group exists the mem ber will establish a new lo cal group and app oin t
itself as the con troller
Kasera et al examined the approac h of using separate m ulticast groups
to reco v er individual losses in reliable m ulticast comm unication Lost pac k ets are
categorized in to groups the retransmission of a lost pac k et is m ulticast to the group
it b elongs to Receiv ers dynamically join and lea v e those groups to reco v er pac k et
losses
TMTP groups session mem bers in to domains and organizes these domains
in to a hierarc hic con trol tree to impro v e the scalabilityof error reco v ery Mem bers
in a domain request the domain manager for retransmission A domain manager
is also resp onsible for error reco v ery of its c hildren managers in the con trol tree
The scop e of retransmission is restricted b y using the TTL eld The con trol tree
is selforganized and it is built dynamically as domain managers join and lea v e the
session
Holbro ok et al suggested a hierarc hic logging serv er structure to reduce
error reco v ery trac in a m ulticast session The distribution and hierarc hyoflogging
serv ers is statically congured Receiv ers con tact their lo cal secondary serv er for
retransmission instead of the remote primary serverstoa v oid NAK implosion and
to minimize reco v ery latency and bandwidth Aserv er either unicasts or m ulticasts
a retransmission based on the n um b er of requests it receiv es
RMTP adopts a similar hierarc hic structure to a v oid message implosion
A set of designated receiv ers DR is selected statically in a session DRs are capable
of retransmitting lost data Ho w ev er the hierarc h y of DRs is constructed dynam
ically Eac h receiv er selects its least upstream DR as the A CK pro cessor AP
and p erio dically sends its receiving state to the AP to request retransmissions A
retransmission is either unicast or m ulticast based on the n um b er of requests
Conclusion
W e prop osed t w o dieren t approac hes to reduce error reco v ery trac in SRM In
hopscop ed error reco v ery mem b ers calculate the required hop coun ts for their re
quests and replies based on distance information exc hanged in session messages
Since the information is piggybac k ed on their session messages the o v erhead im
p osed b y hopscop ed error reco v ery is relativ ely small Ho w ev er hopscop ed error
reco v ery do es not regulate trac direction If the top ology of a session is star
shap ed hopscop ed error reco v ery do es not p erform m uc h b etter than global error
reco v ery Groupscop ed error reco v ery b ounds the scop e of error reco v ery trac b y using
separate m ulticast groups Mem bers that share the same losses join a lo cal group
for error reco v eryth us the error reco v ery trac is only distributed within the lo cal
group Groupscop ed error reco v ery requires individual mem bers to main tain m ul
tiple lo cal groups Therefore more o v erhead is imp osed on mem bers as w ell as on
the underlying m ulticast routing
There remain sev eral op en issues In hopscop ed error reco v ery main taining a
pair of request and reply hop coun ts for individual sources do es not in tro duce signif
ican t o v erhead Ho w ev er main taining m ultiple lo cal groups for individual sources
in groupscop ed error reco v ery ma y not b e acceptable F urther researc h should lo ok
in to group aggregation across sources A lo cal group is asso ciated with one or more
lossy links Sources who share the same deliv ery path and the same lossy links along
the path should b e considered the same in terms of error reco v ery Therefore error
reco v ery of losses from these sources should b e handled b y a single lo cal group
Another scenario that w e ha v e not fully understo o d is the con v ergence time of
groupscop ed error reco v ery in the presence of net w ork dynamics F or example if the
net w ork top ology c hanges mem b ers in a lo cal group do not share losses an ymore
Therefore mem bers ha v e to readjust themselv es so that the new mem b ership in
the lo cal group represen ts a set of mem bers who share the same losses Another
example of net w ork dynamics is trac congestion Data losses due to congestion
c hanges the error rates and the lo cations of lossy links in a session Since lo cal
groups are asso ciated with lossy links c hanges in error rates and lo cations of lossy
links aect the loss sharing b eha vior within lo cal groups The mem b ership in a lo cal
group has to be readjusted in order to adapt to these c hanges The study of the
con v ergence time of mem b ership readjustmen t can help us to b etter understand the
tolerance to net w ork dynamics of groupscop ed error reco v ery Finally one migh t consider com bining these t w o approac hes b y using a hop
scop e on the request messages sen t to lo cal groups since hopscop ed error reco v ery
pro duces b etter trac reduction in terms of the request trac In our hopscop ed
sc heme requests are addressed to the global session group with a sp ecic hop coun t
and that hop coun t is determined b y exc hanging session messages and measuring
ho w far is the closest upstream neigh b or Ho w ev er if a hopscop ed request is sen t
to a lo cal group it can only guaran tee a resp onse if the requester kno ws b oth howto
set the hop coun t and howto c ho ose the appropriate lo cal group Our hopscop ed
error reco v ery only pro vides the former information The requester w ould analyze
session messages determine an appropriate hop coun t but then the target upstream
neigh b or mightnot be amem b er of that lo cal group More researc h has to b e done
to ensure that either the closest upstream neigh b or joins the same lo cal group or the
requester only considers mem b ers in the same lo cal group in computing its request
hop coun t
References
Sally Flo yd V an Jacobson ChingGung Liu Stev e McCanne and Lixia Zhang
A Reliable Multicast F ramew ork for Ligh t w eigh t Session and Application
La y er F raming IEEEA CM T r ansactions on Networking Sridhar Pingali Don T o wsley and James Kurose A Comparison of sender
initiated and Receiv erInitiated Reliable Multicast Proto cols Pr o c e e dings of
A CM SIGMETRICS Pages D Clark and DT ennenhouse Arc hitectural Considerations for a New Gen
eration of Proto cols Pr o c e e dings of A CM SIGCOMM Pages Septem b er D Clark M Lam b ert and L Zhang NETBL T A High Throughput T ransp ort
Proto col Pr o c e e dings of A CM SIGCOMM Pages August ChingGung Liu A Scalable Reliable Multicast Proto col PhD Dissertation
Pr op osal University of Southern California No v em ber Markus Hofmann A Generic Concept for Large Scale Multicast Pr o c e e dings
of International Zurich Seminar on Digital Communic ation IZS Springer
V erlagF ebruary Markus Hofmann Adding Scalabilit yto T ransp ort Lev el Multicast Pr o c e e d
ings of Thir d COST Workshop Multime dia T ele c ommunic ations and
Applic ations Springer V erlag Bar c elona Sp ain No v em ber Sneha Kasera Jim Kurose and Don T o wsley Scalable Reliable Multicast Using
Multiple Multicast Groups CMPSCI T e chnic al R ep ort TR Octob er
R Y a v atk ar J Grio en and M Sudan A Reliable Dissemination Proto col
for In teractiv e Collab orativ e Applications Pr o c e e dings of A CM Multime dia
Hugh W Holbro ok Sandeep K Singhal and Da vid R Cheriton LogBased
Receiv erReliable Multicast for Distributed In teractiv e Sim ulation Pr o c e e d
ings of A CM SIGCOMM August John C Lin and Sanjo yP aul MTP A Reliable Multicast T ransp ort Proto col
Pr o c e e dings of IEEE INF OCOM Pages April
S P aul K K Sabnani J C Lin and S Bhattac haryy a Reliable Multicast
T ransp ort Proto col RMTP T o app e ar in IEEE Journal on Sele cte d A r e as
in Communic ations sp e cial issue on Network Supp ort for Multip oint Commu
nic ation
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 656 (1997)
PDF
USC Computer Science Technical Reports, no. 585 (1994)
PDF
USC Computer Science Technical Reports, no. 606 (1995)
PDF
USC Computer Science Technical Reports, no. 670 (1998)
PDF
USC Computer Science Technical Reports, no. 565 (1994)
PDF
USC Computer Science Technical Reports, no. 655 (1997)
PDF
USC Computer Science Technical Reports, no. 608 (1995)
PDF
USC Computer Science Technical Reports, no. 614 (1995)
PDF
USC Computer Science Technical Reports, no. 731 (2000)
PDF
USC Computer Science Technical Reports, no. 805 (2003)
PDF
USC Computer Science Technical Reports, no. 657 (1997)
PDF
USC Computer Science Technical Reports, no. 667 (1998)
PDF
USC Computer Science Technical Reports, no. 692 (1999)
PDF
USC Computer Science Technical Reports, no. 644 (1997)
PDF
USC Computer Science Technical Reports, no. 599 (1995)
PDF
USC Computer Science Technical Reports, no. 669 (1998)
PDF
USC Computer Science Technical Reports, no. 678 (1998)
PDF
USC Computer Science Technical Reports, no. 774 (2002)
PDF
USC Computer Science Technical Reports, no. 724 (2000)
PDF
USC Computer Science Technical Reports, no. 820 (2004)
Description
Ching-Gung Liu, Deborah Estrin, Scott Shenker, and Lixia Zhang. "Local error recovery in SRM : comparison of two approaches." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 648 (1997).
Asset Metadata
Creator
Estrin, Deborah
(author),
Liu, Ching-Gung
(author),
Shenker, Scott
(author),
Zhang, Lixia
(author)
Core Title
USC Computer Science Technical Reports, no. 648 (1997)
Alternative Title
Local error recovery in SRM : comparison of two approaches (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
23 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270157
Identifier
97-648 Local Error Recovery in SRM Comparison of Two Approaches (filename)
Legacy Identifier
usc-cstr-97-648
Format
23 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/