Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 640 (1996)
(USC DC Other)
USC Computer Science Technical Reports, no. 640 (1996)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Scalable Timers for Soft State Proto cols
Puneet Sharma Deb orah Estrin
Information Sciences Institute
Univ ersit y of Southern California
Admiralt yW a y
Marina del ReyCA Ph x
F ax puneetcatarinauscedu estrinuscedu
Sally Flo yd V an Jacobson
La wrence Berk eley National Lab oratory
Cyclotron Road
Berk eleyCA Ph F ax o ydeelblgo v v aneelblgo v
Septem ber
Abstract
Soft state proto cols use p erio dic refresh messages to k eep net w ork state aliv e while adapting to
c hanging net w ork conditions this has raised concerns regarding the scalabilit y of proto cols that use
the softstate approac h In existing soft state proto cols the v alues of the timers that con trol the sending
of these messages and the timers for aging out state are c hosen bymatc hing empirical observ ations
with desired reco v ery and resp onse times These xed timerv alues fail b ecause they use time as a
metric for bandwidth they adapt neither to the wide range of link sp eeds that exist in most
widearea in ternets nor to uctuations in the amoun t of net w ork state o v er time
W e prop ose and ev aluate a new approachin whic h timerv alues adapt dynamically to the v olume
of con trol trac and a v ailable bandwidth on the link The essen tial mec hanisms required to realize
this sc alable timers approac h are dynamic adjustmen t of the senders refresh rate so that the
bandwidth allo cated for con trol trac is not exceeded and estimation of the senders refresh rate
at the receiv er in order to determine when state can b e timedout and deleted The refresh messages are
sen t in a round robin manner not exceeding the bandwidth allo cated to con trol trac and taking in to
accoun t message priorities eg triggered messages ab out new state are higher priorit y than refresh
messages ab out preexisting state W e ev aluate t w o receiv er estimation metho ds for dynamically
adjusting net w ork state timeout v alues coun ting of the rounds and exp onen tial w eigh ted
mo ving a v erage
Keyw ords Proto cols In ternet w orking Multicast Algorithms In ternet
In tro duction
An um b er of prop osed enhancemen ts to the In ternet arc hitecture require addition of new state ie stored
information in net w ork no des In the con text of v arious kinds of net w ork elemen t failures a cen tral
design c hoice is the manner in whic h this information is established and main tained Soft state proto cols
main tain state in in termediate no des using refreshes that are p erio dically initiated b y endp oin ts When
the endp oin ts stop initiating refreshes the state automatically times out Similarly if the in termediate
state disapp ears it is reestablished b y the endp oin t initiated refreshes Man y soft state designs suc h as RSVP R TP and PIM use best eort messages to
carry the refreshes These traditional p erio dic refreshes generated at xed p erio ds scale p o orly with the
increase in the amoun t of state Some designers ha v e suggested using explicitlyac kno wledged reliable
messages in place of these p erio dic refreshes The reliable nature more completely decouples timer
managemen t of soft state from the issues of pac k et losses of these refreshes Ho w ev er use of reliable
message transp ort en tails an increase in complexit y of the mec hanism
In this pap er w e prop ose a new approachof sc alable timers to impro v e the scaling prop erties of soft
state mec hanisms Sc alable timers replace the xed timer settings used b y existing soft state proto cols
with timers that adapt to the v olume of con trol trac and a v ailable bandwidth on the link Sc alable
timers regulate the amountof con trol trac indep enden t of the amoun t of soft state
In the next section w e presen t an o v erview of state managemen t in net w orks Section discusses
the soft state paradigm of main taining net w ork state In Section w e describ e ho w the xed timers
are curren tly used b y soft state proto cols to exc hange refresh messages and discard stale state and w e
motiv ate our approac h of sc alable timers that mak es soft state proto cols more scalable Mec hanisms
required for the prop osed approac h of regulating con trol trac are discussed in Section and Section W e lo ok at the application of the sc alable timers approac h to PIM in Section follo w ed b y the sim ulation
results in Section Section compares the traditional and the prop osed approac h W e conclude with a
summary and a few commen ts on future directions in Section State Managemen t in Net w orks
State in net w ork no des refers to information stored b y net w orking proto cols ab out the conditions of
the net w ork The state of the net w ork is stored in a distributed manner across v arious no des and
v arious proto cols F or instance a teleconference application migh t run on top of m ultiple proto cols
In ternet Group Managemen t Proto col IGMP stores information ab out the hosts participating in the
conference Based on the mem b ership information the m ulticast routing proto cols suchas PIM or CBT create m ulticast forw arding state in the routers Areserv ation proto col suc h as RSVP or
STI I ma y reserv e net w ork resources for the teleconferencing session along the m ulticast tree
The state has to b e mo died to reect the c hanges in net w ork conditions The net w ork no des comm u
nicate with eac h other to exc hange the information regarding c hange in the net w ork conditions Based
on these con trol messages the net w ork no des mo dify their stored state F or instance in the ev en t of
a c hange in net w ork top ology the routers exc hange messages resulting in mo dication of the m ulticast
forw arding state
In this section w e discuss the tradeos asso ciated with v arious w a ys of main taining state in net w orks
P aradigms for main taining state in net w ork
The state main tained b y no des in a net w ork can b e categorized as har d state and soft state Har d state
is that whic h is installed in no des up on receiving notication for setting up state and is remo v ed only
on receiving an explicit teardo wn thro w a w a y message Soft state on the other hand uses refresh
messages to k eep it aliv e and is discarded if the state is not refreshed for some time in terv al Proto col
designers ha v e adopted b oth paradigms for storing state The reserv ation proto col STI I is
an example of a proto col that main tains hard state In STI I c onne ct messages are generated to setup
a new reserv ation whic h is later torn do wn b y disc onne ct messages Similarly the m ulticast routing
proto col Core Based T ree CBT establishes hard state On the other hand Resource Reserv ation
Proto col RSVP uses a datagram messaging proto col with p erio dic refreshes to main tain soft state
in net w ork no des Proto col Indep enden t MulticastPIM another m ulticast routing proto col
also has con trol messages that are exc hanged periodically to k eep the m ulticast forw arding en tries a
forw arding en try that is not refreshed is discarded
In hard state arc hitectures con trol messages are exc hanged among the no des for installing as w ell
as remo ving the state and no p erio dic refreshes are needed The install and remo v e messages are sen t
reliably resen tun til an ac kno wledgemen t is receiv ed to mak e it robust Soft state arc hitectures on the
other hand do not need to manage reliable message deliv ery functions eg timeouts retransmits etc
In addition soft state arc hitectures do not rely up on explicit teardo wn messages
In steady state hard state proto cols require less con trol trac as there are no p erio dic con trol mes
sages Soft state proto cols pro vide better faster adaptation and greater robustness to c hanges in the
underlying net w ork conditions but at the exp ense of p erio dic refresh messages Ho w ev er if the net w ork
is highly dynamic hardstate con trol messages m ust be generated to adapt to the c hanges In suc h a
scenario use of hard state do es not pro vide m uc h adv an tage o v er soft state b ecause the bandwidth b eing
used bycon trol messages w ould b e comparable for b oth paradigms
Based on the roles pla y ed b y the no des with resp ect to the particular state b eing referenced the
con trol message exc hange among net w ork no des can be mo delled as an exc hange of messages bet w een
t w o en tities the sender and the r e c eiver The sender is the net w ork no de that regenerates con trol
messages to install k eep aliv e and remo v e state from the other no de The r e c eiver is the no de that
creates main tains and remo v es state based on the con trol messages that it receiv es from the sender
Soft State P aradigm
The soft state sender generates refresh messages to k eep the soft state at the receiv er aliv e These messages
are sen t p erio dically after one r efr esh p erio d W e assume for simplicit y that the soft state proto cols do
not use explicit teardo wn messages
In the absence of explicit tear do wn messages if the receiv er do es
not get refresh messages for a particular state for one r efr esh p erio d it maytreat the state as stale and
discard it Refresh pac k ets can o ccasionally get dropp ed in the net w ork The proto col b ecomes more
robust to dropp ed pac k ets if the receiv er w aits longer than one r efr esh p erio d b efore discarding the state
The time for whic h the receiv er w aits b efore discarding a state is a small m ultiple of the r efr esh p erio d The m ultiplying factor is determined b y the degree of robustness required and lossiness of the link or
path
As the size and usage of the net w orks gro w the amoun t of the state to b e main tained also increases
Curren tly most soft state proto cols generate refresh messages using xed timers resulting in gro wth of
con trol trac with the increase in the amoun t of state b eing main tained in the net w ork In the cen ter
of the net w ork aggregate trac lev els will con tribute to large quan tities of state and trac A t the
edges lo w er sp eed links mak e ev en a lo w er trac lev el a concern The primary goal of a data net w ork
is to carry data trac Unconstrained growthofcon trol trac can jeopardize this primary goal This is
exacerbated b y the fact that trac lev els are higher when congestion and net w ork ev en ts are reducing
A soft state proto col migh t optionally use explicit teardo wn messages to ac hiev e faster action
Degree of robustness is a tradeo of tolerance to dropp ed pac k ets and o v erhead of main taining state that ma y no longer
b e required
o v erall net w ork resource a v ailabilit y Soft state proto cols can be made scalable only if con trol messages
can b e constrained indep enden t of the amoun t of state in the net w ork
Con trol T rac in Soft State Proto cols
As weha v e discussed in the previous section soft state proto cols require the p erio dic exc hange of con trol
messages bet w een the net w ork no des Based on the state for whic h the con trol message is sen t con trol
trac can b e categorized in to t wot yp es W e dene the t wot yp es as
Refresh T rac These are con trol messages that refresh already existing state that needs to b e k ept
aliv e Suc h messages are generated lo cally at a sender no de on the basis of its activ e lo cal state
An example of refresh trac is a PIM join message for an already activ e distribution tree
T rigger T rac This consists of messages that are generated b y the sender when it w an ts to create
new state F or example if a PIM router receiv es a join message for a group that is not already
represen ted byanen try in the no des m ulticast forw arding table then a join message is triggered
and sentto w ards the source
In the next section w e iden tify dra wbac ks of the traditional w a ys of setting the timer v alues in soft
state proto cols and presen t our approachtoregulatecon trol trac with sc alable timers Timers for Con trol T rac
Soft state proto cols ha v e t w o timers asso ciated with the con trol trac The sender main tains a r efr esh
timer that is used to clo c k out the refresh messages for the existing state When the r efr esh timer for an
existing state en try expires a refresh message for that state is generated The receiv er main tains a state
time out timer to age the state that it main tains The receiv er discards a state en try if it do es not receiv e
a refresh message for that state b efore this timer expires
T raditional Approac h
T raditional proto cols ha v e xed settings for the timer v alues The v alues of the refresh timers and timeout
timers are c hosen b y empirical observ ations with desired reco v ery and resp onse in mind These v alues
are then used as xed timers for sending the p erio dic up dates The v alue of the timeout timer is set to
a m ultiple of the refresh p erio d F or instance a PIM router sends join messages p erio dically ev ery seconds to its nexthop router to w ards the sender If a state in an upstream PIM router do es not get
refreshed within seconds it is treated as stale and is discarded Suc h xed refresh timers for the state
up date fail to address the heterogeneityof the net w orking en vironmen ts eg range of link bandwidths
and the gro wth of net w ork state
Fixed timers suer from scaling problems The con trol trac uses bandwidth prop ortional to the
amoun t of state b eing refreshed State in the net w ork migh t exist ev en when data sources are idle eg
state related to shared trees in PIM routers or IGMP mem b ership information in Designated Routers
T o date when the o v erhead b ecomes excessiv e the xed timer v alues ha veto be c hanged globally In summary the xe d timer settings fail b e c ause they use time as a metric for b andwidth and do not
addr ess the r ange of link sp e e ds pr esent acr oss the network
State−−>
Bandwidth−−>
Scalable Timers
Fixed Timers
Refresh Period−−>
Figure Fixed Timers vs Scalable Timers
Scalable Timers
W e prop ose a new approac h for regulating the con trol trac in whic h the timer v alues are dynamically
adjusted to the amoun t of state This approac h is called sc alable timers In our approac h w e x the
con trol trac bandwidth instead of the refresh in terv al The design ob jectiv e is to mak e the
bandwidth used b y con trol trac negligible as compared to the link bandwidth and the data trac
Consequen tly a xed p ortion of the link bandwidth is allo cated to the con trol trac The refresh in terv al
at the sender is adjusted according to the xe d a v ailable con trol trac bandwidth and the amoun t of
state to be refreshed The sender v aries the refresh in terv al with the c hange in amoun t of state to be
refreshed
Figure compares the c hanges in the refresh in terv al and con trol bandwidth for xed and scalable
timers Figure has a line for the scalable timers approac h and a line for the xed timers approac h
sho wing ho w the amoun t of bandwidth used bycon trol messages and refresh p erio d v ary as the amoun t
of state to be refreshed increases In xed timers approac h the amoun t of bandwidth used b y con trol
messages increases as the amoun t of state to be refreshed increases Ho w ev er in our scalable approac h
the refresh in terv al increases with the increase in amoun t of state k eeping the v olume of con trol trac
constan t Unlik e xed timers our approac hscalesw ell b ecause the con trol trac do es not gro w with the
amountofstate In the next section w e presen t the mec hanisms required for applying sc alable timers to
soft state proto cols
Mec hanisms for Scalable Timers
The essen tial mec hanisms required b y scalable timers are t w ofold the sender dynamically adjusts the
refresh rate so that the bandwidth allo cated for the con trol trac is not exceeded and the r e c eiver
estimates the rate at whic h the sender is refreshing the state in order to determine when state can be
considered stale and deleted
In this section w e discuss these mec hanisms for servicing the con trol trac and timing out state
at net w ork no des for poin ttop oin t links These mec hanisms can be extended to m ultiaccess LANs b y
treating them as m ultiple p oin ttop oin t links
Servicing the con trol trac at the Sender
The sender needs to generate up dates for its state en tries suc h that the bandwidth allo cated for con trol
trac is not exceeded A simple mo del for generating refreshes at the sender is to equally divide the
allo cated con trol bandwidth contr ol bw
total
among all the state en tries Up dates are sen t for eac h state
en try in a round robin fashion In this zeroth lev el algorithm all the con trol trac is serviced as one class
As discussed earlier in Section trigger messages are generated when new state is created T rigger
trac needs to be serviced faster than existing state trac for impro v ed endtoend p erformance and
con v ergence adaptabilit y to the c hanges in the net w ork conditions More generally dieren t con trol
messages can ha v e dieren t priorities Simple round robin servicing of the con trol messages fails to include
the priorities asso ciated with v arious con trol messages for a proto col
More structure needs to be added to the simple round robin mo del at the sender no de to include
the dierence in the priorit y lev els of the con trol messages A priorit y mo del can be used to add more
structure to the zeroth lev el algorithm at the sender The con trol trac is divided in to v arious classes
based on the priorit y asso ciated with the messages One instan tiation of suc h a priorit y mo del is to
divide con trol trac in to t w o classes trigger and refresh assigning trigger trac to the higher priorit y
class Suc h a simple priorit y mo del is also not suitable as it can lead to starv ation of the lo w er priorit y
class messages In the ab o v e men tioned instan tiation refresh messages migh t be starv ed if there is a
large amoun t of trigger trac Instead isolation of bandwidth for v arious classes of trac messages is
required suc h that eac h class receiv es a guaran teed share of the con trol bandwidth during ev en hea vy
load
Because eac h class of con trol trac is guaran teed a share of the con trol bandwidth there is no
starv ation While bandwidth allo cated to a class is not b eing used it is a v ailable to other classes sharing
the con trol trac bandwidth The class structure of the con trol messages in dieren t proto cols can b e set
dieren tly based on the requiremen ts of that proto col This use of isolation to protect the lo w er classes
from starv ation is similar to the link sharing approac h in Class based Queueing CBQ One p ossible class structure is to divide the con trol messages in to t w o classes trigger messages and
r efr esh messages The trigger trac class is guaran teed a v ery large fraction close to of the con trol
bandwidth so that trigger messages can be serviced v ery quic kly for b etter resp onse time Refresh
messages are allo cated the remaining nonzer o fraction of bandwidth Guaran teeing a nonzer o fraction
of the con trol trac bandwidth to refresh messages precludes starv ation and premature state timeout at
the receiv er
Atok en buc k et rate limiter can b e used to serv e bursts in trigger trac While the trigger trac is not
using the bandwidth allo cated for trigger trac it is used b y refresh trac The sender generates up dates
for the existing state in a round robin fashion Toa v oid sending refresh messages to o frequen tly all proto
cols ha v e a minim um refresh p erio d r ef r esh inter v al
min
that is the same as the refresh p erio d used b y the
xed timers approac h In this mo del the refresh p erio d for existing state is at least ref r esh inter v al
min
and is giv en b y max r ef r esh inter v al
min
bw av ail abl eamount of existing state In this class struc
ture there is a tradeo bet w een the latency in establishmen t of new state and dela y in remo ving stale
state If a larger fraction of con trol bandwidth is allo cated to trigger trac the latency in establishing
new state is less but there migh t b e more dela y in discarding stale state
Timing out Net w ork State at Receiv er
In the xed timer approac h the receiv er has prior kno wledge of the refresh rate and can compute the
timeout in terv al a priori eg a small m ultiple of the refresh rate to allo w for pac k et loss Since in our
approac h the rate at whic h the sender refreshes the state v aries based on the total amoun t of state at
the sender the receiv er has to trac k the refresh frequency and up date the timeout in terv al accordingly
One p ossibilityw ould b e for the sender to explicitly notify the receiv er ab out the c hange in the refresh
period in the con trol messages Another p ossibilityis for the receiv er to estimate the refresh frequency
from the rate of arriv al of the con trol messages
If the receiv er relies on information con v ey ed b y the sender the receiv er is implicitly putting trust in
the sender to b eha v e prop erly This is not an issue of maliciousness but of lac k of adequate incen tiv es
to motiv ate strict adherence The refresh p erio d at the sender v aries as the amoun t of the state b eing
refreshed c hanges So ev en the sender can not foresee the c hanges in the refresh p erio d and hence migh t
con v ey incorrect information to the receiv er F or example if a link comes up there could b e a big burst
of trigger trac resulting in c hanges in refresh in terv als not foreseen b y the sender Therefore ev en
if information is sen t explicitly b y the sender the receiv er still needs alternate mec hanisms to a v oid
premature state deletion If the receiv er do es not trust the sender and has its o wn mec hanism for the
estimation of the timeouts then the information sen tb y the sender is redundan t
In summary the explicit notication approac h unnecessarily couples system en tities W e adopt a gen
eral arc hitectural principle to place less trust in detailed system b eha vior in order to mak e the arc hitecture
more robust
In the absence of explicit notication of c hange in the refresh in terv al the receiv er needs to estimate
the curren t refresh in terv al and adjust the timeout accordingly In the next section w e discuss mec hanisms
for estimation of the refresh in terv al at the receiv er
Estimating the Refresh P erio ds
In the previous section w e discussed whyreceiv ers should not rely on explicit notication from the sender
to set timeouts Instead the receiv er has to estimate the refresh p erio d from the time bet w een t w o
consecutiv e refresh messages that it receiv es for the same state The c hallenge is for the receiv er to adapt
to c hanges in this refresh period o v er time Changes in the refresh in terv al as observ ed at the receiv er
ma y b e due to
trigger messages Servicing of trigger messages can result in transientc hanges in the refresh in terv al
b ecause while they are b eing serviced the bandwidth a v ailable for the refresh messages is reduced
new state As new state is created at a no de the sender has more state to serv e making the refresh
in terv al larger Similarly if some state is discarded the refresh messages are generated at a faster
rate
pac k et loss The receiv er observ es a longer up date in terv al if some refresh message is lost on the
link
Unlik e the xed refresh period approac h where the timeout in terv al has to be robust only to deal
with dropp ed pac k ets the estimator used in sc alable timers has to b e more carefully designed to b e b oth
adaptiv e and ecien t The estimator in our approachm ust detect and resp ond quic kly to an increase in
the refresh p erio d caused b y an increase in the amoun t of trigger state W e not only need timers to be
robust in the presence of an o ccasional dropp ed up date but also in the presence of a rapid nonlinear
increases in the in terup date in terv als
The estimate of the refresh in terv al is used b y receiv ers to discard old state Ho w ev er the estimate
m ust also b e conserv ativ e so as not to timeout state prematurely The consequences of timing out state
prematurely are generally more serious than the consequences of mo derate dela ys in timing out state
In this section w e discuss t w o approac hes and asso ciated problems that can b e used at the receiv er
to age the old state
Coun ting of the Rounds
A round refers to one cycle of refreshes at the sender for all the states that it has to refresh In this
approac h instead of trying to estimate the refresh period and then using a m ultiple of this p erio d to
throwa w a y the state the state is thro wn a w a y if the receiv er do es not receiv e the refresh for a particular
state for some a rounds where a is set to pro vide robustness to lost pac k ets as with the xed timers
Th us b y c ounting of the r ounds the receiv er can trac k the sender closely Assuming that the sender is servicing the states in a roundrobin fashion all of the other state will ha v e
been serviced once bet w een t w o consecutiv e up dates for a particular state F or eac h state the receiv er
main tains the last time that a refresh w as receiv ed for the state The receiv er marks the b eginning of a
round b y r ound markers If t w o refresh messages are receiv ed for a particular state since the b eginning
of the curren t round the last r ound marker is mo v ed to the curren t time The receiv er can coun t the
rounds easily using this algorithm The algorithm is further explained in the psuedo co de b elo w
F or eac h state en try the receiv er stores the time that it receiv ed the last refresh message for that
en try F or state i let this time b e denoted b y previous refreshi The receiv er main tains r ound markers
per link Ev ery time the last r ound marker is mo v ed ie a new round starts c hec ks are made to see
if an y state has to b e remo v ed
On receiving a refresh message for state i if previous refreshi last round mark er f
last round mark er curren t time
shift round mark ers
age state
g
previous refreshi curren t time
The age state routine c hec ks if there is an y state that has to b e discarded
With the c ounting of the r ounds approac h the receiv er is essen tially assuming that the sender is using
round robin for sending refresh messages Ho w ev er it is not necessary for the sender to send the refreshes
in exact roundrobin order some go o d appro ximation to roundrobin will suce That is as long as the
receiv er w aits at least t w o round times b et w een refreshes b efore deleting a particular piece of state it is
not necessary to preserv e the exact order of the refreshes within ev ery round It is imp ortan t that the
sender refresh some state more frequen tly than the others but it w ould not b e a problem to refresh some
other state sligh tly less frequen tly Renemen ts to basic algorithm
W e iden tied sev eral scenarios in whic h a simple implemen tation of the c ounting of the r ounds can ha v e
diculties In the rst set the c ounting of the r ounds can be confused if v arious state en tries ha v e
dieren t refresh rates In the second set there is an end case in whic h some state nev er expires
If the receiv er gets t w o refreshes for a particular piece of state the last r ound marker is mo v ed ahead
So if the refresh rate for v arious states is not the same then the ab o v e algorithm fails One simple
example of this phenomenon is when weha veafaultyin terface card whic h generates a duplicate pac k et
for eac h pac k et that it puts on the link In this case the last r ound marker will b e up dated ev ery time
the sender generates a con trol message
Since the sender can not refresh faster than a maxim um rate ev en if con trol bandwidth is a v ailable
the receiv er can use this information to c hec k if the refresh p erio d is smaller than the minim um congured
refresh p erio d ref r esh inter v al
min
Accordingly the algorithm w ould b e mo died to
if previous refreshi last round mark er curren t time previous refreshi refresh in terv al min f
last round mark er curren t time mo v e round mark ers shift round mark ers
age state
g
previous refreshi curren t time
Another scenario in whic h c ounting of the r ounds can get confused is when clouds are presen t in the
net w ork Net w orks mightha v e clouds that do not main tain state and simply forw ard the con trol trac
receiv ed from the do wnstream no de to the upstream no de In suc h a scenario the receiv er gets refreshes
that are not generated lo cally at the neigh b oring router In this case if t w o or more senders are refreshing
the same state the receiv er will observ e a smaller refresh p erio d for some state and up date the round
mark er faster than the actual round resulting in miscoun ting of the rounds Moreo v er this scenario is
dieren t than a m ultiaccess link scenario where the receiv er can distinguish bet w een the refreshes for
the same state sentb y dieren t senders b y lo oking at the address of the sender
Another failure mo de for a simple implemen tation of the c ounting of the r ounds can result in state
en try that nev er expires If there is only one state en try b eing main tained and the sender stops sending
the refreshes for this state the receiv er in a simple implemen tation will nev er time out the state This
happ ens b ecause the round mark er do es not shift as no up dates are receiv ed Similar b eha vior can o ccur
when there are m ultiple en tries of state and all of them die together or when a link go es do wn and the
receiv er do es not see anycon trol messages
Since a percen tage of con trol bandwidth is allo cated for refresh trac there is an upp er b ound on
the time b et w een t w o consecutiv e refresh pac k ets seen b y the receiv er on a link in the absence of a link
failure The receiv er can use this upp er bound to detect the end case Let us assume the simple class
structure giv en in Section The upp er bound on the time in terv al bet w een t w o consecutiv e refresh
pac k ets seen on a link is giv en b y
gap
max
pk tsiz e
av g
contr ol bw
ref r esh
where pk tsiz e
av g
is the a v erage size of refresh pac k et and
contr ol bw
ref r esh
is the con trol bandwidth allo cated to refresh messages at the sender
If the amoun t of state b eing refreshed is lo w then the round completion time is ref r esh inter v al
min
Otherwise the round completion time is b ounded b y N gap
max
where N is the n um ber of state en tries
that are b eing refreshed So the instan taneous upp er b ound on round completion time is giv en b y
MAX N gap
max
r ef r esh inter v al
min
The ab o v e men tioned w orst case o ccurs when there is innite trigger trac to be serviced So the
round mark er should also b e mo v ed if no refresh pac k ets are seen for an in terv al of the ab o v e computed
upp er b ound
If no trigger state is b eing serviced the time in terv al b et w een t w o consecutiv e refresh pac k ets is b ound
b y
gap
min
pk tsiz e
av g
contr ol bw
total
pk tsiz e av g contr ol bw
r ef r esh
r ef r esh inter v al min and contr ol bw
total
are the same constan ts that are used b y the
sender to generate refresh messages
where contr ol bw
total
is the bandwidth allo cated to con trol messages at the sender
So the instan taneous
upp er b ound on round completion time is giv en b y
MAX N gap
min
ref resh interval
min
where N is the n um b er of state en tries that are b eing refreshed
This b ound can be used for detecting the end case faster when no trigger trac is being receiv ed
Similar upp er b ounds can b e computed for an y other class structure b eing used b y the sender
With the suggested renemen ts to the basic algorithm the c ounting of the r ounds approac h can
successfully trac k the c hanges in refresh period from a single sender In the next section w e discuss
Exp onen tial W eigh ted Mo ving Av erage for estimating the refresh p erio d at the receiv er
Exp onen tial W eigh ted Mo ving Av erage
The Exp onen tial W eigh ted Mo ving Av erage EWMA estimator is a lo w pass lter b elonging to a class
of estimators called r e cursive pr e diction err or algorithms EWMA has been used widely for estimation
b efore eg TCP round trip time estimation etc W e in v estigated use of EWMA estimators as an
alternate to the c ounting of the r ounds approac h
In this sc heme a net w ork no de runs one EWMA estimator for eac h coinciden t link Eac h EWMA
estimator trac ks the refresh in terv al a v erage inter v al
av g
and mean deviation inter v al
mdev
for the
asso ciated link
On receiving a new measuremen t for refresh in terv al I the receiv er up dates the a v erage of the refresh
in terv al as
inter v al
av g
w inter v al
av g
w I
where w is the smo othing constan t of the estimator
As sho wn in EWMA can b e computed faster if it is rearranged as
interval
av g
inter v al
av g
w er r where
er r I inter v al
av g
Mean deviation is computed similarly as
inter v al
mdev
inter v al
mdev
w jer rj inter v al
mdev
Once the a v erage and mean deviation for the refresh in terv al are up dated the timeout v alue for the
state is then set to
a inter v al
av g
b interval
mdev
where a b are b oth small in tegers Here a is the degree of robustness used for setting the timeout p erio d
in the xed timers approac h and b is the w eightgiv en to the refresh in terv al deviation in computing the
timeout p erio d
The mean deviation is used for computing the timeout period so that the estimator is able to trac k
sharp increases in the refresh in terv al This prev en ts the state from being deleted prematurely due to
sudden increases in the refresh in terv al
When a state timeout o ccurs a new timeout v alue is computed based on the curren t v alues of
inter v al
av g
and inter v al
mdev
If the new timeout v alue is not greater than the previous v alue the
state is discarded Otherwise the new v alue for state timeout is set In the ev en t of long bursts of trigger
Timeout for State S.
1. Put S in list of timeout
pending states.
1. Compute new timeout for S
Timeout for State S.
Refresh Message for State S.
1. Update estimate
Trigger Message for State S.
No_Timeout
Normal
2. Compute new timeouts for
all timeout pending states and S
3. Set new timeout value for S.
4. Set new timeout values for all
timeout pending states or
time them out.
based on current estimate.
2. Set new timeout value for S or
timeout S.
3. Set new timeout value for S.
based on new estimate
2. Compute new timeout for S
1. Update the estimate
Refresh Message for State S.
based on current estimate.
1. Set timeout for S
Figure State Diagram for an EWMA receiv er
trac the estimator do es not receiv ean y refreshes and the estimate is not up dated In suc h a scenario
the estimator lags b ehind the actual refresh in terv al and premature timeout of state migh t o ccur T o
a v oid this problem the receiv er do es not timeout state during a burst of trigger trac The receiv er has
t w o mo des of op eration as sho wn in the state diagram for the receiv er in the Figure While the receiv er
is not receiving an y trigger messages it sta ys in the Normal state It en ters the No Time out state when
it receiv es a trigger message No state is timed out when the receiv er is in the No Time out state The
state of the receiv er is c hanged backto Normal up on receiving a refresh message
F or the EWMA estimator the v alue of the smo othing constan t needs to b e determined The require
men t from the smo othing constan t is that it should trac k the refresh p erio d reasonably closely and at
the same time be able to resp ond quic kly to increases in the refresh p erio d due to trigger messages
The estimator should not o v erreact to o ccasional long refresh in terv als due to o ccasional drops of re
fresh pac k ets and trigger trac If state in the no de is not large the refresh in terv al is dominated b y
ref r esh interval
min
and the smo othing constantdoesnot playanimportantrole The smo othing constan t used for the EWMA is a reasonable tradeo b et w een resp onding quic kly to
increases in trigger trac and not resp onding to o strongly to o ccasional long refresh in terv als due to
pac k et drops
In this and the previous sections weha v e discussed our approachof sc alable timers and the required
mec hanisms In the coming sections w e presen t our sim ulation studies of sc alable timers for PIM con trol
trac
Using Scalable timers for PIM Con trol T rac
Proto col Indep enden t MulticastPIM is a m ulticast routing proto col It uses soft state mec h
anisms to adapt to underlying net w ork conditions and m ulticast group dynamics Explicit hopb yhop
join messages from mem b ers are sentto w ards the data source or Rendezv ous P oin t In steady state eac h
router generate p erio dic join messages for eac h piece of forw arding state it main tains These messages
are sen t to upstream neigh b ors p erio dically to capture state top ology and mem b ership c hanges A join
message is also sent onanev en ttriggered basis eac h time a new forw arding en try is established for some
new source or group Am ulticast forw arding en try is discarded if no join messages to refresh this en try
are receiv ed for the timeout in terv al
Curren tly the refresh periods for the PIM up date messages are xed and the PIM con trol trac
gro ws with the amoun t of the m ulticast forw arding state main tained in the routers
Mem b ers of a group graft to a shared m ulticast distribution tree cen tered at the groups Rendezv ous
P oin t Though mem b ers join sourcesp ecic trees for those sources whose data trac w arran ts it they
are alw a ys joined to the shared tree listening for new senders Th us groups that are activ e but ha v e no
activ e senders still need to refresh the state on the shared tree In PIM there is some minimal con trol
trac for ev ery activ e group ev en if there are no activ e senders to the group So the bandwidth used b y
con trol trac is not a strict function of the data trac b eing carried b y the net w ork
PIM Con trol T rac
Besides the exc hange of refresh messages for existing state PIM has trigger trac due to mem bership
and net w ork top ology c hanges PIM trigger tr ac can result from
ac hange in group mem b ership
a new data source to a group
a shift from a shared tree to a sourcesp ecic tree and vice v ersa
a top ology c hange suc h as partitions and net w ork healings and
ac hange in Rendezv ous P ointreac habilit y The refresh rate in PIM determines the adaptabilit y of the proto col to the c hanges in the net w ork
conditions b ecause old stale state is timed out so oner if the refresh rate is high The higher the refresh
rate the greater the adaptabilit y to the c hanging conditions Dela y in servicing trigger trac can increase
the join and lea v e latency of a group and hence it is imp ortan t to giv e higher priorityto trigger trac
than refresh trac
Based on data trac b eing sen t to a group m ulticast groups can b e of t wot yp es Am ulticast group
is said to b e data active at some time if there are senders that are sending data to the group It is more
imp ortan t to refresh the state related to data active groups as compared to data inactive groups as data
inactive groups suer less or no data loss during adaptation to c hanges
Con trol Bandwidth
The links with less bandwidth can not supp ort a large n um ber of sim ultaneous con v ersations The link
bandwidth limits the n um b er of source sp ecic states for a group Hence the scaling with resp ect to the
source sp ecic state is not of ma jor concern W e analyzed the n um b er of groups that can b e supp orted on
v arious links without inating the refresh in terv als b y a large amoun ts F or instance bandwidth
of an ISDN link can supp ort groups with the refresh period of seconds Similary a T link can
supp ort groups with of link bandwidth W e observ ed that using of the link bandwidth for
PIM con trol trac is enough to supp ort an appropriate n um b er of groups W e decided to use of link
bandwidth for PIM con trol trac for conducting our exp erimen ts
Class Structure at the Sender
As weha v e discussed in the previous section dieren t PIM messages ha v e dieren t priorities asso ciated
with them W e divided PIM con trol trac in to t w o classes trigger trac and refresh trac A large
fraction of bandwidth is allo cated to trigger trac so that it can b e serviced as so on as it is generated
Besides a one lev el class structure a tok en buc k et is used to service the bursts in trigger trac faster
Another p ossible renemen t w ould be to service con trol trac related to dataactive datainactive
groups in dieren t classes
There are t w o parameters in this class structure the fraction of the PIM con trol bandwidth for trigger
trac and the size of the tok en buc k et Since trigger trac needs to b e serviced as fast as p ossible the
fraction of bandwidth for trigger trac should be close to Ho w ev er allo cation of all of the con trol
bandwidth can result in starv ation of refresh trac T rigger trac m ust b e allo cated a v ery large close
to but not equal to fraction of bandwidth
If the size of the tok en buc k et is large the sender can service bigger bursts of trigger trac without
increasing the joinprune latency There is a high lev el of correlation among m ulticast groups F or
instance groups created for audio and video streams of same session are coupled together Similarly
there can b e m ultiple groups for sending hierarc hical video streams that are coupled with eac h other and
exhibit related dynamics The buc k et size should b e large enough to accommo date these correlations
Aging out the m ulticast routing state
Multicast routers age out the forw arding en tries if they do not receiv e refresh messages for that en try Dela ying to o long in timing out a forw arding en try can result in follo wing o v erhead
bandwidth and memory in main taining soft state do wnstream for inactiv e trees and
m ulticasting application trac do wnstream after it is no longer needed
On other hand if the state is aged prematurelyit results in the failure of m ulticast routing to send
data to do wnstream lo cations un til state is reestablished It is more imp ortan t to be careful not to
timeout prematurelythanto beo v erly precise in timing out state as so on as p ossible
As describ ed in section t w o approac hes can be adopted at the receiv er no de to discard soft state
The next section discusses the sim ulation studies conducted with the scalable timers for PIM
Sim ulation Studies of Scalable Timers for PIM
An existing PIM sim ulator pimsim w as mo died to include the scalable timers approac h for PIM
con trol trac A sender mo dule w as included to sc hedule the PIM con trol messages based on the class
structure describ ed ab o v e W e implemen ted b oth the c ounting of the r ounds approac h and exp onential
moving aver age estimator for a PIM receiv er This section presen ts the details ab out the sim ulator and
the studies conducted
H-4
H-3
H-2
H-1
DR IN RP
H-n
Figure Simple c hain top ology
Illustrating the Sim ulator
A wide range of scenarios w ere sim ulated for studying scalable timers in the con text of PIM The top ol
ogyy of the net w ork and lifetimes of the mem b ers of v arious groups w ere pro vided to the sim ulator
in conguration les The sim ulation runs w ere v alidated b y feeding the pac k et traces generated from
the sim ulations to a Net w ork Animator nam
W e conducted sim ulations on v arious scenarios ha ving
top ologies with dieren t link bandwidths and mem b ership c hanges
In this section w e presen t a few simple y et represen tativ e scenarios for understanding the beha vior
of scalable timers Since PIM messages are router to router w e use a single top ology in Figure to
in v estigate the b eha vior on a single link Figure sho ws a c hain top ology with three PIM routers The
designated router DR sends join messages to w ards the Rendezv ous P oin t RP for the groups that are
activ e in the attac hed hosts These join messages are receiv ed b y the in termediate router IN The studies
fo cus on the b eha vior of the sender at the DR The receiv ers b eha vior w as studied at IN PIM Sender
0
5
10
15
20
25
0 200 400 600 800 1000 1200 1400 1600 1800
Refresh Intervals(secs.)--->
time(secs.)--->
refresh intervals
Figure Change in the refresh in terv al for step c hanges in n um b er of groups
V arious scenarios that demonstrate the sender beha vior ha v e been presen ted here If the link do es not
drop an y pac k ets then the actions of the sender can be captured b y the refresh messages as seen b y
the receiv er F or eac h refresh message receiv ed w e plot the time elapsed since the previous refresh w as
receiv ed for the same state Eac h refresh message has b een mark ed bya nam is a animation to ol for net w orking proto cols b eing dev elop ed at LBNL
In the sim ulations sho wn here the parameters w ere set so that the allo cated con trol bandwidth is su
cien t to refresh t w o state en tries without exceeding the ref r esh inter v al
min
W e used a r ef r esh inter v al
min
of seconds The size of the trigger tok en buc k et used w as The bandwidth of the link b et w een the
DR and the no de IN w as b ytessec
Figure illustrates the b eha vior of a sender during step c hanges in the amoun t of state corresp onding
to step c hanges in the n um ber of m ulticast groups Figure presen ts the c hange in refresh in terv als in
resp onse to a step c hange in the n um ber of activ e groups at the DR
F or the sim ulation sho wn in Figure a new group b ecomes activ eev ery seconds un til seconds
After seconds are o v er one group expires ev ery seconds When the amoun t of state is increased
from one to t w o the refresh in terv als do not c hange as enough bandwidth is a v ailable to refresh all the
state within the minim um refresh p erio d As more state is created and needs to b e refreshed the refresh
in terv al increases with eac h step This is b ecause the con trol bandwidth is not enough to be able to
send the refreshes at r ef r esh inter v al
min
and the sender adapts to the increase in the amoun t of state
b y reducing the refresh frequency ie increases in the refresh in terv al The exp ected b eha vior w as
observ ed and is sho wn in Figure As the amountofstate decreases after seconds w e notice that
refreshes are sen t more often consuming the a v ailable con trol bandwidth
0
10
20
30
40
50
0 50 100 150 200 250
Refresh Intervals
time(secs.)--->
refresh intervals
Figure Change in the refresh in terv al for trigger trac bursts
In steady state all the con trol bandwidth is used for sending refresh messages When a new group
b ecomes activ e trigger state tok ens are used for servicing the asso ciated trigger trac Spik es in the
refresh in terv als are observ ed when there is trigger trac to b e serv ed The heigh t of the spik e is a function
of the amoun t of trigger trac to be serviced and the size of the buc k et Figure sho ws the resp onse
of the sender when the burst size of trigger trac is bigger than the buc k et size In this scenario at seconds new groups b ecome activ e The sender sends the trigger trac for all but t w o groups bac k to
bac k The trigger trac for the remaining t w o groups is serv ed as more trigger tok ens are generated No
refresh pac k et is seen during this time
One of the parameters that needs to b e congured at the sender is the size of the trigger trac tok en
buc k et The size of the burst of trigger trac is limited b y the asso ciated link sp eed There is a tradeo
bet w een the size of the tok en buc k et and the latency in servicing the trigger trac The larger the buc k et
size the smaller is the asso ciated joinprune latency T rigger trac suers latency greater than the xed
timers approac h only if the trigger trac burst is greater than the tok ens a v ailable in the tok en buc k et
The w orst case dela y suered b ya trigger pac k et is giv en b y bur st siz econtr ol bw
tr ig g er
Ho w ev er in
most cases the tok en buc k et is full and the w orst case dela y suered b y a trigger pac k et in a burst larger
The gures only sho ws refresh messages trigger messages ha v e not b een sho wn
than the buc k et size tok en buck et
siz e
is giv en b y
bur st siz e tok en buck et
siz e
contr ol bw
tr ig g er
PIM Receiv er
Studies w ere conducted of the c ounting of the r ounds and exp onential weighte d moving aver age approac hes
for aging state at the receiv er In this section w e presen t the b eha vior of the t woapproac hes as observ ed
in the sim ulations sho wn in previous section The state timeout timer w as set to three times the curren t
estimate of the refresh p erio d
Coun ting of the Rounds
0
5
10
15
20
25
0 200 400 600 800 1000 1200 1400 1600 1800
Round Markers
time(secs.)--->
round markers
refresh intervals
Figure T rac king b eha vior of Coun ting of the Rounds approachtostep c hanges in n um b er of groups
W e studied the receiv ers trac king b eha vior during step and burst c hanges in the amoun t of state Figure plots the round times as observ ed b y the receiv er o v er time for the step c hange scenario sho wn in Figure The receiv er is able to trac k the round times v ery closely The receiv er is also able to detect the end case
as describ ed in previous section and times out the state prop erly State w as timed out when no refresh
for that state w as receiv ed for the previous three rounds In this scenario no premature timing out of
state w as observ ed with the c ounting of the r ounds approac h
The c ounting of the r ounds approac h adv ances the round mark ers only when refreshes are receiv ed
Hence the receiv er is able to trac k the round times during bursts of trigger trac Figure depicts the
trac king of round times during the burst of trigger trac as sho wn in Figure Figure sho ws the b eha vior of the coun ting of the rounds approac h to o ccasional dropping of refresh
pac k ets appro ximately loss rate If the dropp ed refresh pac k et is for the state that is curren tly
used for adv ancing the round mark er then the receiv er mo v es the round ahead when it receiv es the next
refresh for an y state Otherwise dropp ed pac k ets do not aect the trac king b eha vior
The sender do es not exhibit strict round robin servicing of the state when new state is added That
is the time bet w een the trigger message and the rst refresh for a state is not necessarily a full round
The receiv er lters these small deviations from the round robin b eha vior b y not adv ancing the round
mark er when it receiv es the rst refresh message for a state en try Figure sho ws the beha vior of the receiv er for the case when duplicates of refresh messages are
receiv ed The receiv er lters out the duplicates of the refresh messages and do es not misb eha v e in this
case Ho w ev er if the sender generates refresh messages in an arbitrary order refreshes for some state are
0
10
20
30
40
50
0 50 100 150 200 250
Round Markers
time(secs.)--->
round markers
refresh intervals
Figure T rac king b eha vior of Coun ting of the Rounds to trigger trac bursts
0
5
10
15
20
25
30
35
40
45
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Round Markers
time(secs.)--->
round markers
refresh intervals
Figure Beha vior of Coun ting of the Rounds in presence of o ccasional pac k et drops
receiv ed at the receiv er more frequen tly The c ounting of the r ounds approac h is not able to trac k round
times in suc h a scenario
Exp onen tial W eigh ted Mo ving Av erage Estimator
In this section wesho w the b eha vior of the EWMA estimator in the presence of step and burst c hanges
in state W e used as the v alue for the smo othing constan t w and as the w eigh t for the
in terv al deviation b F or eac h refresh message receiv ed w e plot the new refresh in terv al estimate
a inter v al
av g
b inter v al
mdev
Figure sho ws the refresh in terv al estimates for the step c hange scenario sho wn in Figure The
EWMA estimator is able to follo w the c hanges in the refresh in terv als at the sender The rate of con v er
gence for EWMA is dep enden t on the amoun t of state b eing refreshed The EWMA estimator has more
observ ations of the refresh in terv al in one round if the amoun t of state is large and therefore con v erges
faster to the refresh in terv al
The resp onse of the EWMA estimator to the trigger trac burst in the scenario sho wn in Figure is
sho wn in Figure As sho wn in the gure the EWMA estimator can underestimate the refresh in terv als
when there are big bursts of trigger trac b ecause the refresh in terv al suddenly increases Ho w ev er no
state is timed out in this p erio d as the receiv er is in the No Time out state
0
2
4
6
8
10
12
14
0 50 100 150 200 250 300 350
Round Markers
time(secs.)--->
round markers
refresh intervals
Figure T rac king b eha vior of Coun ting of the Rounds approac h to duplicate refreshes
0
5
10
15
20
25
0 200 400 600 800 1000 1200 1400 1600 1800
EWMA estimate, Refresh Interval(secs.)--->
time(secs.)--->
ewma estimate
refresh intervals
Figure T rac king b eha vior of EWMA estimator to step c hanges in n um b er of groups
Figure sho ws the beha vior of the EWMA estimator to o ccasional dropping of refresh pac k ets If
the link drops refresh pac k ets for a state the receiv er observ es longer refresh in terv als for that state The
EWMA estimator reacts conserv ativ ely to suc h increases in refresh in terv als b y increasing its estimate
and do es not prematurely timeout state
Discussion
In the sim ulation studies w e observ ed that the c ounting of the r ounds receiv er is able to resp ond to the
ev en ts at the sender and trac k the round times more precisely than a receiv er running an EWMA esti
mator Ho w ev er the c ounting of the r ounds receiv er fails to follo w a nonround robin sender successfully If the sender generates refreshes in an arbitrary order the EWMA estimator can also underestimate the
refresh in terv al Ho w ev er in suc h scenarios EWMA estimator is more conserv ativethan the c ounting of
the r ounds receiv er
Both approac hes work in that they allo w the receiv er to adapt to c hanges at the sender consequen tly there is no clear winner Since the mec hanism for aging state is lo cal to a no de dieren t no des can use
0
10
20
30
40
50
0 50 100 150 200 250
EWMA estimate, Refresh Interval(secs.)--->
time(secs.)--->
ewma estimate
refresh intervals
Figure T rac king b eha vior of EWMA to trigger trac bursts
0
5
10
15
20
25
30
35
40
45
0 200 400 600 800 1000 1200 1400 1600 1800
EWMA estimate, Refresh Interval(secs.)--->
time(secs.)--->
ewma estimate
refresh intervals
Figure Beha vior of EWMA in presence of o ccasional pac k et drops
dieren t approac hes There is no need for standardizing the approac h across the net w ork as a whole
Scalable timers as compared to Fixed timers approac h
W e can ev aluate the xed timers and our sc alable timers approac hes along three dimensions eciency resp onsiv eness and complexit y Contr ol Bandwidth In the traditional xed timers approac h the bandwidth used b y con trol trac
gro ws with the amoun t of state Scalable timers adapt to the gro wth in the amoun t of state to
be refreshed and limit the con trol trac bandwidth Figure sho ws the c hange in the con trol
bandwidth and refresh in terv als for the t w o approac hes in v estigated
R esp onsiveness The refresh in terv al determines the resp onse time of the proto col to the c hanges in
the net w ork conditions If the refresh in terv al is smaller the proto col adapts faster to the c hanges
The resp onsiv eness in soft state proto cols is of t wot yp es namely initiation of new state and timing
out the old state T rigger messages are sen t when new state is initialized Since the trigger messages
are serv ed instan taneously the resp onsiv eness of the proto col using scalable timers for con trol trac
is comparable to the xed timers approac h The refresh in terv al increases in scalable timers with
increasing amoun ts of state and it mighttak e longer to time out old state than in the xed timers
approac h Ho w ev er in most cases the lifetime of a state unit is considerably longer than the p erio d
for whic h the state is k ept after the last refresh So the o v erhead due to late timing out of old
state longer resp onse time should be negligible compared to the sa vings in the bandwidth used
bycon trol trac o v er the lifetime of the state
Complexity and Overhe ad The additional mec hanisms required for scalable timers are simple and
in tro duce v ery small memory and computation o v erhead to the proto col Moreo v er the scalable
timers mo dule is generic and is not closely coupled with the proto col The scalable timers approac h
do es not require anyc hanges b e made to the proto col
Conclusion and F uture directions
In this pap er w e presen ted a scalable approac h for regulating con trol trac in soft state proto cols
With scalable timers the con trol trac consumes a xed amoun t of bandwidth b y dynamically adjusting
the refresh in terv al Scalable timers requires mec hanisms at the sender and the receiv er of the con trol
messages Through sim ulations weha v e illustrated the eectiv eness of scalable timers in terms of o v erhead
reduction with minimal impact on proto col resp onsiv eness moreo v er there is little increase in complexit y W e plan to further study the aect of c hanges in the parameters suc h as class structure and smo othing
constan t etc to the p erformance of the prop osed mec hanisms
Weha v e studied scalable timers in the con text of a m ulticast routing proto col ie PIM whic his an
example of a routertorouter proto col Multipart y endtoend proto cols w ere actually the rst proto cols
to incorp orate the notion of scalable timer tec hniques for reduction of p erio dic trac eg wb R TP F uture w ork will in v estigate the applicabilit y of the detailed mec hanisms discussed in this pap er to suc h
endtoend proto cols and softstate proto cols in general
References
Da vid D Clark The design philosophyof the darpa in ternet proto cols In SIGCOMM Symp osium
on Communic ations A r chite ctur es and Pr oto c ols pages A CM August Lixia Zhang Bob Braden Deb orah Estrin Shai Herzog and Sugih Jamin Resource reserv ation
proto col RSVP v ersion functional sp ecication In ternet Draft Xero x P AR C Octob er W ork in progress
Henning Sc h ulzrinne Stephen Casner Ron F rederic k and V an Jacobson R TP A transp ort proto col
for realtime applications In ternet draft w orkinprogress dr aftietfavtrtptxt IETF No v em ber
S Deering D Estrin D F arinacci V Jacobson C Liu L W ei P Sharma and A Helm y Proto col
indep endentm ulticast pim Motiv ation and arc hitecture Internet Dr aftMarc h
S Deering D Estrin D F arinacci V Jacobson C Liu L W ei P Sharma and A Helm y Proto col
indep endentm ulticast pim sparse mo de proto col Sp ecication Internet Dr aft Marc h
S Deering D Estrin D F arinacci and V Jacobson Proto col indep endentm ulticast pim dense
mo de proto col Sp ecication Internet Dr aft Marc h
Claudio T op olcic ST I I In NossdavICSI T ec hnical Rep orts Berk eley California No v em b er
TR
William C F enner In ternet group managemen t proto col v ersion Internet Dr aft Ma y
T on y Ballardie P aul F rancis and Jon Cro w croft Core based trees CBT In Deepinder P Sidh u
editor SIGCOMM Symp osium on Communic ations A r chite ctur es and Pr oto c ols pages San
F rancisco California Septem b er A CM also in Computer Communic ation R eview Oct
C T op olcic Exp erimen tal in ternet stream proto col V ersion stii o ct RF C
Dann y J Mitzel Deb orah Estrin Scott Shenk er and Lixia Zhang An arc hitectural comparison of
STI I and RSVP In Pr o c e e dings of the Confer enceon Computer Communic ations IEEE Info c om T oron to Canada June Sally Flo yd and V an Jacobson Linksharing and resource managemen t mo dels for pac k et net w orks
IEEEA CM T r ansactions on Networking August J P ostel T ransmission con trol proto col Request for Commen ts Standard STD RF C
In ternet Engineering T ask F orce Septem ber V an Jacobson Congestion a v oidance and con trol A CM Computer Communic ation R eview August Pro ceedings of the Sigcomm Symp osium in Stanford CA Au
gust P Sharma D Estrin S Flo yd and V Jacobson App endix Scalable timers for pim
ftpcatarinauscedupubpuneetshpap erst app endixps Univ ersit y of Southern Cal
ifornia June Liming W ei The design of the usc pim sim ulator pimsim T ec hnical rep ort tr Univ ersit y
of Southern California June
This app endix presen ts the details for application of sc alable timers approac h for regulating PIM
con trol trac
A Summary of PIM Con trol Messages
Message
T yp e
Refresh P erio d Timeout In terv al Size of P ac k et Dep edenton Commen ts
sec sec b ytes
Router
Query
Message
Num ber of
neigh b oring
PIM routers
JoinPrune
Message
for eac h source sg state with
same incoming
in terface
Register
Message
Unicast to the
RP
Piggybac k ed
on data pac k et
Assert
Message
Not sen t p eri
o dically
B Calculations for Refresh P erio ds
Average pim control pkt size bytes
Let of the link bandwidth be allocated for PIM control traffic
for an ISDN link
link bandwidth Kbps
Pim control bandwidth Kbps
No of refresh packets which can be sent per sec statessec
we can send information for about states per sec
refersh period N sec
where N is the number of the states present
if N refresh period sec
Note Currently we send periodic joinprunes every secs
The numb er of sour c e sp e cic states for a gr oup is limite d by the link b andwidth as the links with less
b andwidth c an not supp ort a lar ge numb er of c onversations Hencethe sc aling with r esp e ct to to the sour c e
sp e cic state is not of major c onc ern Same analysis can b e restated in terms of the n um b er of groups
whic h can b e supp orted do wnstream of a link if of its bandwidth is allo cated for PIM con trol trac
F ollo wing table sho ws the n um b er of groups whic h can b e supp orted on v arious link t yp es
Link T yp e Link b w Num b er of groups Refresh P erio d
Kbps min
ISDN T T F rom the ab o v e analysis w e can observ e that of the link bandwidth is enough to supp ort an
appropriate n um b er of groups without inating the refresh p erio ds b y a large amoun t
C Psuedo co de for a PIM Sender
F or eac h Link attac hed to the router
Constan ts
B con trol bandwidth for the PIM con trol trac
Bts bandwidth for the trigger state
Bos bandwidth for the old state
Ss state in the sender
P a v Av erage P ac k et size
t ts time for generating new trigger state tok en
t os time for generating new old state tok en
TS Buc k et size The size of the buc k et for the trigger state
Bos B os
Bts B ts
ts os B Bts Bos
ts is v ery close to so that the new state can b e ushed
fast
Weha v e a coun ter for the new state tok ens and old state tok ens
The countfor the new state tok ens nev er increases more than
TS Buc k et size The coun tfor old state tok ens is atmost While sending the old state refresh w e do a roundrobin on the en tries
trigger messages queue the queue for the trigger messages
t ts P avBts
t os P a v Bos
Ev ery t ts seconds
Generateincremen t new trigger state tok ens
Flush state
Ev ery t os seconds
Generateincremen t new old state tok ens
Flush State
When new state is created
Enqueue the related message in the trigger message queue
Flush State
ush state
f
whiletrigger message queue is not empt y and
trigger state tok ens are a v ailable f
dequeue from trigger message queue and send message
decremen t trigger state tok ens
g
whiletrigger state tok ens TS Buc k et size
f
iftrigger message queue is not empt y
f
dequeue from trigger message queue
and send message
decremen t trigger state tok ens
g else
if not to o early to send old state refresh
send old state refresh
decremen t trigger state tok ens
g
whileold state tok ens
f
if not to o early to send old state refresh
send old state refresh
decrementold state tok ens
g
g
Abstract (if available)
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 670 (1998)
PDF
USC Computer Science Technical Reports, no. 678 (1998)
PDF
USC Computer Science Technical Reports, no. 702 (1999)
PDF
USC Computer Science Technical Reports, no. 608 (1995)
PDF
USC Computer Science Technical Reports, no. 599 (1995)
PDF
USC Computer Science Technical Reports, no. 672 (1998)
PDF
USC Computer Science Technical Reports, no. 692 (1999)
PDF
USC Computer Science Technical Reports, no. 656 (1997)
PDF
USC Computer Science Technical Reports, no. 657 (1997)
PDF
USC Computer Science Technical Reports, no. 631 (1996)
PDF
USC Computer Science Technical Reports, no. 606 (1995)
PDF
USC Computer Science Technical Reports, no. 565 (1994)
PDF
USC Computer Science Technical Reports, no. 732 (2000)
PDF
USC Computer Science Technical Reports, no. 730 (2000)
PDF
USC Computer Science Technical Reports, no. 690 (1998)
PDF
USC Computer Science Technical Reports, no. 696 (1999)
PDF
USC Computer Science Technical Reports, no. 667 (1998)
PDF
USC Computer Science Technical Reports, no. 726 (2000)
PDF
USC Computer Science Technical Reports, no. 677 (1998)
PDF
USC Computer Science Technical Reports, no. 727 (2000)
Description
Puneet Sharma, Deborah Estrin, Sally Floyd, Van Jacobson. "Scalable timers for soft state protocols." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 640 (1996).
Asset Metadata
Creator
Estrin, Deborah
(author),
Floyd, Sally
(author),
Jacobson, Van
(author),
Sharma, Puneet
(author)
Core Title
USC Computer Science Technical Reports, no. 640 (1996)
Alternative Title
Scalable timers for soft state protocols (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
24 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269725
Identifier
96-640 Scalable Timers for Soft State Protocols (filename)
Legacy Identifier
usc-cstr-96-640
Format
24 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/