Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 662 (1997)
(USC DC Other)
USC Computer Science Technical Reports, no. 662 (1997)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Scalable, Highly Av ailable W eb Cac hing
K atia Obraczk a
USC Information Sciences Institute
P eter Danzig
Net w ork Appliance
S olos Arthac hinda
USC Electrical Engineering Departmen t
M uhammad Y ousuf
USC Computer Science Departmen t
Abstract
This pap er prop oses tr ansluc ent c aching as an alternativ e to transparen t cac hes.
T ranslucen t cac hes use the fact that net w ork routing forw ards a request for an ob ject
along the b est path from the clien t to the ob ject’s home serv er. Along this b est
path, routers direct the request to w ard a cac he c hosen among a collection of nearb y
translucen t cac hes. Unlik e transparen t cac hing, whic h relies on routers to serv e as TCP
connection in termediaries, translucen t cac hes only use routers to get next-hop cac he
information. By doing a w a y with TCP connection in termediaries, translucen t cac hes
conserv e the end-to-end argumen t-based robustness.
1 In tro duction
This pap er discusses w a ys of scaling W eb cac hes. It prop oses tr ansluc ent c aching, an al-
ternativ e to tr ansp ar ent c aching. T ranslucen t cac hing w orks b y directing a request for an
ob ject to w ard cac hes along the w a y from the clien t to the ob ject’s origin serv er. It relies
on the net w ork routing fabric to route the request along the shortest path b et w een the
source and destination. T r ansluc ent r outers along the w a y in tercept the request and pro-
vide the address of a next-hop cac he serv er to the previous hop. The previous-hop cac he
then sends the request to the indicated next-hop cac he. Similarly to transparen t cac hing,
translucen t routers balance load b y directing a request to a cac he c hosen from a collection
of next-hop cac hes. Unlik e transparen t cac hing, whic h relies on routers to serv e as TCP
connection in termediaries, translucen t cac hes only use routers to get next-hop cac he infor-
mation. By establishing the end-to-end TCP connections themselv es, translucen t cac hes
a v oid the problems caused b y route apping.
F or p erformance, translucen t cac hes can cac he next-hop information to a v oid the extra
round-trip time. In our design, w e also include a maximum numb er of inter c epts option.
By setting this ag, w e limit the n um b er of times a request gets in tercepted b y translucen t
cac hes along its w a y to the ob ject’s origin serv er.
The pap er is organized as follo ws. In Section 2 w e review sev eral load balancing sc hemes;
they w ork b y partitioning w eb trac among cen trally administered cac hes, suc h as cac hes
within an ISP or In ternet bac kb one connectivit y pro vider. W e single out transparen t cac hing
as a router-supp orted load balancing mec hanism and discuss its adv an tages and disadv an-
tages. Section 3 in tro duces translucen t cac hes and presen t our design and implemen tation.
It concludes b y presen ting a simple clustering failo v er sc heme w e designed and implemen ted
to impro v e translucen t cac hing a v ailabilit y and fault tolerance.
1
2 Load Balanced Cac hes
One w a y to mak e W eb cac hing scale is to partition trac among a group of collab orating
cac hes, including pro xy cac hes, net w ork service pro vider, and con ten t pro vider cac hes. In
this section w e review transparen t cac hing { a router-supp orted load balancing mec hanism,
and discuss its strengths and dra wbac ks.
W e start b y reviewing some soft w are-based load balancing sc hemes, namely pro xy auto-
conguration and serv er-side re-direction.
2.1 Soft w are-based Load Balancing
Pro xy Auto-Conguration
Most a v ailable bro wsers supp ort pro xy auto-conguration. It w orks b y conguring bro wsers
with the URL of a conguration program, whic h maps URLs to sp ecic pro xy serv ers.
Conguration programs t ypically use a hash function based on the URL’s hostname to
p erform the URL-pro xy mapping. Hostname hash functions a v oid duplicate cac hing and
maximize cac he hits b y consisten tly mapping requests to an ob ject to the same cac he serv er.
The problem with pro xy auto-conguration is that it is not transparen t to the user. Since
it in v olv es bro wser conguration, users need to get in v olv ed in do wnloading new bro wsers
or re-conguring the existing one ev ery time new cac he deplo ymen t or conguration is
p erformed. This often leads to customer supp ort calls and user discon ten t.
Serv er-Side Re-direction
Cac he serv er load balancing solutions can complemen t and are orthogonal to clien t-side
mec hanisms. Resonate’s Dispatc h [9 ] is an example of a serv er-side soft w are-based load
balancing sc heme. It uses a designated serv er, or disp atch manager, as a fron t-end to
incoming requests
1
. Clien ts send requests to a virtual IP address. The dispatc h manager
accepts incoming requests, parses the requested ob ject id (URL in the case of W eb requests),
and re-directs the request to the appropriate serv er. The dispatc h decision is based on
ob ject lo cation and serv er load information. Cac he serv ers, including the dispatc h manager,
exc hange load information among themselv es using a separate proto col. Once the request is
forw arded to a serv er, that serv er resp onds directly to the clien t, b ypassing the dispatc her.
2.2 Router-Based Load Balancing: T ransparen t Cac hing
An adv an tage of transparen t cac hing o v er soft w are-based load balancing mec hanisms is that
it cannot easily b e b ypassed. As the name implies, transparen t cac hes partition w eb trac
transparen tly to the user. This means that pro xy conguration is not needed, although
b oth sc hemes can complemen t one another. Ev en if a sophisticated user disables pro xy
auto-conguration, a transparen t cac he captures \runa w a y" requests and re-directs them
appropriately .
T ransparen t cac hing can b e implemen ted b y either: mo difying the cac he’s TCP stac k
so that it op erates in promiscuous mo de to capture all p ossible IP addresses; or, using the
router to map a clien t’s TCP session to an appropriate W eb cac he. In the latter approac h,
the router acts as a switc h b oard b et w een clien ts and the target W eb cac hes: it captures
a clien t’s TCP SYN pac k et, establishes a connection with the clien t, selects a W eb cac he,
and establishes a separate connection with it. Up on establishing the t w o TCP sessions, the
router forw ards the clien t’s TCP pac k ets to the selected cac he b y p erforming the appropriate
mappings.
Sev eral transparen t cac hing sc hemes ha v e b een prop osed; w e review some of them b elo w.
1
Although most of its customers use it as a load balancing sc heme for W eb serv ers, Dispatc h can also b e
applied to W eb cac he serv ers.
2
Cac he Director
Cisco’s cac he director system [3 ] uses router supp ort to in tercept HTTP requests (TCP
trac destined to p ort 80) and partition them among a collection of c ache engines. Eac h
cac he engine is assigned a share of the 256 address groups in to whic h the IP address space is
split. The router and cac he engines run the W eb Cac he Con trol Proto col to monitor cac he
load and to re-allo cate the IP address space accordingly . Cac he engines k eep statistics on
address subgroup hit rate as a metric for cac he load. The cac he director uses these statistics
to dynamically re-allo cate high trac address groups from hea vily loaded to more ligh tly
loaded cac hes. Address groups are also re-allo cated when cac hes are added or remo v ed
(including cac he failures) from the cac he farm.
W e are curren tly designing and implemen ting an alternativ e to the cac he director sc heme.
The main idea b ehind our transparen t cac hing sc heme is to use an existing routing proto col
(e.g., RIP) to adv ertise and pro cess cac he load information. Cac hes use routing up dates to
send curren t load information to routers. The adv an tage of this sc heme is that it do es not
need a sp ecial-purp ose proto col to implemen t transparen t cac hing. Because it uses existing,
standardized, publically a v ailable routing proto cols, our sc heme will b e more easily p ortable.
Con ten t-Based Redirection
A step forw ard from the cac he director approac h consists of using a more elab orate decision
making pro cess in the router. Instead of just deciding whether incoming trac is W eb
related or not, the router p erforms c ontent-b ase d r e-dir e ction. Routers parse requests and
comp ose a k ey using the source and destination IP addresses, TCP p ort n um b er, and the
ob ject tag (the URL in case of W eb requests). They use this k ey to lo okup a lo cally-
main tained con ten t-lo cation mapping database. The con ten t-lo cation database uses lo cation
information to re-direct requests. A t the target site, load information can b e used to direct
the request to the \b est" a v ailable serv er.
The con ten t-based redirection approac h can b e used for con ten t serv ers in general, includ-
ing W eb cac hes. Arro wP oin t Comm unications [1] is curren tly dev eloping a line of net w ork
appliances based on the con ten t-based re-direction concept.
2.3 T ransparen t Cac hing Dra wbac ks
The strength of transparen t cac hing is also its main w eakness: it violates the end-to-end
argumen t b y exp osing net w ork routing to higher proto col la y ers and trying to circum v en t
it. This results in decreased robustness: when routes ap, transparen t cac he clien ts ma y
exp erience brok en W eb pages as their HTTP/TCP connections are no longer end-to-end.
3 T ranslucen t Cac hing
T ranslucen t cac hing is another w a y to p erform router-supp orted load balancing among cac he
serv ers. Because they do not break the end-to-end nature of TCP connections, translucen t
cac hes are robust and do not suer from problems that ma y result from route asymmetry .
The basic idea b ehind translucen t cac hing is to direct a request for an ob ject to w ard cac hes
along the w a y from the clien t to the ob ject’s origin serv er. It tak es adv an tage of the fact that
the net w ork routing fabric routes the request along the b est path from the source (clien t) to
the destination (serv er). Routers along the w a y in tercept the request (in ICP [12 ] or other
cac hing proto col) and ec ho bac k the address of a close-b y cac he. The previous-hop cac he
can then send the request to the indicated next-hop cac he.
Figure 1 sk etc hes ho w translucency w orks. A clien t requesting an ob ject sends the request
to the corresp onding clien t cac he (or pro xy cac he) CC. If CC do es not ha v e the ob ject it
forw ards the request to the ob ject’s home serv er. Router R1 along the w a y in tercepts the
3
CC
Cn
R
R
Client cache
Intermediate caches
Router
Translucent
router
Communication
using cache
protocol
client
server
dst=server
C1
C2
C3 C4 C5
CC
R1 R2 R3 R4
client
server
try C1
C1 C2
C3 C4 C5
C6 C7
CC
R1 R2 R3 R4
C6 C7
client
server
C1
C2
C3 C4 C5
C6 C7
CC
R1 R2 R3 R4
dst = C1
Figure 1: T ranslucen t Cac hing.
request and sends bac k a reply con taining C1 as the next-hop cac he. CC then sends the
request to C1. Notice that this time around, CC’s request do es not get in tercepted b y R1.
W e describ e ho w w e b ypass router in terception in Section 3.1 b elo w.
T o a v oid the extra round-trip time (R TT), cac he serv ers ma y cac he next-hop information
for future use. Lik e an y cac hed ob ject, cac hed next hops can b e assigned a time-to-liv e (TTL)
so that they are refreshed p erio dically . This allo ws translucen t cac hes to adapt to c hanges
in cac he serv er load. Clearly , the tradeo in setting these TTL v alues is b et w een trying to
a v oid the extra R TT and k eeping up with load and ob ject lo cation dynamics.
P oin ting to the next-hop cac he b ecomes a load balancing issue. Routers tunnel in-
tercepted ICP requests to a pro cess that k eeps a database of nearb y cac he serv ers. This
pro cess can b e co-lo cated with the router. The next-hop cac he database is equiv alen t to
the con ten t-lo cation information k ept b y routers in the con ten t-based transparen t cac hing
sc heme. Similarly to transparen t cac hing, translucen t cac hes can p erio dically inform routers
of their curren t load. They can also tell routers what IP address ranges they are willing to
service. Curren tly , w e man ually congure next-hop cac he information. As an extension to
our translucen t cac hing implemen tation and as part of our w ork in transparen t cac hing, w e
plan to use an existing routing proto col to comm unicate cac he con ten t and load information
to translucen t routers.
3.1 Design Issues
The translucen t cac hing approac h requires that routers:
Filter pac k ets con taining a request for an ob ject coming from a cac he. In our imple-
men tation, w e use the In ternet Cac he Proto col (ICP). Throughout this section, w e
4
use ICP to refer to ho w cac hes comm unicate among themselv es.
P erform a lo okup for the next-hop cac he information, and
Send the next-hop cac he information bac k to the previous cac he using ICP .
The previous-hop cac he sends the request to the next-hop cac he. Alternativ ely , the
router can forw ard the request to the next-hop cac he and send next-hop information bac k
to the previous-hop cac he for future use.
Cac he translucency ma y cause lo ops: a cac he request ma y k eep b eing in tercepted b y a
router indenitely . P ossible approac hes to a v oid lo oping are:
Dene a sp ecial ICP message t yp e that b ypasses router in terception. In other w ords, if
a cac he already has next-hop cac he information (receiv ed as the result of the curren t or
a previous request), it will use the appropriate ICP message t yp e to prev en t the next-
hop router from in tercepting the request. This solution requires that routers pro cess
ev ery cac he request. In case it receiv es the b ypass message, the router forw ards the
pac k et to its original destination.
Another option is to use a dieren t p ort n um b er when b ypassing router in terception.
This will automatically b ypass the router lter without routers ha ving to pro cess ev ery
ICP pac k et.
F or p erformance, w e plan to restrict the n um b er of next-hop cac he lo okups along the
path b et w een a source-destination pair. If a request go es through sa y 2 or 3 next-hop
lo okups without hitting a cac he, it is forw arded straigh t to its destination, b ypassing an y
other next-hop lo okups. T ak e the scenario sho wn in Figure 1 with a maximum number of
intercepts of 2. The clien t cac he CC gets a request from a clien t. If it do esn’t ha v e the
requested ob ject, it sends an ICP request to w ards the origin serv er. The rst router along
the w a y , R1, tells CC to try C1. CC sends the request to C1 using the \no in tercept"
message to b ypass R1. If it is a cac he miss, C1 incremen ts number of intercepts to one.
It c hec ks whether the n um b er of in tercepts is less than the pre-sp ecied maximum number
of intercepts, in whic h case it forw ards the ICP request to w ards the serv er. The request
gets in tercepted b y R3 who tells C1 to try C4. C1 then sends the request to C4 using
the ICP \no in tercept" message. In case C4 do es not ha v e the ob ject, it will incremen t
number of intercepts to t w o. This causes C4 to send an HTTP request for the ob ject to
the origin serv er, instead of sending bac k next-hop cac he information.
3.2 Implemen tation
W e curren tly ha v e a w orking protot yp e implemen tation of translucen t cac hing under Solaris
2.5.1. Our co de is a v ailable from http://www-scf.u sc. ed u/ yousuf/cache.htm l.
Our implemen tation of translucen t cac hing w orks as follo ws. Routers lter ICP request
pac k ets using the IP pac k et lter pac k age [8 ]. The lter op erates for b oth in b ound and
outb ound sides of the IP pac k et queue and c hec ks pac k ets b efore they get c hec k ed for source
route options. The pac k et can b e congured with a list of rules dened in the lter’s
conguration le. These rules allo w pac k et ltering b y IP proto col, IP options, net w ork
in terface, and p ort n um b er. F or example, the rule block in log proto udp from any
to any port = 3130 causes the lter to in tercept all incoming UDP pac k ets coming from
an y host and destined an y host at p ort = 3130. The in tercepted pac k ets are redirected to
/dev/ipf.
W e c hose to dene new ICP message t yp es instead of assigning a new Squid p ort n um b er.
W e mo died the ICP proto col and incorp orated t w o previously undened ICP messages.
T o the rst one, w e assigned opcode 12 and use it to con v ey next-hop cac he information
5
Null−Terminated URL
ICP reply with next−hop cache address payload format (OPCODE = 12):
ICP request with no−interception payload format (OPCODE = 13):
Opcode Version Message Length
Request Number
Options
Sender Host Address
Null−Terminated URL
Opcode Version Message Length
Request Number
Options
Sender Host Address
Null−Terminated URL
Opcode Version Message Length
Request Number
Options
Sender Host Address
Option Data (containing Hop Count)
Option Data (containing Hop Count)
Next−hop Cache Address
Requester Host Address
Next−hop Cache Address
Figure 2: New ICP messages.
to the previous hop. The second new ICP message w as assigned opcode 13 and is used to
b ypass translucen t routers. Figure 2 sho ws the new messages’ format. Note that the new
messages’ option data header eld con tain number of intercepts. The option data
eld in ICP’s opcode 1 message (ICP OP QUERY) header w as also mo died to con tain number
of intercepts information.
W e incorp orated our translucency proto col in to the publically a v ailable Squid In ternet
Ob ject Cac he [11 ] (Squid v ersion 1.1.14). W e mo died the Squid serv er to p erform the
additional functions listed b elo w. Figure 3 sho ws the Squid pseudo co de including our mo d-
ications.
Generate an ICP reply to send next-hop cac he information to the previous hop cac he.
W e implemen ted this new message using ICP’s previously undened op co de 12 mes-
sage.
Recognize the new ICP op co de 12 message and use the next hop cac he information it
con tains to generate the next ICP request for the ob ject, in this case the ICP op co de
13 message.
Generate the new ICP op co de 13 message and send it to the next-hop cac he. When
a translucen t router in tercepts an ICP op co de 13 message, it regenerates the request
with the same destination address.
Recognize the new ICP op co de 13 message and generate a regular, ICP op co de 1
message if the ob ject is not lo cally cac hed. This request will b e in tercepted b y the
next-hop router.
6
(1) If an HTTP, Gopher, etc. request from client
{
If object found in local cache
{
Transfer the object to client;
}
else
{
Send ICP request with OPCODE = 1 towards the source of object;
}
}
(2) If ICP packet with OPCODE = 12 /* containing next-hop cache */
{
Send ICP request, with OPCODE = 13 for no interception, to next-hop
cache address found in message payload;
}
(3) If ICP request packet with OPCODE = 1 or OPCODE = 13 /* normal or no
{ interception request */
if object found in local cache
{
reply to previous cache with ICP_HIT;
}
else
{
number of intercepts++;
If (number of intercepts == maximum number of intercepts)
{
Get object directly from source and transfer it to client proxy
cache;
}
else
{
Send ICP message ICP_MISS to previous cache;
Send ICP request with OPCODE = 1 towards the source;
}
}
}
(4) If ICP reply packet with ICP_HIT
{
Get object from the cache sending ICP_HIT and transfer it to client
proxy cache;
}
Figure 3: Mo died Squid pseudo co de.
7
for(;;)
{
Read the intercepted packet
If ICP request packet with OPCODE = 1
{
Get the next-hop cache address from cache directory;
Send back an ICP message with OPCODE = 12, containing next-hop cache
address, to the host address in ’Sender Host Address’ field of ICP
header (i-e, previous-hop cache);
}
If ICP request packet with OPCODE = 13
Regenerate the request towards the original destination;
}
Figure 4: ipmon pseudo co de.
Pro cess number of intercepts. If incoming request’s number of inter cepts <
maximum number of inter cepts, incremen t request’s number of inter cepts, and for-
w ard ICP op co de 1 (ICP OP QUERY) request. Otherwise, generate HTTP request to the
ob ject’s origin serv er. The curren t implemen tation uses maximum number of inter cepts =
3.
The mo dications made to the router are listed b elo w.
Install and run the IP pac k et lter (w e use IP pac k et lter v ersion 3.1.11 in our curren t
implemen tation) in the k ernel to in tercept ICP messages, i.e., messages destined to
p ort 3130.
Run a user-lev el bac kground pro cess, ipmon. The ipmon pro cess, whose pseudo co de
is sho wn in Figure 4, reads in tercepted pac k ets from /dev/ipf and pro cesses them as
follo ws:
{ If the ICP message op co de is 1, then generate an ICP op co de 12 message with
the next-hop cac he information to the previous-hop cac he.
{ ipmon lo oks up the next-hop cac he host name from a cac he database main tained
in the router. In the curren t implemen tation, this next-hop cac he database is
man ually congured and lo okups use round-robin to select the appropriate en try .
{ If ICP message op co de is 13, then forw ard the message to w ards its original des-
tination.
W e use the scenario in Figure 5 to demonstrate ho w our protot yp e implemen tation w orks.
The clien t running on excalibur.usc.e du sends an HTTP request to its pro xy serv er also
running on excalibur.usc.edu . The pro xy serv er do es not ha v e the requested ob ject so it
generates an ICP request for the ob ject. This ICP request is in tercepted b y the translucen t
router dorado.usc.edu. The router lo oks up the address of the next-hop cac he (in this
example, paloma.usc.edu) and sends it bac k to the previous-hop cac he excalibur.usc.ed u
using ICP’s op co de 12 message. When the cac he gets an ICP op co de 12 message, it kno ws
that the pa yload con tains the address of next-hop cac he. excalibur.usc.ed u then sends an
ICP op co de 13 request to paloma.usc.edu with number of inter cepts = 1. Since paloma
8
excalibur.usc.edu
(client)
excalibur.usc.edu
(proxy cache)
paloma.usc.edu
(web cache)
cabrillo.usc.edu
(translucent router)
dorado.usc.edu
(translucent router)
jalama.usc.edu
(web cache)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(1) HTTP request
(2) ICP request (HopCount = 0)
(3) ICP reply (opcode 12) with next−hop cache address, i−e, paloma.usc.edu
(4) ICP request (opcode 13) to next−hop cache, i−e, paloma.usc.edu (HopCount=1)
(5) ICP request (HopCount=1)
(6) ICP reply (opcode 12) with next−hop cache address, i−e, jalama.usc.edu
(7) ICP request (opcode 13) to next−hop cache, i−e, jalama.usc.edu (HopCount=2)
Figure 5: T ranslucency sample testing conguration.
do es not ha v e the ob ject it generates an ICP request to w ards the ob ject’s origin source. This
request is in tercepted b y another translucen t router cabrillo.usc.ed u, who returns the
address of the next-hop cac he (here, jalama.usc.edu) paloma.usc.edu. paloma.usc.edu
then sends and ICP op co de 13 request to jalama.usc.edu with number of inter cepts = 2.
Since the ob ject is not cac hed lo cally , jalama incremen ts number of inter cepts and decides
to fetc h the ob ject directly from the serv er.
3.3 Addressing Av ailabilit y and F ault T olerance
The fact that users and services rely hea vily on cac hing for prop er In ternet connectivit y also
means that cac he failures can b e catastrophic, and m ust b e a v oided.
Sev eral of the load balancing sc hemes discussed in Section 2 also act as fault tolerance
solutions. In pro xy auto-conguration for example, the conguration program can sp ecify a
primary and a secondary cac he serv er. In case the primary cac he is una v ailable, the bro wser
automatically forw ards the request to the secondary cac he.
W e address fault tolerance using a clustering-based solution. The idea is to organize
collab orating translucen t cac hes in clusters of t w o or more cac he serv ers
2
. A t an y giv en
time, one of the cluster mem b ers is op erating as the cluster r epr esentative, who is resp onsible
for answ ering requests receiv ed b y the cluster. The remaining mem b ers of the cluster op erate
as b ackups. If for an y reason the cluster’s represen tativ e b ecomes una v ailable, a bac kup will
automatically tak e on the represen tativ e role and will start answ ering requests on b ehalf of
the cluster.
Clien ts reference a cac he cluster using the cluster’s IP address, an IP address to whic h
2
A similar solution has b een prop osed for ac hieving fault tolerance in routers [6].
9
the virtual in terfaces of all cluster mem b ers are b ound. Cac he cluster clien ts suc h as w eb
bro wsers will b e congured to p oin t to a cluster the same w a y as they p oin t to regular
cac hes.
3.3.1 The Cluster F ailo v er Proto col (CFP)
F or simplicit y , w e assume a cluster of t w o cac hes. The resulting proto col can b e easily
extended to handle larger clusters, whic h is an item for future w ork. The cluster pair runs
the cluster failo v er proto col (CFP) whic h allo ws the bac kup cac he to detect failure of the
cluster represen tativ e and to tak e on its role.
A t startup, one of the cac hes in a cluster is congured as the cluster represen tativ e.
Alternativ ely , the cluster can go through an initialization phase, where the cluster cac hes
start up as bac kups. Dep ending on their congured priority, one of the cac hes assumes the
role of the cluster represen tativ e.
The bac kup cac he p erio dically p olls the represen tativ e b y sending standard ICMP re-
quest messages (or ping’s) and w aiting for replies. If for some reason the represen tativ e
do es not resp ond, the bac kup assumes it is do wn and b ecomes the cluster represen tativ e.
Once the default represen tativ e comes bac k up, it resumes its functions and the bac kup
cac he go es bac k to its default bac kup state.
A t an y giv en time, there is only one cluster represen tativ e who resp onds to requests
destined to the cluster. Requests are addressed to a virtual IP address, or the cluster’s IP
address, whic h is b ound to the virtual in terface of b oth cluster cac hes. The represen tativ e
has that in terface turned on, while the bac kup cac he has it turned o. When the bac kup
cac he tak es on the represen tativ e role, it turns on the virtual in terface b ound to the clus-
ter’s IP address. It also sends a gratuitous ARP broadcast to directly connected routers
and gatew a ys. The ARP broadcast causes the appropriate routing table MA C-IP address
mappings to b e up dated. This w a y future clien t requests will b e deliv ered to the curren t
represen tativ e.
When the default represen tativ e comes bac k up, it sends an ARP broadcast to up date
directly connected routers. The represen tativ e also sends p erio dic ARP broadcasts to ensure
ARP records are up-to-date.
3.3.2 Implemen tation
W e implemen ted the represen tativ e and bac kup side of the proto col using a single p erl
script. Our co de is a v ailable from http://www-scf.u sc .ed u/ arthachi/cluster d- doc /) .
Through a command line parameter, the cluster administrator configures one of
the cluster members as representative or as backup. Figure 6 shows the protocol’s
pseudocode.
Besides configuring a cache as representative or backup, the cluster administrator
can also use command line parameters to configure the parameters below. Otherwise,
they take on their default values.
Path names for perl, ping, and ifconfig. Their default values are /local/bin/perl,
/usr/sbin/ping, and /sbin/ifconfig.
Name of virtual interface bound to cluster IP address.
Cluster IP address.
Cache server IP address.
Ping timeout period, with default value set to 3 seconds.
Poll interval with default value set to 3 seconds.
10
for(;;)
{
/* Backup cache */
if cache is BACKUP
{
if (ping is successful)
/* Representative cache is up. */
{
/* Switch from representative to backup or stay as backup. */
ifconfig virtual interface DOWN
}
elseif (ping timed out)
/* Representative cache is down. */
{
/* Switch from backup to representative. */
ifconfig virtual interface UP
}
else (ping failed)
exit
}
/* Representative cache */
else
{
/* Send gratuitous ARP broadcast. */
ifconfig virtual interface UP
}
sleep POLL_INETRVAL;
}
Figure 6: Cluster proto col pseudo co de
11
The ping timeout interval is the interval in seconds the backup waits for
an answer from the cluster representative before declaring it to be down. We
set the current default value to 3 seconds
3
. The frequency at which the backup
cache polls the representative is given by 1/poll interval. The poll interval’s
default value is currently set at 3 seconds. The cluster administrator can configure
the timeout and poll interval depending on how responsive to cache failures the
cluster should be.
4 Conclusions
In this paper, we propose tr ansluc ent c aching as an alternative to tr ansp ar ent c aching-based
load balancing. Translucent caches are robust: by not ‘‘splitting’’ the TCP
connection between the client and a cache, it avoids the problems that may be
caused by route flapping and asymmetry.
Translucent caching uses routers along the best path between a client and
a server to direct requests toward nearby, lightly loaded caches. We presented
our translucent caching design and implementation. We also presented a clustering-based
failover implementation that can be used by translucent caches for improved fault-tolerance
and availability.
References
[1] ArrowPoint. Arrowpoint communications home page. Available from
http://www.arrowp oi nt. co m/.
[2] Cisco. Shared network caching and cisco’s cache engine. Available from
http://www.cisco. co m/.
[3] S. Knight, D. Weaver,
D. Whipple, and R. Hinden. Virtual router redundancy protocol. Internet
Draft draft-hinden-vrr p- 00. tx t, March 1997.
[4] Darren Reed. The ip packet filter.
http://coombs.anu .e du. au / avalon/ip-filte r. htm l.
[5] Resonate. Resonate dispatch 2.0 product overview. Available from
http://www.resona te inc .c om/ .
[6] D. Wessels. Squid internet object cache. Squid web page
http://squid.nlan r. net /, July 1997.
[7] D. Wessels and K. Claffy. Internet cache protocol (icp), version 2.
Internet Draft draft-wessels-ic p-v 2- 00. tx t, November 1996.
3
Note that our default v alue o v erwrites ping’s default v alue of 20 seconds.
12
Abstract (if available)
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 652 (1997)
PDF
USC Computer Science Technical Reports, no. 713 (1999)
PDF
USC Computer Science Technical Reports, no. 651 (1997)
PDF
USC Computer Science Technical Reports, no. 658 (1997)
PDF
USC Computer Science Technical Reports, no. 660 (1997)
PDF
USC Computer Science Technical Reports, no. 738 (2000)
PDF
USC Computer Science Technical Reports, no. 595 (1994)
PDF
USC Computer Science Technical Reports, no. 638 (1996)
PDF
USC Computer Science Technical Reports, no. 735 (2000)
PDF
USC Computer Science Technical Reports, no. 714 (1999)
PDF
USC Computer Science Technical Reports, no. 689 (1998)
PDF
USC Computer Science Technical Reports, no. 680 (1998)
PDF
USC Computer Science Technical Reports, no. 707 (1999)
PDF
USC Computer Science Technical Reports, no. 737 (2000)
PDF
USC Computer Science Technical Reports, no. 708 (1999)
PDF
USC Computer Science Technical Reports, no. 594 (1994)
PDF
USC Computer Science Technical Reports, no. 611 (1995)
PDF
USC Computer Science Technical Reports, no. 609 (1995)
PDF
USC Computer Science Technical Reports, no. 495 (1991)
PDF
USC Computer Science Technical Reports, no. 637 (1996)
Description
Katia Obraczka, Peter Danzig, Solos Arthachinda and Muhammad Yousuf. "Scalable highly available web caching ." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 662 (1997).
Asset Metadata
Creator
Arthachinda, Solos
(author),
Danzig, Peter
(author),
Obraczka, Katia
(author),
Yousuf, Muhammad
(author)
Core Title
USC Computer Science Technical Reports, no. 662 (1997)
Alternative Title
Scalable highly available web caching (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
12 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270515
Identifier
97-662 Scalable Highly Available Web Caching (filename)
Legacy Identifier
usc-cstr-97-662
Format
12 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/