Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 725 (2000)
(USC DC Other)
USC Computer Science Technical Reports, no. 725 (2000)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
1
Proxy Caching for Quality Adaptive Multimedia Streams in
the Internet: A Performance Perspective
Reza Rejaie Haobo Yu
AT&T Labs-Research USC/ISI
reza@research.att.com haoboy@isi.edu
Abstract—Multimedia proxy caching (MCaching) presents a cost-effective solution
to support large scale access to high quality multimedia streams over the Internet.
However, introducing the notion of “quality” of cached streams adds a new dimension
to the evaluation space and complicates the problem. This paper proposes a compre-
hensive framework for the evaluation of multimedia proxy caching mechanisms. We
identify the fundamental tradeoff between stream quality improvement and caching
efficiency across the evaluation space. We then introduce two sets of performance met-
rics in this space to effectively capture various aspects of quality evolution and cache
efficiency at both the per-stream level and the aggregate level.
We have applied this framework to conduct simulation-based evaluation of multi-
media caching mechanisms. At the aggregate level, our results show that MCaching
efficiently utilizes cache space and adaptively maximizes overall performance along
both dimensions of the evaluation space. At the per-stream level, we provided insights
in interactions among quality evolution, prefetching and caching efficiency, and con-
gestion control. Our results reveal that prefetching and caching efficiency are directly
determined by the difference between cached quality and deliverable quality.
I. INTRODUCTION
The explosive growth of commercial usage in the Internet has re-
sulted in a rapidly increasing demand for audio and video streaming.
This trend is expected to continue and a larger portion of Internet traffic
will consist of multimedia streams (i.e., audio and video) in the future.
Most current Internet multimedia streaming applications require that a
server pipelines a stream to a client, i.e., the client is playing back the
available portion of stream while the rest of it is being delivered.
There are two major obstacles in supporting high quality streaming
applications at a large scale over the Internet:
High Bandwidth Flows: High quality multimedia streams usually
consume high bandwidth. For example, an MPEG-2 stream may re-
quire as much as 4 Mbps bandwidth and may last as long as two
hours. This raises scalability concerns in terms of both network load
and server load.
Variability of Available Bandwidth: Today’s Internet is best-effort.
The quality (i.e., bandwidth) of a stream delivered to a client is limited
by the bottleneck bandwidth along the path to the server. Thus a client
with a high bandwidth local access may receive low quality streams due
to a remote bottleneck, even though it has paid a premium for a high
bandwidth local link.
Multimedia proxy caching (MCaching) [1] is an adaptive solution
that addresses both problems simultaneously. By caching popular
streams with appropriate quality at a proxy close to the interested
clients, subsequent requests for these streams can be replayed directly
from the cache. Similar to Web caching, MCaching can significantly
reduce the load on the network and the server, thereby improving scal-
ability. Furthermore, the quality of the stream delivered from the proxy
to the client is only limited by the available bandwidth between the
proxy and the client (i.e., the last hop(s))
1
. Caching popular streams
close to interested clients also significantly reduces startup delay and fa-
cilitates interactive VCR-functionalities. Compared with other quality-
enhancing approaches (e.g., mirror servers), proxy caches are more
We would also like to thank Deborah Estrin and Jennifer Rexford for their insightful comments during
the development of this paper. Haobo Yu is supported by the Defense Advanced Research Projects Agency
(DARPA) through the VINT project at USC/ISI under DARPA grant DABT63-96-C-0054.
Certainly, if a client has only low-bandwidth connectivity to the network (i.e., the bottleneck is the last
hop), the delivered quality can not be improved. However the cache is still able to reduce the load on the
network and the server.
cost-effective and can be widely deployed.
MCaching introduces a new dimension of caching performance that
does not exist in current Web caching schemes, namely the quality of
cached streams. MCaching not only manages to keep popular streams
in the cache, but also tries to maintain appropriate quality for each
cached stream. Its goal is to maximize both the per-stream quality and
the aggregate performance of the cache at the same time.
MCaching works as follows. All the requests and responses are
“routed” through the proxy, as they are in traditional Web caches. On a
cache miss, the request is forwarded to the original server (or a neigh-
bor cache depending on the inter-cache architecture). The stream is
played back from the server to the cache, and simultaneously relayed
from the cache to the client. The quality of the cached stream after the
initial playback is thus limited by the bottleneck between the server and
the cache. On a cache hit, the stream is played back directly from the
cache. If the client has sufficient bandwidth to afford higher quality
than what is available in the cache, the missing higher quality portion
is incrementally prefetched from the server in a demand-driven fash-
ion. Consequently, the more high-bandwidth clients request a stream,
the better its quality becomes in the cache. If a cached stream (or its
high-quality portion) is not frequently used, the replacement algorithm
gradually flushes it out to make room for new streams or higher quality
portions of other streams. Thus MCaching introduces partial prefetch-
ing and fine-grain replacement, as opposed to atomic replacement used
in traditional Web caching.
Stream quality adds a new dimension into the evaluation space of
Web caching mechanisms. Fig. 1 illustrates the hypothetical overall
performance of three caching mechanisms across the evaluation space.
The dotted lines show the expected trajectories for these mechanisms
when cache size increases. Notice that there is a tradeoff between max-
imizing quality of cached streams and maximizing the overall cache
performance (e.g., byte hit ratio), because changing the quality of a
cached stream directly affects its size. If the cache always maintains
low quality versions of streams, as in Low Quality Caching, the byte hit
ratio can be high because larger number of streams reside in the cache.
However maximum deliverable stream quality is consistently low due
to the lack of any quality improvement mechanism. On the other hand,
High Quality Caching always stores the highest quality for every re-
quested stream, even if the client does not have sufficient bandwidth to
receive such high quality. It maximizes the cached quality at the cost
of lower utilization of cache storage space, and consequently a lower
byte hit ratio. MCaching tries to match the quality of cached streams
with the deliverable quality, i.e., the quality that most clients can af-
ford. Thereby MCaching uses cache storage space more efficiently and
achieves both high byte hit ratio and high cached stream quality.
This paper presents a comprehensive framework for performance
evaluation of multimedia proxy caching mechanisms. Because the no-
tion of stream quality does not exist in traditional Web caching, the ex-
isting cache performance metrics (e.g., byte hit ratio) are not sufficient
to fully evaluate multimedia caching mechanisms. We show that qual-
ity improvement and cache efficiency are orthogonal in terms of their
goals, hence the performance of multimedia caching should be exam-
ined collectively along both dimensions. Our framework proposes two
classes of metrics for both two dimensions of the evaluation space. The
2
Overall−Quality
X
X X
Max−Deliverable−Quality
Max−Quality
Byte−Hit−Ratio
High−quality
Caching
Mcaching
Low−quality
Caching
Max−Bottleneck−Quality
Fig. 1. Evaluation Space for multimedia stream caching.
caching efficiency metrics capture the cache performance in reducing
network load and server load, whereas the quality metrics measure the
cache performance in terms of improvement in stream quality. Fur-
thermore, each class contains metrics at the aggregate and per-stream
levels for evaluations at different granularities.
We have conducted simulation-based evaluations using our frame-
work to compare MCaching with two other strawman multimedia
caching mechanisms. Our results verified the fundamental tradeoff
between quality improvement and caching efficiency. At the aggre-
gate level, we found that compared to the two strawman approaches,
MCaching adaptively maximizes both caching efficiency and quality
improvement across the evaluation space by means of partial prefetch-
ing and fine-grain replacement. At the per-stream level, we observed in-
teresting interactions among quality evolution, prefetching and caching
efficiency, and congestion control. Our results reveal that prefetching
and caching efficiency are directly determined by the difference be-
tween cached quality and deliverable quality. Our results also verified
that MCaching converges the cached stream quality to the deliverable
quality, which is the exact reason why MCaching results in higher ag-
gregate cache efficiency.
The rest of this paper is organized as follows: Section II provides an
overview of the design of MCaching. Section III presents our evalua-
tion methodology for multimedia caching mechanisms, including two
sets of metrics and two strawman approaches. We then explain our sim-
ulation strategy in Section IV. In Sections V and VI we describe our
simulation results for aggregate and per-stream level evaluations. We
review related work in Section VII. Section VIII concludes the paper
and addresses our future directions.
II. MULTIMEDIA PROXY CACHING (MCACHING): AN OVERVIEW
A primary challenge for multimedia proxy caching in the Internet
is the requirement of congestion control. Because of the shared nature
of the Internet, all end systems—including streaming applications—are
expected to perform end-to-end congestion control to keep the network
utilization high while limiting overload and improving inter-protocol
fairness [2].
Performing congestion control results in unpredictable and poten-
tially wide variations in transmission rate. To maximize the deliv-
ered quality to clients while obeying congestion controlled rate lim-
its, streaming applications should be quality adaptive [3] over the
Internet—that is, they should match the quality of the delivered stream
with the average available bandwidth on the path. Once a stream is
cached, the cache can replay it for subsequent requests but it still needs
to perform congestion control and quality adaptation based on the state
of its connection to the client. The proxy-client connection is likely
Cached Quality
Played back
Quality
Prefetched
Segments
Quality(Layer)
Time(sec)
L1
L3
L2
L0
Fig. 2. Prefetching mechanism
to exhibit different characteristics, e.g., different changes in available
bandwidth, from previous connections. Thus, some of the required
segments by quality adaptation might be missing from the cache, and
should be prefetched.
MCaching assumes the existence of an end-to-end architecture for
playback of quality adaptive multimedia streams in a congestion con-
trolled fashion over the Internet (such as that in [4]). TCP-friendly con-
gestion control is performed using Rate Adaptation Protocol (RAP) [5].
The quality adaptation module adjusts the quality of the played back
stream. Hierarchical encoding [6] is used to provide a layered approach
to quality adaptation. With hierarchical encoding, each stream is split
into a base layer that contains the most essential low quality informa-
tion, and higher layers provide optional quality enhancement informa-
tion. MCaching further assumes that all the streams are linear-layered
encoded where all layers have the same constant bandwidth for the sake
of clarity. However, this architecture (and MCaching) can be extended
to other layered-encoding bandwidth distributions. Layered organiza-
tion provides an opportunity for proxy caches to adjust the quality of
a cached stream in a demand-driven fashion. To allow fine-grain ad-
justment of quality, each layer of the encoded stream is divided into
equal-sized pieces called segments.
MCaching assumes that all streams among servers and proxies, and
among proxies and clients, must perform congestion control and qual-
ity adaptation. It does not make any assumption about the inter-cache
architecture which determines control message exchange and request
forwarding.
A. Delivery Procedure
Relaying on a Cache Miss: On a cache miss, the request is for-
warded to the original server or a neighbor cache depending on the
inter-cache architecture. For simplicity we assume that the stream is
played back from the original server to the cache via a congestion con-
trolled connection. The cache then relays data packets to the client
through a separate congestion controlled connection. The quality of
the delivered stream is limited by the average bandwidth between the
server and the cache. Thus the client does not observe any benefit (e.g.,
quality improvement) from the presence of the cache.
The proxy always caches a missing stream during its first playback. If
cache space is exhausted, the replacement algorithm flushes a sufficient
number of segments from the cache to make room for the new stream
(Section II-C). After the first playback, there might be occasional miss-
ing segments in the cached stream that were lost but not repaired. To
perform quality adaptation effectively during subsequent playbacks, the
cache may proactively repair the missing segments or fetch new layers
during idle hours, or it may prefetch them in a demand-driven fash-
ion when it is serving subsequent requests of the stream. The latter was
adopted because future request patterns, especially the required quality,
may not be predictable.
Prefetching on a Cache Hit: On a cache hit, the proxy acts as a
server and starts playing back the requested stream. As a result the
3
client observes shorter startup latency. The proxy must still perform
congestion control and quality adaptation. As discussed before, the
quality variations in the cached stream may not match those required
by quality adaptation during the new session. This means that the
cache may require to send some segments that it does not have. To
improve the delivered quality, the cache should prefetch these missing
segments from the server before their playout times, i.e., the deadline
when they must be delivered for the client to play back. Fig. 2 il-
lustrates the difference between played back and cached quality that is
filled with prefetched segments. Because of unpredictable changes in
quality, some of the prefetched segments may not be used.
As outlined above, MCaching has two major components: 1) prefetch-
ing and 2) replacement. We will discuss these two mechanisms in more
details for the rest of this section.
B. Prefetching
During the playback of a cached stream, the cache needs to maintain
two unsynchronized connections: (i) that between the server and the
proxy for prefetching, and (ii) that between the proxy and the client
for delivery of the stream. The proxy must predict a missing segment
that may be required by quality adaptation in the future and prefetch
it before its playout time. Thus there exists a tradeoff: the earlier the
proxy prefetches a missing segment, the less accurate is the prediction,
but the higher is the chance of receiving the prefetched segment in time.
To better meet the playout deadlines, prefetching should loosely fol-
low the playback session; otherwise the prefetching stream may fall be-
hind and become useless. Towards that goal, a sliding-window mecha-
nism was devised for prefetching [1]. The proxy examines a window of
time in the near future and sends an ordered list of all the missing seg-
ments in that window based on their priority (i.e., layer number). These
segments may be missing due to packet losses, layer drops in previous
playbacks, or replacement. Furthermore, if quality adaptation decides
to add a new layer
2
, all missing segments of the new layer within the
prefetching window are also requested(e.g., in Fig. 2).
To ensure in-time delivery of requested segments, the prefetching
window should slide as fast as the playout point. Thus the proxy pe-
riodically slides the prefetching window and sends a new prefetching
request to the server. The server delivers requested segments via a con-
gestion controlled connection to the proxy based on their priorities (i.e.,
layer numbers). For example, it first sends all the requested segments
of layer 0, then those of layer 1, and so on. To loosely synchronize the
prefetching stream with the playback stream, a new prefetching request
preempts the previous one. If the server receives a new prefetching re-
quest before finishing delivery of segments in the previous request, it
ignores the old request and starts to deliver segments in the new request.
Notice that the average quality improvement of a cached stream af-
ter each playback is determined by the average prefetching bandwidth.
Thus it may take several playbacks for the stream’s quality to reach the
maximum that the clients can receive (assuming no replacement).
C. Replacement Algorithm
Due to the lack of sub-structure in ordinary Web documents, most of
the existing replacement algorithms for Web caching are atomic. They
make binary replacement decisions, i.e., pages are cached or flushed
out in their entirety. Layered encoded streams are naturally structured
into separate layers, and each layer is further divided into equal-size
segments. MCaching exploits this structure to perform fine-grain re-
placement, which allows fine-grain adjustment of the quality of cached
streams, reduces fragmentation of the cache space and improves the
cache efficiency. (See Appendix B for further discussion about replace-
ment granularity.)
See [3] for details on conditions for adding a new layer.
No. of Active Layers (Quality)
L
0
L
1
L
2
L
3
Time(min)
Fine-grain Replacement
Coarse-grain
Repalcement
A segment
Fig. 3. Replacement priority within a cached stream
The proxy maintains popularity of individual layers of each stream
3
.
To maximize the performance of the cache, the proxy always flushes
segments of the least popular layer, called the victim layer. The vic-
tim layer is always the top layer of a cached stream. It is generally
preferred to cache a contiguous portion from the beginning of a layer
to have minimum variations in quality and reduce startup latency [7].
Thus segments of the victim layer are flushed from the end toward the
beginning. Fig. 3 depicts the replacement pattern within a single cached
stream. If flushing all segments of the victim layer does not provide suf-
ficient space, the proxy then identifies a new victim layer and repeats
this process
4
. In order to hide startup latency, the first few segments of
the base layer for each cached stream may be kept in cache for a long
period even though its popularity becomes low. While one can devise
other replacement patterns to optimize other aspects of MCaching per-
formance, no other pattern seems to maximize the quality and minimize
the load on the server simultaneously.
III. EVALUATION METHODOLOGY
The quality adaptive nature of multimedia streams brings two dimen-
sions to the performance evaluation of multimedia caching:
1. Quality improvement describes how the quality of individual cached
streams is improved, and,
2. Caching efficiency describes how effectively the cache reduces net-
work load and server load.
In addition, the performance of a multimedia cache can be evaluated at
two levels:
Aggregate level: where we study the overall performance of the
cache in response to a sequence of requests.
Per-stream level: where we investigate the behavior of the cache with
respect to individual streams.
Caching efficiency exists in traditional Web caching, and it has been
measured at the aggregate level. Quality improvement is specific to
multimedia caching, and it can be measured at both per-stream and ag-
gregate levels. In the following subsections, we present our evaluation
metrics for these two levels as well as two strawman mechanisms to
compare with MCaching.
In our evaluations, we only refer to the quality of cached streams
instead of the delivered quality to clients. The maximum deliverable
quality only depends on available client bandwidth, and is a constant
target during a single experiment (see Section IV). We then examine
the evolution of the cached quality to see how effectively it matches
with the target deliverable quality.
A. Evaluation Metrics
We present our aggregate and per-stream level metrics separately.
Note that at each level, we need metrics to measure both caching effi-
Details about the popularity function can be found in Appendix A.
Note that this segment-based replacement may result in thrashing. To avoid this, while a particular
stream is played back from the cache, its active layers are locked in the cache and can not be replaced
during the playback.
4
Per-Stream Level Aggregate Level
Quality completeness, continu-
ity, variation
average cached
quality
Efficiency layer hit ratio, prefetch-
ing efficiency
byte hit ratio, net-
work load
TABLE I
EVALUATION METRICS.
ciency and quality improvement. Table I classifies all our metrics based
on their granularity and corresponding dimension.
A.1 Per-Stream Metrics
Several metrics are required to capture various aspects of the caching
mechanism at the per-stream level. We first describe quality-related
metrics, then the metrics for caching efficiency.
A.1.a Stream Quality. There is no well-accepted metric for mea-
suring perceptual quality of a multimedia stream. Instead, we present
three metrics that collectively capture the pattern of changes in num-
ber of layers (i.e., layer add and drop) for a layered encoded stream in
the cache. Notice that the preceptual effect of layer add and drop is
an encoding-specific issue. These metrics enable us to quantitatively
track the evolution of a layered encoded stream in the cache. Thus for
any given encoding, one could analyze the effect of this evolution on
preceptual quality.
Completeness is defined on a per-layer basis and measures the per-
centage of the layer resident in the cache. This metric allows us to trace
the quality evolution for a cached layer. The completeness of layer of
a cached stream after an arbitrary request is defined as the ratio of
the layer size in cache to its “official length” :
(1)
where we define a chunk as a continuous group of segments in a single
layer of a cached stream, and is then defined as the set of all
chunks of layer of stream after request 5
. is the length (in terms
of segments) of the th cached chunk in layer . Obviously the value of
completeness always falls within [0,1]. Notice that is not restricted to
requests for . Thus if is not in the cache at the time of , this implies
.
Continuity is defined on a per-layer basis and reflects the average
chunk size for a layer. Completeness alone does not reflect the num-
ber of “holes” in a cached layer. For a given value of completeness,
the higher the value of continuity, the lower the number of holes (or
chunks). The continuity of layer of a cached stream after an arbi-
trary request is defined as the average size of all chunks in the layer
normalized by the layer size:
6
mean (2)
Note that continuity may exhibit rapid changes due to the random na-
ture of packet loss and layer add and drop.
Variation is defined on a per-layer basis and shows the standard devi-
ation among chunk sizes within a layer. Continuity alone does not pro-
vide any information about distribution of chunk sizes within a layer.
For a given completeness and continuity, the lower the variation, the
more uniform are the chunk sizes within that layer. The variation of
layer of a cached stream after the -th request is defined as:
stddev (3)
Notice that segments may be missing due to either packet loss or quality adaptation dropping a layer.
It is normalized to the official length of the layer so that we can compare continuities of different
streams.
The above metrics are defined on a per-layer basis, for every stream
after each request. To use a single metric, we collapse them into per-
stream numbers to show the impact of other parameters, e.g., stream
popularity. For each of these metrics (except variation, the average of
which is meaningless), we define its per-stream version as its average
across all layers in the stream and all requests in an experiment. Per-
stream completeness represents the average quality (in terms of layers)
of a cached stream during the given experiment. Per-stream continu-
ity represents the average fragmentation of all layers within a stream
during the given experiment.
A.1.b Caching Efficiency. We use two metrics to represent the
caching efficiency of MCaching to improve the quality of a single
cached stream.
Layer Hit Ratio measures the efficiency of caching mechanism in
maintaining segments of each layer. It is in fact a per-layer byte hit
ratio. We define the layer hit ratio of layer in cached stream during
request as:
(4)
where is the number of bytes delivered from the cache, and
is the total amount of bytes delivered. When the cache has all
the segments of layer that the client wants during a request, layer hit
ratio of layer is 100%.
Prefetching Efficiency measures the effectiveness of the prefetching
mechanism in delivery of higher layers that are missing from the cache.
A prefetched segment might not be played back due to incorrect predic-
tion or late arrival. Either of the two cases reduces the efficiency of the
prefetching mechanism. We define the prefetching efficiency of layer in cached stream after request as the portion of prefetched segments
of that have been played back during the session of :
(5)
where is the number of total prefetched bytes, among which
bytes arrived in time and were delivered to the requesting
client.
For each of the above two metrics, we define its per-stream version
as its average across all layers in the stream and all cache hits of the
stream in an experiment. Per-stream layer hit ratio captures the overall
effectiveness of the caching mechanism to maintain the cached quality
of a stream at the deliverable quality during an experiment. Per-stream
prefetching efficiency measures the overall effectiveness of prefetching.
A.2 Aggregate Metrics
The following three aggregate metrics are intended to measure the
overall behavior of a caching mechanism with respect to quality im-
provement and caching efficiency.
Average Cached Quality measures the quality of cached streams av-
eraged over time and across all streams. Let the set of all multimedia
streams be and the set of all requests during the simulation be . De-
fine the average quality of the entire cache to be the average of per-layer
completeness across every stream, every layer and every request during
an experiment. We only use completeness to measure average qual-
ity because averaging other metrics for quality does not provide much
meaningful information. This metric represents the average quality of
all cached streams during every request in an experiment.
Byte Hit Ratio is used to measure the overall efficiency of a proxy
cache in reducing network traffic. It is defined as the percentage of
bytes delivered from the cache among the total bytes delivered to the
client during an experiment. The difference between per-stream layer
hit ratio and byte hit ratio is that per-stream layer hit ratio is only mea-
sured during cache hits but byte hit ratio takes into account both cache
5
hits and misses. Hence per-stream layer hit ratio does not reflect the
overall efficiency of the entire cache.
Network Load is the total amount of traffic from the server to the
cache during an entire experiment. It represents how effectively the
cache responds to client requests without introducing additional load to
the network and the server.
B. Strawman Mechanisms
We are not aware of any multimedia proxy caching mechanism
that addresses quality improvement or incorporates congestion con-
trol and quality adaptation
7
. Given that MCaching is tightly coupled
with congestion control and quality adaptation, comparison with other
schemes without these two components would be unfair and meaning-
less. Therefore, for the sake of comparison we present two other varia-
tions of MCaching as follows:
No Prefetching (NOP): This scheme is similar to MCaching, but
without prefetching and with atomic replacement. Hence the quality
of a cached stream is limited by the bottleneck bandwidth during the
playback from the server and is never improved. Streams are cached
and flushed in their entirety instead of on a per-layer basis;
Off-line Prefetching (OFFP): This scheme is similar to MCaching,
but with atomic replacement and atomic improvement. After the first
request, all the missing segments of the stream is fetched from the
server via a TCP connection. Thus the proxy has the highest quality
version of the stream that resides at the server
8
. Streams are cached
and flushed in their entirety instead of on a per-layer basis.
Each of these strawman mechanisms attempts to maximize the per-
formance along only one dimension of the design space. NOP attempts
to achieve higher byte hit ratio by caching a larger number of streams at
the cost of lower quality. In contrast, OFFP provides maximum quality
regardless of available bandwidth to the clients, which determines the
deliverable quality.
IV. SIMULATION SETUP
We evaluate the multimedia caching mechanisms with simulation
using ns-2 [9]. We use RAP [5] (for congestion control) along with
layered quality adaptation [3] as transport protocol for multimedia
streams. Our simulations do not include any error control mechanism,
i.e., there is no mechanism to repair packet losses (e.g., retransmission
or FEC). However, priority-based prefetching in MCaching is able to
fill the holes caused by packet losses. Adding an error control mech-
anism will certainly speed up the quality improvement process. In the
absence of error control, our results represent the worst case scenarios.
There are two important factors in designing our simulation sce-
narios: 1) request sequence, and 2) the bandwidth distribution among
clients and the location of the bottleneck link. We address these issues
next.
A. Request Sequence
Without any knowledge about access patterns of Internet multimedia
streams, we generate request sequences using a customized version of
the SURGE Web workload generator [10]. Three factors are needed to
generate a request sequence: the number of requests for each stream
(i.e., stream popularity), request ordering and request interval distribu-
tion. We first assume that stream popularity conforms to the Zipf’s law,
which has been observed in various Web traces [11]
9
. Given the num-
ber of total requests and total number of streams , we let the th
popular stream have
requests, where . We then
The only one we know about is a commercial product from Inktomi for which we do not have any
technical details [8].
The prefetching TCP connection may overlap with the RAP connection from the cache to the server
during a cache miss.
Many Web traces do not exactly follow the Zipf’s law, instead they exhibit Zipf-like behavior [11]. For
simplicity we use Zipf’s law in this paper.
BWpc2
BWpc1
Client 1
BWsp
Proxy
Stream
Server
Client 2
Fig. 4. Simulation topology
generate request ordering so that the stack distance of the requests ex-
hibits log-normal distribution with empirical parameters [10]. Finally,
although there are empirical request interval distributions, it is difficult
to apply them to multimedia stream caching because Web pages are
usually much smaller than multimedia streams. Thus using the same
distribution is likely to result in a proliferation of traffic. In the absence
of empirical data about access patterns of Internet multimedia streams,
we chose a uniform distribution from 300 seconds to 400 seconds as
our request interval model. One consequence is that most requests are
sequential as seen by the cache
10
B. Network Scenario
Throughout our simulation, we use a simple network topology
(Fig. 4). denotes the link bandwidth between the server and
the proxy, whereas and are link bandwidths between
proxy and two clients, respectively. When only one client is active, say
client 1, one may construct two interesting scenarios from this simple
topology:
Scenario I: , the server-proxy connection is the
bottleneck.
Scenario II: , the proxy-client connection is the
bottleneck.
In scenario II, because the bottleneck is at the client side, the cached
stream quality is usually higher than the deliverable quality and this
leaves no need for further quality improvement. To observe the overall
effect of quality improvement and caching efficiency, we mainly ex-
plore scenario I in our simulations.
We focus on the effect of client bandwidth heterogeneity which is
the key factor that determines the quality of cached streams. In our
simulations, we set , and to 56Kbps ( 1.2 lay-
ers), 1.5Mbps and 128Kbps ( 2.7 layers) respectively. By changing
the distribution of requests between the high bandwidth client and the
low bandwidth client, we are able to continuously traverse from one
extreme—where all clients are low-bandwidth—to the other—where
all clients have high bandwidth connectivity. We tune the request dis-
tribution by the ratio of requests from the low bandwidth client. For
example, a request ratio of 5% means that only 5% of the requests
come from the low bandwidth client and the rest are issued by the high
bandwidth client. This ratio essentially provides a tuning knob for the
average available client bandwidth seen by the cache. Therefore in the
following section, we will use “client bandwidth ratio” and “average
client bandwidth” interchangeably.
We do not impose background traffic on the server-proxy link in any
of our simulations. We have conducted simulations using a self-similar
Web traffic model [12], which proved to be too expensive in terms of
execution time for our simulations with many requests to large mul-
timedia streams. In addition, results of our medium-sized simulations
exhibited the same qualitative trends with and without background traf-
fic (see Appendix C for details).
There are other parameters that affect our simulations, such as seg-
ment size, layer consumption rate (the rate at which receiver consumes
data in each layer), number of layers, number of streams and stream
length. To focus on main variables, we limit the number of parame-
ters and assume all streams have 6 layers, the segment size is 1KB, and
This does not invalidate our final conclusions. The only effect of serving concurrent requests from the
cache is that multiple prefetching sessions compete for the server-proxy bandwidth. This results in slightly
slower quality improvement. See Appendix ?? for details.
6
0.5
0.6
0.7
0.8
0.9
1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Byte Hit Ratio
CacheSize/DataSize
Byte Hit Ratio w.r.t. Cache Size (ClientBW=0.05)
MCaching
NOP
OFFP
(a) Byte hit ratio w.r.t. cache size
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
Byte Hit Ratio
Ratio of Low Bandwidth Requests
Byte Hit Ratio w.r.t Client Bandwidth (CacheSize/DataSize=0.025)
MCaching
NOP
OFFP
(b) Byte hit ratio w.r.t. client bandwidth
Fig. 5. Byte hit ratio.
the layer consumption rate is 6KB/s. We expect that changing these
parameters will not qualitatively change our results as long as related
parameters are also changed proportionally. We discuss other parame-
ter settings in the following sections.
V. AGGREGATE PERFORMANCE EVALUATIONS
In this section we present aggregate performance evaluations for
MCaching and the two strawman mechanisms (NOP and OFFP). Our
data set consists of 200 streams. The stream popularities follow the
Zipf’s law. The most popular stream has 400 requests. Without statis-
tical knowledge about the size distribution of real Internet multimedia
streams, we set the stream lengths to be uniformly distributed between
30 seconds and 3 minutes (the size in bytes can be obtained by multiply-
ing the stream length, the number of layers and the layer consumption
rate)
11
.
Two major parameters that affect the aggregate behavior of a caching
mechanism are: cache size and client bandwidth. Cache size changes
the caching efficiency whereas client bandwidth controls the quality of
cached streams. Thus, in our simulations we vary the cache size to be
0.025, 0.05, 0.1, 0.2 and 0.4 times of the data set size (i.e., the size of all
streams), and vary the ratio of requests from the low bandwidth client
between 0% and 100%.
A. Byte Hit Ratio
Fig. 5(a) shows byte hit ratio as a function of cache size when client
bandwidth ratio is 0.05. The following discussion applies to all other
client bandwidth ratios that we have tested. We observe that MCaching
always exhibits higher byte hit ratio than both NOP and OFFP. The
smaller the cache size, the larger is the advantage of MCaching. The
culprit is the atomic replacement in OFFP and NOP. Fine-grain replace-
ment is more efficient in utilizing cache space than atomic replacement,
as shown in Appendix B. When cache storage is abundant (e.g., 0.4
times the data set size), there is less contention for cache space and
cache utilization is a less significant factor. Hence OFFP is able to
Longer streams can be viewed as combinations of several shorter streams with the same popularity.
Hence our results can be scaled with respect to stream length.
Average Cached Quality
MCaching
NOP
OFFP
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
CacheSize/TotalSize 0
0.2
0.4
0.6
0.8
1
Ratio of Low Bandwidth Requests
0
0.5
1
1.5
2
2.5
3
Average Cached Quality
(a) Average cached quality
10
100
1000
10000
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Network Load (MB)
CacheSize/DataSize
Network Load (ClientBW=0.05)
MCaching
NOP
OFFP
(b) Network load
Fig. 6. Average cached quality and network load.
achieve relatively high byte hit ratio. For smaller cache sizes, how-
ever, contention for cache space becomes strong and cache utilization
becomes dominant. Thus OFFP experiences more cache misses and its
byte hit ratio is low. Notice that the smallest cache size in this simula-
tion (2.5% of total data set) is not an unreasonable number compared
with the sizes of most current Web caches
12
.
Fig. 5(b) shows byte hit ratio as a function of client bandwidth when
cache size is 0.025. Again, the following discussion applies to other
cache sizes as well. When most requests come from the high band-
width client, OFFP’s byte hit ratio increases by 44%. Because OFFP
prefetches all missing layers of a stream after a cache miss, when client
bandwidth is low, higher prefetched layers unnecessarily occupy cache
space. This results in low utilization of cache space, which leads to
a low byte hit ratio. In contrast, MCaching adaptively adjusts its be-
havior based on client bandwidth and only prefetches the required seg-
ments. This partial prefetching results in more efficient utilization of
cache space and is able to achieve high byte hit ratio regardless of the
client bandwidth. Client bandwidth has no effect on the byte hit ratio
of NOP, because in our simulations the server-proxy connection is the
bottleneck, NOP always results in a cached quality that is lower than
the deliverable quality and does not waste cache space.
B. Average Cached Quality
Fig. 6(a) depicts the average cached quality as a function of cache
size and client bandwidth ratio. We use a 3D graph because this func-
tion is not homogeneous along either dimensions. When cache size is
large, OFFP results in higher average cached quality than MCaching
and NOP. But for smaller cache sizes (e.g., 0.025 times the data set
size), MCaching has higher average cached quality than OFFP. This is
again the result of more efficient cache utilization of MCaching due to
its fine-grain replacement. The average cached quality of NOP is al-
ways lower than the other two, which shows the quality improvement
effect of prefetching in MCaching and OFFP.
The number of publicly indexable Web pages in the world is estimated to be about 800 million,
containing about 15TB [13]. This is orders of magnitudes larger than the sizes of most current Web caches.
When large multimedia streams become common, it is reasonable to expect this number to grow even
further.
7
Byte Hit Ratio
Average Cached Quality
0.5 0.6 0.7 0.8 0.9
0.5 1.0 1.5 2.0 2.5
Quality vs Byte Hit Ratio
MCaching
NOP
OFFP
(a) Quality vs byte hit ratio
0
0.5
1
1.5
2
2.5
3
10 100 1000 10000
Average Cached Quality
Network Load (MB)
Quality vs Network Load
MCaching
NOP
OFFP
(b) Quality vs network load
Fig. 7. Aggregate behavior of three multimedia caching mechanisms.
Client bandwidth has no effect on average cached qualities of OFFP
and NOP because they do not adjust quality based on the bandwidth
of the interested clients. MCaching adjusts the average cached quality
according to the available client bandwidth
13
.
C. Network Load
Fig. 6(b) shows the network load between the stream server and the
proxy cache as a function of cache size in log scale. The following
conclusion remains the same for different values of client bandwidth
ratios that we have tested. In general, the network overhead of OFFP
is an order of magnitude higher than MCaching. This is the joint effect
of blind prefetching of the entire requested streams and the atomic re-
placement. The network load of MCaching is about 2-5 times higher
than NOP, which is a reasonable cost for its 2-4 times improvement in
the average cached quality over NOP.
D. Summary
Fig. 7(a) summarizes our previous simulation results with respect to
the overall performance in the evaluation space, which uses average
cached quality and byte hit ratio as two dimensions (same as Fig. 1).
Notice that both cache size and client bandwidth change across these
results. Fig. 7(a) verifies our hypothesis that there is a fundamental
tradeoff between cached stream quality and byte hit ratio. The higher
the average quality of cached streams, the smaller the number of cached
streams and the lower the byte hit ratio.
Each of the three mechanisms exploits this tradeoff differently.
OFFP is able to achieve both high average cached quality and high
byte hit ratio when cache storage is abundant, e.g., 0.4 times the data
set size. But for smaller cache sizes, OFFP results in more cache misses
and exhibits both low byte hit ratio and low average cached quality. In
contrast, NOP always results in high byte hit ratio (>65%) at the cost
of significantly lower average cached quality (<1). MCaching exploits
the tradeoff more effectively to improve the performance along both
dimensions. It achieves adequate average cached quality while keep-
We will examine more closely the relationship between cached quality and deliverable quality in
Section VI.
ing byte hit ratio always above 70%. Although both MCaching and
OFFP achieve high quality, MCaching can effectively reduce network
load because of its efficient utilization of cache space. This observa-
tion is verified by Fig. 7(b), which maps the same set of simulation
results over a different angle using quality and network load as two
dimensions. This figure clearly demonstrates that MCaching achieves
similar quality as OFFP at significantly lower network load. NOP has
lower network load than MCaching, but its quality is also reduced by a
similar factor.
Our aggregate performance evaluation presents an evidence that per-
formance evaluations of multimedia proxy caching mechanisms should
be conducted with respect to various dimensions of the evaluation
space. Furthermore, any performance assessment should be drawn
from results along various dimensions collectively.
In the next section we focus on the evolution of quality of cached
streams as well as caching efficiency at the per-stream level. Given
the static nature of OFFP and NOP in quality adjustment, we will only
examine MCaching in order to demonstrate the dynamics of quality
adjustment due to fine-grain replacement and partial prefetching.
VI. PER-STREAM PERFORMANCE EVALUATIONS
In this section, we will first illustrate the “micro-level” quality evo-
lution on a per-layer basis. Then based on these observations, we will
discuss the impact of stream popularity and client bandwidth on the
resulting quality of cached streams.
In order to closely track the quality evolution of every cached stream
on a per-layer basis, our data set consists of only 10 streams for our per-
stream performance evaluation. Stream popularities follow the Zipf’s
law. The most popular stream has 100 requests. Similar to the aggre-
gate evaluations, stream lengths are uniformly distributed between 30
seconds and 3 minutes. Because we have already discussed the impact
of cache size on the aggregate behavior, we set the cache size to half of
the size of our data set to observe moderate replacement.
A. Per-Layer Quality Evolution
To relate quality evolution to caching and prefetching efficiencies,
we focus on the quality adjustment of a single cached stream. Fig. 8
depicts the evolution of per-layer quality (completeness, continuity and
variation) for the most popular stream during a sequence of requests
where 95% of the requests are from the low bandwidth client. Ar-
rival time of requests for the most popular stream from the high and
low bandwidth clients are shown at the top of each graph as long and
short ticks, respectively. Fig. 8 also shows per-layer layer hit ratio and
prefetching efficiency to demonstrate their correlation with quality evo-
lutions.
After the first request, the cached stream quality is low (70% of the
base layer, 30% of the second layer and 10% of the third layer) due to
the bottleneck link between the server and the cache. Since both clients
can afford higher quality streams, as more requests arrive, prefetching
gradually brings in higher layers. The quality improvement occurs on
a layer-by-layer basis, i.e., the quality of lower layers are improved be-
fore any higher layers. This is most clearly shown in the variation plot.
When a layer is initially fetched on a cache miss, the variation first in-
creases because of random “holes” caused by layer drop or packet loss.
Once a layer is mostly resident in the cache, its completeness reaches
close to 100%, then prefetching starts to fill the holes. This mono-
tonically reduces the variation until it reaches zero. Thus, the peak of
variation for any layer occurs always before that of all the higher layers.
Although the low-bandwidth client can only afford about 2.7 layers,
the higher layers are occasionally prefetched into the cache upon the
arrival of a request from the high bandwidth client. Requests from the
high bandwidth client arrive roughly at times 8000s, 41000s, 55000s,
80500s, 83000s, 90000s and are most clearly visible as spikes in the
per-layer prefetching efficiency plot.
8
0
0.2
0.4
0.6
0.8
1
1.2
0 20000 40000 60000 80000 100000 120000
Per-layer Completeness (%)
Time (second)
Per-layer completeness
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(a) Per-layer completeness
0
0.2
0.4
0.6
0.8
1
1.2
0 20000 40000 60000 80000 100000 120000
Per-layer continuity (Byte/Drop)
Time (second)
Per-layer continuity plot
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(b) Per-layer continuity
0
50000
100000
150000
200000
0 20000 40000 60000 80000 100000 120000
Per-layer continuity standard deviation
Time (second)
Per-layer continuity standard deviation plot
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(c) Per-layer variation
0
20
40
60
80
100
120
0 20000 40000 60000 80000 100000 120000
Per-layer Prefetching efficiency (%)
Time (second)
Per-layer prefetching efficiency plot
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(d) Per-layer prefetching efficiency
0
20
40
60
80
100
120
0 20000 40000 60000 80000 100000 120000
Per-layer Layer hit ratio (%)
Time (second)
Per-layer layer hit ratio plot
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(e) Per-layer layer hit ratio
Fig. 8. Evolution of quality improvement efficiency.
The lower layers are only prefetched during a few initial requests,
then they reach maximum quality and stay in the cache because of fre-
quent access to these layers. Consequently, their layer hit ratios quickly
reach 100% and remain at the maximum level. In contrast, higher lay-
ers are only needed to serve a request from the high bandwidth client
which is responsible for only 5% of all requests. Therefore, their pop-
ularities (hence status of residency in the cache) are small and heavily
dependent on the temporal distribution of requests. For example, most
of the requests from the high bandwidth client are far apart, thus the
top layer usually does not stay in the cache. The only exception is ar-
rival of three requests from the high bandwidth client within the interval
[80000s,90000s] that prefetched most of the top layer in the cache and
keep it in the cache for a short period.
The prefetching efficiency plot also shows that prefetching efficiency
decreases as the layer becomes more complete, i.e., prefetching is most
efficient in “adding new layers” but not in “filling holes”. For exam-
ple, as layer 4 is filled up, its prefetching efficiency drops from 60% (at
time 8000s) to 14% (at time 55000s). This is because of the rate adap-
tation enforced by the congestion control mechanism, RAP. Thus, it is
easier to transfer a continuous stream of data than a burst of prefetched
segments in a short period of time. This behavior can be improved
by fine-tuning of the prefetching mechanism to spread a big burst of
required segments over time within a certain time window. We could
also use a bigger prefetching window to smooth out the variations in
prefetching bandwidth.
The layer hit ratio plot depicts how well the cache satisfies requests
for individual layers without going to the server. While it quickly
reaches 100% for the first three layers, it is very bursty for higher lay-
ers. Note that layer hit ratio does not capture the amount of data sent
during a request. If a layer has only a few segments in the cache and
these segments are accidentally the segments that are needed by quality
adaptation mechanism during a session, the layer hit ratio may increase
sharply for this session.
B. Per-Stream Quality
MCaching adjusts stream quality based on per-layer popularity,
which is significantly affected by two parameters: 1) the popularity of
the entire stream based on its access probability from clients, and 2) the
available bandwidth between the cache and the interested clients. Fig. 9
depicts per-stream completeness and continuity as a function of client
bandwidth (i.e., ratio of low bandwidth requests) for all 10 streams in
our data set. The ratio of requests from the low bandwidth client varies
from 0% to 100%.
This figure reveals two important trends. First, as stream popularity
decreases, the quality of the cached streams drops. The layers of pop-
ular streams are likely to stay in the cache for a longer period because
of frequent requests, hence they have a higher chance to improve their
qualities by prefetching. Furthermore, the cached quality of popular
streams is close to the deliverable quality. For example, the most pop-
ular stream (0) keeps almost all 6 layers in the cache when all requests
come from the high bandwidth client (which can afford all layers), and
it keeps 3.08 layers when all requests come from the low bandwidth
client (which can afford 2.7 layers).
In contrast, unpopular streams have fewer requests and have a slim
chance to remain in the cache. Hence, while these streams reside in the
cache, their qualities are mostly limited by the initial playback from the
server. Since the server-cache connection is the main bottleneck ( 1.2
layers) in our simulations, the quality of unpopular streams is likely to
be low. There are a number of fluctuations in Fig. 9, especially for the
unpopular streams in the continuity plot. Because unpopular streams
have few requests and lower chances for adding layers and filling holes,
the effect of random packet loss and layer drop are more visible in their
continuity measurements.
Second, the impact of client bandwidth on quality is as important as
9
Per-Stream Completeness
MCaching
0
0.2
0.4
0.6
0.8
1
Ratio of Low-Bw Requests
0
1
2
3
4
5
6
7
8
Stream Popularity
0
1
2
3
4
5
6
Per-Stream Completeness
(a) Per-stream completeness
Per-Stream Continuity
MCaching
0
0.2
0.4
0.6
0.8
1
Ratio of Low-Bw Requests
0
1
2
3
4
5
6
7
8
Stream Popularity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Per-Stream Continuity
(b) Per-stram continuity
Fig. 9. Impact of popularity and available bandwidth on cached stream quality.
stream popularity. For example, when most requests come from the
high bandwidth client who can receive all layers, the completeness of
the most popular stream (i.e., stream 0) is 5.8, which is almost 6 times
higher than that of the least popular stream, 0.98. However, when most
requests come from the low bandwidth client who can afford only 2.7
layers, the completeness (3.08) of the most popular stream is almost the
same as that (2.52) of the least popular stream.
We can explain this phenomenon as follows. When client band-
width is high, most layers of the popular streams are likely to remain in
the cache. This leaves less space for unpopular streams, which conse-
quently results in more flushing and lower quality. When client band-
width is low, none of the streams is likely to keep higher layers in the
cache. This provides more space for the lower layers of the less popular
streams and results in higher per-stream quality for those streams. In
fact, in these simulations the cache size is half of the data set size and
each stream has 6 layers. When client bandwidth is high, the 6 layers
of the 5 most popular streams occupy most of the cache space and leave
no room for other less popular streams. When client bandwidth is low
(2.7 layers in our simulations), all streams are able to keep about half
of their layers in the cache, which are all the required layers.
C. Per-Stream Caching Efficiency
We have shown earlier that per-stream caching efficiency is closely
correlated with per-stream quality. In this section, we study the effect
of the quality of cached streams on per-stream caching efficiency using
per-stream layer hit ratio and prefetching efficiency as evaluation met-
rics. More specifically, we examine the effect of stream popularity and
client bandwidth, which have significant impact on the stream quality.
Fig. 10 shows the impact of these two parameters on per-stream layer
hit ratio and prefetching efficiency.
There are two cases when layer hit ratio reaches 100%: 1) for the
most popular streams regardless of client bandwidth, and 2) for all
streams when most requests come from the low bandwidth client. In
both scenarios, most of the required layers are able to remain in the
cache, which results in high layer hit ratio. As the client bandwidth
increases, or the stream popularity decreases, layer hit ratio decreases
because there is larger difference between cached quality and maxi-
Per-Stream Layer Hit Ratio
MCaching
0
0.2
0.4
0.6
0.8
1
Ratio of Low-Bw Requests
0
1
2
3
4
5
6
7
8
9
Stream Popularity
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Per-Stream Layer Hit Ratio
(a) Per-stream layer hit ratio
Per-Stream Prefetching Efficiency
MCaching
0.2
0.4
0.6
0.8
1
Ratio of Low-Bw Requests
0
1
2
3
4
5
6
7
8
Stream Popularity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Per-Stream Prefetching Efficiency
(b) Per-stream prefetching efficiency
Fig. 10. Impact of popularity and available bandwidth on efficiency.
mum deliverable quality, and more bytes need to be prefetched from
the server. The lowest layer hit ratio is 60%, which is obtained for the
most unpopular stream with the highest client bandwidth.
There exists an almost complementary relation between layer hit
ratio and prefetching efficiency. When layer hit ratio is high, most
bytes are delivered from the cache, and prefetching is rarely needed.
Prefetching efficiency is low because prefetching only occurs in a
bursty fashion to fill some remaining random holes
14
. Prefetching ef-
ficiency increases when layer hit ratio is lower, and becomes the high-
est for streams with mid-range popularities. For example, when client
bandwidth is the highest, the prefetching efficiency of the 4th popular
stream can be as high as 95%. These streams are not popular enough
to keep all layers always in the cache, but are still popular enough so
that their lower layers remain in the cache. Thus they only need to
prefetch a few higher layers, which can be prefetched in time without
overloading the server-cache bottleneck.
Unpopular streams are exceptions to this complementary behavior
between prefetching efficiency and layer hit ratio. They always have
the lowest prefetching efficiency and the lowest layer hit ratio. There
are different reasons for this behavior depending on client bandwidth.
When the client bandwidth is high, quality of unpopular streams is
likely to be low with frequent variations. If prefetching occurs, it is
more likely that prefetching has to fill the holes in lower layers first
before it adds higher layers. Thus prefetching should smooth out these
random variations and bring in the missing required segments. This
results in a relatively rapid variations in prefetching bandwidth require-
ment. As we discussed in Section VI-A, in such cases prefetching does
not perform efficiently due to the bandwidth regulation by the conges-
tion control protocol. When client bandwidth is low, unpopular streams
are able to remain in the cache and prefetching is only needed to fill
random holes. Prefetching achieves low efficiency in such cases as we
discussed above.
Notice prefetching efficiency of the most popular stream is sometimes 100%. This is an artifact
because we count prefetching efficiency as 100% when there is no prefetching at all.
10
D. Summary
Micro-level examination of per-layer quality evolutions reveals that
MCaching improves quality on a layer-by-layer basis starting from the
lowest layer. Quality evolutions are closely correlated with prefetch-
ing efficiency. In particular, prefetching of continuous pieces of data
can be performed more efficiently than sporadically prefetching in a
congestion controlled fashion.
By exploring the effect of popularity and client bandwidth on the
cached stream quality, we have found:
Lower layers that are more frequently requested remain in the cache,
whereas higher layers that are less frequently accessed are prefetched
and flushed in a demand-driven fashion.
As the popularity of the stream increases, its average cached qual-
ity becomes closer to the maximum deliverable quality which is deter-
mined by available client bandwidth.
Evaluations of per-stream caching efficiency reveal that the differ-
ence between cached stream quality and the target deliverable qual-
ity determines both per-stream caching efficiency and prefetching effi-
ciency.
VII. RELATED WORK
There have been many comprehensive studies of Web cache per-
formance. Among them, work in [11] presented a thorough study of
the implication of the Zipf’s law on cache performance, and derived
asymptotic hit ratio based on this observation. Work in [10] presented
a comprehensive Web workload model. We adopted their model in our
request sequence generator.
Using prefetching to improve Web cache performance has been dis-
cussed before [14], [15]. They evaluated strategies to predict fu-
ture requests. Sometimes prefetching may increase source burstiness
which has negative impact on the network. Work in [16] proposed
techniques to reduce this effect by rate-limit the prefetching requests.
These prefetching mechanisms operate at the per-request level while
our prefetching works inside the substructure of a single stream, hence
these schemes complement each other.
There is also significant previous work on Web proxy cache replace-
ment algorithms [17], [18]. The behavior of these replacement algo-
rithms for traditional Web caches are well understood. However, their
behaviors for access patterns with a significant number of requests to
large multimedia streams, especially those with variable quality, have
not been studied. This is partially due to the absence of any proposal
for multimedia Web caching mechanisms.
Work in [19] addresses the implications of resource requirement (i.e.,
bandwidth and space) for multimedia streams on cache replacement
algorithms. Another scheme [7] stores prefixes of multimedia streams
in proxy caches to improve startup latency and provides an opportunity
for smoothing. Our work complements these efforts in that we provide
a comprehensive study of multimedia caching mechanisms focusing on
quality improvement and cache efficiency.
VIII. CONCLUSION
This paper presented an initial attempt to bridge the gap between the
evaluations of traditional Web caching and multimedia proxy caching.
Multimedia proxy caching introduces cached stream quality as a new
dimension of cache performance evaluation space, which is not cap-
tured by existing cache performance metrics, e.g., byte hit ratio. Our
main contribution is a comprehensive framework for performance eval-
uation of multimedia caching mechanisms. In this framework, we pro-
posed evaluation metrics for both stream quality and caching efficiency
at both aggregate level and per-stream level. Using simulations, we
identified the fundamental tradeoff between stream quality improve-
ment and caching efficiency. We showed that compared with alterna-
tive approaches, MCaching effectively exploits this tradeoff by adap-
tively changing the cached stream quality to match the deliverable qual-
ity, therefore maximizing both overall stream quality and caching effi-
ciency. Our simulations also revealed interesting dynamic interactions
among evolution of cached stream quality, prefetching efficiency and
congestion control.Because of these interactions, per-stream caching
efficiency and prefetching efficiency are directly determined by the dif-
ference between cached stream quality and deliverable quality.
Multimedia proxy caching is still a new area of research. We plan
to extend our work in several directions. First, we will leverage the
previous studies on Web cache prefetching and replacement algorithms
and examine the possibility and impact of incorporating the compatible
mechanisms into multimedia caching. Second, given the large num-
ber of parameters and the dependencies among them, simulation-based
study seems to be inadequate for deep understanding of the dynamics of
replacements and its interactions with prefetching. Therefore, we plan
to devise an appropriate analytical model that captures key aspects of
the problem and helps us to better understand the effect of various pa-
rameters on cached stream quality and caching efficiency. Finally, we
plan to gather real Internet multimedia stream access traces and study
their differences from Web page access patterns. These trace data will
also facilitate real-world evaluation of the MCaching mechanism when
its implementation becomes publicly available.
REFERENCES
[1] Reza Rejaie, Haobo Yu, Mark Handley, and Deborah Estrin, “Multimedia proxy caching mechanism
for quality adaptive streaming applications in the Internet,” in Proc. IEEE Infocom, 2000, to appear.
[2] S. Floyd and K. Fall, “Promoting the use of end-to-end congestion control in the Internet,”
ACM/IEEE Transactions on Networking, vol. 7, no. 4, pp. 458–472, Apr. 1999.
[3] R. Rejaie, M. Handley, and D. Estrin, “Quality adaptation for congestion controlled playback video
over the Internet,” Proc. ACM SIGCOMM, Sept. 1999.
[4] R. Rejaie, M. Handley, and D. Estrin, “Architectural considerations for playback of quality adaptive
video over the Internet,” Tech. Rep. 98-686, USC-CS, Nov. 1998.
[5] R. Rejaie, M. Handley, and D. Estrin, “RAP: An end-to-end rate-based congestion control mecha-
nism for realtime streams in the internet,” Proc. IEEE Infocom, Mar. 1999.
[6] M. Vishwanath and P. Chou, “An efficient algorithm for hierarchical compression of video,” Proc.
IEEE International Conference in Image Processing, Nov. 1994.
[7] S. Sen, J. Rexford, and D. Towsley, “Proxy prefix caching for multimedia streams,” in Proc. IEEE
Infocom, 1999.
[8] Inktomi Inc., “Streaming media caching brief,” 1998.
[9] S. Bajaj, L. Breslau, D. Estrin, K. Fall, S. Floyd, P. Haldar, M. Handley, A. Helmy, J. Heidemann,
P. Huang, S. Kumar, S. McCanne, R. Rejaie, P. Sharma, S. Shenker, K. Varadhan, H. Yu, Y. Xu,
and D. Zappala, “Virtual InterNetwork Testbed: Status and research agenda,” Tech. Rep. 98-678,
University of Southern California, 1998.
[10] P. Barford and M. Crovella, “Generating representative Web workloads for network and server
peformance evaluation,” in Proc. ACM SIGMETRICS, June 1998, pp. 151–160.
[11] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “On the implications of Zipf’s Law for Web
caching,” in Proc. IEEE Infocom, 1999.
[12] A. Feldmann, A. C. Gilbert, P. Huang, and W. Willinger, “Dynamics of IP traffic: A study of the
role of variability and the impact of control,” in Proc. ACM SIGCOMM, Boston, MA, Sept. 1999.
[13] Steve Lawrence and Lee Giles, “Accessibility and distribution of information on the Web,” Nature,
vol. 400, pp. 107–109, 1999.
[14] Li Fan, Quinn Jacobson, Pei Cao, and Wei Lin, “Web prefetching between low-bandwidth clients
and proxies: Potential and performance,” in Proc. ACM SIGMETRICS, 1999.
[15] T.M. Kroeger, D. Long, and J. Mogul, “Exploring the bounds of Web latency reduction from caching
and prefetching,” in Proceedings of The USENIX Symposium on Internet Technologies and Systems,
Dec. 1997.
[16] Mark Crovella and Paul Barford, “The network effects of prefetching,” in Proc. IEEE Infocom,
1998.
[17] P. Cao and S. Irani, “Cost-aware www proxy caching algorithms,” in Proc. USENIX Symposium on
Internet Technologies and Systems, Dec. 1997, pp. 193–206.
[18] S. Williams, M. Abrams, C. R. Standridge, G. Abdulla, and E. A. Fox, “Removal policies in network
caches for World-Wide Web documents,” in Proc. ACM SIGCOMM, 1996, pp. 293–305.
[19] R. Tewari, H. Vin, A. Dan, and D. Sitaram, “Resource based caching for Web servers,” in Proc. of
SPIE/ACM Conference on Multimedia Computing and Networking, San Jose, 1998.
APPENDIX
I. POPULARITY FUNCTION
To do replacement, the proxy should keep track of the popularity for
each cached stream, i.e., the level of client interests in the stream. We
assume that the total playback time of each stream indicates the level of
client interest. For example, if a client only watches half of one stream,
its popularity interest is half of a client who watches the entire stream.
Based on this observation we extend the semantics of a hit and define
11
the term weighted hit (whit) as follows
15
:
(6)
where and denote total playback
time of a session and length of the entire stream, respectively. Both
and are measured by time (e.g., sec-
ond). While the level of client interests does not affect per-layer popu-
larity, adding and dropping layers by quality adaptation results in differ-
ent s for different layers in a session and consequently
results in a different popularity for the cached layers. For example,
even if all layers of a stream are available in the cache and the client
watches the entire stream, quality adaptation may only send Layer 0, 1
and 2 for 100%, 80% and 50% of the playback time, respectively.
To capture both level of client interest and usefulness of individual
layers in the cache determined by layer add and drop, the proxy calcu-
lates on a per-layer basis for each playback. The total playback
time for each layer is recorded and used to calculate the for that
layer at the end of the session. The cumulative value of during a
recent window (called the popularity window) is used as the popularity
index of the layer. The popularity of each layer is recalculated at the
end of a session as follows:
(7)
where and denote popularity and the width of the popularity win-
dow, respectively. Applying the definition of popularity on a per-layer
basis is compatible with the fine-grain replacement algorithm, because
layered encoding guarantees that popularity of different layers in the
same stream monotonically decrease with the layer number
16
. Thus a
victim layer is always the highest in-cache layer of one of the cached
streams. Notice that the length of a layer does not affect its popularity,
because is normalized by length.
II. REPLACEMENT GRANULARITY
In this section, we first show that compared with atomic replace-
ment, fine-grain replacement improves cache efficiency, then discuss
the tradeoff between replacement granularity and book-keeping over-
head.
Denote the -th segment of the -th layer of the -th stream to be
. Let the cache size be finite and able to hold segments.
Let denote the set of segments remaining in the cache when atomic
replacement is used. We define its asymptotic byte hit ratio as:
! (8)
where is the popularity of the corresponding layer for segment ,
and ! is the segment size. Similarly, let denote the set of segments
remaining in cache when fine-grain replacement is used. We define its
asymptotic byte hit ratio as:
! (9)
Notice that because the atomic replacement algorithm suffers more
from fragmentation than the fine-grain algorithm, we have: .
In order to compare Eqs. 8 and 9, we derive the following properties:
Property 1: The popularity of every segment in is
equal or less than that of any segment
in .
The term “weighted hit” has been used in the caching literature [17] to take into account page sizes.
Here we extended the definition of this term to the context of multimedia stream caching.
The encoding constraint requires that to decode a segment of layer , corresponding segments for all
lower layers must be available.
0
0.2
0.4
0.6
0.8
1
1.2
0 20000 40000 60000 80000 100000 120000
Layer Completeness (%)
Time (second)
Completeness, Quality changes of page 0 at cache 1
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(a) Completeness with self-similar background traffic
0
0.2
0.4
0.6
0.8
1
1.2
0 20000 40000 60000 80000 100000 120000
Layer Completeness (%)
Time (second)
Completeness, Quality changes of page 0 at cache 2
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(b) Completeness without self-similar background traffic
Fig. 11. Impact of bursty traffic.
This property is proved by contradiction. By its design, the fine-grain
replacement algorithm keeps in cache the layers (hence segments) with
the highest per-layer popularity. The existence of violates this
fact.
Property 2: From Property 1, we have that ,
,
. Therefore:
!
!
!
! !
! This reasoning can be pushed to the extreme to favor replacement
based on per-segment popularity. However, there exists a tradeoff be-
tween replacement granularity and book-keeping overhead. Maintain-
ing popularity at a finer granularity may result in a higher cache effi-
ciency, but it also requires a higher book-keeping overhead. Given that
multimedia streams usually consist of very large number of segments,
we believe that per-layer popularity presents the best tradeoff. Some
back-of-envelop calculation helps illustrate this point. Let the cache
size be 10GB. Each stream has average size 2MB, 5 layers and segment
size 1KB. Every popularity number takes 6 bytes since it is a floating
point number. The per-segment popularity table for the cache is there-
fore 60MB. Consider that the popularity table is sorted and updated on
every access, this number is quite large for most caches. Repeating this
calculation for per-layer popularity gives us a popularity table size of
150KB, which is a much easier to fit in the main memory.
III. IMPACT OF BURSTY TRAFFIC
In this section we use per-stream completeness to illustrate the influ-
ence of bursty background traffic. The network topology in this simu-
lation is identical to that described in Section IV. The data set consists
12
0
0.2
0.4
0.6
0.8
1
1.2
0 50 100 150 200 250 300
Layer Completeness (%)
Request Number
Completeness, Quality changes of page 0 at cache 2
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(a) Completeness with concurrent requests.
0
0.2
0.4
0.6
0.8
1
1.2
0 100 200 300 400 500 600
Layer Completeness (%)
Request Number
Completeness, Quality changes of page 0 at cache 2
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
(b) Completeness without concurrent requests.
Fig. 12. Impact of concurrent requests.
of 10 streams, and the stream shown is the most popular stream. 95%
of all requests are from the low bandwidth client. The Web traffic was
generated using methods in [12].
Fig. 11 shows the results. We observe that both simulations shows
a qualitatively identical trend: when the client bandwidth is low, the
higher layers of the most popular stream can be flushed out. The dif-
ference between the two is that the stream with bursty traffic seems to
have more variable quality, and as a result the higher layers experience
more fluctuation that the stream without bursty traffic. The reason is
exactly the highly bursty nature of the self-similar background traffic.
Because of this burstiness, it is difficult to understand the results us-
ing average available bandwidth when we are using bursty background
traffic. However, a single bottleneck link makes this task much eas-
ier. This is another reason that we chose not to use bursty background
traffic, in addition to its extraordinarily long execution time.
IV. EFFECT OF CONCURRENT REQUESTS
In this section we examine the impact of concurrent requests on qual-
ity improvement of cached streams. The network topology in this sim-
ulation is identical to that described in Section IV. The data set consists
of 10 streams, and the stream shown is the most popular stream. 95%
of all requests are from the low bandwidth client. We set the layer
bandwidth to 2.5KB/s, therefore the bottleneck between the cache and
the server can afford 2.8 layers. 80% of one request overlaps with a
subsequent request.
Fig. 12 shows the requests. We observe that the results with and
without concurrent requests are qualitatively similar. There are two
subtle differences. First, with concurrent requests, it takes longer for
layer 4 and 5 to achieve maximum quality. This is because of the lower
available bandwidth for prefetching on the cache-server bottleneck link.
However, the highest layer 5 experiences less fluctuation and higher
quality. This is because other cached streams have lower quality due to
limited bandwidth, hence leave more room for this layer.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 709 (1999)
PDF
USC Computer Science Technical Reports, no. 693 (1999)
PDF
USC Computer Science Technical Reports, no. 718 (1999)
PDF
USC Computer Science Technical Reports, no. 700 (1999)
PDF
USC Computer Science Technical Reports, no. 686 (1998)
PDF
USC Computer Science Technical Reports, no. 704 (1999)
PDF
USC Computer Science Technical Reports, no. 681 (1998)
PDF
USC Computer Science Technical Reports, no. 708 (1999)
PDF
USC Computer Science Technical Reports, no. 679 (1998)
PDF
USC Computer Science Technical Reports, no. 702 (1999)
PDF
USC Computer Science Technical Reports, no. 678 (1998)
PDF
USC Computer Science Technical Reports, no. 628 (1996)
PDF
USC Computer Science Technical Reports, no. 730 (2000)
PDF
USC Computer Science Technical Reports, no. 731 (2000)
PDF
USC Computer Science Technical Reports, no. 878 (2006)
PDF
USC Computer Science Technical Reports, no. 703 (1999)
PDF
USC Computer Science Technical Reports, no. 717 (1999)
PDF
USC Computer Science Technical Reports, no. 635 (1996)
PDF
USC Computer Science Technical Reports, no. 908 (2009)
PDF
USC Computer Science Technical Reports, no. 611 (1995)
Description
Reza Rejaie and Haobo Yu. "Proxy caching for quality adaptive multimedia streams in the internet: A performance perspective." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 725 (2000).
Asset Metadata
Creator
Rejaie, Reza
(author),
Yu, Haobo
(author)
Core Title
USC Computer Science Technical Reports, no. 725 (2000)
Alternative Title
Proxy caching for quality adaptive multimedia streams in the internet: A performance perspective (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
12 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269913
Identifier
00-725 Proxy Caching for Quality Adaptive Multimedia Streams in the Internet A Performance Perspective (filename)
Legacy Identifier
usc-cstr-00-725
Format
12 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/