Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 578 (1994)
(USC DC Other)
USC Computer Science Technical Reports, no. 578 (1994)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
On Multimedia Rep ositories P ersonal Computers and Hierarc hical Storage
Systems
Shahram Ghandeharizadeh and Cyrus Shahabi
Departmen t of Computer Science
Univ ersit y of Southern California
Los Angeles California Abstract
The past decade has witnessed a proliferation of p ersonal
computers at homes businesses classro oms libraries etc
Most often these systems are used to disseminate informa
tion Recen tlym ultimedia rep ositories ha v e added to the
excitemen t of this information age b y allo wing a user to re
triev e and manipulate con tin uous media data t yp es audio
and video ob jects The design and implemen tation of these
systems is c hallengin g due to b oth the large size of ob jects
that constitute this media t yp e and their con tin uous band
width requiremen t Compression in com bination with the
a v ailabil i t y of fast CPUs for realtime decompression pro
vide eectiv e supp ort for a con tin uous displa y of those ob
jects with a high bandwidth requiremen t Hierarc hical stor
age structures consisting of RAM disk and tertiary storage
devices pro vide a costeectiv e solution for the large size of
their rep ositories The fo cus of this study is on p ersonal com
puters single user single displa y that emplo y fast CPUs
compression and hierarc hical storage structures to supp ort
m ultimedia applications Its goals are to ensure a con tin u
ous displa y of audio and video ob jects while minimizin g the
latency time observ ed b y the user Its con tributions include
ano v el pip elini ng mec hanism and PIRA TE as a tec hnique
to manage the disk residen t ob jects
In tro duction
During the past few y ears the information tec hnology
has ev olv ed to store and retrievecon tin uous media
data t yp es eg audio and video These systems are
exp ected to pla y a ma jor role in library information
systems educational applications etc Ac hallenging
task when implemen ting these systems is to supp ort
con tin uous displa y of video ob jects This is b ecause
This researchw as supp orted in part b y the National Science
F oundation under gran ts IRI IRI NYI a w ard
and CD A a HewlettP ac k ard researc h gran t and a
researc h gran t from A TTNCRT eradata
Video ob jects require a high bandwidth for their
con tin uous displa y F or example the bandwidth
required b y NTSC for net w orkqualit y video is
ab out megabits p er second m bps Has Recommendation of the In ternational Radio
Consultativ e Comm ittee CCIR calls for a m bps bandwidth for video ob jects F o x A
video ob ject based on HDTV requires a bandwidth
of appro ximately m bps One mayemplo y
compression lossless or lossy see F o x for an
o v erview declustering GR striping TPBG or a com bination of these tec hniques BGMJ
to supp ort the bandwidth requiremen t ie a
con tin uous displa y of video ob jects
Video ob jects are large in size F or example a min ute uncompressed video clip based on NTSC is
appro ximately gigab ytes in size With a com
pression tec hnique that reduces the bandwidth re
quiremen t of this ob ject to m bps this ob ject is
appro ximately megab ytes in size A rep ository
eg corresp onding to an encyclop edia that con
tains h undreds of suc h clips is terab ytes in size
The sustained bandwidth of the curren t magnetic
disk tec hnology is t ypically rated b et w een to m bps
The lo wend mark et place p ersonal computers w ork
stations emplo ys lossy compression tec hniques eg
MPEG Gal to reduce b oth the size and the band
width requiremen t of video ob jects F or example
MPEG ensures that the bandwidth requiremen tof a
video clip do es not exceed m bps The aim of MPEG
is co ding for up to m bps with the further goal
of higherqualit y presen tations ARA While these
tec hniques reduce the bandwidth requiremen ts of video
ob jects
the large size of rep ositories con taining video
ob jects remains signican t see bullet n um b er T o reduce cost of storage the storage manager of
these systems is exp ected to b e hierarc hical consisting of
memory magnetic disk driv es and one or more tertiary
b y reducing the qualit y of presen tation
storage devices eg Digital Audio T ap e CDR OM
etc In order to simplify the discussion assume that
the system consists of one disk driv e and one tertiary
storage device As the dieren tlev els of this hierarc h y
are tra v ersed starting with memory the densityof
the medium the amoun t of data it can store and
its latency increases while its transfer rate and cost
p er megab yte decreases A t the time of this writing
from megab yte of memory to megab yte of
disk storage to less than megab yte of a tertiary
storage device The database resides p ermanen tly on
the tertiary storage device The disk storage is used
as a temp orary staging area for the frequen tly accessed
ob jects in order to minim ize the n um b er of references to
the tertiary An application referencing an ob ject that
is disk residen t observ es b oth the a v erage latency time
and the transfer rate of a magnetic disk driv e sup erior
to that of a tertiary storage device
In order for a system based on hierarc hical storage
structures to b e useful on a da ytoda y basis the system
should pro vide a fraction of a second latency time while
ensuring a con tin uous displa y of b oth audio and video
ob jects T o accomplish these ob jectiv es the a v ailable
resources should b e utilized in an in telligen t manner
Assuming a single user system that displa ys one ob ject
at a time the con tributions of this pap er include
The design of a pip elining mec hanism that emplo ys
b oth the magnetic disk and tertiary store for a
displa y It o v erlaps the displa y of a p ortion of
an ob ject from the disk with materialization of its
remainder from the tertiary store This minimi zes
the fraction of eac h ob ject that should b e disk
residen t in order for a user referencing that ob ject
to observ e the disk latency time Giv en a xed size
disk driv e this mec hanism allo ws for a larger n um ber
of ob jects to b ecome disk residen t minimi zing the
a v erage latency time of the system
PIRA TE replacemen t p olicy to manage the disk res
iden t ob jects in a manner that minim izes the a v er
age latency time andor the v ariance in latency
observ ed b y a user referencing ob jects PIRA TE
complemen ts pip elining in order to maximi ze the
utilization of b oth the disk and the tertiary store
while hiding b oth the high latency time and the lo w
bandwidth of tertiary from the user
The rest of this pap er is organized as follo ws Section pro vides a description of our target en vironmentand a
framew ork that ensures a con tin uous displayof an ob ject from the magnetic disk driv e This discussion is not
detailed b ecause it is almost identicaltothatofsev eral
other studies eg TPBG R V CL Section extends this framew ork to emplo y the tertiary store in
a pip elined manner in order to o v erlap the displayof
MEMORY
TERTIARY
DISPLAY
DISK
Figure T arget platform
a p ortion of an ob ject with its materialization Sub
sequen tly Section presen ts the PIRA TE replacemen t
policy Our conclusion and future researc h directions
are con tained in Section An assumption of this study is that the bandwidth re
quiremen t of an ob ject do es not exceed the bandwidth
of the magnetic disk driv e F or those ob jects that vio
late this assumption tec hniques prop osed b y BGMJ
are appropriate
Displayof Con tin uous Media
Our target en vironmen t see Figure is a t ypical
p ersonal computer that consists of a displa y some
memory a disk driv e and a tertiary storage device
W e assume that memory serv es as an in termediate
staging area b et w een the disk driv e and the tertiary
storage device enabling b oth the tertiary and the
disk to pro duce data sim ultaneously An ob ject
is not required to b e disk residen t in order to b e
displa y ed a p ortion of it can b e displa y ed from
tertiary W e assume that up dates to the database are
infrequen t op erations Example en vironmen ts include
library information systems eg m ultim edia b o oks and
encyclop edias and the en tertainmen t industry short
do cumen taries on a sp ecic topic
In order to supp ort a con tin uous displa y of an ob ject
sa y X from the disk driv e X is split in to n xed
size blo c ks X
X
X
n
A blo c k represen ts a
con tiguous p ortion of an ob ject and determines the unit
of transfer from the disk driv e Its size is determined
at system conguration time In order to ensure a
con tin uous displayof X the system reads one blo c k
of X sa y X
in to memory It allo ws the disk driv e
to b e time shared among other requests while X
is b eing displa y ed as long as blo c k X
is staged in
memory b efore the displayof X
completes The size
of a blo c k and its bandwidth requiremen t determine the
n umberofsim ultaneous displa ys that can b e supp orted
b y the disk subsystem The memory required with
this paradigm is one blo c k One ma y reduce this
memory requirementb y emplo ying tec hniques suc h
as those describ ed in NY extensions are required
to NY due to disk idiosyncrasies zones thermal
recalibration R W
Pip elining to Minimize Latency
Time
When a user references an ob ject that is not disk
residen t one approac hmigh t materialize the ob ject on
the disk driv es in its en tiret y b efore initiating its displa y In this case the latency time observ ed b y the user
is a function of the bandwidth of the tertiary storage
device and the size of the referenced ob ject This
latency time can b e reduced using pip elining Briey the pip elining mec hanism groups the blo c ks of ob ject X
in to s logical slices S
X S
X S
X S
Xs
suc h that
the displa y time of S
X T
D isplay
S
X eclipses the
time required to materialize S
X T
M ater ializ e
S
X T
D isplay
S
X eclipses T
M ater ializ e
S
X etc This
ensures a con tin uous displa y while reducing the latency
time b ecause the system initiates the displayof an
ob ject once a fraction of X ie S
X is disk residen t
W e describ e the pip elining mec hanism for t wopossi ble cases the bandwidth of tertiary is either lo w er
or higher than the bandwidth required to displa y
an ob ject
The ratio b et w een the pro duction rate of
tertiary and the consumption rate at a displa y sta
tion is termed Pro duction Consumption Ratio PCR
B T er tiar y
B D isplay
When PCR PCR the pro duction
rate of tertiary is lo w er higher than the consumption
rate of a displa y station Consider eac h case in turn
PCR Let an ob ject X consist of n blo c ks The time required
to materialize X T
M ater ializ e
X is d
n siz e block B T er tiar y
e
while its displaytime T
D isplay
X is d
n siz e block B D isplay
e The time required to materialize an ob ject is greater
than its displaytime With pip elining a p ortion
of time required to materialize X can b e o v erlapp ed
with its displa y This is ac hiev ed b y splitting X
in to s logical slices S
X S
X S
X S
Xs
suc h that T
Display
S
X eclipses T
M ater ializ e
S
X T
D isplay
S
X eclipses T
M ater ializ e
S
X etc Th us
T
D isplay
S
Xi
T
M ater ializ e
S
Xi for i s
Up on the retriev al of a tertiary residen t ob ject Xthe
pip elining mec hanism is as follo ws
Materialize the blo c ks that constitute S
X on the
disk driv e
F or i to s do
The discussion for the case when the bandwidth of tertiary is
equiv alen t to that of displa y is a sp ecial case of item MATERIALIZE
Overlap Period
STEP 2.a
STEP 2.b
STEP 1
1 2 3 n-1 n DISPLAY
Figure The pip elining mec hanism
a Initiate the materialization of S
Xi
from tertiary
on to the disk
b Initiate the displa yof S
Xi Displa y the last slice S
Xs
The duration of Step determines the latency time
of the system During Step while the subsequen t
slices are materialized from tertiary the disk residen t
slices are displa y ed Step a and b corresp ond to
t w o dieren t pro cesses that execute in parallel While
no constrain ts are imp osed on Step a the amoun tof
required memory is minim ized when a blo ckisushed
on to the disk driv e as so on as it b ecomes memory
residen t Step displa ys the last slice materialized on
the disk driv e In order to minimi ze the latency time
S
Xs
should consist of a single blo c k T o illustrate this
consider Figure If the last slice consists of more than
one blo c k then the duration of the o v erlap is reduced
resulting in a longer duration of time for Step The amoun t of time required for eac h step is com
puted as follo ws Since S
Xs
consists of a single blo c k
the duration of Step corresp onds to the time required
to displa y n blo c ks from the disk driv e ie n
siz e block B D isplay
Therefore the p ortion of the ob ject ma
terialized byStep is bPCR n siz e bl ock c The remainder of the ob ject m ust constitute S
X size S
X n b PCR n c size S
X isinthe gran ularit y of blo c ks The
latency time of the system ie duration of Step is determined b y the time required to rep osition
the read head of tertiary to the appropriate ph ysical
lo cation corresp onding to S
X and render S
X disk
residen t ie
siz e block B Tertiary
size S
X
Example Assume that ob ject X is a one min ute
video clip with B
D isplay
m bps Th us X
is megab ytes in size Assume a inc h gigab yte disk driv e that consists of cylinders where
siz e bl ock cy l inder b ytes Hence X
consists of blo c ks If the bandwidth of tertiary is m bps then PCR If X is tertiary residen t without
Maximize the duration of the pip eline
pip elining a user referencing X observ es a latency time
of min utes With pip elining this latency time is
reduced to min utes a impro v emen t PCR In this case the bandwidth of tertiary exceeds the
bandwidth required to displa y an ob ject Therefore
Equation is satised when eac h slice of an ob ject
consists of a single blo c k Tw o alternativ e approac hes
can b e emplo y ed to comp ensate for the fast pro duction
rate either m ultiplex the bandwidth of tertiary
among sev eral requests referencing dieren t ob jects
or increase the consumption rate of an ob ject b y
ushing the blo c ks to the disk driv e at a faster rate
The rst approac hw astes the bandwidth of the tertiary
storage device b y requiring it to rep osition its read
head m ultiple times Moreo v er its b enets migh tbe
marginal at b est due to the signican to v erhead incurred
when rep ositioning the read head of a tertiary storage
device This rep ositioning is due to load and unload
time of the medium con taining the referenced data eg
with a Hewlett P ac k ard rewritable optical disk library the a v erage time required to load and unload a platter
is t ypically seconds With the second approac h the
pip elining algorithm is the same as that describ ed in
Section with the dierence that the displa yof X
starts once its rst blo c k b ecomes disk residen t
The blo c ks pro duced b y the tertiary can b e ushed
during the displa y of a blo c k Step b as long as the
follo wing constrain t is satised
siz e bl ock B
D isplay
bPCRc S
D isk
Where S
D isk
denotes the disk service time S
D isk
siz e block tf r
max seek max l atency In the w orst
case the maxim um amoun t of memory required is
PCR b PCRc siz e bl ock If the constrain t p osited
in Equation is violated then extra memory is required
as the a v ailable disk bandwidth is insucien t to supp ort
b oth a con tin uous displa y and consume the blo c ks
pro duced b y tertiary In the w orst case the memory
requirementof thistec hnique is
bPCRc S
Disk
siz e bl ock B
D isplay
B
T er tiar y
n where n is the n um b er of blo c ks that constitute the
referenced ob ject One ma y minim ize this memory re
quirementb y forcing the tertiary to sit idle p erio dically
this w astes the bandwidth of tertiary
PIRA TE ReplacementP olicy
Up on the retriev al of a tertiary residentobjectsa y Z
if the storage capacit y of the disk driv e is exhausted then
the system m ust replace one or more ob jects victims in
order to allo w Z to b ecome disk residen t A standard
approac h termed A tomic migh t replace eachof the
victim ob jects in their en tiret y requiring eac h ob ject to
either b e completely disk residen t or not disk residentat
all With P artIal ReplAcementTEc hnique PIRA TE
the system c ho oses a larger n um b er of ob jects as
victims ho w ev er it replaces a p ortion of eac hofits
victims in order to free up sucien t space for the blo c ks
of Z The input to PIRA TE include the size and
frequency of access to eachobject X in the database
termed size Xand hea t X resp ectiv ely CABK
a set of ob jects with a disk residen t fraction except
Z denoted F and the size of the ob ject referenced
b y the p ending request size Z Its side aect is
that it mak es enough of the disk space a v ailable to
accommo date Z PIRA TE deletes blo c ks of an ob ject one at a time
starting with those that constitute the tail end of
the ob ject F or example if PIRA TE decides to
replace those blo c ks that constitute min utes of a min ute video clip it deletes those blo c ks that represen t
the last min utes of the clip lea ving the rst min utes disk residen t The n um b er of blo c ks that
constitute the rst p ortion of X is denoted disk X
while its deleted non disk residen t blo c ks is termed
absent X absent X size X disk X Note
the gran ularityof absent X size X and disk X are in blo c ks The follo wing section pro vides a formal
statemen t of the replacemen t problem and PIRA TE as
a solution for a single displa y system
F ormal Statemen t of the Problem
The p ortion of disk space allo cated to con tin uous media
data t yp es consists of C blo c ks The database consists of
m ob jects fo
o
m
g with heat hea t o
j
sat
isfying
P
m
j hea t o
j
and sizes size o
j
C for all j m The size of the database exceeds the
storage capacit y of the system ie P
m
j size o
j
C Consequen tly the database resides p ermanen tly on
the tertiary storage device and ob jects are sw app ed in
and out from the disk W e assume that the size of eac h
ob ject is smaller than the storage capacit y of the disk
driv e size o
j
C for j m Moreo v er to sim
plify the discussion w e assume that the tertiary is not
required to c hange tap esplatters or rep osition its read
head once it starts to transfer an ob ject Assume a pro
cess that generates requests for ob jects in whic h ob ject
o
j
is requested with probabilit y hea t o
j
all indep en
den t W e assume no adv ance kno wledge of the p ossible
perm utation of requests for dieren t ob jects
Let F denote the set of ob jects with a disk residen t
fraction except the one that is referenced bythe
p ending request size F
P
xF
disk x Moreo v er
assuming a new request arriv es referencing ob ject Z
F F f Z g w edene free disk space as C size F disk Z If absent Z free disk space
then no replacemen t is required In this study w e
fo cus on the scenario where replacemen t is required ie
absent Z free disk space W e dene latency time observ ed b y a request refer
encing Z Z as the amoun t of time elapsed from
the arriv al of the request to the onset of the displa yIt
is a function of disk Z and B
T er tiar y
If disk Z
size S
Z then the maxim um v alue for Zis the
w orst rep osition time of the tertiary storage devise One
ma y reduce this latency time to zero b y incremen ting
size S
Z with the amoun t of data corresp onding to
this time ie
w or st r eposition time B D isplay
siz e block This op
timization is assumed for the rest of this pap er If
disk Z size S
Z then Z due to assumed
optimization Otherwise ie disk Z size S
Z the system determines the starting address of the non
disk residen t p ortion of Z missing and Z is dened
as the total sum of the rep ositioning of tertiary to
the ph ysical lo cation corresp onding to missing the
materialization time of the remainder of the rst slice
siz e block B T er tiar y
size S
Z disk Z The a v erage ex
p e cte d value of latency as a function of requests can b e
dened as
X
x
hea t x x The v ariance is
X
x
hea t x x By deleting a p ortion of an ob ject w ema y increase its
latency time resulting in a higher and Ho w ev er
once the disk capacit y is exhausted deletion of an ob ject
is una v oidable In this case it is desired for some x
in F to reduce disk xsuc h that enough disk space
b ecomes a v ailable to render ob ject Z disk residentin
its en tiret y The problem is ho w to determine those
x and their corresp onding fractions to b e deleted to
minimi ze b oth the a v erage latency time and its v ariance
Unfortunately minimi zing the a v erage latency time
migh t increase the v ariance and vice v ersa In the next
section w e presen t simple PIRA TE and demonstrate
that it minimi zes the a v erage latency Subsequen tly extended PIRA TE is in tro duced as a generalization of
simple PIRA TE with a k nob that can b e adjusted b y
the user to tailor the system to strik e a compromise
bet w een these t w o ob jectiv es
Simple PIRA TE
Figure presen ts simple PIRA TE Logically it op erates
in t w o passes In the rst pass it deletes from those
ob jects sa y i whose disk residen t p ortion is greater
than the size of their rst slice S
i By doing so it
can ensure a zero latency time for requests that reference
these ob jects in the future b yemplo ying the pip elining
mec hanism of Section Note that PIRA TE deletes
dene pf s potential free space dene rds requir ed disk space
rds absent Z free disk space
rep eat
victim ob ject i from set F with
the lo w est hea tand
disk i size S
i if victim is NOT n ull then
pf s disk victim size S
v ictim else
v ictim ob ject i from set F
with the lo w est hea t
pf s disk victim if pf s r ds then
disk v ictim disk v ictim rds
rds else
disk v ictim disk v ictim pf s
rds rds pf s
un til rds Figure Simple PIRA TE
ob jects at a gran ularit y of a blo c k Moreo v er it
frees up only sucien t space to accommo date
the p ending request and no more than that F or
example if absent Zisequiv alentto
size X
and X is
c hosen as the victim then only the blo c ks corresp onding
to the last
of X are deleted in order to render Z disk
residen t
If the disk space made a v ailable b y the rst pass
is insucien t then simple PIRA TE en ters its second
pass This pass deletes ob jects starting with the one
that has the lo w est heat follo wing the greedy strategy
suggested for fr actional knapsack pr oblem CLR
One migh t argue that a com bination of hea t and size
should b e considered when c ho osing victims Ho w ev er
absent Zbloc ks where Z is the ob ject required to
b ecome disk residen t are required to b e deleted from
disk indep enden t of the size of the victims The
follo wing pro of formalizes this statemen t and pro v es the
optimalit y of simple PIRA TE in minimi zing the latency
time of the system
Lemma T o minim ize the a v erage latency time of
the system during pass PIRA TE m ust delete those
blo c ks corresp onding to the ob ject with the lo w est
hea t indep enden t of the ob ject sizes
Pro of Without loss of generalit y assume F fX Y g siz e bl ock and B
T er tiar y
Assume a request
arriv es at t
referencing Z and the disk capacityis
exhausted Let t
b e the time when a p ortion of X
andor Y is deleted W e dene tobe the a v erage
latency at t
t
see Equation Subsequen tly disk
i and disk
i represen t the disk residen t fraction of
ob ject i at time t
and t
resp ectiv ely Let i
denote the
n umberofbloc ks of ob ject i deleted from disk at time
t
ie i
disk
i disk
i By deleting X andor
Y partiallyw e increase the a v erage latency b y
Ho w ev er since deletion is una v oidable the ob jectiv eis
to minim ize while
absent Z X
Y
hea t X size S
X disk
X hea t Y size S
Y disk
Y hea t X size S
X disk
X hea t Y size S
Y disk
Y Th us
hea t X X
hea t Y Y
hea t X X
hea t Y absent Z X
hea t Y absent Z
X
hea t X hea t Y Since hea t Y absent Zand hea t X hea t Y are constan ts in order to minimi ze w e can only v ary
X
this impacts Y
b ecause absent Z X
Y
If
hea t X hea t Y then hea t X hea t Y is a
p ositivev alue hence in order to minimi ze the v alue
of X
should b e minim i zed ie ob ject with higher heat
X should not b e replaced On the other hand if
hea t X hea t Y then hea t X hea t Y is a
negativev alue hence in order to minim i ze the v alue
of X
should b e maximi zed ie ob ject with lo w er heat
X should b e replaced This demonstrates that the
amoun t of data deleted from victims i
in order to
free up disk space dep ends only on hea t i and not
size i Extended PIRA TE
Extended PIRA TE is a generalization of simple PI
RA TE that can b e customized to strik e a compromise
bet w een the t w o goals to minim ize either the a v erage
latency time of the system or the v ariance in the la
tency time The ma jor dierence b et w een simple and
extended PIRA TE is as follo ws Extended PIRA TE
see Figure requires a minim um fraction termed
least x of the most frequen tly accessed ob jects to
b e disk residen t Logically extended PIRA TE op erates
in three passes Its rst pass is iden tical to that of sim
ple PIRA TE If this pass fails to pro vide sucien t disk
In computing the a v erage latency w e ignore the latency of the
other ob jects in the database as w ell as the rep ositionin g time of
the tertiary This is b ecause it only adds a constan tto both and whic h will b e eliminated b y the computation of dene pf s potential free space dene rds requir ed disk space
rds absent Z free disk space
rep eat
victim ob ject i from set F with
the lo w est hea tand
disk i size S
i if victim is NOT n ull then
pf s disk victim size S
v ictim else
v ictim ob ject i from set F with
the lo w est hea tand
disk i least i if victim is NOT n ull then
pf s disk v ictim least v ictim else
v ictim ob ject i from set F
with the lo w est hea t
pf s disk v ictim if pf s r ds then
disk v ictim disk v ictim rds
rds else
disk v ictim disk v ictim pf s
rds rds pf s
un til rds Figure Extended PIRA TE
space for the referenced ob ject then during the second
pass it deletes from ob jects un til eac h of their disk res
iden t p ortion corresp onds to least x If pass t wofails
ie pro vides insucien t space to materialize the refer
enced ob ject it en ters pass This pass is iden tical to
pass of simple PIRA TE where ob jects are deleted in
their en tiret y starting with the one that has the lo w est
heat
With extended PIRA TE least X for eac h disk
residen t ob ject is dened as follo ws
least X
min dknob hea t X size S
X e size S
X where knob is an in teger whose lo w er b ound is zero
The minim um function a v oids the size of least xto
exceed the size of the rst slice When k nob extended PIRA TE is iden tical to simple PIRA TE As
knob increases a larger p ortion of eac h ob ject b ecomes
disk residen t Ob viously the ideal case is to increase the
knob un til the rst slice of all the ob jects b ecome disk
residen t Ho w ev er due to the limited storage capacit y this migh t b e infeasible By increasing the k nobw e
force a p ortion of some ob jects with lo w er heat to remain
disk residen t at the exp ense of deleting from ob jects
with a high heat By pro viding eac h request referencing
an ob ject a latency time prop ortional to the heat of that
ob ject extended PIRA TE impro v es the v ariance while
not increasing the a v erage dramatically There is an optimal v alue for knob that minim izes If the v alue of knob exceeds this v alue then starts to
increase also
Lemma The optimal v alue for knob is
C
Av g S lice Pro of Let U b e the total n um b er of unique ob jects
that are referenced o v er a p erio d of time W e dene
Av g S l ice X
x
hea t x size S
x Av g H eat P
x
hea t x U
U
Av g Least k nob Av g H eat Av g S l ice The ideal case is when the least of almost all the
ob jects that constitute the database are disk residen t
C is the total n um b er of disk blo c ks
C U Av g Least
U k nob U
Av g Slice solving for k nobw e obtain knob C
Av g S lice Substituting the optimal v alue of knob in Equation w e obtain least X
hea t X size S X P
m
i
hea t i size S i C This is in tuitiv ely the amoun t of disk space an ob ject
X deserv es Section emplo ys a sim ulation study
to conrm this analytical result Note that b ecause
hea t is considered in the computation of least with
k nob C
Av g S lice the a v erage latency time degrades
prop ortional to the impro v ementin v ariance In
summarywhen knob PIRA TE replaces ob jects in a
manner that minimi zes a v erage latency time Ho w ev er
when k nob C
Av g S lice itminim izes the v ariance T o
observ e consider the follo wing discussion
In the long run with k nob PIRA TE main tains the
rst slice of all the ob jects with the highest heat disk
residen t while the others comp ete with eac h other for
a small p ortion of the disk space see Figure a T o
appro ximate the n um b er of ob jects that b ecome disk
residen t with knob w e use the a v erage size
Av g S l ice as follo ws
C
Av g S l ice Ho w ev er with k nob C
Av g S lice PIRA TE main tains
only a minim um p ortion of all these ob jects disk
residen t Toac hiev e this in optimal case it requires
b. knob = C/Avg_Slice1
...
...
XY U V WQ S
HEAT DECREASES
...
...
XY U V WQ S
HEAT DECREASES
Disk Resident
Absent
a. knob = 0
Figure Status of the rst slice of ob jects
Av g Least C
U
of disk space
The rest of the
disk space C C
U
can b e used for the minim um
p ortion of the other ob jects see Figure b Therefore
in the long run the n um b er of disk residen t ob jects with
knob C
Av g S lice ie
C
Max Size Max Heat knob
U is larger than Ho w ev er with knob the
rst slice of ob jects are disk residen t while with
knob C
Av g S lice only least of eachofthe ob jects
are disk residen t This results in the follo wing tradeo
On one hand a request referencing an ob ject Z has a
higher hit ratio with knob C
Av g Slice as compared
to knob On the other hand a hit with knob
translates in to a fraction of a second latency time while
with knob C
Av g S lice itresults inaminim um latency
time of size S
Z least Z siz e block B T er tiar y
This
explains wh y with knob C
Av g Slice PIRA TE impro v es
the v ariance prop ortional to the degradation in a v erage
latency P erformance Ev aluation
W e implemen ted a sim ulation mo del to v erify the an
alytical mo dels of this study and compare PIRA TE
with A tomic F or the purp oses of this ev aluation w e
This is optimistic b ecause the ob jects ha v e the highest
heats th us a large minim um p ortion While it is not realistic to
use Av g Least in the equation it is useful for appro ximati on
k nob PIRA TE
Sk ew ed Uniform
k nob C
Av g S lice T able Av erage latency time in seconds and standard
deviation in paren thesis
assumed a gigab yte disk driv e consisting of cylin
der eac h cylinder with a capacit y of megab yte The
bandwidth of the disk drivew as m bps The capac
it y of tertiary store w as set at gigab ytes Its band
width w as m bps The system w as congured with a megab yte blo c k size
The database consists of ob jects eac hwitha
m bps bandwidth requirement B
D isplay
m bps
The size of the ob jects w as v aried from megab ytes
displa y time of seconds to megab yte displa y
time of min ute and seconds The a v erage ob ject
size w as megab yte displa y time of one min ute
Based on the ph ysical c haracteristics of the system
the a v erage size of the rst slice of the ob jects w as megab ytes
W e manipulated the mean of an exp onen tial distribu
tion to mo del t w o alternativ e distributions of access to
the ob jects sk ew ed mean and uniform mean T able presen ts the obtained results as a function
of k nob knob has no impact on A tomic
With a sk ew ed distribution of access A tomic pro vides
an a v erage latency time of seconds with a standard
deviation of Both A tomic and PIRA TE observ e
at least a second latency due to the seek and transfer
time of the rst blo c k The marginal impro v emen t
with PIRA TE is b ecause it main tains the rst slice of
the infrequen tly accessed ob jects disk residen t This
impro v emen t is marginal b ecause the disk space is not
a scarce resource Ho w ev er b y main taining the rst
slice of these ob jects PIRA TE results in a signican tly
lo w er standard deviation when compared with A tomic
v ersus The v alue of knob has no impact
on observ ed v ariance b ecause PIRA TE executes neither
pass nor pass of Figure With a uniform distribution of access the a v ailable
disk space b ecomes a scarce resource A tomic results
in an a v erage latency time of seconds and a
standard deviation of T able presen ts the
obtained results with PIRA TE When knob
PIRA TE results in a signican tly lo w er latency time
and a lo w er standard deviation as compared to A tomic
bymain taining the rst slice of the frequen tly accessed
ob jects disk residen t Note the c hange in b oth the
a v erage latency and standard deviation as a function of
the k nob The optimal v alue of k nob ev aluated using
Lemma of Section results in the ideal compromise
bet w een the a v erage latency and the standard deviation
The p ercen tage degradation in the a v erage latency time
with knob as compared to knob is The
v ariance impro v es b y the same p ercen tage A random
c hoice of v alue for the knob eg eitherorin
T able can result in a degradation in a v erage latency
with marginal impro v ementon the v ariance These
results v erify the analytical mo dels of Section In a second set of exp erimen ts w e analyzed the
percen tage of requests that observ e a certain latency
time in the system The parameters are the same
as previous exp erimen t The obtained result for a
uniform distribution of access mean is presen ted
in Figure In these gures the xaxis represen ts
the incurred latency time while the yaxis represen ts
the p ercen tage of requests that observ ed this latency
time With k nob PIRA TE results in a signican tly
higher p ercen tage of requests that observ e a zero second
latency time in Figure c Ho w ev er when
compared with knob C
Av g S lice it results in a
signican tly lo w er p ercen tage of requests that observea
latency time that v aries from to seconds compare
Figure c with Figure e Figures b d and f presen t the magnied p ortion of curv es in Figures a
c and e resp ectiv ely Note that of requests
observ e a latency time higher than seconds when
knob C
Av g Slice as compared to with k nob
Figure d and f These results demonstrate the
tradeo b et w een the v ariance and the a v erage latency
time of the system
Conclusion and F uture Researc h
Directions
This study in v estigates the role of hierarc hical m ultim e
dia storage managers for con tin uous media data t yp es
audio and video ob jects W e describ ed a pip elining
mec hanism that o v erlaps the displa y of a p ortion of an
ob ject from the disk driv e with the materialization of its
remainder from the tertiary This minim izes the latency
time observ ed for requests referencing those ob jects that
are not disk residen t Moreo v er it reduces the fraction
of an ob ject that should b e disk residen t in order for the
user to observ e the latency pro vided b y the disk driv e
In addition w ein tro duced PIRA TE as a tec hnique to
manage the a v ailable disk space It pro vides the user
with a k nob that can b e ne tuned to either minim i ze
the a v erage latency time of the system or the v ariance
in latency Wein tend to extend this study in sev eral directions
040 80 120 160
0
1
2
3
4
5
6
0
20
40
60
80
100
Percentage of
Requests (%)
Latency (in Seconds)
a. Atomic
040 80 120 160
0
Percentage of
Requests (%)
Latency (in Seconds)
c. PIRATE (knob=0)
040 80 120 160
0
Percentage of
Requests (%)
Latency (in Seconds)
e. PIRATE (knob=25)
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
b. Atomic (Magnified)
0
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
d. PIRATE (knob=0, Magnified)
0
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
f. PIRATE (knob=25, Magnified)
20
40
60
80
100
20
40
60
80
100
1
2
3
4
5
6
1
2
3
4
5
6
Figure 6: Comparison of Atomic and PIRATE
First this study assumed a single user displa ying a
single ob ject The design of extended PIRA TE in
the presence of either a single user displa ying sev eral
indep enden t ob jects sim ultaneously or m ultiple users
accessing dieren t ob jects requires further in v estigation
In b oth cases the curren t design could cause the tertiary
to b ecome a b ottlenec k for the system reducing the
utilization of the disk driv e and resulting in a lo w er
throughput Second an en vironmen t that consists of
sev eral disk driv es and tertiary storage devices pro vides
PIRA TE with additional options to consider this is
sp ecially true for m ultiple users Finallyw ein tend
to demonstrate the feasibilit y of the prop osed ideas b y
implemen ting them
References
ARA PAngP Ruetz and D Auld Video Compres
sion Mak es Big Gains In IEEE Sp e ctrum pages
BGMJ S Berson S Ghandeharizadeh R Mun tz and
X Ju Straggered striping in m ultimedia infor
mation systems In Pr o c e e dings of the A CM SIG
MOD Internation al Confer enc e on Management
of Data CABK G Cop eland W Alexander E Bough ter and
T Keller Data Placemen t in Bubba In
Pr o c e e dings of the A CM SIGMOD International
Confer enc e on Management of Data pages CL HJ Chen and T Little Ph ysical Storage Orga
nizations for TimeDep enden t Multimedia Data
In Pr o c e e dings of the F oundationsof Data Or ganization and A lgorithms F ODO Confer enc e Octob er CLR Thomas H Cormen Charles E Leiserson and
Ronald L Riv est editors Intr o duction to
Algorithms The MIT Press and McGra wHill
Bo ok Compan y CP D K Campb ell and K Pro ehl Optical Ad
v ances BYTE Magazine Marc h F o x E A F o x Adv ances in In teractiv e Digital
Multimedia Sytems IEEE Computer pages Octob er Gal D Le Gall MPEG a video compression stan
dard for m ultimedia applications Communic a
tions of the A CM April GR S Ghandeharizad eh and L Ramos Con tin uous
retriev al of m ultimedia data using paralleli sm
IEEE T r ansactions on Know le dge and Data
Engine ering August Has B Hask ell In ternational standards activities
in image data compression In Pr o c e e dings of
Scientic Data Compr ession Workshop pages
NASA conference Pub NASA Oce of Managemen t Scien tic and
tec hnical information division NY R Ng and J Y ang Maximizing Buer and Disk
Utilization for News OnDemand In Pr o c e e dings
of the International Confer enceon V ery L ar ge
Datab ases R V P Rangan and H Vin Ecien t Storage T ec h
niques for Digital Con tin uous Media IEEE
T r ansactions on Know le dge and Data Engine er
ing August R W C Ruemmler and J Wilk es An In tro duction to
Disk Driv e Mo deling IEEE Computer Marc h
TPBG FA T obagi J P ang R Baird and M Gang
Streaming RAIDA Disk Arra y Managemen t
System for Video Files In First A CM Confer
enc e on Multime dia August
Abstract (if available)
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 584 (1994)
PDF
USC Computer Science Technical Reports, no. 598 (1994)
PDF
USC Computer Science Technical Reports, no. 600 (1995)
PDF
USC Computer Science Technical Reports, no. 601 (1995)
PDF
USC Computer Science Technical Reports, no. 618 (1995)
PDF
USC Computer Science Technical Reports, no. 622 (1995)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 590 (1994)
PDF
USC Computer Science Technical Reports, no. 589 (1994)
PDF
USC Computer Science Technical Reports, no. 748 (2001)
PDF
USC Computer Science Technical Reports, no. 650 (1997)
PDF
USC Computer Science Technical Reports, no. 766 (2002)
PDF
USC Computer Science Technical Reports, no. 592 (1994)
PDF
USC Computer Science Technical Reports, no. 610 (1995)
PDF
USC Computer Science Technical Reports, no. 685 (1998)
PDF
USC Computer Science Technical Reports, no. 558 (1993)
PDF
USC Computer Science Technical Reports, no. 619 (1995)
PDF
USC Computer Science Technical Reports, no. 647 (1997)
PDF
USC Computer Science Technical Reports, no. 627 (1996)
PDF
USC Computer Science Technical Reports, no. 623 (1995)
Description
Shahram Ghandeharizadeh and Cyrus Shahabi. "On multimedia repositories personal computers and hierarchical storage systems." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 578 (1994).
Asset Metadata
Creator
Shahbi, Cyrus
(author),
Shahram, Ghandeharizadeh
(author)
Core Title
USC Computer Science Technical Reports, no. 578 (1994)
Alternative Title
On multimedia repositories personal computers and hierarchical storage systems (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
10 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270998
Identifier
94-578 On Multimedia Repositories Personal Computers and Hierarchical Storage Systems (filename)
Legacy Identifier
usc-cstr-94-578
Format
10 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/