Close
USC Libraries
University of Southern California
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
Folder
USC Computer Science Technical Reports, no. 578 (1994)
(USC DC Other) 

USC Computer Science Technical Reports, no. 578 (1994)

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Description
Shahram Ghandeharizadeh and Cyrus Shahabi. "On multimedia repositories personal computers and hierarchical storage systems." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 578 (1994). 
Transcript (if available)
Content On Multimedia Rep ositories P ersonal Computers and Hierarc hical Storage
Systems
Shahram Ghandeharizadeh and Cyrus Shahabi
Departmen t of Computer Science
Univ ersit y of Southern California
Los Angeles California  Abstract
The past decade has witnessed a proliferation of p ersonal
computers at homes businesses classro oms libraries etc
Most often these systems are used to disseminate informa
tion Recen tlym ultimedia rep ositories ha v e added to the
excitemen t of this information age b y allo wing a user to re
triev e and manipulate con tin uous media data t yp es audio
and video ob jects The design and implemen tation of these
systems is c hallengin g due to b oth the large size of ob jects
that constitute this media t yp e and their con tin uous band
width requiremen t Compression in com bination with the
a v ailabil i t y of fast CPUs for realtime decompression pro
vide eectiv e supp ort for a con tin uous displa y of those ob
jects with a high bandwidth requiremen t Hierarc hical stor
age structures consisting of RAM disk and tertiary storage
devices pro vide a costeectiv e solution for the large size of
their rep ositories The fo cus of this study is on p ersonal com
puters single user single displa y that emplo y fast CPUs
compression and hierarc hical storage structures to supp ort
m ultimedia applications Its goals are to ensure a con tin u
ous displa y of audio and video ob jects while minimizin g the
latency time observ ed b y the user Its con tributions include
ano v el pip elini ng mec hanism and PIRA TE as a tec hnique
to manage the disk residen t ob jects
In tro duction
During the past few y ears the information tec hnology
has ev olv ed to store and retrievecon tin uous media
data t yp es eg audio and video These systems are
exp ected to pla y a ma jor role in library information
systems educational applications etc Ac hallenging
task when implemen ting these systems is to supp ort
con tin uous displa y of video ob jects This is b ecause
This researchw as supp orted in part b y the National Science
F oundation under gran ts IRI IRI NYI a w ard
and CD A a HewlettP ac k ard researc h gran t and a
researc h gran t from A TTNCRT eradata
Video ob jects require a high bandwidth for their
con tin uous displa y  F or example the bandwidth
required b y NTSC for net w orkqualit y video is
ab out  megabits p er second m bps Has  Recommendation  of the In ternational Radio
Consultativ e Comm ittee CCIR calls for a  m bps bandwidth for video ob jects F o x  A
video ob ject based on HDTV requires a bandwidth
of appro ximately  m bps One mayemplo y
compression lossless or lossy  see F o x  for an
o v erview declustering GR  striping TPBG  or a com bination of these tec hniques BGMJ
to supp ort the bandwidth requiremen t ie a
con tin uous displa y of video ob jects
Video ob jects are large in size F or example a  min ute uncompressed video clip based on NTSC is
appro ximately  gigab ytes in size With a com
pression tec hnique that reduces the bandwidth re
quiremen t of this ob ject to  m bps this ob ject is
appro ximately  megab ytes in size A rep ository
eg corresp onding to an encyclop edia that con
tains h undreds of suc h clips is terab ytes in size
The sustained bandwidth of the curren t magnetic
disk tec hnology is t ypically rated b et w een  to  m bps
The lo wend mark et place p ersonal computers w ork
stations emplo ys lossy compression tec hniques eg
MPEG Gal  to reduce b oth the size and the band
width requiremen t of video ob jects F or example
MPEG ensures that the bandwidth requiremen tof a
video clip do es not exceed  m bps The aim of MPEG
is co ding for up to  m bps with the further goal
of higherqualit y presen tations ARA  While these
tec hniques reduce the bandwidth requiremen ts of video
ob jects
 the large size of rep ositories con taining video
ob jects remains signican t see bullet n um b er  T o reduce cost of storage the storage manager of
these systems is exp ected to b e hierarc hical consisting of
memory  magnetic disk driv es and one or more tertiary
b y reducing the qualit y of presen tation

storage devices eg Digital Audio T ap e CDR OM
etc In order to simplify the discussion assume that
the system consists of one disk driv e and one tertiary
storage device As the dieren tlev els of this hierarc h y
are tra v ersed starting with memory  the densityof
the medium the amoun t of data it can store and
its latency increases while its transfer rate and cost
p er megab yte decreases A t the time of this writing
from megab yte of memory to megab yte of
disk storage to less than megab yte of a tertiary
storage device The database resides p ermanen tly on
the tertiary storage device The disk storage is used
as a temp orary staging area for the frequen tly accessed
ob jects in order to minim ize the n um b er of references to
the tertiary  An application referencing an ob ject that
is disk residen t observ es b oth the a v erage latency time
and the transfer rate of a magnetic disk driv e sup erior
to that of a tertiary storage device
In order for a system based on hierarc hical storage
structures to b e useful on a da ytoda y basis the system
should pro vide a fraction of a second latency time while
ensuring a con tin uous displa y of b oth audio and video
ob jects T o accomplish these ob jectiv es the a v ailable
resources should b e utilized in an in telligen t manner
Assuming a single user system that displa ys one ob ject
at a time the con tributions of this pap er include
The design of a pip elining mec hanism that emplo ys
b oth the magnetic disk and tertiary store for a
displa y It o v erlaps the displa y of a p ortion of
an ob ject from the disk with materialization of its
remainder from the tertiary store This minimi zes
the fraction of eac h ob ject that should b e disk
residen t in order for a user referencing that ob ject
to observ e the disk latency time Giv en a xed size
disk driv e this mec hanism allo ws for a larger n um ber
of ob jects to b ecome disk residen t minimi zing the
a v erage latency time of the system
PIRA TE replacemen t p olicy to manage the disk res
iden t ob jects in a manner that minim izes the a v er
age latency time andor the v ariance in latency
observ ed b y a user referencing ob jects PIRA TE
complemen ts pip elining in order to maximi ze the
utilization of b oth the disk and the tertiary store
while hiding b oth the high latency time and the lo w
bandwidth of tertiary from the user
The rest of this pap er is organized as follo ws Section  pro vides a description of our target en vironmentand a
framew ork that ensures a con tin uous displayof an ob ject from the magnetic disk driv e This discussion is not
detailed b ecause it is almost identicaltothatofsev eral
other studies eg TPBG R V CL Section  extends this framew ork to emplo y the tertiary store in
a pip elined manner in order to o v erlap the displayof
MEMORY
TERTIARY
DISPLAY
DISK
Figure  T arget platform
a p ortion of an ob ject with its materialization Sub
sequen tly  Section  presen ts the PIRA TE replacemen t
policy  Our conclusion and future researc h directions
are con tained in Section  An assumption of this study is that the bandwidth re
quiremen t of an ob ject do es not exceed the bandwidth
of the magnetic disk driv e F or those ob jects that vio
late this assumption tec hniques prop osed b y BGMJ
are appropriate
Displayof Con tin uous Media
Our target en vironmen t see Figure  is a t ypical
p ersonal computer that consists of a displa y  some
memory  a disk driv e and a tertiary storage device
W e assume that memory serv es as an in termediate
staging area b et w een the disk driv e and the tertiary
storage device enabling b oth the tertiary and the
disk to pro duce data sim ultaneously  An ob ject
is not required to b e disk residen t in order to b e
displa y ed a p ortion of it can b e displa y ed from
tertiary W e assume that up dates to the database are
infrequen t op erations Example en vironmen ts include
library information systems eg m ultim edia b o oks and
encyclop edias and the en tertainmen t industry short
do cumen taries on a sp ecic topic
In order to supp ort a con tin uous displa y of an ob ject
sa y X  from the disk driv e X is split in to n xed
size blo c ks  X
 X
  X
n
A blo c k represen ts a
con tiguous p ortion of an ob ject and determines the unit
of transfer from the disk driv e Its size is determined
at system conguration time In order to ensure a
con tin uous displayof X  the system reads one blo c k
of X sa y X
in to memory  It allo ws the disk driv e
to b e time shared among other requests while X
is b eing displa y ed as long as blo c k X
is staged in
memory b efore the displayof X
completes The size
of a blo c k and its bandwidth requiremen t determine the
n umberofsim ultaneous displa ys that can b e supp orted
b y the disk subsystem The memory required with
this paradigm is one blo c k One ma y reduce this

memory requirementb y emplo ying tec hniques suc h
as those describ ed in NY  extensions are required
to NY due to disk idiosyncrasies zones thermal
recalibration R W
Pip elining to Minimize Latency
Time
When a user references an ob ject that is not disk
residen t one approac hmigh t materialize the ob ject on
the disk driv es in its en tiret y b efore initiating its displa y  In this case the latency time observ ed b y the user
is a function of the bandwidth of the tertiary storage
device and the size of the referenced ob ject This
latency time can b e reduced using pip elining Briey  the pip elining mec hanism groups the blo c ks of ob ject X
in to s logical slices  S
X   S
X   S
X    S
Xs
suc h that
the displa y time of S
X   T
D isplay
S
X   eclipses the
time required to materialize S
X   T
M ater ializ e
S
X   T
D isplay
S
X   eclipses T
M ater ializ e
S
X   etc This
ensures a con tin uous displa y while reducing the latency
time b ecause the system initiates the displayof an
ob ject once a fraction of X ie S
X   is disk residen t
W e describ e the pip elining mec hanism for t wopossi ble cases the bandwidth of tertiary is either  lo w er
or  higher than the bandwidth required to displa y
an ob ject
 The ratio b et w een the pro duction rate of
tertiary and the consumption rate at a displa y sta
tion is termed Pro duction Consumption Ratio PCR
B T er tiar y
B D isplay
When PCR   PCR   the pro duction
rate of tertiary is lo w er higher than the consumption
rate of a displa y station Consider eac h case in turn
PCR   Let an ob ject X consist of n blo c ks The time required
to materialize X  T
M ater ializ e
X  is d
n siz e block  B T er tiar y
e
while its displaytime T
D isplay
X  is d
n siz e block  B D isplay
e The time required to materialize an ob ject is greater
than its displaytime With pip elining a p ortion
of time required to materialize X can b e o v erlapp ed
with its displa y  This is ac hiev ed b y splitting X
in to s logical slices  S
X   S
X   S
X    S
Xs
suc h that T
Display
S
X   eclipses T
M ater ializ e
S
X   T
D isplay
S
X   eclipses T
M ater ializ e
S
X   etc Th us
T
D isplay
S
Xi
 T
M ater ializ e
S
Xi   for   i s
Up on the retriev al of a tertiary residen t ob ject Xthe
pip elining mec hanism is as follo ws
Materialize the blo c ks that constitute S
X  on the
disk driv e
F or i to s do
The discussion for the case when the bandwidth of tertiary is
equiv alen t to that of displa y is a sp ecial case of item  MATERIALIZE
Overlap Period
STEP 2.a
STEP 2.b
STEP 1
1 2 3 n-1 n DISPLAY
Figure  The pip elining mec hanism
a Initiate the materialization of S
Xi
from tertiary
on to the disk
b Initiate the displa yof S
Xi     Displa y the last slice  S
Xs
The duration of Step  determines the latency time
of the system During Step  while the subsequen t
slices are materialized from tertiary  the disk residen t
slices are displa y ed Step a and b corresp ond to
t w o dieren t pro cesses that execute in parallel While
no constrain ts are imp osed on Step a the amoun tof
required memory is minim ized when a blo ckisushed
on to the disk driv e as so on as it b ecomes memory
residen t Step  displa ys the last slice materialized on
the disk driv e In order to minimi ze the latency time
 S
Xs
should consist of a single blo c k T o illustrate this
consider Figure  If the last slice consists of more than
one blo c k then the duration of the o v erlap is reduced
resulting in a longer duration of time for Step  The amoun t of time required for eac h step is com
puted as follo ws Since S
Xs
consists of a single blo c k
the duration of Step  corresp onds to the time required
to displa y n   blo c ks from the disk driv e ie  n  
siz e block  B D isplay
Therefore the p ortion of the ob ject ma
terialized byStep is bPCR   n    siz e bl ock  c The remainder of the ob ject m ust constitute S
X   size  S
X   n b PCR   n   c   size S
X  isinthe gran ularit y of blo c ks The
latency time of the system ie duration of Step  is determined b y the time required to  rep osition
the read head of tertiary to the appropriate ph ysical
lo cation corresp onding to S
X   and  render S
X  disk
residen t ie
siz e block  B Tertiary
size S
X  
Example  Assume that ob ject X is a one min ute
video clip with B
D isplay
m bps Th us X
is  megab ytes in size Assume a  inc h  gigab yte disk driv e that consists of  cylinders where
siz e bl ock cy l inder      b ytes Hence X
consists of  blo c ks If the bandwidth of tertiary is  m bps then PCR If X is tertiary residen t without
Maximize the duration of the pip eline

pip elining a user referencing X observ es a latency time
of  min utes With pip elining this latency time is
reduced to  min utes a  impro v emen t   PCR   In this case the bandwidth of tertiary exceeds the
bandwidth required to displa y an ob ject Therefore
Equation  is satised when eac h slice of an ob ject
consists of a single blo c k Tw o alternativ e approac hes
can b e emplo y ed to comp ensate for the fast pro duction
rate either  m ultiplex the bandwidth of tertiary
among sev eral requests referencing dieren t ob jects
or  increase the consumption rate of an ob ject b y
ushing the blo c ks to the disk driv e at a faster rate
The rst approac hw astes the bandwidth of the tertiary
storage device b y requiring it to rep osition its read
head m ultiple times Moreo v er its b enets migh tbe
marginal at b est due to the signican to v erhead incurred
when rep ositioning the read head of a tertiary storage
device This rep ositioning is due to load and unload
time of the medium con taining the referenced data eg
with a Hewlett P ac k ard rewritable optical disk library  the a v erage time required to load and unload a platter
is t ypically  seconds With the second approac h the
pip elining algorithm is the same as that describ ed in
Section  with the dierence that the displa yof X
starts once its rst blo c k b ecomes disk residen t
The blo c ks pro duced b y the tertiary can b e ushed
during the displa y of a blo c k Step b as long as the
follo wing constrain t is satised
siz e bl ock  B
D isplay
 bPCRc   S
D isk
Where S
D isk
denotes the disk service time S
D isk
siz e block  tf r
max seek  max l atency  In the w orst
case the maxim um amoun t of memory required is
PCR b PCRc  siz e bl ock  If the constrain t p osited
in Equation  is violated then extra memory is required
as the a v ailable disk bandwidth is insucien t to supp ort
b oth a con tin uous displa y and consume the blo c ks
pro duced b y tertiary  In the w orst case the memory
requirementof thistec hnique is
bPCRc   S
Disk
siz e bl ock  B
D isplay
 B
T er tiar y
n  where n is the n um b er of blo c ks that constitute the
referenced ob ject One ma y minim ize this memory re
quirementb y forcing the tertiary to sit idle p erio dically
this w astes the bandwidth of tertiary
PIRA TE ReplacementP olicy
Up on the retriev al of a tertiary residentobjectsa y Z
if the storage capacit y of the disk driv e is exhausted then
the system m ust replace one or more ob jects victims in
order to allo w Z to b ecome disk residen t A standard
approac h termed A tomic migh t replace eachof the
victim ob jects in their en tiret y  requiring eac h ob ject to
either b e completely disk residen t or not disk residentat
all With P artIal ReplAcementTEc hnique PIRA TE
the system c ho oses a larger n um b er of ob jects as
victims ho w ev er it replaces a p ortion of eac hofits
victims in order to free up sucien t space for the blo c ks
of Z  The input to PIRA TE include  the size and
frequency of access to eachobject X in the database
termed size  Xand hea t X  resp ectiv ely CABK
a set of ob jects with a disk residen t fraction except
Z denoted F  and  the size of the ob ject referenced
b y the p ending request  size Z  Its side aect is
that it mak es enough of the disk space a v ailable to
accommo date Z  PIRA TE deletes blo c ks of an ob ject one at a time
starting with those that constitute the tail end of
the ob ject F or example if PIRA TE decides to
replace those blo c ks that constitute  min utes of a  min ute video clip it deletes those blo c ks that represen t
the last  min utes of the clip lea ving the rst  min utes disk residen t The n um b er of blo c ks that
constitute the rst p ortion of X is denoted disk X
while its deleted non disk residen t blo c ks is termed
absent  X  absent  X size X   disk X  Note
the gran ularityof absent  X  size X  and disk X  are in blo c ks The follo wing section pro vides a formal
statemen t of the replacemen t problem and PIRA TE as
a solution for a single displa y system
F ormal Statemen t of the Problem
The p ortion of disk space allo cated to con tin uous media
data t yp es consists of C blo c ks The database consists of
m ob jects fo
  o
m
g with heat hea t  o
j
    sat
isfying
P
m
j  hea t  o
j
  and sizes size o
j
  C  for all   j  m The size of the database exceeds the
storage capacit y of the system  ie  P
m
j  size  o
j
 C  Consequen tly  the database resides p ermanen tly on
the tertiary storage device and ob jects are sw app ed in
and out from the disk W e assume that the size of eac h
ob ject is smaller than the storage capacit y of the disk
driv e size  o
j
C for   j  m Moreo v er to sim
plify the discussion w e assume that the tertiary is not
required to c hange tap esplatters or rep osition its read
head once it starts to transfer an ob ject Assume a pro
cess that generates requests for ob jects in whic h ob ject
o
j
is requested with probabilit y hea t  o
j
all indep en
den t W e assume no adv ance kno wledge of the p ossible
perm utation of requests for dieren t ob jects
Let F denote the set of ob jects with a disk residen t
fraction except the one that is referenced bythe
p ending request size  F
P
xF
disk x Moreo v er
assuming a new request arriv es referencing ob ject Z
F F  f Z g w edene free disk space as C   size F disk Z  If absent Z   free disk space

then no replacemen t is required In this study w e
fo cus on the scenario where replacemen t is required ie
absent  Z  free disk space W e dene latency time observ ed b y a request refer
encing Z    Z  as the amoun t of time elapsed from
the arriv al of the request to the onset of the displa yIt
is a function of disk Z  and B
T er tiar y
If disk Z
size S
Z   then the maxim um v alue for   Zis the
w orst rep osition time of the tertiary storage devise One
ma y reduce this latency time to zero b y incremen ting
size S
Z   with the amoun t of data corresp onding to
this time ie
w or st r eposition time B D isplay
siz e block   This op
timization is assumed for the rest of this pap er If
disk Z   size  S
Z   then   Z    due to assumed
optimization Otherwise ie disk Z   size S
Z   the system determines the starting address of the non
disk residen t p ortion of Z  missing  and   Z  is dened
as the total sum of  the rep ositioning of tertiary to
the ph ysical lo cation corresp onding to missing the
materialization time of the remainder of the rst slice
siz e block  B T er tiar y
 size S
Z    disk Z  The a v erage  ex
p e cte d value of latency as a function of requests can b e
dened as
 X
x
hea t  x    x  The v ariance is
  X
x
hea t  x     x      By deleting a p ortion of an ob ject w ema y increase its
latency time resulting in a higher  and   Ho w ev er
once the disk capacit y is exhausted deletion of an ob ject
is una v oidable In this case it is desired for some x
in F  to reduce disk  xsuc h that enough disk space
b ecomes a v ailable to render ob ject Z disk residentin
its en tiret y  The problem is ho w to determine those
x and their corresp onding fractions to b e deleted to
minimi ze b oth the a v erage latency time and its v ariance
Unfortunately  minimi zing the a v erage latency time
migh t increase the v ariance and vice v ersa In the next
section w e presen t simple PIRA TE and demonstrate
that it minimi zes the a v erage latency  Subsequen tly  extended PIRA TE is in tro duced as a generalization of
simple PIRA TE with a k nob that can b e adjusted b y
the user to tailor the system to strik e a compromise
bet w een these t w o ob jectiv es
Simple PIRA TE
Figure  presen ts simple PIRA TE Logically  it op erates
in t w o passes In the rst pass it deletes from those
ob jects sa y i whose disk residen t p ortion is greater
than the size of their rst slice  S
i   By doing so it
can ensure a zero latency time for requests that reference
these ob jects in the future b yemplo ying the pip elining
mec hanism of Section  Note that PIRA TE deletes
dene pf s potential free space dene rds requir ed disk space
rds  absent Z   free disk space
rep eat
victim  ob ject i from set F with
the lo w est hea tand
disk i  size  S
i   if  victim is NOT n ull then
pf s  disk victim  size S
v ictim   else
v ictim  ob ject i from set F
with the lo w est hea t
pf s  disk victim if  pf s  r ds  then
disk v ictim  disk v ictim  rds
rds   else
disk v ictim  disk v ictim  pf s
rds  rds  pf s
un til  rds  Figure  Simple PIRA TE
ob jects at a gran ularit y of a blo c k Moreo v er it
frees up only sucien t space to accommo date
the p ending request and no more than that F or
example if absent  Zisequiv alentto
size  X  
and X is
c hosen as the victim then only the blo c ks corresp onding
to the last
 of X are deleted in order to render Z disk
residen t
If the disk space made a v ailable b y the rst pass
is insucien t then simple PIRA TE en ters its second
pass This pass deletes ob jects starting with the one
that has the lo w est heat follo wing the greedy strategy
suggested for fr actional knapsack pr oblem CLR
One migh t argue that a com bination of hea t and size
should b e considered when c ho osing victims Ho w ev er
absent  Zbloc ks where Z is the ob ject required to
b ecome disk residen t are required to b e deleted from
disk indep enden t of the size of the victims The
follo wing pro of formalizes this statemen t and pro v es the
optimalit y of simple PIRA TE in minimi zing the latency
time of the system
Lemma  T o minim ize the a v erage latency time of
the system during pass  PIRA TE m ust delete those
blo c ks corresp onding to the ob ject with the lo w est
hea t  indep enden t of the ob ject sizes
Pro of Without loss of generalit y  assume F  fX Y g siz e bl ock  and B
T er tiar y
 Assume a request
arriv es at t
referencing Z and the disk capacityis
exhausted Let t
b e the time when a p ortion of X
andor Y is deleted W e dene      tobe the a v erage

latency at t
 t
 see Equation  Subsequen tly disk
 i and disk
 i represen t the disk residen t fraction of
ob ject i at time t
and t
 resp ectiv ely  Let  i
denote the
n umberofbloc ks of ob ject i deleted from disk at time
t
ie  i
disk
 i  disk
 i By deleting X andor
Y partiallyw e increase the a v erage latency b y
Ho w ev er since deletion is una v oidable the ob jectiv eis
to minim ize        while
absent  Z  X
 Y
   hea t X    size  S
X    disk
 X   hea t Y    size S
Y    disk
 Y     hea t X    size  S
X    disk
 X   hea t Y    size S
Y    disk
 Y  Th us
 hea t  X    X
hea t  Y    Y
hea t  X    X
hea t  Y    absent  Z    X
 hea t  Y   absent Z
X
 hea t  X   hea t  Y  Since hea t  Y   absent Zand  hea t  X   hea t  Y  are constan ts in order to minimi ze w e can only v ary
X
this impacts  Y
b ecause absent  Z  X
 Y
If
hea t  X   hea t Y  then  hea t  X   hea t  Y  is a
p ositivev alue hence in order to minimi ze  the v alue
of  X
should b e minim i zed ie ob ject with higher heat
X  should not b e replaced On the other hand if
hea t  X   hea t Y  then  hea t  X   hea t  Y  is a
negativev alue hence in order to minim i ze  the v alue
of  X
should b e maximi zed ie ob ject with lo w er heat
X  should b e replaced This demonstrates that the
amoun t of data deleted from victims   i
in order to
free up disk space dep ends only on hea t  i and not
size i   Extended PIRA TE
Extended PIRA TE is a generalization of simple PI
RA TE that can b e customized to strik e a compromise
bet w een the t w o goals to minim ize either the a v erage
latency time of the system or the v ariance in the la
tency time The ma jor dierence b et w een simple and
extended PIRA TE is as follo ws Extended PIRA TE
see Figure  requires a minim um fraction termed
least  x of the most frequen tly accessed ob jects to
b e disk residen t Logically  extended PIRA TE op erates
in three passes Its rst pass is iden tical to that of sim
ple PIRA TE If this pass fails to pro vide sucien t disk
In computing the a v erage latency w e ignore the latency of the
other ob jects in the database as w ell as the rep ositionin g time of
the tertiary  This is b ecause it only adds a constan tto both   and   whic h will b e eliminated b y the computation of  dene pf s potential free space dene rds requir ed disk space
rds  absent Z   free disk space
rep eat
victim  ob ject i from set F with
the lo w est hea tand
disk i  size  S
i   if  victim is NOT n ull then
pf s  disk victim  size S
v ictim   else
v ictim  ob ject i from set F with
the lo w est hea tand
disk i  least i if  victim is NOT n ull then
pf s  disk v ictim  least v ictim else
v ictim  ob ject i from set F
with the lo w est hea t
pf s  disk v ictim if  pf s  r ds  then
disk v ictim  disk v ictim  rds
rds   else
disk v ictim  disk v ictim  pf s
rds  rds  pf s
un til  rds  Figure  Extended PIRA TE
space for the referenced ob ject then during the second
pass it deletes from ob jects un til eac h of their disk res
iden t p ortion corresp onds to least  x If pass t wofails
ie pro vides insucien t space to materialize the refer
enced ob ject it en ters pass  This pass is iden tical to
pass  of simple PIRA TE where ob jects are deleted in
their en tiret y starting with the one that has the lo w est
heat
With extended PIRA TE least X  for eac h disk
residen t ob ject is dened as follo ws
least  X
min dknob  hea t  X   size S
X   e size  S
X    where knob is an in teger whose lo w er b ound is zero
The minim um function a v oids the size of least  xto
exceed the size of the rst slice When k nob  extended PIRA TE is iden tical to simple PIRA TE As
knob increases a larger p ortion of eac h ob ject b ecomes
disk residen t Ob viously  the ideal case is to increase the
knob un til the rst slice of all the ob jects b ecome disk
residen t Ho w ev er due to the limited storage capacit y  this migh t b e infeasible By increasing the k nobw e
force a p ortion of some ob jects with lo w er heat to remain

disk residen t at the exp ense of deleting from ob jects
with a high heat By pro viding eac h request referencing
an ob ject a latency time prop ortional to the heat of that
ob ject extended PIRA TE impro v es the v ariance while
not increasing the a v erage dramatically  There is an optimal v alue for knob that minim izes    If the v alue of knob exceeds this v alue then   starts to
increase also
Lemma  The optimal v alue for knob is
C
Av g S lice  Pro of Let U b e the total n um b er of unique ob jects
that are referenced o v er a p erio d of time W e dene
Av g S l ice  X
x
hea t  x  size S
x   Av g H eat  P
x
hea t  x U
 U
Av g Least  k nob  Av g H eat  Av g S l ice The ideal case is when the least of almost all the
ob jects that constitute the database are disk residen t
C is the total n um b er of disk blo c ks
C  U  Av g Least
U  k nob   U
Av g Slice solving for k nobw e obtain knob  C
Av g S lice   Substituting the optimal v alue of knob in Equation  w e obtain least  X
hea t  X   size S X   P
m
i
hea t  i  size  S i    C  This is in tuitiv ely the amoun t of disk space an ob ject
X deserv es Section  emplo ys a sim ulation study
to conrm this analytical result Note that b ecause
hea t is considered in the computation of least  with
k nob  C
Av g S lice the a v erage latency time degrades
prop ortional to the impro v ementin v ariance In
summarywhen knob  PIRA TE replaces ob jects in a
manner that minimi zes a v erage latency time Ho w ev er
when k nob  C
Av g S lice itminim izes the v ariance T o
observ e consider the follo wing discussion
In the long run with k nob  PIRA TE main tains the
rst slice of all the ob jects with the highest heat disk
residen t while the others comp ete with eac h other for
a small p ortion of the disk space see Figure a T o
appro ximate the n um b er of ob jects that b ecome disk
residen t with knob      w e use the a v erage size
Av g S l ice as follo ws
  C
Av g S l ice  Ho w ev er with k nob  C
Av g S lice  PIRA TE main tains
only a minim um p ortion of all these   ob jects disk
residen t Toac hiev e this in optimal case it requires
b. knob = C/Avg_Slice1
...
...
XY U V WQ S
HEAT DECREASES
...
...
XY U V WQ S
HEAT DECREASES
Disk Resident
Absent
a. knob = 0
Figure  Status of the rst slice of ob jects
  Av g Least     C
U
of disk space
 The rest of the
disk space C      C
U
can b e used for the minim um
p ortion of the other ob jects see Figure b Therefore
in the long run the n um b er of disk residen t ob jects with
knob  C
Av g S lice ie
C
Max Size Max Heat knob
   U  is larger than    Ho w ev er with knob  the
rst slice of   ob jects are disk residen t while with
knob  C
Av g S lice only least of eachofthe   ob jects
are disk residen t This results in the follo wing tradeo
On one hand a request referencing an ob ject Z has a
higher hit ratio with knob  C
Av g Slice as compared
to knob   On the other hand a hit with knob
translates in to a fraction of a second latency time while
with knob  C
Av g S lice itresults inaminim um latency
time of  size  S
Z    least Z   siz e block  B T er tiar y
This
explains wh y with knob  C
Av g Slice PIRA TE impro v es
the v ariance prop ortional to the degradation in a v erage
latency   P erformance Ev aluation
W e implemen ted a sim ulation mo del to  v erify the an
alytical mo dels of this study  and  compare PIRA TE
with A tomic F or the purp oses of this ev aluation w e
This is optimistic b ecause the   ob jects ha v e the highest
heats th us a large minim um p ortion While it is not realistic to
use Av g Least in the equation it is useful for appro ximati on

k nob PIRA TE
Sk ew ed Uniform
                k nob  C
Av g S lice                          T able  Av erage latency time in seconds and standard
deviation in paren thesis
assumed a  gigab yte disk driv e consisting of  cylin
der eac h cylinder with a capacit y of  megab yte The
bandwidth of the disk drivew as  m bps The capac
it y of tertiary store w as set at  gigab ytes Its band
width w as  m bps The system w as congured with a  megab yte blo c k size
The database consists of  ob jects eac hwitha
m bps bandwidth requirement B
D isplay
m bps
The size of the ob jects w as v aried from  megab ytes
displa y time of  seconds to  megab yte displa y
time of  min ute and  seconds The a v erage ob ject
size w as  megab yte displa y time of one min ute
Based on the ph ysical c haracteristics of the system
the a v erage size of the rst slice of the ob jects w as  megab ytes
W e manipulated the mean of an exp onen tial distribu
tion to mo del t w o alternativ e distributions of access to
the ob jects sk ew ed mean   and uniform mean   T able  presen ts the obtained results as a function
of k nob  knob has no impact on A tomic
With a sk ew ed distribution of access A tomic pro vides
an a v erage latency time of  seconds with a standard
deviation of  Both A tomic and PIRA TE observ e
at least a  second latency due to the seek and transfer
time of the rst blo c k The marginal impro v emen t
with PIRA TE is b ecause it main tains the rst slice of
the infrequen tly accessed ob jects disk residen t This
impro v emen t is marginal b ecause the disk space is not
a scarce resource Ho w ev er b y main taining the rst
slice of these ob jects PIRA TE results in a signican tly
lo w er standard deviation when compared with A tomic
v ersus  The v alue of knob has no impact
on observ ed v ariance b ecause PIRA TE executes neither
pass  nor pass  of Figure  With a uniform distribution of access the a v ailable
disk space b ecomes a scarce resource A tomic results
in an a v erage latency time of  seconds and a
standard deviation of  T able  presen ts the
obtained results with PIRA TE When knob  
PIRA TE results in a signican tly lo w er latency time
and a lo w er standard deviation as compared to A tomic
bymain taining the rst slice of the frequen tly accessed
ob jects disk residen t Note the c hange in b oth the
a v erage latency and standard deviation as a function of
the k nob The optimal v alue of k nob ev aluated using
Lemma  of Section  results in the ideal compromise
bet w een the a v erage latency and the standard deviation
The p ercen tage degradation in the a v erage latency time
with knob   as compared to knob   is  The
v ariance impro v es b y the same p ercen tage A random
c hoice of v alue for the knob eg eitherorin
T able  can result in a degradation in a v erage latency
with marginal impro v ementon the v ariance These
results v erify the analytical mo dels of Section  In a second set of exp erimen ts w e analyzed the
percen tage of requests that observ e a certain latency
time in the system The parameters are the same
as previous exp erimen t The obtained result for a
uniform distribution of access mean   is presen ted
in Figure  In these gures the xaxis represen ts
the incurred latency time while the yaxis represen ts
the p ercen tage of requests that observ ed this latency
time With k nob   PIRA TE results in a signican tly
higher p ercen tage of requests that observ e a zero second
latency time  in Figure c Ho w ev er when
compared with knob  C
Av g S lice  it results in a
signican tly lo w er p ercen tage of requests that observea
latency time that v aries from  to  seconds compare
Figure c with Figure e Figures b d and f presen t the magnied p ortion of curv es in Figures a
c and e resp ectiv ely  Note that  of requests
observ e a latency time higher than  seconds when
knob  C
Av g Slice as compared to  with k nob
Figure d and f  These results demonstrate the
tradeo b et w een the v ariance and the a v erage latency
time of the system
Conclusion and F uture Researc h
Directions
This study in v estigates the role of hierarc hical m ultim e
dia storage managers for con tin uous media data t yp es
audio and video ob jects W e describ ed a pip elining
mec hanism that o v erlaps the displa y of a p ortion of an
ob ject from the disk driv e with the materialization of its
remainder from the tertiary  This minim izes the latency
time observ ed for requests referencing those ob jects that
are not disk residen t Moreo v er it reduces the fraction
of an ob ject that should b e disk residen t in order for the
user to observ e the latency pro vided b y the disk driv e
In addition w ein tro duced PIRA TE as a tec hnique to
manage the a v ailable disk space It pro vides the user
with a k nob that can b e ne tuned to either minim i ze
the a v erage latency time of the system or the v ariance
in latency  Wein tend to extend this study in sev eral directions

040 80 120 160
0
1
2
3
4
5
6
0
20
40
60
80
100
Percentage of
Requests (%)
Latency (in Seconds)
a. Atomic
040 80 120 160
0
Percentage of
Requests (%)
Latency (in Seconds)
c. PIRATE (knob=0)
040 80 120 160
0
Percentage of
Requests (%)
Latency (in Seconds)
e. PIRATE (knob=25)
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
b. Atomic (Magnified)
0
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
d. PIRATE (knob=0, Magnified)
0
040 80 120 160
Percentage of
Requests (%)
Latency (in Seconds)
f. PIRATE (knob=25, Magnified)
20
40
60
80
100
20
40
60
80
100
1
2
3
4
5
6
1
2
3
4
5
6
Figure 6: Comparison of Atomic and PIRATE
First this study assumed a single user displa ying a
single ob ject The design of extended PIRA TE in
the presence of either a single user displa ying sev eral
indep enden t ob jects sim ultaneously or m ultiple users
accessing dieren t ob jects requires further in v estigation
In b oth cases the curren t design could cause the tertiary
to b ecome a b ottlenec k for the system reducing the
utilization of the disk driv e and resulting in a lo w er
throughput Second an en vironmen t that consists of
sev eral disk driv es and tertiary storage devices pro vides
PIRA TE with additional options to consider this is
sp ecially true for m ultiple users Finallyw ein tend
to demonstrate the feasibilit y of the prop osed ideas b y
implemen ting them
References
ARA PAngP  Ruetz and D Auld Video Compres
sion Mak es Big Gains In IEEE Sp e ctrum pages
 BGMJ S Berson S Ghandeharizadeh  R Mun tz and
X Ju Straggered striping in m ultimedia infor
mation systems In Pr o c e e dings of the A CM SIG
MOD Internation al Confer enc e on Management
of Data  CABK G Cop eland W Alexander E Bough ter and
T Keller Data Placemen t in Bubba In
Pr o c e e dings of the A CM SIGMOD International
Confer enc e on Management of Data pages    CL HJ Chen and T Little Ph ysical Storage Orga
nizations for TimeDep enden t Multimedia Data
In Pr o c e e dings of the F oundationsof Data Or ganization and A lgorithms F ODO Confer enc e Octob er  CLR Thomas H Cormen Charles E Leiserson and
Ronald L Riv est editors Intr o duction to
Algorithms The MIT Press and McGra wHill
Bo ok Compan y   CP D K Campb ell and K Pro ehl Optical Ad
v ances BYTE Magazine Marc h  F o x E A F o x Adv ances in In teractiv e Digital
Multimedia Sytems IEEE Computer pages   Octob er  Gal D Le Gall MPEG a video compression stan
dard for m ultimedia applications  Communic a
tions of the A CM April  GR S Ghandeharizad eh and L Ramos Con tin uous
retriev al of m ultimedia data using paralleli sm
IEEE T r ansactions on Know le dge and Data
Engine ering  August  Has B Hask ell In ternational standards activities
in image data compression In Pr o c e e dings of
Scientic Data Compr ession Workshop pages
 NASA conference Pub  NASA Oce of Managemen t Scien tic and
tec hnical information division  NY R Ng and J Y ang Maximizing Buer and Disk
Utilization for News OnDemand In Pr o c e e dings
of the International Confer enceon V ery L ar ge
Datab ases  R V P  Rangan and H Vin Ecien t Storage T ec h
niques for Digital Con tin uous Media IEEE
T r ansactions on Know le dge and Data Engine er
ing  August  R W C Ruemmler and J Wilk es An In tro duction to
Disk Driv e Mo deling IEEE Computer Marc h
TPBG FA T obagi J P ang R Baird and M Gang
Streaming RAIDA Disk Arra y Managemen t
System for Video Files In First A CM Confer
enc e on Multime dia August 
Asset Metadata
Creator Shahbi, Cyrus (author),  Shahram, Ghandeharizadeh (author) 
Core Title USC Computer Science Technical Reports, no. 578 (1994) 
Alternative Title On multimedia repositories personal computers and hierarchical storage systems (title) 
Publisher Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA (publisher) 
Tag OAI-PMH Harvest 
Format 10 pages (extent), technical reports (aat) 
Language English
Unique identifier UC16270998 
Identifier 94-578 On Multimedia Repositories Personal Computers and Hierarchical Storage Systems (filename) 
Legacy Identifier usc-cstr-94-578 
Format 10 pages (extent),technical reports (aat) 
Rights Department of Computer Science (University of Southern California) and the author(s). 
Internet Media Type application/pdf 
Copyright In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/ 
Source 20180426-rozan-cstechreports-shoaf (batch), Computer Science Technical Report Archive (collection), University of Southern California. Department of Computer Science. Technical Reports (series) 
Access Conditions The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. 
Repository Name USC Viterbi School of Engineering Department of Computer Science
Repository Location Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email csdept@usc.edu
Inherited Values
Title Computer Science Technical Report Archive 
Description Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017. 
Coverage Temporal 1991/2017 
Repository Email csdept@usc.edu
Repository Name USC Viterbi School of Engineering Department of Computer Science
Repository Location Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA (publisher) 
Copyright In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/ 
Linked assets
Computer Science Technical Report Archive
doctype icon
Computer Science Technical Report Archive 
Action button