Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 584 (1994)
(USC DC Other)
USC Computer Science Technical Reports, no. 584 (1994)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A Pip elining Mec hanism to Minimize the Latency Time in
Hierarc hical Multimedia Storage Managers
Shahram Ghandeharizadeh Ali Dash ti Cyrus Shahabi
Departmen t of Computer Science
Univ ersit y of Southern California
Los Angeles California Abstract
An emerging area of database system researchis to pro vide supp ort for con tin uous media data
t yp es suc h as digital audio and video These data t yp es are exp ected to pla y a ma jor role in
applications suc h as library information systems scien tic databases en tertainmenttec hnology etc They require b oth a high v olume of storage and a high bandwidth requiremen t for their
con tin uous displa y The storage organization of systems that supp ort these data t yp es is exp ected
to b e hierarc hical consisting of one or more tertiary storage devices sev eral disk driv es and some
memory The database resides p ermanen tly on the tertiary storage device The disk driv es store
an um ber of frequen tly accessed ob jects while the memory is used to stage a small fraction of a
referenced ob ject for immediate displa y When a user references an ob ject that is tertiary residen t if the system elects to materialize
the ob ject on the disk driv es in its en tiret y b efore initiating its displa y then the user w ould observ e
a high latency time This pap er describ es a general purp ose pip elining mec hanism that o v erlaps
the displa y of an ob ject with its materialization on the disk driv es in order to minim ize the latency
time of the system The pip elining mec hanism is no v el b ecause it ensures a con tin uous retriev al
of an ob ject to a displa y station in order to supp ort its con tin uous displa y
In tro duction
During the past few y ears the information tec hnology has ev olv ed to store and retriev e digital au
dio and video data t yp es termed con tin uous media data t yp es MWS Systems that supp ort
this data t yp e are commonly referred to as m ultimedia information systems These systems uti
lize a v arietyofh uman senses to pro vide an eectiv e means of con v eying information and pla ya
ma jor role in educational applications library information systems etc A c hallenging task when
implemen ting these systems is to supp ort the sustained bandwidth required to displa y con tin uous
This researchw as supp orted in part b y the National Science F oundation under gran ts IRI IRI
NYI a w ard and CD A a HewlettP ac k ard researchgran t a TR W researchgran t and a researc hgrantfrom
A TTNCRT eradata
media ob jects GRA Q SAD
MWS This is due to the lo w IO bandwidth of the curren t
disk tec hnology the high bandwidth requiremen tof con tin uous media data t yp es and the large size
of their ob jects that almost alw a ys requires them to b e disk residen t F or example a one min ute
uncompressed video ob ject based on HDTV is appro ximately six gigab ytes in size and requires megabit p er second m bps bandwidth to supp ort its con tin uous displa y In the presence of the
lo w bandwidth of the curren t disk tec hnology one ma yemplo y either compression lossless or lossy see F o x for an o v erview declustering GS striping TPBG oracom bination of these
tec hniques BGMJ to supp ort a con tin uous retriev al of m ultimedia ob jects
The storage organization of systems that supp ort m ultimedia applications is exp ected to b e
hierarc hical consisting of a tertiary storage device a group of disk driv es and some memory GS MWS BGMJ The database resides p ermanen tly on the tertiary storage device and its ob jects
are materialized on the disk driv es on demand and deleted from the disk driv es when the disk storage
capacit y is exhausted A small fraction of a referenced ob ject is staged in memory to supp ort its
displa y The reason for exp ecting hierarc hical storage managers is the cost of storage A t the time of
this writing the appro ximate cost p er megab yte of memory is disk storage is and tertiary
storage is less than It is economical to stage the data at the dieren tlev els of hierarc hyin
the follo wing manner a small fraction of an ob ject in memory for immediate displa ya n um ber of
frequen tly accessed ob jects on the disk driv es and the remaining ob jects on the tertiary storage
device
One migh t b e tempted to replace the magnetic disk driv es with the tertiary storage devices in
order to reduce the cost further This is not appropriate for the frequen tly referenced ob jects that
require a fraction of a second transfer initiation dela ys ie the time elapsed from when a device
is activ ated un til it starts to pro duce data This dela y is determined b y the time required for a
device to rep osition its read head to the ph ysical lo cation con taining the referenced data this time is
signican tly longer for tertiary storage device ranges from sev eral seconds to min utes as compared
to that for a magnetic disk driv e ranges from to milliseconds Similarly the tertiary storage
device should not b e replaced b y magnetic disk driv es b ecause the cost of storage increases
and it migh t b e acceptable for some applications to incur a high latency time for infrequen tly
referenced ob jects
The curren t tec hnology trends in the area of tertiary storage devices is a rapid increase in
b oth storage capacit y and sustained bandwidth a rapid decline in megab yte of storage and
a mo dest impro v emen t in their head rep osition time with the distance of tra v el determining the
duration of this time Sc h F or example the mm helical tap e driv es in tro duced in the late s
can store v e gigab ytes of data at a cost of megab yte and supp ort a m bps sustained transfer
rate
One v ersion of this tec hnology supp orts a rac k of tap es to pro vide gigab ytes of storage at
a cost of megab yte this organization increases the time required to rep osition the read head
byin tro ducing switc hing time from one storage media to another in fa v or of b oth a larger storage
capacityand a lo w er megab yte of storage Recen tly Digital in tro duced the DL T tap e driv es
based on the linear tap e tec hnology It pro vides gigab y e of storage and supp orts a sustained
transfer rate of m bps at a cost less than megab yte Similarly METR UM has in tro duced a
tertiary device based on helical scan tec hnology named RSSb It pro vides terab yte of storage
and supp orts a sustained transfer rate of m bps at a cost less than megab yte V arious tertiary
devices based on P eregrine can supp ort sustained bandwidths ranging from to m bps Sc h
The near future calls for tertiary devices that store h undreds of terab yte if not p etab yte of data
and supp ort sustained transfer rates ranging from to m bps
When a request references an ob ject w e term the time elapsed from when the request arriv es
un til the onset of its displa y as the latency time incurred b y this request With hierarc hical storage
organization when a request references an ob ject that in not disk residen t one approac h migh t
materialize the ob ject on the disk driv es in its en tiret y b efore initiating its displa y In this case
assuming a zero system load the latency time of the system is determined b y the time for the
tertiary to rep osition its read head to the starting address of the referenced ob ject the bandwidth of
the tertiary storage device and the size of the referenced ob ject Assuming that the referenced ob ject
is con tin uous media eg audio video and requires a sequen tial retriev al to supp ort its displa ya
sup erior alternativ e is to use pip elini ng in order to minimize the latency time Briey the pip elini ng
mec hanism splits an ob ject in to s logical slices S
S
S
S
s
suc h that the displa y time of S
o v erlaps the time required to materialize S
the displa ytime of S
o v erlaps the time to materialize
S
so on and so forth This ensures a con tin uous displa y while reducing the latency time b ecause
the system initiates the displa y of an ob ject once a fraction of it ie S
b ecomes disk residen t
Another adv an tage of pip elini ng is that it enhances the usef ul utilization of resources when a
user decides to ab ort the displa y of a referenced ob ject T o illustrate consider a user that requests
an obscure ie tertiary residen t min ute video ob ject and decides that it is not of in terest after a
few min utes of displa y With pip elinin g the displa yof object is o v erlapp ed with its materialization
and once the user ab orts the displa y the system can ab ort the pip eline a v oiding the tertiary from
materializing the remaining slices of the referenced ob ject instead tertiary can b e used to service
some other request Without pip elini ng in addition to forcing the user to w ait for a longer in terv al
of time the system w ould ha v e had to use the tertiary for a longer in terv al time in order to stage
The transfer rates quoted in this pap er assume a sequen tial read of the referenced data no rep osition times
The maxim um transfer rate of tertiary devices is t ypically double their sustained transfer rate Due to the con tin uous
bandwidth requiremen t of video and audio ob jects w e use the sustained bandwidths pro vided bythe v endors
Figure Example Three alternativ e datao w paradigms
the en tire ob ject on the disk clusters
The con tribution of this pap er is the design of
a generalpurp ose pip elini ng mec hanism for
con tin uous media that minimizes the latency time of the system while ensuring a con tin uous retriev al
of the referenced ob ject It can b e used with a tertiary storage device whose bandwidth is either equal
to higher or lo w er than the bandwidth required to displa y an ob ject The pip elin ing mec hanism is
describ ed assuming an arc hitecture that consists of some memorysev eral disk driv es and a tertiary
storage device W e consider t w o alternativ e organization of these comp onen ts memory serv es
as an in termediate staging area b et w een the tertiary storage device the disk driv es and the displa y
stations and the tertiary storage device is visible only to the disk driv es via a xed size memory With the rst organization the system ma y elect to displa y an ob ject from the tertiary storage device
b y using the memory as an in termediate staging area With the second organization the data m ust
rst b e staged on the disk driv es b efore it can b e displa y ed W e capture these t w o organizations
using three alternativ e paradigms for the o w of data among the dieren t comp onen ts
Sequen tial Data Flo w SDF The data o ws from tertiary to memory STREAM of Figure from memory to the disk driv es STREAM from the disk driv es bac k to memory STREAM
and nally from memory to the displa y station referencing the ob ject STREAM Neither sim ulated nor implemen ted
P arallel Data Flo w PDF The data o ws from the tertiary to memory STREAM and
from memory to b oth the disk driv es and the displa y station in order to materialize STREAM
and displa y STREAM the ob ject sim ultaneously PDF eliminates STREAM Incomplete Data Flo w IDF The data o ws from tertiary to memory STREAM and from
memory to the displa y station STREAM to supp ort a con tin uous retriev al of the referenced
ob ject IDF eliminates b oth STREAM and Figure mo dels the second arc hitecture tertiary storage is accessible only to the disk driv es b y
partitioning the a v ailable memory in to t w o regions one region serv es as an in termediate staging area
bet w een tertiary and disk driv es used b y STREAM and while the second serv es as a staging
area b et w een the disk driv es and the displa y stations used b y STREAM and SDF can b e used
with b oth arc hitectures Ho w ev er neither PDF nor IDF is appropriate for the second arc hitecture
b ecause the tertiary is accessible only to the disk driv es When the bandwidth of the tertiary storage
device is lo w er than the bandwidth required b y an ob ject SDF is more appropriate than b oth PDF
and IDF b ecause it minimizes the amoun t of memory required to supp ort a con tin uous displayof an
ob ject IDF is ideal for cases where the exp ected future access to the referenced ob ject is so lo w that
it should not b ecome disk residen t ie IDF a v oids this ob ject from replacing other disk residen t
ob jects
The rest of this pap er is organized as follo ws In section w e describ e ho w our target en
vironmen t supp orts a con tin uous displa y of an ob ject Section uses this framew orktodescribe
the pip elinin g mec hanism assuming that the database consists of a single media t yp e with a xed
bandwidth requiremen t B
D isplay
W e describ e our tec hnique from the p ersp ectiv e of a tertiary stor
age device whose bandwidth is either higher or lo w er than the bandwidth requiremen t of a media
t yp e B
D isplay
X Extension of the pip elin ing mec hanism to a database that consists of a mix of
media t yp e eac h with a dieren t bandwidth requiremen t is describ ed briey in App endix A Our
conclusion and future researc h directions are describ ed in Section Ov erview
Giv en a system that consists of D disk driv es and a database that consists of ob jects that b elong to
a single media t yp e with bandwidth requiremen t B
D isplay
w e utilize the aggregate bandwidth of
d disk driv es to supp ort a con tin uous displa y of an ob ject This is ac hiev ed as follo ws First the D
disk driv es in the system are partitioned in to R clusters where R b
D
d
c Next eac h ob ject in the
database sa y X is strip ed SGM in to n equisized sub ob jects X
X
X
n
Eac h sub ob ject
X
i
represen ts a con tin uous p ortion of X When X is materialized from the tertiary storage device its
sub ob jects are assigned to the clusters in a roundrobin manner starting with an a v ailable cluster
The primary reason for a roundrobin assignmen t of the sub ob jects to the clusters is to distribute
the w orkload imp osed b y the displayof an object ev enly across the clusters This a v oids a cluster
from b ecoming the b ottlenec k for the system maximizing its pro cessing capabilit y In a cluster
a sub ob ject is declustered RE LKB GDQ in to d pieces termed fragmen ts with eac h
fragmen t assigned to a dieren t disk in the cluster The v alue of d is c hosen suc h that the bandwidth
of a cluster is greater than or equal to the bandwidth required to displa y an ob ject
Assuming that the bandwidth of eac h cluster is high enough that it can b e m ultiplexed b et w een
U
C luster
requests the memory used to displa y ob jects is partitioned in to R U
C luster
frames
T o ensure a con tin uous displa y of an ob ject the system main tains a time cycle for eac h cluster
A time cycle consists of U
C luster
time in terv als also termed slots A time in terv al is the time
required for a cluster to rep osition its disk heads and transfer a sub ob ject in to a memory frame
Giv en a request for an ob ject X that consists of n sub ob jects the system reserv es n time in terv als on
b ehalf of this request one p er time cycle of the system Relativ e in time the distance b et w een an y
t w o time slots reserv ed on b ehalf of a request is U
C luster
The cluster emplo y ed in the rst time
cycle is the one con taining X
sa y C
i
The displa y of X starts once X
is staged in memory In the
second cycle cluster C
i mo d R
is emplo y ed to read X
The organization of b oth the cycles and
in terv als is suc h that X
is memory residen t immediately b efore the displayof X
completes this
explains wh y the memory is partitioned in to R U
C luster
frames This is ac hiev ed b y setting
the duration of a time cycle to b e equiv alen t to the display timeofasubobject T imeC y cl e siz e subobj ect B
D isplay
The system switc hes from the memory frame con taining X
to X
in order to supp ort a con tin uous
displa y of X The system iterates o v er the clusters and memory frames un til X is displa y ed in its
en tiret y emplo ying a single cluster in eac h time cycle
Example Assume a system that consists of disk clusters Moreo v er assume that the
bandwidth of eac h cluster is t wice the bandwidth required to displayan object U
C luster
Let
the follo wing three ob jects reside on the disk clusters X Y and Z The size of a sub ob ject of eac h
of these ob jects is iden tical X is strip ed in to sub ob jects while eac h of Y and Z is strip ed in to sub ob jects X is larger in size than b oth Y and Z These ob jects are assigned to the clusters in a
roundrobin manner starting with a dieren t cluster for eac h ob ject sa y cluster for X cluster for Y and cluster for Z the assignmen t of sub ob jects to the dieren t clusters is sho wn in T able This equation is explained in the follo wing description
Figure A sc hedule for servicing three requests
Assume that three requests are issued eac h referencing a dieren t ob ject in the follo wing order of
arriv al X Y follo w ed b y Z Figure demonstrates the sc heduling of the time in terv als to displa y
the dieren t sub ob jects as a function of time This sc hedule is p ossible due to the assignmentof
T able allo wing the system to displa y X Y and Z in the same time cycle Note that X
i
emplo ys
the same time in terv al in eac h time cycle in terv al Moreo v er relativ e in time the slots reserv ed
on b ehalf of a displaysa y Y are one slot U
C luster
apart Th us Equation ensures that Y
i
is memory residen t b efore the displayof Y
i
completes Eac h of STREAM and in Figure requires the use of time in terv als As demonstrated b y
Example STREAM requires a single time in terv al The n um ber of time in terv als required b y
STREAM dep ends on the bandwidth of the tertiary storage device
When materializing an ob ject from tertiary if the bandwidth of tertiary is lo w er that the band
width required to displa y the ob ject then the tertiary cannot pro duce an en tire sub ob ject during
eac h time in terv al to b e ushed to a disk cluster This is not a problem for a system that consists
T erm Denition
D Num b er of disk driv es in the system
B
D isplay
Bandwidth required to displa y an ob ject
B
Tertiary
Bandwidth of the tertiary storage device
R Num b er of disk clusters in the system
siz e X Size of ob ject X
T
Reposition
Time required for the tertiary to rep osition its read head
n Num b er of sub ob jects that constitute an ob ject
PCR Pro duction Consumption Ratio
B
Tertiary
B
D isplay
T able List of terms used rep eatedly in this pap er and their resp ectiv e denitions
of a single cluster R Ho w ev er this is a problem when R b ecause the la y out of an ob ject
across the disk driv es is not sequen tial a fragmen t do es not represen t a con tiguous p ortion of an
ob ject see GRA Q BGMJ Consequen tly when materializing ob ject X the tertiary pro duces
B
T er tiar y
B
D isplay
of X
during the rst time cycle During the second time cycle if the system forces the
tertiary storage device to rep osition its read head in order to pro duce
B
T ertiary
B
D isplay
of X
the tertiary
ma y incur an unacceptable o v erhead One approac h to resolv e this mismatc h is to write the data
on the tap e in the same order as it exp ected to b e deliv ered to the disks F or example in a system
that consists of disk clusters if
B
T ertiary
B
D isplay
then the sub ob jects could b e stored on the tertiary
as follo ws X
X
X
X
X
X
X
X
X
etc see Example
for a completion of this illustration This w ould allo w the system to read X sequen tially from
tertiary while ensuring a roundrobin assignmen t of its sub ob jects to the disk clusters W e assume
this approac h in this pap er
When the system consists of a single disk cluster R and U
Cluster
the system can
either displa y or materialize a single ob ject either STREAM or In this case the pip elini ng
mec hanism cannot b e emplo y ed In this pap er w e assume that the pro duct of U
C luster
and R is
greater than one U
C luster
R Pip elini ng
In this section w e assume that all the ob jects b elong to a single media t yp e and ha v e the same
bandwidth requiremen t W e describ e the pip elini ng mec hanism for t w o p ossible cases the bandwidth
of the tertiary is either lo w er or higher than the bandwidth required to displa y an ob ject
The
The discussion for the case when the bandwidth of tertiary is equiv alen t to the displa y is a sp ecial case of item
Figure NonPip elini ng Approac h Single Disk Cluster
ratio b et w een the pro duction rate of tertiary and the consumption rate at a displa y station is termed
Pro duction Consumption Ratio PCR B
T er tiar y
B
D isplay
When PCR PCR the pro duction
rate of tertiary is lo w er higher than the consumption rate at b oth a displa y station STREAM of Figure and the bandwidth sim ulated b y using a single time in terv al p er time cycle STREAM
of Figure PC R In this case the time required to materialize an ob ject is greater than its displa y time Neither PDF
nor IDF is appropriate b ecause the bandwidth of tertiary cannot supp ort a con tin uous displa y of the
referenced ob ject assuming that the size of the rst slice exceeds the size of memory W e start b y
describing SDF for a system that consists of a single disk cluster with U
C luster
Subsequen tly w e extend the discussion to a system that consists of R disk clusters
Let an ob ject X consists of n sub ob jects The time required to materialize X is d
n
PCR
e time
cycles while its displa y requires n time cycles If X is tertiary residen t without pip elinin g the
latency time incurred to displa y X is d
n
PCR
e time cycles see Figure Plus one b ecause an
additional time cycle is needed to b oth ush the last sub ob ject to the disk cluster and allo w the
rst sub ob ject to b e staged in the memory buer for displa y eg time cycle in Figure T o
reduce this latency time a p ortion of the time required to materialize X can b e o v erlapp ed with its
displa y time This is ac hiev ed as follo ws An ob ject X is split in to s logical slices S
X S
X S
Xs
suc h that the displaytime of S
X T
D isplay
S
X o v erlaps the time required to materialize
S
X T
M ater ializ e
S
X T
D isplay
S
X o v erlaps T
M ater ializ e
S
X etc Th us
T
D isplay
S
Xi
T
M ater ializ e
S
Xi
for is
Figure Pip elini ng Mec hanism Single Disk Cluster
Up on the retriev al of a tertiary residentobject X the pip elini ng mec hanism with SDF is as
follo ws
STEP Materialize the sub ob jects that constitute S
X on the disk driv es
STEP F or i tosdo
a Initiate the materialization of S
Xi
from tertiary device on to disks
b Initiate the displa yof S
Xi STEP Displa y the last sub ob ject
The duration of STEP determines the latency time of the system During STEP while
the subsequen t slices are materialized from the tertiary device the disk residen t slices are b eing
displa y ed STEP displa ys the last sub ob ject materialized on the disk clusters
STEP requires a single memory frame and a single time in terv al p er time cycle while STEP
requires t w o memory frames and t wotimein terv als p er cycle additional resources should not
b e allo cated as they cannot b e utilized and STEP requires one memory frame and time slot to
displa y the last sub ob ject The tertiary storage device is fully utilized during b oth STEP and and it is not required during STEP In STEP during the rst time cycle the system reads PCR
of a sub ob ject in to a memory frame In eac h of its subsequen t time cycles it ushes the partially
full frame on to a disk cluster and con tin ues to read PCR of a sub ob ject from the tertiary storage
device In STEP during eac h time cycle the system uses one memory frame and time in terv al to
rep eat the pro cedure outlined for STEP to accomplish STEP a and a second memory frame
and time in terv al to supp ort the displa y of sub ob jects that constitute a previously materialized slice
S
Xi Finally in STEP the last sub ob ject is displa y ed using a single memory frame Note that
b oth STEP and a w aste PCR of a time in terv al b ecause they write partially full buer frames
During this step at least a p ortion of not necessarily all of S Xi is materialized on the disk cluster
The last sub ob ject has already b een staged in the memory frame
Example Assume a system with a single disk cluster R as in Figure Eac h time
cycle consists of t w o time in terv als U
C luster
Assume that ob ject X consists of sub ob jects
If PCR then the time required to materialize the ob ject is time cycles d
n
PCR
e Figure sho ws the materialization of ob ject from the tertiary device follo w ed b y its displa y without the use
of the pip elini ng mec hanism yielding a latency time of time cycles Figure sho ws ho w the
pip elini ng mec hanism o v erlaps the displayof X with its materialization In this case the incurred
latency time is time cycles while the system ensures a con tin uous displayof X The n um b er of time cycles required for eac h step is computed as follo ws F rom Figure it is
ob vious that STEP requires no time slots F urthermore the duration of STEP corresp onds to
the displa y time of n sub ob jects that requires n time cycles T o compute the n um b er of time
cycles for STEP TC
ST EP X w e subtract the total time required for STEP n time cycles
from the time required to ush X to the disk clusters in its en tirety d
n
PCR
e time cycles
TC
ST EP X d
n
PCR
e n
During STEP the p ortion of the ob ject materialized b y the pip elinin g mec hanism is bPCR n c The remainder of the ob ject m ust constitute S
X S
X n b PCR n c
The gran ularityof S
X is in terms of sub ob jects The size of S
X is imp ortan t b ecause it determines
the latency incurred when X is referenced The size of the slices are dieren t and can b e computed
as a function of S
X This is not presen ted b ecause their size has no impact on the latency time of
the system
No w w e extend the discussion of the pip elini ng mec hanism to a system that consists of R disk
clusters Recall that the sub ob jects that constitute X are assigned to the disk clusters in a round
robin manner F urthermore the ph ysical la y out of X on the tertiary device accommo dates the
round robin assignmen t of the sub ob jects and the time slots to the R disk clusters T o preserv e the
round robin assignmen t of the sub ob jects the system ma y require more than d
n
PCR
e time cycles to
render ob ject X disk residen t This is b ecause the last sub ob ject of X ie X
n
migh t b e memory
residen t ho w ev er the cluster that should con tain X
n
migh t b e busy servicing other requests In the
w orst case this cluster migh t b e busy for R additional time cycles b efore it is assigned to the
materialization pro cedure allo wing the system to render X
n
disk residen t If the memory con taining
X
n
is accessible to the displa y stations then the system can displa y X
n
from memory in order to
ensure a con tin uous displayof X using the PDF paradigm for the last sub ob ject Otherwise the
n um b er of time cycles required for STEP should b e extended with the n um b er of time cycles
Figure Pip elini ng Mec hanism Disk Clusters
required to mak e X
n
disk residen t in order to ensure a con tin uous displa y This increases the size of
the rst slice resulting in a higher latency time
T o compute the exact n um b er of time cycles required b efore the en tire ob ject b ecomes disk
residen t with R disk clusters the system can emplo y the follo wing equation
TC
M ater ializ ation
X b
l
R
c d
b
l mo dR
max n mo dR c
R
e R n mo d R w her e l d
n
PCR
e
This equation comp ensates for the dela y asso ciated with the preserv ation of a roundrobin assignmen t
of the sub ob jects to the disk clusters The last sub ob ject ma y not b ecome disk residen t immediately
after l d
n
PCR
e time cycles Num b er of complete roundrobin cycles
required for the ob ject
materialization on to the disk clusters is b
l
R
c d
b
l mo d R
max n mo d R c
R
e The last sub ob ject X
n
b ecomes
disk residen t after n um b er of complete roundrobin cycles R lo cation of X
n
in the roundrobin
assignmen t
Substituting Equation in place of d
n
PCR
e in Equation the n um b er of cycles required for STEP
is TC
ST EP X TC
M ater ializ e
X n
The system comp ensates for the newly in tro duced dela ys b y expanding the duration of STEP increasing the size of the rst slice to ensure a con tin uous displa y Hence the n um b er of time
cycles in STEP and STEP remain unc hanged
Consequen tly after completing the retriev al
of the ob ject the tertiary is free to service other requests The last p ortion of X con tin ues to b e
A roundrobin cycle consists of R time cycles
This is captured b y n mo d R STEP ma y either partially emplo y or not emplo y R time slots allo cated to it see Example for illustratio n
memory residentun til it is ushed to a disk cluster This maya v oid the tertiary from servicing
another request if the a v ailable memory is exhausted
Example Assume a system with disk clusters as in Figure and the same v alues for n PCR and U
C luster
as in Example If ob ject X consists of sub ob jects its sub ob jects w ould b e
stored on the tertiary as follo ws X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
The semicolon separates the p ortions
that b ecome memory residen t during eac h time cycle Figure sho ws howeac h p ortion is ushed to
the disk cluster to attain the roundrobin assignmen tof X as sho wn in T able The rst p ortion of
X X
is read in to the memory during time cycle During time cycle this p ortion is ushed
to cluster and X
is read from tertiary This pro cess is rep eated for the remaining p ortions of X The last p ortion X
X
b ecomes memory residen t at the end of the ten th time cycle The
p ortion corresp onding to X
is ushed to cluster during the elev en th time cycle w asting of
a time in terv al The p ortion corresp onding to X
m ust b e stored on cluster X
m ust reside on
cluster to ensure a roundrobin assignmen t see T able If the tertiary abides b y the roundrobin
usage of clusters then X
is ushed during the t w elfth time cycle see Figure Equation and increase the size of the rst slice accordingly to ensure that this discrepancy do es not disrupt
the con tin uous retriev al of
X resulting in a higher latency as the displa yof X starts during time
cycle instead of If tertiary is allo w ed to ush X
to cluster during time cycle then
the displayof X could start during time cycle as b efore Ho w ev er note that this violation ma y
in terfere with the displa y of another ob ject that requires a dieren t sub ob ject that resides on cluster
during time cycle If the memory used as the staging area b et w een tertiary and disk clusters is a v ailable to the
displa y stations the system ma y emplo y PDF for the last sub ob ject In this case the displayof
X starts during time cycle The last p ortion ie X
is displa y ed and ushed sim ultaneously
during time cycle Using the ab o v e equations a sc heduler can determine the n um b er of time slots and memory frames
required for the pip elini ng mec hanism at eac h time cycle and the n um b er of sub ob jects required to
b e disk resident S
X b efore the displa yof object X can start
The displayof X is disrupted when the system attempts to displa y a sub ob ject that is not disk residen t
PC R In this case the bandwidth of tertiary exceeds the bandwidth required to displa y an ob ject There
fore Equation is satised when eac h slice of an ob ject consists of a single sub ob ject Moreo v er
the la y out of eac h ob ject on the tertiary storage device is sequen tial and device indep enden t b ecause
a complete sub ob ject can b e materialized during eac h time cycle
Tw o alternativ e approac hes can b e emplo y ed to comp ensate for the fast pro duction rate either
m ultiplex the bandwidth of tertiary among sev eral requests referencing dieren t ob jects or increase the consumption rate of an ob ject b y reserving more time in terv als p er time cycle to render
that ob ject disk residen t The rst approac hw astes the tertiary bandwidth b ecause the device is
required to rep osition its read head m ultiple times The second approac h utilizes more resources
in order to a v oid the tertiary device from rep ositioning its read head The resources required p er
time cycle on b ehalf of an approachcan be v arian t In the follo wing t w o sections w e consider eac h
approac h in turn W edev elop analytical mo dels that determine the com bination of resources required
for eac h approac h
Multiplexing
One approac h to comp ensate for the high bandwidth of tertiary is to m ultiplex its bandwidth
among sev eral requests pro viding eac h request referencing an ob ject X with a bandwidth equal to
B
D isplay
X W e start this section b y describing this approac h for PDF paradigm Subsequen tlyw e
demonstrate ho w the utilization of the tertiary storage device can b e maximized using a com bination
of b oth PDF and IDF W e conclude b y describing this approac h for SDF
Assuming that the tertiary storage device is m ultiplexed among j distinct requests and a new
request arriv es increasing the total n um b er of requests to k ie j The follo wing sho ws the
PDF paradigm for the newly arriv ed request
STEP Materialize the rst p ortion of the ob ject referenced b y the newly arriv ed request
in to memory buers a p ortion consists of one or more sub ob jects
STEP a Eac h memory buer is displa y ed and ushed to a disk cluster at the same time
b Next p ortions of eac h of the k referenced ob jects are transfered to the memory buers
STEP If the materialization of one of the k ob jects sa y X completes then the tertiary
either sits idle and w aits for the arriv al of a new request or services another request based on
the a v ailabilit y of its resources and the bandwidth requiremen t of the ob ject referenced b y the
new requests
P arameter V alue
B
T er tiar y
m bps
B
D isplay
m bps
siz e o
x
megab yte
siz e o
y
megab yte
siz e subobj ect megab yte
k P
k
i T
Reposition
o
i
second
T able System parameters for the example
The n um b er of ob jects that can b e m ultiplexed sim ultaneously k dep ends on B
T er tiar y
and the
time required to rep osition the read head of tertiary among these requests Assume that a xed size
memory is allo cated to eac h of the k requests where the size of memory is a m ultiple of the sub ob ject
size sa y z siz e subobj ect where z is an in teger The time required to displa y the memory buer
for eachobject X ie
z siz e su bobj ec t B
D isplay
should b e greater than or equal to the total time required to
m ultiplex the tertiary among k other requests ie k z siz e subob j ect B
Tertiary
materialize
the next p ortion of X ie
z siz e subobj ect B
T er tiar y
and the time required for the tertiary to rep osition its
head among the k requests ie
P
k
i
T
Reposition
o
i
Th us m ultiplexing m ust satisfy the follo wing
constrain t on b ehalf of eachofthe k requests
z siz e subobj ect B
D isplay
k z siz e subobj ect B
T er tiar y
k
X
i
T
Reposition
o
i
If this constrain t is violated then PDF cannot guaran tee a con tin uous displa y of an ob ject In this
case the system ma y emplo y the SDF paradigm as describ ed in section b ecause eac h of the k
requests observ es a PCR less than F or the rest of this section w e describ e PDF assuming that
the constrain t p osited in Equation is satised
Equation assumes that k time slots are reserv ed p er time cycle Moreo v er the k reserv ed
time slots should b e p ositioned suc h that the system ushes the sub ob jects to the clusters in the
same manner roundrobin as it w ould ha v e b een read if the ob ject w as disk residen t eac hofthe z
sub ob jects read on b ehalf of a request is ushed and displa y ed sim ultaneously one p er time cycle
The upp er b ound on the n um b er of required time slots is bPCRc no additional time slots should b e
allo cated as they cannot b e used
An in teresting prop ert y of Equation is the eect of memory buers By increasing z more
data is transfered ev ery time the read head of the tertiary is rep ositioned This enables the system
to m ultiplex requests that require the read head of tertiary to tra v el a longer distance and incur a
higher rep ositioning time
z Maxim um
P
k
i T
Reposition
o
i
tolerated b y
Second
Seconds
Seconds
T able The eect of memory on rep ositioning time
Example Consider the system parameters sho wn in T able Assuming one frame of memory
is dedicated for eachobjectto be m ultiplexed z and t w o time slots are a v ailable k the
constrain t p osited in Equation is satised T able sho ws the eect of increasing the amoun tof
memory z on the feasible rep ositioning times By increasing z Equation is satised for higher
rep ositioning time dep ending on the distance b et w een the k m ultiplexed ob jects A static sc heduler can determine if a new requests can b e added to the k curren tly m ultiplexed
requests based on the a v ailable resources time slots and memory and the idle time of the tertiary
device T om ultiplex k ob jects using PDF k time slots and k z memory frames are required The
idle time of the tertiary when m ultiplexed among k requests deriv ed using Equation is
I dl eT ime z siz e subobj ect B
D isplay
k
X
i T
Reposition
o
i
z k siz e subobj ect B
T er tiar y
If the constrain t p osited in Equation is satised on b ehalf of eac hm ultiplexed request then
I dl eT ime will b e a n um b er either greater than or equal to zero the minim um v alue for k is one
In this case the tertiary is used for
z siz e subobj ect B
T er tiar y
and sits idle for the duration of time computed
b y Equation This pro cess is rep eated
n
z
times This sho ws that in the w orst case the system
can emplo y the PDF paradigm with one time slot and one memory frame the required resources
to displa y an ob ject Up on the arriv al of a new request the sc heduler ma y encoun ter alternativ e
scenarios the additional rep osition time required to service the new request renders the idle time
of the tertiary to b e a negativen um b er this can also happ en when the new request results in a k
that is greater than bPCRc there is insucien t memory to service the new requests there
are insucien t time slots to materialize the referenced ob ject and a com bination of the rst three
scenarios In the rst t w o cases and an ycom bination that includes these t w o case the system has
no c hoice and m ust sc hedule the new requests to b e serv ed at some p oin t in the future when one of
the k activ e requests completes
F or the third scenario the sc heduler ma y emplo y the IDF paradigm in com bination with PDF
to harness the full bandwidth of the tertiary storage device without materializing the ob ject on the
disk driv es the IDF paradigm is sp ecially useful for requests that reference ob jects whose frequency
of access is so lo w that the system elects not to materialize them on the disk driv es Assuming that
the system services m requests using the IDF paradigm the m ultiplexing approachm ust guaran tee
the follo wing constraintonbehalf ofeach k m requests
z siz e subobj ect B
D isplay
k m z siz e subobj ect B
T er tiar y
k m
X
i
T
Reposition
o
i
where k m b PCRc The system is pure PDF IDF when m k With m ultiplexing the SDF paradigm is almost iden tical to the PDF paradigm The ma jor
dierences are as follo ws First SDF requires more resources as compared to PDF T om ultiplex k
ob jects using SDF the system m ust allo cate
k time slots k time slots to transfer the dieren t sub ob jects from tertiary to disk cluster
another k time slots to displa y the disk residen t sub ob jects
k z k frames of memory k z frames serveas anin termediate staging area b et w een
tertiary and m ultidisk to materialize k ob jects while k memory frames are used to initiate
the displayof k disk residen t sub ob jects
PDF requires only k time slots and k z memory frames b ecause it eliminates STREAM of SDF
Note that SDF cannot b e applied unless t wotime in terv als are allo cated on b ehalf of a request
one for STREAM and a second for STREAM Second the latency time is higher with SDF as
compared to PDF With PDF the latency time is equiv alen t to one time cycle for STREAM With SDF the latency time incurred b y a request is dep enden t on the assignmen t of its t w o time
slots If the slots are assigned horizon tally ie the slots are from a single cluster C
i
its latency
time is t w o time cycles one for STREAM and second to p erform STREAM and concurren tly Ho w ev er if the t w o slots are assigned v ertically one from cluster C
i
and second from cluster C
j
sa y
ij then the latency time is the sum of one time cycle for STREAM one time cycle for
STREAM sub ob ject ushed to C
i
and j i time cycle delayun til STREAM can b e initiated
j i is the n um b er of time cycles required for the in terv al corresp onding to C
i
to reac h the p osition
of time in terv al corresp onding to C
j
that con tains the rst sub ob ject of
X Third PDF enforces
k to b e lo w er than bPCRc This constrain t can b e violated with the SDF paradigm resulting in
a higher latency time on b ehalf of eac h request ie when k bPCRc the rst slice consists of a
single sub ob ject when k bPCRc the n um b er of sub ob jects for the rst slice is determined b y the
discussion of Section Note that when the system services m requests using IDF it allo cates
m z memory frames to supp ort their displa y If the rst sub ob ject w as ushed to C i instead of C j in item then item w ould incur a delayof R j iThe
system can c ho ose either C i or C j to decrease the delay Min R j i j i
NonMultiplexing
The second approac h accommo dates the high bandwidth of tertiary storage device b y increasing the
consumption rate of the system This is ac hiev ed b y dedicating additional time slots p er time cycle
to materialize the referenced ob ject increasing the rate of data o w in STREAM with one time
slot dedicated to displa y the ob ject STREAM This tec hnique is appropriate for neither PDF
nor IDF b ecause b oth tec hniques require the pro duction rate of the tertiary STREAM to b e
appro ximately the same as B
D isplay
STREAM Ho w ev er it is appropriate for SDF b ecause it
uses the disk space as a staging area to materialize the ob ject STREAM at a rate of B
T er tiar y
while retrieving STREAM and displa ying STREAM it at a rate of B
D isplay
eliminating the
constrain t enforced b y b oth PDF and IDF With SDF the displa y of an ob ject is initiated once the
rst sub ob ject b ecomes disk residen t
Assuming k is the n um b er of time slots allo cated p er time cycle to materialize the ob ject the
ideal case is when k is equal to dPCRe b ecause it renders the consumption rate of the disk cluster
to b e either higher or equiv alentto B
T er tiar y
If k PCR then some extra memory is required
to temp orarily buer the p ortion that cannot b e ushed to the clusters In this case the memory
residen t p ortion con tin ues to accum ulate as a function of time un til the ob ject b ecomes disk residen t
in its en tiret y This section describ es analytical mo dels to determine the n um b er of required time slots
and memory frames used to materialize the sub ob jects on the disk clusters W e start b y assuming
that k is a constan t for the total n um b er of time cycles required to materialize the reference ob ject
Subsequen tlyw e relax this assumption and consider the case where the v alue of k uctuates from
one cycle to another
Assuming k time slots are a v ailable p er time cycle consider a v ertical assignmen t of time slots
across the clusters the time slots are assigned among the clusters in a roundrobin manner starting
with an a v ailable cluster Th us h
c
time slots are dedicated to cluster c where b
k
R
c h
c
d
k
R
e ie
the k time slots are divided equally among the R clusters When h
c
the clusters are adjacen t
due to roundrobin assignmen t Ob viously k should b e
P
R
c
h
c
and the ph ysical upp er b ound
enforced on k is R U
C luster
Assuming the v ertical assignmen t w eno w calculate the maxim um
amoun t of memory required to prev ento v ero w During eac h time cycle Min k P C R sub ob jects
are transfered from memory to disk clusters Assuming the ob ject consists of n sub ob jects then the
n um b er of time cycles required to render the ob ject disk residentin its en tiret y is
q d
n
Min k P C R e
Multiplexi ng as describ ed in section decreases rate of data o w in STREAM in order to emplo y b oth PDF
and IDF
Moreo v er the n um b er of time cycles required to read the ob ject from tertiary is
n
PCR
If k PCR
then the pro duction rate will b e higher than the consumption rate requiring more cycles to ush
the ob ject on to the disk driv es as compared to the n um b er of cycles required to read it from tertiary The dierence b et w een these t w o comp onen ts n
Min kP C R n
PCR
determines the p ortion of an
ob ject that b ecomes memory residen t once tertiary completes servicing this request T o compute
the exact n um b er of sub ob jects that are memory residen t it is sucien tto m ultiply this dierence
b y Min k P C R
MaxM em n n Min k P C R PCR
M axM em represen ts the maxim um amoun t of memory required in one or more time cycles
The
amoun t of memory required for eac h time cycle is computed as follo ws compute the fraction of
an ob ject that is disk residen t p er time cycle compute the fraction that remains on the tertiary
storage device p er time cycle subtract the size of an ob ject from the sum of item and The
n um b er of sub ob jects that are tertiary residentat cycle i for eac h time cycle i where i q
is
Ter
i
Max n i PCR
where PCR is the n um b er of sub ob jects retriev ed from the tertiary p er time cycle i is the n um ber of
time cycles elapsed since the initiation of the materialization and n is the n um b er of sub ob jects that
constitute the referenced ob ject Therefore i PCR is the n um b er of sub ob jects that are retriev ed
from the tertiary th us far By deducting this from nw e compute the n um b er of sub ob jects that
are still tertiary residen t The maxim um function is used to a v oid Equation from pro ducing a
negativev alue once the tertiary has completed the retriev al of the ob ject When the materialization
rst starts time cycle Ter
n The n um b er of sub ob jects that reside on the disk clusters at time cycle i is
Disk
i
Min D isk
i Min k P C R n
where D isk
i represen ts the accum ulated sub ob jects that reside on the disk clusters ate time cycle
i and Min k P C R is the n um b er of sub ob jects added to disk cluster at time cycle iThe
minim um function is used to a v oid Equation from pro ducing a v alue greater than n once the
ob ject has transfered to the disk clusters in its en tiret y When the materialization rst starts
D isk
In computing M axM em w e assumed that dieren t fragmen ts of dieren t sub ob ject are transfered from memory to
disk clusters sim ultaneously Therefore the a v ailable memory should b e managed in frames where the size of a frame
corresp onds to the size of a fragmen t otherwise the a v ailable memory ma y b ecome fragmen ted reducing its utilization
Detailed discussion on memory managemen t is part of a system sc heduler and constitutes our future researc h direction
Figure Ob ject la y out on the tertiary
Hence the n um b er of sub ob jects that are memory residen t at time cycle i is
Mem
i
n Ter
i
Disk
i
T o illustrate the application of the ab o v e analytical mo dels and to discuss ho w this metho d supp orts
the roundrobin assignmen t of the sub ob jects to the disk clusters consider the follo wing examples
T able con tains the parameters for all the examples Fig sho ws the la y out of ob ject on the tertiary
storage device and based on the PCR the p ortion read p er time cycle Th us it requires time
cycles to read X from tertiary Roundrobin assignmen tof X on the three cluster system is sho wn
in T able Example Assume k time slot is dedicated p er time cycle to materialize X Equation determines that q time cycles are required b efore X b ecomes completely disk residen t
Equation determines that the maxim um amoun t of required memory corresp onds to the size
of four sub ob jects M axM em Equation to are use to compute the amoun t of memory
required during eac h time cycle T able The n um b ers in the rst three ro ws of the table corresp ond
Cluster Cluster Cluster X
X
X
X
X
X
X
X
X
X
X
X
T able Desired assignmen t of sub ob jects to a three cluster system
Time cycle i
Ter
i
D isk
i
Mem
i
Flushed X
X
X
X
X
X
X
X
X
X
X
X
to Disk
Memory X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Residen t X
X
X
X
X
X
X
X
X
T able Example v e
Time cycle i Ter
i
Disk
i
Mem
i
Flushed X
X
X
X
X
X
X
X
to Disk
X
X
X
X
X
X
X
X
T able Example six
to the n um b er of sub ob jects that reside on either the tertiary disk or memory The last t woro ws
of the T able sho w the sub ob jects that b ecome either memory or disk residen tat eac h time cycle
resp ectiv ely The sc hedule for allo cating the clusters to ush the sub ob jects on to disk driv es is
iden tical to the one for displa ying them see Figure T able sho ws that the maxim um memory required is sub ob jects during time cycle eigh tcycle
b
n
PCR
c and it is the same as that computed b y Equation
Example By dedicating t w o time slots p er cycle to materialize X k the time required to
materialize X is reduced to time cycles q and no extra memory is required M axM em Note that b ecause k P C R k PCR ie of the disk bandwidth allo cated during a time
in terv al is w asted see T able and Figure In Example where k PCR once the pip elinin g mec hanism emplo y k clusters during time
cycle sa y clusters i i i k mo dulo R in the second cycle it emplo ys the k clusters
starting with either cluster i k mo dulo R or cluster i k mo dulo R If cluster i k
mo dulo R is emplo y ed as the starting cluster during time cycle then there is a logical o v erlap
bet w een time cycle and T o illustrate in Example there w as an o v erlap b et w een cycles and
cluster and no o v erlap b et w een cycles and
In general when PCR m integ er the
rst cycle uses the i k mo dulo R criteria follo w ed b y m cycles use i k mo dulo
R criteria This pattern rep eats itself for the duration of materialization T o illustrate if PCR
If n is not divisibl e b y PC R then the last p ortion of the ob ject is less than Min k P C R This is not captured
in Equation ho w ev er Equation comp ensates to compute the exact amoun t of required memory p er cycle
In Figure the ph ysical o v erlap b et w een the second and third time cycles ie cluster is b ecause of mo dulo R
and is not a logical o v erlap
Figure Sc hedule for example six
w as then m and the emplo y ed clusters in the dieren t time cycles are etc If there is no suc h m that PCR m integ er then
there is alw a ys an o v erlap b et w een consecutiv e cycles ie emplo y i k mo dulo R criteria
When k PCR thereis noo v erlap b et w een consecutiv e time cycles see Example Note that
there is no constrain t on the o ccupation of time slots during eac h time cycle F or example
X
ma y
o ccup y the second time slot of the rst time cycle that corresp onds to cluster Figure Weno w extend the discussion to consider the case where k is no longer a constan t The sc heduler
mayv ary k for dieren t time cycles to ac hiev e the b est memorytimeslot com bination based on its
p ersp ectiv e on the future status of resources Ob viously Equations and are no longer v alid
F urthermore Equation should b e revised to
D isk
i
Min D isk
i Min k
i
P CR Mem
i n
where k
i
is the n um b er of time slots a v ailable during time cycle i T o describ e Min k
i
P CR Mem
i note that at the ithe time cycle Mem
i sub ob jects are memory residen t and ready for
consumption In addition PCR sub ob jects are pro duced p er time cycle b y the tertiaryTh us if
enough time slots are a v ailable p er cycle all of these PCR Mem
i can b e consumed Otherwise
the pro vided n um b er of time slots in that cycle k
i
determines the n um b er of consumed sub ob jects
In Equation since k is xed in all q time cycles if k PCR then Mem
i
for all i q This results in Min k P C R Mem
i PCRHo w ev er if k PCR then indep enden t of the
v alue of Mem
i M in k P C R Mem
i k This explains wh y Mem
i do es not app ear in
Equation In general the n um ber of a v ailable time slots p er time cycle determines the amoun t of memory
required p er time cycle This tec hnique do es not w aste the bandwidth of the tertiary storage device
Figure Sc hedule for example sev en
ho w ev er it requires more resources as compared to m ultiplexing
Moreo v er if k
i
is greater than
PCR Mem
i then k
i
PCR Mem
i denes the fraction of a time in terv al w asted during
time cycle i Example Assume the system parameters presen ted in T able Moreo v er assume k for
the fth time cycle k for the sev en th time cycle and k for the remaining cycles Here the
maxim um amoun t of memory and the total p erio d of materialization cannot b e calculated directly
from Equation and Ho w ev er b y using Equation instead of Equation a similar table
as T able and can b e constructed see T able This table sho ws that the time required for X to
b ecome disk residen t is reduced to time cycles q as compared to the time cycles required
in Example Moreo v er only sub ob jects b ecome memory residen t during the fourth and fth time
cycles M axM em The sc hedule for ushing the sub ob jects on to the disk clusters is sho wn in
Figure The strategy used to determine the p osition of time slots in dieren t time cycles is the same as
describ ed b efore the only dierence is that if k
i
PCR Mem
i then time cycle i has logical
o v erlap with cycle i In Examples the k a v ailable time slots w ere allo cated v erticallyHo w ev er the k time slots
Sc heduler maybe forcedtodom ultiplexi ng due to lac k of resources
Time cycle i Ter
i
Flushed X
X
X
X
X
X
X
X
X
X
X
X
X
to Disk
Memory X
X
X
X
X
X
X
X
X
X
X
Residen t
T able Example for horizon tal assignmen t
PCR PCR Multiplexing NonMultiplexing
SDF YES MA YBE YES
PDF MA YBE YES MA YBE
IDF MA YBE YES MA YBE
T able Suitabilit y of prop osed tec hniques
mightbe a v ailable horizon tally the k allo cated time slots are pro vided in one time cycle and cor
resp ond to a single disk cluster k U
C luster
T o demonstrate recall Example and assume that
the a v ailable time slots are assigned horizon tallyT able sho ws the sub ob jects transfered to
disk p er time cycle During the rst time cycle although t w o time slots are a v ailable only one
of them can b e utilized W e do not assign
X
to cluster in order to preserv e the roundrobin
assignmen t of sub ob jects to the disk clusters This p ortion should remain memory residen tun til the
time slot corresp onding to disk cluster b ecomes a v ailable With horizon tal allo cation of time slots
Equations and remain v alid Ho w ev er Equations and need to b e extended The reason
is that neither k nor k
i
a v ailable time slots imply that either k or k
i
sub ob jects are transfered to the
disk clusters Moreo v er con trary to the v ertical assignmen t here the sub ob jects are not transfered
to disk sequen tiallyF or example in T able X
b ecomes disk residentbefore X
Therefore a
closed form form ula to compute the p ortion of the ob ject that resides on disks p er time cycle similar
to Equation do es not exist This is b ecause it is runtime dep enden t on the v alue of k
i
and the
sequence of the sub ob jects b eing transfered In App endix B an algorithm is pro vided to determine
the p ortion of the sub ob jects transfered to disk for a giv en time cycle i and a v ailable time slots k
i
Ob viously Disk
i
is the accum ulated amoun t of these p ortions and the p ortions from the previous
runs of algorithm for to i This computation can replace Equation when time slots are
reserv ed horizon tally Figure sho ws inputs and outputs of the algorithm
Conclusion and F uture Directions
This pap er presen ted the design of a pip elin ing mec hanism that minimizes the latency time of
Figure Inputs and outputs of the algorithm
systems based on a hierarc hical storage structure It is no v el and dieren t than the pip elini ng mec h
anism prop osed for parallel relational systems b ecause it ensures a con tin uous displa y of audio and
video ob jects It is general purp ose b ecause it can supp ort dieren t organization of memory disk and tertiary storage tertiary devices with v arious bandwidths ob jects that require dif
feren t bandwidths and a system that allo cates v arious resources memory and disk bandwidth
dynamically based on its load and the a v ailabilit y of resources It splits an ob ject in to s logical
slices S
S
S
s
In the b est case it can reduce the latency time of the system as compared
to nonpip elin in g case b y the follo wing ratio
siz e X siz e S
X siz e X The precise reduction in the latency
time is dep enden t on the arc hitecture the resources allo cated to the mec hanism the bandwidth of
the tertiary and the bandwidth required to supp ort a con tin uous displa y of the referenced ob ject
W e presen t analytical mo dels that compute this latency time based on these input parameters
In summaryw eha v e describ ed three pip elini ng tec hniques SDF PDF and IDF While SDF and
PDF render an ob ject disk residen t IDF displa ys an ob ject from the tertiary storage device without
storing it on the disk clusters IDF is appropriate for those ob jects whose exp ected future reference
is so lo w that they should not replace other ob jects that are curren tly disk residen t Ho w ev er when
PCR IDF and PDF w ould require the rst slice of an ob ject to b ecome memory residen t prior
to the displa y of an ob ject The v alue of PCR denes the size of the rst slice of an ob ject whic h
in turn dictates the amoun t of required memory Dep ending on the a v ailabilit y of resources b oth
PDF and IDF ma y not b e appropriate when PCR SDF migh t b e a more suitable pip elini ng
paradigm b ecause it minimizes the amoun t of required memory b y using the disk as the staging area
With PCR w e considered t w o alternativ e execution paradigms m ultiplexing and non
m ultiplexing Multiplexing striv es to render PCR with the rate of data deliv ery from tertiary
matc hing that of displayb yin tro ducing tap e seeks In this case PDF and IDF are b oth suitable
b ecause they require b oth less disk bandwidth and memory as compared to SDF this bandwidth
and memory could b e emplo y ed to service other requests Nonm ultiplexing retriev es an ob ject
from the tertiary as fast as p ossible without in tro ducing seeks In this case the ob ject b ecomes
memory residen t faster than its consumption rate at a displa y station Once again SDF minimizes
the amoun t of required memory b y ushing the sub ob jects to the disk as so on as p ossible PDF and
IDF ma y not b e appropriate b ecause they ma y require substan tial amoun t of memory dep ending
on the v alue of PCR
Wein tend to extend this study in t wow a ys First w eplan toin v estigate the impact of pip elini ng
on the staging of dieren t ob jects on the disk in tuitiv ely it migh t b e justiable to stage all slices
of a frequen tly accessed ob ject on the disk clusters Ho w ev er it migh t b e more appropriate to stage
only a few slices of mo derately accessed ob ject on the disk clusters and only the rst slice of an
ob ject with a lo w frequency of access on the disks This w ould maximize the utilization of the disk
space and further reduce the latency time of the system Ho w ev er it w ould require a mo dication
of the replacemen t p olicy used to sw ap ob jects in and out of the disk storage Second w ein tend to
in v estigate the design of a sc heduler that uses the pip elinin g mec hanism It should allo cate resources
memory and disk bandwidth to eac h request in a manner that maximizes the utilization of resources
and minimizes the latency time of the system while ensuring the realtime constrain to supp ort a
con tin uous displa y Moreo v er it should analyze the system load when rendering decisions The
alternativ e datao w paradigms presen ted in this pap er pro vide the sc heduler with a wide v arietyof
c hoices to optimize the utilization of resources
Ac kno wledgmen ts
Wew ould lik e to thank Da vid DeWitt for bringing to our atten tion the future trends in the area of
tertiary storage devices and pro viding us with the literature that supp orts these trends In addition
wew ould lik e to thank the anon ymous referees for their v aluable commen ts
References
BGMJ S Berson S Ghandeharizadeh R Mun tz and X Ju Staggered Striping in Multimedia
Information Systems In Pr o c e e dings of the A CM SIGMOD International Confer enc e
on Management of Data F o x E A F o x Adv ances in In teractiv e Digital Multimedia Sytems IEEE Computer pages
Octob er
GCEMJ S Ghandeharizadeh H Chan M EscobarMolano and X Ju On conguring hi
erarc hical m ultimedia storage managers T ec hnical Rep ort USC Univ ersityof
Southern California GDQ S Ghandeharizadeh D DeWitt and W Qureshi A p erformance analysis of alter
nativem ultiattribute declustering strategies In Pr o c e e dings of the A CM SIGMOD
International Confer enc e on Management of Data June GRA Q S Ghandeharizadeh L Ramos Z Asad and W Qureshi Ob ject Placemen t in Parallel
Hyp ermedia Systems In Pr o c e e dings of the International Confer enceon V ery L ar ge
Datab ases GS S Ghandeharizadeh and C Shahabi Managemen tofPh ysical Replicas in P arallel Mul
timedia Information Systems In Pr o c e e dings of the F oundations of Data Or ganization
and A lgorithms F ODO Confer enc e Octob er LKB M Livn y S Khoshaan and H Boral MultiDisk Managemen t Algorithms In Pr o
c e e dings of the A CM SIGMETRICS Intl Conf on Me asur ement and Mo deling of
Computer SystemsMa y MWS D Maier J W alp ole and R Staehli Storage System Arc hitectures for Con tin uous
Media Data In Pr o c e e dings of the F oundations of Data Or ganization and A lgorithms
F ODO Confer enc e o ctob er RE D Ries and R Epstein Ev aluation of distribution criteria for distributed database
systems UCBERL T ec hnical Rep ort M UC Berk eleyMa y SAD
M Stonebrak er R Agra w al U Da y al E Neuhold and A Reuter DBMS Researc hat
a Crossroads The Vienna Up date In Pr o c e e dings of the International Confer enceon
V ery L ar ge Datab ases Sc h T Sc h w arz High p erformance quarterinc h cartidge tap e systems Thir d NASA GSF G
c onfer enc e on MASS Stor age Systems and T e chnolo gies pages F ebruary SGM K Salem and H GarciaMolina Disk striping In Pr o c e e dings of International Confer
enc e on Datab ase Engine eringF ebruary TPBG FA T obagi J P ang R Baird and M Gang Streaming RAIDA Disk Arra yMan agemen t Syst em for Video Files In First A CM Confer enc e on Multime dia August
App endix A Mix of Media T yp es This app endix describ es the extensions of the pip elini ng
mec hanism to supp ort a database that consists of a mix of media t yp es W e b egin b y pro viding an
o v erview of t w o alternativ e approac hes that enables our target arc hitecture to supp ort a mix of media
t yp es blo c kbased GCEMJ and staggered striping BGMJ Subsequen tlyw e describ e
the extensions to the pip elini ng mec hanism to accommo date eac h approac h
T o simplify the discussion assume that the database consists of t w o media t yp es A and B
B
D isplay
A B
Display
B In this case PCR is no longer a constan t b ecause the ob jects of media
t yp e A ha v e a dieren t bandwidth requiremen t than ob jects of media t yp e B W e dene PCR ifor
ob jects of media t yp e i as
B
T er tiar y
B
D isplay
i Therefore for a xed B
T er tiar y
PCR migh t b e greater than
one for a media t yp e PCR A and less than one for another media t yp e PCR B
The Blo c k edbased GCEMJ approac h reads a blo c k of an ob ject from a cluster during eac h
time cycle A blo c k consists of n
i
sub ob jects siz e bl ock n
i
siz e subobj ect n
i
mightbe a
real n um b er Those ob jects that b elong to a single media t yp e sa yA ha v e the same blo c k size
Ho w ev er ob jects of dieren t media t yp es ha v e dieren t n
i
v alues n
A
n
B
The ratio b et w een
n
A
n
B
corresp onds to
B
D isplay
A B
D isplay
B Hence the displa y time of a blo c k is iden tical for all ob jects regardless of
their media t yp e Since one time slot is needed to displa y one sub ob ject this approac h requires n
A
time slots to displa y one blo c k of an ob ject that b elongs to media t yp e A Note that
P
k
i n
i
migh t
b e a real n um ber
When PCR A the discussion presen ted in section is mo died b yc hanging the term
sub ob ject to blo c k
When PCR A with m ultiplexing it is sucien t to mo dify the references to k time slots
and memory frames in section with
P
k
i
n
i
time slots and memory frames to m ultiplex k
ob jects with dieren t bandwidth requiremen ts With nonm ultiplexin g references to k time slots
and memory frames should b e mo died to n
A
k where n
A
represen ts the n um b er of sub ob jects
that constitute the referenced ob ject X whic h b elongs to media t yp e A Staggered Striping BGMJ constructs the disk clusters logically instead of ph ysically and
remo v es the constrain t that the assignmentof t w o consecutiv e sub ob jects of X sa y X
i
and X
i b e nono v erlapping Instead it n um b ers the disk driv es logically D and assigns the
sub ob jects suc h that the disk con tains the rst fragmen tof X
i
ie X
i is l disks mo dulo the
total n um b er of disks apart from the disk driv e that con tains the rst fragmen tof X
i
ie X
i
The distance b et w een X
i and X
i is termed str ide Figure demonstrates the assignmen t
of ob jects X and Y with bandwidth requiremen ts of and m bps if B
Disk
m bps then
M
X
M
Y
resp ectiv ely The stride of eac h ob ject is In order to displa y ob ject X the
system lo cates M
X
logically adjacen t disk driv es that con tain its rst sub ob ject disks and If these disk driv es are idle they are emplo y ed during the rst time cycle to retriev e and displa y X
During the second time cycle the next M disk driv es are emplo y ed b y shifting str ide disks to the
righ t
When PCR X the discussion in section is applicable with one dierence the displa yof
an ob ject migh t b e dela y ed ev en after the retriev al of S
X This is b ecause the logical disk cluster
that con tains sub ob ject X
mightbe emplo y ed b y the materialization of the remaining slices of X
or displa y of other ob jects In this case the displa ym ust w ait un til this cluster b ecomes a v ailable
this discussion applies to PCR
k is still an in teger
Figure Staggered striping with disks
With m ultiplexing k ob jects should b e materialized in k clusters sim ultaneously As the clusters
are logical with staggered striping it is imp ortan t that the lo cated clusters do not share a single disk
driv e In this case staggered striping is a sp ecial case of the discussion of section where the
n um b er of time slots in a cycle is one ie U
C luster
With nonm ultiplexing tertiary should b e dedicated to a single ob ject X un til the ob ject is
materialized in its en tiret y Therefore b y reserving appropriate k time slots in one time cycle k
sequen tial sub ob jects of X are ushed on to disk clusters In staggered striping this ma y not b e
feasible b ecause sequen tial sub ob jects are ushed on to logical disk clusters that migh t share ph ysical
disk driv es F or example in Figure assuming PCR X it is ob vious that X
and X
cannot b e
ushed sim ultaneously they share disk driv es and This can b e resolv ed in t wow a ys The rst
metho d c hanges the la y out of the ob jects on the tertiary storage device In this case the fragmen ts
that constitute X
X
X
X
X
X
X
whic h can b e ushed sim ultaneously should b e
stored on the tertiary storage device sequen tially In order to emplo y pip elini ng the la y out should b e
in a manner suc h that at least one sub ob ject is ushed during eac h time cycle and if sub ob ject
X
i
b ecomes disk residen t during time cycle i then sub ob ject X
i b ecomes disk residen t during time
cycle i F or example in the second time cycle X
and X
m ust b e ushed to render X
disk
residen t in order to a v oid a hiccup The second metho d do es not violate the sequen tial la y out of the
ob ject on the tertiary The sub ob jects are retriev ed sequen tiallyHo w ev er during eac h time cycle
T ransfered Sub ob jetcs i k
i
i is the time cycle
k
i
is the n um ber of a v ailable time slots in time cycle i
Cconsumed Ppro duced STEP C cluster i mo d R
C index while C onsumed C index C cluster do
C index C index STEP pr od Min n i PCR
P index dpr ode div R P cluster dpr ode mod R
P por tion pr od b pr odc if P por tion then P por tion
The pro duction cluster is b ehind if P cluster C cluster then
P por tion
P index P index
The pro duction cluster is passed the consumed cluster if P cluster C cluster P por tion
P cluster C cluster STEP Disk
i
for pointer C indexto Min P index C index k
i
do
The last sub ob ject ma y b e pro duced partially if pointer P index then
C onsumed pointerC cluster P por tion else C onsumed pointerC cluster prin t C onsumed pointerC cluster of sub ob ject pointer R C cluster is transfered
D isk
i
D isk
i
C onsumed pointerC cluster
The accum ulated amoun t resides on disk at ith time cycle Disk
i
Disk
i
Disk
i Figure T ransfered sub ob jects in the i th time cycle
only those fragmen ts that do not comp ete for he same disk driv es are ushed on to the disk clusters
F or example during the rst time cycle although X
and X
are memory residen t only the fragmen t
that constitute sub ob ject X
and fragmen t X
are ushed In this case a memory buer is required
as a staging area b et w een tertiary and memory Moreo v er the disk bandwidth is w asted during the
rst few time cycles This is iden tical to horizon tal assignmen t of time slots describ ed in section
assuming C disk driv es are a v ailable p er time cycle to materialize ob ject X with degree of
declustering M
X
k Min b
C
M
X
cP CR X time slots are allo cated horizon tallyMoreo v er M
X
fragmen ts can b e ushed on to disk clusters p er time slot Similar to horizon tal assignmen t although
k time slots are a v ailable it ma y not b e feasible to ush M
X
k fragmen ts b ecause some of them do
not b elong to the a v ailable disk driv es or sev eral of them b elong to the same a v ailable disk driv e
One ma y emplo y an algorithm similar to Figure to determine the n um b er of time cycles and
memory required to materialize ob ject X This algorithm should b e extended to manipulate disk
driv es and fragmen ts instead of clusters and sub ob jects
App endix B Algorithm for Horizon tal Allo cation Figure sho ws an algorithm to com
pute the n um b er of sub ob jects ushed to the disk driv es consumed p er time cycle for horizon tal
assignmen t of time slots Figure sho ws the main data structure of the algorithm termed Con
sumed when the system consists of clusters and the referenced ob ject consists of sub ob jects
Figure Consumed The main data structure of the algorithm
R and n Eac h elemen t of this data structure corresp onds to a p ortion of sub ob ject that
should b e assigned to a cluster When the algorithm is rst in v ok ed all elemen ts of this data struc
ture are initialize d to zero The v alue that mayoccupyeac h elementm ust b e a real n um ber b et w een
and It represen ts a fraction of a sub ob ject that has b ecome disk residen t indicates that the
sub ob ject has b een ushed in its en tiret y while indicates that it is either tertiary or memory resi
den t A n um ber b et w een and indicates the fraction of a sub ob ject ushed th us far In Figure the sub ob jects corresp onding to eac h elemen t is sho wn at the top left p osition of the elemen t The
algorithm alw a ys ushes the rst sub ob ject X
to the rst diskcluster Mo difying it to start from
a sp ecic disk cluster is trivial The general steps of the algorithm are as follo ws
STEP Lo cate the cluster C Cluster suc h that during time cycle ithe k
i
allo cated time slots
corresp ond to C Cluster Moreo v er determine the smallest index C index that satises the
follo wing constrain t ConsumedC Cluster C index During this step the last sub ob ject
that corresp ond to C Cluster that has b een partially ushed is lo cated
STEP nd the largest index P index of the sub ob ject that has b een pro duced for C Cluster
This step determines the last sub ob ject that corresp onds to C Cluster that has b ecome either
partially or fully memory residen t
STEP Flush the sub ob jects ConsumedC Cluster j where j v aries from C index to either
P index or C index k
i
whic hev er is smaller During this step the sub ob jects that ha vebe come memory residen t are ushed to the disk clusters The n um b er of ushed sub ob jects is
determined b y the n um b er of either a v ailable time slots or sub ob jects that are pro duced b y the
tertiary storage device
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 578 (1994)
PDF
USC Computer Science Technical Reports, no. 601 (1995)
PDF
USC Computer Science Technical Reports, no. 600 (1995)
PDF
USC Computer Science Technical Reports, no. 622 (1995)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 618 (1995)
PDF
USC Computer Science Technical Reports, no. 598 (1994)
PDF
USC Computer Science Technical Reports, no. 627 (1996)
PDF
USC Computer Science Technical Reports, no. 650 (1997)
PDF
USC Computer Science Technical Reports, no. 590 (1994)
PDF
USC Computer Science Technical Reports, no. 612 (1995)
PDF
USC Computer Science Technical Reports, no. 766 (2002)
PDF
USC Computer Science Technical Reports, no. 862 (2005)
PDF
USC Computer Science Technical Reports, no. 592 (1994)
PDF
USC Computer Science Technical Reports, no. 748 (2001)
PDF
USC Computer Science Technical Reports, no. 610 (1995)
PDF
USC Computer Science Technical Reports, no. 589 (1994)
PDF
USC Computer Science Technical Reports, no. 855 (2005)
PDF
USC Computer Science Technical Reports, no. 896 (2008)
PDF
USC Computer Science Technical Reports, no. 647 (1997)
Description
Shahram Ghandeharizadeh and Ali Dashti and Cyrus Shahabi. "A pipelining mechanism to minimize the latency time in hierarchical multimedia storage managers." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 584 (1994).
Asset Metadata
Creator
Dashti, Ali
(author),
Ghandeharizadeh, Shahram
(author),
Shahabi, Cyrus
(author)
Core Title
USC Computer Science Technical Reports, no. 584 (1994)
Alternative Title
A pipelining mechanism to minimize the latency time in hierarchical multimedia storage managers (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
31 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269475
Identifier
94-584 A Pipelining Mechanism to Minimize the Latency Time in Hierarchical Multimedia Storage Managers (filename)
Legacy Identifier
usc-cstr-94-584
Format
31 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/