Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 601 (1995)
(USC DC Other)
USC Computer Science Technical Reports, no. 601 (1995)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
On Conguring Hierarc hical Multimedia Storage Managers
Shahram Ghandeharizadeh HsunKo Chan
Martha L EscobarMolano and Xiangyu Ju
Computer Science Departmen t
Univ ersit y of Southern California
F ebruary Abstract
Multimedia information systems ha v e emerged as an essen tial comp onentofman y application
domains ranging from library information systems to en tertainmen t tec hnologyA c hallenging
task when implem en ting these systems is to supp ort a con tin uous displayofm ultim edia ob jects
The c hallenging is due to the lo w IO bandwidth of the curren tdisk tec hnology the high
bandwidth requirementofm ultim edia ob jects and the large size of these ob jects whic h requires
them to b e almost alw a ys disk residen t One approac h to resolv e this limitation is to decluster
am ultim edia ob ject across m ultiple disk driv es in order to emplo y the aggregate bandwidth
of sev eral disks to supp ort its con tin uous retriev al and displa y T opro vide online access to
v ast amoun t of data economically the storage arc hitecture of these systems is exp ected to b e
hierarc hical
Assuming a hierarc hical storage manager that consists of some memory D disk driv es and
a tertiary storage device this pap er describ es a tec hnique to supp ort a con tin uous displayof
p ossibly compressed m ultimedia ob jects and the fundamen tal factors that impact a c hoice
of conguration parameters for the system and an algorithm to compute them
In tro duction
During the past decade information tec hnology has ev olv ed to store and retrievem ultimedia data
eg audio video Multimedia information systems utilize a v arietyof h uman senses to pro vide
eectiv e means of con v eying information Already these systems pla y a ma jor role in educational
applications en tertainmen t tec hnology and library information systems A c hallenging task when
implemen ting these systems is to supp ort a con tin uous retriev al of an ob ject at the bandwidth re
quired b y its media t yp e SAD
MWS GRA Q in order to ensure its con tin uous displa y
This is c hallenging b ecause certain media t yp es in particular video require v ery high bandwidths
F or example the bandwidth required b yNTSC
for net w orkqualit y uncompressed video is ap
pro ximately megabits p er second m bps Has Recommendation of the In ternational
The US standard established b y the National T elevision System Committee
Radio Consultativ e Committee CCIR calls for a m bps bandwidth for video ob jects A video
ob ject based on the HDTV High Denition T elevision qualit y images requires appro ximately a
m bps bandwidth Compare these bandwidth requiremen ts with the t ypical m bps bandwidth
of a magnetic disk driv e
whic h is not exp ected to increase signican tly in the near future PGK The solution prop osed in this pap er is to decluster the disk driv es so that the aggregate
bandwidth comp ensates the high bandwidth requiremen ts of the m ultimedia devices Theoretically the clustering of d disk driv es with a bandwidth requirementof B
disk
eac h could giv e a bandwidth
of d B
disk
F or example if w eha vea videofollo wing the NTSC standard bandwidth requiremen t
of m bps and the disk driv es bandwidth is m bps w e decluster the disk driv es in groups
of In realit yw eha v e to consider the activ ation seek and latency time when computing the
aggregate bandwidth In fact they increase as the size of the cluster increase more devices need
to b e activ ated and more heads need to b e rep ositioned
Assuming that all the disk driv es in the system are iden tical and a database consisting of
ob jects that b elong to a single media t yp e with bandwidth requiremen t B
D isplay
w e utilize the
aggregate bandwidth of d disk driv es to supp ort a con tin uous displa y of an ob ject This is ac hiev ed
as follo ws First the D disk driv es in the system are partitioned in to R disk clusters where R b
D
d
c Next eac h ob ject in the database sa y X is stripp ed SGM in to n equisized sub ob jects
X
X
X
n
Eac h sub ob ject X
i
represen ts a con tiguous p ortion of X When X is materialized
from the tertiary storage device its sub ob jects are assigned to the disk clusters in a roundrobin
manner starting with an a v ailable cluster In a cluster a sub ob ject is declustered RE LKB
GD in to d pieces termed fragmen ts with eac h fragmen t assigned to a dieren t disk drivein
the cluster
Assuming that the bandwidth of eac h disk cluster is high enough that it can b e m ultiplexed
bet w een U
C luster
requests the a v ailable memory is partitioned in to R U
C luster
frames T o
ensure a con tin uous displa y of an ob ject the system main tains a time cycle for eac h cluster A
time cycle consists of U
C luster
time in terv als also termed slots Atimein terv al is the time
required for a cluster to rep osition its disk heads and transfer a sub ob ject in to a memory frame
The duration of a time cycle corresp onds to the displa y time of a sub ob ject Giv en a request for
an ob ject X that consists of n sub ob jects the system reserv es n time in terv als on b ehalf of this
request one p er time cycle of the system Eac h reserv ed time slot o ccupies the same slot p er time
cycle Relativ e in time the distance b et w een anyt w o time slots reserv ed on b ehalf of a request
is U
C luster
The cluster emplo y ed in the rst time cycle is the one con taining X
sa y C
i
The displayof X starts once X
is staged in memory In the second cycle cluster C
i mod R
is
The concepts describ ed in this pap er are applicable to other secondary storage devices
X n is an exception to this statemen t
Figure A sc hedule for servicing three requests
Cluster Cluster Cluster X
X
X
X
X
Y
Y
Y
Y
Y
Z
Z
Z
Z
Z
T able An assignmen t of sub ob jects to a three cluster system
emplo y ed to read X
X
b ecomes memory residen t immediately b efore the displayof X
completes
b ecause the duration of a cycle corresp onds to the displa y time of a sub ob ject X
in this case
The system switc hes from the memory frame con taining X
to X
to supp ort a con tin uous displa y
of X The system iterates o v er the clusters and memory frames un til X is displa y ed in its en tiret y
emplo ying a single cluster in eac h time cycle
T o illustrate assume a system that consists of disk clusters Moreo v er assume that the
bandwidth of eac h cluster is t wice the bandwidth required to display anobject U
C luster
Let the follo wing three ob jects reside on the disk clusters X Y and ZEac h ob ject is strip ed
in to sub ob jects and assigned to the clusters in a roundrobin manner starting with a dieren t
cluster for eac h ob ject sa y cluster for X cluster for Y and cluster for Z the assignmen tof
sub ob jects to the dieren t clusters is sho wn in T able Assume that three requests are issued
eac h referencing a dieren t ob ject with the follo wing order of arriv al X Y follo w ed b y Z Figure
demonstrates the sc heduling of the time in terv als to displa y the dieren t sub ob jects as a function
of time This sc hedule is p ossible due to the assumed assignmen t allo wing the system to start the
displa y X Y and Z in the same time cycle Note that X
i
emplo ys the same time in terv al in eac h
time cycle in terv al Moreo v er relativ e in time the slots reserv ed on b ehalf of a displa y sa y Y are U
C luster
slots apart The duration of a time cycle corresp onds to the displa ytimeof eac h
sub ob ject Consequen tly X
will b e memory residen t b efore the displayof X
completes
If the roundrobin assignmen t of the ob jects started with the same cluster sa y cluster then
X
Y
and Z
w ould ha v e b een assigned to cluster In this case the displa yof X and Y w ould
start in the rst time cycle X
o ccup ying in terv al Y
occup ying in terv al of cluster while the
displayof Z w ould emplo y a time in terv al in the second cycle of cluster Th us the duration of a
time in terv al determines the w ait time for the request referencing ob ject Y while the duration of a
time cycle determines the w ait time for the request referencing ob ject Z Note that the roundrobin
assignmen t of sub ob jects enables the system to supp ort a maxim um of six sim ultaneous displa ys
The concept of declustering a m ultimedia ob ject across m ultiple disk driv es in order to supp ort
its con tin uous displayw as originally describ ed in Ram GRA Q GR These studies assumed
a sharednothing arc hitecture Sto as the hardw are platform of a m ultimedia information sys
tems In GS an extension with a tertiary storage device is assumed in order to pro vide online
access to v ast amoun t of data It in tro duced the concept of disk cluster and virtual data replication
as a mec hanism to supp ort the displa y of ob jects to dieren t users But it do es not strip the ob
jects ie the en tire ob ject resides in one disk cluster and do es not consider a mix of media t yp es
In BGMJ striping is used as an alternativ e to virtual data replication With this tec hnique an
ob ject is strip ed in to sev eral sub ob jects with eac h sub ob ject assigned to a dieren t disk cluster It
also assigns a degree of declustering to eac h bandwidth requirementof eac h media t yp e based on
the theoretical disk cluster bandwidth ie ignoring the rep ositioning time of the heads While
these studies w ere a signican t rst step they assumed a minimal amoun tof a v ailable memory and
striv ed to establish a pro ducerconsumer relationship b et w een a disk cluster pro ducing the data
and a displa y stations consuming it b y displa ying the data This study extends the previous w ork
b y using memory as an in termediate staging area to displa y an ob ject
Since the size of m ultimedia ob jects is considerably large compression is essen tial But when
dealing with compressed ob jects the bandwith requiremen ts of the displa ys b ecome v ariable There
fore trying to sync hronize the disk clusters to pro duce data at the same pace as the displa y stations
consume it could incur signican t latency times Because ph ysical memory is not mec hanical stag
ing data in memory b efore displa ying it alleviates the problem
If all sub ob jects of X Y and Z w ere assigned to cluster one ie b y ignoring the roundrobin strategy then
cluster one could ha v e displa y ed only t w o of the p ending requests sim ultaneously while the third request w ould ha v e
had to w ait ev en though t w o clusters in the system remain idle w aiting for w ork
T erm Denition
tfr T ransfer rate of a single disk driv e
T
act
d Ov erhead of activ ating d disk driv es
T
seek
d W orst case seek time of d disk driv es
sizesub ob ject Size of the unit of transfer b et w een a disk cluster and the memory
sizemem T otal size of memory
B
D isplay
Bandwidth required to displayan object
B
C luster
Bandwidth of a disk cluster
B
T er tiar y
Bandwidth of the tertiary storage device
D Num b er of disk driv es in the system
d Num b er of disk driv es p er disk cluster
R Num b er of disk clusters in the system R b
D
d
c
U
C luster
Num ber of sim ultaneous displa ys supp orted b y one disk cluster p er time in terv al
U
IO
Num ber of sim ultaneous displa ys supp orted b y the IO subsystem p er time in terv al
T able List of terms used rep eatedly in this pap er and their resp ectiv e denitions
This pap er address the follo wing researc h topics
Ho w man y disk driv es should b e assigned to eac h cluster
What is the design of a cycle and its time in terv al and ho w do es it ensure a con tin uous
display ofam ultimedia ob ject
What is the size of a sub ob ject
Ho w do es the system supp ort a database that consists of a mix of m ultimedia ob jects eac h
with a dieren t bandwidth requiremen t
Ho w do es the system supp ort compression
The rest of this pap er is organized as follo ws In Section w e describ e the arc hitecture as
sumed Then w e pro vide an answ er to the ab o v e questions starting with the simplest case and
incremen tally adding complexit y Section fo cuses on a database that consists of noncompressed
ob jects of a single media t yp e and iden ties the fundamen tal factors that impact a solution Sec
tion extends this w ork to a database that consists of a mix of media t yp es Section extends this
discussion to explain ho w compression can b e implemen ted Our conclusions and future researc h
directions are con tained in Section
Arc hitecture
This pap er presen ts an approac h to supp ort the con tin uous displa y of ob jects based on a hierarc hical
storage arc hitecture that consists of a tertiary storage device a group of disk driv es and memory The purp ose of ha ving a hierarc hical arc hitecture is to strik e a compromise b et w een the qualit y
of the service and the cost of pro viding suc h a service MWS iden ties the second factor as an
imp ortan t criteria for a m ultimedia information system While memory is ideal for fast service its
cost and the massiv e information that a m ultimedia information system manages mak es secondary
storage necessary In addition cost wise tertiary storage is fa v orable o v er disk driv es But the
dela ys incurred b ecause of the rep ositioning time of the read head of tertiary storage devices are
higher than the ones incurred b y the disk driv es Therefore a hierarc hical arc hitecture enables
the system to maximize the utilization of its resources in order to service man y more requests
sim ultaneously W e assume that the en tire database is stored in a tertiary storage device and the ob jects are
materialized to the disk driv es on demand Then the ob jects are stagged in to main memory to b e
later displa y ed Buering the data in memory could help to alleviate the dierences of pro duction
disk cluster bandwidth and consumption displa y bandwidth rates without burning out disk
driv e bandwidth caused b y rep ositioning the heads to o often The solution presen ted in this pap er
assumes that part of the memory is assigned in adv ance to serv e as a buer b et w een the disk driv es
and displa y stations
W e consider t w o alternativ e organizations of the comp onen ts in the hierarc hical arc hitecture
memory serv es as an in termediate staging area b et w een the tertiary storage device the disk
driv es and the displa y stations and the tertiary storage device is accessible only to the disk
driv es via a xed size memory With the rst organization the system ma y elect to displayan
ob ject from the tertiary storage device b y using memory as an in termediate staging area With the
second organization the data m ust rst b e staged on the disk driv es b efore it can b e displa y ed
W e capture these t w o organizations using three alternativ e paradigms for the o w of data among
the dieren t comp onen ts
Sequen tial Data Flo w SDF The data o ws from tertiary to memory STREAM of
Figure from memory to the disk driv es STREAM from disk driv es backto mem ory STREAM and nally from memory to the displa y station referencing the ob ject
STREAM P arallel Data Flo w PDF The data o ws from tertiary to memory STREAM and from
memory to b oth the disk driv es and the displa y station in order to materialize STREAM
Figure Three alternativ edatao w paradigms
and displa y STREAM the ob ject sim ultaneously PDF eliminates STREAM Incomplete Data Flo w IDF The data o ws from tertiary to memory STREAM and
from memory to the displa y station STREAM to supp ort a con tin uous retriev al of the
referenced ob ject IDF eliminates b oth STREAM and The tertiary storage device impacts Memory requiremen ts only ID Memory and disk
bandwidth requiremen ts SDF PDF This pap er fo cuses in the in teraction b et w een disk driv es
memory and displa y stations The rst impact is studied in GDS The second impact can b e
view ed as an additional path STREAMS and The dierence with STREAMS and the
ones considered in this pap er is that the roles are c hanged the disk driv es b ecome consumers
instead of pro ducers But the goal is to equate the pro ducer and consumer rates therefore the
c hange of roles is irrelev an t for the conguration pro cedures Hence tertiary can b e mo deled as
another media t yp e
Single Media T yp e
This section fo cuses on a database that consists of a single media t yp e ie the bandwidth require
men t of eac h ob ject B
D isplay
is iden tical Assuming that a system consists of D disk driv es and
a xed amoun t of memoryw e dev elop a tec hnique that computes the conguration parameters
of the system d siz e subobj ect The ob jectiv e of this tec hnique is to maximize the n um ber
of sim ultaneous displa ys This enhances the o v erall p erformance of the system b y impro ving the
resp onse time of the system and increasing its o v erall useful utilization
W e rst discuss ho w to construct the disk clusters to supp ort displa ys without hiccups under
some constrain ts and what the memory constrain t is Finally w e describ e a conguration pro cedure
and heuristic searc h to nd d and siz e subobj ect that ac hiev e the maximal U
IO
Displa y Without Hiccups and Memory Constrain t
The motiv ation for partitioning the D disk driv es in to R clusters is to increase the IO bandwidth
of the system partitioning the disks increases the fraction of time the disk driv es sp end p erforming
useful w ork reading data instead of w asteful w ork either w aiting to b e activ ated or rep ositioning
their heads Consider the alternativ e forms of w asteful w ork in turn First the o v erhead of
activ ating a disk cluster increases as a function of additional disk driv es d that constitute a cluster
mo deled as T
act
d Second the a v erage seek time of a cluster increases as a function of d if the
organization of data across the d disk driv e is not iden tical Once a request is activ ated on a cluster
its seek time is determined b y the disk driv e that has the longest seek time LKB PGK The
exp ected seek time of a disk cluster that consists of d disk driv es w as deriv ed in BG Assuming
that T
act
dis kno wn the bandwidth of a disk cluster as a function of d and the size of a sub ob ject
can b e dened as
B
C luster
siz e subobj ect T
act
d T
seek
d siz e subobj ect d tf r
where tf r is the transfer rate of a single disk driv e One constrain t on the bandwidth of a cluster is
that it should b e greater than or equal to the bandwidth required to supp ort a con tin uous displa y
of an ob jects in the database
B
C luster
B
D isplay
Otherwise the system w ould not b e able to supp ort a con tin uous displayof an object If B
C luster
is signican tly higher than B
D isplay
then a disk cluster can b e m ultiplexed among
U
C luster
requests
U
C luster
B
C luster
B
D isplay
In this case the size of a sub ob ject sa y X
i
should b e c hosen suc h that its displa y time ie
siz e subobj ect B
D isplay
is greater than or equal to the sum of the amoun t of time a disk cluster is
Figure A sc hedule for servicing three requests using a cluster
m ultiplexed among U
C luster
other requests accessing sub ob jects
and the time required to
read the next sub ob ject X
i
This constrain t can b e expressed as follo ws
siz e subobj ect B
D isplay
U
C luster
T
act
d T
seek
d siz e subobj ect d tf r
The reason using greater than or equal to is as follo ws If the displa y time of a sub ob ject is smaller
than the righ t hand side of the inequalit y then the data will not b e pro duced at the desired
rate and will result in hiccups When the displa y time of a sub ob ject is greater then the data will
b e pro duced faster than it can b e consumed hence w eha veto w aste part of the disk bandwidth
Therefore w e should aim for the equalit y F or example if all the sub ob jects of X Y and Z are assigned to one cluster U
C luster
is and
displa y time is in terv als see Figure then w ewill ha vetow aste disk bandwidth ie disk
cluster will b e idle b et w een the reading of Z and X Z and X and so forth
F urthermore since w ew an t to maximize U
C luster
w e use equalit y of to dene U
C luster
as a
function of d and sizesub ob ject
U
C luster
siz e subobj ect B
D isplay
T
seek
d T
act
d
siz e subobj ect d tf r
Note that the size of a sub ob ject is xed for all ob jects b ecause this section assumes that all ob jects b elong to a
single media t yp es
U C luster is a real n um b er w e will apply a o or function to mak eit tobeanin teger in our heuristic searc h
or dene sizesub ob ject as a function of d and U
C luster
siz e subobj ect
U
C luster
T
seek
d T
act
d
B
D isplay
U
C luster
d tf r
These denitions are useful for explaining the searc h space of our optimization problem and the
heuristic dev elop ed in this section
The goal is to congure the system maximizing the total n um b er of sim ultaneous displa ys U
IO
Weno w dene U
IO
as
U
IO
R U
C luster
where R is the n um b er of disk clusters in the system R b
D
d
c Ho w ev er in order for the system
to supp ort U
IO
displa ys the memory should consist of R U
C luster
frames
where the size
of a frame corresp onds to the size of a sub ob ject hence the follo wing constrain t
R U
C luster
siz e subobj ect siz e memor y Conguration
In this section w e describ e a tec hnique that strik e a compromise b et w een d and the siz e subobj ect to ac hiev e maximal U
IO
Since w eha vet w o equations and and three v ariables d siz e subobj ect and U
C luster
w e appro ximate an optimal solution using a heuristic W e compute the maximal
U
C luster
for eac h dthenw e can select the maximal U
IO
based on Equation The searc h space for this problem is sho wn from t w o dieren t angles in Figure This gure
w as generated assuming D T
act
micr osec d B
D isplay
m bps tfr m bps
and sizememory Gigab ytes
F rom Equation it is easily to see that for giv en a xed v alue of d U
C luster
increases as
siz e subobj ect increases As y ou can see in Figure a for giv en a xed v alue of d the n um ber of
sim ultaneous displa ys U
IO
as a function of the sizesub ob ject increases in a stepwise manner
The heightofeachstepis R The explanation of this is as follo ws U
C luster
is a function of the
siz e subobj ect is a real n um b er therefor w eha v e to align siz e subobj ectto the b yte w e can do this alignmen t
byin terlea ving w a y for example if siz e subobj ect is b ytes the w e read b ytes in the rst cycle b ytes
in the second and third in the fourth and so forth
Plus one p er cluster b ecause an additional frame is needed to momen tarily stage b oth X i and X i in memory
when the system switc hes b et w een t w o sub ob jects
In the currentw orkstation tec hnology the ratio of memory size to disk capacit yis
so for the giv en disk
capacit y Gig aby tesw e assume the memory size is Gigab ytes
Figure Tw o dimensional view of the searc h space
Figure Eliminate searc h space using the memory constrain t
siz e subobj ect and should b e an in tegerso w e apply a o or function on Equation here As
one increases the size of a sub ob ject the v alue of U
C luster
increases b y one in regular in terv als
Eac h time this happ ens U
IO
increases b y R see Equation Therefore for eac h d it suces to
compute the maximal siz e subobj ect that satises the constrain ts to obtain the maximal U
C luster
Unlik e the case of a xed difw e x the siz e subobj ect U
IO
is not necessary a monotonic
function with resp ect d As it could b e seen in Figure b
When memory constrain t is violated due to limited amountofmemory one can satisfy it b y
striking a compromise b et w een d and the siz e subobj ectthese t w o parameters determine U
C luster
and U
IO
in turn see Equation Wec hange the inequality in to equalit y in order to compute the upp erb ound for U
Cluster
By
com bining Equation and w eget U
C luster
as a function of sizememory and d U
Cluster
b
r
b
T
act
d T
seek
d b
D
d
c tfr d sizemem
B
displa y
d tfr
T
act
d T
seek
d b
D
d
c d tfr
where b T
act
d T
seek
d b
D
d
c tf r d siz e mem The constrain t p osited in Equation eliminates a p ortion of the searc h space see Figure This searc h space can b e restricted further using the hardw are c haracteristics of the magnetic
disk driv es A magnetic disk driv e is almost alw a ys required to rep osition its head if the unit
of transfer is larger than the size of a cylinder Consequen tly there are marginal adv an tages to
c ho osing a sizesub ob ject that renders the unit of transfer from eac hdisk driv e to b e larger than
Input B
D isplay
D sizecylinder tfr T
seek
d T
act
d and sizememory
Output d sizesub ob ject and U
IO
Begin
d d
B
D isplay
tf r
e d cannot b e smaller than this v alue
P r ospectiv eS et NULL
For all d In teger and d
B
D isplay
tf r
e d D Do
siz e subobj ect d siz e cy l inder Compute U
Cluster
using Equation and tak e the o or of its v alue
Compute siz e subobj ect using Equation Chec k the memory constrain t Equation If U
C luster
b
D
d
c siz e subobj ect S iz e memor y then
Compute U
C luster
using Equation and tak e the o or of its v alue
Compute siz e subobj ect using Equation EndIf
Compute B
C luster
using Equation If B
C luster
B
D isplay
then c hec k the bandwidth constrain t Equation Compute U
IO
using Equation Insert d siz e subobj ect U
IO
in to P r ospectiv eS et
EndIf
Incrementdb y one
End For
Cho ose the elementof P r ospectiv eS et with the maxim um U
IO
v alue and
return its d sizesub ob ject and U
IO
End
Figure A heuristic to compute d sizesub ob ject and U
IO
a cylinder Th us w e can limit the searchb y assuming that the size of a sub ob ject is smaller than
d siz e cy l inder Figure outlines a heuristic that emplo ys this rule of th um b and p erforms
an exhaustivesearc h of the remaining space It visits the dieren t states b y analyzing all p ossible
v alues of d that range from d
B
D isplay
tf r
e its lo w er b ound to DF or eachv alue it sets the size of a
sub ob ject as a function of d and the cylinder size Next it uses Equation to compute the n um ber
of users supp orted b y a cluster U
C luster
and con v ert this v alue to an in teger It uses d and the
obtained U
C luster
to recompute the sizesub ob ject This recomputation is necessary b ecause the
equalityof is no longer true for the new v alue of U
Cluster
and it ma y reduce siz e subobj ect signican tly Using the obtained d sizesub ob ject and U
C luster
the heuristic c hec ks the memory
constrain t If this constrain t is violated it recomputes U
C luster
as a function of d and the a v ailable
memory in order to compute a new sizesub ob ject It emplo ys the obtained v alues to c hec k the
constrain t p osited in Equation If they violate this constrain t then the obtained v alues are
ignored from further consideration Otherwise they are treated as a solution and main tained in a
set This pro cedure is rep eated for all v alues of d upto D F rom the obtained set of solutions w e
c ho ose the solution that maximizes U
IO
The order of this heuristic is D d
B
D isplay
tf r
e
Mix of Media T yp es
W e presen t a new approac h to congure a system with a mix of media t yp es As compared
to a database that consists of a single media t yp e the follo wing additional factors impact the
p erformance of the system the mix of media the sc heduling p olicy and the duration of
eac h displa y The approac h presen ted considers a represen tativ e queue of requests and a sc heduling
p olicy to compute the size of a sub ob ject and the n um b er of disk driv es p er cluster in order to
maximize the p erformance of the system
The mo del used to displa y the ob jects is similar to that of a database consisting of a single
media t yp e The basic dierence is that instead of reading sub ob jects the system reads blo c ks
The size of a blo c k dep ends on the bandwidth requiremen t of its ob ject ie the media t yp e The
higher the bandwidth the larger the blo c k size
An immediate question is ho w to determine the size of a blo c k A naiv e approac h is to congure
the blo c k size based on one media t yp e sa y A and dene the blo c k size for the other media t yp es
sa y B as a function of this size b ym ultiplying the blo c k size with
B
display
B B
display
A W e start with an
example that demonstrates the conguration parameters pro duced b y considering a single media
system is not optimal from a global p ersp ectiv e Subsequen tlyw e describ e the mo del and the new
factors to b e considered Finally w e presen t the algorithm to compute the optimal conguration
parameters and presen t a case study where a heuristic can b e applied to compute the parameters
faster
Example Consider a system with disks of GBytes eac h and GBytes of mem
ory Eac h disk has cylinders of appro ximately MBytes eac h and their activ ation time is
seconds Also the disk transfer rate is megabits p er second
Assume a database consisting of v e media t yp es CD audio video using CIF NTSC CCIR
Recommendation and HDTV represen tations Supp ose that ev ery displa y has a video and
an audio comp onen t therefore CD ob jects o ccur half of the time F urthermore the bandwidth
required to displa y the ob jects of eac h media t yp e the heat
of eachmedia typeand thea v erage
displa y time of the ob jects of eachmedia t yp e are as follo ws
Media T yp e Bandwidth Heat Av erage DisplayTime
mbps per centag e minutes
CD
CI F
NT S C
CCI R
HDT V
Assume that the ob jects are referenced randomlyIf w e build a represen tativ e queue of requests
that satisfy the previous assumptions w e can compare the service time of the queue
when the system is congured based on a single media t yp e with the service time when congured
based on the represen tativ e queue of requests
The follo wing table sho ws the conguration parameters for eac h media t yp e and the time
required to service the represen tativ e queue
Bandwidth d R sizesub ob ject B
C luster
Service Time
mbps disk s cl uster s M Bytes mbps hour s minutes seconds
CD Cannot supp ort all the media t yp es
CI F Cannot supp ort all the media t yp es
NT S C Cannot supp ort all the media t yp es
CCI R HDT V Ho w ev er if the system is congured based on a represen tativ e queue of requests the a v erage
service time is Therefore the service time when conguring based on a single media
t yp e could b e ! or ! higher than when based on the represen tativ e queue the probabili t y that a request is of a giv en t yp e
requests is large enough to mak e the requests comp ete for the system resources
Mo del
This section presen ts a mo del to supp ort sim ultaneous displa y of a mix of media t yp es in a clustered
m ultidisk system W e start b y considering the new factors that impact the p erformance of the
system Subsequen tlyw e describ e ho w the ob jects can b e displa y ed without hiccups and establish
the memory requiremen ts of the system And nally w e list the parameters needed to supp ort a
mix of media t yp es and their constrain ts
New F actors
The factors that impact the conguration parameters of the system include The mix of media the
sc heduling p olicy and the displa y time of the requests Consider eac h in turn
The rst and ob vious dierence is that eac h media t yp e has its o wn bandwidth requiremen t
This leads us to read the ob jects in blo cks instead of sub ob jects The size of the blo c k is xed for
ob jects of one media t yp e and is prop ortional to the bandwidth required b y that media t yp e The
size of a blo c k is dieren t for alternativ e media t yp es b ecause the consumption rates for the displa y
of dieren t media t yp es is v arian t Therefore the amoun t of memory allo cated for the displayof
eac h media t yp e should b e prop ortional to its consumption rate If w e allo cate the same amoun t
of memory for eachmedia t yp e the displa y of an ob ject with the lo w est consumption rate w ould
reside in memory longer than required prev en ting the displa y of other ob jects Therefore for eac h
bandwidth w e use a dieren tbloc k size
F or the same reasons that w e read only one sub ob ject of an ob ject p er cycle w e read only one
blo c k of an ob ject p er cycle And as b efore the blo c ks are assigned to the clusters in a round robin
manner Also w e allo w to read the blo c ks in pieces spread within a cycle Otherwise if the blo c ks
m ust b e read con tiguously it will limit the p ossible requests to t in a cycle w emayha vet woor
more noncon tiguous time in terv als that could b e used to read a blo c k But it is imp ortan t that
the blo c ks are read in the same manner in all cycles
That is if an ob ject A is split in to blo c ks
A
A
m
and blo c k A
is split in to pieces A
A
p
then if the piece A
j
is read from t
s
j
to t
e
j
in a cycle then for eac h i A
ij
is read from t
s
j
to t
e
j
in the cycle where A
i
is read The follo wing
example illustrates a p ossible sc heduling of the displa y of an ob ject
Example Assume that an ob ject A consists of blo c ks and giv en the curren t status of the
system the system is forced to split eac h blo ckin to pieces Supp ose that w e congure the system
with clusters A p ossible sc hedule to the ob ject is
Otherwise hiccups could b e caused or pieces of the ob ject w ould reside in memory longer than required
A t cycle i cluster j is sc heduled as
j
j
t
s
j
t
e
j
t
s
A
A
j
t
e
j
t
s
A
j
t
e
j
A t cycle i cluster j Mod is sc heduled as
j
j
t
s
j
t
e
j
t
s
A
A
j
t
e
j
t
s
A
j
t
e
j
A t cycle i cluster j Mod is sc heduled as
j
j
t
s
j
t
e
j
t
s
A
A
j
t
e
j
t
s
A
j
t
e
j
The mix of media impacts the p erformance of the system The order in whic h dieren tme dia t yp es are queued in a system that serv es on a rst come rst serv ed basis could aect the
p erformance The follo wing example illustrates this eect
Example Supp ose that the database consists of ob jects that b elong to t w o dieren t media
t yp es A and B B
display
A m bps while B
display
B m bps Supp ose that a cycle consists
of time slots
the blo c k sizes of A and B are and sub ob jects resp ectiv ely and there is only
one disk cluster Supp ose that the sc heduler serv es the requests in a F CFS manner and a queue
of requests is w aiting to b e serv ed Assume that the size of eac h referenced ob ject is blo c k
Moreo v er supp ose that the queue starts with a request of A follo w ed b y a request of B and ev ery
other request is of the same t yp e Therefore the sc heduler will read three requests during eac h
cycle for the rst six cycles and then the last t w o requests in the sev en th cycle
If instead the queue consists of a request of A follo w ed byt w o requests of B with the same
pattern rep eated times follo w ed b y requests of A Then the only dierence with the previous
Ev en though w e are reading blo c ks of ob jects instead of sub ob jects w e main tain the concept of subobj ect as a
unit of reading to consider the rep ositionin g time
case is the order of the requests Ho w ev er the sc heduler requires only cycles to service the queue
As sho wn in the follo wing example another factor that impacts the p erformance of the system
is the sc heduling p olicy Example Supp ose that weha v e a situation similar to the rst case of Example But
the sc heduler selects the next request in the queue that ts in what is left in the cycle Then w e
can sc hedule to service the requests and in the rst cycle and in the second cycle
and so forth Therefore only cycles are required to service the queue Whic h is shorter than the
cycles required with the rstcomerstserv ed p olicy of Example The other factor to consider is the duration of eac h displa y As demonstrated in Example the w a y the media t yp es are mixed aect the p erformance of the system Increasing the length of a
request can b e represen ted b y adding an extra request of length equal to the increase in size after
the original request F or example If the original queue consists of t w o requests of dieren t media
t yp es and the rst request size is duplicated Then w e can represen t the new queue as consisting
of three requests with the second request as a request iden tical to the rst request the rst and
the last request is the same as the original queue Since the mix of media is dieren t in the second
case its p erformance could b e dieren t
Weth us congure the system based on a represen tativ e queue of requests whic h captures the
factors describ ed ab o v e
Displa ying without hiccups
Weno w describ e ho w to displa y ob jects without hiccups Since the displa y of a blo cktak es a cycle
and all the blo c ks are read in the same w a y in all the cycles it suces to consider ho w to displa y
one blo c k without hiccups
Assume that w e split the cycle in e quiintervals timein terv als of equal duration t suc h that
t siz e subobj ect B
C luster
Also assume that the blo c k b is read in a cycle as follo ws
j
j
t
s
b
j
t
e
j
t
s
b
j
t
e
j
t
s
j
b
j
j
t
e
j
j
t
s
m
b
m
j
t
e
m
j
Where b is the concatenation of the pieces b
b
m
And eac h piece lies within one equi
in terv al
Time to read a sub ob ject
In tuitiv ely the algorithm to sc hedule the displa y of the blo c k b tries to start the displayat t
e
if there is a hiccup it tries to x it b y starting later on so that the hiccup pro duced b efore will not
happ en And con tin ues un til it nds a b eginning time when there will not b e hiccups It considers
only the starting times suc h that b
j
is displa y ed starting at t
e
j
for some j More precisely
Input jb
j jb
m
j t
s
t
e
t
s
m
t
e
m
Output Time t at whichw e can start displa ying b
Begin
f ound false
i
While f ound
Let t b e the time to start displa ying b suc h that the displayof b
i
starts at t
e
i
If There is no hiccups starting the displayof b at t
Then f ound tr ue
Return t
Else Let i b e the smallest j suc h that if the displayof b starts at t
then there is a hiccup b efore the displayof b
j
ie b
j
is not ready in memory when
w e nish displa ying b
j End If
End While
End
W e can assure that the previous algorithm will alw a ys terminate First notice that after
setting t in the lo op the pieces b
b
i
can b e displa y ed without hiccups Therefore the v alues of
i increases as the n um b er of iterations of the lo op But the maxim um v alue of i is m therefore the
n um b er of iterations is b ounded
Since the displa y time of a blo c k requires a cycle w e start displa ying the follo wing blo ckat t
in the next cycle F urthermore w e will not ha v e hiccups b ecause wew ere able to displa y b without
hiccups and the distribution of the pieces in a cycle is exactly the same as b
b
m
in the curren t
cycle
In conclusion the previous algorithm will alw a ys give usaw ayofsc heduling the displayof an
ob ject without hiccups
Memory Requiremen t
The use of memory as an in termediate stage giv es us exibilit y to handle dieren t consumption
rates Since the n um b er of disk driv es p er cluster is xed the pro duction rate is xed While the
consumption rate v aries with the media t yp e of the ob ject Another adv an tage of ha ving memory
is to b e able to read a blo ckan ywhere within the cycle
Weno w establish the memory requiremen t for a giv en conguration W e start b y describing
ho w memory is managed and then wepro v e that w e require at most R n siz e subobj ect as the size of the memorywhere n is the length of the cycle in time slots
If n is an in teger w e can split the memory in pages of size equal to siz e subobj ect The pages
are allo cated to disk clusters and time in terv als as follo ws F or eachin terv al i of the cluster j in
the cycle k allo cate page n um ber k n imod n j k mo d R mo d R n F or
example assume that n and R Then the memory assignmen t is
Cycle Cycle Cycle Cluster k P age j P age j P age k P age j P age j P age k P age j P age j P age k
Cluster k P age j P age j P age k P age j P age j P age k P age j P age j P age k
If n is not in teger w e can split the cycle in equiin terv als instead of time in terv als and the
page size to b e B
C luster
e quiinterval instead of siz e subobj ect Then the memory requiremen tis
R n siz e subobj ect R B
C luster
e quiinterval Since an equiin terv al is shorter than a time
in terv al the memory requiremen t isatmost R n siz e subobj ect Finallyw e pro vethatw e require at most R n siz e subobj ect of memorytosc hedule
the displa y of ob jects without hiccups By pro ving that the memory managemen t just describ ed
can b e used b y the algorithm in Section Lemma If the length of the cycle is n then weneed atmost R n siz e subobj ectof
memory to b e able to displa y an ob ject without hiccups
Pro of Assume that wesc hedule the displayfollo wing the algorithm in Section Then w e
pro v e that the memory managemen t describ ed ab o v e is appropiate for the sc heduling algorithm
It suces to sho w that for eac h i the memory allo cated to read b
i
in cycle p is released b y t
e
i
in the cycle p Let j b e the smallest index suc h that the displayof b
j
starts at t
e
j
It is easy to see that suc h
j exists W e consider t w o cases
ij The memory used is released when the reading of b
j
nishes Therefore it is a v ailable for
the next cycle
i j Since the blo c k is supp osed to b e displa y ed in a cycle the memory used for b
j
b
m
is
released b y t in the next cycle Since t t
e
j
the memory used for b
j
b
m
is a v ailable b y
t
e
j
P arameters and Constrain ts
Weno w list the parameters of the mo del and their constrain ts Giv en k t yp es of media eac h with a
bandwidth requirementof B
B
k
the follo wing parameters determine the p erformance of the
system
i n The length of a cycle
ii siz e subobj ect The size of a sub ob ject
iii dThe n um b er of disk driv es p er cluster R b
D
d
c iv n
n
k
The size of a blo c k for eac h media t yp e
T o implemen t the mo del describ ed ab o v e the parameters m ust satisfy the follo wing constrain ts
i A blo c k is read within a cycle F or eac h i n
i
n ii The displa y time of a blo c k is a cycle F or eac h i n
i
siz e subobj ect B
i
n siz e subobj ect B
C luster
iii The memory requiremen t can b e satised
R siz e subobj ect n siz e mem
iv On the a v erage the size of eac h fragmen t is less than one cylinder
siz e subobj ect Q d
In time slots units
v The disk cluster bandwidth can supp ort the displa y of a sub ob ject without hiccups
B
C luster
MaxfB
B
k
g Constrain t p osited in Equation can b e expressed as
n
B
n
B
n
k
B
k
n
B
C luster
Notice that n n
n
k
are real n um b ers therefore w eha v e to align the blo c k sizes to the
b yte The alignmen t can b e done byin terlea ving alignmen t to the higher b yte and to the lo w er
b yte F or example if the blo c k size is then w e read b ytes in the rst cycle b ytes in
the second and third in the fourth and so forth
W e assumed that the heads are rep ositioned once for eac h time slot That assumption can
be c hanged easily b y mo difying the seek time in the computation of B
C luster
F or example if on
a v erage the heads are rep ositioned t wice in an in terv al w ecan m ultiply the seek time in B
C luster
b y a factor of Conguration
In this section w e presen t an algorithm to compute the parameters of the mo del in order to minimize
the service time This algorithm is based on a represen tativ e queue of requests
W e start b y reducing the parameters to consider only siz e subobj ectand d Subsequen tly w e presen t the algorithm to compute the optimal parameters Next w e consider a case study and
describ e a heuristic to mak e the computation faster
The previous section demonstrated that the conguration parameters are siz e subobj ect
d n n
n
k
Wenowshowho w the n um b er of parameters can b e reduced to t w o namely siz e subobj ectand d Giv en a size of a sub ob ject and the n um b er of disk driv es p er cluster
w e can compute B
C luster
F or a giv en B
C luster
Constrain t limits the n um ber of c hoices for
n n
n
k
In particular if n
n
n
k
satises then the k tuples that satisfy the
constrain t can b e expressed as c n
n
n
k
where c is a scalar Therefore the dieren t
v alues for the k tuple do es not aect the service time F or example if c then with
c n
n
n
k
instead of n
n
n
k
w ew ould ha vecycles t wice as long ho w ev er w ealso
read blo c ks that are t wice in size Th us the net eect is the same Hence it suces to consider
one p ossible v alue for n
n
k
n that satises constrain t for a giv en v alue of B
C luster
The selection of the k tuple is made to minimize the fragmen tation of blo c ks in a cycle
When a request nishes it lea v es a fragmen t of time that can b e used for the next request But
if the next request do es not t exactly in the fragmen t w e end up lea ving smaller fragmen ts to
b e used later on This could lead to fragmen tation of the blo c ks within a cycle in pieces smaller
than a sub ob ject Therefore w e decided to select the k tuple that giv e us the largest blo c k
sizes Giv en a v alue for B
C luster
w e select the v alue for n
n
k
n considering the largest n that
satises constrain t The follo wing algorithm computes the conguration parameters W e assume that there is a
function Serv T ime whic h giv es the service time of a queue of requests for a giv en conguration
S er v T ime computes the service time based on the sc heduling p olicy Input B
B
k
DQsize mem Queue S er v T ime mins maxs d el tas mind maxd d el tad
Output siz e subobj ect d
Begin
minS T Time to service Queue in a sequen tial manner one at a time
bests mins
bestd mind
For s mins maxs del tas
For d mind maxd del tad
Compute the length of the cycle n siz e mem b
D
d
c siz e subobj ect Chec k constrain ts
If n and constrain t is not violated and constrain t is not violated
Then Compute the size of the blo c ks for eac h media t yp e For i k Do n
i
n
B
C luster
siz e subobj ect d B
i
C ur r entS T S er v T ime Queue n n
n
k
sd If C ur r entS T minS T
Then minS T CurrentST
bests s
bestd d
End If
End If
Next d
Next s
End
Ho w exp ensiv e the computation is dep ends on the searc h space Whic h is the v alues of
siz e subobj ectand d for whic h Serv T ime is in v ok ed It can p oten tially b e in v ok ed D
siz e mem times But constrain t reduces it to
Q D
times Whic h is reduced further b y
constrain t W e conclude this section b y presen ting a case study where a heuristic can reduce the com
putation time namely a system that serv es the requests on the rst come rst serv ed basis W e
rst demonstrate that if w e disregard the constrain ts then S er v T ime for a giv en d decreases as
siz e subobj ect increases W eth us need to consider only the largest siz e subobj ect that satises
constrain ts and for eac h d Whic h reduces the p oten tial searc h space to D Lemma Assume that the system serv es the requests on a rst come rst serv ed basis
Then for a giv en dif s
s
and s
d and s
d satisfy constrain ts
and then
S er v T ime Queue n n
n
k
s
d S er v T ime Queue n n
n
k
s
d Pro of As the selection of the duration of a cycle p oten tially do es not aect the service time w e
x the length of a cycle to for b oth cases s
and s
Where satises constrain ts
and for b oth s
d and s
d Then w e compare the service time when the cycle length is for b oth s
d and s
d W e rst select to b e the duration of the cycle computed b y the algorithm for s
d
siz e mem b
D
d
c s
T
act
d T
seek
s
tf r d
Since
siz e mem b
D
d
c s
T
act
d T
seek
s
tf r d
The length of the cycle computed b y the algorithm for s
d is greater than Therefore when considering s
d the n um ber of in terv als in eac h cycle is smaller that the
n um b er computed b y the algorithm
n s
B
C luster
s
d siz e mem b
D
d
c s
Whic h implies that n satises constrain t for s
d Therefore w e can consider as the
length of the cycle to compute the service time for b oth s
d and s
d F or a giv en media t yp e i the size of the blo c k in b ytes read for s
d and s
d is B
i
Therefore the n umberofcycles tak en to displa y an ob ject is the same for b oth cases s
d and
A pair s d satises constrain t if there exists a p ositiv en um ber nsuc h that the constrain t is satised for
s d n
Notice that the v alues n n n k can b e uniquely determined b y siz e subobj ect d and the length of the cycle
s
d Since B
C luster
s
d B
C luster
s
d and b
D
d
c is the same for b oth cases W e could b e able
to sc hedule the displa y of an ob ject for s
d earlier
than for s
d And furthermore there is
no ob ject that is sc heduled for s
d earlier than for s
d Therefore the service time for s
d is smaller or equal to the service time for s
d Compression
Compression could mak e the system use its resources more ecien tly ie memory and disk band
width W e study ho w to incorp orate compression to displa y ob jects under the arc hitecture de
scrib ed in Section The only assumption ab out the compression tec hnique is that it allo ws the
displa y of data in an incremen tal manner ie w e can start the displa y of an ob ject with the
data that is curren tly in memoryWedo not ha veto w ait for additional data to start the dis
pla y Therefore our approac h is general enough to co v er a considerable n um b er of compression
tec hniques curren tly in use
The order in whic h compression is done with resp ect to the stripping determines the b eha vior
of the system W e presen t one mo del for eac h of the follo wing orderings Compress the ob ject
b efore splitting it in to blo c ks and Split the ob ject in to blo c ks b efore compressing And describ e
ho w to congure the system for eac h case In fact the conguration pro cedures for eac h mo del can
b e done using the conguration for noncompressed ob jects with the appropiate parameters
Compress then Split
The a v erage bandwidth requiremen t for a compressed image is p oten tially smaller than for a non
compressed image If w e consider the mo del presen ted in Section the displa ytime of a bloc k
is one cycle Therefore the displa y time of a compressed blo c k could b e p oten tially longer than a
cycle If w e compress the ob jects b efore splitting them in to blo c ks w emayha v e some extra disk
bandwidth a v ailable for other applications without an y lost in displa y throughput F or example
if the a v erage ratio for image compression is to then the a v erage displa y time of a blo c k for
a compressed image is ten cycles while the displa y time of a noncompressed blo c k is one cycle
Therefore the disk bandwidth assigned to an ob ject can b e used for other applications during cycles in a v erage
On the other hand if the system is used exclusiv ely for the displa y of ob jects w emayw an t
W e are taking a cycle as the unit of time W e are not considering the sp ecic time within the cycle that the
displa y starts
to use the extra disk bandwidth to service other ob jects But since the duration of the displa yof
a blo ckisv ariable the concept of cycle not longer applies W e still ha v e to fragmen t the time in
in terv als for eac h disk cluster Only one request can b e read during one time in terv al The duration
of those time in terv als could b e either xed or v ariable Another factor to consider is the the size
of a blo c k it could b e xed or v ariable
W e presentt w o mo dels to sc hedule requests when the ob jects are compressed and then split
in to blo c ks One approac h considers time in terv als of v ariable duration and blo c k sizes prop ortional
to the a v erage bandwidth requirementof its media t yp e The other considers time in terv als of xed
duration and blo c ks of xed size Then w e presen t a metho d to congure the system for either
approac h
V ariable Time In terv als and V ariable Blo c k Sizes
The adv an tage of making the blo c k size prop ortional to the bandwidth requiremen t of its media
t yp e is to use the memory as fair as p ossible F or example when the ob jects are not compressed
in a v erage eac h unit of memory will hold a fraction of an ob ject for the same amoun t of time
one cycle With that goal in mind considering blo c k sizes prop ortional to the a v erage bandwidth
requiremen t of its media t yp e is a go o d approac h W e start b y giving a brief description of the
mo del then the sc heduling algorithm Next w e revise the constrain ts p osited in Section The idea is to read blo ckbybloc k of a request from the prop er disk cluster without violating
memory constrain ts and making sure that there is a w ayof sc heduling the displa y without hiccups
The only dierence with the mo del presen tedinSection isthat w e do not ha v e to read a blo c k
within a cycle Also the w a y blo c ks are read do es not ha vetofollowan y pattern timewise Th us
the sc heduler serv es a request based on the a v ailabilit y of resources and real time constrain ts no
hiccups b y assigning time in terv als to the request
The sc heduler needs to k eep trac k of the disk bandwidth a v ailable b y marking the time in terv als
already assigned to a request for eac h disk cluster It will assign noncon tiguous time in terv als to
a blo c k if necessaryT o assure that memory is sucien t to service a request the sc heduler m ust
k eep trac k of the memory a v ailable T o trace memory utilization it splits the memory in to frames
and start assigning frames to requests as they are sc heduled T o release a frame it computes
the time when the data in the frame has b een completely displa y ed Suc h computation requires
the follo wing information Time when data in the frame starts b eing displa y ed and a v erage
compression factor When more than one ob ject is using the same frame it considers the last
The compression factor can v ary within a frame Without loss of generalit yw e can assume the a v erage rate to
compute the duration of the displa y
ob ject that nishes displa ying the data
The sc heduler will map blo ckb y blo ckto a v ailable time in terv als in the prop er disk cluster
The assignmen t is done only if memory a v ailabilit y allo ws it The follo wing algorithm sc hedules
the blo c ks of a request starting b y blo ckn um ber B N umber and assuming that blo c k B N umber is
displa y ed starting at t
The second argumen t t
m ust b e set tp a v ery large n um b er to represen t
innit y for the case of the rst blo c k
Schedule B N umbert
While True
Searc h for earliest time in terv als in the cluster where the blo c k B N umber is
that satisfy memory constrain ts and can meet the deadline of start displa ying the blo ckat t
If Searc h succeeded
Then If B N umber
Let t
Apply the algorithm in Section to the rst blo c k
End If
Up date memory requiremen ts for the time in terv als when the blo ckissc heduled
and the time when memory is released
t
t
displa y time of Blo c k B N umber
If B N umber is the last in the request
Then Return True
Else If Schedule B N umber t
Then Return True
End If
End If
End If
End While
Wenowpresen t the constrain ts for this mo del W e denote the a v erage bandwidth requiremen t
of a media t yp e A after compression b y
"
B
display
A where B
display
A is the bandwidth requiremen t
of A without compression F or example giv en a media t yp e A with a bandwidth requirementof
B
display
A if the a v erage compression ratio for a media t yp e A is to w e consider
"
B
display
A
B
display
A as the a v erage bandwidth requiremen t for ob jects of t yp e A Constrain ts and remain unc hanged Constrain ts and are revised as follo ws
i The blo c k size is prop ortional to its a v erage bandwidth requiremen t
n
"
B
n
"
B
n
k
"
B
k
ii The memory requiremen ts for the service of the queue of requests is alw a ys smaller than or
equal to siz e mem In particular a blo c km ust t in memory F or eac h i n
i
siz e subobj ect siz e mem
The follo wing new constrain t assures the absence of hiccups
Let o o
o
m
b e an ob ject suc h that for eac h k the piece o
k
is read con tin uosly in the time
in terv al t
s
k
t
e
k
Let t
b e the start time to displa y oLet t
M axLat
b e the maxim um latency time
incurred byactiv ation and seek time Then t
t
s
t
M axLat
And for all j t
t
o
o
j display
t
s
j
t
M axLat
where t
o
o
j display
is the time to displa y o
o
j Fixed Time In terv als and Fixed Blo c k Sizes
The exibili t y of the previous mo del could lead to excesiv e fragmen tation of the time Therefore
the ob jects ma y end up b eing partitioned in to small pieces whic h implies excessiv e seek and la
tency times F urthermore its implemen tation could b e computational exp ensiv e T oo v ercome the
problems just describ ed w e establish a xed size of the blo c k and time in terv als of xed duration
The sc heduling under this approac h is a sp ecial case of the sc heduling pro cedure presen ted in
Section W enowha v e blo c ks of xed size then the time to read a blo c k is xed W emap
the time in terv al to the time to read a blo c k and require to read the blo c ks within a time in terv al
Therefore the fragmen tation problem is no longer presen t and the searc h space of the sc heduling
algorithm is reduced making the computation less exp ensiv e
Conguration
T o congure the system w e can apply the same metho d used for the noncompression case with
dieren t parameters W eno w consider the a v erage bandwidth requiremen ts for the dieren t media
t yp es after compression instead of the bandwidth disregarding compression The other parameters
remain the same
F or the case of Fixed Time In terv als and Fixed Blo c k Sizes a question that arises is ho wto
dene the blo c k size if the conguration pro cedure giv es a blo c k size for eachmedia t yp e There
are sev eral alternativ es it dep ends on what the priorities are One extreme of the sp ectrum is
to pic k up the smallest blo c k size arguing that memory is a v ery v aluable resource then it is
b etter to read the least to decrease the p ossibilit y of an ob ject retaining memory for a long p erio d
The other extreme is to select the largest blo c k size to decrease the latency and seek time One
in termediate alternativ e is to compute an a v erage blo c k size based on the Heat of the media t yp es
in consideration
Split then Compress
When compression is applied to the blo c ks instead of the whole ob ject the blo c k sizes b ecome
smaller freeing disk bandwidth for other applications F or example if the a v erage ratio for image
compression is to then the size of a blo c k is reduced b y a factor of in a v erage Whic h
implies that in a v erage the disk bandwidth utilization is
of the utilization for the case when
compression is not applied
But if the system is used exclusiv ely for the displa y of ob jects the released disk bandwidth
can b e used to service other ob jects In that case the mo del for noncompressed ob jects should b e
mo died W e start b y describing the dierences with the mo del that do es not consider compression
Then w e giv e the revised constrain ts And nally w e describ e ho w to congure the system to handle
compression
Nowweha v e the situation when the blo c k sizes are not xed but the displa y time of a blo c k
is still a cycle In order to utilize the disk bandwidth as m uc h as p ossible w ew ould liketo read a
blo c k of another ob ject sa y B shortly after a blo c kof anobject sa y A is read But in the next
cycle the next blo c k in A could b e larger making the next blo c k of B to b e read at a dieren t time
in terv al Therefore the assumption of reading the blo c ks of an ob ject in the same manner in all
the cycles can not b e made The follo wing example illustrates this issue
Example Let A a
a
m
B b
b
k
and C c
c
p
A p ossible w a y of reading the
ob jects is
Cycle Cycle j j j j j a
b
c
j c
j j a
b
j
Since the blo c k sizes of ob jects A and B v ary in the t w o cycles presen ted the blo c ks m ust b e
read in dieren t manner F or ob ject C the size is the same but the blo c k c
had to b e read at a
dieren tin terv al b ecause the blo c ks of B and C to ok longer in the second cycle The irregularit y of the distribution of a blo c k within a cycle requires additional memoryF or
this case Lemma is not longer true But the memory requiremen t is still b ounded byax
n um b er namely R n siz e subobj ect The w orst case scenario is when a blo ckis v ery small and
in one cycle is read at the b eginning and the next cycle at the end Therefore the start of the ob ject
displaym ust b e done close to the end of the cycle Whic h implies that the memory frame holding a
blo c k will b e released at most cycles later Therefore the upp er b ound of R n siz e subobj ect is sucien t
In summary Constrain ts and remain unc hanged and Constrain t b ecomes
R n siz e subobj ect siz e mem Notice that the blo cksizes n
n
k
corresp ond to the sizes b efore compression
Tosc hedule requests w e still consider cycles and sc hedule eachbloc k within a cycle The only
dierence with the sc heduler for noncompressed ob jects is that blo c ks of the same ob ject are not
read in exactly the same manner for all cycles and the blo cksize isv ariable The sc heduler is v ery
similar to the one for noncompressed ob jects Basically it k eeps trac k of the requests sc heduled
for eac h cluster in eac h cycle and lo oks for a sequence of consecutiv e cycles that can read all the
blo c ks of an ob ject from the prop er cluster
The conguration pro cedure is v ery similar to the one for noncompressed ob jects The only
dierence is the w a y the duration of the cycle is computed F or compressed ob jects
n siz e mem b
D
d
c
This approac h is simpler than Compress then Split but it could ev en tually burns out disk
bandwidth b ecause of v ery small blo c ks after compression
Conclusions and F uture W ork
In this pap er w e presen ted a new tec hnique for supp orting the con tin uous displa y of p ossibly com
pressed ob jects This tec hnique declusters the disk driv es and strips the ob jects across the clusters
Also it emplo ys memory as an in termediate stage b et w een the disk driv es and the displa y stations
When displa ying compressed ob jects the bandwidth requiremen ts b ecome v ariable Therefore
buered IO is fundamen tal to comp ensate the v ariable bandwidths
W e also presen ted the fundamen tal factors that impact the p erformance of the system and a
conguration metho d that considers those factors The conguration metho d in tro duced is aimed
to ac hiev e the optimal throughput for a giv en sc heduling p olicy and a situation sp ecied in the
represen tativ e queue of requests If the goal is to optimize throughput in a giv en situation a v erage
w orst b est case etc the queue m ust represen t the load in that situation One in teresting extension
of this w ork is the study of the b eha vior of a system with temp oral and spatial constrain ts
Considering the case when m ultimedia ob jects are instances of a t yp e class in an ob ject
orien ted database w eha vet w o approac hes sharing the buer p o ol with the other t yp es in the
database or assigning a fraction of the memory to the m ultimedia t yp es W e assumed the second
approac h then memory frames used for the displa y of ob jects are not sw app ed out b efore their
displa y If the rst approac his tak en the consideration of real time constrain ts in buer p o ol
managemen t is an issue to b e studied
References
BG Dina Bitton and J Gra y Disk shado wing In pr o c e e dings of the International Confer enceon
V ery L ar ge Datab ases Septem ber BGMJ S Berson S Ghandeharizadeh R Mun tz and X Ju Staggered Striping in Multimedia Infor
mation Systems In Submitte d to SIGMOD GD S Ghandeharizadeh and D DeWitt A m ultiuser p erformance analysis of alternativ e declustering
strategies In Pr o c e e dings of International Confer enceon Datab ase Engine ering GDS S Ghandeharizadeh A Dash ti and C Shahabi A Pip elining Mec hanism to Minimize the
Latency Time in Hierarc hic hal Multimedia Storage Managers In Submitte d to SIGMOD GR S Ghandeharizadeh and L Ramos Con tin uous retriev al of m ultimedia data using parallelism
IEEE T r ansactions on Know le dge and Data Engine ering August GRA Q S Ghandeharizadeh L Ramos Z Asad and W Qureshi Ob ject Placemen tinP arallel Hy
p ermedia Systems In pr o c e e dings of the International Confer enceonV ery L ar ge Datab ases GS S Ghandeharizadeh and C Shahabi Managemen tof Ph ysical Replicas in P arallel Multime
dia Information Systems In Pr o c e e dings of the F oundations of Data Or ganization and
A lgorithms F ODO Confer enc e Octob er Has B Hask ell In ternational standards activities in image data compression In Pr o c e e dings of
Scientic Data Compr ession Workshop pages NASA conference Pub NASA
Oce of Managemen t Scien tic and tec hnical information division
LKB M Livn y S Khoshaan and H Boral Multidisk managemen t algorithms In Pr o c e e dings of
the A CM SIGMETRICS Intl Conf on Me asur ement and Mo deling of Computer Systems Ma y MWS D Maier J W alp ole and R Staehli Storage System Arc hitectures for Con tin uous Media
Data In Pr o c e e dings of the F oundations of Data Or ganization and A lgorithms F ODO
Confer enc e Octob er PGK D P atterson G Gibson and R Katz A case for Redundan t Arra ys of Inexp ensiv e Disks RAID
In Pr o c e e dings of the A CM SIGMOD International Confer enc e on Management of DataMa y
Ram L Ramos R e altime r etrieval of c ontinuous multime dia data using p ar al lelism PhD thesis
Univ ersit y of Southern California RE D Ries and R Epstein Ev aluation of distribution criteria for distributed database systems
UCBERL T ec hnical Rep ort M UC Berk eleyMa y SAD
M Stonebrak er R Agra w al U Da y al E Neuhold and A Reuter DBMS Researchat a
Crossroads The Vienna Up date In pr o c e e dings of the International Confer enceon V ery L ar ge
Datab ases SGM K Salem and H GarciaMolina Disk striping In Pr o c e e dings of International Confer enceon
Datab ase Engine eringF ebruary Sto M R Stonebrak er The case for SharedNothing In Pr o c e e dings of the Data Engine ering
Confer enc e IEEE
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 602 (1995)
PDF
USC Computer Science Technical Reports, no. 629 (1996)
PDF
USC Computer Science Technical Reports, no. 578 (1994)
PDF
USC Computer Science Technical Reports, no. 584 (1994)
PDF
USC Computer Science Technical Reports, no. 630 (1996)
PDF
USC Computer Science Technical Reports, no. 598 (1994)
PDF
USC Computer Science Technical Reports, no. 600 (1995)
PDF
USC Computer Science Technical Reports, no. 616 (1995)
PDF
USC Computer Science Technical Reports, no. 633 (1996)
PDF
USC Computer Science Technical Reports, no. 618 (1995)
PDF
USC Computer Science Technical Reports, no. 622 (1995)
PDF
USC Computer Science Technical Reports, no. 610 (1995)
PDF
USC Computer Science Technical Reports, no. 612 (1995)
PDF
USC Computer Science Technical Reports, no. 619 (1995)
PDF
USC Computer Science Technical Reports, no. 623 (1995)
PDF
USC Computer Science Technical Reports, no. 615 (1995)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 590 (1994)
PDF
USC Computer Science Technical Reports, no. 627 (1996)
PDF
USC Computer Science Technical Reports, no. 685 (1998)
Description
Shahram Ghandeharizadeh, Hsun-Ko Chan, Martha L. Escobar-Molano and Xiangyu Ju. "On configuring hierarchical multimedia storage managers." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 601 (1995).
Asset Metadata
Creator
Chan, Hsun-Ko
(author),
Escobar-Molano, Martha L.
(author),
Ghandeharizadeh, Shahram
(author),
Ju, Xiangyu
(author)
Core Title
USC Computer Science Technical Reports, no. 601 (1995)
Alternative Title
On configuring hierarchical multimedia storage managers (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
32 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270251
Identifier
95-601 On Configuring Hierarchical Multimedia Storage Managers (filename)
Legacy Identifier
usc-cstr-95-601
Format
32 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/