Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 589 (1994)
(USC DC Other)
USC Computer Science Technical Reports, no. 589 (1994)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Placemen t of Ob jects in P arallel Ob jectBased Systems
Shahram Ghandeharizadeh and Da vid Wilhite
Departmen t of Computer Science
Univ ersit y of Southern California
Abstract
P arallelism is a viable solution to constructing high p erformance ob jectorien ted database
systems This pap er analyzes the role of parallelism in suc h systems In parallel systems
based on a sharednothing arc hitecture the database is horizon tally declustered across m ultiple
pro cessors enabling the system to emplo ym ultiple pro cessors to sp eedup the execution time of
queries and impro v e throughput The placemen t of ob jects across the pro cessors has a signican t
impact on the p erformance of queries that tra v erse a few ob jects This pap er describ es and
ev aluates a greedy algorithm for the placemen t of ob jects across the pro cessors of a system
Moreo v er it describ es three alternativ ea v ailabili t y strategies that main tain bac kup copies of
ob jects to enable the system to con tin ue op eration in the presence of disk failures It describ es
howthe bac kup copies can b e used to enhance the p erformance of the system during normal
mo de of op eration This study quan ties the p erformance tradeo of alternativ e tec hniques
using a tracedriv en sim ulation study In tro duction
In the s the ob jectorien ted data mo del is exp ected to b e a prominen t paradigm for rep
resen ting and manipulating data This paradigm pro vides a ric h set of seman tic constructs that
can represen t and manipulate structurally complex in terrelationships among data found in man y
application domains ComputerAided Design CAD and Man ufacturing CAM oce informa
tion systems scien tic and medical applications to name a few Sev eral more recen t applica
tions suc h as Mosaic and other h yp ertext applications also utilize ob jectorien ted functionalit y These applications ha vepro vided the motiv ation for researchonobjectorien ted systems Man y
A preliminary v ersion of this pap er app eared in The th IEEE Internationa l Confer enc e on Data Engine ering F ebruary This v ersion pro vides an o v erview of the role of parallelis m in ob jectorien ted database managemen t
systems as w ell as additional ob ject placementtec hniques and more comprehensiv e exp erimen tal results
ob jectorien ted systems ha v e b een dev elop ed including Bubba BA CC
Marip osa Stoa Omega GCKLa GCKLb V olcano Gra and XPRS SKPO Eac h system represen ts
and manipulates data using the ob jectorien ted data mo del
An ob jectorien ted data mo del can supp ort a wide v ariet y of applications Both the size of
ob jects and the resource requiremen ts of queries for dieren t applications are div erse and v ary from
one application to another In m ultimedia information systems for example video ob jects can b e
sev eral megab ytes in size Moreo v er eac h video ob ject m ust b e stored across sev eral disks in order
to supp ort its realtime displa yGRA Q GR In other applications suc h as CADCAM and
certain scien tic and oce information systems the ob jects are relativ ely small and the resource
requiremen ts of a query accessing these ob jects is lo w enough that eac h ob ject is assigned to a
single no de
The fo cus of this study is on this class of applications
Sev eral Ob jectOrien ted Database Managemen t Systems OODBMS ha v e b een dev elop ed to
date P erformance b ecomes an imp ortan t issue when implemen ting suc h systems T oac hiev e high
p erformance the n um b er of IOs p erformed b y the system when executing a query should b e mini
mized Time sp en t p erforming IOs constitutes a signican t fraction of resp onse time for pro cessing
queries This IO b ottlenec k has b een w ell recognized in the database literature PGK GD F urthermore the curren t trends in hardw are tec hnology add to the signicance of this factor Pro
cessor sp eed is increasing at an appro ximate rate of eac hy ear while that of a magnetic disk
increases byonly R W With this trend it is of utmost imp ortance to minimize the n um ber
of IOs p erformed on b ehalf of queries in order to minimize the resp onse time of the system
Existing ob jectbased systems minimize the n um b er of IOs p erformed to execute na vigational
queries using clustering tec hniques Sev eral clustering algorithms Sta HK BD TN TN ha v e b een dev elop ed and ev aluated These algorithms place related ob jects those connected
via relationships together on the same disk page During query pro cessing when a query references
an ob ject the system materializes the page con taining this ob ject in memory Since na vigational
queries access related ob jects the probabilit y that the next referenced related ob ject resides in
memory is relativ ely high This enables the system to maximize its buer p o ol hit ratio minimizing
In this pap er the term no de is used to denote a unit of the sharednothing arc hitecture consisting of a CPU
some RAM and one or more disk driv es
resp onse time and enhancing system p erformance
In addition to minimizing the resp onse time of OODBMSs maximizing throughput the n um ber
of queries pro cessed p er unit of time is an imp ortan t issue This motiv ates the need for parallelism
Sharing data quic kly and easily b et w een m ultiple users is b ecoming increasingly common and
sometimes ev en necessary for companies to remain comp etitiv e As an example the information
sup erhigh w a y has visions of pro viding ev ery home in America with access to h uge amoun ts of data
Thousands or ev en millions of sim ultaneous queries ma y b e executed at a time pro viding a need for
utilizing m ultiple no des in order to pro vide a reasonable resp onse time to eac h user When storing
v ast amoun ts of data across m ultiple no des an in telligen t metho d for distributing data across the
no des is required to main tain acceptable resp onse times and throughput This pap er in v estigates
these issues in the con text of parallel ob jectorien ted systems
This researc h fo cuses on parallel systems based on the sharednothing arc hitecture Sto The sharednothing arc hitecture pro vides t w o main b enets that motiv ate this fo cus scalabilit yand
reliabilit y Both of these issues ha v e b een iden tied as imp ortan t topics for database researc h in the
s Sel SADN
The scalabilit yc haracteristics of the sharednothing arc hitecture allo w
it to gro w incremen tally with the needs of the application making it useful for b oth applications
that pro cess gigab ytes of data and those that pro cess terab ytes of data Reliabili t yis alw a ys an
issue as data should remain a v ailable in the presence of hardw are failures and this arc hitecture
can allowa v ailabilityin a n um ber of adv erse circumstances
The Omega Marip osa and Bubba protot yp es are based on the sharednothing arc hitecture
in whic h pro cessors do not share disk driv es or random access memory and only comm unicate
with one another b y passing messages using a high sp eed comm unication net w ork Mass storage is
distributed among the pro cessors b y connecting one or more disk driv es to eac h pro cessor These
systems horizon tally decluster LKB RE the ob jects of an application across m ultiple
This is not to sa y that this researchisirrelev an t to sharedmemory systems On the con trary in sharedmemory
systems with m ultiple disks tec hniques are needed for distributing the ob jects across the disks Storing related
ob jects on the same disk allo ws clustering tec hniques to store them on the same disk page reducing IOs during
query pro cessing The tec hniques describ ed in this w ork can b e extended to a sharedmemory system consisting of
m ultiple disks
pro cessors in order to sp eedup the execution of queries b y emplo ying parallelism
The placemen t of ob jects across the no des in a sharednothing arc hitecture has a signican t
impact on the p erformance of na vigational queries that tra v erse the relationships b et w een ob jects
Once suc h a query is directed to the no de con taining the ob ject that initiates its tra v ersal the query
should nd the subsequen t ob jects to b e tra v ersed on that same no de Otherwise it will ha veto
access another no de for the appropriate ob ject incurring extra comm unication and co ordination
asso ciated with migrating the query to another no de
Migrating a query from one no de to another
in order to access an ob ject is termed an in terno de tra v ersal Excessiv ein terno de tra v ersals
increase the comm unication and co ordination o v erheads of the system In addition they ma y also
increase the total n um b er of disk op erations IOs p erformed b y the system on b ehalf of a query When an in terno de tra v ersal o ccurs the probabilit y of nding the referenced ob ject in the
memory of the new no de is nondeterministic and migh t b e quite small On the other hand
if all the relev an t ob jects w ere assigned to a single no de in telligen t clustering tec hniques could
ha v e placed them all on one disk page assuming they t minimizing the n um b er of IOs b y
increasing the buer p o ol hit ratio The strategy emplo y ed to assign ob jects to no des has a
signican t impact on the abilit y of a clustering tec hnique to store related ob jects together P oor
ob ject placemen t strategies ma y assign related ob jects to dieren t no des forcing the system to
assign them to dieren t disk pages and render the clustering tec hniques ineectiv e This ma y cause
a query to access m ultiple no des during its execution increasing the o v erheads of parallelism IO
and comm unication and hence the o v erall resp onse time of the system If a query is required to
access sev eral no des for the relev an t ob jects the o v erheads of parallelism mayout w eigh its b enets
enabling a singlepro cessor system to pro vide a b etter resp onse time In addition these o v erheads
ma y diminish the capabilit y of the system to scale to a large n um b er of no des And nally these
o v erheads w aste system resources eg b oth the disk and the net w ork bandwidth reducing the
throughput of the system
When assigning ob jects to the no des the ob ject placemen t algorithm should strik e a compromise
This researc h is based on Omega whic h migrates the query to the no de storing the remote ob ject when a remote
ob ject is accessed Systems ha v e also b een prop osed whic h migrate the data to the query Stoa pro vides a brief
discussion of these t w o alternativ e paradigms
bet w een t w o conicting ob jectiv es First it should assign t w o ob jects to the same no de if one
references the other frequen tly Second it should distribute the w orkload of an application ev enly
across all no des in order to maximize the b enets of parallelism and main tain scalabilit y The
rst goal enables the system to use clustering tec hniques to store related ob jects on a single disk
page minimizing the n um b er of IOs p erformed b y the system when executing a query The second
goal maximizes the degree of parallelism while a v oiding the formation of hot sp ots and b ottlenec k
no des GD Bottlenec ks m ust b e a v oided whenev er p ossible b ecause they cause some resources
to sit idle while p ending requests w ait in a queue This w astes system resources and compromises
the scalabilit y of the system These t w o goals conict b ecause the rst argues for assigning related
ob jects to a single no de in order to group relev an t ob jects together while the second adv ocates
distributing the ob jects across all a v ailable no des This pap er describ es and ev aluates a greedy
ob ject placementtec hnique that appro ximates b oth ob jectiv es
As the n um b er of no des in a sharednothing arc hitecture increases the probabilit y of a disk
failure increases prop ortionally PGK In order to k eep the ob jects a v ailable in the presence
of suc h failures a bac kup copyofeac h ob ject is assigned to a dieren t no de The Omega system
utilizes the bac kup copies of ob jects for query pro cessing during the normal mo de of op eration in
order to further minimize the n um ber of in terno de tra v ersals This w ork describ es three alternativ e
tec hniques for assigning the bac kup copies of ob jects LOad Balanced Ob ject replication LOBO
Subpartition In terlea v ed REplication SIRE and a third tec hnique denoted LORE These strate
gies are a v ariantof T eradatas interle avedde clustering str ate gy T er CK adapted for parallel
ob jectorien ted systems They are no v el b ecause they assign the bac kup copies of ob jects with the
ob jectiv e to minimize in terno de tra v ersals o v er a p erio d of time during b oth the normal mo de of
op eration and in the presence of failures
This researc h fo cuses on ob ject placemen t in parallel ob jectorien ted systems This pap er is
organized in the follo wing manner Related w ork is describ ed in Section An o v erview of the role
of parallelism in ob jectorien ted systems is presen ted in Section Section describ es a greedy
ob ject placemen t algorithm for declustering data Alternativ ea v ailabilit y strategies are discussed
and ev aluated in Section and Section presen ts conclusions and future directions for this pro ject
Related W ork
An um b er of studies are related to this w ork First there has b een extensiv e researc h on data
placemen t in parallel relational systems and its impact on asso ciativ e queries CABK GD GDQ This study is dieren t b ecause it fo cuses on ob jectbased systems and na vigational
queries Section compares and con trasts na vigational and asso ciativequeries T ypical declustering
tec hniques found in relational systems eg partitioning b y range predicates or hash functions do
not consider relationships that exist b et w een ob jects and are inappropriate for declustering data
that is accessed via na vigational queries
Our initial ob ject placementtec hnique is simply a greedy graph partitioning algorithm Hence
the family of graph partitioning algorithms is highly related to our initial ob ject placemen t strategy The algorithms of this family v ary sligh tly from one another dep ending on the particular domain
eachw as dev elop ed for Kernighan and Lin KL dev elop ed a general graph partitioning algo
rithm that can b e utilized for the initial ob ject placemen t problem Sev eral other graph partitioning
algorithms FM Kri ha vebeen dev elop ed in the con text of VLSI circuit design some of whic h
can b e adapted and eectiv ely used for the initial ob ject placemen t problem Other algorithms
ho w ev er are not appropriate F or example consider the clustering algorithms TN DK Sta men tioned previously for placing ob jects up on disk pages they b elong to the family of graph parti
tioning algorithms They are inappropriate b ecause they w ould diminish the b enets of parallelism
b y assigning ob jects to as few no des as p ossible Ho w ev er these tec hniques can b e used eectiv ely
at eac h no de in order to maximize its buer p o ol hit ratio minimize the n um b er of IOs when
executing queries The in ten t of this researc h is not to surv ey previous graph partitioning researc h
While the initial ob ject placemen t strategy ma y utilize these graph partitioning tec hniques initial
placemen t simply pro vides a starting p oin t for our researchto in v estigate the role of a v ailabilit y
strategies in parallel OODBMSs
Av ailabilit y tec hniques in a distributed en vironmentha v e b een examined extensiv ely for the
relational database domain GM RAID PGK stores parit y information as opp osed to
bac kup copies of ob jects in order to allo w the system to con tin ue op eration in the presence of a
failure The parit y information cannot b e utilized eectiv ely for query pro cessing during the normal
mo de of op eration This is b ecause RAID constructs the bac kup copies of ob jects b y analyzing the
parit y data from sev eral no des In order to use bac kup information for query pro cessing sev eral
no des m ust b e utilized to reconstruct the data defeating the purp ose of our pro cessing paradigm
that striv es to minimize the n um b er of no des referenced during query pro cessing b y using the
bac kup cop y of an ob ject
Mirroring tec hniques T an can b e extended to the ob jectorien ted domain Ho w ev er since the
placementofbac kup copies of ob jects is xed determined b y the mirroring tec hnique the w orkload
can b ecome unev enly distributed if bac kup copies of ob jects are utilized for query pro cessing as
p er our assumptions p ossibly forming b ottlenec ks This is b ecause the mirroring tec hniques w ould
assign bac kup copies of ob jects statically regardless of their relationships to other ob jects in the
system The bac kup copies of ob jects stored on one no de migh t b e accessed quite frequen tly while the bac kup copies of ob jects on another ma y b e totally unrelated to the primary copies
stored there and not referenced at all In addition b ottlenec ks can form if a no de fails since the
en tire w orkload of the failed no de is imp osed on to its mirror The c hained declustering a v ailabilit y
tec hnique HD ev enly distributes the w orkload of a failed no de across the other no des in the
system but can still result in unev en w orkload distribution during normal pro cessing With c hained
declustering bac kup copies of ob jects are placed on the next logically adjacen t no de ie bac kup
copies of ob jects on no de i w ould b e stored on no de i This can result in one no de referencing
its bac kup copies of ob jects quite frequen tly while another no de do es not reference its bac kup copies
of ob jects similar to the problems with the mirroring tec hniques On the other hand T eradatas
in terlea v ed declustering strategy T er lea v es a degree of freedom when placing bac kup copies of
data When extended to the ob jectorien ted domain it presen ts opp ortunities to strategically place
the bac kup copies of ob jects in order to minimize in terno de tra v ersals during query pro cessing
F urthermore it allo ws the opp ortunit y for placing bac kup copies of ob jects suc h that the w orkload
distribution remains ev en when the bac kup copies are utilized for query pro cessing In the case
of a no des failure it also allo ws the w orkload of a failed no de to b e ev enly distributed across the
remaining no des
Finally sev eral tec hniques for dynamic ob ject placemen t in distributed systems BFR WJa WJb HW HW ha v e b een dev elop ed WJa presen ted an ob ject placemen t tec hnique
for distributed ob jectorien ted systems It fo cused on minimizing comm unication costs of a system
hence maximizing its p erformance It in tro duced and ev aluated t w o dynamic ob ject replication
algorithms Ho w ev er sev eral critical issues w ere o v erlo ok ed b y the algorithms that mayimpact
their practicalit y and eectiv eness ie resp onse time for a parallel database system
Giv en the failure of a single no de some ob jects ma y b ecome una v ailable the system do es
not ha v e access to all its ob jects
The net w ork connections b et w een no des are restricted to b e heirarc hical in nature No tok en
ring net w orks are allo w ed nor an y other cyclical net w ork If a no de in the cen ter of the
net w ork fails t w o p ortions of the system b ecome isolated and ob jects in one p ortion b ecome
una v ailable to the other
Scalabilit y problems ma y arise The distribution of the w orkload across the no des is not
considered This can result in the formation of b ottlenec ks degrade system p erformance and
compromise its scalabilit y This is not acceptable for parallel databases
IO costs are not considered
Recen tly HW presen ted a tec hnique that resolv ed some of these issues This study addressed
the a v ailabilit y issue b y assuming that the system main tains at least n copies of eac h ob ject It also
resolv ed the restriction on the net w ork arc hitecture Ho w ev er it considers neither scalabilit y the
distribution of the w orkload or the eects of dynamic placemen t on IOs Our researc h addresses
eac h of these issues p erformance a v ailabilit y of the data in the presence of m ultiple failures
w orkload distribution and scalabilit y In addition our approac h imp oses no restriction on
the net w ork arc hitecture
P arallelism and the OODBMS
Sev eral studies ha v e examined the role of parallelism in relational systems CABK BA CC
DGS
GD Gra GDQ The impact of parallelism on ob jectorien ted systems ho w ev er is
not as w ell understo o d In order to describ e the tradeos of parallelism its b enets and o v erheads
Figure An ob ject database
in an ob jectorien ted system consider the c haracteristics of queries that constitute the w orkload of
suc h systems
Ob jectorien ted data can b e view ed as a directed and p ossibly cyclic graph see Figure Eachv ertex in the graph represen ts an ob ject and eac h directed edge represen ts an in terob ject
relationship The direction of eac h edge denes the order of tra v ersal b et w een ob jects Queries
are na vigational in nature They na vigate the ob ject graph bytra v ersing the directed edges
referencing either a few or man y related ob jects An example of suc h a query is the follo wing
OSQL query FBCC
that selects the ob jects corresp onding to the grandc hildren of Bob
select childchildBob
This query accesses ob ject Bobtra v erses the chil d edge to Suz ithentra v erses the chil d edges
from Suz i to Nikki and T immy chil d is a m ultiv alued attribute Na vigational queries on ob ject
orien ted data are distinct from traditional asso ciativ e queries on relational data Asso ciativ e queries
are essen tially v aluebased in nature where an attribute is compared with either another attribute
orav alue using v aluebased op erators
select from EMPLOYEE where EMPLOYEEsalary This example asso ciativ e query retriev es all emplo y ees whose salary is greater than As
so ciativ e queries select a set of ob jects that meet some v aluebased criteria Na vigational queries
on the other hand select a set of ob jects based on the relationships b et w een the data These dif
ferences b et w een asso ciativ e and na vigational queries imp ose dieren t constrain ts up on the system
when parallelism is considered and is discussed further in the follo wing paragraphs
When an ob jectorien ted query is in tro duced to the system it is directed for execution to the
no de storing the primary cop y of the ob ject initiating the querys tra v ersal in the ob ject graph
The query accesses some or man y related ob jects and the result is returned to the requesting
pro cess In a parallel OODBMS tra v ersing the ob ject graph from one ob ject to another ma y result
in an in terno de tra v ersal The query pro cessing paradigm utilized in Omega for accessing remote
ob jects is to migrate a p ortion of the query to the remote data and resume execution setting up
a pip eline from the initiating no de to the new no de The results of the query are then returned
through the pip elin es to the initiating no de where the nal result of the query is constructed
This paradigm can yield a substan tial impro v emento v er the traditional mo v e the data to the
query paradigm F or instance consider the previous ob jectorien ted query
select childchildBob
Giv en that Bob is stored on no de while Suz i N ikki and T immy are stored on no de migrating
the query pro vides few er comm unications as compared to retrieving the data Consider the exe
cution of this query under Omegas pro cessing paradigm It b egins executing on no de accesses
Bob and then migrates to no de where it accesses Suz i N ikkiand TimmyOne in terno de
tra v ersal is required With the alternativ e paradigm ho w ev er more tra v ersals are required In this
instance the query b egins its execution on no de b y accessing Bobthen con tacts no de for a
copyof Suz i After receiving Suz i it again con tacts no de for N ikki and Timmy Here at least in terno de comm unications are required The query migration paradigm minimizes comm unication
and sync hronization o v erheads if ob ject placementtec hniques place related ob jects together on the
same no de F or queries accessing man y ob jects this can yield a substan tial sa vings as opp osed to
a system that mo v es data to the queryF urthermore these aect b oth the resp onse time and the
throughput of a parallel system
T o consider the role of parallelism in an OODBMS w e rst ev aluate its eects on the a v erage
resp onse time of queries in the presence of a single user Subsequen tlyw e extend our discussion to
analyze the eect of parallelism on throughput F or the rest of this pap er the reader should assume
a single user en vironmen t whenev er the discussion fo cuses on resp onse time On the other hand
throughput is discussed in the con text of a m ultiuser en vironmen t
Resp onse time is dened as the a v erage amoun t of time elapsed from when a user submits a
query to the completion of its execution b y the system in the presence of a single user T o analyze
the eects of parallelism on resp onse time w e assume that a single stream of queries is submitted
to the system and they are executed one after another un til the stream is exhausted The resp onse
time of the system then is the a v erage time required to execute a query in this stream Ho w ev er
in order to measure the eects of parallelism on the resp onse time of queries in an OODBMS the
nature of ob jectorien ted queries m ust rst b e considered
In general the Degree of P arallelism DoPfor na vigational queries is unkno wn The DoP
of a query refers to the a v erage n um b er of no des that are concurren tly utilized to pro cess the query
during its execution Some queries access ob jects in a transitiv e manner and require ob jects b e
retriev ed serially suc h as nding the exwife of Janes h usband
select exwifehusbandJane
This query has a DoP of one since only one ob ject can b e retriev ed at a time and th us only one
no de can b e utilized at a time to pro cess the query Other queries ho w ev er allo wsev eral ob jects
to b e retriev ed in parallel suc h as nding the c hildren of Dick select childDick
Ev en with this queryho w ev er the DoP is still unkno wn un til it actually executes The DoP is
drastically aected b y the placemen t of ob jects across the no des F or example ob ject placemen t
tec hniques migh t place Dick and all his c hildren on a single no de In this case the DoP of the
query is one since the ob jects reside on a single no de and only one no de is utilized when pro cessing
the queryIn con trast an ob ject placemen ttec hnique migh t place Dick and his c hildren all on
dieren t no des In this case the DoP is greater than since three no des can b e used concurren tly
to access his c hildren though the retriev al of Dick cannot b e o v erlapp ed with the retriev al of his
c hildren
Note that with asso ciativ e queries in parallel relational systems the DoP of queries can b e
determined b efore the query executes based on the data placementtec hnique utilized b y the system
Figure The eects of parallelism on IO and in terno de tra v ersal o v erheads
F or example if an asso ciativ e query selecting of a relation is in tro duced to the system it can
b e directed to all no des when hash or roundrobin partitioning is used or to a subset of the no des
if range partitioning is used F or selecting a single tuple of a relation utilizing range or hash
partitioning allo ws the query to b e directed to a single no de while utilizing roundrobin partitioning
requires it to b e directed to all the no des In essence the no des used to execute a query can b e
determined b efore its execution
Before a na vigational query b egins its execution there is no w a y to determine the ob jects or
no des it will reference Hence its DoP is unkno wn In this analysis w e assume that the ideal
DoP for ob jectorien ted queries is one in order to gain a b etter understanding of the eects of
parallelism on resp onse time in an OODBMS This assumption is accurate for systems that execute
small queries referencing a few related ob jects Our ob ject placementtec hniques place related
ob jects together and in man y cases conne the pro cessing of suc h queries to a single no de
The resp onse time of a parallel OODBMS in the presence of a single user is in v ersely related to
its buer p o ol hit ratio More disk IOs imply a higher resp onse time Assume a xed size database
whose ob jects are accessed based on a sk ew ed distribution ie some ob jects are accessed more
frequen tly than others Moreo v er assume that related ob jects are clustered together on the same
disk page As one increases the n um b er of no des and therefore memory in the system the buer
p o ol hit ratio also increases since additional ob jects t in memory see Figure a Initially the
hit ratio impro v es dramatically as more ob jects t in memory Afterw ards this impro v ementlev els
o b ecause the frequen tly accessed ob jects ha v e b ecome memory residen t and additional memory
Figure Pro cessing o v erheads
is used to store infrequen tly accessed ob jects and these ob jects pro vide marginal impro v emen t In
this analysis the relationship b et w een the n um ber of nodes m the n um b er of ob jects o the
n um b er of ob jects that can t in memory on a single no de n and the hit ratio hr is appro ximated
b y the function
hr m
n
o
for m
o
n
for m o
n
The hit ratio impro v es as no des are added un til the database b ecomes main memory residen t when
m o
n
and is thereafter The function is normalized b et w een and in order to compare
it with the costs of in terno de tra v ersals
In addition to impro ving the hit ratio increasing the n um b er of no des in the system increases
the n um ber of in terno de tra v ersals
see Figure b As the ob jects in the database are declustered
across more no des relationships b et w een ob jects of dieren t no des increases This is appro ximated
b y the function
internodetra v ersals q
n m
o
for m o
p
n for m o
Again this function is normalized b et w een and when the n um b er of no des is less than or equal
to
o
n
at whic hpoin t the database is memory residen t to allo w comparison with the earlier IO
costs Note that when the n um b er of no des is greater than the n um b er of ob jects the p ercen tage
of in terno de tra v ersals b ecomes constan t b ecause no t w o ob jects are assigned to the same no de
in order to distribute the w orkload ev enly and an ob ject ma y not b e declustered across m ultiple
no des
This will b e demonstrated in our analysis of the ob ject placemen t algorithms in later sections
Figure a sho ws the com bination of the IO and in terno de tra v ersal costs termed the o v erheads
of parallelism as a function of the n um b er of no des Initially the o v erheads of parallelism decline
as no des are added b ecause the b enets of a higher hit ratio out w eigh the costs of additional
in terno de tra v ersals Ho w ev er this o v erhead lev els o as more no des are added then b egins to
increase This is b ecause the b enets of storing more infrequen tly accessed ob jects in memory no
longer out w eigh the costs of additional in terno de tra v ersals While the sp ecic p oin t at whic h
the p erformance degrades is completely dep enden t on the relativ e costs of IOs v ersus in terno de
tra v ersals
Figure a demonstrates the basic tradeos asso ciated with parallelism in OODBMSs
In con trast Figure b demonstrates the n um b er of IOs p erformed b y a single no de system as
a function of its memory size As compared to the parallel system with an iden tical amountof
memory a singleno de system pro vides a sup erior p erformance b ecause it observ es a similar buer
p o ol hit ratio without incurring the o v erheads of in terno de tra v ersals Ho w ev er it is imp ortan tto
note that this comparison assumes an ideal DoP of one for queries that constitute the w orkload of
the system The ideal DoP of queries aects this comparison considerably When all other v ariables
are xed and the ideal DoP is increased the resp onse time for a parallel system can decrease as
no des are added un til the n um b er of no des is greater than the DoP at whic hpoin t the p erformance
lev els o Ho w ev er in a singleno de system the DoP of a query cannot b e increased b ecause all
the pro cessing is p erformed b y a single no de th us no similar b enet is p ossible F or these queries
therefore parallel systems can outp erform singleno de systems
T o summarize the resp onse time analysis for a singleuser en vironmen t singleno de systems
pro vide b etter p erformance than parallel systems when queries ha v e a DoP of one and the total
memory size of eac h system is the same As the DoP increases ho w ev er parallel systems can pro vide
a b etter resp onse time byallo wing some pro cessing to b e p erformed in parallel While singleno de
systems can outp erform parallel systems with resp ect to resp onse time the same cannot b e stated
of throughput
The throughput of a system is dened as the n um b er of queries that it can execute o v er a xed
p erio d of time measured in terms of queries completed p er second Assuming the ideal DoP of
queries submitted to the system is one it is p ossible for n no des to service n queries sim ultaneously
F or systems with v ery slownet w orks additional no des ma y not pro vide an y impro v emen t at all
as long as queries execute up on dieren t no des Theoretically therefore throughput can increase
linearly with the n um b er of no des in the system Ho w ev er sev eral factors impact throughput and
mak e its analysis dicult First the throughput of a system is aected b y the distribution of the
w orkload across the no des During the execution of m ultiple queries on a parallel OODBMS sev eral
dieren t queries mighteac h access data stored on the same no de sim ultaneously This results in
the formation of a queue of queries on one no de while other no des are idle w aiting for w ork These
b ottlenec ks degrade the throughput of the system
Second throughput is aected when queries frequen tly migrate from one no de to another
Essen tiallyw ork p erformed b y the system falls in to one of t w o categores useful w ork or w asteful w ork Whenanin terno de tra v ersal o ccurs w asteful w ork is p erformed for migrating
the query to a new no de If queries migrate frequen tly the system p erforms a lot of w asteful
w ork lea ving less pro cessing time for useful w ork The throughput of the system diminishes
accordingly In terno de tra v ersals m ust b e minimized to allowmaxim um thoughput in parallel
systems Ob ject placemen t impacts the frequency of query migration and hence has a signican t
impact on throughput
In summary while parallelism ma y not b e an appropriate solution for reducing the resp onse
time of na vigational queries in OODBMSs it can b e used to increase throughput b y allo wing sev
eral queries to b e executed concurren tly Ob ject placemen t is critical to maximizing throughput
in parallel ob jectbased systems executing na vigational queries In telligen t ob ject placementtec h
niques minimize the n um ber of in terno de tra v ersals disk IOs and the standard deviation in the
w orkload across the no des
Initial Ob ject Placemen t
The placemen t of ob jects across the no des has a signican t eect on the p erformance of na vigational
queries In telligen t declustering tec hniques place related ob jects together up on the same no de
in order to lo calize the execution of queries to a single no de minimizing comm unication and
sync hronization costs incurred b y accessing sev eral no des for relev an t ob jects F urthermore these
tec hniques enable clustering strategies to place related ob jects together up on the same disk page
Child
Spouse
Child Child
40
Child
Bob Mary
Tom Kathy
Spouse
10
10
30
20
40
Figure A graph represen tation of an ob ject database
in order to minimize IOs p erformed b y the system on b ehalf of queries further reducing resp onse
time A greedy declustering strategy has b een dev elop ed with the ob jectiv es of minimizing the
n um ber of in terno de tra v ersals p erformed on b ehalf of queries o v er some p erio d of time while
main taining ev en w orkload distribution
F ormal Statemen t of the Problem
Assuming that the frequency of access and the size of eachobject x in the database is pro vided
termed he at x and size x CABK resp ectiv ely w e dene the w ork imp osed on a no de b y eac h
ob ject x as
work x heat x siz e x The w orkload of a no de P
i
is dened as the total w ork of the ob jects sa y N assigned to it ie
workload P
i
P
N
k
work obj ect
k
Conceptually an ob jectorien ted database can b e view ed as v ertices of a directed and p ossibly
cyclic graph see Figure Eac hv ertex in this graph represen ts an ob ject and eac h directed edge
represen ts an in terob ject reference The direction of eac h edge denes the order of tra v ersal among
ob jects Asso ciated with eac h edge is the n umber oftimesit istra v ersed frequency of tra v ersal
Using this terminology the placemen t of ob jects across a system with m no des can b e view ed
as a graph partitioning problem The formal statemen t of the problem is as follo ws decluster
the graph in to m partitions suc h that the total n um ber of tra v ersals b et w een the m partitions
is minimized and the total w orkload of eac h partition is appro ximately the same with one
partition assigned to eac h no de
Initial Ob ject Placemen t Strategy
Assuming that tra vs o
i
o
j
represen ts the n um b er of times the edge from ob ject o
i
to o
j
is tra v ersed
w e dene the relev ance b et w een these t w o ob jects as
relev ance o
i
o
j
tr av s o
i
o
j
tr av s o
j
o
i
Using this concept Figure outlines a greedy algorithm to fragmen t a database in to partitions
Briey the algorithm creates eac h partition as follo ws First it computes the w orkload that should
b e imp osed on to the curren t partition its w orkload quota Next it assigns the most frequen tly
accessed ob ject that is still unassigned sa y o
i
to the partition The next assigned ob ject is the
one with the highest relev ance to o
i
forming a subgraph The subsequen t ob jects assigned are
determined based on their degree of relev ance to ob jects of this subgraph The algorithm con tin ues
to assign ob jects to this partition un til its w orkload quota is met
Giv en a system with m no des and a database consisting of n ob jects our greedy algorithm
fragmen ts the database in to m partitions the size of partition m w as allo w ed to v ary The
algorithm assigns eac h of the rst m fragmen ts to m to a dieren tnode P
to P
m
Next it
redistributes the ob jects assigned to partition m across the m no des This is adv an tageous for
t w o reasons First it distributes the in terno de tra v ersals more ev enly across all no des and a v oids
the scenario where most of the tra v ersals are b et w een the last no de and the other m no des in the
system
Second in most cases it minimizes b oth the n um ber of tra v ersals across no des and the
a v erage v ariance in the distribution of the w orkload
The complexit y of this algorithm is O k n l og n where k is the a v erage n um b er of outgoing
If the fragmen ts w ere constructed b y partitioning the database in to m pieces incremen tally as the ob jects are
assigned more and more constrain ts w ould b e imp osed on the assignmen t of the remaining ob jects This results in
assignmen t of related groups of ob jects to the rst m fragmen ts assignmen tof man y unrelated ob jects to the
last fragmen t and man y references that span from the rst m fragmen ts to the last one
AssignOb jsLP
Assign the n ob jects in set L to the m no des in set P workload L
P
n
i work obj ect
i
workload
q uota
last
f r act
w or k load L m i for i from to m do
workload P
i
create empt y hash index H for L the indexed attribute is the iden tit y of the ob ject
create empt y B tree index B for L the indexed attribute is its w eigh t attribute
c ho ose ob ject o
j
from L with highest frequency to start this partition
while workload P
i
work o
j
w or k l oad
quota
do
assign o
j
to partition P
i
and remo veitfromL
workload P
i
workload P
i
work o
j
workload L workload L work o
j
prob e H with o
j
s iden tit y if it exists then remo v e o
j
from b oth H and B
for eachedge o
j
o
k
suc h that o
k
in L do
if o
k
exists in H then
w eig ht o
k
w eig ht o
k
relev ance o
j
o
k
up date B to reect new w eig ht o
k
else
w eig ht o
k
r el ev ance o
j
o
k
insert o
k
in to b oth H and B indices
end ifelse
end for
let o
j
b e the ob ject in L with highest w eig ht searc hB end while
if workload
quota
workload P
i
workload P
i
work o
j
workload
q uota
then
assign o
j
to partition P
i
and remo v e it from L
workload P
i
workload P
i
work o
j
workload L workload L work o
j
end if
end for
for i from to m do
workload
q uota
w or k load L m i create empt y B tree index B for L the indexed attribute is its w eigh t attribute
for eac h ob ject o
j
in L do
w eig ht o
j
for eachedge o
j
o
k
suc h that o
k
in P
i
do
w eig ht o
j
w eig ht o
j
r el ev ance o
j
o
k
insert o
j
in to B
end for
let o
j
b e the ob ject in L with the highest w eig ht to P
i
searc hB while workload P
i
work o
j
w or k l oad
quota
do
assign o
j
to partition P
i
remo v e o
j
from b oth L and B
workload P
i
workload P
i
work o
j
workload L workload L work o
j
for eachedge o
j
o
k
suc h that o
k
in L do
w eig ht o
k
w eig ht o
k
r el ev ance o
j
o
k
up date B to reect new w eig ht o
k
end for
let o
j
b e the ob ject in L with highest w eig ht to P
i
searc hB end while
if workload
quota
workload P
i
workload P
i
work o
j
workload
q uota
then
assign o
j
to partition P
i
and remo v e it from L
workload P
i
workload P
i
work o
j
workload L workload L work o
j
end if
end for
assign the rest of ob jects in L to partition P
m
Figure A Greedy Ob ject Placemen t Algorithm
edges p er ob ject This complexit yisac hiev ed using sev eral index structures Man yv ariations of
greedy graph partitioning algorithms exist with v arying complexities Note that the complexit y
of the v ersion presen ted is indep enden t of the n um b er of no des Additional tec hniques ha v e b een
dev elop ed that oer a small p erformance gain at the cost of increased algorithm complexit yand
therefore increased redistribution time suc h the graph partitioning tec hnique of Kernighan and
Lin KL Sev eral other graph partitioning algorithms dev elop ed for VLSI circuit design see
section are also applicable but include increased complexities Our greedy algorithm is presen ted
as a computationally simple basis for further researchin to a v ailabilit y and replication tec hniques
Ev aluation of the Greedy Ob ject Placemen t Strategy
In this section w e quan tify the tradeos asso ciated with the ob ject placemen t algorithm using b oth
analytical and tracedriv en ev aluation studies The follo wing factors ha v e a signican t impact on
our ob ject placemen t algorithm
The n um b er of no des in the system
The relationship b et w een the ob jects in the database and its mapping cardinalit y oneto
man y N hierarc hical man ytoman y NM and arbitrary man ytoman y NM
The degree of sk ew in the distribution of tra v ersals asso ciated with the edges
The access pattern of queries that constitute the w orkload of an application and ho w they
na vigate the ob jects The ob jects ma ybe na vigated in either a transitiv e breadthrst or
com bination of these t w o manners
Weev aluated our algorithm for the last three factors as a function of the n um b er of no des in the
system W e made the follo wing observ ations
F or a xed database size the p ercen tage of in terno de tra v ersals increases as one increases
the n um b er of no des in the system due to an increase in the n um b er of graph partitions
Our algorithm distributes the w orkload of an application fairly ev enly across the no des The
v ariance in the distribution of the w orkload w as less than for all access patterns and
relationships in v estigated in this study Our algorithm results in the lo w est p ercen tage of in terno de tra v ersals for a database with a
N relationship b et w een ob jects The p ercen tage of in terno de tra v ersals increases for a NM
relationship
In general the algorithm p erforms b est for queries that na vigate ob jects in a breadthrst
manner
In addition w eha veev aluated the algorithm as extended b y the Kernighan partitioning tec h
nique KL Briey our results demonstrate that this tec hnique impro v es the p erformance of our
algorithm only marginally There are t w o p ossible explanations for this either the assignmen tof
the greedy algorithm is near optimal lea ving marginal ro om for impro v ementb y the tec hnique or
its assignmen t directs the tec hnique to w ards a lo c al minimum F or the follo wing analysis of the greedy algorithm the size of partition m the o v ero w par
tition w as xed at of the database size Sev en t yv e p ercen t of the w orkload w as distributed
across the rst m partitions with the remainder assigned to the o v ero w partition The ob jects of
the o v ero w partition w ere then redistributed across the rst m partitions The size of the o v ero w
partition w as set at of the database size after analyzing the algorithms p erformance o v er a
v ariet y of databases and query t yp es while v arying its size from to of the total w orkload
In general the partitioning impro v ed as the size of the o v ero w partition w as increased from then degraded as it approac hed of the database size This w as due to t w o opp osing factors
First when the o v ero w partition w as of size the last few partitions assigned con tained more
and more unrelated ob jects the rst few partitions con tained related ob jects and few if an y unre
lated ob jects Ob jects of the last partition w ere related strongly to other partitions y et relativ ely
w eakly with one another As the o v ero w partition size increased this factor diminished Ob ject
partitioning assigned more related ob jects within the last few partitions Ho w ev er as the size of
the o v ero w partition approac hed of the database size eac h of the rst m partitions con tained
few er and few er related ob jects b efore redistribution The algorithm captured less lo calit y within
the rst m partitions b efore redistribution A t some p oin t this dra wbac k out w eighed the b enets
of redistributing the ob jects and p erformance degraded In general setting the size of the o v ero w
partition at of the database size pro vided go o d p erformance
The remainder of this section is organized as follo ws Section pro vides a description of
the database iden tifying the alternativ e relationships b et w een ob jects Section describ es the
analytical analysis of the algorithm and pro vides the asso ciated results Section describ es
the tracedriv en ev aluation of the algorithm and Section pro vides the ev aluation the ob ject
placemen t algorithm as a function of v arious conguration sizes
Description of the test database
F or this ev aluation w e used a syn thetic database based on a subset of the T ektronix Hyp ermo del
Benc hmark ABMP
The database is graph deriv ed using a balanced sev en lev el tree with eac h
lev el n um b ered from to termed lev elid The n um b er of ob jects at eachlev el is lev el id
F or
example there is one ob ject at lev el at lev el at lev el etc The total n um ber of ob jects
in the database is Wec haracterized the p erformance of our ob ject placemen t strategy and the alternativea v ail
abilit y strategies for three t yp es of in terob ject relationships paren tc hildren PC partOfparts
PP and graph relationships The paren tc hildren relationship is a N relationship The c hildren
of a giv en paren t are ordered with eac hc hild ha ving at most one paren t With this relationship
Figure Analytical Ev aluation of the Ob ject Placemen t Algorithm Kernighan augmen tation
eachobjectatlev el i references v e unique ob jects at lev el i i There are no references
no outgoing edges from ob jects at lev el The partOfparts relationship is a hierarc hical NM
relationship While parts ma y share subparts this relationship is acyclic This relationship is
generated b y mapping eac h ob ject at lev el i to randomly c hosen ob jects at lev el i i
Finally the graph relationship is an arbitrary NM relationship with v e edges from eac h ob ject
including ob jects at lev el to randomly c hosen ob jects in the database Consequen tlyithas
appro ximately v e times as man y edges as b oth paren tc hildren and partOfparts
Analytical ev aluation
F or the analytical ev aluation the ob ject placemen t strategy partitioned the databases for a v arying
n um b er of no des Both the heat of eac h ob ject and the frequency asso ciated with eac h edge w ere
randomly generated The ob ject placemen t algorithm partitioned eac h database t yp e PC PP and Graph and the total frequency of edges crossing no de b oundaries w as coun ted to determine
the p ercen tage of in terno de tra v ersals This coun tw as generated b y adding the frequencies of
edges asso ciated with ob jects assigned to dieren tnodes Ineac h of the exp erimen ts the greedy
partitioning w as augmen ted with the Kernighan partitioning tec hnique and ev aluated These results
are sho wn in Figure They illustrate sev eral of our observ ations Only marginal impro v ementis
Figure Three access patterns
obtained b y augmen ting the greedy algorithm with the Kernighan tec hnique This is signican t
in the fact that the Kernighan partitioning tec hnique pro vides pairwise optimal paritioning No
t w o ob jects can b e exc hanged b et w een t w o partitions suc h that the resulting partitioning pro vides
few er in terno de tra v ersals In this w a y our greedy algorithm is near pairwise optimal Other
observ ations include b etter p erformance for the greedy algorithm for databases exhibiting the M
relationship PC as opp osed to the NM relationships PP graph The algorithm p erforms
b etter with lo w er connectivitybet w een ob jects the PP relationship has appro ximately
of the
edges of the graph relationship And lastly the p ercen tage of in terno de tra v ersals is directly
prop ortional to the n um b er of no des in the system As the n um b er of no des increases the p ercen tage
of in terno de tra v ersals also increases
Description of the tracedriv en ev aluation
F or eachin terob ject relationship w e analyzed three dieren t access patterns STRING ST AR
and HYBRID see Figure Eac h access pattern references v e ob jects byna vigating the edges
in the database The trace consisted of one million sample queries based on a single giv en access
pattern The trace is generated at t w o dieren t stages at the tr aining stage to compute the heat
of eac h ob ject and the frequency of tra v ersals asso ciated with eac h edge and at the evaluation
stage to compute a the p ercen tage of time a query crosses the b oundary of a no de when na vigating
the in terob ject references and b the distribution of the w orkload across the no des The ev aluation
trace is iden tical to the training trace W e calculate the n um ber of in terno de tra v ersals b y coun ting
the n um b er of times a query crosses from one no de to another when na vigating an edge for the
relev an t ob ject F or example assume that ob ject o
of Figure resides on no de P
i
and ob jects o
o
o
and o
are assigned to some other no de P
j
P
i
P
j
A query based on the ST AR access
pattern termed a ST AR query results in four in terno de tra v ersals b ecause the query is required
to na vigate from P
i
to P
j
four times once for eac h of its tra v ersals Both HYBRID and STRING
queries result in one in terno de reference since once they na vigate from o
to o
the remaining
na vigations are conned to P
j
The size of eac h ob ject w as determined byc ho osing a random v alue b et w een and uni
formly distributed
Giv en a relationship at the training stage eac h access pattern results in a dieren t degree of
sk ew for the distribution of access to the ob jects their heat and the frequency of tra v ersals
asso ciated with the edges T o demonstrate consider eac h access pattern individual ly starting with
STRING
A query based on the STRING access pattern termed a STRING query visits v eobjects
in a transitiv e manner Since eac h database is graph deriv ed using a tree for b oth the PC and
PP relationships a STRING query starts its tra v ersal with a random ob ject at lev els to ie
the rst three lev els Otherwise it w ould not b e able to visit v e ob jects b ecause the ob jects
at lev el ha v e no outgoing edges a STRING query initiating at an ob ject in lev el can visit at
most ob jects The partOfparts relationship results in sev eral ob jects that are nev er visited b y
a STRING query ie they ha v e a zero frequency of access b ecause these ob jects ha v e no edge
path to ob jects in the rst three lev els of the tree some ob jects at lev el are not referenced b y
those at lev el With the Graph relationship the STRING query ma y start with an y ob ject in
the database b ecause eac h ob ject has v e outgoing edges This study in v estigates t w o p ossible
distributions for the Graph relationship uniform and skewe d distribution of accesses to ob jects
and tra v ersals across the edges These distributions are hereafter refered to as uniformGraph
uG and sk ew edGraph sG resp ectiv ely With the uniformGraph a STRING query starts
its tra v ersal with a random ob ject that resides at an arbitrary lev el of the tree resulting in a
uniform distribution of tra v ersals across the edges With the sk ew edGraph eac h query starts with
an ob ject at the rst lev els of the tree similar to PC and PP This results in a sk ew ed pattern
of access and pro vides a fair comparison b et w een the sG PC and PP relationships
Figure Graph Relationships
AST AR query tra v erses the edges in a breadthrst manner F or b oth PC and PP a query
b egins its tra v ersal with a randomly c hosen ob ject that resides at lev els to ob jects at lev el ha v e no outgoing edges and cannot initiate a ST AR query This results in a uniform distribution
of access to edges and a fairly uniform distribution of access to the ob jects also Access to ob jects
is sligh tly sk ew ed since ob jects at lev els and will a v erage few er accesses than those in the
middle lev els of the database structure Similarly the sG relationship requires the ST AR query to
initiate at ob jects residing at lev els to Again the distribution of access to ob jects is sligh tly
sk ew ed while the frequency distribution across the edges asso ciated with the ob jects at lev els to
is uniform Note that the other edges from ob jects in lev el will not b e tra v ersed and ha v e
frequencies of The uG relationship c ho oses arbitrary ob jects to start a ST AR query resulting
in a uniform distribution of access to b oth the ob jects and the tra v ersals across the edges
A HYBRID query p erforms a single transitivetra v ersal follo w ed with three breadthrst tra v er
sals F or the PC PP and sG relationships a query based on this access pattern starts its
tra v ersal with a randomly c hosen ob ject at lev els to F or the uG database this query starts
with an arbitrarily c hosen ob ject in the database
Ev aluation of the ob ject placemen t algorithm
Figures a and b presen t the p ercen tage of in terno de tra v ersals obtained b y the greedy algorithm
for uG and sG relationships resp ectiv ely With the uG database the alternativ e access patterns
result in appro ximately the same p ercen tage of in terno de tra v ersals This is b ecause eac h access
pattern yields a uniform distribution of access to b oth ob jects and edges directing the ob ject
placemen t algorithm to partition the graph equiv alen tly for eac h access pattern The probabilit y
of in terno de tra v ersals is equiv alen t for eac h access pattern since eachna vigates a xed n um ber of
edges
As compared to the uG database the sG relationship results in a lo w er p ercen tage of in ter
no de tra v ersals for the alternativ e access patterns The sk ew ed distribution of tra v ersals across
the edges pro vides a hin t to the ob ject placemen t algorithm that directs it to lo calize the most
frequen tly tra v ersed edges to a single no de In addition the initial ob ject placemen t algorithm
pro vides the b est p erformance for the STRING access pattern and w orst for HYBRID This is
b ecause the STRING access pattern results in the most sk ew along the edges while the HYBRID
generates the least sk ew In these exp erimen ts the standard deviation for the distribution of the
w orkload across the no des w as less than The results obtained for the PP relationship are similar to sG for HYBRID and ST AR queries
The ob ject placemen t algorithm ho w ev er p erforms w orst for the STRING query as compared to
the other t w o query t yp es The PP database consists of sev eral subgraphs and the STRING query
causes man y of these subgraphs to b ecome frigid heat as describ ed in section The
database eectiv ely con tains few er accessed ob jects with higher tra v ersal frequencies across the
related edges forcing the ob ject placemen t algorithm to distribute these strongly connected hot
ob jects to dieren t no des in order to distribute the w orkload as ev enly as p ossible This causes a
relativ ely high p ercen tage of in terno de tra v ersals as compared to other access patterns With no des the standard deviation in the distribution of the w orkload reac hed for the STRING
query The greedy algorithm p erforms b est for the PC relationship It results in in terno de
tra v ersals for the ST AR query all congurations and appro ximately in terno de tra v ersals for
the STRING and HYBRID queries With this relationship the ST AR query results in a uniform
distribution of access to edges causing the ob jectplacemen t algorithm to p erform essen tially a
breadthrst assignmen t of ob jects This lo calizes most of the ob jects tra v ersed bya ST AR query
to a single no de hence the lo w p ercentageofin terno de tra v ersals The STRING query results
T erm Denition
m Num b er of pro cessors in the system
C Num b er of clusters in the systems
c Num b er of pro cessors p er cluster c m
C
n Num b er of ob jects in the database
T able List of terms used rep eatedly in this w ork and their resp ectiv e denitions
in a higher sk ew among the edges of PC b ecause the edges b et w een ob jects from lev els to are tra v ersed most frequen tly The n um b er of tra v ersals asso ciated with the outgoing edges at
dieren t levels canberank ed as follo ws has no outgoing edges This causes the
ob ject placemen t algorithm to assign the ob jects at lev els through in essen tially a depthrst
manner The ob jects at lev els to are then assigned in a breadthrst manner guided b y the
structure of the database with sev eral distinct subgraphs of lev els through assigned to eac h
no de Consequen tly once a STRING query is directed to a no de it crosses a no de b oundary
bet w een lev els and for one fault out of the v e ob jects referenced Av ailabili t y Strategies
As one increases the n um b er of no des in the system in a sharednothing arc hitecture the probabilit y
of a no de failure increases prop ortionally PGK T o main tain the a v ailabilit y of data in the
presence of suc h failures a bac kup copyofeac h ob ject is stored on a dieren t no de While this
increases the o v erhead asso ciated with up dates b oth the primary and bac kup copies of eac h ob ject
m ust b e up dated these bac kup copies of ob jects can b e utilized to minimize the n um ber of in ter
no de tra v ersals during query pro cessing Th us the assignmen t of the bac kup copies of ob jects to
the no des is imp ortan t This section describ es three a v ailabilit y strategies whose ob jectiveis to
minimize the n um ber of in terno de tra v ersals o v er some p erio d of time
Three alternativea v ailabilit y strategies
This section describ es three alternativea v ailabilit y strategies LOad Balanced Ob ject replication
LOBO Subpartition In terlea v ed REplication SIRE and LORE These strategies replicate the
Cluster Cluster Pro cessor Primary Cop y S S S S S S S S
Bac kup Cop y s s s s s s
s s s s s s
s s s s s s
s s s s s s
Figure In terlea v ed declustering
data assuming that the m no des in the system are partitioned in to C clusters eac h cluster with
c m
C
no des First the greedy algorithm is emplo y ed to partition the database in to C equi
w orkload subgraphs with eac h subgraph assigned to a dieren t cluster Subsequen tly eac hof
these subgraphs is partitioned in to c subgraphs using the greedy algorithm in order to generate
the fragmen ts of the database with eac h fragmen t assigned to a dieren t no de in the cluster
Next one cop y of a fragmen t is designated as the primary cop y and the other as the bac kup cop y
termed primary and bac kup fragmen ts LOBO SIRE and LORE eac h manipulate the bac kup
fragmentb y partitioning it in to c bac kup subfragmen ts Eac h of these bac kup subfragmen ts
is stored on a no de in the cluster other than the one con taining the corresp onding primary fragmen t
The concept of a cluster enables the system to con tin ue op eration in the presence of m ultiple failures
as long as there is no more than one failure p er cluster CK One disadv an tage of using this
concept is that it increases the v ariance in the distribution of the w orkload across the no des in the
presence of failures
T o illustrate assume an eigh t no de system m partitioned in to clusters C eac h
consisting of no des c The greedy algorithm is emplo y ed to divide the ob jects in to t w o
cluster partitions It is used again up on eac h of these partitions to construct the primary fragmen ts
of the database SS see Figure Next an a v ailabilit y strategy is emplo y ed to decluster the
bac kup fragmen t of a no de sa ys in to three subfragmen ts s s s with eac h subfragmen t
assigned to a dieren t no de in the cluster no des and The a v ailabilit y strategies presen ted generate bac kup subfragmen ts suc h that eac h imp oses an
equal amountofw orkload on their resp ectiv e no des when the corresp onding primary fragmentis
una v ailable F or example in Figure the bac kup subfragmen ts s s and s are generated
suc h that the w orkload of no de is ev enly distributed across no des and should no de fail This pro vides t w o main b enets to the system during a failure First ev enly distributing
the w orkload of a failed no de across the others in its cluster minimizes the probabilit y of forming
a b ottlenec k and signican tly degrading the p erformance of the system when a failure o ccurs
Second this minimizes the time required to reconstruct the data of the failed disk since eac h bac kup
subfragmen t can b e accessed concurren tly
During reconstruction if another no de in the cluster
fails some data ma y b ecome una v ailable By minimizing the time necessary to reconstruct the data
of the failed disk the mean time to data loss when all copies of some ob ject are una v ailable is
maximized If the system elects to use the bac kup copies of ob jects for query pro cessing during the
normal mo de of op eration to reduce the n um ber of in terno de tra v ersals this approac h results in
a higher standard distribution of the w orkload during the normal mo de of op eration as compared
to the greedy initial ob ject placemen t algorithm This is b ecause the n um b er of times a bac kup
cop y of an ob ject is referenced during the normal mo de of op eration is unrelated to
the n um b er of times it is referenced during a failure that renders its corresp onding
primary copyuna v ailable Ob ject placemen t can b e optimized suc h that the w orkload is ev enly
distributed either in the presence of a failure or during the normal mo de of op eration An alternativ e
approac h that assigns bac kup copies of ob jects with the ob jectiv e to distribute the w orkload ev enly
during the normal mo de of op eration can result in the formation of a b ottlenec k when a no de fails
since all the w orkload of the failed no de migh t b e imp osed on only one other no de in its cluster
This demonstrates a basic tradeo in the design of an a v ailabilit y strategy The alternativ e approac h
is discussed in more detail in Section but w as abandoned due to its complexit y While LOBO SIRE and LORE eac h generate equiw orkload bac kup subfragmen ts they gen
erate these bac kup subfragmen ts s s s in a dieren t manner LOBO generates the c subfragmen ts b y analyzing eac h ob ject individuall y It attempts to store bac kup copies of ob jects
on no des they are most related to SIRE views eac h bac kup fragmen t s as itself a graph to b e
partitioned in to c subgraphs Its main ob jectiv e is to main tain strong relationships b et w een
ob jects within bac kup subfragmen ts Secondary to this goal it attempts to store bac kup copies of
ob jects on the no des they are most related to as with LOBO LORE on the other hand tries to
reac h a compromise b et w een these t w o approac hes hence its name It attempts to store bac kup
Man y systems utilize a hot standb y on whic h the data of the failed disk is reconstructed when a failure o ccurs
copies of ob jects on no des they are most related to while main taining relationships b et w een ob jects
within bac kup subfragmen ts W e describ e eac h strategy in turn
LOad Balanced Ob ject replication LOBO
LOBO is designed to minimize the probabilityof a tra v ersal crossing the b oundary of a no de b y
maximizing the c hance of the tra v ersal nding the bac kup cop y of the referenced ob ject lo cally
Giv en a cluster with c no des LOBO analyzes eac h ob ject sa y o
i
of a bac kup fragmen t assigned
toanode sa y P
i
individuall y and assigns the bac kup cop y of this ob ject to a dieren tnodein
the cluster LOBO assigns the bac kup copyof o
i
to the no de P
j
that do es not y et ha v e
c of P
i
s
w orkload and has the highest v alue for
Rel ated o
i
P
j
F EF U P
i
T where F represen ts the total frequency of tra v ersals asso ciated with edges from the primary copies
of ob jects assigned to P
j
to ob ject o
i
and T is the total frequency of tra v ersals asso ciated with
edges from o
i
to the primary copies of ob jects that reside on no de P
j
The Exp ected F ractional
Una v ailabilityof P
i
EF U P
i
represen ts the fraction of time P
i
is exp ected to b e una v ailable
during an in terv al of time In this w ork w e assume that the EF U ofanode iskno wn and do
not dev elop tec hniques to compute it This is a signican t researc h topic in itself that deserv es to
b e addressed in its o wn righ t and is b ey ond the scop e of this w ork statistical analysis of either
the mean time to failure of a disk or historical data collected from the system are among sev eral
p ossible approac hes In the absence of this information letting EF U P
i
mo dels a system
with no failures
Equation enables LOBO to b e a general purp ose algorithm that can b e used for alternativ e
sharednothing platforms with dieren tv alues for EF U P
i
This equation is appropriate for b oth
homogeneous systems where the EF U is constan t for eac h no de and heterogeneous platforms where
the EF U of the no des v ary from one to another
The reasoning b ehind Equation is as follo ws When assigning the bac kup copy ofanobject one
goal is to minimize the n um ber of in terno de tra v ersals o v er a p erio d of time b y using the bac kup
cop y of an ob ject for query pro cessing Giv en an ob ject o
i
if the EF U of the no de con taining
150
1
150
100
700
PP P
2 13
O O O
23
Figure A three no de system
its primary copyisv ery lo w then its bac kup cop y should b e assigned to the no de whose ob jects
most frequen tly reference this ob ject Otherwise its bac kup cop y should b e assigned to the no de
whose ob jects are most frequen tly referenced b y o
i
b ecause the bac kup copyof o
i
is used to initiate
queries when the no de storing its primary cop y is una v ailable Equation uses the EF U of a no de
in order to minimize the n umberofin terno de tra v ersals o v er a p erio d of time T o demonstrate
this consider a cluster with three no des as sho wn in gure The primary cop yof o
resides on
P
o
on P
and o
on P
The edges represen t the frequency of tra v ersals b et w een these ob jects
Equation directs LOBO to assign the bac kup copyof o
to P
When assigning the bac kup cop y
of o
LOBO can assign it to either P
or P
If the EF U of P
is exp ected to b e lo w sa ythen
it is assigned to P
Rel ated o
P
Rel ated o
P
reducing the frequency of
in terno de tra v ersals b y instead of b y had o
b een assigned to P
Ho w ev er if the EF U
of P
is exp ected to b e high sa y Equation directs LOBO to assign the bac kup copyof o
to P
Rel ated o
P
Rel ated o
P
in order to minimize the o v erall n um ber of
in terno de tra v ersals
If the bac kup fragmentof eachnode con tains man y ob jects when assigning the bac kup copies
of ob ject from no de P
i
to P
j
LOBOs ob jectiv e is to assign to P
j
the bac kup copies of those ob jects
most related to P
j
b efore exhausting its quota ie
c of P
i
s w orkload LOBO assumes that this
assignmen t is ideal for declustering the bac kup copies of ob jects across the no des This assumption
is not v alid for all circumstances as demonstrated in the results LOBO assigns the bac kup copies
of ob jects from P
i
to P
j
as follo ws First it determines the bac kup copies of those ob jects of P
i
that
ha venot y et b een assigned in to list L F or eac h elementsa y o
i
of L it computes Rel ated o
i
P
j
Next it sorts these elemen ts based on this v alue As a nal step it assigns the bac kup copyofeac h
ob ject in L to P
j
un til P
j
s quota is exhausted starting with the rst elemen tof L the one with
the highest v alue for Rel ated o
i
P
j
Subpartition In terlea v ed REplicatio n SIRE
SIRE is designed with t w o ob jectiv es First SIRE main tains lo calit y within the bac kup subfrag
men ts ie ob jects in the bac kup subfragmen ts are related to eac h other Its secondary ob jectiv e
is to minimize the probabilit y of a tra v ersal crossing the b oundary of a no de similar to LOBO
Giv en a cluster with c no des SIRE views the bac kup fragmen t of eac h no de as a graph and emplo ys
the greedy algorithm of Section to partition it in to c subgraphs eac h with appro ximately
c of the w orkload of the fragmen t Next it assigns eac h subgraph to a no de in the cluster other
than the one con taining its primary cop y It stores a subgraph on a no de with the highest v alue for
Equation that has not b een assigned a subgraph as y et In this case F of Equation represen ts
the total n um ber of tra v ersals from the primary copies of ob jects assigned to P
j
to ob jects in that
subgraph and T is the total n um b er of tra v ersals from the ob jects of the subgraph to the primary
copies of ob jects stored on P
j
The rational b ehind using Equation is the same as that for LOBO
see Section LORE
LORE w as dev elop ed as a compromise b et w een LOBO and SIRE Our initial studies demonstrated
that b oth LOBO and SIRE had desireable traits not found in the other During normal op eration
no failures LOBO pro vides a b etter reduction in in terno de tra v ersals than SIRE This is due
to the imp ortance LOBO places on storing the bac kup cop y of an ob ject on the no de it is most
related to With m ultiple failures ho w ev er the p erformance of LOBO degrades signican tly while
SIREs p erformance remains stable enabling SIRE to outp erform LOBO This o ccurs b ecause
SIRE main tains relationships within the bac kup subfragmen ts while LOBO do es not LORE is an
attempt to gain the b est of b oth w orlds
The implemen tation of LORE is quite similar to that of LOBO When assigning bac kup copies
of ob jects from no de P
to P
it examines eac hof P
s ob jects individually and assigns the ob ject
Figure An ob ject graph and partial ob ject placemen t
that is most highly related to those of P
The only dierence b et w een LOBO and LORE is that
LORE considers those bac kup copies of ob jects that ha v e already b een assigned from P
to P
while LOBO do es not This allo ws LORE to main tain lo calit y within the bac kup subfragmen ts
while still retaining p erformance close to LOBO
Ob ject Placementfor Ev en W orkload Distribution During Normal Mo de of
Op eration
As men tioned in Section our three a v ailabilit y strategies generate the bac kup subfragmen ts so
that the w orkload of a failed no de is ev enly distributed across the other no des in its cluster This
is ac hiev ed b y assuming that the bac kup copyofanobjectimposes w orkload on its assigned no de
prop ortional to the total w orkload of the ob ject This assumption is true when a no de has failed
and allo ws the w orkload of the failed no de to b e ev enly distributed across the others in its cluster
Ho w ev er this results in a less uniform w orkload distribution during the normal mo de of op eration
b ecause this assumption is in v alid with no failures This observ ation motiv ated us to in v estigate
tec hniques that more ev enly distribute the w orkload during the normal mo de of op eration
The most crucial problem to distributing the w orkload ev enly during the normal mo de of op
eration is the calculation of the frequency of access to the bac kup cop y of an ob ject when it is
assigned Giv en this information the w orkload imp osed byeachcop y on its assigned no de can b e
found Initially b efore bac kup copies are assigned the primary cop y of eac h ob ject is assigned
the total heat of the ob ject Placing the bac kup copyof anobject o
back
on a no de P
causes
o
back
to b e accessed whenev er an edge is tra v ersed from an ob ject accessed on P
to o
back
The
frequency of access to o
back
during normal mo de op eration is fully dep enden t on the n um ber of
tra v ersals from ob jects of P
to o
back
Ho w ev er assuming a kno wn frequency of access to copies of
ob jects on P
and the frequency of tra v ersals along the edges from these ob jects to o
back
it isnot
p ossible to accurately estimate the frequency of access to o
back
F or example consider the ob ject
graph and the partial ob ject placemen t in Figure Giv en this conguration it is imp ossible to
determine the frequency of access to the bac kup copyof o
if it is assigned to no de The ob ject
graph sho ws that the edge from o
to o
is tra v ersed times Ho w ev er giv en these statistics it
imp ossible to determine whether the edge from o
to o
will b e tra v ersed from the primary cop y
of o
to o
or from the bac kup copyof o
to o
If the bac kup copyof o
is assigned to no de it migh t b e accessed as man y as times or it migh tnev er b e accessed The statistics assumed in
this study do not pro vide for an accurate estimation of the frequency of access to bac kup copies of
ob jects Due to these considerations this approac hw as neither implemen ted nor analyzed in this
studyHo w ev er a probabilistic mo del for determining the frequency of access to a bac kup copyof
an ob ject can b e constructed if additional statistics regarding the n um ber of transitivetra v er
sals along an edge are main tained in addition to the total frequency of tra v ersal along the edge A
transitivetra v ersal along an edge from ob ject o
to ob ject o
is dened as a tra v ersal from o
to o
when o
did not initiate the query It is termed transitiv e b ecause only queries that access ob jects
in a transitiv e manner can generate a transitivetra versal ie STRING and HYBRID queries the
ST AR query cannot generate a transitivetra v ersal With these statistics a probabilistic mo del
can more accurately estimate the n um b er of times the bac kup cop y of an ob ject is referenced during
the normal mo de of op eration Ev en with this information it is still dicult to ev enly distribute
the w orkload and the pro cess is more computationally complex than our approac h
Assuming that the frequency of access to the bac kup copyofan object o
i
can b e calculated
when it is assigned using a probabilistic mo del the n um b er of references directed to the primary
copyof o
i
can also b e determined simply the total heat of o
i
min us the n um b er of accesses to the
bac kup copyof o
i
Ho w ev er the calculations of the frequency of access to copies of ob jects related
to o
i
assumed that only one copyof o
i
existed the primary cop y whic hw as assigned the total heat
of o
i
The assignmen t of the bac kup copyof o
i
to P
i
causes few er accesses to the primary cop y
in v alidating this assumption Th us the frequency of access to all copies of all ob jects related to o
i
ma y no longer b e correct after the bac kup copyof o
i
is assigned and this problem propagates itself
throughout the en tire ob ject graph In order to accurately estimate the w orkload of eac h copyof
eac h ob ject and therefore distribute the w orkload ev enly the frequency of access to ev ery cop y
of ev ery ob ject ma y need to b e recalculated eac h time a bac kup copyof an y ob ject is assigned
The complexit y of an algorithm to accurately estimate the frequency of access to eac h copyof all
ob jects during the normal mo de of op eration explo des F or systems with man y ob jects ie parallel
systems a v ailabilit y algorithms with a high computational cost are not feasible and w e elected to
abandon this approac h
Ev aluation of the three alternativea v ailabilit y strategies
In this section w e quan tify the tradeos asso ciated with the alternativ ea v ailabilit y strategies using
a trace driv en ev aluation study As with the greedy ob ject placemen t strategy the follo wing factors
ha v e a signican t impact on the a v ailabilit y strategies
The n um b er of no des in the system
The relationship b et w een the ob jects in the database and its mapping cardinalit y oneto
man y N and man ytoman y NM
The degree of sk ew in the distribution of tra v ersals asso ciated with the edges
The access pattern of queries that constitute the w orkload of an application and ho w they
na vigate the ob jects The ob jects ma ybe na vigated in either a transitiv e breadthrst or
com bination of these t w o manners
In addition to these four factors that also impact the ob ject placemen t algorithm the follo wing
impact the p ercen tage of in terno de tra v ersals observ ed with the alternativea v ailabilit y strategies
Thesize ofeac h cluster c
The n um b er of clusters in the system C
The n um b er of failed no des in the system
The EF U asso ciated with eachnode The cop y of an ob ject emplo y ed b y the system when initiating the tra v ersal p erformed bya
queryT o clarify for a query that initiates its na vigation with ob ject x the system can direct
the query to either the no de con taining the primary copyof x and use the bac kup copies
of ob jects for subsequen t tra v ersals p erformed b y the query or emplo y either the primary
or bac kup copyof x dep ending on the load of the no de con taining the resp ectiv e copyof x
dynamic load balancing
Weev aluate the alternativea v ailabilit y strategies assuming that the system emplo ys the primary
copyof x to initiate the tra v ersal and can utilize lo cal bac kup copies of ob jects for subsequen t
tra v ersals In all exp erimen ts w e assumed that the EF U of eachnode is W eev aluated LOBO
SIRE and LORE for v arious cluster sizes c n um b ers of clusters C and failures W e dra w the
follo wing conclusions
LOBO is sup erior to SIRE for queries that na vigate ob jects in a breadthrst manner regard
less of the relationship b et w een the ob jects see Figure b LORE yields p erformance close
to that of LOBO F or a system with a xed n um b er of no des LOBO b ecomes less eectiv e
when the no des are group ed in to a larger n um b er of clusters C LORE b ecomes sligh tly less
eectiv e while SIRE remains relativ ely unaected In the presence of m ultiple failures SIRE
main tains stable p erformance while the p erformance of LOBO degrades signican tly SIRE
outp erforms LOBO for this class of queries The p erformance of LORE degrades sligh tly but
main tains p erformance close to that of SIRE
When the relationship b et w een the ob jects is N SIRE is sligh tly sup erior to LOBO for
queries that na vigate ob jects in a depthrst transitiv e manner see Figure a LORE is
sligh tly b etter than b oth of these tec hniques SIRE outp erforms LOBO b y a wider margin
as the n um b er of clusters C increases and b egins to outp erform LORE In the presence of
m ultiple failures SIRE remains the sup erior strategy The p erformance of LOBO degrades
signican tly while the p erformance of LORE degrades sligh tly F or queries that na vigate
ob jects in a breadthrst manner LOBO outp erforms LORE whic h in turn outp erforms
SIRE
F or a NM relationship b et w een ob jects LOBO is sup erior to SIRE for v arious access patterns
for dieren t cluster sizes In the presence of failures SIRE outp erforms LOBO
F or almost all relationships and access patters SIRE pro vides the b est w orkload distribution
The standard deviation in the w orkload pro vided b y LOBO and LORE are nearly the same
T o summarize in general LOBO is the b est strategy during the normal mo de of op eration
follo w ed closely b y LORE SIRE ho w ev er main tains the b est w orkload distribution In the pres
ence of m ultiple failures the p erformance of LOBO degrades signican tly while LORE main tains
a relativ ely stable p erformance degrading sligh tly SIRE main tains stable p erformance and out
p erforms b oth LOBO and LORE F urthermore it main tains the b est w orkload distribution under
failure than either LOBO or SIRE
The rest of this section is organized as follo ws Sections and ev aluate LOBO SIRE
and LORE during b oth the normal mo de of op eration and in the presence of m ultiple failures
resp ectiv ely Ev aluation of LOBO SIRE and LORE with no failure
Weev aluated LOBO SIRE and LORE as a function of the size of eac h cluster c and the n um b er of clusters in a system C In eac h of these exp erimen ts the n um b er of ob jects in the
database w as xed at In the rst exp erimen t w ev ary the size of a cluster from to no des and analyze the p ercen tage of in terno de tra v ersals observ ed with eacha v ailabilit y strategy Since the results are comparable with those obtained in Section w e rep ort the p ercen tage
impro v emen t observ ed byeac h strategy as compared to the ob jectplacemen t algorithm The
strategy with the highest p ercen tage impro v emen t is a sup erior strategy In the second exp erimen t w ex the n um b er of no des at and quan tify the impact of forming
dieren tn um b ers of clusters and on the p ercen tage of in terno de tra v ersals in the system
Once again w e rep ort the p ercen tage impro v emen t observ ed b y eac h strategy
Figure sk ew edGraph v arying cluster size no failure
Cluster size c
Figure presen ts the p ercen tage impro v emen t in the n um ber of in terno de tra v ersals obtained
b y the a v ailabilit y strategies for alternativ e access patterns with the sG database With the
ST AR access pattern LOBO results in the b est sa vings as compared to LORE or SIRE the same
observ ation is made for b oth the PC and PP relationships This is b est explained using an
example Assume that a no de sa y P
i
con tains the primary cop y of ob ject o
in Figure b
LOBO has the p oten tial to assign the bac kup copies of o
o
o
and o
to P
i
LORE also
has this p oten tial y et is restricted somewhat b y considering relationships b et w een ob jects in the
bac kup subfragmen ts When there are no failures in a system under the ST AR query tra v ersals
bet w een t w o bac kup ob jects do not o ccur y et LORE considers them This pro vides a sligh tly lo w er
p erformance for LORE SIRE on the other hand do es not ha v e this p oten tial to assign copies of
ob jects indep enden tl y b ecause it partitions the bac kup fragmen ts in to subgraphs and assigns eac h
subgraph separately With the STRING access pattern the sa vings obtained b y LOBO are not as pronounced when
compared to LORE and SIRE T o explain this consider the STRING of Figure a Giv en the
primary copyof o
stored on P
i
LOBO ma y assign the bac kup copyof o
to P
i
Next it assigns
the bac kup copyof o
to the no de con taining the primary copyof o
sa y P
j
If P
i
is dieren t than
P
j
an in terno de reference still exists in na vigating from o
to o
to o
Th us LOBO p erformance
P aren tChildren P artOfP arts UniformGraph Sk ew edGraph
ST AR LOBO LOBO LOBO LOBO
STRING LORESIRE LOBOSIRELORE LOBO LOBO
HYBRID LORE LORE LOBOLORE LOBO
T able Summary of the results
LOBO LORE SIRE
no des
ST AR
STRING HYBRID
T able Standard deviation in the w orkload v arying cluster size
is not as go o d for the STRING query as for the ST AR LOREs p erformance also degrades but not
as m uc h as LOBO since it captures some lo calit y within the bac kup subfragmen ts enabling LORE
to store o
o
and o
on the same no de The p erformance of SIRE for the STRING query on the
other hand is similar its p erformance on the ST AR query since it main tains strong relationships
within bac kup subfragmen ts hence the bac kup copies of o
and o
ha v e a higher probabilityof
b eing stored together
The HYBRID access pattern causes LOBO and LORE to b eha v e somewhere b et w een the
STRING and ST AR access patterns while the p ercen tage impro v emen t of SIRE remains rela
tiv ely insensitiv e to the access pattern it is the same as ST AR and STRING T able iden ties
whic h strategy is sup erior for the alternativ e relationships and access patterns
In these exp erimen ts w e observ ed that when a strategy results in a higher p ercen tage of sa vings
it also results in a less uniform distribution of the w orkload see T able The higher sa vings deriv es
from more accesses to bac kup copies of ob jects The three a v ailabilit y strategies assume that the
n um b er of references to the bac kup cop y of an is prop ortional to the heat of the ob ject This
assumption is not true during the normal mo de of op eration as discussed in Section and
results in a higher standard deviation in the w orkload As demonstrated in T able the STRING
access pattern resulted in the w orst distribution of the w orkload for all database t yp es while the
ST AR access pattern yielded the b est
Figure sk ew edGraph pro cessors v arying n um b er of clusters C no failure
Num b er of clusters C
In this exp erimen t the n um ber of nodes w as xed at W e analyzed the savingsinin terno de
tra v ersals for eac h strategy when the n um b er of clusters is v aried from to Figure presen ts
the p ercen tage impro v emen t pro vided b y the a v ailabilit y strategies for eac h access pattern for the
sG relationship In general the results demonstrate that LOBO and LORE are sensitiv e to the
n um b er of clusters while SIRE is relativ ely insensitiv e
As the n um b er of clusters in the system increases the p ercen tage impro v emen t obtained b y
LOBO and LORE for the ST AR query decreases signican tly This holds true to a lesser exten t
for the HYBRID query This is b ecause the bac kup copyofeac h ob ject is constrained to a single
cluster The bac kup cop y of an ob ject is assigned based on its relev ance to a subset of the primary
fragmen ts in the database those assigned to a cluster With one cluster the bac kup copyof
eac h ob ject is assigned b y considering its relev ance to all the other primary fragmen ts With eigh t
clusters the bac kup copyofeac h ob ject is assigned based only on its relev ance to
of the primary
fragmen ts ie those within the cluster This ma y cause sub optimal assignmen ts the optimal
assignmentma y b e to a no de not in this cluster reducing the o v erall p erformance of the system
The impact is not exactly
b ecause the fragmen ts assigned to the no des of a cluster ha v e a high
degree of relev ance see Section for ho w the primary fragmen ts are formed
With SIRE the n um ber of c onne cte d ob jects p er bac kup subfragmentof eac h no de increases in
prop ortion to the n um b er of clusters in the system due to few er no des p er cluster F or example
with one cluster of no des the bac kup fragmen ts are divided in to subfragmen ts and distributed
across the other no des in the system Ho w ev er with clusters of no des the bac kup fragmen ts
are divided in to only subfragmen ts hence eac h subfragmen t is larger and con tains more connected
ob jects This has no impact on the ST AR query b ecause increasing the n um b er of ob jects p er
bac kup fragmen t do es not impact its frequency of access during the normal mo de of op eration Th us
LOBO and LORE outp erform SIRE Ho w ev er for the STRING query this pro vides a substan tial
sa vings as compared to LOBO and LORE This observ ation is sligh t for the NM relationships
Ho w ev er for the PC relationship this pro vides SIRE a signican tsa vings as compared to LOBO
and LORE SIRE outp erforms them with as few as clusters This observ ation con tributes to the
stable p erformance of SIRE sho wn in Figure The observ ations made concerning the standard deviation in this exp erimentw ere similar to
those dra wn from the previous exp erimen t With resp ect to standard deviation in the w orkload
SIRE pro vides the b est p erformance for all databases and query t yp es LOBO and LORE yield
nearly equiv alentw orkload distribution with LOBO sligh tly b etter In general the a v ailabilit y
strategies pro vide the b est w orkload distribution for the ST AR query t yp e while the STRING query
t yp e results in the w orst w orkload distribution Ho w ev er no conclusion could b e dra wn concerning
the eect of increasing the n um b er of clusters on the standard deviation of the w orkload W orkload
distribution w as b etter in some cases as the n um b er of clusters v aried while w orse in others
Ev aluation of LOBO SIRE and LORE with failures
Weev aluated the three a v ailabilit y strategies in the presence of failures as a function of the
size of eac h cluster and the n um b er of failures in a system The n um b er of ob jects in the
system w as again xed at for these exp erimen ts In the rst exp erimen t w e assumed a single
cluster and measured the p ercen tage of in terno de tra v ersals incurred b y the a v ailabilit y strategies
in the presence of a single failure as a function of the n um b er of no des in a cluster In the second
exp erimen t w e analyzed the p erformance of LOBO SIRE and LORE for a no de conguration
consisting of clusters as a function of the n um b er of failures up to one p er cluster
Figure sk ew edGraph v arying cluster size one failure Pct external tra v ersals
In these exp erimen ts when a query references an ob ject stored on the failed no de it is directed
to the no des con taining the failed no des bac kup subfragmen ts The query nds the referenced
ob ject on one of these no des and starts it na vigation and stops executing on those no des that
do not con tain the bac kup cop y of the referenced ob ject
W e used the p ercen tage of in terno de
tra v ersals incurred after the query starts its na vigation as our ev aluation criteria
Cluster size c
Giv en a system with m no des organized in one cluster w e assumed a failure for eac h distinct no de
and used the ev aluation trace to compute b oth the p ercen tage of in terno de tra v ersals and the
a v erage distribution of the w orkload across the no des The rep orted results represen t the a v erage
of these n um bers Figure presen ts the p ercen tage of in terno de tra v ersals incurred byeac h replication strategy
for the alternativ e access patterns with the sG relationship In general as w e increased the n um ber
of no des in a cluster from to w e observ ed t w o opp osing factors the probabilit y that a query
initiates its tra v ersal with the bac kup cop y of an ob ject decreases from to ab out and once the bac kup cop y of an ob ject is referenced the probabilit y of a subsequen t reference crossing
the b oundary of a no de increases b ecause the bac kup subfragmen ts of the failed no de con tain few er
ob jects With few no des p er cluster SIRE and LORE outp erform LOBO b ecause the frequency
This is ho w Omega is designed to execute queries in the presence of failures
Figure sk ew edGraph v arying cluster size one failure p erformance degradation
of access to the bac kup copies of ob jects is relativ ely high and the bac kup subfragmen ts are
larger and since SIRE and LORE main tain the connectivitybet w een the bac kup copies of ob jects
eachbac kup subfragmen t captures more lo calit y and the impact of the second factor is minimized
note that the n um b er of ob jects p er bac kup subfragmen t is the same for b oth LOBO SIRE and
LORE The rst factor causes LOBOs p erformance to degrade signican tly from the no failure
case while the second allo ws the p erformances of LORE and SIRE to degrade less see Figure When cluster size is larger LOBO and LORE outp erform SIRE b ecause the n um b er of queries
initiating their tra v ersals with bac kup copies of ob jects is small and there are few er ob jects p er
bac kup subfragmen t reducing the abilit y of SIRE and LORE to capture lo calit y within them this
do es not aect LORE as signican tly as SIRE since its priorities in assigning bac kup ob jects are
dieren t The rst factor minimizes the degradation of the p erformances of LOBO and LORE
from the no failure case see Figure while the second minimizes the impact of the abilit yof
SIRE and LORE to capture lo calit y within the bac kup subfragmen ts T able lists the sup erior
strategy for the PC and PP relationships and eac h access pattern
In summary SIRE p erforms b est when failures o ccur and more queries are redirected to the
bac kup copies of ob jects This is b ecause it captures lo calit y within bac kup subfragmen ts LOBOs
p erformance on the other hand degrades signican tly in the presence of failures and more queries
are directed to bac kup copies of ob jects This is b ecause it do es not capture lo calit y within bac kup
subfragmen ts LORE on the other hand pro vides a compromise b et w een LOBO and SIRE It
Figure sk ew edGraph pro cessors clusters v arying n um b er of failures
main tains lo calit y within bac kup subfragmen ts and the receiv es the asso ciated b enets Its p erfor
mance degrades when a failure o ccurs but far less than LOBO Under no failures its p erformance
is close to that of LOBO and is m uc h b etter than SIRE Th us when a failure o ccurs its o v er
all p erformance is b etter than either LOBO or SIRE in manycases seeT able The standard
deviation observ ations in this exp erimentpro vided no new conclusions from previous exp erimen ts
Num ber of F ailures
In a nal exp erimen t w e assumed a no de system consisting of clusters W ev aried the n um ber
of failures from to one failure p er cluster and analyzed the p ercen tage of in terno de tra v ersals
with the alternativea v ailabilit y strategies Figure presen ts the obtained results for the alternativ e
access patterns with sk ew edGraph In general the results demonstrate that LOBO is sensitiv e
to the n um b er of failures and LORE and SIRE are relativ ely insensitiv e
F or the alternativ e access patterns LOBO yields a higher p ercen tage of tra v ersals as a function
Figure pro cessors clusters v arying n um b er of failures
paren tc hildren partOfparts
ST AR L O O O S L L LO O OS
STRING S S S S S L L S S OS
HYBRID S S S S S O O O O OS
T able Summary of the results pro cessors v arying n um b er of failures LLOBO SSIRE
OLORE
of failures b ecause it do es not consider the relationship b et w een the bac kup copies of ob jects
when assigning them and as the n um b er of failures increases more queries initiate their tra v ersals
within the bac kup subfragmen ts LORE and SIRE on the other hand main tain related bac kup
copies of ob jects together within bac kup subfragmen ts and their p erformance remains more stable
F or the PC relationship the gap b et w een LOBO and the other t w o strategies widens and they
outp erform LOBO in the presence of few er failures This gap is the widest for the ST AR query see
Figure a T able lists the sup erior strategy for the PC and PP relationships for eac h access
pattern Note that for the PC relationship SIRE p erforms b est for STRING and HYBRID queries
in all cases and that as the n um b er of failures increases SIRE is generally the b est strategy Figure b presen ts the standard deviation in the w orkload as a function of the n um ber of
failures with LOBO at most one failure p er cluster The standard deviation increases from zero
to one failure b ecause the w orkload of the failed no de is only distributed across the no des within
the cluster of the failed no de and not across all no des in the system The deviation p eaks at four
failures b ecause the w orkload of half the no des ie the no des of those clusters with a failure is
signican tly higher than the remaining no des in the system With additional failures more than
half the no des incur the w orkload asso ciated with eac h failed no de causing the standard deviation
to decline The deviation with eigh t failures is lo w er than that with no failures b ecause in essence
the database has b een declustered across few er no des the a v erage w orkload of a no de is higher
Conclusions and F uture Researc h Directions
This w ork in v estigates ob ject placemen t in parallel ob jectorien ted systems A greedy ob ject parti
tioning algorithm is presen ted whic h is used for partitioning the database in to clusters and assigning
ob jects to no des within clusters It pro vides a basis for researchin to placing additional copies of ob
jects to enhance system p erformance and main tain system a v ailabilit y Three a v ailabilit y strategies
are presen ted whic h pro vide alternate viewp oin ts in determining the placementofbac kup copies of
ob jects LOBO examines eac h ob ject to determine whic h no de it is most related to while SIRE
partitions eac h bac kup fragmentin to highly related subfragmen ts eac h of whic h assigned to a dif
feren t no de LORE attempts a compromise b et w een these t w o paradigms when assigning bac kup
copies of ob jects It stores the bac kup cop y of an ob ject on a no de it is highly related to y et main
tains relationships b et w een ob jects within the bac kup subfragmen ts By using these a v ailabilit y
strategies the system can main tain op eration and ensure data a v ailabilit y when m ultiple no des
ha v e failed
F uture researc h for this pro ject extends in sev eral directions The rst concerns the eects
of dynamic load balancing on the a v ailabilit y strategies Curren tly the analysis of our algorithms
assumes that a query is alw a ys directed to the no de storing the primary cop y of the ob ject initiating
the query and that bac kup copies of ob jects are emplo y ed subsequen tly once the query b egins its
na vigation With a dynamic load balancing tec hnique a query migh t initiate its tra v ersal on a
no de con taining the bac kup cop y of the initiating ob ject if the no de storing the primary copyof
this ob ject is busy has a large IO queue a busy CPU etc This impacts the p erformances of
eac h of our a v ailabilitytec hniques The a v ailabilit y strategies ha v e not y et b een fully ev aluated
with resp ect to minimizing IOs A sim ulator is under dev elopmen t that incorp orates these ob ject
placemen t algorithms so their eect on IOs can b e determined F urthermore the impact of
declustering up on clustering m ust b e measured Ob viouslyift w o related ob jects are not assigned
to the same no de they cannot b e assigned to the same disk page Declustering aects clustering
and the relationship b et w een these tec hniques requires further in v estigation and formalization
W e also plan to examine clustering more closely in the con text of parallel ob jectorien ted systems
Clustering in v olv es placing the ob jects across disk pages in order to minimize disk IOs Muc hw ork
has b een done in this area for single pro cessor ob jectorien ted systems Ho w ev er the clustering
algorithms dev elop ed consider only one copyofeac h ob ject When considering m ultiple copies
of eac h ob ject sev eral p olicy decisions m ust b e made Should primary and bac kup copies of
ob jects b e stored together on the same page or separately! If separately should the primary
and bac kup pages b e in terlea v ed or should they b e separated! and What are the tradeos of
these alternativ es! These questions m ust b e considered more fully The a v ailabilit y algorithms presen ted manage only one bac kup cop y of eac h ob ject The p er
formance of the system migh t b e enhanced further b y replicating an ob ject additional times and
assigning eac h replica to a dieren t no de in the system note that if all ob jects are replicated up on
all no des then no in terno de tra v ersals o ccur at all T o this end w e are dev eloping an algorithm
whic h considers creating m ultiple copies of eac h ob ject
Eac h of the algorithms describ ed in this w ork is static They analyze the database at some
poin t in time and decluster the ob jects across the system The placemen t of ob jects is xed for
future system op eration Ov er time ho w ev er the w orkload of the system and its ob jects ma y
c hange In ev olving systems a dynamic ob ject placemen t algorithm ma y b e necessary to stabilize
the p erformance of the system This motiv ates the dev elopmen t of a dynamic ob ject placemen t
algorithm whic h considers ev en w orkload distribution Sev eral issues critical to dynamic ob ject
placementm ust b e in v estigated F or example if the ob jectiden tier OID of an ob ject denotes its
ph ysical lo cation mo ving the ob ject eectiv ely c hanges its OID This has an enormous impact on
dynamic placemen t since relationships b et w een an ob ject and the mo v ed ob ject m ust b e mo died
if it main tains the OID of the mo v ed ob ject This issue pro vides a deterren t to dynamic ob ject
placemen t and m ust b e further in v estigated A dynamic ob ject placemen t algorithm is also under
dev elopmen t and has y et to b e implemen ted or ev aluated
References
ABMP
T Anderson A Berre M Mallison H P orter and B Sc hneider The Tektronix
Hyp ermo del Benc hmark In Pr o c of the EDBT Conf F ebruary ALR A Alderson W Lync h and B Randell Thrashing in a Multiprogrammed P aging
System In Op er ating Systems T e chniques Academic Press London p
BD V Benzak en and C Delob el Enhancing p erformance in a p ersisten t ob ject store
Clustering strategies in O
T ec hnical Rep ort Altair August BFR Y Bartal A Fiat and Y Rabani Comp etitiv e Algorithms for Distributed Data
Managemen t In th A CM STOCMa y Victoria BC Canada
BGHJ A Bhide A Go y al H Hsiao and A Jhingram An Ecien tSc heme for Pro viding
High Av ailabilit yIn A CM SIGMOD June p BG D Bitton and J Gra y Disk Shado wing In Pr o c of VLDB Los Angeles August
Borr A Borr T ransaction Monitoring in EncompassTM Reliable Distributed T ransaction
Pro cessing In Pr o c of VLDB
BA CC
H Boral W Alexander L Cla y G Cop eland S Danforth M F ranklin B Hart
M Smith and P V alduriez Protot yping Bubba a highly parallel database system
IEEE T r ans on Know le dge and Data Engine ering " Marc h
CH J Cheng and A Hurson Eectiv e Clustering of Complex Ob jects in Ob jectOrien ted
Databases In Pr o c of SIGMOD p CABK G Cop eland W Alexander E Bough ter and T Keller Data placemen t in Bubba
In Pr o c of SIGMOD June CK G Cop eland and T Keller A comparison of higha v ailabili t y media reco v ery tec h
niques In Pr o c of SIGMOD June Den P Denning The w orking set mo del for program b eha vior In Communic ations of the
A CMMa y p
DGS
D DeWitt S Ghandeharizadeh D Sc hneider A Bric k er H Hsiao and R Ras
m ussen The Gamma Database Mac hine Pro ject In IEEE T r ansactions on Know le dge
and Data Engine ering Marc h DK P Drew and R King The p erformance and utilit y of the CA CTIS implemen tation
algorithms In Pr o c of VLDB Brisbane Australia August FM C Fiduccia and R Mattheyses A lineartime heuristic for impro ving net w ork par
titions In A CM IEEE th Design A utomation Confer encePr o c e e dings June p
FBCC
D H Fishman D Beec h H P Cate E C Cho w T Connors J D Da vis N Derrett
C G Ho c h W Ken t P Lyngbaek B Mah b o d M A Neimat T A Ry an and
M C Shan IRIS An Ob jectorien ted Database Managemen t System A CM T r ans
on Oc e Information Systems " Jan uary GCKLa S Ghandeharizadeh V Choi C Ker K Lin Design and implemen tation of the omega
ob jectbased system In Pr o c e e dings of the F ourth A ustr ailian Datab ase Confer enc e F ebruary GCKLb S Ghandeharizadeh V Choi C Ker K Lin Omega A parallel ob jectbased system
synopsis In Pr o c e e dings of the se c ond International Confer enc e on Par al lel and
Distribute d Information SystemsJan uary GD S Ghandeharizadeh and D DeWitt A m ultiuser p erformance analysis of alternativ e
declustering strategies In Pr o c e e dings of Intl Conf on Datab ase Engine ering GDQ S Ghandeharizadeh D DeWitt and W Qureshi A p erformance analysis of alterna
tivem ultiattribute declustering strategies In Pr o c of SIGMOD GR S Ghandeharizadeh and L Ramos Con tin uous retriev al of m ultimedia data using
parallelism In IEEE T r ansactions on Know le dge and Data Engine ering August GRA Q S Ghandeharizadeh L Ramos Z Asad and W Qureshi Ob ject placemen t in parallel
h yp ermedia systems In Pr o c e e dings of the International Confer enceon V ery L ar ge
Datab ases GWLZ S Ghandeharizadeh D Wilhite K Lin and X Zhoa Ob ject Placemen tin P arallel
Ob jectbased Systems In IEEE Data Engine ering
GM L Golub c hik and R Mun tz F ault T oleran t Issues in Data Declustering for P arallel
Database Systems In Bulletin of the T ec hnical Committee on Data Engineering
Septem b er p Gra G Graefe V olcano An extensible and parallel datao w query pro cessing system
Computer Science T ec hnical Rep ort Oregon Graduate Cen ter Bea v erton OR June
GHW J Gra y B Horst and M W alk er P arit y Striping of Disc Arra ys Lo wCost Reliable
Storage with Acceptable Throughput In Pr o c e e dings of VLDB Brisbane Australia
p
HCL
L Haas W Chang G Coleman J McPherson P Wilms G Lapis B Lindsa y H Pirahesh M Carey and E Shekita Starburst MidFligh t As the Dust Clears In
IEEE T r ans on Know le dge and Data Engine ering Marc h p HK S Hudson and R King Cactis A selfadaptiv e concurren t implemen tation of an
ob jectorien ted database managemen t system In A CM T r ansactions on Data Base
Systems Septem b er p HW Y Huang and O W olfson A Comp etitiv e Dynamic Data Replication Algorithm In
IEEE Data Engine ering April HW Y Huang and O W olfson Ob ject Allo cation in Distributed Databases and Mobile
Computers In IEEE Data Engine eringF ebruary
HZ M F Hornic k and S B Zdonik A Shared Segmen ted Memory System for an Ob ject
Orien ted Database In A CM T r ans on Oc e Information Systems " Jan
uary HD H Hsiao and D DeWitt Chained Declustering A New Av ailabili t y Strategy for Mul
tipro cessor Database Mac hines In Sixth Intl Conf on Data Engine eringF ebruary
Kat J Katzman A F ault T oleran t Computing System In Pr o c of Eleventh Hawaii Conf
on System Scienc es Jan uary KL B Kernighan and S Lin An ecien t heuristic pro cedure for partitioning graphs In
Bel l System T e chnic al Journal F ebruary p
Kri B Krishnam urth y An impro v ed mincut algorithm for partitioning VLSI net w orks
In IEEE T r ansactions on ComputersMa y p
LKB M Livn y S Khoshaan and H Boral Multidisk managemen t algorithms In Pr o c
of SIGMETRICSMa y ML R Mun tz and J Lui P erformance Analysis of Disk Arra ys Under F ailure In Pr o c
of VLDB Brisbane Australia PGK D P atterson G Gibson and R Katz A case for Redundan t Arra ys of Inexp ensiv e
Disks RAID In Pr o c of SIGMOD Ma y RE D Ries and R Epstein Ev aluation of distribution criteria for distributed database
systems UCBERL T ec hnical Rep ort M UC Berk eleyMa y R W C Ruemmler and J Wilk es An In tro duction to Disk Driv e Mo deling In IEEE
Computer Marc h Sel P Selinger Predictions and c hallenges for database systems in the y ear In Pr o c
of VLDB Sta J W Stamos Static grouping of small ob jects to enhance p erformance of a paged
virtual memoryIn A CM T r ansactions on Computer Systems Sto M Stonebrak er The case for SharedNothing In IEEE Data Engine ering SADN
M Stonebrak er R Agra w al U Da y al E Neuhold A Reuter DBMS researc hat a
crossroads the Vienna up date In Pr o c of VLDB Stoa M Stonebrak er Marip osa A New Arc hitecture for Distributed Data Sequoia T ec hnical Rep ort Univ ersit y of California Berk eleyCA Ma y Stob M Stonebrak er Marip osa A New Arc hitecture for Distributed Data In IEEE Data
Engine eringF ebruary SKPO M Stonebrak er R Katz D P atterson and J Ousterhout The design of XPRS In
Pr o c of VLDB Los Angeles CA Septem ber T an T andem Database Group NonStop SQL a Distributed HighP erformance High
Av ailabilit y Implemen tation of SQL In Pr o c of Se c ond Intl Workshop on High
PerformanceT r ansaction Systems Asilomar CA Septem ber
T er T eradata Corp DBC Data Base Computer System Man ual T eredata Corp
Do cumen t No C Release No v em ber TN M Tsangaris and J Naugh ton A sto c hastic approac h for clustering in ob ject bases
In Pr o c of SIGMOD Ma y
TN M Tsangaris and J Naugh ton On the P erformance of Ob ject Clustering T ec hniques
In Pr o c of SIGMOD June p Ull J Ullman Principles of Database and Kno wledgeBase Systems V ol Computer
Science Press WJa O W olfson and S Ja jo dia Distributed Algorithms for Dynamic Replication of Data
In A CM Principles on Datab ase Sytems WJb O W olfson and S Ja jo dia An Algorithm for Dynamic Data Distribution In Pr o c
of the nd Workshop on Management of R eplic ate d Data WMRDII p
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 558 (1993)
PDF
USC Computer Science Technical Reports, no. 600 (1995)
PDF
USC Computer Science Technical Reports, no. 598 (1994)
PDF
USC Computer Science Technical Reports, no. 615 (1995)
PDF
USC Computer Science Technical Reports, no. 625 (1996)
PDF
USC Computer Science Technical Reports, no. 578 (1994)
PDF
USC Computer Science Technical Reports, no. 618 (1995)
PDF
USC Computer Science Technical Reports, no. 685 (1998)
PDF
USC Computer Science Technical Reports, no. 650 (1997)
PDF
USC Computer Science Technical Reports, no. 584 (1994)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 619 (1995)
PDF
USC Computer Science Technical Reports, no. 623 (1995)
PDF
USC Computer Science Technical Reports, no. 862 (2005)
PDF
USC Computer Science Technical Reports, no. 622 (1995)
PDF
USC Computer Science Technical Reports, no. 699 (1999)
PDF
USC Computer Science Technical Reports, no. 748 (2001)
PDF
USC Computer Science Technical Reports, no. 628 (1996)
PDF
USC Computer Science Technical Reports, no. 612 (1995)
PDF
USC Computer Science Technical Reports, no. 627 (1996)
Description
Shahram Ghandeharizadeh, David Wilhite. "Placement of objects in parallel object-based systems." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 589 (1994).
Asset Metadata
Creator
Ghandeharizadeh, Shahram (author), Wilhite, David (author)
Core Title
USC Computer Science Technical Reports, no. 589 (1994)
Alternative Title
Placement of objects in parallel object-based systems (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
51 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269910
Identifier
94-589 Placement of Objects in Parallel Object-Based Systems (filename)
Legacy Identifier
usc-cstr-94-589
Format
51 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/