Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
00001.tif
(USC Thesis Other)
00001.tif
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AN INTEGRATED A PPRO A CH TO SCHEMA VIEW S AND VERSIONS FO R O B JEC T DATABASES by Kwang June Byeon A D issertation Presented to the FACULTY O F TH E GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In P artial Fulfillment of the Requirem ents for the Degree D O CTO R OF PHILOSOPHY (Com puter Science) December 1993 Copyright 1993 Kwang June Byeon UMI Number: DP22860 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Dissertation Publishing UMI DP22860 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES, CALIFORNIA 90007 This dissertation, w ritten by Kwang, J u n e , Byeon............................................. under the direction of h . D issertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillm ent of re quirem ents for the degree of ph,p. C? 5 3 D O C TO R OF PH ILOSOPH Y Dean o f Graduate Studies Date . O ctober.. 7 , . . .I.9.9.3. Acknowledgements I wish to express my deepest appreciation to m y advisor, Professor Dennis McLeod, for his continuous guidance and support. I would like to sincerely thank Professor Alice C. Parker for her invaluable advice and encouragement. I am deeply grateful to Professor Shahram Ghandeharizadeh for his help and comments on my dissertation. I am also grateful to my other com m ittee members Professors George Bekey and Ellis Horowitz. I would like to thank all of m y friends for their friendship. In particular, I am indebted to Hamideh Afsarmanesh for her sincere help and concern, and the database research group members Doug Fang, Joachim Hammer, Jonghyun Kahng, and Antonio Si, for their comments on my ideas. I am grateful to my parents, Jong Hwa and Sung Bo Byeon, w ith all my heart for their love and concern. Finally, I wish to thank my wife, Joo Eun, for her love, patience, and unending support. ii Contents A ck n o w led g em en ts ii L ist O f F ig u res vi A b stra ct viii 1 In tro d u ctio n 1 1.1 Background and M o tiv atio n s............................................. . ........................ 1 1.2 Research G o a l s ............................ 3 1.3 Overview of A p p r o a c h ..................................................................................... 4 1.4 Organization of D is s e rta tio n .......................................................................... 5 2 R e la te d R esearch 6 2.1 Views ................................ 6 2.1.1 Class V ie w s ............................................................................................ 6 2.1.2 Schema V ie w s........................................................................................ 7 2.1.3 D iscu ssio n ............................................................................. 9 2.2 M eta-data V e rsio n s........................................................................................... 10 2 .2 . 1 Class V ersio n s........................................................................................ 10 2.2.2 Schema Versions ................................................................................. 1 1 2.2.3 D isc u ssio n ............................................................................. 11 2.3 Between Views and M eta-data V ersio n s..................................................... 12 3 C ore O b ject D a ta M o d el 13 4 V irtu a l D a ta b a ses 17 4.1 Concept of a V irtual D a ta b a s e ....................................................................... 17 4.1.1 Schema and Instances ................................................................... 18 4.1.2 V irtual D atabase Derivation G r a p h .............................................. 22 4.2 V irtual D atabase M anipulation M ech an ism ................................................ 23 4.2.1 Creation on a Single B a s e ................................................................ 23 4.2.2 Creation on M ultiple Bases ............................................................ 26 4.2.3 Schema Change and Deletion . . . . ' ........................................... 29 iii 4.2.4 Instance M a n ip u la tio n ...................................................................... 31 5 V ir tu a l D a ta b a s e M a n ip u la tio n M e c h a n ism 32 5.1 Creation on a Single B a s e .............................................................................. 32 5.1.1 Preparing Phase .................................................................................. 32 5.1.2 Im porting P h a s e ........................................................... : ................. 33 5.1.2.1 Class Definition Im porting S t e p .................................... 35 5.1 .2 .2 Type-Closure Checking S te p ........................................... 35 5.1.2.3 Is-a Relationship Finding Step .................................... 38 5.1.2.4 Instance Arranging S t e p .................................................. 44 5.1.3 Pruning Phase ..................................................................................... 45 5.1.4 Transforming P h a s e .............................................................................. 49 5.2 Creation on M ultiple Bases ........................................................................... 51 5.2.1 Preparing Phase .................................... 52 5.2.2 Phases for Prim ary B a s e .................................................................... 52 5.2.3 Phases for N on-Prim ary B a s e s .......................................................... 52 5.2.3.1 Im porting and Pruning P h a s e s .................................... 53 5.2.3.2 Integrating P h a s e ................................................................ 54 5.2.4 Transforming P h a s e .............................................................................. 57 5.3 Schema Change and D e l e t i o n ....................................................................... 58 5.3.1 Schema C h a n g e ..................................................................................... 58 5.3.2 D e le tio n ................................................................................................... 59 5.4 Instance M a n ip u la tio n ..................................................................................... 60 5.4.1 Creation O p e ra tio n ............................................................................... 60 5.4.2 Deletion and Move O p e ra tio n s.......................................................... 62 5.4.3 Change, Retrieval, and Execution O perations .......................... 62 6 A p p lic a tio n s o f V ir tu a l D a ta b a s e s 64 6.1 Schema V ie w s ...................................................................................................... 64 6.1.1 Concept of a Schema V ie w ................................................................ 64 6.1.2 C re a tio n ................................................................................................... 65 6.1.3 Schema Change and D e l e t i o n ......................................................... 6 6 6.1.4 Instance M a n ip u la tio n ....................................................................... 67 6.1.5 Comparisons w ith R elated W o rk ...................................................... 67 6.2 Schema Versions ............................................................................................... 6 8 6.2.1 Concept of a Schema Version ......................................................... 6 8 6 .2 . 2 C r e a tio n ................................................................................................... 69 6.2.3 Schema Change and D e l e t i o n ......................................................... 70 6.2.4 Instance M a n ip u la tio n ....................................................................... 71 6.2.5 Comparisons w ith R elated W o rk ..................................................... 71 6.3 Special V irtual Databases ............................................................................. 72 6.3.1 Concept of a Special V irtual D a t a b a s e ........................................ 72 6.3.2 M anipulation of Special V irtual D a ta b a s e s .................................. 73 iv 6.4 Support of User P ersp ectiv es........................................................................ 74 6.4.1 V irtual D atabase Application C a s e s ............................................... 74 6.4.2 Comparisons w ith Previous W o r k .................................................. 77 7 P r o to ty p e Im p lem en ta tio n 80 7.1 O m e g a ................................................................................................................... 80 7.2 V irtual D atabase M echanism ........................................................................ 81 7.2.1 R epresentation of V irtual Databases ........................................... 81 7.2.2 Im plem entation of V irtual D atabase O p e ra tio n s ...................... 83 8 C on clu sion 88 8.1 S u m m a r y ........................................................................................................... 8 8 8.2 C o n trib u tio n s .................................................................................................... 89 8.3 Directions for Future R esearch ........................................ 90 8.3.1 Single D atabase E n v iro n m e n t........................................................ 90 8.3.2 M ultiple D atabase E n v iro n m e n t..................................................... 91 A p p en d ix A V irtual D atabase M anipulation O p e ra tio n s........................................................ 95 A .l V irtual D atabase O p e ra tio n s........................................................................ 95 A .2 Class O p e r a tio n s ............................................................................................. 97 A .3 A ttribute and M ethod O p e ra tio n s.............................................................. 99 A.4 Is-a O p e r a tio n s ................................................................................................ 100 A .5 Instance O p eratio n s......................................................................................... 101 R eferen ce L ist 102 v List Of Figures 3.1 An Exam ple Class Definition based on C O D M ...................................... 14 3.2 An Exam ple D atabase DBi ......................................................................... 15 4.1 An Exam ple V irtual D atabase V D B i ....................................................... 18 4.2 M ultiple Class M embership among V irtual D a ta b a s e s ........................ 20 4.3 V irtual D atabase Derivation G raph for D B i ......................................... 22 4.4 Creation of V irtual D atabase V D B i ........................................................... 25 4.5 Creation of V irtual D atabase VDB 3 ........................................................... 27 4.6 V irtual D atabases VDB 2 and VDB 3 .......................................................... 28 5.1 V D Bi after Im porting P h a s e ........................................................................ 34 5.2 VDBi after Class Definition Im porting S t e p ......................................... 35 5.3 Type-Closure Checking S t e p ........................................................................ 36 5.4 VDBi after Type-Closure Checking S t e p ................................................ 37 5.5 Is-a Relationship Finding S t e p ..................................................................... 39 5.6 Procedures for B a s e .......................................................................................... 40 5.7 Procedures for V irtual D a ta b a s e ................................................................. 41 5.8 During Is-a Relationship Finding Step for V D B i.................................. 43 5.9 VDBi after Is-a Relationship Finding S t e p ............................................ 44 5.10 VDBi after Instance A rranging Step ....................................................... 45 5.11 VDBi after Pruning P h a s e ............................................................................ 46 5.12 Pruning P h a s e .................................................................................................... 48 5.13 O perations for Transforming P h a s e .......................................................... 50 5.14 VDB 3 after Im porting and Pruning Phases for V D B i ....................... 53 5.15 VDB 3 after Im porting and Pruning Phases for VDB 2 ....................... 54 5.16 Integrating P h a s e ............................................................................................. 56 5.17 VDB 3 after Integrating P h a s e ..................................................................... 57 5.18 Instance M anipulation O p e ra tio n s.............................................................. 61 6.1 An Exam ple V irtual D atabase VDB 4 as a Schema V ie w .................... 6 6 6.2 An Exam ple V irtual D atabase VDB 2 as a Schema V e r s i o n ............. 70 6.3 Possible Cases of V irtual D atabase Application .................................. 75 6.4 Exam ple Cases of V irtual D atabase A p p lic a tio n .................................. 77 6.5 V irtual D atabase Derivation G raph for D B i ......................................... 78 7.1 R epresentation of V irtual Databases in D B i .......................................... 82 7.2 An Exam ple Schema Definition Script File VDBj for Omega . . . . 85 8.1 V irtual Databases in a M ultiple D atabase E n v iro n m e n t.................... 92 Abstract There have been a num ber of approaches to views and m eta-data versioning for object databases. The common objective of these approaches is to support users who m ay require different and changing perspectives of an object database. However, both views and versions are used to represent the user perspectives only in a lim ited way. In addition, due to the essential sim ilarities between the notions of views and versions, it is possible to unify these two notions, b u t the resulting unified notion still has lim itations. This dissertation introduces the concept of a virtual database th a t not only unifies th e notions of views and versions but also extends them . A mechanism for m anipulating virtual databases as well as their schemas and instances is also presented. The application of the virtual database concept to supporting views, versions, and new notions th a t are neither views nor versions, is described, and how this application supports user perspectives of a database is exam ined. The virtual database concept and its m anipulation mechanism have been im plem ented using an object database system. Chapter 1 Introduction 1.1 Background and Motivations D atabases are presently used in a variety of applications. In these applications, a database is typically shared by several users who may require different and changing perspectives of the database. Work on views and versioning of m eta-data in the object database context [1, 4, 10, 22, 28, 30, 41, 42, 43, 47, 48, 49] has explored techniques to support these user perspectives in an object database system . 1 There have been several approaches to views [1, 4, 10, 22, 28, 41, 42, 43, 48, 49] and two m ajor approaches to versions of m eta-data [30, 47]. Depending on the granularity of views th a t they introduce, the view approaches can be divided into two groups: class view and schema view. Similarly, the m eta-data version approaches can be divided into two groups: class version and schema version. A class view (and version) consists of a single class, while a schema view (and version) consists of m ultiple interrelated classes (i.e., a schema). Schema views (and versions) are conceptually m ore general th an class views (and versions), in th e sense th a t a user of an object database does not necessarily want to see a class view (and version) alone; he/she often wants to see a schema where the class view (and version) is interrelated with other classes an d /o r class views (and versions) of the database. Hence, in this dissertation, we focus on th e schema view and version approaches. 1 Similar thing has been done in the relational database context [16, 40]. However, in this dissertation, we focus on the object database context. i i i We can observe th a t schema views and versions have m any sim ilarities. In particular, suppose th at a schema view SVIEW,- has been created on an object database DB,, and a schema version SVERSION, of DB, has also been created. If we com pare the schem a of SVIEW , and th a t of SVERSION, w ith th e schema of DB, respectively, it is clear th a t both schemas are created through some schema changes from the schem a of DB, , and thus have the same structural form as the schema of DB,. In addition, users can m anipulate (i.e., create, delete, change, and retrieve) th e instances of SVIEW ; and those of SVERSION,-, as they do for th e instances of a typical database. As a result, SVIEW ; (and SVERSION,) is regarded by its users as their own tailored database th a t is defined on DB,-. These sim ilarities m otivated us to develop a concept th at unifies schema views and versions. In spite of these sim ilarities, there has been to date no approach th a t unifies schem a views and versions, or considers both concepts in a single context. As a result, the previous schema view and version approaches have ignored m any cases th a t need to be supported for real-world applications, which indicates th a t these approaches support user perspectives in a lim ited way. For example, th e following cases are some of those ignored cases. Consider the above example of SVIEW,-, SVERSION;, and DB, again. First, some users m ay be interested in only a subset of SV ERSIO N ,. Second, some may be interested in both a p art of SVIEW,- and a part of SVERSION,-. Third, some m ay want to change the schema of SVIEW ;, while others m ay want to continue to use th e current SVIEW,-. Failure to adequately address those cases by th e previous approaches is another m otivation for considering schema views and versions in one context and developing a unifying concept. Even though the above cases have not been explicitly considered by the previous approaches, they can be supported to some extent by using schema views and versions. T he first case can be supported by creating a schem a view on SVERSION,; ithe second case, by creating a schema view on SVIEW , and SVERSION,; th e third case, by creating a schema version of SVIEW ;. However, the advantage of using th e unifying concept here is clear. The above cases can be supported by a single structure and its m anipulation mechanism rather th an by two different structures and their corresponding m anipulation mechanisms. i ■ Based on these m otivations, we first tried to develop a concept th a t unifies the j schem a view and version concepts. However, we found th a t simply unifying both concepts and using a unified concept as a schema view or version would still support user perspectives in a lim ited way. Consider the example of SVIEW*, SVERSION*, and DB* one m ore tim e. Suppose th a t a user wants to see a p art of SVIEW* and a p art of SVERSION,, w ith new m eta-data (e.g., classes) th at are not derived from SVIEW,- or SVERSION,. W hat the user wants here is neither a schema view (due to new m eta-data) nor a schema version (due to derivation from m ultiple bases); hence, it cannot be represented by a single unified concept which can be used only as a schema view or version. One may claim th a t th e tailored database can be represented by a schema view and a schema version (or by two unified concepts); for example, by creating a schem a version SV ERSIO N / of SVERSION,- and then creating a schema view on SVIEW , and SV ER SIO N /. However, this solution does not create exactly w hat the user wants, but creates a schem a view close to it; in addition, the solution requires creation of an extra schema version. This lim itation of the unified concept m otivated us to develop a concept th a t not only unifies the schema view and version concepts but also extends them . 1.2 Research Goals The prim ary goal of our research presented in this dissertation is to develop an ap proach th a t supports different and changing user perspectives of an object database in a comprehensive way. In order to achieve this goal, th e following tasks need to be done. • First, develop an approach th a t supports a new concept th a t not only unifies the schema view and version concepts but also extends them , and also provides a m echanism for m anipulating this new concept. • Second, exam ine how this approach supports user perspecti ves of a database. In particular, examine how this approach can be used as a schema view, schema version, or unified approach, and show th a t it resolves the lim itations of and also extends the schem a view, schema version, and unified approaches. • Third, im plem ent the new concept and its m anipulation m echanism using an object database system. 3 1.3 Overview of Approach In our approach, we introduce a new concept called a virtual database, which is based on a simple object d ata model. A virtual database is a special database whose schem a and instances are partially or totally derived from those of one or m ore bases. Each base of a virtual database can be a (regular) database or another virtual database, and m ust be unique. A virtual database thus always contains inform ation (i.e., m eta-data and data/instances) th at is derived from its base(s), and m ay contain additional inform ation th a t is not derived but newly created in it. It is also actually created and exists in physical form. Any num ber of virtual databases can be derived directly or indirectly from a single underlying database. Note th a t a virtual database is directly derived from its base(s). W hen a virtual database is derived from m ultiple bases, at m ost one base can be the underlying database. A database and its virtual databases derived from it are thus organized into the ( virtual database) derivation graph for the database. Further, each virtual database is regarded by its users as their own tailored database th a t is defined on the underlying database. The virtual database approach also provides a mechanism for m anipulating vir tu al databases as well as their schemas and instances. This m echanism support operations for • creating a virtual database on a single base or m ultiple bases, • deleting a virtual database, • changing the schema of a virtual database, and • m anipulating (i.e., creating, deleting, changing, and retrieving) instances. W ith these operations, users can m anipulate a virtual database as if they do to a (regular) database. We do not provide a specific query language itself; instead, we provide the m anipulation operations th a t can be integrated into a query language. T he virtual database approach can be used as a schema view or version approach, and resolves the lim itations of the previous schema view and version approaches, i respectively. Hence, it can be used as an approach th a t unifies th e schem a view and version approaches. However, it is m ore general than the unified approach. This 4 indicates th a t the virtual database approach not only unifies the schem a view and version approaches, but also extends them . Hence, the virtual database approach supports different and changing user perspectives of a database m ore com prehen sively th an the schema view, schema version, and unified approaches. Finally, an experim ental prototype of our virtual database approach has been im plem ented using the Omega database system [19] developed at USC. In this prototype, virtual databases th a t are defined on a common underlying database are created and actually stored in th e underlying database; the database language of Omega has been extended to include th e virtual database m anipulation operations. 1.4 Organization of Dissertation The rem ainder of this dissertation is organized as follows. C hapter 2 reviews the previous work on views and m eta-data versions in the object database context. C hapter 3 presents a simple object d ata model on which our approach is based. C hapter 4 introduces the concept of a virtual database, and also briefly describes the characteristics of a m echanism for m anipulating virtual databases. C hapter 5 presents th e details of the virtual database m anipulation mechanism. C hapter 6 ex amines how the virtual database approach supports user perspectives of a database, and compares this approach w ith others. C hapter 7 describes the issues related to our experim ental prototype im plem entation. Finally, C hapter 8 summarizes the results and contributions of this research and explains future work. 5 Chapter 2 Related Research This chapter reviews the previous work on views and m eta-data versions in the object database context. 2.1 Views There have been several approaches to views for object databases [1, 4, 10, 22, 28, 41, 42, 43, 48, 49]. Depending on th e granularity of views th a t they introduce, these approaches can be divided into two groups: class view and schema view. In this section, we describe the features of each group and also com pare one group with the other. 2.1.1 C lass V ie w s There are m any sim ilarities among the class view approaches [10, 43, 49]. In these approaches, a view is a single class and defined as a query in a query language. The operations used for constructing a query are sim ilar to the relational operations (e.g., select, project, join) [16, 31, 35]. A view is treated ju st like a regular class; hence, it can be used as an operand of a query defining another view. In addition to these sim ilarities, each approach has its distinctive features. In [1 0 ], a view can have additional attrib u tes and m ethods th a t are not derived from its base classes (i.e., the classes from which th e view is derived). Also, a view is not created in th e class hierarchy of its underlying database; instead, it is created in a view derivation hierarchy which is orthogonal to the class hierarchy. W hen a view 6 is created, its instances are also created w ith new object identifiers. In addition, the correspondences between these instances and their base class instances are stored, which m ay help views to be updated w ithout am biguities. However, the view update problem is hardly considered in this paper. In [43, 49], a view is created in the class hierarchy of its underlying database, where it is placed as a subclass or superclass of its base class(es). The instances of a view are derived from those of its base class(es), which indicates th a t some instances are shared by a view and its base class(es). W hen a view is created, however, this instance derivation is not m ade but represented by a query for defining the view. A fter th e view is created, the derivation is m ade whenever needed (e.g., when updating th e view). [43] focuses on how this instance sharing feature makes views u pdated w ithout causing ambiguities. [49] briefly discusses the notion of subschemas as well as this view mechanism, in order to support virtual transform ation of a database. Here, a subschem a is a subset of a database, which consists of views and regular classes interrelated through attributes, m ethods, and is-a relationships. 2 .1 .2 S ch em a V ie w s In th e schem a view approaches [1, 4, 22, 28, 41, 42, 48], a view is a schem a th at consists of m ultiple interrelated classes. The distinctive features of each approach are in the following. In [1], a view is a schema th a t consists of m ultiple classes interrelated through attrib u tes and is-a relationships. Note th a t attributes and m ethods are not dis tinguished in this approach; instead, two kinds of attributes (i.e., attrib u tes whose values are either stored or com puted) are used. A view is defined as a query, and can be derived from other views. In a query, a view is actually created by im porting classes from its base(s), deriving virtual classes and attrib u tes, and hiding attributes; here, deriving and hiding are optional, and a base of a view is a database or another view defined on the database. This creation also organizes im ported and virtual classes into a class hierarchy based on a simple strategy. Im ported classes share th e instances of their corresponding classes in the base(s); the instances of virtual classes are usually derived from other im ported an d /o r virtual classes. How ever, instances are newly created for those virtual databases whose instances cannot 7 be derived from other classes (e.g., classes derived by an operation sim ilar to the relational join operation). In [22], a view is a set of classes interrelated through functions. A view is defined by a set of queries th a t derive its classes from the base(s), and can be derived from other views. The operations used for constructing a query are similar to th e relational operations (e.g., select, project, join). In a query, the classes of a view are derived via these operations from their corresponding classes of its base(s) (i.e., base classes). New instances are also created for the classes of a view, and the correspondences between these instances and their base class instances are stored. Further, instance update functions of a view are im plem ented by the user/designer in term s of those of its base(s), so th a t updates on a view can be m apped into those on its base(s). Finally, a view is type-closed, which m eans th a t a view includes all classes th a t are recursively referenced by th e functions of its classes. In [41, 42], a view is a schema th a t consists of classes interrelated through attributes, m ethods, and is-a relationships. A view is defined as a query, and actually created by deriving new virtual classes in its global schema, and adding these (and other) classes of the global schema (to the view). This indicates th at views are not derived from th e underlying database b u t from th e global schema. The global schem a is th e underlying database augm ented w ith virtual classes; hence, the original underlying database becomes a view created on this schema. A view can be derived from other views in the sense th a t all classes of the global schema including virtual classes derived by individual views can be accessed by any view. Further, a view m ust be type-closed and is-a valid. It is m ade type-closed by finding the classes from th e global schema, th a t are recursively referenced by the classes of a view b u t are not yet added by th e user, and adding such classes to the view. An algorithm for doing this is presented in [42] and formally verified. In addition, a view is m ade is-a valid by deriving is-a relationships among its classes from the global schema. An algorithm for doing this is presented in [41] and formally verified. In [48], a view is a schema th at consists of attrib u tes, m ethods, and is-a rela tionships. A view is created by creating virtual classes and finding is-a relationships among these classes. However, finding is-a relationships is m ade by a user, and the s between views and their underlying database are not clearly described. 8 In [28], a model of queries for object databases is presented, and the notion of a view is not explicitly described. A query graph can be considered a view, which consists of m ultiple classes interrelated through attrib u tes, m ethods, and is-a relationships. A view can be used as an operand of another query. T he operations used for constructing a query are sim ilar to th e relational operations (e.g., select, project, join). In [4], a query language for an object database is presented, and th e notion of a view is briefly considered. A view is a schema consisting of classes interrelated through attrib u tes and is-a relationships. In this approach, a query consists of a context clause and an operation clause. The context clause defines a subdatabase on which operations specified in the operation clause are to be executed; it is represented similarly to a query in SQL [16, 31, 35], and its result can be saved as a view. In addition, the result of the operation clause (i.e., the result of th e query) is a table. 2 .1 .3 D iscu ssio n It appears th a t schema views are m ore general th an class views. Consider the following example. Suppose th a t a user is interested in the class hierarchy rooted at a class C,-, but does not want to see an a ttrib u te of C 8 -. This hierarchy can be easily defined as a schema view. However, it is neither easy nor n atural to represent the hierarchy w ith class views; a class view m ay need to be created for each class in the hierarchy. In order to alleviate this lim itation, some class view approaches [43, 49] provide additional structures (e.g., subschemas) th a t consist of m ultiple class views an d /o r base classes. Even though there is no widely-accepted definition of a view in the object database context [29], we believe th a t m any im portant features th a t are required for a desired view approach have been recognized and proposed by the previous ap proaches. However, there hardly exists an approach th a t incorporates these features in a single framework. 9 2.2 Meta-data Versions Users m ay want to change the schema of a database. This issue of changing a database schema, often called schema evolution, has been extensively investigated in the object database context [8 , 33, 38, 39, 40]. However, th e m ajor problem of schem a evolution is th at the current schema is replaced w ith a new one even though it m ay be needed by other users. There have been two m ajor approaches for solving this problem [30, 47]. They both propose the versioning of m eta-data, but the granularity of versioning is different among them . [47] proposes the versioning of a single class (i.e., class version), while [30] proposes the versioning of a database schema (i.e., schema version). 2.2.1 C lass V ersion s In [47], a version set contains a class and its versions, and is represented by a version set interface which is a disjunction of the definitions of the class and its versions. In a version set, an instance is associated w ith th e class version in which it is created. Also, error handlers are added by a user to each class version in order to handle the functions th a t are defined in the version set interface b u t not in the definition of the class version. Due to the version set interface and error handlers, all instances of a class and its versions can be used as if they are the instances of a single class, and no change is needed to application program s th a t have been using old class versions. In this approach, th e versioning of classes is not used to support queries th a t deal w ith historical inform ation for instances. An exam ple query can be “for the instance ii of the class version C i, show th e values of its attrib u tes under the class version C 2 ” where C 2 was derived from C i by deleting one attrib u te. The versioning is rather used to enable the application program s (th a t do or do not want 'to change th e definitions of classes), to continue to share and access their common database after changes. f 10 2.2.2 S ch em a V ersion s In [30], a schem a version is a database th a t is derived from a (regular) database or a previously defined schem a version, through some schema changes. Any num ber of schem a versions can be derived from a database or another schem a version; hence, an underlying database and its schema versions are organized into a hierarchy. A schema version is derived from a database or another schem a version, by first copying (the schem a and instances of) the latter into th e form er and then making schem a changes to th e former. In addition to schema versioning, this approach uses instance versioning [9, 13, 25, 26, 44]. Further, it allows queries for handling historical inform ation to be posed against a database and its schemas versions. 2.2.3 D isc u ssio n T he lim itations of the class version approach of [47] are as follows. F irst, it appears th a t class versions are m ore lim ited than schema versions. Consider the following example. Suppose th a t a user wants to add an attrib u te to th e class C , • th at is the root of a class hierarchy. Since the added attrib u te is inherited into the subclasses of C t -, a class version needs to be created for each class in th e class hierarchy (rooted at C j), and the user needs to consider these several class versions. However, in [30], only a single schem a version needs to be created. Second, since a user (or program ) is usually interested in a specific version of a class rath er th an its m ultiple versions, a “virtu al” schema version, which contains only a single version for each class, is needed in this approach, as discussed in [30]. Third, users m ust program error handlers in this approach, which may not be an easy task. Even though schem a versions are more general than class versions, the schema version approach of [30] has the following lim itations. F irst, due to th e adoption of instance versioning, different versions of an instance m ay exist in different schema versions; therefore, an instance th a t is deleted in one schem a version SVERSION; may still exist in another schem a version SVERSIONy This m ay be somewhat confusing to the user who deletes the instance from SVERSION; and then accesses SVERSION^. Second, even instance update may cause th e creation of a schema version. If other schem a versions are derived from th e schema version SVERSION;, instance update is not allowed on SVERSION;. Instead, a new schema version 11 is derived from SVERSION* by copying the entire SVERSION*, and the instance u pdate is m ade on this new version. This indicates th at too m any schema versions m ay be created in the worst case. 2.3 Between Views and Meta-data Versions Relationships between views and m eta-data versions have been identified and the possibility of applying views to schem a evolution has been considered [10, 30, 49]. [1 0 ] discusses how class views can be used to support some of the schem a evolution cases found in [8 , 37]. [49] presents a m ethodology for supporting schem a evolution by using class views and subschemas. [30] discusses the possibility of using schema views to support schem a versions. To the best of our knowledge, however, there has been no approach th a t unifies schema views and versions, or attem p ts to consider both concepts in a single context. 12 Chapter 3 Core Object Data Model We use a simple object d ata m odel called the Core O bject D ata Model (CODM ). CODM provides the basic modeling constructs and features found in m ost object d ata models [2, 5, 7, 32, 36]. In this chapter, we describe those constructs and features of CODM th a t are relevant to our subsequent discussion. • O b je c ts a n d in s ta n c e s In this model, every real-world entity is modeled as an object. O bjects are divided into two groups: simple and abstract objects. A simple object is a string of characters, an integer, or a real num ber, and represented by itself. It also belongs to one of the system-defined sim ple classes S trin g , I n te g e r, and R e a l. On th e other hand, an abstract object is assigned a system-defined unique identifier called an object identifier (OID). In addition to its OID, an abstract object can be represented by its associated attrib u te(s) an d /o r m ethod(s). A bstract objects th a t share the same attrib u tes and m ethods belong to the same abstract class. Further, regardless of w hether they are simple or abstract, objects th a t belong to a class are term ed th e instances of the class. • C la sse s A class consists of its definition and instances. T he definition is represented by superclasses, subclasses, and attrib u tes and m ethods th a t are applicable to its instances. An exam ple class definition is shown in Figure 3.1. 13 class RA s u p e rc la s s G raduate, Employee s u b c la ss ssno: Integer, name: String, m ajor: D epartm ent, empno: Integer, advisor: Faculty, m e th o d find_ad visor () r e t u r n Integer b o d y FindA dvisorl ); Figure 3.1: An Exam ple Class Definition based on CODM • A ttr ib u te s a n d m e th o d s An attrib u te is specified by its nam e and dom ain/value class (i.e., a collection of possible values), and a m ethod is specified by its nam e, input argum ents, ou tp u t argum ent, and body . 1 In CODM, a m ethod of a class can reference only th e attributes and m ethods of the same class including itself, as well as system-defined classes. Also, even though CODM does not support de rived attrib u tes [11, 21], m ethods can support them indirectly. For example, suppose th a t the class R A has the a ttrib u te yearlysalary, and th a t the user wants to add to R A th e derived attrib u te m onthlysalary whose derivation predicate is yearly.salary / 12. This addition is not allowed in CODM; how ever, the user can create a m ethod th a t com putes the derivation predicate and returns the result. • Is -a re la tio n s h ip s Classes are organized into a directed acyclic graph (DAG) via is-a relation ships representing superclass-subclass connections; this DAG is term ed a class 1Instead of the actual body of a method, the file containing it is specified. 14 Object dname: String hasstcffi: Steffi ssno: Integer name: String Department Person empno: Integer major: Department A I Student Employee Staff l„'nder_ graduate Graduate faculty_of: Department )has_tenure: String Faculty TA RA advisor: Faculty fmdjzdvisori.) return Integer ti V y ri r2 Figure 3.2: An Exam ple D atabase DBy lattice. A class inherits attrib u tes and m ethods from its superclasses, and in stances of a class become instances of its superclasses. We call the instances directly created for a class th e direct instances of th e class, and other in stances (i.e., instances of its subclasses) are indirect instances. T he root of th e class lattice is a system-defined class called O b je c t. O b je c t has the system -defined classes S trin g , I n te g e r, and R e a l as its subclasses. Fig ure 3.2 represents an exam ple database based upon this model. The classes S trin g , I n te g e r, and R e a l are not included for simplicity, and only direct instances of a class are associated w ith th e class. Also, the class R A has th e four inherited attrib u tes ssno, nam e, m ajor, and empno; it also has the (directly-)defined attrib u te advisor; R A has the defined m ethod find^advisor. • P r o p e r ty in h e r ita n c e a n d n a m in g co n flic ts Inheritance of properties (i.e., attrib u tes an d /o r m ethods) along is-a relation ships m ay cause two kinds of nam ing conflicts. One is th e conflict between a class Cj and its superclass S;. This conflict occurs if a sam e-nam ed property 15 Pi is defined for both C, and Sj. Note th a t these two classes m ay not neces sarily have the same definition of pi. The conflict is resolved by m aking the Pi of C i override the inherited p,; of S,-. The other is the conflict between superclasses of a class, which is so-called the multiple inheritance problem [7]. This conflict occurs if a same-nam ed property pi is defined for m ore than one superclass of a class. In CODM , it is assum ed th a t these superclasses m ust have th e same definition of p,-. Hence, th e conflict can be resolved by making pi of any superclass override th at of others. F urther, if th e superclasses have different definitions of pi, these definitions m ust be changed into the same definition, or p, m ust be renam ed appropriately in those superclasses. • M u ltip le class m em b ersh ip CODM allows an object to be an instance of m ore th an one class at th e same tim e. We have already considered a special case such th a t an object is an instance of a class and its superclass(es). In addition to this special case, an object can be an instance of m ultiple classes th a t are not in superclass-subclass relationships. In this case, another kind of nam ing conflict m ay happen, since the classes to which an object belongs m ay have properties w ith the same nam e. We elaborate on this case including th e strategy for resolving the nam ing conflict in Sections 4.1.1 and 5.4. 16 Chapter 4 Virtual Databases In this chapter, we introduce the concept of a virtual database, including the char acteristics of the schema and instances of a virtual database and th e notion of a derivation graph. We also present a m echanism for m anipulating v irtual databases as well as their schemas and instances. This m echanism is briefly described in this chapter, and its details are discussed in the next chapter. 4.1 Concept of a Virtual Database A virtual database is a special database whose schem a and instances are partially or totally derived from those of one or m ore bases. Each base of a virtual database can be a (regular) database or another virtual database, and m ust be unique. A v irtual database thus always contains inform ation th a t is derived from its base(s), and m ay contain additional inform ation th a t is not derived but newly created in it. It also exists physically, in contrast to a view for a relational database which exists in th e form of a query [16]. A v irtual database is created (directly) on its base(s) . 1 In addition, in a single database environm ent, 2 virtual databases are created directly or indirectly on their common underlying database. Here, each virtual database is regarded by its users as th eir own (tailored) database defined on th e underlying database; it represents 1This means that a virtual database is created based on the information derived (directly) from its base(s). 2In Section 8.3.2, we will consider the application of virtual databases to information sharing in a multiple database environment. 17 Object projno: String p_investigator: Faculty totaljamount: Integer ssno: Integer name: String f Person Project Department u 2 major: Department L advisor: Faculty W work_on: Project find advisoti) return Integer faculty of: Department hastenure: String Faculty Figure 4.1: An Exam ple V irtual D atabase VDBx th e perspective of the underlying database th a t the users require. Further, users can m anipulate a virtual database as they m anipulate a (regular) database. Figure 4.1 illustrates an exam ple virtual database VDBx. Here, VDBx is created on the database DBx shown in Figure 3.2. Some classes of VDBx (e.g., P e rs o n , R A ) are found in DBx, while others (e.g., J u n io r_ F a c u lty , P r o je c t) are not. In w hat follows, we describe the distinctive characteristics of v irtual databases in detail. 4.1.1 S ch em a and In sta n ces A virtual database consists of a schema and instances. Its schem a may have three kinds of classes: imported, derived, and local. Also, a virtual database m ay include two kinds of instances: imported and local. • An im ported class is a class whose definition is copied from some bases of the virtual database, w ith or w ithout schem a changes including renam ing. Each of these bases thus has a class corresponding to the im ported class, which is term ed a base class of th e im ported class. This also indicates th a t an im ported 18 class m ay have more th an one base class. In this case, th e definitions of the base classes are copied and merged into a single class definition. In addition, due to possible schema changes, an im ported class and its base class(es) may not have the same class definition. An im ported class shares th e instances of its base class(es). In fact, the instances of the base class(es) are always th e instances of the im ported class. 3 These instances are called imported instances in the v irtual database, while they m ay be called differently (i.e., im ported or local instances) in the base(s) of the v irtual database (see Section 4.1.2). We can thus observe th a t an im ported instance in a v irtual database m ust also be an instance in at least one base of the virtual database. However, this m ultiple class m em bership m ay cause the following nam ing conflict: an im ported instance m ay have the sam e-nam ed attrib u tes an d /o r m ethods th a t are defined in th e im ported class and some of its base class(es) respectively. We resolve this conflict by m aking the attrib u tes and m ethods of the base(s) override those of the virtual database. Further, an im ported class m ay have instances th a t are not the instances of the base(s) b u t newly created in th e virtual database under certain conditions. 4 These instances are called local instances in the v irtual database. Therefore, the instances of an im ported class always include all instances of its base class(es) (i.e., im ported instances), and m ay include additional instances (i.e., local instances). • A derived class is a class th a t is derived from other classes (i.e., other im ported, derived, a n d /o r local classes). Its instances are com puted from these other classes, based on the derivation predicate; thus, they m ay include im ported an d /o r local instances. Since derived classes are not supported by CODM , they are conceptually stored as regular classes in a virtual database, and their derivation predicates are not associated w ith these stored classes but w ith the definition of a v irtual database. 3CODM allows an object to become an instance of more than one class. 4Details of instance manipulation including these conditions are in Section 5.4. 19 Object Object ssno: Integer name: String ssno: Integer name: String f Person Person major: Department ( Student empno: Integer major: Department L advisor: Faculty ^ workon: Project jmd advisor!) return Integer Employee RA Graduate VDB advisor: Faculty find_advisor() return Integer RA DB Figure 4.2: M ultiple Class M embership among V irtual D atabases • A local class is a class th a t is newly created in the virtual database; it is neither im ported nor derived. A local class has only local instances. In Figure 4.1, P e rs o n , R A , F a c u lty , and D e p a r tm e n t are im ported classes, J u n io r_ F a c u lty and S e n io rJF a c u lty are derived classes, and P r o je c t is a local class. T he class definitions of some im ported classes are different from those of their base classes; for exam ple, R A of V D Bi vs. its base class R A of D BX . Each im ported class has im ported instances, which are represented by “shaded” small ovals (e.g., ry of R A ). According to the above definition of im ported instances, an im ported instance in the virtual database VDBy is also an instance in its base DBi; for exam ple, ri (of R A ) in V DBi is also an instance (of R A ) in D BX . However, in Figures 3.2 and 4.1, th e same object ri is represented by two instances w ith th e same OID (i.e., r x in VDBX and rx in D BX ). This representation is assum ed to be equivalent to the one shown in Figure 4.2. In this dissertation, th e former representation is employed for simplicity. T he attrib u tes and m ethods m ajor, advisor, and find^adivsor of the im ported instance r x in V D BX are overridden by th e sam e-nam ed attrib u tes and m ethods of r x in DBX . However, the attrib u te work^on of rx in V D BX is not overridden, since this 20 attrib u te is not defined for ri in DBi (see th e transform ing phase in Sections 4.2.1 and 5.1.4). The classes Ju n ior_F acu lty and S en ior_F aculty are derived from the class F aculty, via the derivation predicates hasJ,enure = “no” and has-tenure = “yes” , respectively. T he im ported instances of F acu lty are checked using th e derivation predicates and appropriately moved into Ju n ior_F acu lty and Senior_Faculty. Further, P r o je c t is a local class and has only local instances th a t are represented by “unshaded” small ovals (e.g., ji). A virtual database always contains im ported classes and instances, but does not necessarily contain other classes and instances. Im ported instances always belong to im ported classes, and m ay belong to derived classes th a t are derived from im ported classes. Local instances always belong to local classes, and m ay belong to some im ported or derived classes under certain conditions. In addition, im ported classes and instances, as well as derived classes th a t are derived from only im ported classes, indicate inform ation derived from the base(s) of a v irtual database. On the other hand, other classes and instances (i.e., local classes and instances, derived classes th a t are derived from local classes), as well as properties th a t are newly added to im ported or derived classes (e.g., work^on of R A in Figure 4.1), represent additional inform ation th a t is not derived from th e base(s) of a virtual database but newly created in th e virtual database. As for a database based on CODM, a virtual database always contains the system-defined classes O b jec t, S trin g, In teg er, and R ea l by default. They are considered special classes th a t are not im ported, derived, or local classes. The classes of a v irtual database are organized into a class lattice rooted at th e class O b ject. O b ject has S trin g , In teg er, and R eal as its direct subclasses, and user- defined classes as its direct and indirect subclasses. T he is-a relationships among the im ported classes of a virtual database m ust be derivable from the is-a relationships in th e base(s) of the virtual database. This m eans th a t if a virtual database has an is-a relationship betw een any two im ported classes th a t are im ported from the same base, this base m ust have a direct or indirect is-a relationship between the base classes of th e im ported classes. Finally, a virtual database m ust always contain all classes th a t are recursively referenced by the attrib u tes and m ethods of its classes. This indicates th a t the 21 Figure 4.3: V irtual D atabase Derivation G raph for DBi virtual database m ust be type-closed [22, 41, 48]. In Figure 4.1, suppose th a t the m ethod find-advisor references only the attrib u te advisor in the class R A . It is then clear th a t the v irtual database V D Bi is type-closed. 4 .1 .2 V ir tu a l D a ta b a se D er iv a tio n G rap h As m entioned earlier, virtual databases are created directly or indirectly on an un derlying database, and each v irtual database is created (directly) on one or more bases. In addition, since every base m ust be unique, at m ost one base of a vir tual database can be th e underlying database. As a result, a database and its v irtual databases are organized into a directed acyclic graph (DAG) rooted at the database. We call this DAG th e (virtual database) derivation graph for the database. Figure 4.3 shows the derivation graph for the database D Bi. Here, for example, V D Bi is created on DBi; VDB3 is created on VDBi and V D B2. T he v irtual database derivation graph for a database has the following charac teristics. I • Since th e underlying database is not a virtual database, it does not contain any I inform ation derived from others. It thus contains only classes and instances th a t are created in it. These classes and instances are term ed local classes and instances of the database respectively. • The classes of a virtual database or the underlying database are im ported directly or indirectly into other virtual databases. Hence, m ultiple classes, 22 each of which is in a different virtual database or the underlying database, m ay represent the same class; for exam ple, in Figure 3.2 and 4.1, the classes R A of DBx and R A of V D B i. Among those classes, different attrib u tes (and m ethods) cannot have the same name. Note th a t the same attrib u tes (and m ethods) m ust have the same definition, but m ay have different nam es due to renam ing. • A real-world entity is represented by only one instance. However, this instance m ay belong to several virtual databases and th e underlying database. The instance is called a local instance in a virtual database or the underlying database where it is first created, while it is called an im ported instance in other v irtual databases. • Suppose th a t an im ported class of VDBj has one of its base classes in VDBj, and this base class is a derived class. If all classes involved in th e derivation predicate for th e derived class of VDB^ are im ported into VDB;, the im ported class is considered a derived class in VDBj; otherwise, it is an im ported class. • The derivation graph for the database m ay be changed by the operations of m anipulating virtual databases and their schemas, b u t m ust always rem ain as a DAG rooted at the database. 4.2 Virtual Database Manipulation Mechanism In this section, we briefly describe the characteristics of the virtual database m a nipulation mechanism. W ith this mechanism, users can create and delete a virtual database, change its schema, and m anipulate its instances, as they do for a regular database. 4.2.1 C re a tio n on a S in g le B a se A virtual database can be created on either a single base or m ultiple bases. We consider the form er case in this section, and the latte r in Section 4.2.2. The process j of creating a v irtual database on a single base can be divided into the following 23 phases: preparing, im porting, pruning, and transforming. T he preparing and im porting phases are m andatory, and the rest are optional. In addition, each phase m ay consist of one or m ore operations, each of which is dynam ically executed by th e user. In the preparing phase, the em pty virtual database is created on its base; in the importing phase, classes are im ported from the base into th e em pty virtual database, and th e is-a relationships among th e classes are derived from th e base; in the pruning phase, the properties and classes th a t were unnecessarily im ported during the im porting phase are deleted; in the transform ing phase, th e schema of the virtual database is changed into the final desired form. An illustrative exam ple is shown in Figure 4.4. Here, the user creates the virtual database VDBi (Figure 4.1) on the base D Bi (Figure 3.2). Note th a t D Bi is the underlying database. P r e p a r in g P h a s e In order to create a virtual database, th e user m ust start from its underlying database. Hence, in our exam ple, the user starts from DBi (line 1). The user then creates the em pty v irtual database VDBi on DBx which happens to be the base (line 2), and accesses VDBi for th e im porting phase (line 3). I m p o r tin g P h a s e T he user im ports the classes P e rs o n , R A , and F a c u lty from DBx into the em pty V D Bi (line 4). This im porting also determ ines and im ports the classes th at are recursively referenced by the attrib u tes and m ethods of the im ported classes b ut are not im ported by th e user. In this exam ple, th e classes D e p a r tm e n t and S ta ff are located and im ported, as shown in the system response beneath line 4. In addition, the is-a relationships among the im ported classes are derived from DBi and created in V D Bi. Finally, the instances of th e base classes in DBi are m ade to also belong to th e corresponding im ported classes in V D Bi. 24 1. > ... start from DBi 2. > cre a te v d b VDBi on DBi; 3. > a ccess v d b VDBi; 4. V D B i> im p o rt Person, RA, Faculty from DBi; D epartm ent, S ta ff imported from DB\ 5. V D B i> d e le te p ro p erty em pno from RA; 6 . V D B i> d e le te p ro p erty empno from Faculty; 7. V D B i> d e le te p ro p erty has_staff* from D epartm ent; 8 . VDB !> add class Project (projno: String, pJnvestigator: Faculty, to t aL am ount: Integer); 9. V D B i> add a ttr ib u te to RA (work_on: Project); 10. V D B i> add cla ss Junior-Faculty su b cla ss o f Faculty v ia has_tenure = “no” ; 11. V D B i> add class SeniorJFaculty su b cla ss o f Faculty v ia has_tenure = “yes” ; 12. V D B i> ... m anipulation of schem a and instances ... 13. V D B i> exit; 14. > ... back to DBi ... Figure 4.4: C reation of V irtual D atabase V D BX 25 P r u n in g P h a s e T he user deletes unnecessary properties and classes from VDBj by deleting prop erties and optionally the classes th a t are recursively referenced by these properties. 5 T he attrib u te empno is deleted from th e class R A (line 5) and the class F a c u lty (line 6 ), respectively. The attrib u te h a s s ta ff is then deleted from the class D e p a r t m e n t (line 7). Here, the symbol “*” attached to has s t a f f m eans th a t this operation also deletes the classes th a t are recursively referenced by h a ssta ff. Hence, the do m ain class S ta ff of h a s s ta ff which is a user-defined class is deleted. T ra n s fo rm in g P h a s e T he user changes the schema of V D Bi. T he local class P r o je c t w ith three attrib u tes is created (line 8 ), and the new a ttrib u te w o rk sn is added to th e class R A (line 9). In addition, the derived class J u n io r_ F a c u lty is created as a subclass of F a c u lty , via th e derivation predicate has-tenure — “no” (line 10); similarly, the derived class S e n io r -F a c u lty , via hasJenure = “yes” (line 11). A fter the transform ing phase, V DBi becomes the one shown in Figure 4.1, except for the local instances of th e class P r o je c t. 4.2 .2 C rea tio n on M u ltip le B a ses The process of creating a virtual database on m ultiple bases can be divided into the following phases: (1) preparing, (2) importing and pruning for one selected base, (3) im porting, pruning, and integrating for each of other bases, and (4) transforming. Here, th e preparing, im porting, and integrating phases are m andatory, and the rest are optional. In addition, before th e process begins, the user m ust select one base among the bases, from which th e virtual database to be created will im port the m ost classes; this selected base is term ed the prim ary base. In the preparing phase, th e em pty virtual database is created on the bases. Next, th e im porting and pruning phases are applied to th e prim ary base. In the importing phase, classes are im ported from th e base into the em pty virtual database, and is-a | relationships among th e classes are derived from th e base; in th e pruning phase, -the properties and classes th a t were unnecessarily im ported are deleted. Next, i________________________________ i 5For details of this delete-property operation, see Section 5.1.3. 26 1. > ... start from DBx 2. > crea te v d b VDB3 on V D B i, VDB2; 3. > a ccess v d b V D B3; 4. V D B3> im p o rt Faculty, D epartm ent, Project from VD Bi; 5. VDB3> im p o rt Faculty, Course from VDB2; D epartm ent imported from VDB 2 6 . V DB3> d e le te p ro p erty offered_by from V D B 2 .Course; 7. VDB3> in tegrate; 8 . V DB3> d e le te p r o p er ty em pno from Faculty; 9. VDB3> add a ttr ib u te to P roject (belong.to: D epartm ent); 1 0 . V D B3> ... m anipulation of schem a and instances ... 11. V D B3> exit; 1 2 . > ... back to DBi ... Figure 4.5: Creation of V irtual D atabase V D B 3 th e im porting, pruning, and integrating phases are applied to each of the bases other th an the prim ary base. T he importing and pruning phases are identical to the corresponding phases applied to the prim ary base; in the integrating phase, the classes and instances derived from a non-prim ary base are integrated into the virtual database. Finally, in the transform ing phase, the schem a of th e v irtual database is changed into th e final desired form. An illustrative exam ple is shown in Figure 4.5. Here, th e user creates the vir tu al database V D B 3 (Figure 4.6(b)) on the virtual databases VDBi (Figure 4.1) and V D B 2 (Figure 4.6(a)). Suppose th a t th e user wants to im port {F acu lty, D e p a rtm en t, P r o je c t} from VDBi and {F acu lty, C ou rse} from V D B 2 . 6 Since 6We assume that a user knows which classes are to be imported from which bases in advance. ] However, such classes may not include the classes that may be imported during the importing phase. 27 ssno: Integer name: String major: D epartment dname: String Departmeo Employee }empno: Integer s2 Graduate Faculty courseno: String offeredJry: Department instructor: Faculty ssno: Integer name: String faculty of: Department has tenure: String researcharea: String dname: String facultyj>f: Department ' researcharea: String advisor: Faculty Jmd advisorO return Integer Project Departvnen courseno: String instructor: Faculty c l * 2 projno: String pjnvestigator: Faculty totaljamount: Integer belong to: Department tl < 2 r, r2 (a) VDB2 (b)VDB3 Figure 4.6: V irtual D atabases V D B 2 and V D B 3 m ore classes are to be im ported from V D Bi, the user selects it as the prim ary base. N ote th a t if th e sam e num ber of classes are to be im ported from both V D Bi and V D B2, the user can select either as the prim ary base. Now, the creation process begins. P r e p a r in g P h a s e T he user starts from the underlying database DBi (line 1), creates the em pty v irtual database VDB 3 on V D Bi and V D B 2 (line 2), and accesses VDB 3 for the im porting phase (line 3). I m p o r tin g P h a s e (fo r V D B i) T he user im ports th e classes F a c u lty , D e p a r tm e n t, and P r o je c t from VDBi into the em pty V DB 3 (line 4). Here, no additional class is im ported; th e is-a rela tionships am ong th e im ported classes are derived from VDBi and created in VDB3; 28 finally, the instances of the corresponding base classes (in V D B i) also become in stances of th e im ported classes. P ru n in g P h a se (for V D B i) No pruning is done for th e properties and classes from V D Bi. Im p o rtin g P h a se (for V D B 2) The user im ports the classes F acu lty and C ou rse from V D B2 into VDB3 (line 5). Here, the class D ep a r tm e n t is also im ported as shown beneath line 5; the is-a relationships among th e im ported classes are derived from V D B2; finally, the instances of th e corresponding base classes (in V D B2) also become instances of the im ported classes. P r u n in g P h a se (for V D B 2) T he user deletes th e a ttrib u te offeredJby from the class C ou rse th a t is im ported from V D B 2 (line 6 ). Note th a t this does not delete the dom ain class of the attribute. In teg ra tin g P h a se (for V D B 2) T he classes and instances from V DB 2 are integrated into VDB 3 by th e system (line 7). Here, the sam e classes and instances are m erged, and redundant is-a rela tionships caused by this m erging are removed. T ran sform in g P h a se The user deletes th e attrib u te empno from the class F a cu lty (line 8 ), and adds the attrib u te belong.to to the class P r o je c t (line 9). A fter this phase, VDB3 be comes th e one shown in Figure 4.6(b). 4 .2 .3 S ch em a C h an ge an d D e le tio n We now consider how the virtual database m anipulation m echanism supports the schem a change and deletion of a virtual database. 29 S ch em a C h an ge T he schem a of a virtual database VDB,- can be changed in either or both of the following two ways: • through transforming • through importing, pruning, and integrating. T he form er way is to use th e transform ing phase of th e creation process. Here, the schem a of VDB* is simply changed w ith the operations used for th e transform ing phase (see Section 5.1.4). In addition, this way can be used if VDB; is not a base of another v irtual database. T he la tte r way is to use the im porting, pruning, and integrating phases for a non-prim ary base of the m ultiple-base creation process. Here, the schem a change of VDB, is done by im porting classes from the underlying database or another vir tu al database, deleting unnecessary properties an d /o r classes from these im ported classes, and integrating the result into VDB,. In addition, this way is allowed if VDB, is not a base of another virtual database and no cycle appears in the deriva tion graph for the underlying database as a result. F urther, th e above two ways can be used in a repeated and combined m anner. For exam ple, the schema of VDBi can be changed by first im porting and integrating classes of other virtual databases into VDB,- (i.e., by applying the latter way to each of these other virtual databases), and then modifying the schem a of VDBj (i.e., by applying the form er way to VDB,). D e le tio n A user can delete a virtual database VDB,- if it is not a base of another virtual database. For exam ple, in Figure 4.3, a user can delete the virtual databases VDB3 w ith th e following operation: > d e le te v d b VDB3; 30 4 .2 .4 In sta n c e M a n ip u la tio n Users can m anipulate the instances of a virtual database as they do for the instances of a regular database. T he virtual database m anipulation operations for instances are used for doing this (see Section 5.4); they include operations for • creating an instance of a class, • deleting an instance (from a virtual database), • moving an instance to a class, • changing th e value of an a ttrib u te of an instance, • retrieving th e value of an a ttrib u te of an instance, and • executing a m ethod on an instance. 31 Chapter 5 Virtual Database Manipulation Mechanism In this chapter, we describe the details of the virtual database m anipulation mech anism. 5.1 Creation on a Single Base As described in Section 4.2.1, the single-base creation process can be divided into the following phases: preparing, importing, pruning, and transforming. T he preparing and im porting phases are m andatory, and the rest are optional. Also, each phase m ay consist of one or m ore operations/com m ands, each of which is dynam ically executed by the user. An illustrative exam ple is shown in Figure 4.4 where the v irtual database V DBi (Figure 4.1) is created on th e database DBi (Figure 3.2). In w hat follows, we consider each phase of th e general single-base creation process in detail. Suppose th a t a user wants to create a virtual database VDB; on a base BASE;. 5.1.1 P rep a rin g P h a se In th e preparing phase, th e user creates the em pty v irtual database 1 VDB; on BASE;, and accesses VDB; for the im porting phase. D etailed description of this phase is as follows. 1An empty virtual database means a virtual database that contains no user-defined classes and instances. 32 F irst, th e user m ust check w hether he/sh e is currently accessing th e underlying database. This is because the creation, deletion, and access/opening of a virtual database m ust start from its underlying database. If th e user is not accessing th e underlying database but one of its virtual databases, he/sh e m ust exit to the underlying database by using the operation e x i t . 2 Second, th e user creates the em pty virtual database VDB, on BASE* by exe cuting th e following c r e a te v d b operation. c r e a te v d b VDB, o n BASE,; T hird, th e user accesses VDBj for the im porting phase by executing the access v d b operation. a c c e ss v d b VDB,; E x a m p le o f V D B X As m entioned earlier, lines 1 to 3 in Figure 4.4 indicate th e preparing phase of the creation process for V D Bi. A fter this phase, V D Bi is an em pty v irtual database created on D B i which is its underlying database. 5 .1 .2 Im p o r tin g P h a se D uring this phase, classes and their instances are im ported from BASE,- into the em pty VDBi, and th e is-a relationships among th e im ported classes are derived from BASE,-. T he user selects classes of BASE, and executes th e following im p o r t operation for doing this. im p o r t Ci<*>, . . . , . . . , Cn{* > fro m BASE,; Here, C ,-^ indicates (1) C, (i.e., the class C,- only) or (2 ) C,-* (i.e., th e class C,- and its direct and indirect subclasses). In addition, when th e user selects all classes of BASEj, th e following im p o r t operation is used. im p o r t Object* fro m BASE,; 2Details of this operation and all other operations discussed in this chapter can be found in Appendix A. 33 ssno: Integer name: String Object Person Faculty major: Department empno: Integer advisor: Faculty find_advisor() return Integer dname: String hasjstqff: Staff Departmen d i ^3 empno: Integer empno: Integer faculty of: Department has_tenure: String T l t2 Figure 5.1: VDBi after Im porting Phase E x a m p le o f V D B i In Figure 4.4, line 4 indicates the im porting phase. T he user selects the classes P e rs o n , R A , and F a c u lty to im port from DBi into V D B i. V D B i after this phase is shown in Figure 5.1. The im p o r t operation actually executes the following steps: class definition importing, type-closure checking, is-a relationship finding, and instance arranging. In the class definition importing step, the definitions of the user-selected classes are copied from BASE; into the em pty VDB;. In the type-closure checking step, th e classes th a t are recursively referenced by the attrib u tes and m ethods of the im ported classes b u t are not im ported by th e user are located and im ported from BASE; into VDB;. In the is-a relationship finding step, th e is-a relationships among th e im ported classes are derived from DBi and created in V D B i. Finally, in the instance arranging step, the instances th a t belong to the base classes of the im ported classes are m ade to also belong to th e im ported classes. In w hat follows, we describe each step in detail. 34 Object ssno: Integer name: String f Person 0 ssno: Integer name; String major: Department empno: Integer advisor: Faculty find_advisor() return Integer Faculty faculty_of: Department hastenure: String ssno: Integer name: String empno: Integer Figure 5.2: VDBi after Class Definition Im porting Step 5 .1 .2 .1 C lass D e fin itio n Im p o r tin g S tep T he definitions of the user-selected classes, w ithout is-a relationships, are copied into the em pty VDBj. However, if th e user selects all classes of BASE*, the is-a re lationships among the classes are also copied into VDBi; since VDB; is type-closed in this special case, the type-closure checking and is-a relationships finding steps are skipped. E x a m p le o f V D B i In Figure 4.4, the user im ports three classes (i.e., P e rs o n , R A , and F a c u lty ) of D Bi into th e em pty V D B i. V D Bi after this phase is shown in Figure 5.2. 3 5 .1 .2 .2 T y p e-C lo su r e C h eck in g S tep A fter th e class definition im porting step, VDB, m ay not contain some classes th a t are recursively referenced by the attrib u tes and m ethods of th e im ported classes. In this step, such classes are located in BASE, and im ported into VDB,. Figure 5.3 explains how this can be done; this approach is sim ilar to [22, 41, | 3The attributes and methods directly defined for each class are included in a square bracket, ]while those inherited are not. 35 T y p e -C lo s u re C h e c k in g b e g in Newly im p o rte d .classes := 0; fo r each class C ,- in VDB.classes d o check_type_closure(C8 ); VDB ^classes := VDB_classes U Newly im ported.classes; en d ; p r o c e d u r e check_type_closure(C;) b e g in fo r each a ttrib u te attr of C8 d o b e g in C j := Dom ain (attr)] if (Cj is not in VDB_classes) and (Cj is not in Newly im p o rte d .classes) and (Cj is not a system-defined class) th e n b e g in im port Cj from base; insert Cj in N ew lyim ported.classes; check_type_closure(Cj ); e n d ; e n d ; en d ; Figure 5.3: Type-Closure Checking Step 36 ssno: Integer name: String Person ssno: Integer name: String major: Department empno: Integer advisor: Faculty findjodvisoti) return Integer RA ri dname: String I kas_staff: Staff Departm ent empno: Integer ^ Faculty^ ^ ssno: Integer name: String empno: Integer faculty of: Department hastenure: String Figure 5.4: V D Bi after Type-Closure Checking Step 48]. Here, VDB.classes is a set of classes th a t are in VDB,- before this step; N ew lyim ported-classes is a set of classes th a t m ay be im ported during this step. T he ch eck _typ e_closu re procedure is invoked for each class C, in VDB_classes. It checks each a ttrib u te attr of C,- to determ ine w hether its dom ain class C j is in VDB_classes, in NewlyJm ported_classes, or a system -defined class (i.e., w hether C j is currently in VDB,). T he reason for considering only attrib u tes here is th a t a m ethod of a class can reference only attrib u tes and m ethods of the same class, and th e system -defined classes. Then, if C j is not in VDB; yet, it is im ported from BASE,-, and added to N ew lyJm ported.classes. The ch eck _typ e_closu re proce dure is executed on Cj recursively. This execution continues until all classes th at are recursively referenced by th e properties of Cj are located and im ported. E x a m p le o f V D B i A fter th e class definition im porting step, the set VDB_classes contains th e classes P erso n , R A , and F acu lty. F irst, the ch eck _typ e_closu re procedure is invoked for P erso n . This does not im port any additional class from D Bi into V D B i, since th e dom ain classes of all attrib u tes of P erso n are system -defined classes. Next, ch eck _typ e_closu re is invoked for R A . This im ports D e p a r tm en t from D B i, since th e dom ain class of the a ttrib u te m ajor of R A is D e p a r tm e n t which is not 37 im ported yet. Note th a t th e dom ain classes of other attrib u tes of R A are system- defined classes or classes th a t are already in V D B i. N ext, c h e c k _ ty p e _ c lo su re is invoked for D e p a r tm e n t, which im ports S ta ff from DBx; it is th en invoked for S taff, which im ports no additional class. Finally, ch ec k _ ty p e _ c lo su re is invoked for F a c u lty , which im ports no additional class. In summ ary, two classes (i.e., D e p a r tm e n t and S taff) are im ported during this step, as shown beneath line 4 in Figure 4.4. VDBi after this phase is shown in Figure 5.4. 5 .1 .2 .3 Is -a R e la tio n s h ip F in d in g S te p A fter the type-closure checking step, VDBj still does not have is-a relationships among its classes. In this step, such is-a relationships are derived from BASEj and created in VDBi. Figures 5.5, 5.6, and 5.7 explain how this can be done; this approach is sim ilar to [42]. F irst, th e transitive closure of is-a relationship (TC) for BASEj is com puted if necessary. N ote th a t the TC for BASE,- m ay exist if another v irtual database has already been created on BASE,, in which case it is not recom puted. T he TC for BASEj is com puted by invoking th e procedures la b e l_ class and tra n sitiv e _ c lo su re _ o f_ b a se . In lab el_ class, th e classes of BASEj are labelled C i, . . . , Cn. Note th a t we do not consider the system -defined classes S trin g , I n te g e r, and R e a l which are always the direct subclasses of O b je c t. A variation of the topological sort [3, 24] is used for this procedure. Each class C j is first associated w ith th e num ber of its superclasses, which is represented by count(C j). T he class O b je c t (w ith c o u n t(O b je c t)= 0 ) is selected and labelled Cj. For each of its subclasses C ’, count (C*) is decreased by 1. Then, a class C w ith co u n t(C )= 0 is selected and labelled th e next label, and for each of its subclasses C , c o u n ^ C ’) is decreased by 1. This phase is repeated until all classes of BASEj are labelled. In th e tra n sitiv e _ c lo su re _ o f_ b a se procedure, th e is-a transitive closure of BASEj is com puted. A variation of th e transitive closure algorithm for directed graphs [3, 24] is used here, since th e class lattice of a virtual database is a rooted directed acyclic graph. T he upper-right p art of the n x n m atrix BA SE is first set to 0. This m atrix is used to represent the is-a relationships between classes in BASEj. T he value of each elem ent BASE[i,j] is either 0 or 1, which m eans w hether class Cj 38 Is-a R e la tio n sh ip F in d in g b eg in if the is-a transitive closure of the base does not exist th e n b eg in label_class(); transitive_closure_of_base(); end; find Jsa_of_VDB (); check_property _of_VDB (); end; Figure 5.5: Is-a Relationship Finding Step is a superclass of class C_, . Due to the nature of labelling classes, only low-labelled classes can be superclasses of high-labelled classes. This m eans th a t only such el em ents as BASE[i,j] where i < j can have value 1. As a result, we consider only the upper-right p art of th e m atrix. A fter this setting, all possible is-a relationships between classes are com puted using the transitivity of is-a relationship. Classes in BASE,- are visited in th e order of their labels, starting from Ci. Suppose th a t a class C , is being visited. For each is-a relationship < C ,,C j> outgoing from Ci, its corresponding m atrix elem ent BASE[i,j] is set to 1. F urther, if there exists an is-a relationship from any class to the class C ; where 1 < k < i, but there exists no is-a relationship from to Cj, th e is-a relationship from Ck to Cj is derived by the tran sitiv ity of is-a relationship. N ext, th e TC for VDBi is derived from the TC for BASE,. T he is-a relationships th a t are not redundant4 are found from this derived T C for VDB,, and created in VDB,-. This is done by executing th e procedure find_isa_of_V D B . Here, the columns and rows for the classes in VDB .classes (i.e., th e classes of VDB,) are first extracted from the m atrix BASE, and stored in a new m atrix VDB whose size is |VDB_classes| x |VDB_classes|. We consider only th e upper-right p art of VDB as we do th a t of BASE. Then, all necessary is-a relationships among the classes in 4The transitivity of is-a relationship produces redundant is-a relationships in the TCs. 39 p ro ced u re label_class() b eg in for each class C, • in base do count(C*) := th e num ber of superclasses of C,-; j ••= i; {Suppose th a t the root of th e class lattice of th e base is O b ject.) enqueue (O bject, Class_queue); w h ile Class_queue is not em pty do b eg in C := front(Class_queue); dequeue (Class _queue); label C as Cj] j := j + 1 ; for each subclass C* of C do b eg in count(C ) := count(C ) - 1; if count (C*) = 0 th en enqueue(C , Class_queue); end; end; end; p ro ced u re transitive_closure_of_base() b eg in for i = 1 to n -1 , j = i+ 1 to n do BASE[i,j] := 0; for i = 1 to n- 1 do for each is-a relationship < C i,C j> do b eg in BASE[i,j] := 1; if i > 1 th e n for k = 1 to i- 1 do if BASE[k,i] = 1 and BASE[k,j] = 0 th e n BASE[k,j] := 1; end; end; Figure 5.6: Procedures for Base 40 p ro ced u re findJsa_of_VDB() b eg in for i = 1 to n -1 , j = i + l to n do if (C, is in VDELClasses) and (Cj is in VDB_Classes) th en VDB[Ci,Cj] := BASE[i,j]; for i = 1 to |VDB_Classes|-l, j = i + l to |VDB_Classes| do if VDB[i,j] = 1 th e n b eg in {Suppose th a t VDB[i,j] = VDB[C,C'].} create th e is-a relationship < C ,C >; if j < jVDB_Classes| th en for k = j+ 1 to |VDB_Classes| do if VDB[j,k] = 1 and VDB[i,k] = 1 th e n VDB[i,k] := 0; end; end; p ro ced u re check_property_of_VDB() b eg in for i = 1 to |VDB_Classes| do for each inherited property prop of C; in base do if prop is not inherited from any superclass of Ci in current VDB th en prop is m ade to be a directly-defined property of C „ - in current VDB; end; Figure 5.7: Procedures for V irtual D atabase 41 VDB_classes are determ ined, and created in VDB; as follows. Classes are visited in th e order of their labels, starting from Ci. Suppose th a t the class C is being visited. Each is-a relationship < C ,C > outgoing from C is selected and created in VDB,-. Here, this is-a relationship m eans th a t VDB[C,C ] = VDB[i,j] = 1. R edundant is-a relationships are then found by checking the is-a relationships VDB[j,k] and VDB[i,k] where j < k < |VDB_classes|. If both relationships exist (i.e., both are 1), VDB[i,k] becomes a redundant is-a relationship and is rem oved (i.e., set to 0). VDB, now has is-a relationships among its classes. However, we still need to check one m ore thing, i.e., the properties of each class. Let us consider th e following situation. Suppose th a t BASE* has th e classes C i, C2, and C3 where Cx is a super class of C2, which is a superclass of C3 in turn. Also, suppose th a t Cx has th e a t trib u te ax, C 2 has a2, and C3 has a3. If only Cx and C3 are im ported into VDB,-, the is-a relationship <C x,C 3> is created in VDB,- by the find_isa_of_V D B procedure, but there is a problem . C3 is supposed to inherit ax and a2 in V D B,, as it inherits them in BASE,-. However, after the creation of the is-a relationship <Cx,C3> , C3 inherits only a\ in VDB,. This problem is solved by changing a2 from an inher ited a ttrib u te to a directly-defined attrib u te of C3 in VDB,. This kind of property checking and arrangem ent is done by executing th e c h e c k _ p ro p erty _ o f_ V D B pro cedure. In fact, for each inherited property prop of each class C, in BASE;, the procedure checks w hether prop is inherited into C; from a superclass of C < in VDB,. If not, prop is m ade to be a directly-defined property of C, in VDB,. E x a m p le o f V D B x Suppose th a t there exists no TC for DBx. In th e Iab el_ class procedure, each class C,- of DBx is first associated w ith count(C,-); for exam ple, c o u n t(O b je c t) is 0, co u n t(P e rso n ) is 1, and count(T A ) is 2. O b je c t is labelled C i, and for each sub class C ' of O b je c t, c o u n ^ C ’) is decreased by 1. As a result, both co u n t(P e rso n ) and count ( D e p a rtm e n t) becom e 0. P e rs o n is labelled C2, and c o u n t(S tu d e n t) and co u n t(E m p Io y ee) become 0. D e p a r tm e n t is then labelled C3. Similarly, the classes of DBx are labelled Cx, . . . , Cxi as shown in Figure 5.8(a). In th e tra n sitiv e _ c !o su re _ o f_ b a se procedure, the elem ents BASE[i,j] w ith i < j are set to 0 in th e first “for” loop. N ote th a t BASE[i,j]=BASE[C,-,Cj]. In the second loop, class C x is first visited; BASE[1,2] corresponding to the outgoing edge 42 C i C2 C3 C4 CS C6 C7 CS C9 C10C11 C l Object C2 C3 Person Department C4 Employee CS Student C6 ^ U n d e rJ'' graduate C9 Staff C7 G raduate Faculty C8 CIO RA Cll TA C l 1 1 1 1 1 1 1 1 1 1 C2 0 1 1 1 1 1 1 1 1 C3 0 0 0 0 0 0 0 0 C4 0 1 1 0 0 1 1 CS 0 0 1 1 1 1 C6 0 0 0 0 0 C7 0 0 1 1 C8 0 0 0 C9 0 0 CIO 0 C ll (b) BASE after transitive_closure_of_base C l C2 C3 C8 C# C ll C l C2 C3 C8 C9 C ll C l ~ 1 1 1 1 1 “ Cl ~ 1 1 0 0 0“ C2 0 1 1 1 C2 0 1 1 1 C3 0 0 0 C3 0 0 0 CS 0 0 C8 0 0 C9 0 C9 0 Cll Cll (a ) DBj after label_class (c) VDB after extracted from BASE (d) VDB after find_isa_of_VDB Figure 5.8: During Is-a Relationship Finding Step for V D Bi < C X ,C 2 > is set to 1 ; and BASE[1,3] corresponding to < C i,C 3 > is set to 1. Class C 2 is visited next, and BASE[2,4] corresponding to < C 2 ,C 4 > is set to 1 . Since BASE[1,2] = 1 and BASE[1,4] = 0, BASE[1,4] is now set to 1 . BASE[2,5] corre sponding to < C 2 ,C 5> is then set to 1 , and BASE[1,5] is set to 1 . Similarly, each of the rem aining classes from C 3 to C n is visited one by one, and the elem ents of the m atrix BA SE corresponding to the outgoing edges and the derived is-a relationships are set to 1 . Figure 5.8(b) shows the m atrix BASE after executing this procedure. In th e find_isa_of_V D B procedure, the columns and rows for those classes in VDB_classes are extracted from the m atrix BASE, and are stored in th e new m a trix VDB. N ote th a t VDB[i,j] is not necessarily equal to VDBfC^Cj]. Figure 5.8(c) shows VDB after doing this. Then, the is-a relationship < C i,C 2> corresponding to VDB[1,2] is created in VDBi. VDB[1,4] (i.e., VDB[Cx,Cs]) is set to 0 , since VDB[2,4] (i.e., VDB[C 2 ,C8]) is 1 and VDB[1,4] is 1 . VDB[1,5] and VDB[1,6] are then set to 0. Similarly, the rest of is-a relationships are created in VDBi. Fig ure 5.8(d) shows th e m atrix VDB after executing this procedure. 43 ssno: Integer name: String Object Departmen Person Faculty dname: String hasstaff: Staff major: Department empno: Integer advisor: Faculty find_advisor{) return Integer empno: Integer empno: Integer faculty_of: Department hasjt enure: String Figure 5.9: V D Bi after Is-a Relationship Finding Step Finally, in the c h e ck _ p ro p erty _ o f_ V D B procedure, each class of VDBi is checked to find any property th a t th e class inherits in D Bi b u t does not inherit in V D Bi; such property is newly defined in the class of V D B i. In fact, th e attrib u te empno is defined for th e classes F a c u lty and S taff, respectively. In addition, the attrib u tes m ajor and empno are defined for th e class R A . VDBi after th e is-a relationship finding step is shown in Figure 5.9. 5 .1 .2 .4 I n s ta n c e A r ra n g in g S te p VDBj now contains th e im ported classes and their is-a relationships, but does not contain th eir instances. In this step, such instances are derived from BASE,-. First, for each class C,- of VDB*, th e direct instances of its base class (of BASEi) are m ade to be th e direct instances of C,-. Then, for each class C o f VDBi, its indirect instances are com pared w ith th e indirect instances of its base class. T he latter m ust be equal to or m ore th an the former. Therefore, if th e latter are m ore th an the form er, those instances th a t are in the la tte r b u t not in the form er (i.e., the indirect instances of th e base class th a t are not indirect instances of C i) are m ade to becom e direct instances of Ci. 44 ssno: Integer name: String Object Departmen Person StafT Faculty major: Department empno: Integer advisor: Faculty findjidvisor() return Integer dname: String hasstaff: Staff d l < ^2 d3 empno: Integer empno: Integer faculty j)f: Department has_tenure: String Figure 5.10: V D B i after Instance A rranging Step E x a m p le o f V D B i A fter handling direct instances, P e rs o n is the only class whose indirect instances are different from those of its base class. Hence, the indirect instances of th e base class th a t are not th e indirect instances of P e rs o n (i.e., Ui, U2 , ti, and tj) are m ade to becom e direct instances of P e rs o n . V D B i after this instance arranging step is shown in Figure 5.10. 5 .1 .3 P ru n in g P h a se The im porting phase, in particular, its type-closure checking step m ay im port the properties an d /o r classes in which the user is not interested (i.e., th a t are unneces sary). In this pruning phase, th e user deletes such properties a n d /o r classes from VDB, by using th e following d e le te p r o p e r ty operation. d e le te p r o p e r ty p iW , . . . , p^**, . . . , p n{*} fro m C,-; Here, th e properties p i, . . . , pn m ust be defined in the class C,-; p,^** indicates (1) p,- (i.e., th e property pi only) or (2 ) p ,■ * (i.e., th e property pi and th e classes th a t are recursively referenced by p,). Hence, this operation is equivalent to applying one of th e following operations to each property pi. 45 Object ssno: Integer name: String f Person dname: String Department faculty of: Department Faculty Jhas tenure: String major: Department m advisor: Faculty ’ find_advisor() return Integer ti r2 Figure 5.11: V D Bi after Pruning Phase d e le te p ro p er ty p; from Ct - ; d e le te p ro p er ty p;* from C;; Exam ple usages of these two operations are in the following. E x a m p le o f V D B i In Figure 4.4, lines 5 to 7 indicate the pruning phase. T he first operation is used for lines 5 and 6 respectively, while the second one is used for line 7. VDBi after this phase is shown in Figure 5.11. We now consider the detailed sem antics of each of the above two operations. T he first one checks w hether the property pt is referenced by a m ethod of th e class Cj. If so, an error message is returned and no further action is done. Otherwise, this operation deletes th e property pi from th e class Cj. In addition, if pi is an attrib u te, and th e instances of Cj have values for pi th a t are stored in VDBj, these values are also dropped . 5 5Note that in this pruning phase, instances of VDBj do not have attribute values that are stored in VDB*. 46 E x a m p le o f V D B i In Figure 4.4, this operation is used to delete th e a ttrib u te empno from the classes R A (line 5) and F a cu lty (line 6 ), respectively. N ote th a t no m ethod refer ences this attrib u te. T he second operation first executes the first operation. If pi is a m ethod, this operation stops here. Otherwise (i.e., if pi is an a ttrib u te), its dom ain class C j, and th e classes th a t are recursively referenced by the a ttrib u tes 6 of C j b u t are not necessary are also deleted. Here, necessary classes indicate those th a t were im ported by th e user during the im porting phase, or th a t are referenced by the classes th a t are not recursively referenced by Cj. Figure 5.12 explains how this recursive deletion of classes can be done; this approach uses the type-closure checking algorithm of Figure 5.3. Here, VDB_classes is th e set of classes th a t are recursively referenced by th e class Cj; hence, it is first set to {C j}. Also, Originally_imported_classes is th e set of classes th a t were im ported during the im porting phase; Deleted_classes is th e set of classes to be deleted; Undeleted_classes is th e set of classes th a t are not deleted. T he type-closure checking algorithm is first executed on Cj in order to find th e classes th a t are recursively referenced by th e attrib u tes of th e class Cj, and to p u t the found classes in VDB.classes. T he recu rsively_d elete_class procedure is th en executed on Cj to determ ine unnecessary classes as follows. This procedure checks w hether Cj is a class th a t was im ported by the user during the im port ing phase, or w hether Cj is referenced by an attrib u te of a class th a t is not in VDB_classes. If either is true, it inserts Cj in Undeleted_classes and stops. O th erwise, th e procedure inserts Cj in Deleted_classes. Subsequently, th e procedure checks each attrib u te attr of Cj to find w hether its dom ain class (i.e., C/) is in Deleted .classes, in Undeleted_classes, or a system-defined class. If neither, th e re c u rsiv e ly _delete_class procedure is executed on C* in turn. This indicates th at the procedure is recursively executed until all unnecessary classes are found. Af ter this recursive execution, D eleted-dasses includes all unnecessary classes to be 6As mentioned in Section 5.1.2.2, the reason for considering only attributes is that a method of a class can reference only attributes and methods of the same class, and the system-defined classes in CO DM. 47 R e c u r siv e ly D e le tin g U n n ecessa ry C lasses b eg in {O riginallyJm portecLclasses includes classes th a t were im ported by user.} VDB_classes := {Cj}; execute th e type-closure checking algorithm for VDB_classses; D eleted.classes := 0; Undeleted_classes := 0; recursively jdelete_class( Cj); for each class Cm in D eleted.classes do delete class Cm from VDBi; end; p ro ced u re recursively_delete_class(Cj) b eg in if (Cj is in Originally Jm ported_classes) th e n b eg in insert Cj in Undeleted_classes; return; end; for each attrib u te iattr whose dom ain class is Cj do b eg in {Suppose th a t iattr is defined in C*.} if (C k is not in VDB.classes) th e n b eg in insert Cj in U ndeleted .classes; return; end; end; insert Cj in Deleted.classes; for each attrib u te attr of Cj do b eg in Ci Dom ain (attr); if (C; is not in D eleted.classes) and (C; is not in Undeleted.classes) and (Ci is not a system -defined class) th en recursively _delete_class( C;); end; end; Figure 5.12: Pruning Phase 48 deleted. Hence, th e d e le te class operation 7 is executed on each of these classes in Deleted_classes. E x a m p le o f V D B i In Figure 4.4, this operation is executed on the a ttrib u te h a s s ta ff whose do m ain class is S ta ff (line 7). This attrib u te is first deleted from D e p a rtm en t, since it is not referenced by a m ethod. N ext, th e type-closure checking algorithm is executed on Staff. Since this class references only th e system -defined class In te g e r through th e attrib u te empno, no additional class is found. N ext, the re cu r siv e ly _delete_class procedure is executed on Staff. Since this class was not im ported during th e im porting phase, and is not referenced by any a ttrib u te , 8 it is inserted in Deleted_classses. A fter executing this procedure, Deleted_classes con tains only Staff; hence, S ta ff is deleted. 5 .1 .4 T ran sform in g P h a se T he user changes th e schem a of VDB* into the final desired form. T he operations for doing this are shown in Figure 5.13. They include operations for • adding, deleting, and renam ing a class, • adding, deleting, and renam ing an attrib u te or a m ethod (i.e., a property), and • adding and deleting an is-a relationship. In addition, the operations for adding classes allow classes to be derived from other classes based on derivation predicates. We adopt the m ethod of specifying predicate-based derived classes from SDM [21]. T he detailed syntax and semantics of each operation shown in Figure 5.13 are in A ppendix A. In w hat follows, we describe th e transform ing phase of th e creation process for V D Bi. 7Details of this operation can be found in Appendix A. sThe attribute h asstaff which references this class was deleted. 49 C lass • add cla ss <class-nam e> {su b c la ss o f <class-nam e-list>} {(<property-def-list > )} {v ia < class-derivation-predicate> }; • d e le te cla ss <class-nam e>; • ren a m e cla ss <class-nam e> as <class-nam e>; A ttr ib u te and M e th o d • ad d a ttr ib u te to <class-nam e> (< attrib u te-d ef-list> ); • ad d m e th o d to <class-nam e> (< m ethod-def-list> ); • d e le te p r o p e r ty < r-property-nam e-list> from <class-nam e>; • ren a m e p ro p er ty < property-nam e> o f <class-nam e> as < property-nam e>; Is-a • ad d isa < class-nam e> to <class-nam e>; • d e le te isa <class-nam e> to <class-nam e>; Figure 5.13: O perations for Transform ing Phase 50 E x a m p le o f V D B i In Figure 4.4, lines 8 to 11 indicate the transform ing phase. The user creates th e local class P r o je c t w ith three attrib u tes (line 8 ), and adds th e new attrib u te work-on to th e class R A (line 9). Note th a t R A of VDBi now has an a ttrib u te th at its base class (i.e., R A of D B i) does not have. The user also creates the derived class J u n io r J F a c u lty as a subclass of F a c u lty , via th e derivation predicate has-tenure = “no” (line 10); similarly, the derived class S e n io rJF a c u lty , via has-tenure = “yes” (line 11). T he instances of F a c u lty are exam ined to find w hether they satisfy those predicates; hence, fi and f2 are moved into J u n io r_ F a c u lty , while f3 and f4 are moved into S e n io r_ F a c u lty . VDBi after this phase becomes th e one shown in Figure 4.1, except for th e local instances of the class P r o j e c t . 9 5.2 Creation on Multiple Bases The m ultiple-base creation process can be divided into the following phases: ( 1 ) preparing, (2) importing and pruning for one selected base, (3) importing, pruning, and integrating for each of other bases, and (4) transforming. T he preparing, im porting, and integrating phases are m andatory, and th e rest are optional. Also, before th e process begins, th e user m ust select th e prim ary base am ong the bases, from which th e virtu al database to be created will im port th e m ost classes. Note th a t if th e same num ber of classes are to be im ported from th e bases, th e user can select any base as th e prim ary base. An illustrative exam ple is shown in Figure 4.5 where the v irtual database VDB 3 (Figure 4.6(b)) is created on th e v irtual databases V D B i (Figure 4.1) and V D B 2 (Figure 4.6(a)). In w hat follows, we consider each phase of th e general creation process in detail. Suppose th a t a user wants to create a v irtual database VDB; on bases BA SEi, . . . , BASEn, and th a t the user selects BASE; as the prim ary base. 9These local instances are added to VDBj after it is created. See Section 5.4. 51 5.2.1 P rep a rin g P h a se In this phase, the user creates the em pty VDB; on the bases and accesses VDB;. This phase is identical to th e corresponding phase of th e single-base creation pro cess, except for m ultiple bases. Hence, th e operation used for creating VDB; is as follows. c r e a te v d b VDB; on BA SEj, . . . , BA SEn; E x a m p le o f V D B3 As m entioned earlier, lines 1 to 3 in Figure 4.5 indicate th e preparing phase. A fter this phase, V D B 3 is an em pty virtual database created on V D B i and V D B 2 . 5 .2 .2 P h a se s for P rim a r y B a se T he im porting and pruning phases are applied to th e prim ary base BASE;. These phases are identical to the corresponding phases of the single-base creation process. E x a m p le o f V D B 3 In Figure 4.5, line 4 indicates the im porting phase for V D B i, but no pruning is done. Note th a t the user selected V DBi as the prim ary base. In th e im port ing phase, since all classes th a t are recursively referenced by th e properties of the im ported classes are already in V D B 3 , no additional class is im ported (during the type-closure checking step). V D B 3 after the im porting and pruning phases is shown in Figure 5.14. 5 .2 .3 P h a se s for N o n -P r im a r y B a ses T he im porting, pruning, and integrating phases are applied to each of th e bases other th an th e prim ary base BASE;. Suppose th a t the current non-prim ary base is BASE,-. 52 Object ssno: Integer name: String faculty of: Department has tenure: String Faculty | projno: String Project \ pinvestigator: Faculty total amount: Integer Department dtuune: String Figure 5.14: VDB 3 after Im porting and P runing Phases for VDBj 5 .2 .3 .1 I m p o r tin g a n d P r u n in g P h a s e s T he im porting and pruning phases are identical to the corresponding phases of the single-base creation process. In our approach, the schem a and instances im ported from BASEy are m aintained separately from th e schem a and instances of VDB*- until th e integrating phase (for BASE^). E x a m p le o f V D B 3 In Figure 4.5, lines 5 and 6 indicate the im porting and pruning phases for V D B2, respectively. In the im porting phase, th e class D e p a r tm e n t, which is th e dom ain class of the attrib u te faculty-of of the class F a c u lty , is im ported during th e type- closure checking step. In addition, in th e pruning phase, the identifier of th e current non-prim ary base is prefixed to every class for th e d e le te p r o p e r ty operations; for exam ple, in line 6 , the class C o u rs e from VDB 2 is represented by “V D B 2 .Course” . VDB 3 after these two phases is shown in Figure 5.15. Note th a t th e schem a and instances im ported from V D B 2 are not yet integrated into the schem a and instances of VDB3. 53 Object Object ssno: Integer name: String empno: Integer faculty of: Department research_area: Stringy ssno: Integer name: String faculty of: Department has_tenure: String courseno: String instructor: Faculty projno: String pinvestigator: Faculty total amount: Integer Faculty Faculty Course Project Departm ent dname: String dname: String Information from VDB2 Figure 5.15: VDB 3 after Im porting and P runing Phases for V D B 2 5 .2 .3 .2 In teg ra tin g P h a se T he schem a and instances im ported from the current non-prim ary base BASEj are integrated into those of VDB;. This integration is done by th e in te g r a te operation. in tegrate; This operation takes th e following sim ple integration strategy. For each class C a in the schem a from BA SEj, th e operation checks w hether it is in VDB;. If C Q is in VDB; w ith th e same or different nam e (say C ^ ), the properties of C a are merged into C j , which adds to C a' those properties of C a th a t are not defined for C a\ In addition, the instances of C a are integrated into those of C 0 /, which m akes those instances of C a (th at are not instances of C a') become instances of C a \ 1 0 T he integration of classes m ay cause redundant is-a relationships to be created in the class lattice of VDB;. For exam ple, suppose th a t in VDB;, Cy is a superclass of C 2 which is in tu rn a superclass of C 3 , and th a t in th e schem a im ported from 10Note that in the course of this integration, the schema and instances from the non-primary base BASEj are removed. BASEj, C i is a superclass of C 3 . T hen, in the integrating phase, th e classes C i and C 3 are integrated into VDBj-. This integration causes th e redundant is-a relationship < C i, C 3> to be created in VDBj. O ur approach elim inates redundant is-a relationships. Figure 5.16 explains how this can be done. Here, th e classes of VDBj are labelled C i, . . . , Cn by using the pro cedure lab el_class of Figure 5.6. N ext, in the procedure check_isa_of_V D B , the upper-right p art of th e n x n m atrix VDB is first set to 0. For each is-a relationship < C i,C j> , th e value of its corresponding elem ent VDB[i,j] is increased by 1. Note th a t th e value of VDB[i,j] m ay be m ore th an 1 since there m ay be m ore th an one is- a relationship <C,-,Cj> in VDBj. Finally, in the procedure rem o veJsa_of_V D B which is sim ilar to th e procedure find_isa_of_V D B of Figure 5.7, redundant is-a relationships are found and removed as follows. Classes are visited in th e order of their labels. Suppose th a t th e class Cj is being visited. For each is-a relationship < C j,C j> , if there are more th an one such is-a (i.e., VDB[i,j] > 1), redundant ones are removed; in addition, if there exist is-a relationships < C j,C ^ > and <C j,C fc> where j < k < n (i.e., VDB[j,k] = 1 and VDB[i,k] = 1), the (redundant) is-a rela tionship (i.e., <C j,C /t>) is removed. E x a m p le o f V D B 3 In Figure 4.5, line 7 indicates the integrating phase for V D B 2 . As shown in Figure 5.15, th e schem a and instances from V D B 2 consist of the classes O b ject, F acu lty, C ou rse, and D e p a r tm e n t, and their instances. All classes except for C ou rse are found in VDB 3 and m erged into th e corresponding classes of VDB3; in addition, their instances are also integrated into V D B3. Here, F a cu lty (and D e p a r tm e n t) of V D B 3 now has two base classes w ith th e same nam e, which are in V D B i and V D B2, respectively. Similarly, each instance of F a cu lty (and D ep a r tm en t) also becomes an instance of its two base classes. Due to the merging of O b jec t, th e class C ou rse rem ains as a subclass of O b jec t in V D B3. T he above integration of classes creates a redundant is-a relationship between O b jec t and F a cu lty and between O b jec t and D e p a r tm e n t, respectively. Such is-a relationships are removed by the procedures of Figure 5.16 as follows. In the lab el_class procedure, the classes of VDBj (i.e., O b ject, F acu lty, D e p a r t m en t, P r o je c t, and C ou rse) are labelled C i, . . . , C5, respectively. A fter the 55 R e m o v in g R ed u n d a n t Is-a R e la tio n sh ip s b eg in label_class(); check Jsa_of_VDB (); remove Jsa_of_VDB (); end; p ro ced u re checkJsa_of_VDB() b eg in for i = 1 to n - 1 , j = i + l to n do VDB[i,j] := 0; for i = 1 to n - 1 do for each is-a relationship < C ;,C j> do VDB[i,j] := V D B [ij] + 1; end; p ro ced u re removeJsa-.of_VDB() b eg in for i = 1 to n - 1 , j = i + l to n do b eg in if VDB[i,j] > 1 th e n remove (VD B[i,j]-l) is-a relationships < C ;,C j> ; if VDB[i,j] = 1 th e n b eg in if j < n th e n for k = j + 1 to n do if VDB[j,k] = 1 and VDB[i,k] = 1 th e n remove is-a relationship < C i,C k> ; end; end; Figure 5.16: Integrating Phase 56 Object course no; String instructor: Faculty ssno; Integer name: String faculty of: Department hasjenure: String empno; Integer research_area: String Course Faculty projno: String Project j p^investigator; Faculty ^ V total_amount; Integer Department dname: String Figure 5.17: VDB 3 after Integrating Phase check_isa_of_V D B procedure, only two elem ents V D B [1,2] and V D B [1,3] of the m atrix VDB have a value greater th an 1 . This indicates th a t there are two is-a re lationships betw een O b je c t and F a c u lty and betw een O b je c t and D e p a r tm e n t respectively. Hence, the re m o v e Jsa_ o f_ V D B procedure removes one is-a relation ship betw een O b je c t and F a c u lty and betw een O b je c t and D e p a r tm e n t. VDB 3 after th e integrating phase is shown in Figure 5.17. 5 .2 .4 T ran sform in g P h a se This phase is identical to th e corresponding phase of the single-base creation pro cess. E x a m p le o f V D B 3 In Figure 4.5, lines 8 and 9 indicate the transform ing phase. VDB 3 after this phase becomes the one shown in Figure 4.6(b). 57 5.3 Schema Change and Deletion This section describes how th e virtual database m anipulation m echanism supports th e schem a change and deletion of a virtual database. 5.3.1 S ch em a C h an ge T he schem a of a v irtual database VDBj. can be changed in either or both of the following two ways: • through transform ing • through importing, pruning, and integrating. T he form er way indicates th a t the schema change is achieved by using the tran s form ing phase of th e creation process. In other words, it is achieved by simply m odifying the schem a of VDBj. Hence, th e virtual database operations used dur ing th e transform ing phase are also used for doing this m odification. In addition, th e m odification is allowed if VDBj is not a base of another virtual database; oth erwise, it is not allowed. However, a user m ay som etim es want to perform this m odification even though it cannot be allowed for VDBj. This can be supported as follows. T he user first creates a new virtual database VDBj* on VDBj by im porting th e entire (schem a and instances of) VDBj, and then executes the m odification on VDBj'; th e user now continues to use VDBj' instead of VDBj. T he la tte r way indicates th a t the schema change is achieved by using the im port ing, pruning, and integrating phases for a non-prim ary base of the m ultiple-base creation process. 1 1 In other words, it is achieved by (1) im porting classes into VDBj from VDBj, which is selected by the user from the underlying database and its virtual databases except for VDBj,12 (2) pruning unnecessary properties an d /o r classes from these im ported classes, and (3) integrating th e result into VDBj. Here, due to the im porting phase, a new edge is created in the v irtual database deriva tion graph if VDBj is not a base of VDBj. Hence, this schem a change through im porting, pruning, and integrating is allowed if VDBj is not a base of another 1 1 As for the creation process, the pruning phase is optional. 12VDBj can be a base of VDB,-. 58 virtual database and no cycle appears in th e virtual database derivation graph as a result; otherw ise, it is not allowed. However, a user m ay sometim es want to do this schem a change even though it cannot be allowed for VDBi. This situation can be resolved as follows. T he user creates a new virtual database VDBi' on VDB, and VDBj. This is ju st the creation of a v irtual database w ith m ultiple bases; VDB, is th e prim ary base and the entire VDB,- is im ported into V D B /. Finally, the above two ways can be used in a repeated and com bined m an ner, which m eans th a t each way can be used repeatedly, and com bined w ith the other. For exam ple, suppose th a t a user wants to first im port inform ation from other v irtual databases into VDB, and th en change the schem a of VDB,-. This can be achieved by using th e la tte r way for im porting inform ation from each virtual database into VD B,, and th en using th e form er way for changing the schem a of VDB,-. 5 .3 .2 D e le tio n A user m ay want to delete a virtual database th a t is no longer needed. If this virtu al database is not a base of another virtual database, the user can delete it by using the following d e le te v d b operation. d e le te v d b VDB,-; This operation deletes th e schem a and instances of VDB,-. However, the im ported instances of VDB,- are not com pletely deleted. Since they are also instances of the base(s) of VDB,-, only their class m em berships w ith VDB, are deleted. T he opera tion also removes the node VDB,- and its incom ing edges from th e v irtual database derivation graph. E x a m p le s o f D e le tio n In Figure 4.3, a user can delete the virtual databases VDB 3 as follows: > d elete vdb VDB3; In this figure, only those virtual databases th a t are leaf nodes of th e derivation graph (i.e., VDB3, VDB 4 , and V D B 5 ) can be deleted. 59 5.4 Instance Manipulation Users can m anipulate th e instances of a virtual database in a way sim ilar to how they m anipulate th e instances of a regular database. T he virtual database m anipulation operations for instances shown in Figure 5.1813 are used for doing this; they include operations for • creating an instance of a class, • deleting an instance, • moving an instance to a class, • changing th e value of an a ttrib u te of an instance, • retrieving th e value of an a ttrib u te of an instance, and • executing a m ethod on an instance. Among these operations, only the operation for creating a new instance for an im ported class does not have the same sem antics for every v irtual database; it can be interpreted in two different ways for each v irtual database as described below. Hence, a user needs to define a virtual database as strongly dependent or weakly dependent on its bases, in order to decide in which way th a t operation is interpreted in th e v irtual database. Note th a t this notion of dependency is used only to decide the sem antics of the operation, and th a t th e operation used for defining the dependency (i.e., the d e fin e operation) can be found in A ppendix A. If a user does not define, a v irtual database is defined as weakly dependent on its bases by default. In addition, a user can change the dependency of a virtual database from strongly dependent to weakly dependent, b u t not vice versa. In w hat follows, we describe the sem antics of each instance m anipulation operation in detail. 5 .4.1 C re a tio n O p era tio n We first explain how the operation for creating an instance for an im ported class C; is handled in a virtual database VDB;. If VDB; is defined as strongly dependent 13The complete syntax of each operation can be found in Appendix A. 60 • crea te in sta n ce <class-nam e>; • d e le te in sta n c e < instance-oid>; • m ove in sta n ce < instance-oid> to <class-nam e>; • ch an ge a ttr ib u te v a lu e < in stan ce-o id > .< attrib u te-n am e> in to <value-oid>; • re tr ie v e a ttr ib u te valu e < in stan ce-o id > .< attrib u te-n am e> ; • e x e c u te m e th o d < in stan ce-o id > .< m eth o d -n am e> ({< in p u t-o id -list> }); Figure 5.18: Instance M anipulation O perations on its bases, this operation first chooses one base class (e.g., C j) of Ci, and simply invokes th e same operation for Cj in th e base to which C j belongs. On the other hand, if VDBj is defined as weakly dependent, this operation first creates a local instance for Cj in VDBj. This local instance is then m ade to be shared by (i.e., to be an (im ported) instance of) each v irtual database th a t shares other instances of Cj w ith VDBj. This case indicates th a t an im ported class (e.g., Cj) may have im ported instances as well as local instances in a weakly dependent virtual database. Consider the database DBi and its virtual databases V D B i, V D B 2 , and VDB 3 shown in Figures 3.2, 4.1, and 4.6. In V D B 3 , creating an instance for the im ported class F a cu lty is handled as follows. If VDB 3 is defined as strongly dependent on its bases V D Bi and V D B2, this operation first chooses a base class of F acu lty (e.g., F a cu lty of V D B i), and invokes the same operation for F a cu lty in V D B i. Since F a cu lty is also an im ported class in V D B i, the invoking in tu rn chooses a base class of F a cu lty of V D Bi (i.e., F a cu lty of D B i), and creates a local instance (say f5) for F a cu lty in D B i. This created instance f5 is shared by th e v irtual databases V D B i, V D B2, and V D B3; it becomes an (im ported) instance of th e class F acu lty in each of these v irtual databases. On th e other hand, if V D B 3 is defined as weakly dependent, th e operation sim ply creates a local instance for F a cu lty of VDB3. Since V D B 3 is not a base of another virtual database, this instance is not shared by other v irtual databases. We now consider other instance creation cases. C reating an instance for a de rived class is not allowed because of the am biguities th a t m ay arise in the course of 61 updating [11]. However, if an instance is created for the class(es) on which a derived class is based, and also satisfies th e associated derivation predicate, this instance is m oved/added to th e derived class. C reating an instance for a local class is handled in th e sam e fashion as for an im ported class in a weakly dependent virtu al database. In VDBi (Figure 4.1), creating instances for th e derived classes J u n io r_ F a c u lty and S e n io r_ F a c u lty are prohibited. C reating instances for the local class P r o je c t is done by sim ply creating local instances (e.g., ji, j 2) for th e class, which are then m ade to be (im ported) instances of VDB3 (Figure 4.6(b)). 5 .4 .2 D e le tio n an d M ove O p era tio n s T he operation for deleting an instance t,- deletes t„ from everywhere (i.e., th e un derlying database and its v irtual databases ) . 1 4 This can be done by sim ply deleting th e object t ;, since there exists only one t; and deleting it deletes its class m em ber ships w ith th e underlying database and virtual databases. In V D Bi (Figure 4.1), deleting th e instance ji (and di) deletes ji (and d i) from th e underlying database D B i and its virtu al databases. T he operation for moving an instance t,- to a class Ci first finds the class C j one of whose direct instances is t,-. If C j is a (direct or indirect) subclass of C i, ti is m ade to be a direct instance of C i, and its class m em berships w ith the subclasses of Ci (if any) are deleted; also, if t; has values for th e directly-defined attrib u tes of those subclasses in VDBi, these values are also deleted. On th e other hand, if C j is a (direct or indirect) superclass of C i, t t - is m ade to be a direct instance of C 4 ; hence, it becomes an indirect instance of C j. This operation is often used to delete an instance from a class but not from the superclasses of th e class in VDB*-. 5.4 .3 C h a n g e, R e tr ie v a l, an d E x e c u tio n O p era tio n s The operation for changing th e value of an a ttrib u te ae - of an instance ti first checks w hether a; is overridden by any attrib u te. Note th a t an attrib u te (and a m ethod) of an instance in a virtual database m ay or m ay not be overridden by th e sam e-nam ed 14Deleting an imported instance from a virtual database but not from others is not allowed, since this deletes the instance from an imported class (in the virtual database) but not from its base class(es). Note that an imported class must always include all instances of its base class(es). 62 a ttrib u te (and m ethod) th a t is defined for th e instance in th e underlying database or another v irtual database, due to m ultiple class m em bership and property overriding (see Section 4.1.1). If O j is not overridden, this operation sim ply changes th e value in the current v irtual database VDBj. If it is overridden by as - th a t is defined for tj in VDBj (i.e., th e underlying database or another v irtual database), the operation changes th e value of this O j of t; in VDBj. Further, in either case, the operation evaluates th e new value of a ,- of tj, and moves ti to or out of a derived class, if necessary. T he operation for retrieving the value of an attrib u te a, of an instance t x first checks w hether a n is overridden by any attrib u te. If not, this operation simply retrieves th e value in VDBj. If it is overridden by a; defined for t* in VDBj, the operation retrieves th e value of this a; of tj in VDBj. T he operation for executing a m ethod m, on an instance t x first checks w hether m,i is overridden by any m ethod. If not, this operation sim ply executes th e m ethod in VDBj. If it is overridden by m; defined for tj in VDBj, the operation executes this dj on tj in VDBj. In V D Bi (Figure 4.1), consider th e instance r x of th e class R A . Its attrib u te work-on is not overridden by any other attrib u te, while its attrib u tes ssno, name, major, and advisor are overridden by th e sam e-nam ed attrib u tes th a t are defined for rx in D Bx (Figure 3.2). Hence, changing/retrieving th e value of work-on of r x is done by sim ply changing/retrieving th e value in V D BX ; however, changing/retrieving the values of ssno, name, major, and advisor of rx is done by changing/retrieving the values of th e sam e-nam ed attrib u tes of rx in D BX . Also, changing/retrieving the associated a ttrib u te values of th e instances of the class P r o je c t is done by simply changing th e values in V D BX . In V D B 2 (Figure 4.6(a)), consider th e instance r x of the class R A . Its m ethod find-advisor is overridden by find-advisor of r x in V D BX (Figure 4.1), which is in tu rn overridden by find-advisor of r x in D BX (Figure 3.2). Hence, executing find-advisor on r x in V D B 2 is done by executing find-advisor on r x in D BX . 63 Chapter 6 Applications of Virtual Databases 4 In this chapter, we exam ine how th e virtu al database approach supports different and changing user perspectives of an object database. In particular, we first exam ine how virtual databases can be used as schem a views or versions, and com pare these schem a view and version concepts w ith th e previous schem a view and ver sion concepts, respectively. We also exam ine how the v irtual database approach supports a concept th a t is neither the schem a view nor version concept. Finally, we present th e possible cases of user perspective support of th e v irtual database approach, and com pare this approach w ith other approaches. 6.1 Schema Views V irtual databases can be used as schem a views for an object database. In this section, we describe the characteristics of the schema view approach th a t this virtual database application supports. We also com pare this schem a view approach w ith th e previous approaches. 6.1.1 C o n cep t o f a S ch em a V ie w A v irtu al database th a t is totally derived from and defined as strongly dependent on its base(s) can be used as a schem a view. T he characteristics of this schem a view concept are as follows. In this concept, a schem a view consists of a schem a and instances, and is created on one or m ore bases. Each base can be an underlying database or another schem a view created on the database, and m ust be unique. 64 Each class of a schem a view is either an im ported class, or a derived class com puted from im ported an d /o r other derived classes. New m ethods can be added to classes of a schem a view, but new attrib u tes cannot. 1 Unlike a relational view th a t often exists in th e form of a query, a schem a view exists physically. A schem a view can be created on one or m ore previously created schem a views; also, schem a views are created on their common underlying database. As a result, a database and schem a views created on it are organized into a directed acyclic graph rooted at the database. E x a m p le o f V D B 4 T he v irtual database VDB 4 of Figure 6.1(a) represents a schem a view created on the database D Bi of Figure 3.2. It satisfies the requirem ents for a virtu al database th a t can be used as a schem a view: totally derived and strongly dependent. V DB 4 is to tally derived from D B i, since it contains only im ported classes and instances. It is also defined as strongly dependent on D B i, as shown in Figure 6.1(b). 6 .1 .2 C re a tio n T he processes of creating a virtual database on one or m ore bases are used for creating a schem a view. Hence, a schem a view is created by dynam ically execut ing a series of virtual database m anipulation operations, rath er th an by a single query/operation. A schema view usually im ports some (but not all) classes of its base(s) in th e im porting phase of the creation process. In addition, since a schema view is totally derived from its base(s), local classes as well as new attrib u tes m ust not be introduced during the transform ing phase. However, we allow new m eth ods to be added to a schem a view, since they do not change the state of an instance. E x a m p le o f V D B 4 T he user creates th e schem a view V D B 4 (Figure 6.1(a)) on the database D B 4 (Figure 3.2), by executing the v irtual database operations shown in Figure 6.1(b). 1If new attributes are added to a schema view, the values for these attributes of instances in the schema view represent information that cannot be derived from the base(s) of the schema view. Note that a schema view must be totally derived from its base(s). 65 Object ssno: Integer name: String empno: Integer Employee Staff major: Department: TA faculty of: Department RA major: Department > ... start from DBj ... > create vdb VDB4 on DBt ; > access vdb VDB4; VDB4> im port Employee* from DBj; Department imported from DBj VDB4> delete property advisor, find_advisor from RA; VDB4> delete property has_staff from Department; VDB4> define VDB4 as strongly dependent; VDB4> ... manipulation of schema and instances ... VDB4> exit; >... back to D B j... (a) VDB4 (b) Creation of VDB4 Figure 6.1: An Exam ple V irtual D atabase VDB4 as a Schem a View Here, th e user im ports th e class E m p lo y e e w ith its direct and indirect subclasses, which in tu rn im ports th e class D e p a r tm e n t. H e/she then deletes th e attrib u te advisor and th e m ethod find ^advisor from the class R A , and th e a ttrib u te h a s s ta ff from th e class D e p a r tm e n t. However, any new class or property is not added to this schem a view. 6 .1 .3 S ch em a C h a n g e and D e le tio n T he operations for changing th e schem a of a virtual database are used for changing the schem a of a schem a view. As for a v irtual database, th e schem a of a schema view can be changed by simply modifying the current schema, or by im porting classes from the underlying database or another schema view. However, this schem a change m ust not introduce local classes and new attrib u tes to a schem a view; the latter case m ust not create a cycle in th e virtual database derivation graph. In addition, th e operation for deleting a virtual database is used for deleting a schem a view. Finally, th e operations for changing the schema of and deleting a schem a view are allowed only if the schem a view is not a base of any other schem a view. 66 6 .1 .4 In sta n c e M a n ip u la tio n T he operations for m anipulating instances of a v irtual database are used for a schem a view. Note th a t as m entioned above, a virtual database used as a schema view m ust be defined as strongly dependent on its base(s); for exam ple, in Fig ure 6.1(b), V D B 4 is defined as strongly dependent on D B i. Among a database and its schem a views, only th e database contains local instances; schem a views have only im ported instances. Thus, if the sam e-nam ed property is defined for an im ported instance in a schem a view as well as in th e underlying database, th e property in th e database overrides th a t in the schem a view. Further, even though an im ported in stance of a schem a view m ay have a m ethod ml th a t is not defined in th e underlying database, this m ethod eventually references the attrib u te(s) an d /o r m ethod(s) (of the schem a view) th a t are overridden by the sam e-nam ed corresponding attrib u te(s) an d /o r m ethod(s) of th e database; consequently, th e m anipulation of m * requires th e m anipulation of those attrib u te(s) a n d /o r m ethod(s) of th e database. 6 .1 .5 C o m p a riso n s w ith R e la te d W ork We now com pare this schem a view approach w ith th e previous approaches. We believe th a t th e previous approaches have recognized and supported th e features th a t are required for a desired schem a view approach for object databases. However, there hardly exists an approach th a t supports all of these features. O ur schem a view approach adopts these necessary features from the previous approaches, and incorporates them into a single framework. Among those features, the distinctive ones are in the following. • A schem a view is actually created and exists physically rath er th an exists in th e form of a query. • A schem a view is m aterialized in th e sense th a t it contains instances (i.e., im ported instances). • T he schem a of a schema view can be changed, which can be done in two ways: w ith or w ithout im porting classes. 67 • This approach does not provide a specific query language; instead, it provides the m anipulation operations th a t can be integrated into a query language. • This approach does not have a problem sim ilar to th e view u pdate problem in the relational database context [6 , 14, 20, 27, 45]. In this approach, updating instances in a schem a view does not cause any am biguity, since ( 1 ) a schema view has instances stored w ith it, (2 ) the instance m anipulation operations are executed on specific instances of a schem a view, and (3) the operation for adding an instance to a derived class, which is the only operation th a t may cause an am biguity, is not allowed. Finally, some previous class and schema view approaches [22, 28, 43] support th e join operation among classes, sim ilar to th e relational join operation. Our approach does not provide this operation, but im plicitly supports it in a lim ited way. In particular, the join operation among classes th a t are interrelated through attrib u tes an d /o r m ethods can be supported by including these classes in a schema view; th e join operation among classes th a t are not interrelated is not supported. Even though our approach does not provide the la tte r join operation, th e current prototype of our approach im plem ented using the Om ega database system [19] supports this operation. We will elaborate on this in C hapter 7. 6.2 Schema Versions V irtual databases can be used as schem a versions to support schem a evolution of an object database. In this section, we describe the characteristics of the schema version approach th a t this virtual database application supports. We also compare this schem a version approach w ith th e previous approaches. 6.2.1 C o n cep t o f a S ch em a V ersion A virtual database th a t is created on a single base and defined as weakly dependent on the base can be used as a schema version. T he characteristics of this schema version concept are as follows. In this concept, a schem a version consists of a schema and instances. Since it represents a database evolution [ 8 , 33, 38, 39, 40], a schema 68 version m ust be created on only one base. Its base can be either an underlying database or another schem a version created on th e database. A schem a version can have im ported, derived, and local classes; it can also have im ported and local instances. As for a schem a view, a schem a version exists physically. Any num ber of schema versions can be created on a base, which is either a database or another schema version. As a result, a database and schem a versions created on it are organized into a hierarchy rooted at th e database. E x a m p le o f V D B 2 T he v irtual database V D B 2 of Figure 6.2(a) (also, of Figure 4.6(a)) represents a schem a version created on th e database DBi of Figure 3.2. It satisfies th e require m ents for a v irtual database th a t can be used as a schem a version: single base and weakly dependent. V D B 2 is created on a single base (i.e., D BX ), and defined as weakly dependent on the base as shown in Figure 6.2(b). It contains th e local class C o u rs e and th e local instances cx and c2, as well as im ported classes and instances. 6 .2 .2 C rea tio n The process of creating a virtual database on a single base is used for creating a schem a version. Hence, a schem a version is created by dynam ically executing a series of v irtual database operations. A schem a version usually im ports all classes of its base in the im porting phase of th e creation process. In addition, local classes as well as new properties can be added to a schem a version during th e transform ing phase. E x a m p le o f V D B 2 T he user creates th e schem a version V DB 2 (Figure 6.2(a)) on the database D Bi (Figure 3.2), by executing th e virtual database operations illustrated in Fig ure 6.2(b). Here, th e user first im ports all classes of D BX . H e/she then deletes the class U n d e r g r a d u a te , and recursively deletes the attrib u te h a s sta ffiro m th e class D e p a r tm e n t, which in tu rn deletes the class S taff. In addition, the user adds the local class C o u rs e to this schem a version. 69 ssno: fnteg, name: String courseno: String offered by: Department instructor: Faculty major: Department dname: String Departmen Course Person “ 1 “ 2 “ 3 Employee'jenipno; integer s2 “ 1 u2 Graduate ^ faculty of: Department racuity j research_area: String advisor: Faculty fmdjidvisor() return Integer >... start from DB > create vdb VDB2 on DBj; > access vdb VDB2; VDB2> im port Object* from DBi; VDB2> delete class Undergraduate; VDB2> delete property has_staff* from Department; VDB2> add class Course (courseno: String, offered_by: Department, instructor: Faculty ); VDB2> define VDB2 as weakly dependent; VDB2> ... manipulation of schema and instances... VDB4> exit; > ... back to DBj ... tt t2 n r 2 (a) VDB2 (b) Creation of VDB2 Figure 6.2: A n Exam ple V irtual D atabase V D B 2 as a Schema Version 6.2.3 S ch em a C h a n g e an d D e le tio n T he operations for changing the schema of a virtual database are used for changing the schem a of a schem a version. As for a virtual database, the schem a of a schema version can be changed by simply m odifying th e current schem a or by im porting classes. However, in th e latter case, th e schem a version can im port additional classes only from its base, since a schem a version m ust be defined on a single base. T he operation for deleting a virtual database is used for deleting a schem a version. Finally, the operations for changing th e schem a of a schem a version and deleting a schem a version are allowed only if the schem a version is not a base of any other version. 70 6 .2 .4 In sta n c e M a n ip u la tio n T he operations for m anipulating instances of a virtual database are used for a schem a version. Note th a t as m entioned earlier, a virtual database used as a schem a version m ust be defined as weakly dependent on its base; for example, in Figure 6 .2 (b), V D B 2 is defined on weakly dependent on D B i. A schem a version m ay have im ported instances as well as local instances; its im ported instances are shared by th e underlying database a n d /o r other schema versions. As a result, this schem a version approach does not support instance versioning, and thus does not support queries for handling historical inform ation. 6 .2 .5 C o m p a riso n s w ith R e la te d W ork We now com pare this schem a version approach w ith the previous approach of [30]. • O ur approach is m ore lim ited th an th e previous approach, in th e sense th at instance versioning is not used. However, this lim itation actually allows us to avoid those problem s of com plicated instance m anipulation of the previous approach, which are discussed in Section 2.2.3. • As opposed to the previous approach, our approach does not support queries th a t handle historical inform ation for instances , 2 since it does not support instance versioning. It rath er focuses on supporting users, who have changing perspectives of a database, to share the database through different schema versions at th e same tim e. • In the previous approach, instance update on a schem a version m ay cause the creation of a new schem a version, if th e schem a version is a base of another schem a version (see Section 2.2.3). In our approach, this situation does not happen, since instance u p d ate is allowed on any schema version, regardless of w hether it is a base of another schem a version or not. In sum m ary, our approach does not support instance versioning, but resolves the problem s of the previous approach. 2An example of these queries can be found in Section 2.2. 71 6.3 Special Virtual Databases Since virtu al databases can be used as schem a views or versions, th e virtual database concept can be considered a concept th a t unifies th e schem a view and version concepts. However, the virtual database concept is m ore general th an th e unified concept. Hence, the difference between th e form er and th e la tte r can be considered another concept th a t is supported by th e virtual database approach, b u t is not the schem a view, schem a version, or unified concept. This concept is term ed the special virtual database concept. Note th a t the virtual database concept is thus the union of th e schem a view, schem a version, and special virtual database concepts. In this section, we describe th e characteristics of th e special virtual database concept and its m anipulation mechanism. 6.3.1 C o n cep t o f a S p e cia l V ir tu a l D a ta b a se A v irtual database th a t is neither a schema view nor a schem a version is used as a special v irtual database. Hence, the characteristics of the virtual database concept, except for those of th e schem a view and version concepts, are applied to the special v irtual database concept. In w hat follows, we consider th e distinctive characteristics of a special virtual database. Consider a v irtual database VDB*- created by im porting some (but not all) classes from its base(s), and adding local classes to these im ported classes. This VDB; is a v irtual database but is neither a schem a view nor a schem a version; hence, it is a special virtual database . 3 Regardless of w hether VDB; is defined as strongly or weakly dependent, it is different from a schem a view, in th e sense th a t it contains local classes (and instances) th a t are neither im ported nor derived. Further, VDB; is not a schem a version. If it has m ultiple bases, or it is defined as strongly dependent, it is obviously different from a schem a version . 4 If it has only one base and is defined as weakly dependent, VDB; m ay look like a schema version, but is still different. In particular, VDB; is created by im porting only some ____________________________________ 3Note that as mentioned above, the special virtual database concept is not a concept that unifies the schema view and version concepts. 4Note that a virtual database used as a schema version must be created on a single base, and defined as weakly dependent on its base, as described in Section 6.2.1. 72 classes of its base and transform ing them , while a schem a version is created by im porting all classes of a base and transform ing them . One m ay argue th a t VDB; can be created by im porting all classes of its base and transform ing them . However, this could be a very expensive since a virtual database often im ports only a small num ber of classes from its base. E x a m p le o f V D B j T he virtual database V D Bi of Figure 4.1 represents a special v irtual database created on th e database DBi of Figure 3.2. It contains only inform ation about faculties and research assistants from D B i, and the local class P r o je c t. N ote th a t it has two derived classes (i.e., Ju n ior_F acu lty and S en ior_F acu lty) derived from th e class F acu lty. VDBi is neither a schem a view nor a schem a version. 6 .3 .2 M a n ip u la tio n o f S p e cia l V ir tu a l D a ta b a se s T he processes of creating a virtual database is used for creating a special virtual database. A special virtual database im ports some (but not all) classes of its base(s) in th e im porting phase of the creation process. In addition, local classes as well as new properties can be added to a special virtual database during the transform ing phase. E x a m p le o f V D B i T he user creates the special virtual database VDBi (Figure 4.1) on the database D Bi (Figure 3.2), by executing the v irtual database operations illustrated in Fig ure 4.4. Details of this creation process are in Sections 4.2.1 and 5.1. T he user im ports some (but not all) classes from D B i, deletes unnecessary classes and prop erties, and adds derived and local classes as well as new properties. T he operations for deleting a v irtual database, changing its schema, and m a nipulating its instances are used for a special virtual database. Note th a t a virtual database used as a special virtual database can be defined as strongly or weakly dependent. i I I 73 6.4 Support of User Perspectives In this section, we exam ine how comprehensively th e virtual database approach supports user perspectives of a database. In order to do this, we first describe the possible application cases of the v irtual database approach; these cases actually indicate the cases of user perspective support of this approach. We th en com pare th e v irtual database approach w ith th e schema view, schem a version, and unified approaches. 6.4.1 V ir tu a l D a ta b a se A p p lic a tio n C ases We have observed th a t a virtual database can be used as a schem a view, a schema version, or a special virtual database. Since a v irtual database can be created on other v irtual databases a n d /o r the underlying database, these three concepts can be used in the same database environm ent. Figure 6.3 illustrates th e possible cases of this v irtual database application. Here, there are two kinds of arrows: regular and bold. A regular arrow from A to B indicates th a t only one A can be a base of B, while a bold arrow from A to B indicates th a t m ore th an one A can be bases of B. In addition, there are two kinds of circles: regular and bold. T he regular circle for D a ta b a se indicates th a t there exists only one underlying database; the regular circle for S ch em a V ersion indicates th a t a schem a version has only one base. The bold circle for S ch em a V iew (and S p ecia l V irtu a l D a ta b a se) indicates th a t a schem a view (and a special virtual database) can have m ore th an one base. Further, there are four nodes: D a ta b a se, S ch em a V iew , S ch em a V ersion , and S p ecia l V irtu a l D a ta b a se. • T he node D a ta b a se is included in a regular circle, and has three outgoing arrows all of which are regular. This m eans th a t there exists only one un derlying database, which can be a base of a schem a view, a schem a version, or a special virtual database; at m ost one base of a schem a view, a schema version, or a special virtual database can be the database. • T he node S ch em a V ie w is included in a bold circle, and has four incoming arrows where three of them are bold and one is regular. This m eans th a t a 74 r\ Schema View Database Special .Virtual Databasi Schema Version Figure 6.3: Possible Cases of V irtual D atabase A pplication schem a view is created on one or m ore bases, which can be selected from the underlying database as well as th e schem a views, schem a versions, and special v irtu al databases created on th e database; at m ost one base of a schema view can be the database. • T he node S c h e m a V e rsio n is included in a regular circle, and has four incom ing arrows all of which are regular. This m eans th a t a schema version is created on a single base, which is the underlying database, a schem a view, a schem a version, or a special virtual database. • T he node S p e c ia l V ir tu a l D a ta b a s e is included in a bold circle, and has four incom ing arrows where three of them are bold and one is regular. This m eans th a t a special virtual database is created on one or m ore bases, which can be selected from the underlying database as well as the schem a views, schem a versions, and special virtual databases created on the database; at m ost one base of a special virtual database can be th e database. 75 The possible cases of th e virtual database application illustrated in Figure 6.3 can be sum m arized as follows. First, a schem a view or a special v irtual database can be created on • th e underlying database, • one or m ore schema views, • one or m ore schem a versions, • one or m ore special virtual databases, or • any com bination of th e above. Second, a schem a version can be created on • the underlying database, • one schem a view, • one schem a version, or • one special virtual database. E x a m p les o f V D B 5 and V D B 6 Figure 6.4 represents two exam ple cases: (a) a schema view created on a schema version and (b) a schem a version created on a schem a view. F irst, Figure 6.4(a) represents the schem a view VDB5 created on th e schem a version VDB2, which is created by im porting only inform ation about courses as well as their instructors and teaching assistants from VDB2; VDB5 m ust be defined as strongly dependent on VDB2. Second, Figure 6.4(b) represents th e schema version VDB 6 created on the schem a view VDB4, which is created by im porting the entire VDB4, adding the local class S alary_S tatu s, and adding th e attrib u te salary to the (im ported) class E m p lo y ee. Finally, since a database and its virtual databases are organized into a directed acyclic graph (DAG) rooted at the database (i.e., the derivation graph for the 76 Object Object > : Integer qfferedby: Department name: String instructor: Faculty courseno: String empno: Integer J salary: Salary ^Status jm ✓'Employee dname: String i: String major: Department ta af: Course faculty of: Department researcharea: String TA HA faculty of; Department kas_tenure: String major"—_ Department ssno; Integer name: String I f Person Departmei Department Faculty Salary^ Status category: String amount: Integer Course (a)VDBs (b) VDB6 Figure 6.4: Exam ple Cases of V irtual D atabase Application database), it follows th a t a database and the schem a views, schem a versions, and special virtual databases created on th e database are organized into the derivation graph for the database. Figure 6.5 shows the derivation graph for the database D B i, which includes all exam ple v irtual databases considered in this dissertation. 6 .4 .2 C o m p a riso n s w ith P r e v io u s W ork Figure 6.3 illustrates th a t the virtual database approach supports user perspectives th a t th e schem a view and version approaches, as well as the unified approaches of both, support. F irst, in the schem a view approaches, a schem a view represents a user perspective, and can be created on • the underlying database, • one or m ore schem a views, or • any com bination of th e above. The virtual database approach supports these cases, since in Figure 6.3, they are represented by th e nodes D a ta b a s e and S c h e m a V ie w and the arrows among them . It also supports all possible schem a change cases th a t th e schem a view approaches support. 77 Legend: ^ ^ Schema view ( | | | | ) Schema version Special virtual database v!)» m \ ^ \ Figure 6.5: V irtual D atabase Derivation G raph for D Bi Second, in the schem a version approaches, a schema version represents a user perspective, and can be created on • th e underlying database, or • one schem a version. The v irtual database approach supports these cases, since in Figure 6.3, they are represented by the nodes D a ta b a s e and S c h e m a V e rsio n and th e arrows among them . It also supports all possible schem a change cases th a t the schem a version approaches support. T hird, in th e unified approaches, a unified concept represents a user perspective, and is used as either a schem a view or version . 5 Hence, a schem a view can be created on • the underlying database, • one or m ore schem a views, • one or m ore schem a versions, or • any com bination of the above. 5 Note that this concept is not the same as a special virtual database which is neither a schema view nor version. 78 Also, a schem a version can be created on • the underlying database, • one schem a view, or • one schem a version. The v irtual database approach supports these cases, since in Figure 6.3, they are represented by th e nodes D a ta b a s e , S c h e m a V iew , and S c h e m a V e rsio n , and th e arrows among them . It also supports all possible schem a change cases th a t the unified approaches support. Figure 6.3 also illustrates th a t the virtual database approach supports those user perspectives th a t the above schema view, schem a version, and unified approaches do not support. In th e virtual database approach, special virtual databases are supported, and can be used w ith schem a views and versions in th e same database environm ent. This is represented by th e node S p e c ia l V ir tu a l D a ta b a s e , the self-loop arrow of this node, and the arrows betw een this node and other nodes in Figure 6.3. In sum m ary, th e virtual database approach can be considered a schema view or version approach. As a result, it can also be considered an approach th a t uni fies the schem a view and version approaches. However, as discussed above, it is m ore general th an the unified approach, due to th e special virtual database con cept th a t it supports. This indicates th a t th e v irtual database approach not only unifies th e schem a view and version approaches, but also extends them . Hence, the virtual database approach supports different and changing user perspectives of a database m ore com prehensively th an th e schema view, schema version, and unified approaches. 79 Chapter 7 Prototype Implementation An experim ental prototype of th e virtual database m echanism has been im ple m ented using th e Omega database system [19] developed at USC. In this chapter, we briefly describe the features of Omega th a t are relevant to our prototype im ple m entation. We also discuss the im portant issues related to this im plem entation. 7.1 Omega Omega is a parallel object-based database system constructed using th e W isconsin Storage S tructure [1 2 ] which is a relational file system . It is based on a functional d ata m odel sim ilar to [46], whose prim ary modeling constructs are objects, func tions, and classes. This d ata m odel supports two kinds of functions: stored and com puted. A stored function can be considered an attrib u te, while a com puted function can be considered a m ethod w ith a single input argum ent. In addition, th e d ata m odel supports classification, aggregation, generalization, (m ultiple) in heritance of functions, m ultiple class m em bership, and property overriding. Om ega supports a variant of OSQL [18, 50] as its database language. Like SQL for relational databases, this OSQL (variant) serves as th e d ata definition, d ata m anipulation, and query language of Om ega1; hence, it provides th e operations for m anipulating m eta-data (i.e., classes and functions) as well as d ata (i.e., instances). In addition, OSQL uses the select operation for querying instances, which is similar 1In the remainder of this dissertation, we will use the term OSQL to indicate the database language of Omega. 80 to the select operation of SQL in the relational database context. Consider the following exam ple select queries. se le c t vehicleid(c), m aker(c) for each Car c; s e le c t pnam e(p), m aker(c) for each Person p, Car c w h ere ((pnam e(p) = owner(c)) and (work_for(p) = m aker(c))); Suppose th a t these queries are posed against the database consisting of the following three classes: Car (vehicleid: String, owner: String, maker: Com pany), Person (pnam e: String, work_for: Com pany), and Com pany (cname: String, location: String). The first query returns the makers of cars, while the second one returns the nam es of persons who work for the m akers of their cars, and these makers. Note th a t the second query illustrates a case th a t is sim ilar to joining two tables in the relational database context. 7.2 Virtual Database Mechanism Om ega supports the modeling features of CODM discussed in C hapter 3, except th a t attrib u tes and m ethods of CODM can be represented as stored and com puted functions in Om ega respectively, and com puted functions correspond to special m ethods. In addition, we could access and modify th e source code of Omega. Hence, we decided to use Omega for im plem enting the virtual database m echanism. In this section, we first illustrate how virtual databases can be represented in an Om ega database, and then discuss th e issues related to th e im plem entation of the v irtual database m anipulation operations. 7.2.1 R e p r e se n ta tio n o f V ir tu a l D a ta b a ses In our im plem entation, virtual databases defined on an underlying database are ac tually created and stored in the database. The class lattice of each virtual database 81 ssno: Integer name: String Person V D B i. Object VDB, Object ssno: Integer name: String ✓ '" " v D B l v . Person 'dname: String has ,staff; Staff projno: String p_investigator: VDB /Faculty totalamount: Integer Department VDBX _ Project VDBj_ Faculty VDBj JJepartmeni dname: String Figure 7.1: R epresentation of V irtual D atabases in DBi is created in th e database, and its root is m ade to be a direct subclass of the root of th e class lattice of th e database. This indicates th a t the class lattice of the virtual database is integrated into a branch of th e class lattice of the database. How ever, this integration m ay cause the following conflict. A database and its virtual databases m ay have classes w ith the same name; in this case, v irtual databases can not be stored in the database since a class m ust have a unique nam e in a database. We resolve this conflict by simply prefixing th e (unique) identifier of a virtual database to th e nam es of its classes; for exam ple, V D B i-O b je c t, V D B i-P e rs o n . T he class nam es th a t are used in the definitions of attrib u tes and m ethods in a v irtual database are also changed. Figure 7.1 illustrates how the virtual databases of the database D Bi are represented and stored in D B i. However, users do not see the virtual database identifiers prefixed to class nam es when they access virtual databases. Hence, they simply use non-prefixed ordinary class nam es in their queries or v irtual database operations; the query processor th en changes these nam es into prefixed nam es and uses th e latter internally. 82 In Figure 7.1, the way of representing virtual databases in a database does not reveal the order of derivation among them (i.e., the derivation graph). This order is im portant for instance m anipulation as discussed in Section 5.4. It is not stored in the database; instead, it is stored and m aintained in th e table Derivation_Graph where for each virtual database (including the underlying database), its bases as well as th e virtual databases of which it is a base are specified. 7 .2 .2 Im p le m e n ta tio n o f V ir tu a l D a ta b a se O p era tio n s T he current prototype of th e virtual database m echanism does not fully support all v irtual database operations found in C hapter 5. It fully supports th e operations for deletion and instance m anipulation of a virtual database, but partially supports those for single-base creation, m ultiple-base creation, and schem a change. T he sup ported operations have been included in OSQL. In w hat follows, we discuss the issues and problem s related to th e im plem entation of the v irtual database opera tions. C r e a tio n o n a S in g le B a se As described in Section 5.1, the single-base creation process consists of the fol lowing phases: preparing, im porting, pruning, and transform ing. T he operations used for th e preparing phase (i.e., c r e a te v d b and access v d b ) have been im ple m ented to create and open a script file, respectively. Note th a t the nam e of this file is the identifier of the virtual database to be created. T he operation for the im porting phase (i.e., im p o rt) has been im plem ented as follows. As discussed in Section 5.1.2, this operation is divided into the follow ing steps: class definition im porting, type-closure checking, is-a relationship finding, and instance arranging. Since classes m ust be associated w ith other classes through is-a relationships, but the is-a relationships among the (im ported) classes of a vir tu al database are created during the is-a relationship finding step, the first three steps have been im plem ented to copy th e definitions of the classes into, and m a nipulate a simple d ata structure (e.g., an array of structures in the C program m ing language). A fter th e is-a relationship finding step, th e class definitions are retrieved from th e d a ta structure, and converted into the Omega class definition form at; this 83 conversion also prefixes the identifier of the v irtual database to all class names. T he converted class definitions are then put into the script file created during the preparing phase, which is executed to actually create im ported classes in the un derlying database. An exam ple script file is shown in Figure 7.22, which creates the im ported classes of VDBi (Figure 4.1) in th e database D BX (Figure 3.2). Note th a t the root class of V D Bi (ke., V D B i_ O b je c t) is explicitly created as shown in the first line of the file. A fter creating im ported classes in the virtual database, the instance arranging step is done by using th e a d d ty p e operation of OSQL, which makes an instance belong to a specified class. A fter th e im porting phase, the virtual database being created becomes another virtual database and is stored in the underlying database . 3 The operations for the pruning and transform ing phases thus require the dynam ic schem a changes of the database. Om ega partially supports only a few of these operations (i.e., creating and deleting a class or function). Since Om ega was not originally designed and developed to effectively support dynam ic schem a changes, however, it is too hard to im plem ent the other operations for the pruning and transform ing phases using Omega. As a result, these operations have not been im plem ented. Finally, the virtual database m echanism presented in C hapter 5 does not support th e join operation among arbitrary classes sim ilar to th e join operation in the relational database context. However, as illustrated in th e above exam ple select query, th e select operation of OSQL allows users to execute this join operation sim ilarly to a relational view. In addition, the select operation can store th e result of join into a class. N ote th a t this class is stored as a subclass of the root of the class lattice of a virtual database. For exam ple, the user can store th e result of the above select exam ple in the class E m p lo y e e by th e following select query. s e le c t pnam e(p), m aker(c) in to Em ploym ent as s to re d fo r e a c h Person p, Car c w h e re ((pnam e(p) = owner(c)) a n d (workjfor(p) = m aker(c))); 2Omega uses the term “type” to indicate a class. 3Note that a database and its virtual databases are stored in the database as described in Section 7.2.1. 84 cre a te ty p e V D B i-O bject; c rea te ty p e VDBi_Person su b ty p e o f V D Bi_Object ( ssno IN TEG ER , nam e STRIN G ); c r e a te ty p e VDBi_Staff su b ty p e o f VDBi_Person ( em pno IN TE G ER ); crea te ty p e V D B i-D epartm ent su b ty p e o f V D Bi_O bject ( dnam e STRIN G, has_staIF V D B i-Staff ); crea te ty p e VDBi_Faculty su b ty p e o f VDBi_Person ( em pno IN TEG ER , faculty_of V D B i_D epartm ent, has_tenure STRIN G ); cre a te ty p e VDBi_RA su b ty p e o f V D B i-Person ( m ajor V D B i-D epartm ent, em pno IN TEG ER , advisor V D B i-Faculty ); crea te fu n ctio n find_advisor (VDBi_RA r) — > (IN T E G E R i) as co m p u ted i q u it 1 Figure 7.2: An Exam ple Schema Definition Script File V D Bi for Omega C r e a tio n o n M u ltip le B a se s T he m ultiple-base creation process described in Section 5.2 consists of th e fol lowing phases: (1) preparing, (2) im porting and pruning for a prim ary base, (3) im porting, pruning, and integrating for each non-prim ary base, and (4) transform ing. Here, the operations used for the im porting, pruning, and integrating phases for each non-prim ary base require dynam ic schem a changes of th e underlying database; hence, they have not been im plem ented. However, th e operations for other phases jare identical to the corresponding operations used for th e single-base creation pro- I Icess. S c h e m a C h a n g e As described in Section 5.3.1, the schem a of a virtual database can be changed ( 1 ) through transform ing, an d /o r (2 ) through im porting, pruning, and integrating. The form er uses th e operations for the transform ing phase of th e single/m ultiple- base creation process, while the latter uses those for the im porting, pruning, and integrating phases for a non-prim ary base of the m ultiple-base creation process. Hence, the form er has been partially im plem ented, while th e latte r has not been im plem ented. [D e le tio n i T he operation for deleting a virtual database (i.e., d e le te v d b ) has been im ple m ented by using the d e le te in s ta n c e , re m o v e ty p e , and d e le te ty p e operations of OSQL. It first deletes instances and then classes of a virtual database. Here, d e le te in s ta n c e deletes a local instance from a database; re m o v e ty p e removes th e m em bership of an im ported instance from a virtual database; d e le te ty p e deletes a class (w ith no instance) from a database. i I j I n s ta n c e M a n ip u la tio n | < ' t j i Omega originally supports the operations th a t are equivalent to some of the in- j j . . . . . . . 1 'stance m anipulation operations described in Section 5.4 but have different syntax. jThe operation for creating an instance (i.e., c r e a te in s ta n c e ) has been imple- j m ented by using the c r e a te in s ta n c e operation of OSQL as follows. The instance is created in a database or virtual database by executing c r e a te in s ta n c e of OSQL; i 86 it is then m ade to be a m em ber of another virtual database if necessary by executing add ty p e of OSQL. The operation for deleting an instance (i.e., d e le te in sta n ce) is equivalent to the d e le te in sta n ce operation of OSQL. T he operation for moving an instance to a class (i.e., m ove in sta n ce) has been im plem ented by using the rem ove ty p e and add ty p e operations of OSQL. T he operation for changing an a ttrib u te value of an instance (i.e., ch an ge a t tr ib u te valu e) is equivalent to the se t operation of OSQL. Further, th e operations for retrieving an a ttrib u te value of an instance (i.e., r etriev e a ttr ib u te value) and executing a m ethod on an instance (i.e., e x e c u te m e th o d ) are originally sup ported by Om ega in the form of th e sele ct operation of OSQL; th e select queries discussed in Section 7.1 illustrate exam ple cases of retrieving attrib u te values. Due to th e support of m ultiple class m em bership in Omega, however, an instance may have m ultiple functions w ith the same nam e. This nam ing conflict m ay be caused w ithin a database or virtual database (i.e., among classes and their superclasses), or among a virtual database and its bases (i.e., among im ported classes and their base classes). It is obvious th a t this conflict m ust be resolved to support th e above three operations (i.e., ch an ge a ttr ib u te valu e, re triev e a ttr ib u te valu e, and e x e c u te m e th o d ). Om ega originally resolves the form er conflict case by run-tim e binding. However, this binding m echanism had to be extended to resolve th e la t te r conflict case. Thus, the resolution strategy discussed in Section 4.1.1 has been im plem ented. Chapter 8 Conclusion In this chapter, we sum m arize the results and contributions of th e v irtual database approach presented in this dissertation. We also discuss directions for future re search based on this approach. ( 8.1 Summary In this dissertation, we have introduced the concept of a v irtual database, including the characteristics of the schema and instances of a v irtual database, and th e notion of a derivation graph. We have also presented a m echanism for m anipulating virtual databases, which provides users w ith the operations for creating and deleting a v irtual database and changing its schema, as well as for creating, deleting, changing, and querying/retrieving its instances. In addition, we have illustrated how virtual databases can be used as schem a views or versions, and com pared these schema view and version concepts w ith the previous schem a view and version approaches, respectively. We have discussed how the virtual database approach supports a [concept th a t is neither the schema view nor version concept. Further, we have ! exam ined how the virtual database approach can support different and changing iuser perspectives of an object database, and com pared this approach w ith the j schem a view, schem a version, and unified approaches respectively. Finally, we have [discussed the issues related to th e im plem entation of an experim ental prototype of jthe v irtual database concept and its m anipulation m echanism. 88 8.2 Contributions T he contributions of th e virtual database approach can be sum m arized as follows. • The virtual database approach has not been addressed previously. It can be considered a schem a view or version approach. As a result, it can also be considered an approach th a t unifies th e schema view and version approaches. However, it is m ore general th an the unified approach. This indicates th at th e virtual database approach not only unifies the schem a view and version approaches, but also extends them . Hence, th e v irtual database approach supports different and changing user perspectives of a database m ore compre hensively th an the schem a view, schema version, and unified approaches. • The virtual database approach can be considered a schem a view approach, since virtual databases can be used as schem a views. This schem a view ap proach resolves the lim itations of the previous approaches. We believe th at the features th a t are required for a desired schem a view concept for object databases have been recognized and proposed by th e previous approaches. However, there hardly exists a schem a view approach th a t supports all of these features. O ur schema view approach adopts these required features from the previous approaches and incorporates them into a single framework. • T he v irtual database approach can be considered a schem a version approach, since v irtual databases can be used as schem a versions. This schema version approach resolves the lim itations of the previous approach (i.e., [30]). It is more lim ited th an the previous approach, in the sense th a t instance versioning is not employed. However, this lim itation actually allows us to avoid the problem s of com plicated instance m anipulation of the previous approach. • T he virtual database approach is based on a simple object d ata m odel (i.e., CODM ), and some of the m anipulation operations deal w ith dynam ic changes to th e schem a of a virtual database. Hence, as long as a database system is based on the d ata model th a t supports th e features of CODM , and also allows the schem a of its database to be dynam ically changed, th e v irtual database approach can be easily im plem ented in this database system. 89 8.3 Directions for Future Research In this section, we discuss directions for future work based on the virtual database approach. We divide this section into two parts. In the first p art, we describe the extension and refinem ent th a t is needed for the current approach. In the sec ond p art, we consider the application of our approach to the partial integration of com ponent databases in federated database systems. 8.3.1 S in g le D a ta b a se E n v iro n m en t The current status of the virtual database approach requires the following future research. • T he virtual database approach presently provides only a m inim um set of m anipulation operations th a t are needed to show the distinctive features of v irtual databases. O ther operations, such as th e select operation in OSQL (see Section 7), are simply assum ed to be supported by a query language into which th e m anipulation operations are integrated. However, it is not clear th a t which operations are included in those other operations, since there is no consensus about th e requirem ents for a query language in the object database context. Therefore, we need to work (1) on developing a query language th a t embodies th e virtual database m anipulation operations and those other operations, or (2 ) at least on specifying/deciding those other operations th a t are required to be supported by a query language into which th e virtual database m anipulation operations are integrated. • Some of the m anipulation operations have side effects; hence, it is often difficult to expect the results of executing these operations. For example, when a user creates a virtual database, he/she selects some classes from its base(s) w ith the im p o r t operation. Since this operation m ay im port addi tional classes th a t are not specified by the user and creates the is-a relation ships among classes, expecting its result is very difficult even for a simple case. Hence, it is required to have a m eans/tool th a t allows users to display th e schem a of a virtual database whenever needed. Presently, the d isp la y s c h e m a operation (see A ppendix A) displays the definition of th e classes of 90 a (virtual) database in a tex t form. Due to the graphical nature of (vir tual) database schemas, we need to develop a new operation th a t displays the schem a of a (virtual) database in a graphical form. • M any view approaches in th e object database context propose an operation sim ilar to th e join operation for relational databases. They propose it since they claim th a t object database systems m ust support the features th a t are supported by relational database systems. We need to work on supporting this arbitrary join operation in the virtual database approach. • During th e im porting phase of the creation process, the type-closure checking step simply im ports all classes th a t are recursively referenced by the user- selected classes. However, this step may im port too m any unnecessary addi tional classes th a t are deleted in the pruning phase. A m ore efficient algorithm th a t can reduce th e num ber of unnecessary classes to be im ported is required. In addition, the algorithm for the is-a relationship finding step requires some additional optim ization of its procedures. 8 .3 .2 M u ltip le D a ta b a se E n viron m en t The virtual database approach can be applied to inform ation sharing in an envi ronm ent th at consists of heterogeneous and autonom ous databases (we call this environm ent a multiple database environm ent). In this environm ent, each database is term ed a component database, and m ay have virtual databases; we call an environ m ent consisting a com ponent database and its virtual databases a single database environment. In addition, since com ponent databases share inform ation w ith (i.e., im port inform ation from) others, we regard them as virtual databases. Figure 8.1 shows an exam ple m ultiple database environm ent. Here, the com ponent database DB2 and its v irtual databases V D B2.i and VDB2.2 are interrelated via solid arrows; also, the inform ation sharing among single database environm ents are represented via dotted arrows. The com ponent database DB5 im ports inform ation from the com ponent database DB4 and the virtual database V D B 2 .i; the virtual database VDB3 -1 im ports from the com ponent database D B2; th e virtual database VDB4il im ports from th e virtual databases V D B2.2 and VDB3.2. 91 VDB Figure 8.1: V irtual D atabases in a M ultiple D atabase Environm ent This v irtual database application provides a simple b u t flexible m echanism for the seamless p artial integration of com ponent databases. This m echanism does not put em phasis on the methodologies for the subtasks of partial integration, for exam ple, the methodologies for solving heterogeneity problems among com ponent databases and th e methodologies for integrating schemas of com ponent databases. Instead, it assumes th a t users know such methodologies, and provides the users w ith the operations w ith which they can easily do the subtasks of partial integration fol lowing their methodologies. For exam ple, the m echanism provides th e operations for specifying export schem as1 for a com ponent database, dynam ically im porting and integrating inform ation from a rem ote database into a local database, and changing the m eta-d ata of im ported inform ation as well as th e schem a of a local database for resolving heterogeneity betw een local and rem ote databases. In order to sup port this m echanism , the v irtual database m anipulation m echanism needs to be extended to deal w ith the interactions and relationships among m ultiple databases. This application also provides a flexible architecture for inform ation sharing among heterogeneous and autonom ous databases. There have been a num ber of re search approaches for sharing inform ation among existing heterogeneous databases 1An export schema of a component database indicates a part of the database that can be shared by other component databases [23]. It can be represented by a virtual database. 92 [15, 17, 23, 34, 37], The common principle of these approaches is to provide the users of the local database w ith inform ation in rem ote databases, so th a t the users can m anipulate this rem ote inform ation as they m anipulate local inform ation. There are three m ajor directions th a t the existing approaches take for im plem enting this principle. The first direction is to develop a m ultidatabase query language to sup port queries involving inform ation in m ultiple databases. [34] takes this direction. The problem of this approach is th a t th e users of th e local database m ust know w hat is in rem ote databases, in order to place queries involving m ultiple databases (we call these queries multidatabase queries). The second direction is to totally integrate the schemas of local and rem ote databases into a single global schema. M ultidatabase queries are placed on this global schema. [15] and [37] take this direc tion. In particular, [15] concentrates on the methodologies for integrating schemas and processing m ultidatabase queries, while [37] concentrates on the operations for interactively integrating and m anipulating schemas. However, the common prob lem of these two approaches is th a t constructing a global schem a is com plicated and sometimes impossible. The th ird and final direction is to partially integrate rem ote databases into the local database (i.e., integrating the portions of rem ote databases into th e local database). [23] and [17] take this direction. [23] discusses the integration of the partial schemas of rem ote databases into a single schem a (called an im port schema) for the local database, b u t does not address the issue of integrating this im port schem a into the local database. [17] proposes an experim ental system for partially integrating the schemas and instances of rem ote databases into the local database, but it actually concentrates on the efficient m echanism for accessing local and re m ote inform ation (w ithout considering their locations) from th e local database, rath er th an the m echanism for partial integration. However, the common problem of these two approaches is th a t im porting inform ation into a local database changes th e current schem a of the local database, even though the current schema m ay be needed later by some users. The architecture th a t th e virtual database application provides has several ad vantages over th e approaches th a t deal w ith to tal or partial integration of compo nent databases (i.e., th a t take the second and th ird directions). Some of them are discussed in th e following. First, in this architecture, com ponent databases as well 93 as th eir virtual databases can participate in inform ation sharing, rath er than only com ponent databases can participate. In other words, inform ation can be im ported directly into virtual databases as well as com ponent databases. This indicates more diverse inform ation sharing patterns. Second, if some users of a local database do not want to change the schema, im porting inform ation from rem ote databases into a local database can be achieved by creating a virtual database th a t im ports from rem ote databases as well as the local database. T hird, since virtual databases are often smaller th an their underlying com ponent database, the size of inform ation im ported into virtual databases is often smaller th an the size of inform ation im ported into com ponent databases. This indicates smaller overhead of integration. 94 Appendix A Virtual Database Manipulation Operations This appendix describes the syntax and semantics of th e virtual database m anipu lation operations. The syntax of the operations include the following notations. • < . . . > is a nonterm inal symbol, • { . . . } is an optional part, and • | represents “o r”. A .l Virtual Database Operations This section contains the operations for handling a virtual database itself or for involving more th an one virtual database. Details of these operations except for th e e x it, d isp la y sc h e m a , and d e fin e operations can be found in C hapter 5. • c r e a te v d b < vd b -id > o n <base-id-list>; • d e le te v d b < vdb-id> ; • ac ce ss v d b < vdb-id> ; • e x it; • im p o r t <r-class-nam e-list> fro m <base~id>; • in te g ra te ; • d isp la y sc h e m a < base-id> ; 95 • d efin e < v d b -id > as <vdb-dependency>; where <base-id> := < vdb-id> | < db-id> < base-id-list> := < base-id> {, <base-id-list>} < r-class-nam e> := <class-nam e>f*l <r-class-nam e-list> := <r-class-nam e>{, <r-class-nam e-list>} <vdb-dependency> := str o n g ly d ep en d en t j w eak ly d ep en d en t Here, < vd b -id > is the identifier of a virtual database; < d b -id > is th e identifier of a database; <class-nam e> is the nam e of a class. • T he crea te v d b operation creates an em pty virtual database on one or more bases. • The d e le te v d b operation deletes a virtual database if it is not a base of another virtual database. • T he access v d b operation accesses a virtual database. • T he e x it operation exits from a virtual database or a database. • The im p o rt operation im ports classes and their instances from a base into the current virtual database, and also derives is-a relationships from the base. • The in te g r a te operation integrates inform ation im ported from a base into the current virtual database. • The d isp la y sch em a operation displays the definition of each class of a virtual database or a database in a tex t form. • T he d efin e operation defines a virtual database as strongly dependent or weakly dependent. 96 A.2 Class Operations This section contains the operations for handling classes. • add class <class-nam e> {su b cla ss o f <class-nam e-list>} {(< property-def-list >)} {v ia <class-derivation-predicate>}; • d e le te class <class-nam e>; • ren a m e class <ciass-nam e> as <class-nam e>; where <class-nam e-list> := <class-nam e>{, <class-nam e-list>} <class-derivation-predicate> has one of th e following forms: 1. A simple predicate which is either 1 ) < p ath -n am e> < scalar-com parator> < co n stan t> or 2) < p ath -n am e> <scalar-com parator> < path-nam e> where < p ath -n am e> is a sequence of attrib u te names, < scalar-com parator> is one of = , 7^, < , < , > , and > , and < co n stan t> is an integer, a string, or a Boolean value 2. A Boolean com bination of simple predicates using the operators and, or, and n ot 3. <class-nam e> < set-operator> <class-nam e> where < set-operator> is one of in ter sectio n , u n ion , and d ifferen ce • The add class operation creates a new class C j. It is invoked w ith or w ithout <class-derivation-predicate>. W hen this predicate is not given, th e class C , w ith th e given superclass(es) is created. If no superclass is specified, th e class O b ject becomes the superclass of C*. The attrib u tes and m ethods of the superclass(es) are inherited by C ,. On the other hand, three kinds of predicates can be given. The first one is a simple predicate which has one of the two forms. First, an exam ple of using 97 the first form is the creation of the class H on or_stu d en t as a subclass of the class S tu d en t w ith the predicate gpa > 3.5. We here suppose th a t S tu d en t has th e a ttrib u te gpa whose dom ain class is R eal. Second, an exam ple of using the second form is th e creation of the class S p ecia l _RA as a subclass of R A with the predicate m ajor ^ advisor, facuity-of. W hen creating a class w ith a simple predicate, the superclass of the class is specified by a user. T he second one is a Boolean com bination of simple predicates using th e operators and, or, and not. An exam ple of using this kind of predicate is th e creation of th e class C S_R A as a subclass of R A w ith the following predicate: major, dname = “CS” and m ajor = advisor.faculty.of. W hen creating a class w ith this kind of predicate, th e superclass of th e class is specified by a user. In th e cases using these two kinds of predicates, the attrib u tes and m ethods of the superclass(es) are inherited by th e class. Further, the instances of the superclass(es) which satisfy th e predicate are moved into th e class. The last one is a predicate using one of th e set operators in te r se c tio n , union, and d ifferen ce. F irst, an exam ple of using a predicate w ith in ter sectio n is the creation of the class T A _and_R A w ith th e predicate: TA in ter sec tio n RA. The classes TA and R A become the superclasses of the class T A _and_R A . The a t tributes and m ethods of T A and R A are inherited by TA _and_R A . Further, th e instances of both TA and R A are moved from T A and R A into the created class. Second, an exam ple of using a predicate w ith u n ion is the creation of the class G ra d u a te_ A ssista n t w ith the predicate: TA u n ion RA. The classes TA and R A become the subclasses of the class G ra d u a te_ A ssista n t, and the common superclass(es) of TA and R A become the superclass(es) of G radu- a te_ A ssista n t. The attrib u tes and m ethods of the superclass(es) are inherited by G ra d u a te_ A ssista n t. Also, the properties which are common to TA and R A , but are not the inherited properties of G ra d u a te_ A ssista n t, are moved from TA and R A into G rad u ate_A ssistan t. However, the m igration of the instances of T A and R A is not needed. Third, an exam ple of using a predicate w ith n ot is the creation of the class T A _O nly w ith th e predicate: TA n o t RA. The class TA becomes the superclass of th e class T A _O nly. The attributes and 98 m ethods of TA are inherited by T A _O nly. Further, the instances of TA which are not th e instances of R A are moved into T A _O nly. • T he d e le te class operation deletes a class. This operation first deletes all is-a relationships from the class to its subclasses by using the d e le te isa operation, and then removes all is-a relationships from its superclasses to itself. It then moves th e instances of the class to its superclasses, and finally removes the class itself. Further, if there exist attrib u tes an d /o r m ethods referring to the class, th e user is asked to decide w hether this operation is aborted, or those attributes a n d /o r m ethods are also deleted. • T he ren am e class operation changes th e nam e of a class, and has no effect on instances. If there exist attrib u tes an d /o r m ethods referring to th e class, the user is asked to decide w hether this operation is aborted, or those attrib u tes an d /o r m ethods are changed. A.3 Attribute and Method Operations This section contains the operations for handling attrib u tes and m ethods. • add a ttr ib u te to <class-nam e> (< attribute-def-list> ); • add m e th o d to <class-nam e> (<m ethod-def-list>); • d e le te p ro p erty <r-property-nam e-list> from <class-nam e>; • ren am e p ro p erty < property-nam e> o f <class-nam e> as <property-nam e>; where < attrib u te-d ef> := < attrib u te-n am e> : <class-nam e> < attrib u te-d ef-list> := < attrib u te-d ef> {, < attribute-def-list> } < m ethod-def> := < m ethod-nam e> ({<class-nam e-list>}) retu rn <class-nam e> b o d y <m ethod-body-file> < m ethod-def-list> := <m ethod-def> {, <m ethod-def-list>} < r-property-nam e> := < property-name>^*^ < r-property-nam e-list> := < r-property-nam e>{, <r-property-nam e>} 99 • T he add a ttr ib u te to operation adds an attrib u te to a class. T he added a t trib u te is inherited by the subclasses of the class. • The add m e th o d to operation adds a m ethod to a class. The nam e, input argum ent(s), output argum ent, and body of the m ethod is specified in this op eration. However, the actual body of the m ethod is not specified, and the name of the file containing it is specified instead. Thus, the body of the m ethod m ust have been created in the file before this operation is invoked. The added m ethod is inherited by the subclasses of the class. • T he d e le te p ro p erty operation deletes an attrib u te or a m ethod from a class. T he property (i.e., th e attrib u te or m ethod) m ust be defined in the class. This operation also deletes the property from all subclasses of th e class. If this prop erty is an attrib u te, th e instances of the class m ay lose their values for the attrib u te. If it is a m ethod, the operation has no effect on instances of classes. Further, if there exist m ethods referring to this property, the user is asked to decide w hether this operation is aborted or those m ethods are also deleted. • The ren am e p ro p erty operation changes the nam e of a property defined in a class, and has no effect on instances. However, if there exist m ethods referring to this property, th e user is asked to decide w hether this operation is aborted or those m ethods are changed. A.4 Is-a Operations This section contains the operations for handling is-a relationships. • add isa <class-nam e> to <class-nam e>; • d e le te isa <class-nam e> to <class-nam e>; • The add isa operation creates an is-a relationship from a class S to another class C . Thus, the class S becomes a superclass of the class C. The created is-a relationship m ust not be a redundant one, and m ust not introduce a cycle in th e class lattice. A fter this creation, the is-a relationships among the im ported 100 classes of a virtual database m ust be still derivable from the is-a relationships in the base(s) of the virtual database. The attributes and m ethods of S are inherited by C and its subclasses. • The d e le te isa operation deletes an is-a relationship from a class S to another class C. This deletion m ust not leave th e class lattice disconnected. Thus, if S is the only superclass of C, the im m ediate superclasses of S become the im m ediate superclasses of C. A fter this deletion, the is-a relationships among the im ported classes of a virtual database m ust be still derivable from th e is-a relationships in the base(s) of th e virtual database. The attributes and m ethods which were defined in the class S are deleted from the class C. A.5 Instance Operations This section contains the operations for handling instances. Details of these oper ations can be found in Section 5.4. • crea te in sta n ce <class-nam e>; • d e le te in sta n ce <instance-oid>; • m ove in sta n ce < instance-oid> to <class-nam e>; • ch an ge a ttr ib u te valu e < in stan ce-o id > .< attrib u te-n am e> in to <value-oid>; • retriev e a ttr ib u te valu e < in stance-oid> .< attribute-nam e> ; • e x e c u te m e th o d < instance-oid> .< m ethod-nam e> ({< input-oid-list> }); where <instance-oid> := OID of an instance <value-oid> := OID of an attrib u te value < input-oid-list> := < input-oid> {, <input-oid-list> } • The crea te in sta n ce operation creates an instance of a class. • T he d e le te in sta n ce operation deletes an instance. 101 • The m ove in sta n ce operation moves an instance to a class. • T he chan ge a ttr ib u te valu e operation changes the value of an attrib u te of an instance into a new one. • T he retriev e a ttr ib u te valu e operation returns the value of an attrib u te of an instance. • T he e x e c u te m eth o d operation executes a m ethod on an instance. 102 Reference List [1 ] S. A biteboul and A. Bonner. O bjects and views. In Proceedings o f the A C M SIG M O D International Conference on M anagement o f Data. ACM SIGMOD, May 1991. [2] H. Afsarm anesh and D. McLeod. The 3DIS: An extensible, object-oriented in form ation m anagem ent environm ent. A C M Transactions on Inform ation Sys tem s, 7(3):339-377, O ctober 1989. [3] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. Data Structures and Algorithms. Addison-Wesley, 1983. [4] A. M. Alashqur, S. Y. W. Su, and H. Lam. OQL: A query language for m anipulating object-oriented databases. In Proceedings o f the International Conference on Very Large Databases. VLDB Endow m ent, August 1989. [5] M. Atkinson, F. Bancilhon, D. D eW itt, K. D ittrich, D. M aier, and S. Zdonik. T he object-oriented database system manifesto. In Proceedings o f the 1st Inter national Conference on Deductive and Object-Oriented Databases, December 1989. [6 ] F. Bancilhon and N. Spyratos. U pdate semantics of relational views. A C M Transactions on Database System s, 6(4):557-575, December 1981. [7] J. Banerjee, H. Chou, J. Garza, W . Kim, D. Woelk, N. Ballou, and H. Kim. D ata m odel issues for object-oriented applications. A C M Transactions on Of fice Inform ation System s, 5(l):3-26, January 1987. [8 ] J. Banerjee, W. Kim, H.-J. Kim, and H. F. K orth. Semantics and imple m entation of schema evolution in object-oriented databases. In Proceedings of the A C M SIG M O D International Conference on M anagement o f Data. ACM SIGM OD, May 1987. [9] D. Beech and B. M ahbod. Generalized version control in an object-oriented database. In Proceedings o f the International Conference on Data Engineering. IEEE, January 1988. 103 [10] E. Bertino. A view m echanism for object-oriented databases. In Proceedings o f the International Conference on Extending Database Technology, 1992. [11] I. A. Chen and D. McLeod. Derived d ata update in sem antic databases. In Proceedings o f the International Conference on Very Large Databases. VLDB Endow m ent, August 1989. [12] H. T. Chou, D. J. D eW itt, R. K atz, and T. Klug. Design and im plem entation of the W isconsin Storage System. Software Practices and Experience, 15(10), 1985. [13] H. T. Chou and W . Kim. A unifying framework for version control in a CAD environm ent. In Proceedings o f the International Conference on Very Large Databases. VLDB Endow m ent, August 1986. [14] U. Dayal and P. A. Bernstein. On the correct translation of update operations on relational views. A C M Transactions on Database System s, 8(3):381-416, Septem ber 1982. [15] U. Dayal and H. Hwang. View definition and generalization for database inte gration in a m ultidatabase system. IE E E Transactions on Software Engineer ing, SE-10(6):628-645, November 1984. [16] R. Elm asri and S. Navathe. Fundamentals of Database System s. Ben jam in/C um m ings, 1989. [17] D. Fang and D. McLeod. A testbed and mechanism for object-based sharing in federated database systems. Technical R eport USC-CS-92-507, C om puter Sci ence D epartm ent, University of Southern California, Los Angeles, CA 90089- 0781, February 1992. [18] D. Fishm an, D. Beech, H. Cate, E. Chow, T. Connors, T. Davis, N. Der- re tt, C. Hoch, W . K ent, P. Lyngbaek, B. M ahbod, M. N eim at, T. Ryan, and M. Shan. Iris: An object-oriented database m anagem ent system. A C M Trans actions on Office Inform ation System s, 5(l):48-69, January 1987. [19] S. G handeharizadeh, V. Choi, C. Ker, and K. Lin. The design and im plem en tation of th e Omega object-based system . In Proceedings o f the Australian Database Conference, 1993. [20] G. G ottlob, P. Paolini, and R. Zicari. Properties and u pdate sem antics of consistent views. A C M Transactions on Database System s, 13(4):486-524, De cem ber 1988. [21] M. H am m er and D. McLeod. D atabase description w ith SDM: A sem an tic database model. A C M Transactions on Database System s, 6(3):351— 386, Septem ber 1981. 104 [22] S. Heiler and S. Zdonik. O bject views: Extending the vision. In Proceedings o f the International Conference on Data Engineering. IEEE, 1990. [23] D. Heim bigner and D. McLeod. A federated architecture for inform ation sys tem s. A C M Transactions on Office Inform ation System s, 3(3):253-278, July 1985. [24] E. Horowitz and S. Sahni. Fundamentals o f Data Structures. Com puter Science Press, 1976. [25] R. H. K atz. Toward a unified framework for version modeling in engineering databases. A C M Computing Surveys, 22(4):375-408, December 1990. [26] R. H. K atz and T. Lehman. D atabase support for versions and alternatives of large design files. IE E E Transactions on Software Engineering, SE-10(2):191— 200, M arch 1984. [27] A. M. Keller. Choosing a view update translator by dialog at view definition tim e. In Proceedings o f the International Conference on Very Large Databases. VLDB Endowm ent, August 1986. [28] W . Kim. A model of queries for object-oriented databases. In Proceedings o f the International Conference on Very Large Databases. VLDB Endowment, August 1989. [29] W . Kim. Research directions in object-oriented databases. Technical Report ACT-OODS-013-90, M CC, January 1990. [30] W . K im and H. Chou. Versions of schema for object-oriented databases. In Proceedings o f the International Conference on Very Large Databases. VLDB Endow m ent, Septem ber 1988. [31] H. K orth and A. Silberschatz. Database System Concepts. McGraw-Hill, 1986. [32] C. Lecluse, P. Richard, and F. Velez. O 2 , an object-oriented d ata model. In Proceedings o f the A C M SIG M O D International Conference on M anagement o f Data. ACM SIGM OD, June 1988. [33] Q. Li, K. J. Byeon, and D. McLeod. An experim ental system for conceptual evolution in object databases. In B. Srinivasan and J. Zeleznikow, editors, Proceedings o f the Australian Database Research Conference, February 1990. [34] W . Litwin and A. Abdellatif. M ultidatabase interoperability. IE E E Computer, December 1986. [35] D. M aier. The Theory o f Relational Databases. C om puter Science Press, 1983. 105 [36] D. M aier, J. Stein, A. O tis, and Purdy A. Development of an object-oriented DBMS. In Proceedings o f the Conference on Object-Oriented Programming System s, Languages, and Applications. ACM, Septem ber 1986. [37] A. M otro. Superviews: V irtual integration of m ultiple databases. IE E E Trans actions on Software Engineering, SE-13(7), 1987. [38] G. T. Nguyen and D. Rieu. Schema evolution in object-oriented database systems. Data and Knowledge Engineering, 4(l):43-67, 1989. [39] D. J. Penney and J. Stein. Class modification in the Gem Stone object-oriented DBMS. In Proceedings o f the Conference on Object-Oriented Programming System s, Languages, and Applications, 1987. [40] J. F. Roddick. Schema evolution in database systems - An annotated bibliog raphy. A C M SIG M O D Record, 21(4), December 1992. [41] E. A. Rundensteiner. M ultiView: A methodology for supporting m ultiple views in object-oriented databases. In Proceedings o f the International Con ference on Very Large Databases. VLDB Endowment, 1992. [42] E. A. Rundensteiner and L. Bic. A utom atic view schem ata generation in object-oriented databases. Technical Report 92-15, University of California, Irvine, January 1992. [43] M. H. Scholl, C. Laasch, and Tresch M. U pdatable views in object-oriented databases. In Proceedings o f the 2nd International Conference on Deductive and Object-Oriented Databases, December 1991. [44] E. Sciore. M ultidim ensional versioning for object-oriented databases. In Pro ceedings o f the 2nd International Conference on Deductive and Object-Oriented Databases, December 1991. [45] A. Sheth, J. Larson, and E. Walkins. Tailor: A tool for updating views. In Proceedings o f the International Conference on Extending Database Technology, 1988. [46] D. Shipm an. The functional d ata model and the d ata language DAPLEX. A C M Transactions on Database System s, 2(3): 140-173, M arch 1981. [47] A. H. Skarra and S. B. Zdonik. The m anagem ent of changing types in an object-oriented database. In Proceedings o f the Conference on Object-Oriented Programming Systems, Languages, and Applications, 1986. [48] K. Tanaka, M. Yoshikawa, and K. Ishihara. Schema virtualization in object- oriented databases. In Proceedings o f the International Conference on Data Engineering. IEEE, January 1988. 106 [49] M. Tresch and M. H. Scholl. Schema transform ation w ithout database reorga nization. A C M SIG M O D Record, 22(1), M arch 1993. [50] K. W ilkinson, P. Lyngbaek, and W . Hasan. The Iris architecture and im ple m entation. IE E E Transactions on Knowledge and Data Engineering, 2(1):63- 75, M arch 1990. 107
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
Asset Metadata
Core Title
00001.tif
Tag
OAI-PMH Harvest
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11257611
Unique identifier
UC11257611
Legacy Identifier
DP22860