Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
00001.tif
(USC Thesis Other)
00001.tif
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ADVANCED TECHNIQUES FOR APPROXIMATING VARIABLE ALIASING IN LOGIC PROGRAMS by Anno Langen A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computer Science) February 1991 Copyright 1991 Anno Langen UMI Number: DP22823 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Dissertation Publishing UMI DP22823 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL p _ UNIVERSITY PARK I hiU* LOS ANGELES, CALIFORNIA 90089-4015 Q p Q * 1 V I I n s This dissertation, written by Anno Langen under the direction of ...... Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re quirements for the degree of DOCTOR OF PHILOSOPHY Dean of Graduate Studies D a t e : DISSERTATION COMMITTEE Chairperson Acknowledgements Many thanks to my advisor and friend Dean Jacobs. Dean provided valuable advice, worked closely with me on numerous issues, and made research in graduate school a ipleasure for me. I am grateful to the committee: Alvin Despain helped sustain my interest in Logic Programming. His wealth of experience in the field was invaluable. Professor Dubois has always been very helpful. Several colleagues influenced my work and interests in Logic Programming. Thanks to Doug DeGroot, who introduced me to Restricted And-Parallelism and its compilation problem. Will Winsborough, who introduced me to multiple spe cialization of Logic Programs, encouraged me most kindly. His research attitudes and patience have made working with him a pleasure. Many thanks to Kim Mar riott, who clarified many issues in denotational semantics and abstract interpreta tion, and encouraged attention to mathematical rigor. Manuel Hermenegildo and M uthu (K. M uthukumar) encouraged me by putting some of my work to use in their Restricted And-Parallelism implementation project. I thank all colleagues that in fluenced my interests through discussions with them: Rick Hull, Don Cohen, Peter Van Roy, Tom Getzinger. . 11.. Contents A cknow ledgem ents ii A b stract v 1 Introduction 1 1.1 Abstract Interpretation.................................................... 2 1.2 Abstract Domains for A liasing........................................... 4 1.3 Independent A n d -P arallelism ........................................... 6 1.4 The Organization of this D issertation....................... 8 2 P relim inaries 9 2.1 Sets ........................................................................................................... 9 2.2 L a ttic e s ............................................................................................... 9 . 2.3 F u n c tio n s .......................... 11 2.4 Logic Program S y n ta x ................................................................. 11 3 A bstract Interpretation 13 3.1 Concrete Sem antics........................................................................ 13 3.1.1 Substitutions and the Domain E S u b st.......................... 14 3.1.2 The Query-Dependent F o rm u latio n........................... 18 3.1.3 The Query-Independent F o rm u latio n ........................ 19 3.2 Abstract S em antics........................................................................... 21 3.2.1 Local Soundness Requirements .................................... 21 3.2.2 The Query-Dependent F o rm u latio n ........................... 22 3.2.3 The Query-Independent F o rm u latio n ........................ 23 3.2.4 Global Soundness.............................................................. 24 3.2.5 Comparison of the Two Form ulations.......................... 26 3.3 Power Abstract D om ains.......................... 32 3.4 Mode In fe re n c e ........................................................ 33 3.5 Mode-Oriented S e m a n tic s .............................................. 40 3.6 S u m m a ry ..................................................................... 45 4 A bstract D om ains for A liasing 47 4.1 The Domain S h a rin g .............................................................................. 47 4.1.1 Local Soundness of a u n if y ...................................................... 51 4.1.2 Algebraic Properties of a u n ify ........................................... 52 4.2 The domain E S h a r in g .................................................................... 57 4.2.1 Local Soundness of a u n if y ................................................... 61 4.2.2 Algebraic Properties of a u n ify ........................................... 66 4.2.3 Implementation I s s u e s ........................................................ 67 4.3 The Worst Case O peration.............................................................. 71 4.4 The Domain P rop.............................................................................. 84 4.4.1 Local Soundness of a u n if y ................................................ 85 4.4.2 Algebraic Properties of aunify . 85 4.5 S u m m a ry .................................................... 87 5 A bstract E xecution 89 5.1 Abstract Execution Using Condensing .......................... 89 5.1.1 Comparison with Extension T a b le s .............................. 91 5.2 Mode Inference using C ondensing................................................. 93 5.2.1 Mode Inference using Extension T a b le s........................... 95 5.3 Mode-Oriented Abstract E x e c u tio n .......................................... 98 5.4 Fast Condensing ................................................................. 98 5.5 S u m m a ry ............................................................................ 100 6 Independent A nd-P arallelism 103 6.1 Intro d u ctio n.............................................................. 103 6.2 EGE C om pilation.............................................................. 104 6.2.1 Execution Graph E x p ressio n s.................................................. 105 6.2.2 The Compilation P ro b le m ........................................................... 106 6.3 Maximal A n d -P ara lle lism .................................................................. 108 6.4 Conditional Dependency G raphs.................................................... 109 6.5 A Compilation F ram ew ork....................................................... I l l 6.5.1 Compilation with E S h a r in g .............................................. 117 6.6 The Zerocost H e u ristic ................................................................. 118 6.7 S u m m a ry .................................................................................. 122 7 C onclusions 124 ,iv _ Abstract During the execution of a Logic Program, two program variables are aliased if they are bound to terms that share a common variable. Information about variable aliasing is essential for certain optimizations, notably for exploiting Independent And-Parallelism. It can also be used to improve the accuracy of the analysis of other run-tim e properties of the variable binding, such as groundness, freeness, linearity, and term structure, that are essential to other optimizations. The general problem of deciding whether variables can possibly be aliased at some point in a program is undecidable. Thus, any algorithm for deriving information about variable aliasing must necessarily produce an approximation. This dissertation presents several results that advance the accuracy and effi ciency of the approximation of variable aliasing in Logic Programs. We use the framework of abstract interpretation to provide a formal basis for the approxima tion of run-time properties. We present a novel domain for approximating variable aliasing, compare it with previous and alternative proposals, and demonstrate its feasibility. We present a new evaluation technique, called condensing, within the framework of abstract interpretation, and discuss its advantages and limitations. We present a version of semantics of Logic Programs that uses only a single opera tion to model the parameter unification on entry and exit. Some previous semantic definitions of Logic Programs have been obscured by the need to standardize the variable names of the call and procedure head apart. We have encapsulated the proper treatm ent of variable names into a single operation outside of the semantic definitions, which became more lucid as a result. Finally, we apply our analysis to a particular approach to Independent And-Parallelism. Specifically, we show how variable aliasing analysis improves the generation of control expressions that use simple run-time tests to schedule goals for parallel execution. ________ y_ Chapter 1 Introduction During the execution of a Logic Program, two program variables are aliased if they are bound to terms that share a common variable. Information about variable aliasing is essential for certain optimizations, notably for exploiting Independent And-Parallelism. It can also be used to improve the accuracy of the analysis of other run-time properties of the variable binding, such as groundness, freeness, linearity, and term structure, that are essential to other optimizations. The general problem of deciding whether variables can possibly be aliased at some point in a program is undecidable. Thus, any algorithm for deriving information about variable aliasing must necessarily produce an approximation. This dissertation attacks the problem of deriving variable aliasing at several levels. • Abstract interpretation provides a general framework for deriving approxima tions of the run-time properties of programs. We develop a general-purpose scheme for the abstract interpretation of Logic Programs and study how dif ferent evaluation techniques affect accuracy and efficiency. • W ithin the framework of abstract interpretation, the abstract domain deter mines which run-time properties are derived. We present a novel domain for approximating variable aliasing, compare it with previous and alternative proposals, and demonstrate its feasibility. • Compilation for Independent And-Parallelism requires information about vari able aliasing. We show how variable aliasing analysis improves the generation 1 of control expressions th at use simple run-time tests to schedule goals for parallel execution. 'In this chapter, we discuss each of these levels in more detail and give an overview of our results. 1.1 Abstract Interpretation Abstract interpretation provides an elegant framework for deriving dataflow infor mation about computer programs [1, 10]. In this approach, the given language is ^assigned both a concrete semantics and an abstract semantics. The domain of com putation states in the concrete semantics is replaced by a domain of descriptions of states in the abstract semantics. Each basic operation on the concrete domain is ■replaced by a corresponding operation on the abstract domain. Execution of a pro gram according to the abstract semantics produces an approximation to data-flow information as given by the concrete semantics. Naively executing a program according to the abstract semantics is inefficient. The commonly accepted alternative to naive abstract execution is tabulation of the results for specific calls with Extension Tables [26]. Extension Tables implement a form of “memoizing” where the results of previous abstract calls are stored for reuse. Computing a new entry for such a table requires iterating to a fixpoint. Our frame work suggests a second alternative to naive abstract execution. We pre-compute a fixed approximation to the meaning of clauses, a process called condensing, which allows specific calls to be abstractly executed with a single unification operation. The condensing process, however, may require iterating to a fixpoint. We show that condensing does not result in a loss of accuracy if the abstract unifi cation operation satisfies certain algebraic properties and argue that these properties are im portant even if a more conventional form of abstract execution is employed. Generally speaking, the cost of abstract execution using condensing is equal to the cost of abstract execution using extension tables if procedures are used only in their most general modes. Extension tables are more efficient if procedures are used in a small number of specific modes, while condensing is more efficient if procedures are used in a large number of more general modes. As analysis of more and more 2 — features is demanded, the number of distinguishable modes increases. In this sense, we expect abstract execution using condensing to scale up better. The proof of soundness of our analysis is divided into two parts, one global and one local. In the global part, the soundness of the general-purpose framework is proven to be the consequence of certain properties of the underlying abstract do main and operations. In the local part, these properties are shown to hold for our particular domain for aliasing. Our framework is based on “query-dependent” and “query-independent” formulations of both the concrete and abstract semantics. The query-dependent concrete semantics (QDCS) has the conventional form and is the standard against which soundness is measured. The query-independent abstract semantics (QIAS) provides the basis for condensing. A fundamental result of this thesis is the global soundness of the QIAS with respect to the QDCS. This is estab lished by proving the equivalence of the QDCS and the query-independent concrete semantics (QICS), and then proving the global soundness of the QIAS with respect to the QICS. The conditions under which condensing does not result in a loss of accuracy are established by developing a conventional query-dependent abstract se mantics (QDAS) and comparing it with the QIAS. Of course, the QDAS is shown to be globally sound with respect to the QDCS. Most frameworks for the abstract interpretation of logic programs, e.g., [3, 18], use a concrete semantics where the particular choice of variables appearing in the range of substitutions is relevant, even though this choice is irrelevant to the user. This approach is problematic in our context, where the QDCS and the QICS must be shown to be equivalent to justify the soundness of condensing. We solve this prob lem by defining the semantics over appropriate equivalence classes of substitutions. Our framework is also unique in that it is defined in terms of a single operation, called ‘unify’, that packages together several low-level operations, including vari able renaming, unification, composition, and restriction. Generally speaking, this approach improves the accuracy of approximations and simplifies local soundness proofs. Commonly, Logic Program clauses with a common head functor are compiled into procedures. At run-time, a goal is solved by calling such a procedure. W ith abstract interpretation, we can approximate all the possible arguments to proce dure calls, based on some declaration of possible top-level calls. This application of ,3 _ l abstract interpretation is known as mode inference. Mode inference is essential for optimization of Logic Programs at the procedure level. The basis for mode infer ence is a specification of possible arguments to procedure calls. An approximation is, again, obtained by replacing certain concrete domains and operations by corre sponding abstract domains and operations. We scrutinize a little discussed problem with abstract interpretation. Details of the formulation of the concrete semantics may affect the abstract semantics. In this case, there are several equivalent specifi cations for the possible arguments to procedure calls, which may result in different approximations. We show under which circumstances one formulation is better than another. 1.2 Abstract Domains for Aliasing In the framework of abstract interpretation, run-time properties of a Logic Program are approximated by executing over an “abstract domain” . The abstract domain describes sets of substitutions in terms of the run-time properties of interest. Ab stract domains for variable aliasing describe sets of substitutions by constraining how variables may be shared. Several abstract domains for variable aliasing have been proposed elsewhere, [6, 8, 11, 18, 30]. These domains describe the variable sharing of substitutions that are possible at run-time at given points in some program clause. Chang [6] proposes to describe variable sharing by partitioning the set of clause variables into classes, called coupling classes. Such a partitioning describes all substitutions that share no variables across coupling classes. Jones and Sondergaard [18] propose to describe variable sharing by listing pairs of clause variables that may share variables. Debray [11] proposes to describe variable sharing by listing for each clause variable those other clause variables with which it may share variables. Our abstract domain for aliasing, called Sharing, is based on the notion of shar ing groups. For a given substitution a sharing group is the set of clause variables that share some variable. An element of the Sharing domain describes a set of substitu tions by listing possible sharing groups. We will compare our domain with previous proposals and show that it captures variable aliasing information with a higher . 4 degree of accuracy. The accuracy of the Sharing domain can be improved by addi tionally approximating “linear variables”. We propose an extension of the Sharing jdomain, called ESharing, which approximates variable aliasing, “linear variables”, and “free variables”. Listing all possible sharing groups explicitly may not be feasible. In particular, when we must assume the worst for variable aliasing among some set of n clause variables, we would have to list 2” possible sharing groups. The number of sharing groups of typical substitutions is much smaller. The main problem with approx im ating variable aliasing by explicitly listing sharing groups is th at the effort of jworking with the approximations increases as their quality decreases. To address jthis problem we present a special representation for sets of sharing groups. In this (representation we can decrease the size of an approximation by forfeiting some ac curacy. For each of the abstract domains we propose, we prove local soundness and discuss the algebraic properties of their abstract unification operation. Abstract unification over Sharing enjoys a commutativity property, so that the same approx imation results from all permutations of a sequence of unifications. For the domains for variable aliasing proposed elsewhere, the best approximations are obtained when “grounding” precedes “aliasing” in a sequence of unifications. Unfortunately, ab stract unification over ESharing is also sensitive to the order of unifications. Also, the intended use of our specialized representation of sets of sharing groups is in compatible with commutativity. Clearly, if we are prepared to dynamically forfeit accuracy whenever the approximation size exceeds some bound, then the size of interm ediate approximations for a particular sequence of unifications will influence the result. Let us say that an abstract domain is well behaved if its abstract unification operation enjoys all the algebraic properties to guarantee equivalence of the query- dependent and query-independent abstract semantics. None of the abstract domains for variable aliasing is well behaved in that sense. This is evident when we compare the approximations resulting from abstract interpretation based on Extension Ta bles and based on condensing. Because of the unfortunate lack of some algebraic property, we can exhibit examples for both techniques where using one results in more accuracy than using the other. Several simple abstract domains, however, ,5j are well behaved. We also show that a powerful abstract domain for approxim at ing “groundness”, which M arriott, Sondergaard, and Jones propose in [20], is well behaved. 1.3 Independent And-Parallelism A variety of different techniques for executing logic programs in parallel have been proposed. In Or-Parallelism alternative solutions to a goal are explored in parallel. In And-Parallelism subgoals of a goal are solved in parallel. In a pure Logic Program, the order in which subgoals are solved is irrelevant; an implementation is free to execute all subgoals in parallel and “join” the solutions in a data base style. But this can lead to tremendous inefficiency when only a single solution, or a single solution at a time is required. In this case, when two goals share a variable their independent solution might produce different values for the variable. This is called a binding conflict. The problem of binding conflicts has been addressed. Wise [28, 29] proposes to reconcile the bindings after execution of the goals completes. The alternative is to prevent the parallel execution of two subgoals when they share a variable. Independent And-Parallelism avoids binding conflicts by requiring that goals execute in parallel only when they share no variables. Deciding whether two goals share a variable is difficult because it is partly de termined by the current substitution at run-time. A textually shared variable may be bound to a ground term, or textually different variables may be aliased. Conery has proposed implementing Independent And-Parallelism for Prolog by monitor ing shared variables at runtime [9]. Compilation for Independent And-Parallelism requires information about variable aliasing at compile-time. Chang and Despain show how to analyze programs in order to identify subgoals that cannot share vari ables [7]. A compiler can use this analysis to exploit parallelism without run-time overhead. Some opportunities for parallelism are missed by pure compile-time analysis. For example in child(C, M, F) mother(C, M), father(C, F). 6 the calls in the body should not execute in parallel if we are generating children for particular parents, but may execute in parallel if we are generating parents of children. A hybrid approach, called Restricted And-Parallelism, is proposed by DeGroot [14, 15]. This execution model for logic programming combines compile-time anal ysis with run-time monitoring. Here, programs are compiled into special control expressions, called Execution Graph Expressions (EGEs), which initiate goals on the basis of tests on program variables. The RAP model is easier to implement than a general graph-based scheme such as Conery’s because EGEs have a linear form; goals are collected into groups for scheduling. This permits simple “fork/join”- style synchronization mechanisms. In general, this linear form restricts the amount of parallelism that can be achieved because initiation of a goal cannot be made dependent upon the completion of an arbitrary collection of other goals. DeG root [14] argues that RAP can still achieve more than enough parallelism to keep a moderately-sized multiprocessor computer busy. Recent empirical evidence [5] lends support to these arguments. Generating EGEs is difficult for two reasons. First, determining whether an EGE is correct for a clause may entail considerable reasoning about tests and goals. Second, choosing from among various correct alternatives may require a rather sub jective assessment of their benefits. This is because different EGEs may achieve more parallelism in different circumstances. Moreover, an EGE that achieves a great deal of parallelism may be unacceptably large. We introduce a novel approach to generating correct EGEs and choosing between them. In our approach, a clause is first converted into a graph-based computational form, called a Conditional Dependency Graph (CDG), which achieves Maximal And- Parallelism (MAP). MAP is the maximal and-parallelism possible while maintaining correctness. This CDG is then gradually transformed into an EGE, potentially at a loss of parallelism, using two rewrite rules operating on hybrid expressions. Since these rules are sound, in the sense that they always produce correct results from correct sources, correct EGEs are always produced. Compilation algorithms are defined within this framework by giving heuristics for choosing where and how rules should be applied. .7 . 1.4 The Organization of this Dissertation This dissertation is structured as follows. Chapter 2 reviews some standard m athe matical concepts and notation. They are grouped together for easy reference. Chap ter 3 presents our abstract interpretation framework. It introduces an appropriate equivalence class of substitutions and the operation unify with the algebraic prop erties commutativity, idempotence and additivity. We give both query-dependent and query-independent formulations of the concrete semantics and show their equiv alence. Then we present both formulations of the abstract semantics, prove their soundness based on local soundness requirements, which an individual abstract do main must meet, and relate the precision of the two abstract semantic formulations based on algebraic properties of the abstract unification operation ‘aunify’. We present a concrete semantics for mode inference. We show th at there are different, equivalent formulations of procedure call modes that may lead to different approx imations. We compare the different approximations and show under which circum stances they are equivalent. In Chapter 4 we present several abstract domains. We present our domain Sharing for analysis of aliasing and discuss the algebraic properties of its abstract unification operation. Then we present an extension that approximates more features of substitutions and approximates aliasing with greater accuracy but whose abstract unification operation does not enjoy any of the algebraic properties of interest. Finally, we give an example of a powerful abstract domain whose abstract unification operation enjoys all algebraic properties to guarantee equivalence of precision of both formulations of the abstract semantics. Chapter 5 illustrates our new technique for abstract execution. We contrast abstract execution using extension tables with abstract execution using condensing. We then compare the results of using both techniques to generate mode information. We develop examples showing how abstract interpretation with the extended Sharing domain, which lacks a commutative aunify, may gain or loose accuracy when condensing, rather than extension tables is used. Chapter 6 develops our technique for gener ation of execution control expressions for Restricted And-Parallelism. Chapter 7 draws conclusions from the presented results and points towards further research. .8 Chapter 2 Preliminaries This chapter collects some basic, standard definitions and concepts and specifies their notation. These can be found in standard texts on Lattices, e.g., [2], on deno- tational semantics, e.g., [23], and on Logic Programming, e.g., [19], We hope that 'listing these standard concepts together allows the reader to find the definitions more easily and to read through the following chapters more smoothly. Non-standard def initions can be found before their first use. 2.1 Sets We assume familiarity with the concept of sets. We write 0 for the empty set, C for the subset relation and C for the proper subset relation. We write x for cartesian product, 0 for intersection, U for union, \ for subtraction of sets. The precedence of the operations decreases in this list. Individually, the operations associate to the left. This means for example that the expression A \ B X C \ D is implicitly parenthesized (A \ ( B x C ))\ D. 2.2 Lattices A set R C S x S is called a binary relation on S. For a binary relation R we write aRb for (a, b) £ R. A binary relation C on D is a quasi-order iff C l is: 1. reflexive: x C x for all x 6 D\ and 2. transitive: x Q y A y Q z = > x Q z for all x,y,z £ D. 9J U . quasi-order C on Z) is a partial order iff III is antisymmetric: x C y A y C x =4* ® = y for all x, y £ P . For example for every set X the subset relation C is a partial order on the power set of X , written V(X). A set P together with a partial order Jon P is called a partially ordered set (poset). Its partial order is denoted C p or [just n. if omitting the index will not result in confusion. The product of two posets U. and B is the poset A x B . Its partial order is defined by (a, b) GAxb (&',&') iff L Qa a'Ab Qb b'. A quasi-order = on D is an equivalence relation iff = is symmetric: L = y iff y = x. An equivalence relation on D partitions D into equivalence classes Jof the form {x £ D\a = *} for all a £ D. Corresponding to every set D with quasi-order E there is a poset [P], whose elements are the equivalence classes, [d] = {x £ D \x □ d A d C x} and whose partial order is defined by [x] C[xjj [y\ iff E y‘ f°r some, hence all, &' £ [x] and y1 £ [y]. Let D be a poset. If there exists an element b £ D such th at b ’ O x for all x £ D, then b is denoted by the symbol _L , read bottom. For all a, b £ D the expression a U 6, read least upper bound of a and 6, denotes the element in P , if it exists, such that 1. a U b is an upper bound: a C a L J b and b C a L I b, and 2. a U b is least among upper bounds: a □ x A b C x a U b C x for all x £ P . The expression a n 6, read greatest lower bound of a and 6, is definable analogously. A poset P is a U-semilattice iff a U b exists for all a, 6 £ P . The poset P is a lattice iff both a U b and a ll b exists for all a, b £ P . The least upper bound operation can be generalized to sets. Let P be a poset. For a set S C P the expressions U S denotes the element in P , if it exists, such that a □ L J S for all a £ S and (Va £ S : a C x) = = > U S E * for all a: £ P . The least upper bound [_ !< £ of a set S may equivalently be w ritten | j £ c . A set G C P is a x& S chain iff a C b or b C a for all a, b £ C . A U-semilattice P is a complete partial order (cpo) iff L J C 7 exists for all non-empty chains G C P . A cpo P is a complete lattice iff both greatest lower bounds and least upper bounds exist for all sets S C P . For example for every set X the powerset 'P(X) is a complete lattice that is partially ordered by the subset relation. 10 2.3 Functions The expression D — * ■ R denotes the space of functions with domain D and range R. Function space formation associates to the right and function application to the left. This means, for example, that the expressions A — * B — > C and f a b are implicitly parenthesized A (B — » C ) and ( / a)b. Sometimes we may add redundant parentheses in function application expressions, e.g., f(a) instead of fa . For functions / : A — > B and g : B — > G their composition g o f : A — > C is & function defined by (g o / ) x = g(f x ) for all x £ E A. For a set D and poset R the set of functions D — * R is a poset. Its partial order relation is defined by / Q d-*r 9 iff f x C g x for all x € D. For posets D and R the function / : D — » R is monotonic iff x y =>• / x f y. For epos D and R a monotonic function / : D — * R is continuous iff / ( j j C) = ]_j / x for all xEC non-empty chains C C D. For a function f : D ^ D an element * 6 D is a fixpoint iff / * = x. For a partially ordered set D and a function / : D — > D the expression fix(f), read least fixpoint of / , denotes the fixpoint in D , if it exists, such th at f x — x = > fix(f) E x for all x G D. For a cpo D with _ L element the least fixpoint exists for all functions / : D — * D. It is defined fix(f) = [_] / n -L, where f ° x — x and / n+1 x = f ( f nx). n > 0 2.4 Logic Program Syntax We distinguish the following classes of symbols: variables, functions, predicates, and connectives. Functors are either function symbols or predicate symbols. Functors with arity zero are called constants. A term is defined inductively: 1. A variable is a term. 2. If / is a function symbol with arity n and ...t n are terms, then / ( £ i,. . . , £„) is a term. If p is a predicate symbol with arity n and . . . tn are terms, then p(£i,... ,tn) is an atom. A clause h:-b consists of the atom h , its head, and a finite sequence of atoms, its body. We refer to atoms in a clause body as goals. A logic program or 11 ijust program r; q consists of a finite sequence of clauses r, its rule base, and a finite sequence of atoms g, its query. We will introduce the following syntactic categories: the set of all variables Var, the set of all terms Term, the set of all atoms A tom , the set of all finite sequences of atoms Body, the set of all clauses Clause, the set of all finite sequences of clauses RuleBase, and the set of all programs Program. In chapter 3 we will define semantics for the last four of these syntactic categories. Chapter 3 Abstract Interpretation Abstract interpretation provides an elegant framework for deriving dataflow infor m ation about computer programs [1, 10]. In this approach, the given language is jassigned both a concrete semantics and an abstract semantics. The domain of com putation states in the concrete semantics is replaced by a domain of descriptions of states in the abstract semantics. Each basic operation on the concrete domain is replaced by a corresponding operation on the abstract domain. Execution of a pro gram according to the abstract semantics produces an approximation to data-flow information as given by the concrete semantics. This chapter presents a general-purpose framework for the abstract interpreta tion of logic programs. 3.1 Concrete Semantics In this section, we define our concrete semantics for logic programs. We give a particular form of concrete semantics, called a collecting semantics [10], which spec ifies the set of all substitutions th at may occur at each point in a program. As a first step, we introduce a domain ESubst of equivalence classes of substitutions and an operation unify on ESubst. We then present the query-dependent and query- independent formulations of the collecting semantics in terms of ESubst and unify. Finally, we prove the equivalence of these formulations. 13 1 ; ; ;--------------------------------------------- 3.1.1 S u b stitu tio n s an d th e D o m a in ESubst Let Var be the domain of variables and Term be the domain of terms. A substitution o r in Subst is a mapping from a finite subset of Var, its domain dom(cr), to Term. The domain of an explicitly enumerated substitution is given by the enumeration, e.g., {X h X , Y i — ► X} has domain {X,Y}. The application of substitution cr to term t, denoted at, is the term obtained from t by replacing all occurrences of variables in dom(<r) by their associated terms. The composition of substitutions a and 9, denoted a o 9, is the substitution with domain dom(<r) U dom (0) defined by (cr o 8)t = a(9t). We let rgv(cr) = (J var(crv) v E d o m (ir) be the set of all variables occurring in some term in the range of cr. Our treatm ent of substitutions differs from the standard one where substitutions are mappings from Var to Term that are almost everywhere the identity. Our approach reflects the fact th at the current substitution during execution has significant bindings only for variables in the current clause. It turns out that explicit accounting of these active variables is essential to our treatm ent of aliasing. Term a is said to be an instance of term 6, denoted a < b, iff there exists a substitution cr such that a = crb. The quasi-order < on terms induces equivalence classes of term s with consistently renamed variables. Let ETerm be the domain of these equivalence classes and [< ] denote the class of all consistent renamings of t. The quasi-order < on terms becomes a partial order on ETerm. We write L J for the least upper bound, which is the most specific generalization, and l ”l for the greatest lower bound, which is the most general common instance, on ETerm. For example, [f(X, X, Y)]U[f(X, Y , a)] = [f(X, Y , Z)] [f(X, X, Y )]n[f(X, Y, a)] = [f(Y, Y , a)] Note that ETerm has greatest element [X] but no least element, and least upper bounds always exist. The quasi-order < on terms is lifted to substitutions by a < 6 iff dom (0) C dom (ff) and a t < 8t for all terms t with var(t) C dom ($). Note th at the effects of applying a substitution to terms with variables outside its domain are ignored 14 jwhen it is compared with other substitutions. The quasi-order < on substitutions induces equivalence classes of substitutions with consistently renamed range vari ables. Let ESubst be the domain of these equivalence classes and [a] denote the jdass of all consistent renamings of < r, e.g., [{X i — > • Y}] = [{X i — ► Z}]. We let lc denote jthe equivalence class of substitutions containing the identity substitution over the jvariables in clause C . We write U and n for the least upper bound and greatest lower bound, respectively, on ESubst. From now on, we let hatted variables range over substitutions and unadorned variables range over classes of substitutions, e.g., [a] = cr. Application of a class of substitutions to a term is defined to be the class of jterms resulting from application of the individual substitutions, i.e., [< r\t = \crt]. We •refer to elements of ESubst simply as substitutions, rather than equivalence classes of substitutions, where ambiguity will not result. A unifier of terms a and b is a substitution / a such th at var(a)U var(6) C d o m ( / a ) and fta = p,b. A most general unifier of a and 6 is a unifier that is largest w.r.t. < on substitutions. When the semantics of logic programs is directly defined using terms and substitutions, an arbitrary choice must be m ade between possible most general unifiers. This is somewhat problematic if one wishes to compare several [different formulations of the semantics. We avoid this problem by working with the domain ESubst of equivalence classes of substitutions. It is easy to show th at for any function m gu(a, b) that computes a most general unifier of a and 6, we have [m gu(a, b)] = □ / a f.ia=fj,b Here, and from now on, an equation like fia = fib above implicitly constrains the domain of fi to cover at least the variables in the terms that / a is applied t o . In discussing most general unifiers, we use the following operational definition. D efinition 1 (m gu) m gu(c,c) = 0 if c is a constant { {a t — > a, b i — ► a} if {a, b} C Var {a i — ► 6} if a e Var {b a} if b £ Var if at hast one of a or b is in Var 15 m g u (f ( a i ... an), f (&i. . . bn)) = m g u (aa„, crbn) o cr where cr = m g u ( f ( a i... an_i), f {bx... & n-i))) We define the semantics of logic programs in terms of a single operation unify : Term x ESubst x Term x ESubst — > ESubst. Oversimplifying somewhat, the result of executing a clause h : -b for a given call a under current substitution c r is taken to be unify(a, cr, h, executefb, unify(h, th: - b, a, c))) where execute(b, 0) is the result of executing the body b beginning in the substitution 6. Note that unify encompasses all of the low-level operations, including variable renaming, unification, composition, and restriction, normally associated with clause entry and exit. Generally speaking, direct approximations of unify will be more accurate than compositions of approximations of the low-level operations. Moreover, local soundness proofs will be simplified since only one operation is involved. To construct unify, we introduce a “tagging” scheme th at models the fact that the terms under consideration have no variables in common. Let Var C Var be a set of variables that cannot appear in programs and let X £ Var \ Var imply X £ Var. Tagging is extended to Term, Subst, and ESubst in the natural way, e.g„ m = f(X), {Y W f(X)} = {Y i-+ f(X)}, and [d] = [5]. The restriction of substitution c r to untagged variables, denoted |<r|, simply removes tagged variables from the domain of £, i.e., dom (|d-|) = dom(o-) \ Var. D efin itio n 2 (unify) . . J [|m gu(/ia,/i5) o /t|] where [/t] = a 1 1 9 i f era n O b exists uiufy(<*> < r, b,0) = < ( undefined, otherwise Note that the particular representative jx chosen here is unim portant. We relate this definition to lattice operations on ESubst. 16 T h eo re m 1 (C h a ra c te riz a tio n o f unify) unify(a, cr, b,6) = cr n U /*• P ro o f: We have [|m gu(a, 6)|] = | j /x and [mgu(ora, ab) o o r] = [£] n [_| pi, which is the most general extension of cr th at unifies a and b. Together they imply the claim. □ Since the collecting semantics is concerned with sets of substitutions, we extend unify to a total function on Answer = 'P(ESubst). D efin itio n 3 (E x te n d e d unify) unify : Term x Answer x Term x Answer — > Answer unify(a, S , b, T ) = {unify(a, cr, b, 6)\<r ( E S A 6 £ T A era H$b exists } This version of unify enjoys the following properties. T heorem 2 (P rop erties o f unify) Id em p oten ce unify(a, {cr}, b, unify(6, T , a, {cr})) = unify(a, {cr}, b, T ) C om m u tativity unify(a2, unify(ai, S , bly % ), b2, T2) = unify(a1, unify(a2, S , b2, % ), bt , % ) A d d itiv ity unify(a, Si U S 2, b, T ) = unify(a, S i, b, T ) U unify(a, S 2,b, T ) Proof: Idempotence of extended unify is implied by a corresponding idempotence of unify on substitutions, i.e., unify(a, cr, b, unify(6, 6, a, cr)) = unify(a, cr, b, 9) if era fl 6b exists. Note that unify(6, a, cr)(6) = era n 6b. By the characterization of Theorem 1 it remains to show o-n U a *' fi'a < .$ b 17 To show < it suffices to show for all ji' with fi'a < 9b that (< x ( “1 fi')a < era fl 9b. And indeed we have (<r n fi')a < era n fi'a < era n 9b. To show > it suffices to show for all fi with fia < (era n 9b) th at (cr n fi)a < 9b, which is clearly true. Commutativity of extended unify is implied by comm utativity of unify on substitutions. The latter comm utativity is reduced to the commutativity of the greatest lower bound on ESubst, n , by the characterization in Theorem 1. Additivity follows directly from the definition of extended unify, Definition 3. □ 3 .1 .2 T h e Q u e r y -D e p e n d e n t F orm u lation The concrete semantics consists of the following semantics domains and functions over these domains. The meaning of a program, which consists of a rule base and a query, is an Answer, which is a set of substitutions. The meaning of a clause body is a function mapping Answer to Answer, defined in the context of the meaning of a rule base. The meaning of a rule base is the combination of the meaning of each of its clauses, defined as a fixpoint. The meaning of a clause is a function returning an Answer for a given goal and Answer, the latter being the set of current substitutions. Answer = V(ESubst) ClauseMeaning = Atom x Answer — > Answer RuleBaseMeaning = { 1 ... nc} — > ■ ClauseMeaning BodyMeaning = Answer — > Answer where n c is the number of clauses in the rule base. P : Program — * Answer R ,: RuleBase — > RuleBaseMeaning C : Clause — ► RuleBaseMeaning — * ■ ClauseMeaning B : Body — > RuleBaseMeaning — * BodyMeaning F : RuleBase — » • RuleBaseMeaning — > RuleBaseMeaning .184 ±t[rj = C [A :-6 ]e (a ,5 ) = U unify(a, {<r},fc, B[[6] eunify(A, a, {cr})) tr£S B IjeS = « S B[ax ... a^J e 5 = (J e j (a.i? B|[ai . . . a;_i]] e S) l< j< n c F|[C71 ...C nJ e * = CfCije The domain ClauseMeaning is a domain of functions that are continuous in Answer. RuleBaseMeaning is a cpo with ± y(a,«S) = 0 and (Ll.E)y(a,5) = UeeEeJ (a, ,5). We show that B |6] is continuous by induction over length of the body b. B[] is clearly continuous. Using the continuity of B [ax ... aj_i J and of the ClauseMeaning function U D we have B [ o i. . . a*! (U D )S = |J U (U l> )j(< ii,B [a 1 . . . o i_1l|rf5). l < j < n e d£D After using the definition of U D we combine the two unions over elements of D and fold the definition of B [aj ... a;J to obtain Uder> ® [°i • • • ® » J d The continuity of C([fi.:-6] is implied by the continuity of B |6] and an additivity property of unify: unify(a, S , b, U D) = We have shown th at B |6] and C[(7| are continuous. It is well known that, under these conditions, the least fixpoint /fe ( F H ) is Un F W " !, B[61 (R [r]) S = Un B[6] F [r]1 . 5, and C[CJ (R [r]) (a, S ) = U « C [C ']F [r],?L (a, S ), where F lrfL = _ L and F [r]”+ 1 J_ = F [r](F fr]” ± ). 3 .1 .3 T h e Q u ery -In d ep en d en t F orm u lation In the above query-dependent formulation of the concrete semantics, the body of a clause is processed in the context of the query (a, S), i.e., C [ft: -6] e (a., S) is derived from B|6J e applied to unify(A, - 6 }, a, {<x}) for < r £ S. In the following alterna tive formulation, in which each semantic function A is replaced by a corresponding semantic function A, the body is processed independently from the query. The definition of each semantic function A is the same except for C[<7]], which is defined as follows. 1ft D efin ition 4 (C[[(7J) C [/t:-6 ]g (a ,5 ) = unify(a,5,fc,B [& |e{ifc:- f c }) Here, the Answer B |6] (R [rJ) -j} is fixed; it is independent from queries and run-time states. The meaning of a clause incorporates the query and state simply by applying the operation unify to this fixed value. Intuitively, this suggests a bottom- up implementation where the body goals are solved in isolation, after which the head is unified against the query. Generally speaking, this implementation cannot be recommended since the fixed Answer obtained by solving the body goals may be infinite and may contain many substitutions that are not unifiable with the query. It is, however, useful to approximate this query-independent formulation in the abstract semantics, as we will see in section 3.2.3. The following theorem shows that the two formulations of the concrete semantics are equivalent. T h e o re m 3 (C o n c re te E q u iv alen ce) C [(7 ](R [r]) = CJC?]] (R [rJ) P ro o f: We prove by induction over n that the following three equations hold for all n. The theorem follows directly from the fact that Claim I I holds for all n. I B [6 lF [rl’ !L u n ify (h ,T ,a,5 ) = unify(h, B [6] F [ r f i T , a, S) II C [fc :-6 J F [r ]a (« ,5 ) = C[&:-4]F|[rJ’ !L(a,S) III C[[C']]Fjl7'!’ !L(ai,umfy(a2, S,a'yT)) — unify(«2, C[Cj] («i, S), a', T) __ In the base case, n = 0 and BflhJ-L.Y = 0 = B|[&] A-X for all non-empty bodies b. If b is empty then I holds by the definition of B [] and B[J. Thus I holds for all b. I I is implied by I and the idempotence of unify, i.e., unfold the definition of C[/i:-fe], use I, apply idempotence of unify, collapse the union of unifiers with singleton sets into one unifier, and fold the definition of C |& :-6]. I l l is implied by I I and the comm utativity of unify, i.e., use I I , unfold the definition of C [[/* .: — , apply comm utativity of unify, and fold the definition of C p .:-6 ]. For the inductive step, n > 0 and we show I by induction over the body b. I holds trivially if b is empty. For non-empty bodies b = <*i... aj, I is implied by the inductive assumption of I , and the inductive assumption of III, i.e., after unfolding 20. the definition of B [ai . . . a*] and F jrJ ’ lL and using the inductive assumption, the left hand side is U C[C,1 F [ r r - ‘ ± (<*, unify(fc, B J o ,. . . <*_,]F[r]P - S, a, T)). 1 < j< n c After applying the inductive assumption of III, use Additivity of unify to collapse the union into unify to obtain unify (ft., U C fC J F jM T 1 -!- (aif B [ax . . . a ^ ] F lrf-L S), a, T). l< j< n c The right hand side of I is obtained from this by folding the definition of F[[r]]”L and then B[«! . . . a*J. The proofs of I I and I I I in the inductive step are analogous to the proofs in the base case. E H 3.2 Abstract Semantics In this section, we present our general-purpose framework for the abstract inter pretation of logic programs. As a first step, we state required properties of the underlying abstract domain and its operations. We then present query-dependent and query-independent formulations of the abstract semantics, and show how con densing is derived from the latter. We give algebraic properties that ensure that these two formulations are equivalent, thus condensing does not result in a loss of accuracy. Finally, we prove global soundness of the two formulations of the abstract semantics with respect to the corresponding formulations of the concrete semantics. This ensures th at the query-independent abstract semantics, and therefore condens ing, is globally sound with respect to the query-dependent concrete semantics. 3.2.1 L ocal S o u n d n ess R eq u irem en ts The domain Answer of sets of substitutions in the concrete semantics is replaced by a domain Answer' in the abstract semantics. Each proposed abstract domain, Answer1 , must meet the following requirements. 1. Answer' is a cpo with a least element X. 21 2. There is a concretization function 7 : Answer' — > Answer such that 7 (5 ) is the set of all substitutions th at satisfy the property of interest described by 5. We require that the concretization function be monotonic, i.e., Use® 7 (^ ) — tCU1 ® ') for all W C Answer'. 3. There is a function init : Clause — Answer' satisfying ic € 7 (init((7 )) and there is a function aunify : Term x Answer1 x Term x Answer' — > Answer' satisfying unify(a, 7 (5 ) , 6 , -y(T)) Q 7 (aunify(a, 5 , 6 , T)). 4. Finally, to ensure th at fixpoints in the semantics are well-defined, we require that aunify be continuous in its Answer' arguments. These local soundness requirements are used to establish global soundness, i.e., that for every point in the program, the set S € Answer of possible current substitutions is approximated by an element S ' € Answer' with S C 7 (S'). As a very simple example, the following domain approximates the set of variables bound to ground terms. Answer' is 'P{ Var) with G U G' = G D G' and least element the set of all variables Var. Here, 7 (G) = {[d]|Vu £ G : var(o~u) = 0}, which is the set of all substitutions th at ground at least the variables in G. We define init((7) = 0 and ( r h / v a r(a) u G if v a r(6) ^ G' a u m iy ia , G , 0, G r ) = < ( G otherwise which satisfies the requirements above. 3 .2 .2 T h e Q u ery -D ep en d e n t F orm u lation The following abstract domains and operations correspond directly to domains and operations in the concrete semantics. ClauseMeaning' — Atom x Answer' — » Answer' RuleBaseMeaning' = { 1 ... nc} — > ClauseMeaning' BodyMeaning' = Answer' — > Answer' 22 where nc is the number of clauses in the rule base. P ' : Program — > Answer1 R ' : RuleBase —4 RuleBaseMeaning' C : Clause —4 RuleBaseMeaning' — > ClauseMeaning' B ' : Body - 4 RuleBaseMeaning' — » BodyMeaning' F' : RuleBase -4 RuleBaseMeaning' -4 RuleBaseMeaning' = B '[g ](R 'H )in it(g ) R 'M = j**(F'|[r]) C 'p 2 .:-&] e(a, S) = aunify (a, S, h, B ' [6 ] e aunify(h, in it(h : - b), a, S )) B 'U e S = S B '[ a i. . . a* ]] e S = |J e j (a*, B 'fai ... e S) 1 <i<nc F'[C'1 ...C 'B J e * = C 'lC iie ClauseMeaning' is the domain of functions that are continuous in Answer'. As in the standard semantics, B'|&] is shown to be continuous by induction over length of the body b using the fact that all ClauseMeaning’ functions are continuous in Answer’. Similarly it is shown th at B'ffe] e is continuous. C'jJA: - 6] e is well-defined, i.e., it is continuous in Answer’, since aunify is continuous in Answer' and B '|6 ] e is continuous. Finally, the continuity of is implied by the continuity of Bp>] and of aunify. Having established continuity of C|(7]] and B [ 6J, the least fixpoint is known to be U„ F'[[rP_, B '[ 6 ] (R 'W ) S = Un B 'P>] F 'ttr !^- « S > ; and C '[(7]|(R '[r])(a,S) = Un C '[ 0 ] F 'I r f l ( a ,5 ) . 3 .2 .3 T h e Q u ery -In d ep en d en t F orm u lation Naive abstract execution of a goal according to the above query-dependent formu lation of the semantics requires computation of a fixpoint in the abstract domain. If the same or similar calls are made repeatedly, as is often the case in the analy sis of logic programs, then this technique will be unacceptably slow for a practical compiler. A solution to this problem can be derived from the following alternative 23 formulation of the abstract semantics in which the body of a clause is processed in dependently from the query. In this formulation, semantic functions A ' are replaced by semantic functions A' in analogy to the concrete case. D efin itio n 5 (C'fl/7]) C 'p .: - 6 | e (a, S ) = aunify(a, S, h, B'p>] e init(h :-b)) Here, the Answer' B 'p ] (R/flr]) in it(h : - b) is fixed; it is independent from queries and run-tim e states. The meaning of a clause incorporates the query and state simply by applying the operation aunify to this fixed value. This suggests an implementation where an approximation to the meaning of each clause body is pre-computed, a process we refer to as condensing, allowing specific calls to be abstractly executed without com putation of a fixpoint. Such pre-computation is feasible because the approximation of the meaning of a clause body is finite, in contrast to the meaning itself which is generally infinite. Note that the process of condensing itself requires computation of a fixpoint; the saving comes only in abstractly executing subsequent calls. An example of condensing appears in section 5.1. 3 .2 .4 G lo b a l S o u n d n ess We now show the global soundness of the query-independent abstract semantics with respect to the query-independent concrete semantics. In combination with Theorem 3, this implies the soundness of condensing. As a first step, we present a lemma relating concrete clause meanings CJC7]] to abstract clause meanings C'UC'J. The results in this section are stated in terms of the concretization function 7 introduced in section 3.2.1. Lem m a 4 I B [6 ]F W n -L7(S) £ i M F W l S ) II C lC IFH rr-LO ^tS)) £ 7 (C7 [C !F [r]l’ IL(<.,S)) P ro o f: By induction over n. For n = 0, claim I is trivially true. For all n, claim I I is implied by claim / , soundness of init and aunify, and monotonicity of unify. For n > 0 claim I can be shown by induction over the length of the body b using ,24j the inductive assumption of claim I I for n — 1, and monotonicity of F[r]"i_ j and 7 in Answer'. Q T h e o re m 5 (G lo b al S o u n d n ess) P M c 7 ( P ?M ) C lC 'l(R [r])(a ,7 ( 5 ')) C 7 (C 'IC 'I](R ^[r])(a,5)) B [6 ](R [rl) 7 (5) C 7 (5 ^ ] (R7W ) S) Proof: The lemma together with the monotonicity of 7 , the characterizations of ^a:(F[r]) and /Mj(F'[[r]]), and the continuity of B [6 J, C |(7], B'p>], and C'JJC'J imply the main soundness claim. □ We now show the global soundness of the query-dependent abstract semantics with respect to the query-dependent concrete semantics. T heorem 6 P|>-;4] C 7 (P '[r;4 ]) C |C 1 (R W )(a ,7(S)) C 7(C '[C 1(R '[r])(a ,5 )) b | ‘ 1 ( » H ) ^ ) e 7 ( b 'P ] ( r 'H ) 5 ) P roof: Following [20], we introduce the so-called plural semantics for clause mean ings. It differs from the standard semantics by failing to m atch up the result of executing the body with the substitution with which the execution started. We obtain the plural semantics by replacing all semantic functions A with a corre sponding semantic function A p ju . The semantic equations are the same except for CpluDA’ which is defined C pluIh:-& ]e(a,«S) = unify(a, S, h, B [ 6] e unify(h, - 6}, a, 5 )) Clearly, C|(7J e (a, S ) C C p ju [[(7J e (a, S). This inclusion can easily be lifted to the other semantic functions. It then remains to show C pluIG ] e («, 7 (5 )) C 7 (C'[<7 ] e (a, 5)) The proof of this analogous to the proof of Theorem 5. E H 25j 3 .2 .5 C om p arison o f th e T w o F orm u lation s The two formulations of the abstract semantics may produce different results. Below we examine how the abstract query-dependent and query-independent semantics relate in accuracy based on various algebraic properties of the abstract unification operation aunify. We start by defining the algebraic properties, whose consequences we examine below. D e fin itio n 6 (C o m m u ta tiv ity o f aunify) aunify(ai, aunify(a2 , S, b2,T 2), &i, 2i) = aunify(a2, aunify(ai, S, &i, 2i), b2, T2) D efin itio n 7 (A d d itiv ity o f aunify) |_J aunify(a, S, b, T ) = aunify(a, |_] S, b, T) se® se^ Note th at mono tonicity of aunify, which is one of the local soundness requirements, is already one half of additivity. Namely, [_J aunify(a, S, b, T ) C aunify(a, |_| S, b, T) 56$ se* D efin itio n 8 (Id e m -3 ) aunify(a, S, b, T) C aunify(a, S, b, aunify(6 , T, a, S)) D e fin itio n 9 (Id e m -3 ) aunify(a, 5 , 6 , T ) 3 aunify(a, S, b, aunify(6 , T, a, 5)) We say aunify is idempotent iff it is both Idem-3 and Idem-3- For some ab stract domains, unification can only additionally restrict a given approximation, i.e., aunify(a, S, b, T) C S. This implies Idem -3 for such a domain since aunify is monotonic. All downwards closed abstract domains, which is a class of abstract do mains defined by [25], have this property. An abstract domain, Abs, is downwards closed if, in addition to satisfying the requirements stated in section 3.2.1, Abs is a lattice with greatest lower bound, n , and all substitution sets in the range of 7 are closed under instantiation, i.e., A £ Abs A a £ 7 (A) A cr' < < r ==> < j' £ 7 (A). 26. Given this definition we see that A n aunify (a, .A, a', A') is a safe approximation of unify(a,7 (^4 ) ,a /, 7 (i4 7 )) and therefore aunify can be defined in a way that satisfies, aunify(a, S, b, T ) 3 S. The domain Prop presented in section 4.4 is downwards closed. Dually, for some abstract domains abstract unification can never restrict the approximation, i.e., S 3 aunify (a, S,b,T). This implies Idem-[I for those domains. For example, in free variable analysis a domain approximates the set of variables th at are definitely mapped to variables. In such a domain variables never become definitely free as a result of abstract unification. In such a domain aunify is Idem-Cl. It is difficult to achieve Idem -3 in the analysis of free variables. The different semantics of a clause without body in the query-dependent and query-independent formulation illustrate the importance of Idem-3- Consider the clause p(X,X), whose concrete semantics is interchangeable with built-in unification, = /2 . We would hope to infer th at A and B remain free after executing p(f (A,B) , C), if A, B, and C are free and pairwise independent beforehand. However the query-dependent semantics first approximates the freeness of X in p(X,X), which is not free after unification with p(f (A,B) , C). Unless this approximation includes some information about the term structure of X at run-time, the query-dependent semantics will not be able to infer freeness of any variables in the call. The query-independent semantics allows us to infer freeness of A and B in the call with less effort. The algebraic properties are im portant even if a more conventional form of ab stract execution is employed. Idempotence states th at repeated abstract execution of the same unification operation does not change the ultim ate result. Commuta tivity states that abstract execution of two different unification operations produces the same result regardless of the order in which they execute. Lack of idempotence may make the query-dependent semantics sensitive to the execution order of goals in a body even if aunify is commutative. Section 4.1.2 gives an example for this. Chapter 4 presents an example that shows the effects of the lack of commutativity, namely, certain aliasing domains are less accurate when aliasing unifications precede grounding unifications. Additivity ensures that no information is lost in taking the least upper bound of the meaning of individual clauses to compute the meaning of a procedure. As an example of the effects of the lack of additivity, in-line expansion .2 .7 . of a call in a clause, a process th at entails making multiple copies of the clause, improves the accuracy of abstract execution for non-additive domains. Before relating the query-independent and query-dependent abstract semantics we prove some lem m ata concerning comm utativity of abstract unification and the abstract meaning of bodies and clauses. L em m a 7 (C o m m u ta tiv ity o f u n ificatio n a n d clause m ea n in g ) If aunify is commutative then aunify(a1 ,C ,|C ] e ( a 2 ,5 '),a /,T ) = C'[<7| e (a2, aunify(a1 ? S, a1 , T)) P ro o f: The lemma is a direct consequence of the definition of C 'fC J and commu tativity of aunify. [I The next lemma states that no accuracy is lost by eagerly evaluating a unification before the meaning of a body. L em m a 8 (N o loss for u n ificatio n b e fo re b o d y m ean in g ) I f aunify is commutative B'H6 ] F'[r]]r!l. aunify(a, S, a', T ) [I aunify(a, B'p>] F /[r|],!L S, a;, T) P ro o f: By induction over the length of the body b. If b is empty the claim holds, trivially. If b is not empty but n = 0 the claim also holds since its left hand side is _L . If n > 0 and b — ax ... ai we unfold the definition of B 'f a i a,-J and use the inductive assumption and monotonicity of the ClauseMeaning domain to obtain B 'f o i... Oi| F '|[r|’ !L aunify(o, S , a', T ) C [J F 'f r p . j (oi? aunify(o, B '[ o i. . . Oi_i] F 'lrJ 7 ! S, a', T)) l< j< n c Now we unfold the definition of F /| 7,| ’ 1L, use Lemma 7 and monotonicity of aunify to obtain U F 'IIr P - j (« * •> aunify(o, B 'J o i... at -_i| F'[[rp_ 5, o', T)) E i<i<«c aunify (a, |J ^ l Cji ^ W ^ - L ( ^ , B 7^ . . . ot -_x] F [ r f l S), a', T) 1 < j< nc ,2.a Prom this right hand side we obtain the right hand side of the claim by folding the definitions of F /|r ] ’ !L and B 'flai... at -]j. □ L em m a 9 (C o m m u ta tiv ity o f u n ificatio n a n d b o d y m ean in g ) I f aunify is commutative and additive B'f&J F 7H n ± aunify(a, S, a', T) = aunify(a, B 7[ [ 6] F \rf!L S, a’, T) P ro o f: The proof is analogous to the proof of Lemma 8 . Equality is obtained by using the stronger inductive assumption for shorter bodies and additivity instead of monotonicity of aunify. □ Now we are prepared to relate the query-independent and query-dependent abstract semantics. L em m a 1 0 If aunify is commutative and Idem-^2 I B'[[&] F '[r]n ± S C B 7| 6 ] S I I C •[h: -b] F 'IrF ± (a, S) C C\ h : - 6] F frf-L (a, S) P ro o f: By induction over n. For n = 0 claim I is trivially true if the body b is empty, and true for non-empty bodies b since the left hand side is _L . For all n claim I I is implied by claim I: Unfold the definition of and use claim I and monotonicity of aunify to obtain C \ h : - 6 ] F ^ rJ ’ i (a, 51 ) C aunify(a, S, h , B'[[6 ] F'[[rJ,lL aunify(fo, init(/&: - 6 ), a, S )) Now we use monotonicity of aunify together with Lemma 8 and then Idem - 3 and fold the definition of C'JTi: -bj to obtain the right hand side of claim II. For n > 0 claim I is easily shown by induction over the length of the body b. The base case is trivial. For b = ax ... a* we unfold the definition of B 'faj ... a;], use induc tive assumption for shorter bodies and monotonicity of the ClauseMeaning func tion F 'JrJ"! j , then unfold F /[[r]]’ !L, use claim I I for n — 1, and fold F'frJI’ lL and C 29. 1 — ---------------------------------------------;---- —— — ----------------------------------------------------------------------------------------------------------------------- T h e o re m 11 i f aunify is commutative and Idem-ill C '[C lR '[r] (a ,S ) E C IC ] S7[r] (a, S) B ' H R ' W S C B l i p H i 'P roof: The theorem is a consequence of Lemma 10 and continuity of the semantic functions. □ Section 5.2 shows an example, where lack of Idem- 3 allows the query-independent semantics to infer a more accurate approximation. Commutativity, however is not a necessary condition. The conclusion of Theorem 11 also follows if abstract unifi cation cannot lose accuracy, i.e., if aunify satisfies aunify(h, S, a, S') 3 S. The optim al aunify in a downwards closed domain always satisfies this. L em m a 12 If aunify is commutative, additive, and Idem-H. I B 7 ^ ] FlrJ-lL S E B'fffe] F ' l r j l S I I C 7 ^ : - 6 J W^rfiL (a, S) E C 'P : - 6 ] F ' f r f l (a, S ) P ro o f: By induction over n. For n = 0 claim I is trivially true if the body 6 is empty, and true for non-empty bodies 6 since the left hand side is _L . For all n claim I I is implied by claim I: Unfold the definition of C'[[6 : - 6J and use Idem-E to obtain C'flTi.: - 6 | F'[[rJ'!L (a, S ) E aunify(a, S, h , aunify(6 ., B '|6 ] F '|r J ’ !L init(6 .: - 6 ), a, S )) Now we use Lemma 9, for which we need additivity, and then claim I and mono tonicity of aunify, and then fold C \ h : - 6 J. For n > 0 claim I is shown by induction over the length of the body 6 using claim I I for n — 1. The steps and justifications are analogous to the proof of this case in Lemma 10. □ 30 T h e o re m 13 I f aunify is commutative, additive and Idem-Q C ^ R / H K a . S ) C C'[[(7]] R /[rJ (a, S) WIQWMSQB'IQK’MS P ro o f: The theorem is a consequence of Lemma 12 and continuity of the semantic functions. □ In sections 4.1.2 and 4.2.2 we show examples, where lack of additivity and com m utativity respectively allow the query-dependent semantics to infer more accurate approximations. The premises of Theorems 11 and 13 indicate the conditions for which the query- dependent and query-independent abstract semantics are equivalent. T h e o re m 14 (E q u iv alen ce o f a b s tr a c t se m an tic s) I f aunify is commutative, additive, and idempotent C ^ C ] H 7H (a, S) = C 'lC ] R 'H (a ,S ) B I[ t ] E ;[r )S = B '[ 6 ] R '[ r ] S P ro o f: The theorem is a corollary of theorems 11 and 13. □ In the simple domain for groundness presented in section 3.2.1 aunify is com m utative, additive, and idempotent. Thus, condensing does not result in a loss of accuracy. In section 4.4 we present a considerably higher-accuracy domain for groundness, originally proposed by [ 2 0 ], whose elements are propositional formulas. There we show that aunify for this domain is also commutative, additive and idem- potent. In the abstract domains for aliasing proposed in [6 , 8 , 11, 18, 30], aunify is not commutative or additive, thus condensing may result in a loss of accuracy. In section 4.1 we present a new abstract domain for aliasing. Its aunify is commutative and Idem-C but not additive. In the next section we show how such domains may be lifted. 31 3.3 Power Abstract Domains An abstract domain D can be lifted to a power domain V{D). If abstract unification in D is commutative and Idem-C the query independent formulation using V(D) is more accurate than the query dependent formulation. For every abstract domain D we can consider a corresponding power domain V{D) with union as least upper bound and the empty set as least element. For mally, an element of the power domain is a set that is closed under inclusion of smaller elements, i.e., d E P A d' C d = > d' E P . This requirement ensures that the smallest power domain elements that correspond to two comparable domain elements are comparable. There is a certain redundancy in this representation of substitutions. It could be removed by restricting the elements of the power domain to be sets of pairwise incomparable domain elements, and introducing a more com plicated least upper bound operator, which removes subsumed elements from the set union. Avoiding the redundancy is im portant for an efficient implementation, but is irrelevant for the following considerations of algebraic properties. The concretization function for P(D ) is defined in terms of the concretization function for D. inD )(S) = |J 7 D(d) des Clearly, 'y-p(D ) is monotonic. The operations aunify# and init# are extended natu rally to the power domain: initT>(#)((7) = {SIS' E init#((7)} aunify.p(#)(a, P, b, P ’) = [J {S'"!#" Q aunify#(a, S , 6 ,5")} s e P s ' z p 1 The local soundness of init?^#) and aunify^#) is implied by the definition of 'y-p(D ) and the local soundness of init# and aunify#. L e m m a 15 I f aunify# for D is Idem-H then aunify^#) is Idem-Q. 32 P ro o f: We have a u n ify p ^ a jP j& jP ') = {S"|S" E aunifyp(a, S, b, S') A S € P A S ’ £ P'} C {5"'|5" E aunify£,(a, S, b, aunify£ > (& , S', a, S)) A S £ P A 5 ' € P '} E {S"'|5" E aunifyI)(a, 6, aunifyJ?(6,S ', a, S2)) A Si £ P A S2 £ P A S' £ P '} C {S"|S" E au n ify jj(a,5 ,6 ,5") A 5 £ P A S' £ a u m fy ^ j^ a , P, 6 , P ')} = aunify^(£))(a, P, 6 , aunifyP(I?)(6 , P ', a, P )) by unfolding a u n ify p ^ , Idem-E of aunifyD, basic set algebra, and folding twice aunify p(D). □ T heorem 16 If aunify£, for D is Idem-tZ and commutative then the query-independent abstract semantics for V(D ) is more accurate than the query-dependent abstract semantics. P ro o f: By Theorem 13 it suffices to show that aunify^p) is additive, commutative, and Idem-E- By definition a u n i f y ^ i s additive. By Lemma 15 aunify 7, ^ is Idem-E, and it easily seen th at aunifyp^) is commutative if aunifyD is. □ 3.4 Mode Inference Commonly, clauses whose head share a common functor and arity are compiled together as a procedure and when atom a is solved with set of current substitutions S the procedure defining a is called. A mode describes which arguments to a procedure call are possible. This information is essential for optimization at the procedure level. A mode abstracts from the caller’s substitutions as well as from substitutions over any of the procedure’s clauses. We describe possible arguments by describing possible substitutions over standardized atoms, where each variable corresponds to a unique argument position. Given an assumption about how a rule base r will be used, the standard semantics given in section 3.1 determines the possible set of substitutions for every point in the program. This is easily extended to specify procedure modes. We have already observed that different, equivalent formulations of the concrete semantics, namely the query-dependent and the query-independent formulation, _________________________________33 may give rise to different approximations. We encounter even more variety in equiv alent formulations of the concrete semantics th at determines procedure modes. In general, different formulations lead to different approximations. This problem de serves more attention. Below we present two formalizations of procedure modes. Then, we give several safe approximations of procedure modes, show under which circumstances they are equivalent, and compare their accuracy. We introduce a number of conventions and auxiliary functions to formalize pro cedure modes. We assume a fixed rule base r with nc clauses. We consistently write Ci for the i-th clause and hi for its head. The domain Site contains pairs of indices corresponding to all points in the program, i.e., {i, j) £ Site if Ci has at least j atoms. We write a(i,j) for the j-th atom of the i-th clause, and in general, for a given site s we write as for the atom at site s and bs for the, possibly empty, list of atoms before as. We define the auxiliary function enter : { 1 ... nc} x Atom x Answer — * ■ Answer as enter(i,a, S) = unify(A.i, { t^ } , a, 5 ) to describe the possible substitutions after entering the i-th clause from call a with its possible set of substitutions S. Formally, we model a mode as a set of substitutions over a standardized atom, th at is, the domain Mode is a subset of Answer. Let skel(a) be the atom a stripped of all proper subterms with canonical variable names in the argum ent positions. For example, s&ef(qsort(M, S, [A|X])) = qsort(A , B, C). Using this standardization we define a function callmode : Atom x Answer — > Mode that specifies the argu ments of a call to a ’s procedure when a is solved with S, as a Mode over skel{a), callmode{a,S) = unify(sA;e?(a), {tjte i(a)},«, S). The mode inference problem is to describe for each procedure in a program with which modes it can possibly be called, based on some entry declaration. The entry declaration explicitly states possible calling modes for some predicates. For simplicity we assume that the declaration is given as a function decl: Atom — * Mode, mapping an atom a to a description of possible entries to a ’s procedure. This description is given in form of a Mode over the variables in skel(a). This function and the mode functions below map all atoms corresponding to the same procedure to the same Mode, i.e., f a = f skel(a). 34 We now define a function entry : (Atom — * Mode) — * { 1 ... rac} — » Answer, that specifies possible substitutions for each clause after unification with the head based on some declaration, entry d is the smallest function satisfying both enter(i,skel(hi),dhi) C entry d i and enter(k, a.(i,j), R M entry di) C entry dk. This states th at all clause entries for which there are declared procedure entries as well as clause entries resulting from solving atom are possible clause en tries. This definition of entry d is well defined since enter and Bj&jRQr] are con tinuous. We give the function at : (Atom — > Mode) — * Site — * Answer, that specifies possible substitutions for each site based on some declaration, at d (i, j) = R W ( entrv d *)• Using this function we can describe the possible arguments of calls to a's proce dure based on a declaration d by mode : (Atom — ► Mode) — * Atom — > Mode, mode da = da U (J callmode(as, at ds) s G Site 8kel(a)— skel(a8 ') Here the possible procedure entries are defined in terms of possible substitutions at each site in the rule base. Alternatively, modes may be specified directly as a fixpoint. We define mode : (Atom — + Mode) — * ■ Atom — > Mode, by requiring th at mode d is the smallest func tion satisfying both da C mode da and callmode(a(i'j), B[[6 ^ j)] R [rJ enter(i, skel(hi), mode d hi)) C mode d a^j). This states that all declared procedure entries are possible and a possible procedure entry to the i-th clause leads to possible procedure entries for the called atoms in the body. These two specifications are equivalent in the concrete domain but may give rise to different approximations in the abstract domain. The equivalence uses the following property of callmode and skel. 35 L em m a 17 unify(a, S , skel(a'), callmode(a/, T )) = unify(a, S, a', T ) Proof: By definition of unify it suffices to show for {0} = callmode{a!, {cr}) that 6skel(a') = era. Since era < skel{a./) the latter follows from definition of callmode and unify. C D As a direct consequence of this lemma we have enter(i, skel(a), callmode(a, S)) = enter{i,a,S), which we now use to establish equivalence of the two specifications. T heorem 18 mode da C mode da Proof: By definition of mode it suffices to show th at da C mode da, which is a requirement on mode, and for all {i , j ) 6 Site with skel(a^j)) = skel(a) that callmode(a(itj), R [rJ entry di) C mode da Using the second requirement in the definition of mode and monotonicity of callmode and B J 6 ] R |rJ it suffices to show entry d i C enter{i, skel(hi), mode dh,). For this it suffices to show that / d, for f d i = enter(i, skel(hi), mode dhi), satisfies the two re quirements in the definition of entry. The first requirement, enter{i, skel{hi), da) C f da, is satisfied by definition of mode and monotonicity of enter. The second requirement on f d is enter{k,a^t^ ,B ) C f dk, where B — B p^*^] R jr j ( / d i ) . If skel(hk) / skel(a^j)) then this requirement is trivially true since enter{k,a^tj), B) = 0. Otherwise, we have enter(k,a^jj, B) C enter(k, skel(a^ j)), callmode(a^ ^ ,B )) from Lemma 17. Finally, we have enter(k, skel(hk), callmode{a^j), B)) C f d k by monotonicity of enter and the second requirement in the definition of mode. C D T heorem 19 mode da C mode da Proof: It suffices to show th at mode d satisfies the two requirements in the definition of mode. The first is trivially satisfied, da C mode da. It remains to show callmode(a(itj), B [ [ 6 ^ j)J RJYJ enter(i,skel(hi),mode dhi)) C mode da^j). 36 By definition of mode it suffices to show that the left hand side is a subset of callmode(a^i> j'j, at d {i, j)). Since B [ 6 J R[[r| and callmode are monotonic it suffices to show enier(i,skel(hi),mode dhi) C entry di. Since enter is additive, i.e., enter(i, a, [J \D r) = enter(i,a,S), se* it suffices to show by definition of mode, that enter{i,skel{hi),d hi) C entry di, which is the first requirement in the definition of entry, and for all s E Site with skel(at) — skel(hi) th at enter{i, skel(hi), callmode{a„, at ds)) C entry di. Using the second requirement in the definition of entry it remains to show enter{i,skel{hi), callmode(a3, at ds)) C enter{i,as, at ds), which follows from Lemma 17. E H We now tu rn to approximating possible procedure arguments using abstract interpretation. Given an abstract domain {Answer', U, ± , 7 } we define functions ap proximating entry, at, mode, and mode by replacing the operations enter, callmode, and B pjjR flrJ with three abstract operations. We present two possible sets of ab stract operations. In the simple case enter' : { 1 ... n c} x Atom x Answer' — > Answer' and callmode' : Atom x Answer' — > Answer' are defined by replacing the operation unify in the definition of enter and callmode by aunify. The functions are locally sound iff they satisfy enter{i, a ,^{S )) C -/{enter' (i,a ,S )) callmode(a,-f(S)) C -({callmode'{a, S)). In the simple case local soundness is implied by the local soundness requirement for aunify. We use a function propagate : Body x Answer' — > ■ Answer' to approxim ate B [ 6 ] R [rJ. The local soundness requirement is B [ 6 j R [ r j 7 (5 ) C 7 (propagate(b,S)). We may use either the query-dependent or query-independent abstract semantics to define propagate since both have been shown to approxim ate the standard semantics 37 (see Theorems 3, 5, and 6 .) I.e., B [ 6] R|[r]]7 (5 l) C -y(B'[[6 ] R 'lrJ S) as well as B | 6 ] R Jr] 7 (< S ') C 7 (B 7[fe]R7HS'). In the more complicated case, we may want to preserve more accuracy in the approximation of set union for call modes than for sets of substitutions. If aunify for Answer' is not additive, it may be sensible to separate the two roles of L J, namely for approximating call modes and execution. When execution is approximated, U ap proximates the union of contributions from different clauses of the same procedure. W hen call modes are approximated, U approximates the combination of different uses of a procedure. It can be argued th at in practical programs calling patterns vary more widely than the descriptions of contributions from different clauses, and thus the approximation of call modes deserves the additional accuracy of a power domain. In this case we lift the previous definitions enter', callmode', and propagate to corresponding functions on sets enter'p : { 1 . . . rcc} x Atom x 7y( Answer') — > ■ V(Answer'), callmode'p : Atom X 'P(Answer') — + 'P(Answer'), and propagateP : Body x V(Answer') — * V(Answer') and define V(D ) = {'P(Answer'), U ,0 ,7 p) for a given domain D = {Answer', U, A, j) , where 7 p(\&) = (J 7 (< S r). Here the local se* soundness requirements are reduced to the previous local soundness requirements. enter(i,a,jp(S)) C j P{enter'P{i,a, S)) callmode{a,jp{S)) C 'yP{callmode'p{a ,S )) B [6 ]R |[r]7 p(5') C ■ jP{propagateP{b,S)) Note th at this does not suggest approximating B|& ]R[r]| by the power abstract domain function B p [ 6 ]R p [rJ introduced in section 3.3. Lifting BfbjRflr]] results in the definition propagateP(b, ty) — (J B'([&] R'flVfl S, which does not incur the full se* cost of computing with power abstract domains. Global soundness follows from local soundness and monotonicity of the con cretization function. The following lemmas lead up to two theorems establishing soundness of mode' and mode'. L em m a 20 (Vo E Atom : da C jM (d'a)) =$■ entry d i C j D(entry'd’ i) P ro o f: It suffices to show that entry d is a smaller function than / d1 , for / d' i = 7 X)(enfry'd' i), which is done by showing that f d' satisfies the two requirements in 38 the definition of entry. The first requirement is met by the condition of the lemma and monotonicity of gamma. The second requirement is met by the local soundness conditions and monotonicity of 7 . □ L em m a 21 (Va € Atom, :d a C 7 Af(d'a)) => a td s C 7 D(at'd's) P ro o f: A direct consequence of Lemma 20, monotonicity of B [ 6 |R [[rJ, and local soundness of propagate. □ T h e o re m 22 (Va £ Atom : da C 7 Af(d'a)) =>- mode da C 7 M{mode d' a) P ro o f: A direct consequence of monotonicity of 7 ^/, the condition of the lemma local soundness condition for callmode', and Lemmas 20 and 21. □ T h e o re m 23 (Va £ Atom : da C . 7 Af(d'a)) =>■ mode da C ^^{m ode'd' a) P ro o f: It suffices to show that moded is a smaller function than / d ', for i = 7 Mimode'd' 1), which is done by showing th at f d' satisfies the two requirements in the definition of mode. The first requirement is met by the condition of the lemma and monotonicity of 7 . The second requirement is met by the local soundness conditions and monotonicity of 7 . □ We now compare the two alternative formulations of approxim ated modes. T h e o re m 24 I f enter'(i, a, S) C enter'(i, skel(a), callmode'(a, S)) then mode'da C I mode'da P ro o f: The proof is analogous to the proof in the standard case, where the condition holds naturally. □ It is hard to imagine abstract domains, for which the condition of this theorem is violated. We should always be able to improve the accuracy of an approximation by avoiding the intermediate approximation of procedure arguments. T h e o re m 25 I f enter'(i,skel(a), callmode'(a, S)) C enter'(i, a, S) and enter' is ad ditive, i.e., enter'(i,a,|_|\P ) = j_J enter'( i , a ,S), then mode'da C mode'da se* 39 P ro o f: The proof is analogous to the proof in the standard case, where the condition holds naturally. C I Although the additivity requirement is satisfied for non-additive domains Answer' by replacing enter', callmode' and propagate by their corresponding lifted variants, enter'p, callmode'p and propagateP, it is difficult to retain full information during the conversion to approximation of procedure arguments and back. In particular, information is lost if a head, p (f(X ,Y )), and atom , p ( f ( g , Z )), have unifiable proper subterms but the abstract domain does not approxim ate the term structure of procedure arguments. 3.5 Mode-Oriented Semantics Lemma 17 gives rise to additional formulations of standard semantics. They are mode-oriented in the sense that they give meaning to procedures rather than clauses. The meaning of a rule base is given as a function mapping procedure entry modes to exit modes. A procedure entry mode is given as a pair of an atom and a set of substitutions over the skeleton of that atom. The query dependent mode-oriented formulation has the following semantic domains and functions. Mode = Answer = V{ESubst) RuleBaseMeaning = Atom x Mode — * Mode BodyMeaning = Answer — > Answer P m i Program — * Answer R-M : RuleBase — > RuleBaseMeaning Bm : Body — » RuleBaseMeaning — » BodyMeaning F m : RuleBase — * RuleBaseMeaning — > ■ RuleBaseMeaning .40] R m W = ^ ( F m W ) F M W e (a i^ ) = U callmode(h, Bm[[&] e unify(fi., {t^: - 5}, skel(a),S)) h ~ b £ r ikel(a)—skel(h) B m D o S = S unify(ai, {<x}, skel(ai), e(<Zj, callmode(a,i, {<r}))) < T eB M|6]c5 where b = ax ... a.;_i RuleBaseMeaning is a domain of continuous functions for which the skeleton of the atom in the first argum ent determines the result, i.e., / 6 RuleBaseMeaning = = > ■ f( a ,S ) = f(skel(a),S). BmI&] is clearly continuous. L em m a 26 B u W F m H I S = B [ 6j F lrp L S Proof: By induction over n. If n = 0, it is easily seen that the equation holds. For n > 0 the claim is shown by induction over length of b using two base cases, for empty and single atom bodies. For empty bodies the claim holds trivially. Let the body consist of the single atom a. After unfolding definition of B m M and F m H I we delay the union over clauses in the rule base and promote the union over cr E S, the left hand side, B m M F m W * !L S , is shown equal to [J [J unify(a, {<r}, skel(a), callmode(h, BmP>] Fm I^ r ^ M ) ) h • * ‘ &£r skel(a)=skel(h) where A i = unify(h, {t^:- 5}, skel(a), callmode^a, {<r})). Using Lemma 17 twice this becomes U U unify(a » W ^ » B M W F M W n_1 -Lunify(h,{tft: - i,},a,{o-})) h I — i>£r <t£ S akel(a)=skel(h) After using the inductive assumption for n — 1 and folding C [/&:-&] we obtain (J C |A : - 6 ] F[[r|]”- lL (a, S ) h — & 6 p skel(a)=skel(h) 41 Here, we can expand the union to all clauses in r since C [ /i:- 6 ] e (a ,S ) = 0 if skel(a) ^ skel(h), and then fold the definition of F [ r |’!L and B ja] to obtain the right hand side of the claim. For b = a\ ... aj_i we have B [ 6 aj] e S = B [aj] e (B J6 J e S ) as well as BmP> ®il e S = BmI^'I e (B m W e S) and therefore the inductive step is trivial. d The mode-oriented query independent semantics has a different domain for the meaning of rule bases, RuleBaseMeaning = Atom — > Mode. P m : Program — * Answer R m : RuleBase — > RuleBaseMeaning B m : Body — * RuleBaseMeaning — > BodyMeaning Fm : RuleBase — ► RuleBaseMeaning — * RuleBaseMeaning FafE**;?! = BM[[gJ(RM [[rI]){i,} R mW = ^ ( F mW) FmW e « = U ca//mode(h.,BAf[[fe]e{tfc:-6}) A I ~6€r s k e l(a )= tk e l(h ) BjtfOeS = 5 Bm|«i • ..ai|e«S' = unify(aj,BM[% •••«i-ile^,^e/(ai),e(ai)) L em m a 27 B [ 6J F lrf-L 5 = B m| 6 ] F M[[r]n -L 5 P ro o f: By induction over n. If n — 0, it is easily seen that the equation holds. For n > 0 the claim is shown by induction over length of b using two base cases, for em pty and single atom bodies. For em pty bodies the claim holds trivially. Let the body consist of the single atom a. After unfolding definition of Bjif[a] and F m M ’ H - we use additivity of unify to delay the union over clauses in the rule base. This shows the left hand side is equal to U unify(a,S ,skel(a), callmode(h, BM [^] {e-h:-b})) h : — 6£r s k e l(a )= skel{h) 42. We now use Lemma 17 and inductive assumption for n — 1 and then fold Cpi.: - 6 ] ... to obtain h ' ~ b£r skel(a)=tkel(h) Here, we can expand the union to all clauses in r since C [ h : - 6 ] e (a, S) = 0 if skel(a) ^ skel(h), and then fold the definition of P frJ7 !. and B [a] to obtain the right hand side of the claim. For 6 = a j ... at_! we have B [ 6 a j| e $ = B [a;] e (B [6 J e S ) as well as B ^ [ i a ,-]] e S = B^flaiJ e (B m ^ ] e S) and therefore the inductive step is trivial. □ Corresponding to the two concrete, mode-oriented semantics there are two ab stract mode-oriented semantics. The abstract, query-dependent, mode-oriented se mantics is obtained by replacing domains and operations of a “plural” version of the concrete, query-dependent, mode-oriented semantics. It has the following semantic domains. Mode' = Answer' RuleBaseMeaning' = Atom x Mode' — » Mode' BodyMeaning' = Answer' — > Answer' : Program — * Answer' R M : RuleBase — » RuleBaseMeaning' B j^ : Body — » RuleBaseMeaning' — > BodyMeaning' F m : RuleBase — ► RuleBaseMeaning' — ► RuleBaseMeaning' FmE7 *;?! = B M l?I(R M W )init(?) r m W = M V m M ) s ) = U callmode'{h, Bjyj[6 ] e aunify(/i, in it(h : - 6 ), skel(a), S )) h i ~b£r »kel(a)—tkel(h) = S B m 1^0*1 e S = aunify(ai, S', skel(ai), e(a*, callmode1 (a{, S'))) where S' = B^P*] e S and b = aa ... a ^ i RuleBaseMeaning ’ is a domain of continuous functions for which the skeleton of the atom in the first argument determines the result, i.e., / 6 RuleBaseMeaning' = > f( a ,S ) = f(skel(a),S). B j^ |6 ] is clearly continuous. 43 The abstract, query-independent, mode-oriented semantics is obtained directly by replacing domains and operations of the the concrete, query-dependent, mode- oriented semantics. Its semantic domains are same as for the query-dependent abstract semantics except for RuleBaseMeaning' = Atom — > Mode1 . P 'm : Program — ► Answer' R 'm : RuleBase — * RuleBaseMeaning' B 'm : Body — + RuleBaseMeaning' — - » ■ BodyMeaning' F 'm : RuleBase — > RuleBaseMeaning' — > RuleBaseMeaning' P J M{r’ ,q\ = B 7 AfM ( R /M W )init(?) R 'm W = ^ ( F ' m W ) F'jkf[r]ea = |_J callmode'(h, einit(h:-b)) h : ~ b € r tkel(a)—*kel(h ) = s B'iwEai • • • “i! e 5 * = aunify(ai, B'm[[«i • e 5, skel(ai),e(ai)) Global soundness of the query-independent, mode-oriented, abstract semantics with respect to the query-independent, mode-oriented, concrete semantics is im plied by the local soundness requirements. The proof of this is analogous to the proof of Theorem 5. Similarly, global soundness of the query-dependent, mode- oriented, abstract semantics with respect to the query-dependent, mode-oriented, concrete semantics is implied by the local soundness requirements. The proof of global soundness is analogous to the proof of Theorem 6 . Here the plural version of query-dependent, mode-oriented, concrete semantics differs only in the body mean ing function B Mpi « M e S = unify(aj, S', skelfai), e(aiy callmode(ai,S'))) where S ' = BMplulM e $ and b = ax ... a ^ . The plural version approximates the concrete semantics and is approximated by the abstract semantics. The latter fact is established by a proof that is analogous to the proof of Theorem 5. 44 3.6 Summary In this chapter we presented a framework for analyzing run-tim e properties of Logic Programs. This framework is in the spirit of abstract interpretation, which was introduced by the Cousots [10] and has been applied to Logic Programs by others, [3, 4, 12, 18, 22, 25, 26, 27, 30]. In abstract interpretation, run-tim e properties of a program are analyzed by determining an approximation of the set of possible program states for each point in the program. This is done by formally defining a domain of program states, the concrete domain, with a few primitive operations and a domain of approximations, the abstract domain, with corresponding primitive operations. Corresponding to the concrete semantics, which is formally defined in terms of the concrete domain, an abstract semantics is obtained by replacing the concrete domain with the abstract domain in the semantic definition. In this chapter, we chose the domain Answer, which consists of sets of certain equivalence classes of substitutions, as concrete domain. The definition of the equiv alence classes separates the concerns for a rigorous handling of variable renaming on one hand and a lucid definition of the Logic Program semantics on the other. In choosing sets rather than sequences of substitutions we comm itted to a level of abstraction that is well suited for the analysis of variable aliasing. Given a query, our semantic definition specifies an Answer for each point in the program, assuming left to right execution of clause bodies. The general framework of abstract inter pretation can also be applied to definitions of concrete semantics th at are more specific. There, the definition of concrete semantics may reflect depth first search and additional implementation details of particular Logic Programming languages. Our definition of Logic Program semantics uses a single prim itive operation on Answers. In this chapter, we gave several, equivalent formulations of concrete Logic Program semantics at the chosen level of abstraction. These formulations give rise to different formulations of abstract semantics, which are in general not comparable. We analyzed under which conditions, which formulations result in better approximations. The variety of equivalent formulations stems from the following two possible choices. First, we may assign meaning to individual clauses or to procedures, which is the set of clauses with a common head functor. Second, although the meaning of 45 a clause is always defined as a function giving answers for a query, the formulations of meaning vary in how dependent the meaning of the clause body is on the query. Similarly, the meaning of procedures may be formulated in a query dependent and a query independent way. We focused on the differences between the query dependent and query indepen dent formulations. The corresponding two abstract semantics are equivalent if the abstract unification operator, aunify, has certain algebraic properties, namely, if it is commutative, additive, and idem potent. The query dependent abstract seman tics leads to more accurate approximations if aunify is commutative and Idem - 3 • The query independent abstract semantics leads to more accurate approximations if aunify is commutative, additive, and Idem -3- 46 Chapter 4 Abstract Domains for Aliasing In this chapter, we introduce abstract domains Sharing and ESharing th at identify aliasing between variables. An initial version of the domain Sharing appeared in [16]. We define the abstract operations init and aunify on the two domains and prove local soundness, and algebraic properties of aunify. In section 4.4 we present a domain approximating groundness information. This powerful domain satisfies all three algebraic properties required for equivalence of the query-independent and query-dependent abstract semantics in Theorem 14 in chapter 3. 4.1 The Domain Sharing As a first step, we compare several previously proposed domains for aliasing. The substitution o -e = {W i-+ A , X (-» A, Y i-> B, Z i-* f (A, B)} is used in examples below to compare these domains. We consider whether a domain can capture basic independence information, e.g., the fact th at X and Y are inde pendent, and information about propagation of groundness, e.g., th at grounding X grounds W . Chang [6 ] models Answer’ as maps from variables to descriptions of terms. Three basic descriptions are provided: G means the variable is ground, Gi means the variable is a member of the i-th equivalence class of dependent variables, and I 47 means the variable is not ground and is in some singleton equivalence class. For example, the substitution {W I —> ground, X i-+ f (A), Y h-> g(A, B), Z i-> f (C)} is approximated by { W i—> G , X i—> C i , Y i—> G i , Z i—> / } . This approach captures basic independence information with only limited accuracy, for example, cre can at best be approximated by { W i—> C \ , X i—► C i , Y i—> C i , Z i—► C i ) which fails to capture the fact th at X and Y are independent. Moreover, it does not allow us to infer th at grounding of one variable grounds others. The problem here is that grounding of one member of an equivalence class does not allow us to infer th at any of its other members should be ground. Note th at this behavior prevents aunify from being commutative. Xia and Giloi [30] and Citrin [8 ] present extensions to [6 ] which address the specific problem of groundness propagation. Their technique is to track strongly coupled subclasses Eij of the equivalence classes (7;, where a set of variables is strongly coupled if grounding any one of its members grounds all of its members. For example, cre can be approximated by {W i—► E n , X 1 —* E n j X • —► C i , Z i— * ^ 1 } which allows us to infer th at grounding either one of W or X grounds the other. Their technique does not address the fundamental limitations of this approach, e.g., they still fail to capture the fact that X and Y are independent. Moreover, it fails to propagate groundness in general, e.g., they cannot infer th at grounding of Z grounds tf, X , and Y. Debray [11] maps a variable to the set of variables on which it depends. Jones and S0 ndergaard [18] define an abstract domain consisting of sets of pairs of de pendent variables, which is essentially equivalent in expressive power to [1 1 ]. This 48 approach captures basic independence information with considerably more accuracy, for example, cre can be approxim ated by { W i—► { X , Z } , X {W , Z } , Y i —^ { Z } , Z i-» { W , X , Y } } which captures the independence of X and Y. However, it is not good for propagating groundness, for example, they cannot infer that grounding of X grounds W and grounding of Z grounds W and X. Note th at they can infer th at grounding of Z grounds Y. Our domain Sharing is based on the notion of sharing groups. For a given substitution cr we say variables u, v G dom(or ) share variable w if w G var(o-u.) f l var(d-u). The sharing group of a variable w for cr, denoted sg(<r,w), is the set of all variables that share w. Definition 10 (Sharing Groups) sg : Subst x Var — * V(Var) sg(cr,w) = {v G dom(o-)|u; G var(<ri;)} For example, sg(<re, A ) = {W, X , Z } . We define a Sharing to be a set of sharing groups. Intuitively, substitution cr is approximated by Sharing S if S contains (at least) the sharing group of each variable in rgv(cr). A slight complication arises here because we use equivalence classes of substitutions, however, this is not a significant problem because equivalent substitutions have the same set of sharing groups, i.e., if [< r] = [0 ] then {sy(o-, t;)|t/ G rgv(o-)} = {s^(0,n)|v G rgv(0)}. Definition 11 (Sharing) Sharing = T(V (V ar)) S U T = S U T ± = 0 T^) = {[^IVv € rgv(<x) : sg(o-,v) G S'} For example, [o^] G 7 ({{W, X , Z } , { Y , Z } } ) . Clearly, Sharing is a cpo and 7 is mono tonic. Groundness and independence information can be derived from a Sharing S as follows. Let < r be a given substitution in 7 ( 5 '). Variable v is ground iff there is 49 no variable w “shared by” v, which follows if v does not appear in any set in S. Variables u and v are independent iff they do not share any variable w, which follows if they do not appear together in any set in S. This approach captures aliasing with a great deal of accuracy. For example, [<xe] € 7 ({{W,X,Z},{Y,Z}}) captures the fact that X and Y are independent, that grounding either X or B grounds the other, and that grounding Z grounds W, X, and Y. Moreover, we can infer th at grounding Y strongly couples W, X, and Z. We now define the operation init on the domain Sharing. D efin itio n 12 (init) init(<7) = {{v}|v £ var(C')} It is easy to show that the local soundness requirement ic £ 7 (init(<7 )) is satisfied. To define aunify on Sharing, we extend the tagging operation of section 3.1.1 to sets of variables and Sharings in the natural way, i.e., g = {v|u £ g} and S = {g\g £ S}. Restriction to untagged variables is given by |g| — g \ Var and IS} = {|< 7||<7 £ S'}. D efin itio n 13 (aunify) I |am ge(a, 6 , S U T )| if a and b are unifiable aunify (a, S ,b,T ) = ^ ( 0 otherwise In section 4.1.1 we will relate am ge(a, 6 , S) to m g e (a , 6 , or) = m gu(o-a,ff6 ) o a, which is the most general extension of cr th at unifies a and b. To define am ge, we introduce three auxiliary functions. First, the closure under union of a Sharing S, denoted 5*, is the smallest superset of S satisfying I G 5* A F 6 S'* ==£" X U Y £ S* The Sharing S* approximates all further instantiations of substitutions in 7 (5 ), e.g., {{W,X,Z},{Y,Z}}* = {{W,X,Z},{Y,Z},{tf,X,Y,Z}}. Second, rel(t,S) = { X € S'|var(t) D X ± 0} 50 denotes the component of a Sharing S which is relevant to a term t. Third, S x T = { X U Y \ X e S A Y e T } . D e fin itio n 14 (am ge) am ge(a, £ > , S) = < (S \ rel(a, S ) \ rel(b, S )) U (rel(a, S) x rel(b, S ))9 if at least one of a or b is a variable or constant amge(fl„, bn, am ge(f(oi... an_i), f(£>i... 5)) if a = f (aj ... a„) and b — f (b i.. . bn). As an example, we compute aunify(a ,S ,h ,T ) where a = append(X, Y, Z), S = {{X}, {Y}, {Z}}, h = append([A|L], M, [A|N]), and T = {{A},{M,N}}. After pro cessing the i-th argument the result Si is given by the following. S0 = S U T = {{X},{Y},{Z},{I},{M,N}}, St = amge(x, [I|L],50) = {{X, A},{Y},{Z},{M,N}}, 5 2 = amge(Y, M, = {{X, A},{Y,M,li},{Z}}, 5 3 = amge(Z,[A|N],S2) = {{X, Z, A}, {Y, Z,M,I}, {X, Y, Z, A, The restriction to untagged variables results in j ^ l = {{X, Z}, {Y, Z}, {X, Y, Z}}, thus aunify(a, S ,h ,T ) = {{X, Z}, {Y, Z}, {X, Y, Z}}. 4 .1 .1 L ocal S o u n d n ess o f aunify We show that unify(a,7 ( 5 ) , 6 , 7 (T)) C 7 (aunify(a, 5 , 6 , T)) by establishing an approximation for amge. Let mge : Term x Term x Subst — » Subst be the most general extension of a substitution th at unifies two terms, i.e., mge(a, 6, < r) = mgu(d-a, frb) o a. From the definition of mgu and associativity of composition we see th at mge satisfies mge(c, c, cr) = cr for constants c, and mge(a, b, d) = mge(an, bn, mge(f(ai... an_i), f (bt... i), a)) .51 if a = ±(a1... an) and b = f (bx ... bn). L em m a 28 (S o u n d n ess o f am g e) If a andb are unifiable then [a] £ 7 (S) = > [m ge(a, 6 , a-)] £ 7 (am ge(a,& , S )) P ro o f: By structural induction over the pair a and b. If a and b are constants the claim is trivially true. If at least one of a or 6 is a variable let jx be m ge(a, b, cr). Assuming a and b are unifiable and [£] 6 7 ('S')> it remains to show [p,\ £ 7 ((-S' \ rel(a,S) \ re/(6 , S)) U (rel(a,S) x rel(b, S))*). Note th at Vv € var(<xa) : sg(cr,v) £ rel(a,S). A variable v £ rgv(/i) is in var(/ta) iff it is in v a r{fib) since jxa = fib. We show [/!t] £ 7 (am ge(a,fc, 5 )) by showing for all v £ rgv(/i) th at sg(fi,v) £ a m g e (o ,6 , S ). There are two cases to consider. First, if v £ rgv(/i) \ v a r{fia) then v £ rgv(d-). Also sg(fi,v) = sg(cr,v) and sg(<r, v) £ S \ rel(a,S) \ rel(b,S) assuming [cr] £ 7 (5). Second, if v £ rgy(fi) D v a r (fia) then v £ rg v (m g u (o r a, crb)). The sharing group sg{fi,v) is (J sg(a, v'), where G = sg(mgu(o-a,crb),v). Moreover, v'eG there is at least one v'a £ var(o-a) and at least one v \ £ var(<r6 ) in G. Thus sg(fi, v) is in (rel(a, S') x rel(b, S ))* by the assumption [c r] £ 7 (S). This concludes the proof of the base case. If a = f ( a i ... an), a' = f (aj ... an_i), b = f ( b i . . . b n), V = f ( b i...b n ) and ft' = m g e(a', 6 ', cr) then m ge(a, 6 , a) = m g e(an, 6 n, m g e(a', <r)) and a m g e (a , 6 ,£ ) = am g e(an, bn, am g e(a', 6 ', FJ)). Therefore the inductive step is simply two applications of the inductive assumption. C T h e o re m 29 (L ocal S o u n d n e ss) unify(a, 7 (S),&, 7 (T )) C 7 (aunify(a, S, 6 , T)) P ro o f: It is easily seen th at |<r| £ 7 (|S |) for all cr £ 7 (S), and cr n 6 £ 7 (S U T) if < t £ 7 (S) and 6 £ 7 (T). Together with the lemma, this establishes the soundness of aunify. C 4 .1 .2 A lg eb ra ic P r o p e r tie s o f aunify In this section we establish Idem-CI and com m utativity of aunify for the domain Sharing. We start by proving a lemma for comm utativity of a m g e under specific circumstances. 52 L em m a 30 (C om m u tativity o f am ge) Assuming and bi are unifiable for i E {1,2} am ge(a 2, 6 2 ,am ge(ai,& i,S)) = a m g e(a i,6 i)am ge(a 2 ,6 2 ,5 )) Proof: By induction over the structure of pairs of terms (a,, bi). There are two base cases: a; and bi are the same constant, and at least one of a; or 6* is a variable for i E {1,2}. Com m utativity is easily established for the first base case, and for the inductive steps. Therefore, we present only the proof for second base case. Because of sym m etry it suffices to show for all sets of variable in the lhs (left hand side) that they are in the rhs (right hand side). We do this by showing for each construction of a set of variables in the lhs in terms of unions of sets in S, th at there is a construction of the same set on the rhs. For purposes of this proof we say a set of variables X is at a term t iff var(i) 0, and we say X is at i iff X is at dj or bi, and we say a union X U » Y is of Type i iff X is at a* and Y is at bi. For Z in the lhs we distinguish two cases. If Z is n ot at 2, then Z E am ge(a 1,6i,5'). In this case Z is either a single set from S or the union of some sets from S , none of which are at 2 , and therefore all of them are in am ge(a2, 6 2, S ) and their union is included in the rhs. I f Z is a t 2 , let T be am g e(a 1 ,5 1 ,5 ) , then Z is either from U = rel(a2, T ) x rel(b2, T ) or the result of a union of two sets from U*. Notice th at in either case Z is the result of Type 2 union. If X E rel{a2, T ) and Y E rel(b2, T ), and Z = X U2 Y then we show that Z is in the rhs by induction over the number of Type 1 unions. If Z is constructed with no Type 1 union, then Z is not at 1, and therefore Z £ am g e (a 2, b2, S) and Z remains untouched by the unification of a-i and b\. If Z is constructed with some Type 1 union, assume w.l.o.g. that X = U Ui V with U at <ii and V at 61. As a special case assume that V is at 2, but U is not at 2. In this case (U Ui V) U2 Y — U Uj (V U2 Y). By inductive assumption V U2 Y is in the rhs and therefore so is Z . In the general case at least one of U and V m ust be at 2 since their union is, and we can show th at X U2 Y — U' Ui V ' where U' — U U2 Y if U is at 2 and U' = U otherwise, and analogously for V'. If Z = X U2 Y , with X and Y from U* then we show that Z is in the rhs by induction over the number of union operations involved in the construction of Z. 53 We have already dealt with the base case, and we can assume that X and Y are in the rhs. Therefore it suffices to show for all X and Y in the rhs, with X at a2 and Y at & 2, th at the union X U2 Y is also in the rhs. This claim is proved by induction over the number of Type 1 unions in X and Y combined. If there are none, then X U2 Y E (rel(a2, S ) X rel(b2, S )) and therefore in the rhs. The inductive step is analogous to the inductive step in the previous induction on the number of Type one unions. □ T heorem 31 (C o m m u ta tiv ity o f aunify) aunify (a !, aunify(a2, 5, b2, T2), 6 1 , Tx) — aunify(a2, aunify (ax, 5 ,&i,T l ) , b2, T2) P ro o f: We reduce this property to the com m utativity of am g e by considering that the two unification operations are evaluated using equivalent but different tagging and restriction schemes. Let Var be another set of variables, th at cannot appear in programs and is disjoint from both Var, and Var. Let H ^U be analogous to l^] in that it removes all variables from S. Clearly aunify(a2, S, b2, T2) can be evaluated equivalently as ||am ge(a2, b2, S U T2)||. We have |a m g e (a i, 6 i, ||am ge(a 2 ,57, S U T^)\\ U T\)\ = |||am ge(ax, 6 1 , am g e (a 2, b2, S U T^) U 2T)|| I = 11|amge(ax, 57, am g e(a 2 ,57, S U % U 2\))III The first step delays the double restriction operation until the end of evaluating the unification of ax and 5X . This will not affect the result, since carrying the extra doubly tagged variables around cannot affect which singly tagged and untagged variables are grouped by unions. The second step moves a Sharing with singly tagged variables into the argument of the inner a m g e expression. This does not change the result since a2 and b2 contain no singly tagged variables. Thus we have reduced the com m utativity of aunify to the com m utativity of am g e and aunify is commutative by Lemma 30. □ T h e o re m 32 (Id em -C o f aunify) aunify(a, S, b, T ) C aunify(a, S, b, aunify(5, T, a, S )) 54 P ro o f: We use the same scheme of double tagging and restriction from the proof of Lemma 31. We assume th at single tagging leaves doubly tagged variables un changed. This reduces Idem-C of aunify to a special property of am g e and double restriction for certain Sharing elements. We have aunify(a, S , b, aunify(6 , T, a, S )) = |a m g e (o , 6 , SU |jam ge(6 ,f , T U 3fy||)| = |||am ge(a,& ,5 U am g e ( 6 , f , T U 5'))||| = |||a m g e (a ,6 , S U am ge(5,a, T U 5))||| = j||amge(a,fe, a m g e ( 6 ,<f, S U T U £))||| The first step is unfolding of aunify. The second step delays double restriction, which cannot affect the result. The third step promotes single tagging, and the fourth step moves untagged sharing groups into the inner am g e expression, which doesn’t affect the result since b and a contain no untagged variables. Now it suffices to show |am ge(a,5, S U T )| C j||am g e(a,6 , am g e( 6 , W , S U T U £))||| The proof is by induction over the number of am ge operations on the left hand side. We use the following relation <] to describe the proof invariant. Let the tilde operator on sets of variables add doubly tagged variables to the set, that correspond to untagged variables already in the set, i.e., g — gU g \ Var \ Var for all g C Var, then we define L <1 R iff for all g € L, {g , C R if g D ( Var U Var) = 0, and g £ R otherwise. For example, {{X},{A,B}} < \ R implies th at {{X}, {X}, {A, A,B}} C R. We will show for L <J R th at I am ge(a, b, L) <1 am ge(a, 6 , a m g e ( 6 , a, R)) This implies the lemma since, initially, 5 U T < S’U S 'U T and, after the last am ge operation, we have L < R =£• \L| C | |ji2|| |. We show I by induction over the number of am g e operations on the left hand side. For the base step at least one of a or 6 is a variable or a constant and we consider two cases for g £ am ge(a, 6 , L). If g £ L and g fl v ar(a) = g f\ v a r( 6 ) = 0 then by assumption L <1 R the sets g and (or the set g) are (is) in R and these corresponding sets share no variables with a, b, or W and, thus, are in am ge(a, b, amge(fe, < f, R)). For the rem ainder of the base case 55 let L' = rel(a,L) x rel(b,L), R ' = am ge(a,fe, R), and R" = rel(a,R') x rel(b,R'). It remains to show that g G R n* for aE g G L'*. It suffices to show g G R " for aU g G L' since the closure is monotonic. If g E L' then g = gaUgb with < 7on v ar(a) ^ 0 and gb D v a r( 6 ) ^ 0. We consider whether ga contains any tagged variables. First, if ga contains only untagged variables then ga U §b is in R' and in rel(b,R'). And since G rel(Ey R') we have g — U ga U G -R". Second, if contains some tagged variable then g = ga U §b is in il' as weH as both rel(E,R') and rel(b,R')y and therefore also in R". The inductive step uses com m utativity of am g e to set up two uses of the inductive assumption. For a — f ( a i . . . a n), b = f ( b i . . . b n), a' = f ( a i . . . an_ i), and V = f ( b i ... bn_i) we have am ge(a, 6 , am g e( 6 , < ? , J?)) = am g e(an, 6 n, a m g e(a/, b am g e( 6 n, f n, am g e( 6 ', a7 , R)))) = am g e (a n, bn, am ge(Fn, En, am g e(a', F, am ge(F , a7 , R)))) by definition and com m utativity of am g e and we can use the inductive assumption first for the inner pair and then for the outer pair of am g e operations. C I A bstract unification for Sharing is not idem potent as the foUowing example shows. Let a = p(f(A,B),f(A,C)), b = p(X,Y), 5 = {{A}, {B}, {C}}, T - {{X},{Y}} then aunify(a, S, b, T) = S* \ {{B, C}}, but aunify(a, S, 6 , aunify(6 , T, a, S)) — , S'*. There fore the query-dependent clause meaning may be less accurate than the query- independent clause meaning. Moreover, the accuracy of query-dependent abstract execution is sensitive to the order of goals in the body in spite of aunify’s commu tativity. Consider the query “A=ground, p ( f (A,B) , f (A ,C ))” and its perm utation “p ( f (A,B) , f (A ,C)) , A=ground” against a rule base that contains the facts X =X and p(X,Y). W ith execution according to query-independent semantics, the inde pendence of B and C can be inferred for both queries. But with execution according to query-dependent semantics, independence of B and C can only be inferred for the first query. Also aunify for Sharing is not additive, thus information may be lost during condensing in taking the least upper bound of the meaning of individual clauses to compute the meaning of a procedure. For example, given p(ground, X) . 56 p(X, ground). q(X, Y) p(X, Y). the condensed version of the body of q is {{X}, {Y}}, which does not allow us to conclude th at A is ground after executing the atom q(A, A). But again the accuracy of the query dependent abstract semantics is affected by the order of goals in a body. Consider the two queries “A=B, q(A,B)” and “q(A,B) , A=B”. This time, execution according to the query independent abstract semantics consistently fails to infer th at A and B are ground after execution, but execution according to the query dependent abstract semantics infers that A and B are ground for the first query but fails to infer their groundness for the second. Even though aunify for Sharing is not additive, its comm utativity and Idem-[I make condensing 'P(Sharing) very attractive. 4.2 The domain ESharing To see how the Sharing domain underachieves, consider constructing the exam ple substitution ae from the previous section by a unification. The example sub stitution class, [< re], is obtained by unify(a, ia, b, i&), where a = f(W, X, Y, Z) and b = f(A, A, B, f(A, B)). Its approximation using abstract unification on Sharing is aunify(a, in it(a ),6 , init(6 )) = {{tf, X, Z}, {Y, Z}, {W, X, Y, Z}}, which indicates th at X and Y may be dependent. The result is, however, the best possible approximation based on the input, since the substitution cr' = {W t-+ W, X i — > X, Y Y, Z i — ► f (Q, Q)} is approxim ated by init(a) as well and soundness requires 7 (aunify(a, init(a), b, init(6 ))) to contain unify(a, a 1 , 6 , t*,) = {W i — > A, X y — > A,Y i — » A, Z ■ — » • f(A, A)}. The accuracy of the Sharing domain can be extended if linear terms are tracked. A term is linear iff it contains no variables repeatedly. For a given substitution we say a variable is linear iff it is bound to a linear term. We can approximate which terms at are linear by approximating aliasing and linear variables of < r. In the analysis of Logic Programs we wish to approxim ate free variables to identify unconditional dependence. For a given substitution a variable is free if it is bound to a variable. W hen aliasing information is considered free variables can be ap proximated more accurately. The domain Extended Sharing, written ESharing, 57 approximates Sharing, linear variables, and free variables. Formally, an ESharing is a triple (S ,L ,F ) from Sharing x 'P( Var) x *P { Var). There are named projection functions. sharing : ESharing — ► Sharing sharing (S, L ,F ) = S linear : ESharing — > F ( Var) linear(S, L ,F ) = L free : ESharing — ► V(Var) fre e(S ,L ,F ) = F We define the following concretization functions TL ' ■ V(Var) — ► 'P(Subst) 7 l(V) — {ojVv £ V : crv is linear} 7 f ■ V( Var) — > V(Subst) 1f {V) = {<r|Vu € V : < rv G Var} 7 # : ESharing — > V(Subst) 7 e (F ) = -y (sharing E) fl 7 ^(/m earF ) fl 7 F(freeE) The least upper bound for ESharing is defined (5, Z, F ) U (S", Z', F ’) = ( 5 u 5 ' , f n Z', F n F '). The different orientation of the set operations corresponds to approxim ating possible sharing groups as opposed to definitely linear and free variables. The substitution 1c is approxim ated by init£(C7) = (init((7), var(C'), var((7)). A bstract unification for Extended Sharing is defined analogously to abstract unification for Sharing except for the function am ge. T hat is, D efin itio n 15 (aunify) r, 1 m\ I |a m ge(a, 6 , S U T)| if a and b are unifiable aunify (a ,S ,b ,T ) = < ( v otherwise where the union of ESharing is the triple of unions of its components. D e fin itio n 16 (am g e for ESharing) am ge(a, b,E ) = (newskaring(E, a, 6 ), newlinear(E, a, b), newfree(E,a, b)) if at least one of a or b is a variable or constant am g e(an, bn, a m g e (f . . . a„_ 1), f (& !... fe„_i), E)) if a — f (aj . . . an) and b = f ( b i ... 6n). 58 jBelow we develop the functions for the various components, newsharing, newlinear, and newfree. We start by motivating and defining some auxiliary functions. Then we define the three main functions. Finally we prove semantic properties of the auxiliary functions and use those to prove soundness of the three main functions. Given an Extended Sharing we can approxim ate, pairs of definitely independent variables, possibly non-ground variables, and definitely linear terms. indtp : ESharing x Term x Term — » Bool png : ESharing — * ■ Var) linterm : ESharing — * 'P(Term) indep(E,a,b) iff rel(a, sharingE) D rel(b, sharingE) = 0 pngE = (J g g^sharingE all v E pngE fl var(f) occur only once in t lintermE — <t E Term pngE fl var(£) C UnearEA V«, v E v a r(i) : indep(E, u , v) The aliasing information in ESharing limits the scope of variables th at can possibly be affected when a substitution becomes further instantiated as a result of a unifica tion. A variable remains unaffected unless it is in the sharing group of some variable th at, under the current substitution, occurs in at least one of the two terms. We formalize the possibly affected variables of a term. affected : ESharing x Term — » V( Var) affected(E, t) — U rel(t, sharingE) Lemma 36 proves the soundness of this formalization. Because the aliasing information can narrow the scope of affected variables it helps approxim ate linear and free variables with high accuracy. W hen two linear terms th at share no variables are unified the resulting term as well as all new bindings of the substitution are linear. W hen a linear term is unified with a non-linear term the resulting term is possibly non-linear. Note th at variables th at occur in the non linear term under the substitution will only be bound to linear term s, if the two terms shared no variables beforehand. Since the variables in the linear term under 59 the substitution may be bound to non-linear terms no affected variable can still be considered linear in the approximation. Formally, newlinear(E, a, b) = newlinear : ESharing x Term x Term — * V{ Var) linearE \ (A fl B ) if {a, b} C UntermE and indep(E,a,b) linearE \ A if a £ UntermE and indep(E,a,b) linearE \ B if b E lintermE and indep{E, a, 6 ) linearE \ ( i U B ) otherwise where A — affected(E, a) and B = affected(E,b). Note that, as a consequence, newlinear(E, a, 6 ) = linearE if at least one of A or B is empty. It would also be sound to add variables to the linear component as we infer that they are ground. We keep track of free variables similarly. The aliasing information allows us to limit the scope of variables that become bound to a non-variable term . newfree(E,a,b) = freeE if {a, 6 } C freeE freeE \ A if a 6 freeE freeE \ B if 6 6 freeE freeE \ (A U B ) otherwise where A — affected(E, a) and B = affected(E,b). Aliasing information can be approximated more accurately if we distinguish the common cases where at least one of the arguments is linear. For example, if we know th at A and B are linear and independent on entry to the clause we know that X and Y are independent at the call of p(X,Y). h(A,B) A = [X|Y ], B = [A], p(X, Y) . newsharing(E, a, 6 ) = S U A X B if {a, 6 } C UntermE and indep(E,a,b) S U A x B* if a E UntermE and indep(E,a, b) S U A* x B if 6 G lintermE and indep(E, a, b) S U A* x B* otherwise where S = sharingE \ A \ B , A = re/(a, sharingE), and B = rel(b, sharingE). 60 As an example we compute the approximation aunify(a, init(a), b, init(fe)) for a = f(W, X, Y,Z) and. b = f(A, A, B,f(A, B)), which indicated a possible aliasing of X and Y when the approxim ation was based on Sharing alone. Let L — F = {A, B, W, X, Y, Z}, and let E± be the ESharing after processing the i-th argument. Eo = init(a) U kdt(fe) = ({{A}, {!}, {W}, {X}, {Y},{Z}}, L, F) E t = amge(W,A,£0) = ({{I,»}, {B}, {X}, {Y}, {Z}}, L, F) E 2 = amge(X,A,^1) = ({{A, W,X}, {!}, {Y}, {Z}}, L, F) E z = arage(Y,B^2) = ({{A, W,X},{B,Y},{Z}},X, F) E 4 = amge(Z,f(A,B),E 3) = ({{A, W,X, Z}, {B, Y, Z}}, L, {I, B, W, X, Y}) Finally, aunify(a, init(a), b,init( 6)) = \E<\ = ({{W, X, Z}, {Y, Z}}, {W, X, Y, Z}, {W, X, Y}), which indicates that X and Y are independent. It is easily seen that a certain star closure is necessary in newsharing when one of a or b are not in lintermE. It is less obvious that the star closure is necessary if some variables occur through both a and b under the substitution, i . e . , when they are not indep(E, a, b). The necessity of the star closure can be seen in the following example. Consider the substitution cr = {A A, B t —> B, C i —► C, D f (A, B, C)} with [c r] G 7 (E) for E = ({{AjD},{B, D}, {C, D}}, {A, B, C, D}, 0) and consider aunify(a, E, b, init(fe)) with a = p(A,B, C,D), b = p(X, Y, Z,f(Y,Z,X)). During computation of this aunify the linearity of all variables is asserted correctly until the fourth argument of p/4 is considered. There, the three sharing groups {A, D, X}, {B, D,Y}, and {C, D, Z} are relevant to both terms D and f(Y, Z,X). The resulting set of sharing groups must include {A, B, C, D, X, Y, Z} since unify(a, [£] , b, results in the substitution [{A 1—» A, B t —► A, C i —» A, D 1—► f(A, A, A)}]. 4 .2 .1 L ocal S o u n d n ess o f aunify In this rather technical section we prove the local soundness of aunify for ESharing. We start by proving the soundness of the auxiliary operations, png, indep, affected, and Unterm. Then we prove some properties of concrete unification concerning 61 'linear variables, followed by the soundness proof of newlinear. Then we prove some properties of concrete unification concerning sharing groups and prove soundness of newsharing. Finally, we prove soundness of am g e and aunify. L em m a 33 (S o u n d n ess o f png) [d- ] € 7 e ( E ) A var(dv) ^ 0 =>• v E pngE P ro o f: If u E var(du) then sg(ar,u) E sharingE by [d] E 7 e ( E ) and v E s < 7(d,u). □ L e m m a 34 (S o u n d n e ss o f indep) indep(E,a,b) A [d] E 7 e(I?) = > var(da) f! var(d&) = 0 P ro o f: Suppose there is a v E v ar(da) f l var(d&). Then by [c r] E ^e{E ) the sharing group sg(<r, t> ) would be in sharingE and in both rel(a, sharingE) and rel(b, sharingE). [U L e m m a 35 (S o u n d n ess o f affected) [o'] E 7 e (E) A var(du) D var(da) ^ 0 =>- v E affected(E,a) P ro o f: Consider the sharing group g = sg(&,u) of some variable in u E var(du) f l v a r (da). By [d] E 'Ye(E) we have g E sharingE and in rel(a, sharingE). E H L e m m a 36 (S cope o f c h an g e o f m ge) [o- ] 6 'ys(E) A d ^ affected{E,a) U affected(E,b) =>• m ge(a, b, d)(i>) = dv P ro o f: Note th at m g e (a , 6 , d) = m g u (d a, d&) o d and th at d o m (m g u (d a , d&)) = var(d a)U v ar(d 6 ). By Lemma 35 v £ affected{E, a)Uaffected(E, b) implies var(du)fl (v ar(da) U v a r(d 6 )) = 0 , and thus var(dv) fl d o m (m g u (d a, crb)) = 0 , and thus crv = m ge(o, 6 , d). □ L em m a 37 (S o u n d n ess o f newfree) [d] E 7 e{E) A da fl dfe exists =>■ m ge(a, b, d) E 7 FinewfreeE) 62 'P roof: Soundness of newfree follows from Lemma 36 and definition of newfree. E D L em m a 38 (S o u n d n e ss o f linterm) [it] £ A t £ UntermE at is linear P ro o f: Suppose a variable u appears repeatedly in at. Either two variable posi tions in t have been instantiated to terms containing u, or some variable of t has been instantiated to a term containing u repeatedly. The first case is ruled out by requiring th at all variables occurring repeatedly in t are bound to ground terms by a, and all pairs of variables in t are bound to term s sharing no variables. The second case is ruled out by requiring th at all variables of t are either bound to ground or 'linear terms. C D The following two lemmas about the concrete m g u state which variables will remain bound to linear terms by the resulting substitution. Both lemmas consider the unification of terms th at share no variables. The first lemma considers the case of two linear terms and the second lemma considers the case where one of the two terms is linear. L em m a 39 I f aa n ab exists and for v £ dom (o’ ) the terms av, aa, and ah are linear, var(o-a) f l v a r (ab) = 0 , and either var(<xa) f l var(d-t;) = 0 or var(<x&) f l var(<7 i/) = 0 , then mgu(d"a, ab)(av) is linear. P ro o f: By induction over the number of evaluations of m g u according to def inition 1. The claim holds trivially if aa = ab is a constant. If aa is a vari able then we consider two cases. First, if v a r (ab) H v a r (av) ^ 0 then aa $ var(d-u) and m g u (aa,ab)(crv) = av, which is linear. Second, if era £ va.r(av) then m g u (aa,ab)(av} is like av with the single occurrence of aa replaced by ab. The term is linear since ab is linear and var((X&)n var(<rv) = 0. If ab is a variable the claim holds by symmetrical considerations. If aa = f{a\ ... an) and ab = / ( 6 X . . . bn) then for = m g u (/(a j . . . an_j), f(bi — &n-i)) the term fiv is linear under the con ditions of the claim by inductive assumption. Moreover, since aa and ab are linear and share no variables fi does not affect an or bn, i.e., f a n = an and fibn = bn. Thus by a second inductive assumption the term m g u(£ian, fibn)([iv) is linear. □ 63 L em m a 40 If era fl ab exists and for v £ dom (ff) the terms av and ab are linear, var(o-a) fl var(5‘ 6) = 0 , and var(d’ fe) fl var(£"u) = 0 , then mgu(cra, ab)(av) is linear. P ro o f: By induction over the number of evaluations of m g u according to defini tion 1. The claim holds trivially if a a = ab is a constant. If aa is a variable then mgu(o-a,d-&)(<rv) is like av with the possible single occurrence of era replaced by ab. The term is linear since ab is linear and var(ab) fl var(d-u) = 0. If ab is a variable then m gu(dr a, ab)(av) = av, which is linear, since ab ^ var(d’ u). If era = / ( a x ... an) and ab = f(b 1...b n) then for fi = m g u (/(a j . . . an_ i ) , / ( 6 X ... 6 „_i)) the term pv is linear under the conditions of the claim by inductive assumption. More over, since ab is linear and shares no variables with aa, \i does not affect bn, i.e., fibn = bn. Moreover, bn shares no variables with either jxan or fiv since for t G v} : v a r (/it) C var(<xa,6 x . . . 6n_i) and bn shares no variables with aa or av by the assumption of the claim. Now we can apply a second inductive assumption according to which the term m g u (/ian, ftbn)(fiv) is linear. 0 L e m m a 41 (S o u n d n e ss o f newlinear) [o'] E 7 e (E) A aaV\ ab exists = > m ge(a, b, a) £ 7 L(newlinearE) P ro o f: By showing for v £ newlinearE th at m ge(a, b, cr)v is linear. For v € linearE the term av is linear by [5 -] £ 7 e (E). By Lemma 35 v ^ affected(E, a) im plies var(<xv) D var(d-a) = 0. If indep(E,a,b) we have var(o-a) fl var(<r&) = 0 by Lemma 34. Now we distinguish according to the first three cases of the def inition of newlinear. If {a, 6 } € lintermE then both aa and ab are linear by Lemma 38. For v £ linearE \ (affected(E, a) D affected(E, b)) we meet all con ditions of Lemma 39 to conclude m ge(a, b,a)v is linear. If b £ lintermE then ab is linear and for v £ linearE \ affected(E, b) we meet all conditions for Lemma 40 to conclude th at m ge(a, b, a)v is linear. If a £ lintermE we use Lemma 40 symmetri cally to show th at m ge(a, b, a)v is linear for v € linearE \ affected(E,a). Lemma 36 established th at at most variables from affected(E, a) U affected(E,b) can become non-linear when a and b are unified. I I The next lemma states th at the sharing group of a variable under the most general extension of substitutions unifying a linear term with some opposing term 64 is the union of a single sharing group of a variable in the opposing term with some union of sharing groups of variables from the linear term. L em m a 42 (S h a rin g g ro u p s fro m u n ific atio n w ith a lin e a r te rm ) aa is linear A var(o-a) fl var(o-ft) = 0A ^ ^ V b ^ var(<r&), Va C var(o-a) : fi = m ge(a, b, a) A u £ v a r(fta) ^ s9(»> ~) = sg(a, vb) U (J sg{a, va) V a.eV a P ro o f: By induction over the number of evaluations of m g u according to defini tion 1. The claim holds trivially if aa = ab is a constant. If at least one of aa or ab is a variable then assume without loss of generality th at ab is a variable and fia = aa. Clearly, for all u E var(d-a) the sharing group sg(ft, u) = sg(a, u) Us^(or, ab). If aa — /(% . . . an) and ab = /(& i. . . bn) then let fi' — m g e (/(a j . . . an_i), /(& i... & n_i), a). For variables u E var(/ta) \ v ar(/tan) the groups sg(fi,u) = sg{fi', u), which de compose into the claimed unions by inductive assumption on construction of fi'. It remains to consider the groups sg{fi,u) for u E v ar(/ian). Since aa is linear and aa and ab share no variables fi'an = aany which is linear and sg(fi', va) = sg(a,va) for all va € v a r(a n). Moreover, var(fi'b„) C v a r( /(a x . . .a n_x)) U var(<r&) so th at fi'bn and an share no variables. We conclude by a second inductive assumption for m ge(a„, bn, fi') that sg(fi,u) = sg{fi',vh) U [J sg{a,va) for some vb ( E v ar(/t'6 „) VaGVa and Va C var(a„). As a special case of the first inductive assum ption for fi' we have that sg(ft',Vb) decomposes into sg(a,vb) U (J sg{a, v'a) for some v'b E var(/t'&) and VI C var(/i'a). Thus, sg(fi,u) = sg(a,vb) U (J sg(a,va), for all u E var(/ia). v«£VauVl □ L em m a 43 (S o u n d n e ss o f newsharing) [< t] E 7 e {E) A aa [ ~ l ab exists = > m g e (a , 6 ,ff) E ^{newsharing{E, a, b)) P ro o f: Let fi = m g e (a , 6 , a) then we show for all v E rgv(/i) that sg{fi,v) E newsharing(E, a,b). If v ^ var(/ia) then sg(fi,v) — sg{a,v) by Lemma 36. If v E v ar(£ a) then sg{fi, v) will be the union of some sharing groups from rel(a, sharingE) and rel(b, sharingE). We use soundness of linterm and indep, Lemmata 38 and 34 to establish the conditions for applying Lemma 42. We have that aa (or ab) is linear if a E lintermE (or b E lintermE) and var(<ra) fl var(ab) = 0 if indep(E,ayb). We 65 now consider each of the cases in the definition of newsharing. If {a, & } C lintermE and indep(E, a, b) then by two symmetrical applications of Lemma 42 we have that sg(p,v) — sg(b-,va) U sg(a,Vb) for some va G v a r(<ra) and vj, € var(<rZ>), and thus sg(fi,v) G rel(a, sharingE) x rel(b, sharingE). Similar applications of Lemma 42 show that sg(ji, v) G newsharing(E, a, b) for the cases where indep(E, a, b) and only one of a or b is in lintermE. The soundness of the default case follows from the soundness of am g e for Sharing, Lemma 28. ' □ L e m m a 44 (S o u n d n ess o f am ge) [< t] G 7 e{E) A aa (1 ab exists = > [m ge(a, b, £ ■ )] G 7 js(am ge(a, 6 , E)) P ro o f: By structural induction over the pair a and b. If at least one of a or 6 is a variable or constant then soundness follows from Lem m ata 37, 41, and 43. If a — f(ai...e^), a1 — f (a* . . . a^-i), b = f(bi...b„), and b' — f(bi...bn) then am g e(a,& ,£ ) = am g e(an, bn, am g e(a/, bf, E)) by definition of a m g e and m ge sat-J isfies m ge(a, b, or) = m g e(an, bn, m g e(a', 6', <x)). Therefore the inductive step simply j consists of two applications of the inductive assumption. C T h e o re m 45 (L ocal S o u n d n ess) unify(a, 7 j^(E), b, 7 e {E')) C 7 £ (aunify(a, E , 6 , E')) P ro o f: It is easily seen th at |<r| G 7 je (|^ |) for all a G 7 £-(J5 ), and crr\9 G ^e ^EIIE') if cr G 7 e (E) and ^ £ 7jb(-£'0* Together with Lemma 44, this establishes the soundness of aunify for ESharing. C 4 .2 .2 A lg eb ra ic P r o p e r tie s o f aunify The ESharing domain no longer enjoys a commutative property. In particular, the best approximation of variable aliasing is obtained if grounding unifications precede aliasing ones. For example, consider the two sequences of unifications “f (X, X, Y) = A, X = a” and “X - a, f (X, X, Y) = A”. We infer all possible Sharings of variables relevant to A when we approxim ate the first sequence. W hen we approxi m ate the second sequence, however, no additional sharing for variables relevant to A is inferred if Y is linear and independent of A. That is, starting with the ESharing 66 ({{A, B}, {A,C}, {X}, {Y}}, {Y},0), the result of approximating the second sequence is ({{A, B, Y}, {A,C, Y }},0,0), which indicates that B and C are independent. But the result of approximating the first sequence contains the additional sharing group {A,B, C,Y} in the sharing component, indicating that B and C may share a variable. In the following example we show th at this may allow the query-dependent semantics to infer a more accurate approximation. Consider the trivial rule base, h(X,A) f(X,X,Y)=A. X=X. and the query a = h(ground, Q). We show th at the query-dependent semantics can infer th a t Q is linear after execution, but the query-independent semantics can not. W ith the query-independent semantics, B — B'|f (X, X, Y)=A] R'fV] init(C i) = ({{A, X},{A,Y}},{X, Y},{X,Y}), which indicates that A may be non-linear. There fore, B'fa] R'[[r]]init(a) = aunify(a, init(a), h(X, A), B) — ({{Q}}, 0,0) cannot guar antee that Q is linear. However, aunify(h(X, A), init(Ci), a, init(a)) = ({{A}, {Y}}, {A, Y}, {A, Y}) captures the linearity of Y. Using this in B'j[f (X,X,Y)=A] R y [ r ] | results in ({{A,Y}},{Y},{A,Y}) and therefore B'fla] R'[[r]]init(a) = ({{Q}}, 0, {Q}), which captures that Q is linear. 4 .2 .3 Im p le m e n ta tio n Issu es It may require an extraordinary amount of space to represent the set of possible sharing groups of a substitution for a clause with more than a few variables. The theoretical maximum number of sharing groups of a set of substitutions over n variables is 2n. In practice, the number of sharing groups of a set of actual substitu tions tends to stay much smaller. As long as the approximation is good and reflects ground and linear variable bindings the number of approxim ated sharing groups will stay accordingly smaller than the maximum. But occasionally approximation information is lost and as the vagueness increases the the number of possible sharing 67 groups explodes. A feasible implementation m ust provide a graceful degradation. In section 4.3 we describe a tunable degradation of accuracy for ESharing. The operation that is most problematic is the star closure, the closure under union, of the Sharing th at is relevant to a term across from a non-linear term. The expansion of the star closure may result in exponential increase in the number of sharing groups. In the remainder of this section we introduce a number of rules and lemmas th at are useful for reasoning about representations th at avoid expanding the star closure. We start out by considering rules that allow us to compute with Sharing expres sions, Exp. All elements of the Sharing domain are in Exp, and if E and E ' are in Exp then E*, E x E 1 , E U E 1 are also in Exp. The precedence of the operators decreases in the list. Thus, the expression A U B x C* is implicitly parenthesized as A \J(Bx(C *)). In section 4.1 we defined restriction to untagged variables and filtering out relevant sharing groups for elements of Sharing. Here, we introduce an additional operation rest : Term x Term x Sharing defined as rest(a, b, S) = S \ rel(a, S) \ rel(b, S ), and extend the three operations to Exp. Note that, with the new function rest, we can express the base case of the definition of am g e on Sharing as a m g e (a , 6 ,S ) = rest(a,b,S)U rel(a,S)* x rel(b,S)m . For restriction of expressions we have A1 \E U E '\= \E \U \E '\ A2 |E x E'\ = \E\ x \E‘\ A3 \E*\ = \E \* Note that rule A2 holds because we defined l^l as the natural extension of restriction for sets of variables g, \g\ = g \ Var. Since the set of approxim ated substitutions does not change when the empty sharing group is removed, it would have been sound to define restriction of Sharing in a way that explicitly excludes the em pty sharing 68 group. The elegance of the second rule justifies the original definition retroactively. For filtering unaffected sharing groups we have B1 rest(a, b,E U E ') = rest(a, b, E) U rest(a, 6 , E') B2 rest(a, b ,E x E') — rest(a, b, E) x rest(a, b, E ') f?3 rest(a,b, E*) = rest(a,b,E)* For filtering relevant sharing groups we have C l rel(a, E U E 1 ) = re/(a, £ ) U rel(a, E 1 ) (72 rel(a,E x E r ) = rel(a,E) x J E J ' U F 7 x rel(a,E’) (73 re/(a, £*) = reZ(a, E ) x E* Note th at the union on the right hand side of rule C 2 is not disjoint; the overlap contains rel(a,E ) x rel(b,E). The soundness proofs for these rules are very simple. Soundness of each of the rules can be shown, separately, by structural induction over the expressions. W ith the exception of rules (72 and (73, the soundness of the base cases follow easily from the definitions. For the base cases of rules (72 and (73 we show four set inclusions. We have rel(a,E x E') C rel(a,E ) x E ’UE x rel(a,E'), and rel(a, E*) C rel(a,E ) x E*, since at least one of a set of sharing groups m ust share a variable with a if their union does. We have rel(a, E ) x E ' C reZ(a, E x E'), and rel{a, E) x E* C rel(a, E*), since the union of a set of sharing groups shares a variable with a if at least one ofj them does. Now we can use Exp to compute abstract unification with the Sharing domain.] Unfortunately, it is not clear how to define a least upper bound operation for Exp or how to decide whether two expressions represent the same Sharing without ex panding the definitions of m ultiplication, union, and star closure. Nevertheless, the rules allow us to delay expansion of the star closure until we have restricted Sharing to untagged variables. This is advantageous if the Sharing T is large in comparison with S in aunify(a,5', 6 , T). Now we develop a domain, Tup, that is more abstract than Sharing and never needs to expand a star closure. Elements of this domain Tup are of the form tuple(S, S') where S and S ’ are two disjoint Sharing elements. The concretization 69 7 r : Tup — ► ESubst is defined in terms of 7 for Sharing, 7 r(tuple(S, T )) = 7 ( 5 llT*). We define aunify for Tup like aunify for Sharing by naturally extending union, tagging, and restriction to pairs. The function a m g e is defined using the auxiliary function mktuple : Sharing x Sharing — * Tup, with mktuple(S,T) — tuple(S \ T*, basis(T)), where basis (S') is the, unique, smallest Sharing, such th at basis (S)* = S*. Note th at we do not need to expand the star closure for T to compute S \ T * — {g € S |U-{y e T\gf c g } zf g } . We define / a m g e(a,b,tuple(S,T)) — mktuple rest(a,b, S), rest(a,b,T) U rel(a, S) x rel(b, S,)U rel(a, S U T) x re/(6 , T ) x TU rel(a,T) x rel(b,S) x T \ / if at least one of a or 6 is a variable or constant. And am ge(o, 6 , T) = am g e(an, 6 n, a m g e (/(a 1 . . . an_x), / ( 6 1 . . . 6„_i), T)) if a = / ( a i . . . an) and b — /(& i. . . bn). We will prove soundness of am g e, using additional algebraic properties and approximations. The operations U and x are commutative and associative with identity 0 and {0} respectively. The operator U has the familiar idempotence, S U S = S, and the operator X is idem potent for star closed sets, i.e., S* x S* = S*. D l S** = S* D2 (S x T)* = S* x T* D3 (5 U T)* = S* U T* U S* x T* DA (S \ J T ) x U = S x U U T x U There are derived identities, (S x T*)* = (S x T)*, from D l and D2, and (5'UT"1 ')* = (S U T)*, from D l and D3. This allows us to systematically remove inner star closures from star closed expressions. Similarly within a star closed expression x is idem potent, e.g., {S U T x T x U)* = (S U T x U)*. Approximations: S C S x S C S*. In combination with D l, D 2 and D 3 we obtain derived approximations S x T* Q (S X T)* and (S U T *) C (S U T)*. 70 L em m a 46 (S o u n d n e ss o f am ge) [o'] € 7 T(tuple(S,T)) =*> m g e (a , 6 , fr) E 7 r(am ge(a,& , tuple{S,T)) P ro o f: Soundness of am g e for Tup is implied by the following approximation of the am g e for Sharing. am g e(a, h,S U T*) C rest(a, b, S ) U ( \ rest(ayb ,T ) U rel(a, S) x rel(b, S)U rel(a, S U T) x re/(6 , T) x TU rel(ayT ) x re/(6 , S) x T ) ; We derive the approxim ation from left to right using the algebraic properties and approximations from above. am ge(a, b, S U T*) = rest(aybyS ) U rest(a,b,T )*U (rel(a,S) U rel(a,T ) x T*)* x (rel(6,S) U rel(b,T) x T*)* = I rel(a, 5) x re/(Z>,S)U ^ rest(ayb, S) U rest(a,b,T)* U rest(a,b, S') U re/(a, S) x T x rel(b, T)U re/(a, T) x T x rel(b, S)U ^ re/(a, T) x T x rel(by T) y ^ rest(ay b, T) U rel(a, S) x rel(b, S)U ^ re/(a, S U T) X T X rel(by T)U \ rel(a, T) x T x rel(b, S) ) C □ 4.3 The Worst Case Operation Often an approxim ation of a substitution has crisp information for some of the do m ain variables, but must assume the worst for the rest. If we m ust assume the worst for n variables we include 2” — 1 variable groups in the sharing component. It is clearly not appropriate to represent lack of knowledge this expensively. Instead we 71 introduce a special operation for this common case and develop a domain of expres sions built up from elements of ESharing and a constructor wc, which corresponds to some operation wco. Let the function wco : ESharing x 7> ( Var) — > ESharing di lute the approximation of a given ESharing by assuming the worst for a given set of variables. Assuming the worst for a set of variables V and an Answer corresponds to considering all possible further instantiations of the variables in V . For example, the ESharing th at describes the set of substitutions, where A is ground and assumes the worst for B and C is obtained by wco(({{B}, {C}}, {A, B, C}, {B, C}), {B, C}). Formally, we define a worst case operation for Sharing and for ESharing. D efin ition 17 (wco) j wco : Sharing x 7> ( Var) — > Sharing wco(S, V) = rest(V, S ) U rel(V, < S')* wco : ESharing x 7?( Var) — > ESharing wco((5, L , j F ) , V) = (wco( S ,V ) ,L \ affected{V,E),F \ affected(V, E)) Here and elsewhere in this section we apply the functions re/, rest, and affected to sets of variables as if they were terms. This is unproblematic since these functions j depend only on the set of variables of their term argum ents, and a finite set of] variables can easily be thought of as a special kind of term . Also, we abbreviate rest(V,V, S) by rest(V, S). We now define an abstract domain of certain canonical expressions, Wexp, built I up from elements of ESharing and the constructor wc. More precisely, elements of Wexp are equivalence classes of such expressions. All canonical expressions can be constructed from elements of ESharing and the function cwc : Wexp x 7y( Var) — > Wexp defined as follows. ' wc((rest(V, S) U basis(re/(F, S)), L \ A , F \ A ) , V ) if rel(V, S) contains more than one sharing group (S, L \ A, F \ A) otherwise [ wc(cwc{E ,V '), V) if E y f ~ ) Ey> = 0 ( cwc(E, V U V') otherwise cwc ((S ,L ,F )),V ) = cw c(w c(E,V),V') = 72 where E t = rel(t, sharingE) for t £ {V V '}, A = affected(V, (S , L, F)), and basis(5) is the, unique, smallest Sharing, such that basis(5')* = 5 ' * . The Sharing basis(5) is obtained from S by removing all sets that are the union of some other sets in S. A canonical expression lists strictly fewer sharing groups in the sharing component of its innermost ESharing than the ESharing denoted by the expression, and the different variable sets for which the worst is assumed affect different sharing groups of the innermost ESharing. The equivalence wc(wc(E, U), V ) = wc(wc(E, V), U) induces equivalence classes on canonical expressions. Wexp is the domain of equivalence classes of expressions constructed from ESharing and the function cwc. We say a construction cwc(E, V ) is non-trivial if rel(V, sharingE) contains at least one sharing group. All elements of Wexp are either in ESharing or the result of a non-trivial construction cwc(£, V). We will use the following two properties of non-trivial constructions. L e m m a 47 For ESharing, E, and set of variables, V , let E y be rel(V, sharingE), then for the canonical construction cwc(cwc(2£, V), V') we have I CWc(cW c(jE, V), V') = CWc(jB,F U F') */ E y n Eyi 0 I I cwc(cwc(i5, V*), V') — cwc(cwc(jE ,V '),V ) Proof: By induction over the number of constructors, wc, in E. If E £ ESharing then claims I and I I follow from definition of cwc and the assumptions that the constructions are non-trivial. If E = wc(£1; Vi) then cwc(E-l,V\) is a non-trivial construction of E and we consider the two intersections E iy1 fl E \V and E 1Vl C\E1V,. If both intersections are empty then claim I I follows from the definition of cwc and inductive assumption and claim I follows by noting additionally that E 1Vl ft (E lv U Eiyi) is also empty. If neither of the two intersections are empty then claims I and I I follow from definition of cwc and inductive assumption. If E\yx fl E\y = 0 but E\yi fl Eiyi 0 then claim I follows from definition of cwc and inductive assumption. Claim I I follows from definition of cwc, inductive assumption and claim I. The remaining case is analogous for claim I and symmetrical for claim I I to the previous case. [II 73 The following least upper bound operation induces a cpo structure on Wexp. U C Wexp x Wexp wc(E, V) U E ‘ = E U wc(Ef, V) = cwc(E U E 1 , V) The definition is given for canonical expressions, but with claim I I from Lemma 47 it is easily seen th at equivalent expressions result in equivalent least upper bounds. Claim I I of Lemma 47 also ensures that the least upper bound of two canonical expressions wc(E, T^) and wc(E', V 1 ) is not ambiguously defined. The definition of U on Wexp indeed induces a partial order with U as least upper bound, as the following theorem shows. T h e o re m 48 The operation L I is idempotent, symmetric, and associative, and thus U is the least upper bound for the ordering [I defined by E L I E ' iff E U E ' — E '. idempotence E U E = E symmetry E U E t = E ' U E associativity (E U E ') U E " = E L I {E' L I E ") P ro o f: It suffices to prove the properties for elements of ESharing and non-trivial constructions. The properties clearly hold for elements of ESharing. We use claims I and I I from Lemma 47 and one more claim, I I I cwc(E, V)UE' = cwc(E U E', V ) , to prove the properties for non-trivial constructions. First, we prove claim I I I by induction over the structure of E. The claim follows from definition of U ifi E E ESharing. If E = wc(J E1, Vi) then consider E 1Vl fl E 1V. If the intersection is em pty then the claim follows from definition of cwc and U, inductive assumption, and claim I I of Lemma 47. If the intersection is non-empty then the claim follows from definition of cwc and U, inductive assumption, and claim I of Lemma 47. W ith claim I I I , idempotence of U for all non-trivial constructions follows from claim J of Lemma 47. Symmetry of U for all canonical expressions follows from claim I I I , definition of U, and claim I I . Note that cwc(i?, V) is a non-trivial construction of the class of wc{E, V ), if wc(J5, V ) is a canonical expression. Associativity of U 74 can be shown for all canonical expressions by induction over the structures of E, E ' and E". The inductive steps use Lemma 47 to set up the applications of the inductive assumptions. □ All the auxiliary functions for ESharing can be lifted to Wexp since a canonical expression denotes a unique ESharing. In particular, the concretization function 7 for the domain Wexp is just the concretization function for ESharing lifted to Wexp. The denoted ESharing can be obtained by evaluating wc as wco. The following lemma shows th at equivalent canonical expressions denote the same ESharing. L em m a 49 I f rel(V, sharingE) (1 rel(V\ sharingE) = 0 then wco(wco(.E, V ), V r) = w c o ( w c o ( jE, V 1 ), V) P ro o f: We have wco(wco(5', F ), W ) = rest(W,, rest(V, S )) U rest(W, rel(V, 5'))*U (re/(W, rest(V, S)) U rel(W, rel{V, S)*))* = rest(W U F, S) U rest(W , re/(F, S))*U (rel(W, rest(V, S)) U rel(W, rel(V, S)) x rel(V, S))* from definition of wco and the algebraic rules A 1 through (73 from section 4.2.3. Note th at all the filtering operations commute. For example, rest(W, rel(V, S)) = rel(V, rest(W, S)). Therefore, with rel(V, rel(W, S)) = 0, the right hand side is clearly symmetrical in V and W [I A bstract unification for Wexp is not just the lifted version of aunify for ESharing. We define define aunify for Wexp in a way that avoids computing the star closure in certain circumstances by increasing the set of variables for which the worst must be assumed. Abstract unification for Wexp is defined analogously to aunify for ESharing except for the base case of am ge. T hat is, D efin ition 18 (aunify) # . m\ J |a m ge(a j&>‘ S ’ U T)| i f a and h are unifiable aumfy(a, S, b, T) = < . ( (0 otherwise where the union of Wexps is defined wc(E, F )U E ' — ^ /Uwc(J7, V ) = cwc(EJUE, V). 75 Fox non-variable, unifiable term arguments a = f ( « i... on) and b — f (&i... bn) we define am ge(a, b, E) = am ge(an, bn, a m g e (f(a i... f(fei . . . 5n- i) , E)). When at least one of a or 6 is a variable, then cw c(am ge(a,6, JE 7 ), V) if A y D A a = A y D Ab = 0 w c(am gew-(a, b, E ), V U var(a)) otherwis cw c(am geW r(a, b,E), V) if A y D A a = A y P i Af, = cw c(am geW r(a, 6, E), V U vs a m g e i r (a, b, (S , X, P )) = (rest(a, 6,5) U P, L \ A, L \ A) am g e(a, b, w c(P, V ) ) = [ cw c(am geiy(a , 6,£ ) , F U v ar(a)) otherwise a m g e Br(a , b, wc(E, V ) ) cw c(am geW r(a, 6 , E), V U var(a)) otherwise where A t = affected(t, E), for t e {V,a, & }, P = rel(a,S) x rel(b,S) and ^4 = U(-P)- For a sufficiently large set of variables V abstract unification can be computed w ithout computing a star closure. For v ar(a) C V we have a m g e (a ,6 ,w c (£ ,F )) = cw c(am gejp(a,6 , E), V). Using expressions from Wexp, it is always possible to avoid com putation of the star closure by sacrificing some accuracy. In this sense the domain Wexp addresses the problem of graceful degradation posed in section 4.2.3. Soundness of abstract operations on Wexp is reduced to soundness of the corre sponding abstract operations on ESharing by proving for their definitions th at the right hand side denotes a larger ESharing than the left hand side. For soundness of U on Wexp we will prove wco(P, V) U E ' C wco(E L I E ', V ) in Lemma 51. For soundness of aunify we will prove the following approximations concerning ESharing and the operation wco under appropriate conditions in Lemmas 52, 54, 55, and 59. I wco(E, V)UE' = wco(E U E', V) I I am ge(a, b, wco(E, U)) C w co(am ge(a, b, E), V ) I I I am ge(a, b, wco(E, V ) U w co(am geH r(a, b, E), V U var(a)) I V |w co (P ,F )| = w c o (|P |,|F |) There are two simple approximations of wco. 76 L em m a 50 I E C wco(E, V) II wco(wco(E, V),W)Q wco(E> V U W) Proof: Proof of I is straightforward from the definition of wco. For I I we first ex pand the definition of wco, following the proof of Lemma 49. Then, we see th at it suf fices to show th at both rest(W, rel(V, S )) and (rel{W, rest(V, 5'))U(re/(lF', rel(V, S))x rel(V, S')))* are subsets of rel{V U W, S)*, which is clearly true. □ L e m m a 51 wco(E, V) U E ' c wco(Jg? U E ‘, V ) Proof: It is easily seen th at wc is monotonic in both arguments, i.e., for C F2 ' and Ei C E 2 we have wco(J5i,Vi) C 'wco(E2, V2). The claim follows from this monotonicity and claim / of Lemma 50. □ L em m a 52 Assuming E and V contain no tagged variables, w c o ( jE ? , V) U E 1 = w c o ( jE U E V- ) E ' U wco(J5, V) = w c o ( jE' U E , P ) wco(E, V) U wco(i£', V') = wco(J5 U E ', V U V') The union here is the componentwise union for ESharing. Proof: Straightforward from definition of wco and disjointness of tagged and un tagged variable sets. C To show approximations for am g e of ESharing obtained from a worst case as sumption, we will first state a lemma about the set of affected variables for an ESharing obtained from a worst case assumption and introduce a naming conven tion for Sharing s. L em m a 53 affected(a, wco(E, V)) = \ ^ ^ ” ' ( V > sharin*E ^ = ® [ A a U A y otherwise where A a = affected(a, E) and A y = affected(V, E ) . 77 P ro o f: Let S = sharingE, Sv = rel^V^S), and R — rest(V,S) then affected(a, wco(E, y)) = [Jre l(a, R U 5^) = (J(ref(a, R) U rel(a, Sv) x 5y). Thus if rel{a, Sy) = 0 we have affected{a, wco(E, 1/)) = U re/(a, R) = affected(a, i£), and otherwise we have [J rel(a, R) C A a C affected(a, w c o ( j B , V)) and Av = U rel(a, SV) x S'f C A 0 U Ay. Together this implies the claim. □ We introduce a few special names and abbreviations for the rem ainder of this sec tion. We will consistently refer to the Sharing component of the ESharing, E, as S. Note that the filtering operations rel and rest commute, e.g., rel(V, rest(a, b, S)) = rest(a, b, rel(V, S)). Therefore the Sharing elements in the following claims can be expressed canonically as the union of products of, possibly star closed, factors. There are six basic factors, which we name as follows. relV rest V rel a sa R a rel b sb R b rest a b Sv R For example Sv = rel(V, rest(a, 6 , S)) = rest(a,b, rel(V, S)). The basic factors are pairwise disjoint except, possibly, for the pairs Sa, Sb and R a, R b. Recall th at a m g e ^ is the special abstract operation that avoids computing the star closure. We will show th at this sound if a m g e ^ is surrounded by a sufficiently pessimistic worst case assumption. L e m m a 54 I f affected(V,E) fl affected(a,E) — affected(V,E) fl affected(b, E) = 0 then I am ge(a, b, w c ( jB , V)) = w c(am ge(a,b,E ),V ) I I am g ew-(a,6,wc(J5, y )) = w c(am gew (a,fe,FJ), V) 78 P ro o f: Let E ‘ — am ge(a, 6, E ), then claim I is equivalent to the following three claims for the components of the ESharing. Ia newsharing(wco(E,V),a,b) = 'wco(sharingE/,V ) Ih linearE1 \ affected(E',V) = newlinear(vrco(E,V)ja,b) Ic freeE1 \ affected(E',V) = newfree(vrco(E,V),a,b) Note th at affected(V,E) fl affected(a, E) = 0 implies rel(a,rel(V, S)) = 0 and thus, rel(a, wco(5', V)) = rel(a,S). Therefore the premise implies a number of identities: i ] indep(wco(Ef V ),a,b ) iff indep(E,a,b) j rest(a, 6 , wco(5', V)) = wco(rest(a, b, S), V) affected(t,wco(E,V)) = affected{t, E) for t € {«,&} : t 6 linterm(wco(E, V)) iff t £ UntermE i W ith this claims lb and Ic are easily seen to hold. To show claim Ia we distinguish the four cases in the definition of newsharing. If {a, 6 } C UntermE and indep(E, a, b) then the corresponding conditions hold for , wco(i?, V), i.e., {a, 6 } C linterm(wco(E, V)) and indep(vrco(E,V),a,b). Therefore it suffices to show newsharing(wco(E,V),a,b) = wco (rest(a,b, S ),V ) U rel(at S ) x rel(b,S) = wco (sharingE1 , V) which follows from the definitions of wco and newsharing. The other three cases are shown along the same line. The condition th at some case requires of E, implies the corresponding condition on wco(E, V ) and therefore the same case applies, and the star closure is performed for the same subsharing on both sides. The proof of claim I I is analogous to the proof of claim I. C L e m m a 55 v ar(a) C V =4* a m g e (a , 6 ,w co(£, F )) C w co(am gew (a,b, E ),V ) 79 P roof: Let E' = a m g e ^ (a ,6 ,E ) and S = sharingE, then the main claim corre sponds to the following three claims for the components of an ESharing. I newsharing(wco(E,V),a,b) C wco(sharingE',V) I I linearE' \ affected(V, E') C newlinear(wco(E,V),a,b) I I I fre eE '\ affected(V, E') C newfree(wco(E,V),a,b) Let S' = wco(S', V), then note that newsharing(wco(E,V),a,b) C rest(a, b, S') U rel(a, S')* x rel(b, S')*. Therefore for claim I it suffices to show th at rest(a, b, S') U rel(a,S')* x rel(b,S')* C wco(sharingE',V). According to our naming conventions we have R a = 0 by the premise, rel(a,S') — Sa x rel(V,S), rel(b,S') = f?jU Sb x rel{V, S), and sharingE' = R U Sy U Sa X (i?& U S & ). Thus it suffices to show: Sfr U R U S * a x rel(V, S)* x (R b U Sb x rel(V, S))* C R U (Sv U Sa x (R b U Sb))* Here it is easily seen th at all products on the left hand side are subsets of some subset of the right hand side. Namely, Sy C S y , R C . R, S* x relfV,\ S)* x R% C (Sy U S a x (R b U Sb))*, and S * a x rel(V,S)* x St C (Sv U Sa x S b)*. For claim I I we distinguish two cases. If at least one of relfa, S) or rel(b, S) is empty, we have newlinear(E,a,b) = UnearE, and newlinear(wco(E,V),a,b) = linear^wco(E, V)) since rel(x,S) = 0 => rel(x, wco(S, V)) = 0. Therefore in this case UnearE' \ affected(V, E') — newlinear(wco(E, V), a, b). If both rel(a,S) and rel(b, S) are non-empty, then we have affectedfVE') — affected(Vf E)Uaffectedfb, E) by definition of am ge^r since v ar(a) C V, and UnearE' \ affected(V\ E') = UnearE' \ affectedfV, E) \ affected{b, E) C UnearE \ affected(V, E) \ affected(b, E) = linear(wco(E, 1/)) \ affectedfb, E) C linear(wco(E,V)) \ affected(a,wco(E,V)) \ affected(b,wco(E,V)) C newlinearfwco(E, V)). The first step is justified, since affectedfV, E') = affectedfV, E ) U affected{b, E). The second step just uses newlinear(E,a,b) C UnearE. The third step follows from definition of wco, the fourth from Lemma 53, and the fact th at affected(a, E) C 80 affected(V,E) if v ar(a) C V. The last step is just a worst case approxim ation of newlinear. To show I I I let W t , A t , and A 't be affected(t,wco(E,V)), affected(t,E), and affected(t,E'), respectively, for t E {a, 6 , V}. We distinguish again whether at least one of rel(a, S ) or rel{b, S ) is empty. If it is, then freeE' = freeE \ A a \ A b , since t E freeE implies rel(t, S) ^ 0 for t E {a, 6 } and A y — A y \ A a \ A b . Also, one of reZ(a, w c o ^ , V)) or rel(b,wco(S,V)) is empty, and newfree(wco(E,V),a,b) = free(wco(E, F )) \ W a \ W b , and by Lemma 53 we have free(v?co(E, F )) \ A a \ A b C newfree(wco(E,V),a,b). Therefore, freeE' \ Ay C newfree(wco(E^V)ya,b) in this case. Otherwise, if both rel(a, S) and rel(b,S) are non-empty, Ay = A y , and A'a = A'b = A a U A b = [J(rel(a, S ) x rel(b, S )) C A y since v ar(a) C V. Therefore, freeE' \ A y = freeE \ A y \ A b. And freeE \ A y \ A b C newfree(wco(E,V), a,b) by definition of newfree. □ Restriction to untagged variables for ESharing is well defined. Instead of lifting this definition to arbitrary canonical expressions, we define a subclass of canonical expressions whose restriction is well behaved with respect to the constructor wc. A canonical expression, wc(E, V), is called restrictable if E is restrictable and V < ? E sharingE : |^|D |V | = 0 = > g C V arV gf\V = 0. For restrictable Wexps, w c(E,V ), we define restriction, |w c(i?,y)| = cwc(|i?|, |Vj). We will show that all Wexps that occur during com putation of abstract unification are restrictable in lemmas 56, 57, and 58. L e m m a 56 I f E and E ' are two Wexps that contain no tagged variables then E U E 1 is restrictable. P ro o f: By induction over the structure of canonical expressions. If E and E ' are ESharing s then E U E ' is an ESharing and thus restrictable. The union wc (E, V) U E ' = wc(.£/ U E', V) is restrictable since \g\ fl iVj = g C\V for all g E sharingE and for g E sharingE' we have g C Var. The union E ' U wc(E, V) = wc(E' U E, V) is restrictable since g fl V = 0 for g E sharingE' and for g E sharingE we have g C Var. [I 81 L em m a 57 Let A t = affected(t, E ) for t € {a, 6 , F } /fcen ifvrc(E ,V ) is restrictable and A a fl A y — A^Cl A y = 0 then w c(am ge(a,6 , E ),V ) is restrictable. P ro o f: Let S = sharingE and S' — newsharing(E,a,b) then we consider two cases for g € S'. If g € rest(a,b,S) then g £ S and since w c(E ,V ) is restrictable we have l^l fl | F j = 0 ==> g C Far V 0 fl F = 0. If 5 ^ rest(a, b, S) then g C (A a U Ab) and thus g D F = 0. □ L e m m a 58 If w c(E ,V ) is restrictable then w c(am geW r(a, & , JE 7 ), F U var(a)) zs re- strictable. P ro o f: Let 5 = sharingE and S' = sharing (a.mgew (a, b, E)) then we consider two cases for g € S'. If g 6 rest(a,b,S ) then, |</| fl |F U v ar(a)| = |< j r | fl |F |, and \g\ fl |F | = 0 = > 5 C Far V g fl F = 0 since w c(£, F ) is restrictable. If g € rel(a,S) x rel(b,S) then |< 7| D |F U var(a)| D l ^ g r ] fl v ar(a) 7^ 0. E H Finally, we show th at restriction is sound for restrictable Wexps. I L em m a 59 I f wc((S, L, F), V) is restrictable then |w co((5,L ,n F )| = w c o ((|S |,|L |,|F |),|F |) P ro o f: The identity is easily seen for the linear and free components of the two sides. For the sharing component we first consider the case where no sharing group g 6 S contains only tagged variables. By the premise and definition of rel we then have rel(V, S) = re/(|F|,<S') and therefore |wco(5, F )| = |wco(5, |F |)| = w cod^l, |F |). In general we can partition S into two Sharings S 1 = {g E 5 |< 7 C Far} and «§2 = S \ S \ . Now we have wco(5',F) = wco(5i, F )U wco(5 , 2 , |F j) U re/(F, S^)* x reZ(|F|, S 2)* using the established result for S 2 and some of the algebraic rules. We have |wco(iS'i, F )| = {0} and \rel{V,Si)* x reZ(|F|, < S 2)*| = lreKI^I 5 ^2)*! since the union with elements from Si do not result in additional restricted sharing groups. Therefore we have |wco(/5, F )| = wco(|5 2 |, |F |) U {0} = w co(|5|, |F |). [I We have introduced the domain Wexp with operations U, cwc, and aunify and dem onstrated its soundness. We continue by discussing the trade-off between loss of accuracy and savings in computation. Since E C cwc(E, F ) for all Wexps E and 82 sets of variables V, it is always sound to, preemptively, assume the worst for some set of variables for which a large portion of some Sharing is relevant. Once the worst has been assumed for these variables, no star closure involving sharing groups relevant to this set will have to be computed. Certain representations of Sharing with worst case operations, the sharing com ponents of Wexps, correspond directly to elements of the domain of weak and strong equivalence classes reviewed in section 4.1. The Sharing obtained by starting with the set of strong equivalence classes and then assuming the worst for each weak equivalence class, captures all the information present in the equivalence class de scription. For example the description {V i — * ■ Ci, tf h-► E n , X h-* E h ,Y C2,z C2} specifies th at the sets {V, W, X } and {Y, Z} are weakly coupled and W and X are strongly coupled. This description corresponds directly to the sharing compo nent of some Wexp, wc(wc({{V}, {W, X}, {Y}, {Z}}, {V, W, X}), {Y, Z}). Note that the representation using the worst case operation explicitly lists the trivial, singleton strong equivalence classes for non-ground variables. These classes are implicit in the notation with subscripted C ’s and E ’s. Both representations capture the fact th at variables in different weak coupling classes cannot share variables and that grounding one member of a strong coupling class ground all its members. Some accuracy is lost by forcing descriptions to correspond directly to strong and weak equivalence classes. For example, let S be the Sharing from the previous example, then after unification of X with f (Y, Z) we obtain aunify(X = f (Y, Z), S,A = k, {{A}}) = |amge(X = f (Y, Z), I = A, wc(wc({{V}, {tf, X}, {Y>, {Z>, {! } }, {V, t f , X}), {Y, Z}))| = |amge(f(Y, Z), A, wc(wc({{V}, {A, t f , X}, {Y}, {Z}}, {V, W, X}), {Y, Z}))| = |wc(wc({{V}, {A, t f , X, Y}, {A, t f , X, Z}}, {V, t f , X, Y, Z}))| = wc({{V}, {t f , X, Y}, {tf, X, Z}}, {V, t f , X, Y, Z}) C wc({{V}, {W, X}, {Y}, {Z}, {V, W, X, Y, Z}) Here, the result obtained with worst case operations of Sharing captures the fact th at subsequent grounding of Y strongly couples {tf,X, Z}. In the last line above, we also give the most accurate Sharing th at directly corresponds to weak and strong equivalence classes aiid approximates the result of the unification. It is more concise but less accurate and suffers from the defects elaborated in section 4.1. In particular, 83 it is never permissible to infer that variables become strongly coupled as the result of grounding some other variable. 4.4 The Domain Prop The domain Prop proposed in [20] tracks groundness. Abstract unification in the domain Prop enjoys all three properties necessary to condense without loss of accu racy. Although it does not approximate aliasing the approximation of groundness information can be used to establish independence of goals in Independent And- Parallelism. Elements of this domain are propositional formulas, or more precisely classes of monotonic, propositional formulas, th at describe groundness dependencies between variables. For example, X < -> Y describes the set of all substitutions where X and Y are in the same state of groundness and all subsequent instantiation will leave them in the same state of groundness. Thus, X *-* Y describes the substitutions ■[X ( — * • a ,Y • — » 6 } and {X t — > Y}, but does not describe the substitutions {X a} or { Y i — > f ( Z ) , X i — y f(Q )}. Further, true describes all substitutions (T ) and false j describes no substitutions (-L). The concretization function is given by 7 (P ) = {<r|V0 < cr : P holds under 9}. Here, variable v holds under substitution a iff er grounds v and the propositional connectives have their usual interpretation. Thus, for example, 7 (X A F ) = grounds X and Y}. We have aunify (a, 5, b, T) = jam ge(a,b,S A T)| where am ge(a, 6 , S ) = am g e(an, 6 n, a m g e (f (at ... an_j), f (6 1 ... 6 „_i), 5)). Here a formula is tagged by tagging the variables in the formula. Restriction is defined |5 | = \/TeTass(s)T^^ where Tass(S) = V arfl var(5') — » ■ {true,false} is the amge(«x, b,S) = S A A « A v u £ v a r(a ) ti£var(i>) I f ( 6 1 . . . bn) then if at least one of a or b is a constant or variable, and if a = f («i . . . an) and 6 = domain of tru th assignments for tagged variables in the formula S. 84 For example, for a — p(X, Y), 6 = p ( f (Q) , R), S = true, and T = Q + - > • R we have _ aunify(a, S, b, T) = |am ge(a, 6 , Q R)| = |Q^RAX<->QAY<->P| = X Y 4 .4 .1 L ocal S o u n d n ess o f a u n ify L em m a 60 (L ocal Soundness o f am ge) [a] € 7 (F ) =► [mge(a,fe,5-)] e 7 (am ge(a, b,F)) Proof: It is easily seen, th at the formula f \ v holds under a iff aa is ground. vG var(o) Let fi = [m ge(a, b, o-)] Since 9a = 9b for all 9 < fi it is certainly true that 9a is ground iff 9b is ground. Thus the formula I f \ u I + - > • I u holds under \tt£ v a r(a ) / \w 6var(6) / 6 for all 9 < fi. Moreover, since fi < [j], /i £ 7 (F ). Together this establishes that (i Gj If a I /\ ^ ( A *>))• D \ \ n £ v ar(a) / \i>G var(6) / / L em m a 61 (Soundness o f restriction) < r e 7(F ) =► k l G 7 (|F |) P ro o f: For each substitution a let r : Var flv ar(a) — > Bool be the tru th assignment defined by r(v) = true iff av is ground. Clearly, both a and |<r| are in 7 (r(F )), and thus in 7 (|F |). C 4 .4 .2 A lg eb ra ic P r o p e r tie s o f aunify L em m a 62 (C o m m u ta tiv ity o f am ge) am ge(a, b, am ge(c, d, F )) = am ge(c, d, am g e(a, b, F)) P ro o f: By reduction to comm utativity of conjunction. I L em m a 63 (Id e m p o te n c e o f am g e) am g e(a, b, am ge(a, b, F)) = am ge(a, b, F) 85 P ro o f: By reduction to idempotence of conjunction. □ L em m a 64 (A d d itiv ity o f am ge) [ am ge(a, 6 , f\F i) - / \ am ge(a, 6 , Ft) i i P ro o f: By reduction to distributivity of disjunction and conjunction. I I T h e o re m 65 (C o m m u ta tiv ity o f aunify) aunify(a2, aunify(ai, F, &i, F \), b2, F2) = aunify(a1 ? aunify(a2, F, b2, F2), bi, Fi) P ro o f: By reduction to com m utativity of am ge. Following the proof of commuta tivity of abstract unification in the Sharing domain, we introduce a second tagging scheme, double tagging. All restriction can be delayed until the last step of the evaluation because of distributivity of disjunction and conjunction. □ The following proofs use some new concepts in reasoning about formulas th at may contain doubly, as well as singly tagged variables. We introduce an operator, F, th at replaces all doubly tagged variables with their corresponding untagged variables in formulas. We say a formula F is balanced iff |JP| = F. For example all formulas of the form A A A are balanced. L e m m a 6 6 I f F is balanced, then a m g e (a , 6 , am g e( 6 ,S, F)) is also balanced. P ro o f: Let F ' — am ge(a, 6 , a m g e ( 6 , 1 , F )) = F A A < - * B A A * - * B where A = v and B = / \ v. Then we have JF1 '! = A < r + B A \\F A A B\\ and « 6 T ar(a) »6var(&) F ' = F A A h B = 1^11 A A B. To show F ' ==> IF1 '! it suffices to show F A A B = > IF1 A A < -* B ||. We show this by constructing a tru th assignment t' for variables in F A A * -+ B for each tru th assignment r for variables in F A A B. For all v $ Var : t ' v = ru , and for T 7 6 Var : t ' v = t v . To show ||F,,[ | = > F ' it suffices to show flF 1 A A F?|| =>• ([F1 !. This is clearly true since every tru th assignment satisfying F A A * - * ■ B also satisfies F. C l T h e o re m 67 (Id e m p o te n c e o f aunify) aunify(a, F1 , b, aunify(6 , F ‘, a, F)) = aunify(a, F, 6 , F1 ') 86 I P ro o f: We use two different but equivalent tagging and restriction schemes for the left hand side, which is equivalent to j|am ge(a,& , amge(fe, a, J F 1 ) ) || |. By the previous lemma this is equivalent to \F'\ for F ' = am g e(a, b,a m g e ( 6 , a, F)). It remains to show th at F ' — am ge(a, b,F), which is shown by induction over the number of evaluations of am ge. □ L em m a 6 8 (A d d itiv ity o f aunify) aunify(a, \J Fi: b, F') = \J aunify (a, Fi, b, F') i i I | P ro o f: By reduction to additivity of am ge and com m utativity and associativity of disjunction. C U 4.5 Summary In this chapter we discussed limitations of previously proposed domains for approx imating variable aliasing, and presented our alternative. The key idea is to ap proximate the set of sharing groups, rather than pairwise independence or coupling classes. The domain Sharing approximates sharing groups. We then showed how to improve approximations of sharing groups by additionally approximating “linear” variable bindings. The domain ESharing is mainly a domain for approximating aliasing, th at also approximates linear and free variable bindings. A domain th at approximates the set of sharing groups by listing the possible sharing groups is simplistic - the vaguest approximations are the largest and require the most com putational effort. To address this problem, we presented a represen tation for ESharing th at represents large sets of sharing groups with the so called worst case operator. We investigated the algebraic properties of these domains. For the previously proposed domains, the abstract unification primitive was not commutative. In par ticular, accuracy was lost when aliasing unification preceded grounding unifications. The lack of comm utativity makes those domains less suitable for approximations using query independent abstract semantics. The abstract unification operator for the Sharing domain is commutative and Idem-Cl but not additive. Therefore, this 87 domain is most accurate as a power domain with query independent abstract se mantics. The abstract unification operator for ESharing is not commutative. For this domain we give examples where query independent semantics leads to more accuracy and others where query dependent semantics leads to more accuracy. 88 Chapter 5 Abstract Execution The query independent abstract semantics gives rise to a new technique for abstract execution. In a pre-processing phase, called condensing, the query independent meaning of clause bodies or procedures can be computed for the entire rule base. The result is a condensed form of the rule base, which, for each clause, retains only its head and the approximation of the query independent meaning of its body. This chapter discusses several issues connected to abstract execution. First, we give a small example of condensing using the Sharing domain and use the result to propagate several approximations. We contrast this with abstract execution according to the query dependent semantics using extension tables. Extension tables are the conventional solution to the inefficiency of naive abstract execution. Then we give an example of abstract execution according to mode-oriented formulations of abstract semantics. Here we use the ESharing domain and, again, contrast the results of executing according to query dependent and query independent abstract semantics. 5.1 Abstract Execution Using Condensing In this section, we give an example of abstract execution of the predicate append ([], L, L) . append![AlL], M, [A|N]) append(L, M, N). using condensing over the domain Sharing. Recall th at in this approach, the body of a clause is processed independently from specific queries. We compute a fixed 89 array A where A[i] contains the result of processing the body of the i-th clause. The array A is computed by iterating to a fixpoint following the query-independent abstract semantics, i.e., after the ra-th iteration A[i] = F'[[r]]? lL init(fei: We abstractly execute goal a under current Answer1 S by taking the least upper bound of all aunify(a, S', A[i]) where h{ and a are unifiable. As a first step, we condense the append predicate over the domain Sharing. Let Ci be the first clause for append and h\ be its head. Similarly, let < 7 2 be the second clause, h2 be its head, and a be the goal in its body. After initialization we have A[l] = ± and A[2] = -L. After the first cycle we have A[l] = init(append([],L, L)) = {{L}} A[2] = aunify(a,init(C'2 ),^ i, A[l]) U aunify(a, init(C 2), h2, A[2]) = {{A},{M,N}} U l = {{A},{M,N}} After the second cycle we have A[2] = aunify(a,init, hu A[l]) L I aunify(a, init((72), h2, A[2]) = {{A}, {M, N}} U {{A}, {L, N}, {L, M, N}, {M, H}} = {{A},{L,N},{L,M,N},{M,N}} During the third cycle we discover that the fixed point has been reached. Suppose we now want to determine the result of abstractly executing the call a = append(X, Y, Z) given that X is ground. We start with the initial Sharing S = {{ Y},{Z},{Y,Z}} which assumes nothing about Y and Z. The result is aunify(a, S', hi, A[l]) L I aunify(a, S, h2, A[2]) = {{Y, Z}}. This says th at Y and Z may be dependent, however, they will not have any unshared variables. Thus, if we subsequently determine th at one of Y or Z becomes ground, we can deduce th at the other is ground. . 9. 01 5 .1 .1 C o m p a riso n w ith E x te n sio n T ab les It is possible to evaluate a function of a fixpoint on a by-need-basis using Ex tension Tables. The table stores function values for the smallest possible subset of the domain th at is required to evaluate the function for a particular domain element. In this evaluation technique, the following policy for table update re places the evaluation of the function / for a given domain element d. Return the value stored for d if the table already contains an entry for d. Otherwise, enter the tentative value _ L and repeat improving the table entry until there is no more change. Obtain an improved value for d by evaluating the definition of /(d ), possibly using some tentative table values in the process. As an example we might tabulate the function C \ h : - 6 ] R/flV] for the smallest possible number of pairs (a, S). Here, a table value for (a ,S ) is improved by attem pting to evaluate aunify(a, S, h, B'Jfe] R /[r| aunify(^., init(fe: -£>), a, S )). The i-th attem pt will result in a value S' with C \ h : - 6 | F'fr]]!L (a, S ) C l S'. Note th at no attem pt can result in a value exceeding C'flA: - ij R / |r | (a, S ). We illustrate this with the example from the previous section, th at is, evalua tion of B'fla] R '[r] S, for a = append(X, Y, Z) and S = {{Y}, {Z}, {Y, Z}}, where the rule base r contains the definition of append/3. Let hi be the head of the i-th clause and ax be the recursive call of append in the second clause. We tabu late C 'p i: - 6 J R/jjyjj as follows. First, the table entries for pairs (a, S) and (ai, 5i) are initialized with the tentative value _L . Here, S x — aunify(A.2 ,init(( 7 2 ), a, S) = {{M}, {N}, {M , N}}. The first improvement for table entry at (ax, Si) is {{M, N}}, re sulting from unification with hx. A second evaluation for (ai,Sx) confirms this as the fixpoint. Using this value we obtain the first improvement for table entry (a, S), {{X, Y}}, which is confirmed by reevaluation. So the same result has been computed with a single additional table entry. There are two kinds of optimizations for this tabulation technique. First, a small am ount of bookkeeping can identify certain values as the final result without having to confirm them through reevaluation. Second, a single table entry can be used to compute the result of the tabulated function for several, related domain elements. In the example above it is unnecessary to compute the second table entry, since it can be obtained by renaming the result of the first. The latter optimization can be 91 combined with modes, which are discussed in the next section, to use a particularly small number of table entries to represent the tabulated function. As noted earlier, computing with Extension Tables corresponds to computing the function of a fixpoint by need. W ith them, the result of a particular query a begun in a particular Sharing S can computed. This com putation does not produce com plete body meanings B'p>] R/fr] init(fe: -&), as in condensing, but rather only a finite part of such meanings, in particular, just enough to derive C '[fi: -b\ R'[]r]] (a, S ) from B'P>J R ' [r]aunify(fe,init(fe:-6 ),a , S). Assuming the call is fairly specific, this com putation is, by itself, less work than condensing. If many calls are made, however, then the associated computation of fixpoints will be considerably more work than condensing. Note that Extension Tables implement a form of memoizing. The previously computed table values are reused when an answer to a query begun with some Sharing is needed more than once. Reuse is facilitated by extrapolating the results of calls from the results of different, but related calls. In particular, the result of the call (a, S) can be computed from the result of any uniform renaming of the call (a,filter{ajS)), where filter(a,S) denotes the subcomponent of S th at is in an appropriate sense, relevant to a. Thus, entries in the table are effectively for renaming equivalence classes of calls (a, S) where filter(a, S ) = S. For atoms w ith only variables in the argum ent positions this corresponds directly to modes. Different calling modes require separate entries in the Extension Table. Table 5.1 shows various modes for append(X, Y, Z) reflecting different uses of the predicate. Use Mode Table Entry find prefix {{X}} 0 find suffix {{Y}} 0 append first two arguments {{Z}} 0 split last argum ent {{X}, {Y}} 0 create difference list {{^},{Z}} {{Y, Z}} Table 5.1: Different modes of append/3 The first row here says that the first argument of append becomes ground if the second and third arguments were initially ground. Thus, abstract execution of append(A, B, C) begun in {{A},{D}} leads to {{D}}. 92 The use of extension tables eliminates repeated fixpoint computations associ ated with related abstract calls. However, a separate fixpoint com putation is still required for each mode of interest, in contrast with condensing, which always re quires exactly one fixpoint computation. Generally speaking, the cost of abstract execution using condensing is equal to the cost of abstract execution using extension tables if procedures are used only in their most general modes. Extension tables are more efficient if procedures are used in a small number of specific modes, while condensing is more efficient if procedures are used in a large number of more general modes. The latter case occurs more frequently as the underlying abstract domain becomes richer, i.e., as it encompasses more dataflow information with a higher degree of accuracy. 5.2 Mode Inference using Condensing W hen abstract interpretation using condensing over the domain ESharing is com pared to interpretation using extension tables, the results are, in general, not compa rable. Since aunify for ESharing is not commutative and lacks Idem-3 , condensing and Extension Tables may give different results. For the following predicate qsort/3 the query-dependent semantics cannot infer that X in the body of the second qsort clause is free, but the query-independent semantics can. Therefore, we infer a more specific call mode for qsort using condensing than using Extension Tables. split([], B, [] , []). splitC[AIL], B, [AIM], N) A < B, split(L, B, M, N). split([A|L], B, M, [A|N]) B =< A, split(L, B, M, M). qsort(L, S) qsort(L, S, []). qsort ([], S, S) . qsort([A IL], S, D) :- split(L, A, M, N), qsort(M, S, [A IX]), qsort(N, X, D). 93 We substantiate this by computing mode', which was defined in section 3.4, using first the query-independent formulation for body meaning, B '[ ]], and then the query-dependent formulation for body meaning, B'[| ]]. First we condense the clauses. The following array A[i] = B '| 6i| R /[r| init((7;) is computed for clause Ci with body ^[1] = ({{B}},{B},{B}) A[2] = A[3]= (0,0,0) A[i]= ({{L,S}},0,{L ,S}) *[&]= ({{S}},{S},{S}) A[6] = ({{A, S}, {S, D, X}}, {D}, {A, S, D, X» Here, the condensed version for the second and third procedure used the condensed version of the Prolog built-ins < / 2 and >=/2 , which succeed only if both arguments are ground. Now, we compute entry mode information for the clauses given an entry mode declaration for qsort/2. Let the entry mode declaration declare that qsort/2 is called with the first argum ent ground and the second argument a variable. Formally, this is specified by the function d tc l : Atom — * Mode and decl r ({{B}}>(B}>{B }) if skel(a) =qsort(A,B) -L otherwise We obtain an approxim ation of the current substitution after entering the clause by abstractly unifying the head and the mode information. To propagate this ap proximation over a call we take the least upper bound of abstract unifiers with the condensed version of the clauses. From an approxim ation before a call we obtain an entry mode for the calls procedure with one unification with the normalized procedure head. Below we discuss the generation of entry modes for q s o r t / 3 in some detail. Since throughout this example all variables are either linear or ground, we omit the third component of ESharing. In Table 5.2 we list mode deal a for all skeletons a for which a corresponding call occurs in the program. 94 skel(a) mode decl a split(A,B,C,D) qsort(A,B) qsort(A,B,C) Table 5.2: Modes using condensing Let Mcall : Site — > ESharing be a function such th at for a fixed rule base r and declaration function decl, Mcall s = j[r]] enter'(i, skel(hi), mode decl hi) is the approxim ation of set of substitutions at each site in the program, based on mode' decl. In Table 5.3 we list Mcall s for all sites s. (i,j) Mcall (i,j) (2,2),(3,2) (4,1) ({{s}},{s}) (6,1) ({{S}, {D}, {M},{N}, {X}},{S,M,N,X}) (6,2) ({{S},{D},{X}},{S,X}) (6,3) ({{S,X},{D}},{S,X}) Table 5.3: ESharing at each site using condensing Note th at we encounter two different modes for qsort/3. The mode from Mcall (4,1) is ({{B}}, {B}), which indicates th at the third argument is ground. The modes from Mcall (6,2) and Mcall (6,3) are ({{B}, {C}}, {B}), which indicate that the third ar gument m ay be non-ground. When qsort/3 is first called from site (4,1), the second recursive call enters qsort/3 with third argum ent ground but the third argument is not ground on the first recursive call. W hen the third argum ent is not ground on entry to qsort/3 the third argum ent of neither recursive call is. Therefore the call modes at sites (6,2) and (6,3) dominate the mode of qsort/3. 5 .2 .1 M o d e In feren ce u sin g E x te n sio n T ab les Now we compute mode' with Extension Tables. We use them to tabulate the function B 'Ja] R 'lr] E for pairs of atom a and ESharing E . This requires less 95 entries than tabulating C'flV7] R'[[r]](a, E) for each clause and such pairs. Since B'p>a]] e E = B'fla] e (B'[[6 ] e E) for bodies b and atoms a, the function B 'p jR 'J r ] can be computed for all bodies from the table entries. The entries are given only for “filtered” ESharings, which contain only variables that occur in the call of that entry. The value B '[a] R'jjrJ E for arbitrary E can be extrapolated in a straightfor ward way from the filtered ESharing. The Extension Table after evaluating q s o rt (A, B) with ({{B}},{B}) is given in Table 5.4. a E B'|a]R'[[rjE qsort(L,S) ( { { s } } ,{ s } ) (0,0) qsort (L,S, [] ) ({{s}},{s}) (0,0) split(L,A.M.N) ({{m},{n}},{m,n}) (0,0) qsort(M,S,[A IX]) ( { { s } ,{ x } } ,{ s ,x } ) ({ { s,x } } ,0 ) qsort(N,X,D) ({{X},{D}},0) ({{X,D}},0) qsort(N,X,D) (0,0) Table 5.4: Extension Table for q s o r t/3 program Let us follow the construction of the table entries for q s o r t (M,S, [A|X]) in some detail. This call is first encountered with current ESharing ({{X}, {S}}, {S, X}) in an attem pt to improve the table entry for q s o rt (L , S, [] ). On first encounter it is initialized with the tentative value J_. Then, we take the least upper bound of the results of abstractly executing the two clauses for q s o rt/3 . During this first improvement attem pt, the second clause contributes nothing, since the table en try, _L , is looked up for the body meaning of the first recursive call. Note, that the look-up only succeeds because of “filtering” , since the current ESharing is liter ally, ({{S}, {X}, {D}}, {S, X}). The first improvement attem pt results in ({{S, X}}, 0). Here, the query-dependent semantics has already missed the fact th at X remains free. This is a consequence of lack of Idem -3, which we noted in section 3.2.5. The second improvement attem pt needs to evaluate the second recursive call, q s o r t(N,X,D), with ESharing ({{S, X}, {D}}, 0), which is filtered to ({{X}, {D}}, 0). After its evalua tion the second improvement attem pt for the entry of q s o rt (M, S, [A IX] ) completes with ({{S,X }},0) as the confirmed value. We can compute the mode approximation, mode' decl, and use the Extension Table above to compute B '[ 6] R'flr]] S for propagate(b, S). As an aside, the sets of 96 substitutions, atd, and the concrete modes, moded, for conforming d with da C 7 (decla) can also be approximated by considering for which pairs the Extension Table contains entries. It can be shown that / mode da C 7 \ [_] callmode' {a, E ) skel(a')— »kel(a} (a,E)£D where D is the domain of the Extension Table, i.e., the set of pairs from Atom x ESharing for which the Table contains confirmed entries. This m ethod of approx im ating modes may be more accurate than at' decl and mode' decl. In the q s o rt example, however, mode' decl results in the same mode approximation as the method above. Table 5.5 lists mode' decl for the skeletons of all calls in the rule base. skel(a) mode' decl a split(A,B,C,D), qsort(A,B), qsort(A,B,C), Table 5.5: Modes using Extension Table Table 5.6 lists the current ESharing at each site based on mode' decl. (i,j} Mcall {i,j) (2,2), (3,2) ({{M},{N}},{M,N}) (4.1) ({{S}},{3}) (6.1) ({{S}, {D}, {M}, {N}, {X}}, {S, M , N , X}) (6.2) ({{S},{D},{X}},{S,X}) (6.3) ({{S,X},{D}},{S}) Table 5.6: ESharing at each site using Extension Table Note that the modes obtained from the extension table differ from the modes ob tained with condensing. In particular, the exit mode of q s o r t/3 , which never indi cates a free variable, does not perm it us to infer th at the variable X remains free at (6,3), that is, after executing qsort(M , S, [A|X] ). Therefore the call of q so rt(N , X, D) is approximated with an overly pessimistic mode ({{B}, {C}},0). A more powerful abstract domain that maintains part of the term structure is necessary to infer that X remains free if extension tables are used for abstract interpretation. 97 5.3 Mode-Oriented Abstract Execution The query-dependent, mode-oriented semantics presented in section 3.5 enables maximal reuse of extension table entries. A single table entry is constructed for each mode of a call. The pre-computation required for the query-independent, mode-oriented semantics condenses procedures rather than clauses. Using the qsort program again as the example, we tabulate the function F j^ Jr] (a, M ) for atoms a with skel(a) = a for the minimal set of modes M th at are needed to eval uate R jj( r ] E for some atom a and ESharing E. The result of evaluating q s o rt(L ,S ) with ({{S}},{S}) is given in Table 5.7. a E split(A,B,C,D) ({{c},{d}},{c,d}) (0,0) qsort(A,B) ({{B}},{B}) (0,0) qsort(A,B,C) ({{B}},{B}) (0,0) qsort(A,B,C) ({{b},{c}},{b}) ({{B, C}}, 0) qsort(A,B,C) ({{b},{c}},0) ({{B,C}},0) Table 5.7: Extension Table for mode-oriented semantics Table 5.8 lists the condensed procedures of the qsort program. F ^ H R /M lrJ a a split(A,B,C,D) qsort(A,B) qsort(A,B,C) ({{A, B}},0) ({{A,B},{B,C}},{C}) Table 5.8: Condensed procedures The condensed procedures accurately reflect the degenerate case of sorting a list with a single variable element. 5.4 Fast Condensing The condensed version of a rule base is a fixpoint. Some special optimizations are possible to find this fixpoint. The algorithm below partitions the rule base into groups of m utually recursively defined predicates. These groups can be condensed 98 ■independently. Thus, it is possible for each group to use the condensed version for all Ipredicates except those that are defined in this group. For each procedure, P, we can determine which other procedures m ust be condensed together with P. If Q is called by P and Q calls P then Q and all procedures th at must be condensed together with Q, must be condensed together with P. If Q is called by P and calls no procedures th at both are called by P and call Q then Q can be condensed independently from P. The following algorithm maintains a stack of procedures th at we are trying to condense. Every procedure in the stack is called by the procedures below it. The pseudo code function group(P, CallingP) returns the group of procedures that must be condensed together with P. The sequence of procedures in CallingP are known to call P, and within the sequence each procedure is called by the ones following it. group(P, CallingP) G p = 0 for Q called by P with Q ^ P and uncondensed(Q) do if CallingP contains Q preceded by the sequence CalledByQ th e n add Q as well as all of CalledByQ to Gp else le t Gq = group(Q, [P I CallingP]) in if Gq contains P th e n add Gq to Gp; else condense Gq together with Q now; o d r e tu r n Gp W ithin each group, we can rewrite the clause, so th at all calls to predicates th at have not been condensed are a t the end. After the group has been rearranged in this way the prefix, of calls with known condensed versions is evaluated and the result stored. At each subsequent iteration, only clauses containing calls to predicates in the current group need to be evaluated, and within these clauses only those calls. This optimization is sound since the order of goals in a body does not affect its meaning. This optimization exploits the fact that, during condensing, we are only interested in approximating the result of executing the whole body. During condensing, we are not interested in approximating the set of substitutions at all points within the body of that clause. This optimization may affect the accuracy of 99 the approxim ation of body meaning, however. If aunify is commutative the order of goals in a body, b, does not change the mode-oriented, query-independent, abstract body meaning, B'jif J6J R 'm M - ^ aunify is commutative and additive the order of goals does not change the query-independent, abstract body meaning, B'p>] R-'m H- Note that,in the case of the domain ESharing, whose aunify is not commutative, the optimization may actually improve the accuracy. Consider, for example, a variable th at occurs repeatedly in a recursive call and that will be grounded after the recursive call in the original goal order. Here, approximation of the body in original order may approximate an interm ediate non-linear variable binding that degrades accuracy. 5.5 Summary In this chapter we elaborated on the differences in abstract execution implied by the query dependent formulation on one hand and the query independent formulation on the other. The query independent formulation encourages pre-computation of the clause body meaning. This phase is called condensing. Condensing requires iteration to some fixpoint. The contribution of a clause to the meaning of a query is obtained from its condensed version with a single unification. The query depen dent formulation defines an abstract meaning function as a certain fixpoint. This function can be evaluated “by need” for particular elements of its domain. The extension of the function may be tabulated as an Extension Table. Every new entry in the Extension Table requires iterating to a fixpoint. Previously computed entries may simply be looked up. The effort required by the condensing phase in abstract execution according to query independent semantics corresponds to computing ta ble entries for the most general call of all clauses according to the query dependent semantics. Therefore, abstract execution using condensing is more efficient than abstract execution with extension tables if all clauses are invoked in several ways, including the most general invocation. For practical programs a number of factors influence the efficiency of abstract execution. The cost of computing a table entry depends on the number of steps needed to confirm that a fixpoint has been reached as well as the cost of abstract unification with various interm ediate approximations. An abstract domain element with a small representation requires less effort as an 100 argum ent to abstract unification and may require fewer iteration steps. Therefore, abstract execution with extension tables is more efficient than abstract execution us ing condensing if procedures are called with few distinguishable “modes” and these modes have smaller representations than the most general procedure invocation. These two insights are consistent with the variation of execution tim e for test runs of our &>Prolog compiler, which analyzes variable aliasing to automatically uncover opportunities for Independent And-Parallelism. This compiler compiles a Prolog program into an &-Prolog program using mode declarations and abstract interpretation over the ESharing domain to omit unnecessary run-tim e tests. This application of variable aliasing analysis and the compilation problem are described in chapter 6. Our &-Prolog compiler exists in two versions, one uses extension tables, the other uses condensed procedures. Table 5.9 shows the total runtim e for the two versions on three example programs. The circuit program finds the smallest circuit using NAND gates and inverters to fit a given tru th table. The nature of the program is “search”. The invocations of the program ’s clauses are very general. For the qsort program with difference lists, the version with extension tables fails to recognize th at the second argum ent will always be free, and therefore computes three table entries for qsort. But for all three entries the fixpoint can be confirmed very quickly assuming the first argument is ground. The prover program is apparently a transliterated Lisp program. All procedures are consistently invoked with a unique mode. Moreover, each argument of these modes is either ground or uninstantiated. For at least one procedure of the prover program, approxim ating the most general call is much more expensive than computing the table entry for the very restricted mode in which this procedure is actually used. Program E tab Version Condensing Version Ratio circuit 19416 11745 1.65 qsort 6883 6478 1.06 prover 26572 34011 0.78 Table 5.9: Average runtim e of &-Prolog compiler (in ms) Beyond the improved efficiency of abstract execution using condensing in pro grams of the “search nature” , we expect condensing to scale up better. A more refined abstract domain that captures more run-time properties will distinguish 101 more “modes”. Abstract execution with extension tables will compute a separate table entry for each of the distinguishable modes, whereas the effort of condens ing remains comparable to computing a single entry, namely for the most general invocation. A condensed procedure completely describes the abstract meaning of the proce dure, rather than just its value for a few selected domain elements. This makes new kinds of inferences feasible. In [17] we propose to direct specialization of procedures from the knowledge about particularly beneficial optimizations. We may want to compile a very efficient loop and guard its entry with a run-time test whose success guarantees the safety of the optimization used to obtain the efficient loop. Here, we must describe queries that will result in given, desirable run-tim e states in the loop body. W ith condensed procedures, it may become feasible to search systematically for a description of queries that will result in certain answers. Such inference is not feasible if a description of safe entry modes m ust be obtained by generating and testing, where each test may require iteration to a fixpoint. In [17] we construct representations of the meaning of execution paths that are akin to condensed proce dures. We also give an example of an algorithm th at, for a certain abstract domain, directly computes a description of safe entry modes from the “condensed execution path meaning” and a given optimization safety requirement. 102 Chapter 6 Independent And-Parallelism 6.1 Introduction Logic Programs can exploit several forms of parallelism. In And-Parallelism subgoals of a goal are solved in parallel. In Independent And-Parallelism, goals may execute in parallel only if their solutions do not restrict each others search space. This is guaranteed by requiring th at concurrently executing goals share no variables. W hether two goals share a variable is a run-time property, since a textually shared variable may be bound to a ground term , or textually different variables may be aliased. For example, under Independent And-Parallelism, the recursive call of a ll_ p in a ll_ p ( [] , _) . all_p([H|T], X) p(H, X), all_p(T, X). may be executed concurrently with the call of p only if X is ground and H and T are independent, i.e., are bound to terms th at do not share common variables. De termining the independence of goals is an im portant application of variable aliasing analysis. Restricted And-Parallelism (RAP) further restricts Independent And-Parallel ism. Here, goals are grouped at compile-time and the parallel execution of two groups is guarded by a simple run-time test. Success of the run-tim e test must guarantee independence of the groups. Programs are compiled into special control expressions, called Execution Graph Expressions (EGEs), which initiate goals on the basis of tests on program variables. 103 Generating EGEs is difficult for two reasons. First, determining whether an EGE is correct for a clause may entail considerable reasoning about tests and goals. Second, choosing from among various correct alternatives m ay require a rather sub jective assessment of their benefits. This is because different EGEs may achieve more parallelism in different circumstances. Moreover, an EGE th at achieves a great deal of parallelism may be unacceptably large. In this chapter, we introduce a novel approach to generating correct EGEs and choosing between them. In our approach, a clause is first converted into a graph- based computational form, called a Conditional Dependency Graph (CDG), which achieves Maximal And-Parallelism (MAP). MAP is the maximal and-parallelism possible while maintaining correctness. This CDG is then gradually transformed into an EGE, potentially at a loss of parallelism, using two rewrite rules operating on hybrid expressions. Since these rules are sound, in the sense th at they always produce correct results from correct sources, correct EGEs are always produced. Compilation algorithms are defined within this framework by giving heuristics for choosing where and how rules should be applied. This chapter is organized as follows. In section 6.2, we define EGEs, and m otivate the compilation problem. In section 6.3, we define correctness and MAP. In section 6.4, we define CDGs and prove that a clause can be converted into a CDG which achieves MAP. In section 6.5, we introduce our compilation framework, define the rules for transforming CDGs into EGEs, and prove their soundness. In section 6.6, we discuss a particular heuristic for generating EGEs th at attem pts to minimize the loss of parallelism incurred by the restricted nature of EGEs. 6.2 EGE Compilation Recall from section 6.1 that in RAP only independent goals are supposed to execute in parallel. RAP permits run-tim e testing to establish the independence of goals. DeGroot [14] suggested Execution Graph Expressions, which mix tests and prolog code. Hermenegildo [15] presented a Prolog like syntax for EGEs, called &-Prolog, which we adopt. In this section we define EGEs in &-Prolog syntax and introduce the compilation problem. 104 6.2 .1 E x e c u tio n G rap h E x p ressio n s We use the following terminology in discussing EGEs. Variable identifiers appear textually within goals, e.g., X and Y appear in a(X, Y), and are bound to terms when a program executes. Variables are unnamed objects which appear inside terms and may be unified with other terms. Let variable identifiers X and Y be bound to terms t and u respectively. X becomes further instantiated when a variable in t becomes unified with a term. X is ground if t contains no variables. X and Y are dependent if t and u contain a common variable, otherwise they are independent. Two goals are dependent if there is a variable identifier of the first goal and a variable identifier of the second goal which are dependent, otherwise they are independent. EGEs are defined inductively as follows. Every goal is an EGE and for EGEs Ex and E 2 the following are EGEs. SEQ expression (.Ex, E2) PAR expression (Ex ft E 2) IF expression (C -> E i; E2) CPAR expression (C => Ei ft E 2) Execution of an EGE proceeds from the outerm ost expression inwards. A SEQ expression executes the subexpressions sequentially from left to right. A PAR ex pression executes the subexpressions concurrently. An IF expression executes Ei or E 2 depending on whether the condition C is true or false. A CPAR expression may be viewed as an abbreviation for (C -> Ex ft E 2; E x, E 2'). A condition C is a conjunction of basic tests on variable identifiers and is used to determine whether goals are independent. Two kinds of basic tests are provided: L i M holds only if all variables in the list of variables L are independent from the variables in M , e.g., X and Y are independent if [X] i [Y] holds, gL holds only if all variables in the list of variables L are ground, e.g., X is ground if g[X] holds. We now give several examples of EGEs. The clause h(X, Y) : - a(X) , b(Y) can be compiled into the EGE h(X, Y) [X]i[Y] => a(X) ftb(Y) since a(X) and b(Y) can execute in parallel as long as X and Y are independent. Similarly, the clause h(X) a(X) , b(X) can be compiled into the EGE h(X) g[X] => a(X) f t b(X) 105 since a(X) and b(X) can execute in parallel as long as X is ground. The clause h(X,Y) a(X) , b(Y) , c(X, Y) can be compiled into the EGE h(X, Y) ( g[X] -> a(X) & ( g[Y] => b(Y) & c(X, Y)) ; gCY] -> b(y) a ( a(X), c(X, Y)) ; CX]i[Y] -> (a(X) a b(Y)), c(X, Y) ; a(X), b(Y), c(X, Y)). which executes one of four sub-EGEs depending on X and Y . 6 .2 .2 T h e C o m p ila tio n P ro b lem In this section, we motivate our compilation techniques by way of several examples. These examples show that a compiler must be able to • reason about independence of goals to ensure correctness, • accurately determine when losses of parallelism occur, and • choose between different alternatives when losses of parallelism occur or when the “perfect” expression is unacceptably large. To determine whether an EGE is correct, a compiler m ust be able to infer that, whenever two goals might execute in parallel, the sequence of tests th at lead to that point were sufficient to ensure th at the goals were independent. As an example, consider the first branch of the last EGE in the previous section. Since X is ground on this branch, we know X and Y are independent so a(X) can execute in parallel with the other goals. In addition, b(Y) and c(X, Y) can execute in parallel if Y is ground. Similar argum ents show that the other branches are correct. In general, the inferencing necessary to determine whether an EGE is correct can be quite complex. For example, in the EGE [X]i[Y] -> E \, E 2 ; E 3 it may be necessary to analyze E \ to determine whether [X] i [Y] still holds when E 2 starts executing. The following example demonstrates th at the linear form of EGEs restricts the amount of and-parallelism th at can be achieved. In this example, and throughout this chapter, we consider only those EGEs which lead to dependent goals executing in the same order as they appear in the original clause. This restriction is in the 106 spirit of Prolog; we rely on the programmer to determine the best order in which to execute dependent goals. The limitations of EGEs are still present if goal reordering is perm itted. Consider the clause h(X, Y) a(X) , b(X) , c(Y) , d(X, Y). An EGE of the form gCX] -> E i; g[Y] -> E 2; [X] i [Y] -> E 3; E 4 tests all relevant initial cases for this clause and therefore can achieve the maximal possible parallelism. Now consider the third branch, where X and Y are independent but neither is ground. Suppose we decide to construct E 3 using the subexpression (a(X) , b(X)) f t c(Y) for which there is no loss of parallelism. Under no condition can d(X, Y) be initiated in parallel with this subexpression since d(X, Y) cannot start until after a(X) finishes. We must therefore compose d(X, Y) sequentially after this subexpression. But if a(X) grounds X and c(Y) finishes before b(X), then d(X,Y) will needlessly wait for b(X) to finish. As an alternative, if we try starting with a(X) f t c(Y) then b(X) will end up needlessly waiting for c(Y) to finish. Finally, if we let a(X) execute on its own, then c(Y) will end up needlessly waiting for a(X) to finish. Thus, every possible EGE for E 3 leads to losses in parallelism. This is because initiation of a goal cannot be made dependent on the completion of an arbitrary collection of other goals. Note th at none of these expressions is clearly the best with respect to parallelism alone, since for each of them there are circumstances where it does better than the others. We conclude this section with a discussion of certain im plem entation details of the RAP model and the way in which these details affect compilation. In general, evaluation of a test such as [X] i [Y] requires scanning the terms bound to X and Y at run-tim e. Since such terms can be arbitrarily large and may have to be scanned quite frequently, testing can introduce considerable overhead. An im portant part of the RAP model is an algorithm which efficiently computes an approximation to these tests. This algorithm is safe in the sense th at it never allows a test to succeed when it should have failed. Thus, independent goals may be viewed as being dependent but not the converse. It is our view that the inexactness of tests should not directly affect the com pilation process; the same strategy for deriving EGEs should be used regardless 107 of whether tests are exact or inexact. However, the inexactness of tests should indirectly affect compilation. Consider the EGE [X] i [Y] -> E \, E 2; E % intro duced earlier. Suppose analysis of E i shows th at [X] i [Y] still holds when E 2 starts executing. If the tests were exact, we would be at liberty to retry [X] i [Y] in E 2 anyway; at worst we would be performing a redundant test. However, since the tests are inexact, if we retry [X] i [Y] in E 2 it may fail, thereby reducing the amount of parallelism achieved. Thus, it is im portant to avoid redundant testing as much as possible. 6.3 Maximal And-Parallelism In this section, we present a formal notion of correctness and define MAP to be the maximal and-parallelism possible while m aintaining correctness. Execution of a clause begun in some initial state is correct if the following two restrictions are observed. 1. Dependent goals never execute concurrently. 2. Dependent goals never execute out of order. The second restriction is in the spirit of Prolog; we rely on the programmer to determine the best order in which to execute dependent goals. These restrictions are characterized by the following constraint on scheduling. D e fin itio n 19 (C o n s tra in t on S ch ed u lin g ) (*) A goal should not be initiated while there is a dependent goal to its left which has not finished executing. T h e o re m 69 ((*) C h a ra c te riz e s th e C o rre c tn e ss o f E x e c u tio n ) Dependent goals never execute concurrently or out o f order if and only if (*) is observed. P ro o f: Only if: Suppose (*) is not observed. At the point (*) is violated, a goal B is initiated while there is a dependent goal A to its left which has not finished executing. If A is currently executing, then two dependent goals are executing 108 concurrently, violating the first restriction. If A is not executing, then A and B are executing out of order, violating the second restriction. If: Suppose (*) is observed. First, we show th at concurrently executing goals A and B , A left of B , must be independent. Under (★), A and B must be independent at the tim e B is initiated. We can show that they will remain independent as long as B is executing since no goal can get access to variables of both A and B . Second, we show that dependent goals A and B , A left of B , cannot execute out of order. Under (-*-), B cannot be initiated until A has finished. □ MAP is the maximal and-parallelism possible while m aintaining correctness. D e fin itio n 20 (S ch ed u lin g D iscip lin e for M A P ) (**) A goal should be initiated as soon as all dependent goals to its left have finished executing. D efin itio n 21 (C o rre c tn e ss a n d M A P ) Let E be the representation o f a clause in some computational formalism such as EGEs. E is correct i f execution o f E begun in any initial state observes (*). E achieves MAP if execution of E begun in any initial state implements (**). 6.4 Conditional Dependency Graphs In this section, we introduce CDGs and prove that a clause can be converted into a CDG which achieves MAP. A CDG is a directed acyclic graph where the vertices are goals and each edge is labeled by a condition. The CDG © (£) associated with a clause L has a vertex for each goal of L and an edge from goal A to goal B if A is left of B . The condition labeling edge (A, B ) is indep(A, B ), defined as follows. D efin itio n 22 ( indep) The condition indep(A, B ) is the conjunction of the following tests: g[X] for every variable identifier X occurring in both A and B , and [X] i [Y] for every variable identifier X occurring in A but not B and identifier Y occurring in B but not A. It is easy to show that goals A and B are independent if and only if the condition indep{A, B ) holds. Note th at there is no need to test whether a variable identifier 109 occurring in both goals is independent from other variable identifiers since we require th at it be ground. As an example, Figure 6.1 shows the CDG for the clause h(X,Y) a(X), b(Y) , c(X,Y), which was presented in section 6.2. b(Y) gCY] [X] i [Y] c(X,Y) a(X) gCX] Figure 6.1: Example Conditional Dependency Graph The CDG execution model is as follows. D e fin itio n 23 (C D G E x e c u tio n M o d el) Perform the following two step execution cycle repeatedly. A cycle should start as soon as a goal finishes or a variable identifier is further instantiated. 1) E d g e R em o v al: Remove every edge whose origin has finished executing. I f the conditions hold on all edges going into a goal, then remove those edges. 2) G o al In itia tio n : Initiate all goals with no incoming edges. The following theorem shows that 0 (T ) achieves MAP. T h e o re m 70 (0(X ) achieves M A P ) Execution o /0 (X ) begun in any initial state implements (*★). P ro o f: We m ust show that a goal B is initiated during execution of 0(X ) as soon as all dependent goals to its left have finished executing. Consider the change which results in all goals left of B being either finished or independent of B for the first time. A CDG execution cycle will start as soon as this change occurs. Since all edges going into B come from goals on its left, all such edges will be removed after the first step of this cycle. Thus, B will be initiated in the second step of this cycle. 110 It remains to be shown th at B could not have been initiated in a previous cycle. During every previous cycle, there was some dependent goal A to B ’s left which had not finished executing. The edge {A , B ) initially in 0 (Z ) could not have been removed after step one in any of these cycles. Thus, there was always an edge going into B and B could not have been initiated in step two. □ 6.5 A Compilation Framework In this section we introduce our framework for compiling clauses into EGEs. In our approach, a clause L is first converted into the CDG 0 (T ) which achieves MAP. This CDG is then gradually transformed into an EGE, potentially at a loss of parallelism, using two rewrite rules operating on hybrid expressions. We show the rules are sound in the sense that they always transform correct hybrids into correct hybrids. This ensures th at any EGE derived from 0 (Z ) will be correct since ©(•£) is correct. The hybrid expressions consist of EGEs with CDGs as well as goals as elemen tary components. The rewrite rules replace a CDG within a hybrid by an EGE containing more refined sub-CDGs. W hen an IF expression is introduced, facts about dependencies between variable identifiers become known. Such facts are ac cumulated in a context. The effective condition on an edge of a sub-CDG in a given context is, in general, simpler than th at edge’s label. For a fixed context, we may say th at an edge is removed when its effective condition is true, that is, when its label is implied by the context. Formally, we introduce the notion of an Extended CDG (ECDG), {T, C ), con sisting of a CDG, r , together with a context, C . Hybrid expressions consist of EGEs with ECDGs as well as goals as elementary components. A context of a CDG T is an element from an abstract domain in sense of chapter 3. It describes the set of substitutions th at are possible before T executes. We introduce four functions dealing with contexts, initC : Clause — > Context, propagate : Hybrid x Context — » • Context, simplify-label : Context — * Atom x Atom — > ■ Test, and T : Test — ► {true, false} — ► Context — > Context. Our compila tion framework is parameterized by these functions. As a very simple instance we present definitions for the functions in this section th at require only local analysis. I l l In this simple instance, the domain of contexts is V{ Var) x V( Var) x V( Var x Var) x 'P( Var x Var). An element (G, N , I, D ) represents the knowledge th at the variables in G and N are definitely ground and definitely non-ground, respectively, and the pairs of variables in I and D are definitely independent and definitely dependent, respectively. The least upper bound of contexts is obtained by intersecting the tu ple components. In section 6.5.1 we show how to compile CDGs using ESharing for contexts and global analysis for accurate definitions of initC and propagate. For a clause L the context, initC (L ), describes the set of substitutions that are possible before any goals of i ’s body have executed. For the simple context domain, it is defined initC (h:-b) = (0, B , B x B ,0) where B is the set of local variables, var(6)\var(h). The context, propagate(T, C ), approximates the set of substitutions that are possible after execution of T, if it started in context C. Here, we define propagate((G, N , I, D), T) = (G, N \ var(r), 0,0), which is sound but rather impre cise. For a testable condition, T, we distinguish its meaning as a condition, |T ], which is the set of substitutions that satisfy it, from an interpretation of T as a pair of functions th at modify the context based on an assumption of the test outcome, T |T ] G {true, false} — > Context — > Context. For example, the testable condition g[X] corresponds to an executable program that tests whether X is ground. The meaning of this condition is the set of substitutions, that satisfy it. In this case, |g[X]] = {[<x]|var(5-(X)) = 0}. Compound conditions have the standard meaning: \T ,T 1 = [TJ n [T'l, [T ;T '| = [TJ U [T '], and [not T] = Suist \ \T \. The interpre- tation of a testable condition, T , as context modifiers, T [T J, relates T to the chosen abstract domain of contexts. This interpretation specifies a sound approximation of the set of substitutions for each of the two possible outcomes of the test based on an approximation before the test. The soundness requirement for a test interpretation of a testable condition T is 7 ( C ) n [ T ] C 7(T[r](n<eC') 7(C,) \ [ T ) C 7(T [T l/alSeC) Given the soundness of basic tests it is easily seen that the following rules for compound tests are sound. T[[r,T'I true C = T |T ] true (T[T'] true C) T \T ,T '\ false C = {T {T \ false C) U { T \T '\ false C) T lT ; T 'ltr u e C = (TfljT] true C) U (T[[T'| true C) TIT-, T 'l false C = T{T ^ false (T|T'J/a/se C) The test interpretation for the simple domain is fully defined by defining the inter pretation of basic tests. T[g[X]] true(G,N,I,D) = (G U {X}, N,I,D) T[g [X] ]/a/se, (G, N,I, D) = (G,NU{X},I,D) T[[X] i [Y] 1 true (G, N,I, D) = (G,N, JU {(X, Y), (Y, X)}, D) T[[X] i [Y]]/a/se {G, N , I, D) = (G,N,I,D U {(X, Y), (Y, X)}) Finally, we introduce the function th at simplifies an edge label in a given context. We abbreviate simplify-label C (A, B ) as Ca b * Success of Ca b m ust imply the inde pendence of A and B . Formally, the soundness requirement for label simplification is: \ C a b \ H 7 (G) C lindep{A ,B )\ We lift this definition of C a b to CDGs by writing Cafj for the conjunction of C a b for all goals i G a and B G /?. For the simple domain we define, (G, N, I, D ) a b false if S n N £ 0 V A ' x B ' n D ^ g ( S \G ) A A C X ] i [Y] otherwise I e A '\G TeB'\G where S = var(A ) fl v ar(B ), A! — var(>l) \ S, and B ' = var(D ) \ 5 . We assume th at g0 is simplified to true. Our first rewrite rule, called the Split Rule, introduces PAR, SE Q , and CPAR expressions. 113 D efin ition 24 (T he Split R ule) Input to the Split Rule consists of an ECDG (I\ C) and a partitioning of T into two sub-CDGs o l and 0. The Split Rule may be applied only if there are no edges from f3 to a. I f Cap = true, then the result o f the Split Rule is the P AR expression ({a, C) ft {/3,G)). I f Cap = false, then the result is the SEQ expression ((a,C), { /3,propagate(a,C))) otherwise the result is the CPAR expression (T => (a, C) ft (/3,(T[[T] true C) L I propagate{a,Tl\T\ false C))), where T = Cap- Our second rewrite rule, called the If Rule, introduces IF expressions. D efin ition 25 (T he If R ule) Input to the I f Rule consists o f an ECDG (r,<7) and a testable condition T . The result o f the I f Rule is (T -> (I\ TfT] true C ); (I\ T |T ] false C )). As an example, we use the two rewrite rules to derive the EGE C g[X] -> a(X) & (g[Y] => b(Y) ft c(X, Y)) ; g[Y] -> b(Y) ft (a(X ), c(X, Y)) ; [X]i[Y] -> (a(X) ft b(Y )) , c(X, Y) ; a (X ), b (Y ), c(X, Y)) given for the clause h(X,Y) a(X) , b(Y ), c(X,Y) in section 6.2. The CDG associated with this clause was given in section 6.3. W ith our simple domain, the initial context is (0 ,0 ,0 ,0 ) and restricts nothing. We first apply the If Rule with the condition g[X]. In the “then” branch, both edges coming out of a(X) are removed under the resulting context ({X}, 0,0,0). Applying the Split Rule to separate a(x) from the other goals results in the PAR expression with a sub-CDG in the second half. Splitting this sub-CDG results in the CPAR for b(Y) and c(X,Y). In the “else” branch the new effective condition on the edge from a(X) to c(X, Y) is false. We now apply the If Rule with the condition g[Y] to this CDG. In the new “then” branch the edges from a(X) to b(Y) and b(Y) to c(X,Y) are both removed. Applying the Split Rule to separate b(Y) from the other goals results in the PAR expression with a sub-CDG in the second half. Splitting this sub-CDG results in the SEQ expression for (a(X ), c(X ,Y )). The other two branches are derived in an analogous manner. Note th at the final “else” clause a(X) , (g[Y] => b(Y) ft 114 c(X,Y)) is also derivable. And, indeed, if X and Y are dependent then execution of a(X) may ground Y. We now prove the soundness of the rewrite rules. To accomplish this we define of the EGE and CDG execution models. At the outermost level, hybrids execute according to the EGE model. W hen an ECDG is ready to execute according to the EGE model, its CDG is initiated and executes according to the CDG model. The CDG finishes as soon as all its goals have finished. Using this execution model, we define a notion of correctness for hybrids. D e fin itio n 26 (C o rre c tn e ss o f H y b rid s) A hybrid H is correct under some context if during execution o f H begun in any initial state satisfying that context 1. goals are scheduled correctly and 2. for each ECDG (T, C) in H , the part of the state before the first goal in T is initiated that is relevant to T satisfies the context C. The set of substitutions that are relevant to CDG, T, and satisfy a context, (7, T h e o re m 71 (S o u n d n ess o f th e S p lit R ule) I f the Split Rule is applied to an ECDG (r,< 7) in a source hybrid that is correct P ro o f: First, we argue th at goals in the result hybrid will be scheduled correctly. when a executes sequentially before (3 in the result hybrid since there are no edges from j3 to a. an execution model for hybrid expressions which is a straight-forward combination where |< r|x> is a substitution like a restricted to domain D. It is not sensible to attem pt to describe all possible substitutions before the first goal of T executes. In that case we would have to include the union of all substitutions possible before, during and after the execution of a goals executing in parallel. Trying to approxim ate such a large set cannot improve accuracy of the approximation. under some context, then the result hybrid will also be correct under that context. It is always correct when a and /3 execute concurrently in the result hybrid since they would have executed concurrently in the source hybrid. It is always correct 115 Second, we argue that appropriate contexts are introduced. If a and (3 execute sequentially then the context propagate(a, C ) is correct by soundness of propagate. T hat is, the context describes the set of possible substitutions at this point and therefore it certainly describes the set of substitutions relevant to (3. If a and (3 may execute in parallel, we have no description for the set of possible substitutions at the point of initiation of either a or /3. For each of the two groups it is possible that the other group is initiated first and has further instantiated the variables to which it has access. But since the groups are scheduled correctly, they share no variables and the set of substitutions that are relevant to the respective groups are described by the context at the point of the test, C . Therefore, the contexts C and C U propagate(a, C) always describe the set of substitutions relevant to a and /3 respectively. In the case of a P AR or SEQ expression, the approximation may be sharpened to C and propagate(a,C) by the same reasoning. In the case of the CPAR expression, the same reasoning leads to the context (TJTJ true C) U propagate{a, T [T |/o /se C) for /?, by distinguishing the two possible outcomes of the test T, which guards the parallel execution. C D Let us hint at an alternative definition of the Split Rule that specifies the context propagate(a, C) for (3 in the second ECDG in the CPAR rule. This context may well be sharper than the context for /3 in the current definition for some choices of abstract domains representing the context. This is, because the least upper bound is a conservative approximation of the union of described substitutions and it is not clear how much of the information th at is won in a test is representable in the chosen abstract domain. A sharper approximation will lead to weaker condi tions guarding parallel execution; it will result in more parallelism. This context, propagate{a., C), is not sound according to the given definition of sound ECDGs, however. Substitutions that cause a to fail, are not approximated, and therefore /3 may be initiated with some substitutions that are not described by propagate{a, C). To take advantage of the sharper approximation we must be willing to risk violation of the scheduling discipline in a few cases where the result of the com putation is guaranteed to be discarded. The alternative definition of the Split Rule would have to be accompanied by a more liberal definition of soundness of ECDGs. 116 T heorem 72 (Soundness o f the If R ule) I f the I f Rule is applied to an ECDG (r, C) in a source hybrid which is correct in some context, then the result hybrid will also be correct in that context. P ro o f: The soundness requirement for test interpretation guarantees the soundness of the introduced contexts. C U We now argue that every EGE derived in our framework is correct. The initial context initC (L) is correct for a clause L since the first occurrence of a “local” variable identifier is guaranteed to be non-ground and independent from all other variable identifiers. For example, the initial context for the clause h(X) : - a(X) , b(Y) , c(X, Y) consists of the fact [X] i [Y] . Note th at entry mode information, e.g., that some argum ent is always ground, can be easily incorporated into the initial context. Such information m ay be derived from global program analysis along the lines of the function entry from chapter 3 or programmer annotations [24]. Compilation begins with the initial hybrid (Q(L), initC (L )). The correctness of © (£) ensures th at the initial hybrid is correct under initC (L ). By the soundness of the rewrite rules, every EGE derived from the initial hybrid will be correct under initC (L). 6 .5 .1 C o m p ila tio n w ith E Sharing We can use the domain ESharing to accumulate contexts. We can obtain the ini tial context from a local approximation like wco(init(fe:-& ),var(h)), or from an approximation based on declarations and entry mode analysis, like the function entry defined in section 3.4. We can use any of the various functions approxim at ing B j 6 ]R[r]] to define a sound propagate function. The interpretation of testable conditions is defined: T[[g lll}true(S,L,F) = (S' \ re/(X, S), L \ {X}, F \ {X}) T[[[Y] i[X]| true (S,L,F) = ({g 6 S\X gV Y g},L,F) T IT} false E = E Two considerations determine the necessary test guarding parallel execution of goals in a context represented by ESharing. First, no parallel execution is possible when 1 1 1 some shared, variable is definitely free. Second, the test th at enables parallel exe cution m ust exclude the possible sharing groups that are shared by the two goals. The effective condition, Ca b ? for parallel execution of goals A and B in context C is defined { false if free C fl var(^4) D var(i?) ^ 0 f \ exclude g otherwise g E re l(A ,re l(B ,sharing C)) f g[X] itg = {X} exclude g = > , y [jc]i[Y] otherwise I -CX .T>Cflr For example, if sharing C is {{X, Y}} and X 0 free C then splitting a(X) and b(X, Y) results in ( [X] i [Y] => a(X) f t b(X,Y)). Note that, although it is possible th at X is non-ground, it is not necessary to test whether X is ground because the context implies th at X is ground if it is independent of Y. W henever g C g‘ the condi tion exclude g implies the condition exclude g1 . Therefore, we obtain an equiva lent but simplified compound guard by “excluding” only those sharing groups that are minimal w.r.t. the subset ordering. If we modify the example to assume that sharing C = {{X},{X, Y}} then the split would result in (g[X] => a(X) f t b(X,Y)). 6.6 The Zerocost Heuristic We have presented two rewrite rules to derive correct EGEs for a clause. Com pilation of a clause into an EGE can now be presented as a set of heuristics that choose an appropriate rule and determine how it is applied. The heuristics should be sufficiently biased against the If Rule to prevent unduly large CDGs. This bias m ust be balanced with a bias against the Split Rule, because its application will, in general, lose opportunities for parallel execution. Consider for example the clause h(X,Y) a(X), b(X), c(X,Y). where it is known from global analysis that X will be ground after a(X) has finished execution. The Split Rule can be applied in two ways. When the last goal is split off before splitting the first two goals are split apart, the body will be compiled into the EGE Cg[X] => (g[X] => a(X) f t b(X)) f t c(X,Y)). This EGE is bad 118 for several reasons. First, the test is repeated although the outcome of the first test determines the outcome of the second. Second, unless X is ground initially the last goal must wait for both goals to finish. Maximal And-Parallelism is obtained only if the last goal can be initiated as soon as X becomes ground. Splitting off the first goal before splitting apart the last two goals results in the much more desirable EGE, (g[X] => a(X) f t (b(X) f t c(X,Y))). This EGE achieves MAP. So, evidently, some applications of the split rule do not lose any opportunities for parallelism. We say such a split has zero cost. Whenever during a derivation a zero cost split is possible, the derivation should proceed by splitting there. This is, apparently, the only uncontroversial heuristic. Therefore, we present how to recognize a zero cost split in some detail in the this section. In general, it would be interesting to compare the cost of two alternative applications of the split rule, based on the loss of opportunities for parallelism. But, as we have seen in section 6.2.2, there is no best EGE for all circumstances. A split is zero cost if no goals are delayed unnecessarily (as always modulo our ability to tell determine necessity through our effective approximation). As a very simple example, a split of a two goal sequence into two single goal sequences is zero cost. In general, splitting a CDG T into two CDGs a and j3 is zero cost if, whenever the guard Cap fails, all goals in the suffix (3 depend on some goal to their left until the last goal of the prefix a has finished execution. This lead us to a relatively simple formal condition. A split of an n goal sequence 1 . . . n into a. = 1 ... s and j3 = s + 1 ... n is zero cost if f \ propagate{\ . . . i — 1 , T fC ^ J false C)ij — false l<i<» s< j< n Note that, for the variables th at are relevant to goal i a t the point when i is initiated, the context th at is propagated up to i is sound. The condition above requires that all (3 goals definitely depend on all a goals when th at a goal is initiated after the guard has failed. This condition is overly pessimistic for several reasons. First, some goals in /? may be independent from a but are blocked by some (3 goal to their left. In th at case they are still not ready to execute until all a goals are finished. Second, the (3 goal j does not really need to depend on all a goals. It must, however, depend on the last goal in a to finish. Third, the requirement __________ 119 propagate(l.. .i — 1 ,T |C ap^ false C ) i j = false guarantees th at j must wait for i to finish if the guard fails. But it depends on the precision of the test interpretation and propagation. For goals i that are ready to execute immediately the condition C i j = = > G a0 , also guarantees th at j must wait for i to finish if the guard fails. For both choices of context domains considered previously, the latter condition can identify more zero cost splits. Note that, for * E a and j E 0 the conditions C i j = > C a 0 and C ij = C ap are equivalent. To improve identification of zero cost splits we identify the blocked /3 goals, the prefix goals th at could possibly finish last, and the goals th at can be initiated immediately. For example, splitting through the middle in h(X,Y) a(X), b(X), c(Y), d(Y). is zero cost if a(X) cannot ground X. If X and Y are independent the subsequences execute in parallel and there is no cost. Otherwise neither X nor Y are ground, anc a(X) cannot finish last. Also d(Y) is blocked by c(Y) unless executing a(X) ,b(X) grounds Y. And, by assuming th at a(X) cannot ground its argument we know a(X)j cannot ground Y, and thus d(Y) will remain blocked until both a(X) and b(X) have finished executing. We formalize the special sets of goals. The set of prefix goals, th at can star immediately is I = {1} U {2 < i < 5|C7(i.„t_i)t = true}. The set of prefix goals that will not finish last if the guard fails is N _ {i e i\Ci(i+1...3) = > Caf}}u {i (E a\propagate(l. . . i, T |(7a^]]/a/se C0*(*+1-») =: false}. The set of suffix goals th at are blocked by other suffix goals is n f \ propagate(l. . . i,T lC a(3l false C')(a+ 1...fe _1)k = false > . K i < a In the special case of ESharing as contexts, propagation can never introduce un conditional dependencies. Formally, we have propagate(b, C)ij = false implies 120 Cij = false. This is likely to be true for many other choices of context domains. In these cases the set of blocked f3 goals is £ = {* + 1 < k < n \propagate{a,T\Capl false C \ s+1...k-i)k = false} . As an additional refinement, if there is a single goal i th at can possibly be the last to finish, it suffices to test th at the (3 goal depends on some other /3 goal to its left at the point when i is initiated. If a contains only one goal, it is initiatable immediately and a (3 goal, k, is blocked if C ( s+ i,..k - i ) k = > C a p . If a. contains the single goal, i, the split is zero cost if A Qj = Ca0 \je p \B where 5 = {s + l<A;< ra|C(,+i. ,.k-i)k = > Cap}. W hen a. contains more than one goal, a split at s is zero cost if A C % j — Cap ie /\ n Vie A s / A \ A propagate^ 1 ... i — 1, T f Cap\ false C)ij = false i e c t \ I \ N \ i€A-B / where the set of blocked (3 goals, B , is obtained as described for the general case above. As examples we list the conditions for zero cost splits in simple special cases We write a , 6 |c to indicate a split of a sequence of three goals into a beginning sequence of length two and a rest sequence of a single goal. Split a\b is always zero cost. Split a |6 c is zero cost if Cab = Ca^ c) and either Cac — Ca^c) or c £ B , i.e., Cbc =£• CaQ ,c). This condition can also be rew ritten equivalently as Cab =>- Cac and C(ab)c =£■ Gab. Split ab\c is trivially zero cost if C(ab)c — true. Split ab\c is also zero cost if Cab — true and Cac — G bc- The following clause can be compiled into an EGE using only zero cost splits. h(X,Y) a(X, I. J), b(I, P), c(J, Q), d(P, Q, Y). 121 Splitting off the first goal and the last goal are zero cost. This leaves a two goal split in the middle, which is trivially zero cost. The zero cost splits in this example were obvious because many goals were unconditionally dependent. The unconditional dependencies caused by a shared local variable are not affected by goals left of its first occurrence. Therefore, every goal in j that contains some local variable depends unconditionally on the goal with the first occurrence of that variable. Let L = v a r(6 )\v a r(A ), be the set of local variables, then ( v a r ( i... j —1 )\ v a r ( l . . . i — 1 )) H v a r (j) fl var(L ) / 0 implies f \ propagate( 1 . . . i — 1 , = l< i< < false. Therefore, we obtain the following simple approximations for the goals of interest. Let L = v a r(6 ) \ v a r(h), then the set, N , of a goals th at cannot finish last includes {1 < i < s |( v a r ( i) \v a r ( l... i —l ) ) r i v a r ( i + l ... a ) n f ^ 0}, and the set, B , of blocked (3 goals includes {s+ 1 < j < n |( v a r ( s + l ... j — l ) \v a r ( a ) ) n var(jf)nZ ^ 0 }, and a split is zero cost if, \ A — ^ «/3 i € l \ N \ j e e \ B / A A (var(t) \ v a r ( l . . . i — 1)) Pi var (j) f l i ^ 0 i £ a \ I \ N \ j€/3\B ) In practice this sufficient condition, in combination with the special treatm ent of splits where a consists of a single goal, recognizes as many zero cost splits as the more complicated condition. 6.T Summary In this chapter we discussed an im portant application of analysis for variable alias ing, namely, uncovering opportunities for independent and-parallelism. In indepen dent and-parallelism, two goals may execute in parallel if they share no variables at run-time. Variable aliasing completely determines the soundness of parallel execu tion. The approximation of variable aliasing may determine unconditional sharing or unconditional independence of two goals, but in general it will indicate some potential sharing. In general, the program will execute two goals in parallel if it can exclude the remaining possible sharing through a run-time test. The approximation of variable aliasing completely determines the necessary run-time tests. 122 We focused on a particular approach to independent and-parallelism, called Re stricted And-Parallelism. It was originally proposed by DeGroot in [14]. In this approach goals are grouped at compile tim e and parallel execution of the resulting groups is guarded by simple run time tests. Finding appropriate groups and guards poses a special compilation problem. Keen approximation helps to omit unneces sary run-tim e tests. Moreover, actual run-time tests may be cheap approximations th at err on the safe side rather than searching large terms for common variables. In that case, parallel execution may depend on compile-time analysis, because a conservative run-tim e test may fail even if we can infer that the corresponding ideal run-tim e test would never fail. Grouping goals at compile time is problematic. In this chapter we formalized the earliest possible initiation of a goal in compliance with Restricted And-Parallelism. We showed th at for some clauses no grouping of goals can guarantee the earliest possible initiation for all its goals. We presented a framework for generating EGEs, which are the executable expressions that initiate groups of goals for parallel exe cution based on run-tim e tests. This framework allows for heuristics to determine desirable grouping and eliminate undesirable testing. If initiation of a goal might be delayed as the result of some grouping, we can attribute this delay to specific decisions during EGE generation. We presented a particular heuristic, the zero cost heuristic, which indicates choices that introduce no delays. 123 Chapter 7 Conclusions Analysis of Variable Aliasing is the centerpiece of analysis of non-trivial Logic Pro grams. In a general Logic Program, a solution is increasingly constrained by m atch ing and instantiating the binding of Logical Variables. Global analysis seeks to approxim ate certain properties of the possible run-time bindings. Variable Aliasing analysis approximates how Logical Variables are shared. Since solving one goal only affects the binding of variables accessible from it, variable aliasing limits the set of variables that are possibly affected by its solution. For the analysis of numerous run-tim e properties of the variable binding, variable aliasing analysis is used to limit the impact of executing a goal. Variable Aliasing can improve the analysis of the term structure, linearity, freeness, and groundness. To determine opportunities for Independent And-Parallelism and semi-intelligent backtracking, variable aliasing in formation is needed directly. In this dissertation we presented several results aimed at improving analysis of variable aliasing in Logic Programs. A bstract Interpretation is a general framework for the analysis of run-tim e prop erties. The approximation of some run-tim e property of interest is obtained by defining an abstract semantics, which operates on approximations of variable bind ings. The abstract semantics is derived from a “concrete semantics” by replacing the concrete domains and the concrete primitive operations with corresponding ab stract domains and primitive operations. In the abstract interpretation community, each researcher has presented their own similar, or equivalent concrete semantics for Logic Programs. The semantics differ not only in the choice of concrete domain, but also in the formulations of semantics with a fixed domain. We acknowledge 124 that the concrete semantics is not really fixed by the language definition. Rather the choice of concrete semantics should be deliberate and justified. Choosing the domain of the concrete semantics fixes a certain level of abstraction. The level of abstraction that is best suited to describe “logical” features of variable bindings may be inadequate to describe the run-time properties of interest to a particular logic programming language, or to a particular implementation of such a language. For the concrete domain in this dissertation we chose sets of variable bindings, or more precisely, sets of certain equivalence classes of carefully defined variable bindings. This choice was good for the analysis of variable aliasing and several other run-tim e properties. To approxim ate properties of the sequence and number of solutions in a language that is committed to depth first search, a different concrete domain must be chosen, namely one of sequences of variable bindings. Even with a fixed concrete domain, different, equivalent formulations of the standard semantics result in different abstract semantics. In this dissertation we presented several formulations of concrete semantics and exhibited the rather re strictive conditions for which the corresponding formulations of abstract semantics will be equivalent. There is no formulation th at is best in all circumstances. We ex plicitly stated when a formulation will lead to a more accurate approximation. Wej expressed these conditions in terms of algebraic properties of the abstract primitive operations. The properties can be derived in a systematic way. Each implication in the proof of equivalence of the concrete semantics formulations may require somej algebraic property of some concrete primitive operation. It remains to determine how the abstract semantics is affected when the corresponding abstract primitive operations lack some of those algebraic properties. One im portant result of this dissertation is the formulation and investigation of the query-independent semantics for Logic Programs. The corresponding abstract semantics encourages a new technique for abstract execution. In a pre-computation phase, called condensing, a complete description of the abstract semantics is com puted. This technique is applicable for all approximations of sets of variable bind ings. We showed under which conditions condensing can lead to improved, equiv alent, or inferior accuracy. We briefly compared the computation necessary for abstract execution using condensing and Extension Tables. During abstract execu tion using Extension Tables, new table entries are found by iterating to a fixed point. 125 The condensing phase requires, similarly, iteration to a fixpoint. If a procedure is used in different ways, i.e., with distinguishable approximations, different table en tries m ust be computed for the extension table whereas corresponding results are obtained with a single operation from the pre-computed, condensed procedure. This may result in faster analysis. But the two techniques are different, so that there is no correspondence between the intermediate approximations during the respective iterations. Condensing may be slower if procedures are used in few modes, and/or the approximations of these modes are more succinct than the approximation of the most general mode. In chapter 4 we presented the abstract domain, Sharing, for the analysis of variable aliasing. Sharing approximates the set of sharing groups of a variable binding. Sharing captures groundness and independence of variables and propagates this information better than previously proposed domains. To propagate sharing information even better we extended the domain to additionally capture linearity and freeness of variables. We introduced a special representation for certain large sets of sharing groups. W ith this, the size of the representation of a Sharing may be bounded. Finally, we point out several directions for further research. We have not attem pted to quantify the benefit of increased precision of variable aliasing analysis to various applications th at require some variable aliasing infor mation. The variable aliasing in many practical Prolog programs can be captured completely by very simple abstract domains. A large class of programs correspond directly to functional programs. Such programs use Logical Variables in a extremely restricted way: a variable is bound to a ground term at its first binding. The vari able aliasing in such programs can be captured completely by tracking the sets of ground and untouched variables. It still remains to be shown th at undue overhead can be avoided while approximating sharing groups when variable aliasing is trivial. Some of the possible sharing groups of an Answer should be represented implicitly rather than explicitly. The representation should be designed to take advantage of common special cases, such as functional style programming. The query-dependent and query-independent formulations can be understood as extremes of a spectrum. Rather than finding approximations for all encountered modes of a predicate independently, or approximating the most general invocation 126, and extrapolating from it to special modes, it should be possible to extrapolate ap proximations for encountered modes from approximations of a few modes that have been selected for particularly efficient approximation. For example in variable alias ing, approximating the sharing groups after a call with many ground arguments may be simpler than approximating the sharing groups of the most general invocation. On the other hand, approximating the sharing groups after a call whose arguments share variables in intricate ways may be most efficient by extrapolating from the ap proximation of the most general invocation, particularly if execution of the predicate grounds its arguments. This suggests defining yet another formulation of semantics, j which may be called query-selective. Here, Answers to general queries are expressed in terms of Answers of certain canonical queries. The canonical queries for variable aliasing should be defined as queries where each argument is either ground or an independent, free variable. The details for this alternative abstract execution need to be worked out. It seems th at, unlike the query-dependent approximation, the accuracy of approximation according to query-selective semantics would not suffer if the abstract unification lacks Idem -3. One exciting feature of condensing is its ability to completely capture certain semantic functions, which were previously only describable by an algorithm th at re-J quired iteration to a fixpoint to apply the function. W hat applications become fea sible, now th at we can succinctly capture the complete meaning of certain semantic functions? Program analysis might include hypothetical and abductive reasoning. T hat is, we may be able to compute the necessary preconditions for a call of the clause to guarantee certain conditions on exit. In [17] we and others propose a tech nique for multiple specialization of Logic Programs with run-time tests, that uses a technique similar to condensing. We start by summarizing the goal of multiple specialization. Most compiler optimizations of logic programs can be performed only if certain “safety requirements” hold. For example, unification can be simplified if certain arguments to a call are free or ground, data structures can be destructively up dated if they are no longer referenced, and independent calls can be scheduled for parallel execution. A specialized implementation of a program component incorpo rates optimizations that are safe only for particular activations of th at component. 127 Commonly, procedures are specialized according to their possible activations as de termined by global flow analysis. In single specialization [4, 13, 18, 21, 22, 26], one version of each procedure is created to handle all of its activations. In multiple specialization [27], several versions of each procedure are created, and the appro priate version is selected at each invocation. In both cases, user-supplied “mode declarations” for entry predicates may be used to restrict the class of reachable activations. An alternative way of introducing specialization is to explicitly guard different versions of program components by run-tim e tests. Applications of this approach, e.g., DeGroot’s formulation of restricted-and parallelism [14], which we discussed in chapter 6 , have tests inserted directly at the site of the optimization. This can lead to excessive testing th at, at least in some cases, offsets the benefit of the optimization. Nevertheless, this approach has considerable merit since it allows specialization to be directed towards specific optimizations. In contrast, specialization based solely on flow analysis allows an optimization to be performed only if an activation enabling it happens to be generated. An ideal compromise would be to use tests at outer levels of the call tree to enable particularly useful specializations. We hope to extend the idea of condensing the meaning of a procedure to con densing the meaning of a set of execution paths th at lead from the test site to the optimization site. This condensed meaning can then be used to search for run- tests th at will guarantee the safety condition. To define this formally, we must chose a concrete Logic Program semantics that describes substitutions whose do main includes variables from the test site as well as the optimization site. This is a departure from the concrete semantics presented in chapter 3, where the domain of a substitution is always the set of variables of a single clause. The details of this semantic formulation need to be worked out. 128 r Reference List [1] S. Abramsky and C. Hankin, editors. Abstract Interpretation o f Declarative Languages. Ellis Horwood, 1987. j [2] G. Birkhoff. Lattice Theory. American M athem atical Society, 1973. [3] M. Bruynooghe. A framework for the abstract interpretation of logic programs. Technical report, Department of Computer Science, K.U. Leuven, 87. [4] M. Bruynooghe, G. Janssens, A. Callebaut, and B. Demoen. Abstract inter pretation: Towards the global optimization of Prolog programs. In Proceedings o f the 1987 Symposium on Logic Programming, pages 192-204. IEEE, 1987. [5] M. Carlton and P. Van Roy. A distributed Prolog system with and-parallelism. In 1988 Hawaii International Conference on System Sciences, 1988. [6 ] J. Chang. High Performance Execution of Prolog Programs Based on a Static Data Dependency Analysis. PhD thesis, Univ. of Cal. at Berkeley, 1985. Dept, of EECS Report No. UCB/CSD 86/263. [7] J.-H. Chang, A. Despain, and D. DeGroot. And-parallelism of logic programs based on a static dependency analysis. In Proceedings Spring Compcon, pages 218-225. IEEE, 1985. I [8 ] W. V. Citrin. Parallel Unification Scheduling in Prolog. PhD thesis, Univ. of Cal. at Berkeley, 1988. Dept, of EECS Report No. UCB/CSD 88/415. [9] J. Conery and D. Kibler. Parallel interpretation of logic programs. In Proceed ings Conference on Functional Programming Languages and Computer Archi tecture, pages 163-170. ACM, 1981. 129, [10] P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In fth Symposium on Principles of Programming Languages, pages 238-252. ACM, 1977. [11] S. K. Debray. Static inference of modes and data dependencies in logic pro gramming. in preparation, 1989. [1 2 ] S. K. Debray and D. S. Warren. Automatic mode inference for logic programs. J. Logic Prog., 5(3), Sep. 1988. [13] S. K. Debray and D. S. W arren. Automatic mode inference for logic programs. Journal o f Logic Programming, 5(3):207-229, September 1988. [14] D. DeGroot. Restricted and-parallelism. In Proceedings o f the International Conference on Fifth Generation Computer Systems, pages 471-478. ICOT, 1984. [15] M. Hermenegildo. A Restricted And-Parallel Execution Model and Abstract Machine for Prolog Programs. Kluwer Academic Press, 1987. [16] D. Jacobs and A. Langen. Accurate and efficient approximation of variable aliasing in logic programs. In Proc. North American Conf. on Logic Program ming, pages 154-165. MIT Press, 1989. [17] D. Jacobs, A. Langen, and W. Winsborough. Multiple specialization of logic programs with run-time tests. In Proc. International Conference on Logic Pro gramming, 1990. [18] N. D. Jones and H. Sondergaard. A semantics-based framework for the abstract interpretation of Prolog. In S. Abramsky and C. Hankin, editors, Abstract Interpretation o f Declarative Languages, chapter 6 , pages 124-142. E. Horwood, 1987. [19] J. W. Lloyd. Foundations o f Logic Programming. Springer Verlag, 1984. [20] K. M arriott, H. Sondergaard, and N. Jones. Denotational abstract interpreta tion of logic programs, in preparation, 1990. 130 [21] C. S. Mellish. Some global optimizations for a Prolog compiler. Journal o f j Logic Programming, 2(l):43-66, January 1985. [22] C. S. Mellish. A semantics-based framework for the abstract interpretation of Prolog. In S. Abramsky and C. Hankin, editors, Abstract Interpretation of Declarative Languages, chapter 6 , pages 181— 198. Ellis Horwood, 1987. I I [23] D. A. Schmidt. Denotational Semantics: A Methodology for Language Devel opment. W. C. Brown Publishers, 1988. [24] E. Shapiro. Concurrent Prolog: a progress report. In IEEE Computer 19, 8, pages 44-58, 1986. [25] H. Sondergaard. Semantics-Based Analysis and Transformation o f Logic Pro grams. PhD thesis, Univ. of Copenhagen, Denmark, Dec. 1989. [26] R. W arren, M. Hermenegildo, and S. K. Debray. On the practicality of global flow analysis of logic programs. In Proceedings o f the Fifth Conference on Logic Programming, pages 684-699. MIT Press, 1988. [27] W. Winsborough. Path-dependent reachability analysis for multiple special ization. In Proc. North American Conf. on Logic Programming, pages 133-153. MIT Press, 1989. (full paper to appear in J. Logic Programming). [28] M. Wise. A parallel Prolog: the construction of a data driven model. In Proc. Conf. LISP and Func. Prog. ACM, 1982. [29] M. Wise. Prolog Multiprocessors. Prentice-Hall International, 1986. [30] H. Xia and W. K. Giloi. A new application of abstract interpretation in Prolog programs: Data dependency analysis. In IFIP WG 10.0 Workshop on Concepts and Characteristics o f Declarative Systems, 1988. 131 Index ± , 10 n, 10 u, 10 abstract domain, 13, 21 abstract interpretation, 13 abstract semantics, 13, 21 Additivity of aunify, 26 of unify, 17 Answer, 18 Answer1 , 21 at, 35 at', 37 A tom , 12 atom, 1 1 aunify, 2 2 B [ ], 19 B [ 1, 19 W [ J, 24 B m E ] , 4 1 B m I ], 42 B 'm I 1, 44 J> 43 B 'f J, 23 binary relation, 9 Body, 12 BodyMeaning, 18, 40 BodyMeaning', 22, 43 bottom (X), 10 C O , 19 Off ], 19 C 7 ! ], 24 Ch 34 C m I 1, 41 C m I ], 42 C 7 m I 1, 44 CmI 1 > 43 C l I 23 C p lu tt 1, 25 callmode, 34 callmode1 , 37 chain, 1 0 classes equivalence, 1 0 Clause, 12 clause, 1 1 Claus eMeaning, 18 ClauseMeaning', 22 collecting semantics, 13 Commutativity of aunify, 26 of unify, 17 complete lattice, 1 0 complete partial order (cpo), 1 0 .132. composition (circ) of functions, 1 1 of substitutions, 14 concrete domain, 13 concretization function (7 ), 2 2 condensing, 24 continuous, 1 1 , 2 2 cpo, 1 0 d o m , 14 domain abstract, 13 concrete, 13 downwards closed, 26, 30 enter, 34 enter1 , 37 entry, 35 entry', 37 entry declaration, 34 equivalence classes, 1 0 equivalence classes of terms, 14 equivalence relation, 1 0 ESubst, 13 ETerm , 14 F I ] , 19 F I ], 19 F 7 ! ], 24 F m I ], 41 Fm I 1, 42 F jm I ], 44 FkllMS F t 1, 23 fix, 11, 19 fixpoint, 1 1 least (fix), 1 1 global soundness, 21, 24 goals, 1 1 greatest lower bound (n), 10, 14 hh 34 Idem-gain of aunify (Idem -3), 26 Idem-loss of aunify (Idem-CI)> 26 Idempotence of aunify, 26 of unify, 17 identity substitution class (ic ), 15 identity substution, 15 init, 2 2 instance, 14 lattice complete, 1 0 least fixpoint (fix), 1 1 least upper bound (□), 10, 14 local soundness requirements, 2 1 m gu, 15 Mode, 40 mode, 35 Mode', 43 mode', 37 mode-oriented, 40 mode, 35 mode , 37 mode inference, 33 monotonic, 1 1 , 2 2 most general common instance, 14 133 most specific generalization, 14 order partial, 1 0 n i 19 n i, 1 9 P 7! 1, 24 P m O , 41 P m [ 1, 42 P 7 m I 1, 44 PmII I) 43 P I I 23 partially ordered set (poset), 1 0 partial order, 1 0 plural semantics, 25 poset, 1 0 j power domain, 32, 38 procedure, 33 Program, 12 propagate, 37 quasi-order, 9 R[ J, 19 R[ ], 19 H7[ ], 24 Rm [ 1 ) 41 R jfl ], 42 R V l 1, 44 Rm I 1. 43 R'l ], 23 relation binary, 9 equivalence, 1 0 restriction to untagged variables, 16 rgv, 14 RuleBase, 12 Rule B as eMeaning, 18, 40 RuleBaseMeaning', 22, 43 RuleBaseMeaning, 42 semantics abstract, 13, 21 query dependent, 22, 38 query independent, 23, 38 collecting, 13 concrete query dependent, 18 query independent, 19 plural, 25 Site, 34 skel, 34 Subst, 14 substitution, 14 tagging, 16 Term, 14 Term, 12 term, 1 1 term order (< ), 14 unifier, 15 unify, 16 Var, 12, 14 134
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
Asset Metadata
Core Title
00001.tif
Tag
OAI-PMH Harvest
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11257211
Unique identifier
UC11257211
Legacy Identifier
DP22823