Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
00001.tif
(USC Thesis Other)
00001.tif
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MULTI-METHOD PLANNING by Soowon Lee A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computer Science) April 1994 Copyright 1994 Soowon Lee UMI Number: DP22887 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. U M I* Dissertation Publishing UMI DP22887 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES, CALIFORNIA 90007 This dissertation, written by T2a. o±/ . P M . . . L & L . under the direction of h i ? D issertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillm ent of re quirements for the degree of D O C TO R OF PH ILOSOPH Y Dean of Graduate Studies n A pril 19, 1994 D a te .............................................. DISSERTATION COMMITTEE 5 • —(Are— / Chairperson . ! C 3 ly .............. p h - p - CpS \jYb\ 3l3?Al D ed ica tio n To My Parents i ! A ck n ow led gem en ts First, I would like to thank Paul Rosenbloom, my advisor, for his superb guid ance, encouragement, and support throughout the course of my research. He always provided me with insightful suggestions and invaluable comments on both the work and the draft. I would like to thank Shankar Rajamoney, Craig Knoblock, and Behrokh Khosh- nevis, the other members of my dissertation committee, for many helpful sugges tions, criticism and advice on the thesis. Craig’s comments on an early draft were particularly helpful. I would also like to thank Ari Requicha and Dan Moldovan for serving on my guidance committee and providing a lot of useful feedback. I would like to thank the members of ISI Soar group, including Jose-Luis Ambite-Molina, Bonghan Cho, Randy Hill, Lewis Johnson, Jihie Kim, Hidemi Oga- | sawara, Miran Park, Karl Schwamb, Ben Smith, Iain Stobie, and Milind Tambe for i : interesting discussions and helpful suggestions on this research. I am also grateful i to Allen Newell, John Laird, and members of the Soar group for their comments and encouragement. I would like to thank Mike Barley, Haym Hirsh, Pat Langley, and Steve Minton J for their helpful comments on my work. I would like to thank my friends at USC, including Sanghoi Koo, Sungbok Kim, | ! and Daeyeon Park who have made these past years so enjoyable. ] I I would like to thank my parents and all others in my family for their constant i I love, encouragement, and support. Finally, I thank my wife, Yoolim, for her love J and understanding during the long course leading to this dissertation. This research has been sponsored by the Defense Advanced Research Projects Agency and the Office of Naval Research under contract number N00014-89-K- 0155, and under subcontract to the University of Southern California Information Sciences Institute from the University of Michigan, as part of contract N00014- 92-K-2015 from the Advanced Research Projects Agency and the Naval Research Laboratory. I i I C o n ten ts D edication ii A cknow ledgem ents iii List O f Figures viii L ist O f Tables x A bstract xiii 1 Introduction 1 1.1 Overview of the A p p ro a c h ......................................................................... 5 1.1.1 Method C r e a tio n ............................................................................. 7 1.1.2 Method C oordination...................................................................... 8 1.2 Im plem entation............................................................................................. 11 1.3 C o n trib u tio n s................................................................................................ 12 1.4 Guide to the T h esis...................................................................................... 13 2 B ias in P lanning 15 2.1 Bias in Inductive L e a rn in g ......................................................................... 16 2.2 Application of Bias to P la n n in g ................................................................ 16 2.3 Examples of Planning B iases...................................................................... 19 2.4 Relationship with Search Control ............................................... 23 2.5 S u m m a ry ............................................................................ 25 3 Single-M ethod P lanners 26 3.1 Planning in Soar .......................................................................................... 27 3.1.1 Overview of S o a r ............................................................................. 27 3.1.2 Operator Representation in Soar ............................................... 30 3.1.3 Plan Representation in S o a r ......................................................... 33 3.1.4 Planning Methods in S o a r ............................................................ 36 3.2 Implemented Planning B ia s e s ................................................................... 39 3.3 Learning in Single-Method P lan n ers......................................................... 45 v 3.4 Experimental R e su lts.................................................................................. 52 3.5 S u m m a ry ...................................................................................................... 58 4 M ulti-M eth od P lanners 59 4.1 Monotonic Multi-Method Planners ........................................................ 61 4.1.1 Restricted Dominance R e la tio n .................................................. 62 4.1.2 Performance A nalysis...................................................................... 65 4.1.3 Learning in Multi-Method P la n n in g ........................................... 69 4.2 Bias-Relaxation Multi-Method Planners ....................................... 70 4.2.1 Experimental R e su lts...................................................................... 73 4.3 Fine-Grained Multi-Method P la n n e rs..................................................... 75 4.3.1 Experimental R e su lts...................................................................... 77 4.4 Comparison with Partial-Order P la n n in g .............................................. 80 4.5 S u m m a r y ...................................................................................................... 83 5 A pplication to a C om plex D om ain 85 5.1 Simulated Battlefield E nvironm ents........................................................ 86 5.2 Tactical Air Sim ulation............................................................................... 87 5.3 Im plem entation............................................................................................ 87 5.4 S u m m a r y ...................................................................................................... 92 6 R elated W ork 93 6.1 Biases in Planning ..................................................................................... 93 6.1.1 L inearity............................................................................................. 93 6.1.2 P ro te c tio n ......................................................................................... 95 6.2 Planning and Learning in S o ar.................................................................. 95 6.3 Multi-Method P lanning............................................................................... 96 6.3.1 ST E PPIN G ST O N E............................................................................ 96 6.3.2 failsafe- 2 ...................................................................................... 97 6.3.3 Partial Order P la n n e rs ................ 98 7 Conclusion 100 7.1 Summary of the Approach and R e s u lts .................................................... 100 7.2 Limitations and Future W o r k .....................................................................103 A p pend ix A Experimental Results: The Blocks-World D o m a in ........................................... 106 A .l Performance over 30 Training P roblem s.................................................... 107 A.2 Performance over 100 Testing P roblem s.................................................... 112 A p pend ix B Experimental Results: The Machine-Shop Scheduling Domain B .l Performance over 30 Training P roblem s.......................... B.2 Performance over 100 Testing P roblem s.......................... .. L ist O f F igu res 1.1 The Sussman’s anomaly in the blocks-world domain.............................. 2 1.2 A problem in the one-way-rocket domain.................................................. 3 1.3 Hypothetical trade-off between single-method planners’ scope and performance......................................... 5 1.4 An example of a multi-method planner...................................................... 6 2.1 Analogy between concept learning and planning..................................... 17 2.2 Example of the effects of a linearity bias on the plan space: (a) initial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. . 19 2.3 Example of the effects of a protection bias on the plan space: (a) ini tial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. 20 2.4 Example of the effects of a directness bias on the plan space: (a) ini tial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. 21 2.5 Example of the effects of a goal-nonrepetition bias on the plan space: (a) initial state and goal conjuncts, (b) plan eliminated, (c) plan remaining.......................................................................................................... 22 2.6 A recursive planning procedure based on means-ends analysis. . . . 24 3.1 The Soar architecture..................................................................................... 28 3.2 The planning algorithm based on means-ends analysis as imple mented in Soar................................................................................................ 30 3.3 Examples of operator proposal rules........................................................... 31 3.4 Examples of operator application rules...................................................... 32 3.5 Examples of goal expansion rules................................................................ 33 3.6 A generalized plan........................................................................................... 34 3.7 The plan representation in Soar: (a) a set of problems which are solvable by the rule in Figure 3.6, (b) the sequence of steps for a four-block-stacking problem, (c) the sequence of operators.................. 35 3.8 Planning in the blocks-world using (a) linear; (b) nonlinear; and (c) ! abstraction planning 37 ' 3.9 Goal-flexibility dimension.............................................................................. 40 ' 3.10 Goal-protection dimension............................................................................ 41 3.11 The planning methods generated by the bias dimensions..................... 43 3.12 Planning in Soar............................................................................................. 44 3.13 Four block unstacking with nonlinear planning....................................... 46 3.14 Learned rules for block unstacking with nonlinear planning................ 47 3.15 Four block unstacking with directness 48 j 3.16 A learned rule for block unstacking with directness.............................. 48 3.17 Four block stacking with p ro te c tio n ........................................................ 50 3.18 A learned rule for four block stacking with protection.......................... 50 3.19 Three and five block unstacking with complete protection: (a) a projected path, (b) transfer of the learned rule to a different number. 51 3.20 A learned rule for four block unstacking with complete protection. . 52 4.1 Restricted dominance graphs for the single-method planners.............. 64 4.2 Example of learning which planners to use for which classes of prob lems: (a) a learned rule to avoid the direct goal-protection planner, (b) a class of problems in which this rule is applicable....................... 70 4.3 Performance of single-method planners (+ ), coarse-grained m ulti method planners (o), and fine-grained multi-method planners (*) in the blocks-world domain............................................................................... 79 4.4 Performance of single-method planners (+ ), coarse-grained m ulti method planners (o), and fine-grained multi-method planners (*) in the scheduling domain 79 j I 5.1 A skeleton of the goal hierarchy for the 1-v-l aggressive bogey scenario. 88 ! ix L ist O f T ables 2.1 Examples of planning biases and their justifications classified ac cording to a MEA-based planning procedure........................................... 25 3.1 Number of problems solved............................................ 54 3.2 Average number of decisions (and CPU time (sec.)) per problem. . 55 3.3 Average plan length per problem................................................................. 56 4.1 The performance of the six single-method planners for the problem sets defined by the scopes of the planners................................................ 63 4.2 Ten bias-relaxation multi-method planners in the blocks-world. . . . 72 1 4.3 Single-method and bias-relaxation multi-method planning................... 73 4.4 Single-method and coarse-grained multi-method vs. fine-grained multi-method planning in the blocks-world domain............................... 76 4.5 Significance test results for the blocks-world domain............................. 77 4.6 Single-method and coarse-grained multi-method vs. fine-grained j multi-method planning in the machine-shop scheduling domain. . . 78 4.7 Significance test results for the machine-shop scheduling domain. . . 80 4.8 Experimental results for M p-+Me and P O C L ........................................ 82 5.1 The 2 x n planning methods generated from directness and protec tion, where there are n threat-levels 90 ; 5.2 Example of fine-grained multi-method planning for tactical air domain. 91 A .l Performance of M \ over 30 training problems in the blocks-world I domain 107 ■ I A.2 Performance of M 2 over 30 training problems in the blocks-world 1 domain 108 1 A.3 Performance of M3 over 30 training problems in the blocks-world domain..................................................................................................................109 A.4 Performance of M4 over 30 training problems in the blocks-world domain..................................................................................................................110 A.5 Performance of Ms over 30 training problems in the blocks-world domain..................................................................................................................I l l A.6 Performance of M& over 30 training problems in the blocks-world domain..................................................................................................................112 A.7 Performance of single-method planners over 100 testing problems in the blocks-world domain.................................................................................. 113 A.8 Performance of single-method planners over 100 testing problems in the blocks-world domain (continued).......................... 114 A.9 Performance of coarse-grained multi-method planners over 100 test ing problems in the blocks-world domain.....................................................115 A .10 Performance of coarse-grained multi-method planners over 100 test ing problems in the blocks-world domain (continued)........................... 116 A. 11 Performance of coarse-grained multi-method planners over 100 test ing problems....in the blocks-world domain (continued)........................... 117 A. 12 Performance of coarse-grained multi-method planners over 100 test ing problems in the blocks-world domain (continued)........................... 118 A.13 Performance of fine-grained multi-method planners over 100 testing problems in the blocks-world.................. .....................................................119 A. 14 Performance of fine-grained multi-method planners over 100 testing problems in the blocks-world (continued).................................................120 A. 15 Performance of fine-grained multi-method planners over 100 testing problems in the blocks-world (continued).................................................121 A. 16 Performance of fine-grained multi-method planners over 100 testing problems in the blocks-world (continued).................................................... 122 B .l Performance of Mj over 30 training problems in the machine-shop scheduling domain.......................................................................................... 124 B,2 Performance of M 2 over 30 training problems in the machine-shop scheduling domain.......................................................................................... 125 B.3 Performance of M 3 over 30 training problems in the machine-shop scheduling domain.......................................................................................... 126 . B.4 Performance of M4 over 30 training problems in the machine-shop i scheduling domain.......................................................................................... 127 j B.5 Performance of M 5 over 30 training problems in the machine-shop ; scheduling domain............................................................................................. 128 I B.6 Performance of Me over 30 training problems in the machine-shop ! scheduling domain.......................................................................................... 129 ! B.7 Performance of single-method planners over 100 testing problems in 1 the machine-shop scheduling domain.........................................................130 B .8 Performance of single-method planners over 100 testing problems in the machine-shop scheduling domain (continued).................................. 131J B.9 Performance of multi-method planners over 100 testing problems in , the machine-shop scheduling domain.........................................................132 1 B.10 Performance of multi-method planners over 100 testing problems in o the machine-shop scheduling domain (continued)............................... A b stra ct The ability to find a low execution-cost plan efficiently over a wide domain of applicability is the core of domain-independent planning systems. The approach investigated here to building such a planning system begins with two hypotheses: (1) no single m ethod will satisfy both sufficiency and efficiency for all situations; and (2) multi-method planning can outperform single-method planning in terms of sufficiency and efficiency. To evaluate these hypotheses, a set of single-method planners has been constructed. The results obtained from the experiments with these planners for the domains investigated show that these planners have trouble performing efficiently over a wide range of problems. As an alternative to single-method planning, multi-method planning is investi- j | gated in this thesis. A multi-method planner consists of a coordinated set of meth- j I ods which have different performance and scope. Given a set of created methods, j 1 the key issue in multi-method planning is how to coordinate individual methods in j ! an efficient manner so that the multi-method planner can have high performance, j i I i The multi-method planning framework presented here provides one way to do this ! 1 . ■ 1 based on the notion of bias-relaxation. In a bias-relaxation multi-method plan- i | . . . . I ! ner, planning starts by trying highly restricted and efficient methods, and then j ! successively relaxes restrictions until a sufficient m ethod is found. I ; A class of bias-relaxation multi-method planners has been developed. These j planners vary in the granularity at which individual methods are selected and used. Depending on the granularity of m ethod switching, two variations on strongly j ■ monotonic m ulti-method planners are implemented: coarse-grained multi-method I i I planners, where methods are switched on a problem-by-problem basis; and fine grained multi-method planners, where methods are switched on a goal-by-goal basis. The experimental results indicate that, at least for the domains investigated, both coarse-grained and fine-grained multi-method planning can reduce plan length significantly compared with single-method planning, and fine-grained planning can improve the planning time significantly compared with coarse-grained and single m ethod planning. Application to a simulated agent domain also shows one way that m ulti-method planning can be used in more complex domains. I I I I I C h ap ter 1 i i In tro d u ctio n Research in domain-independent planning systems has been a main theme in the area of AI planning. These systems vary according to the way in which the search space is defined and traversed, the way in which plans are represented, the way in which goal interactions are dealt with, the way in which tim e and resources are handled, the way in which planning interacts with execution, and so on [Allen et al., 1990]. Among the criteria used to evaluate these systems, three typical ones are the amount of time required to find the plan; the execution cost of the plan itself; and the ability to find some plan, or an optimal plan, for any problem in an arbitrary domain. Thus, finding a low execution-cost plan efficiently over a wide domain of applicability is the core of domain-independent planning systems. ! The key issue here in building such a system is how to construct a single planning method, or a coordinated set of different planning methods. j I The hypotheses underlying this research are (1) no single m ethod will satisfy! 1 . . . . ^ j both sufficiency and efficiency for all situations; and (2) m ulti-method planning j I . . . 1 ; can outperform single-method planning in terms of sufficiency and efficiency. The i I . . . I j first hypothesis is based on the observation that most conventional planning sys- j ; tems which encode planning behaviors within a single fixed m ethod — such as linear planning, nonlinear planning, abstraction, and so on — have a lim itation in performing efficiently over a wide range of problems. For example, STRIPS-type planners can generate plans quite efficiently for some problems by using the linearity assumption [Fikes and Nilsson, 1971]. W ith this Initial State Goal (tmCA) (an A Table) (on B Table) c A Gl: (on A B) G2: (on B C) O perators (MOVE-ONTO-TABLE < x > ): - Putdown block <x> onto table. Preconditions: (on <x> <z>) ( c l e a r <x>) Add lists: (on <x> T a b le ) ( c l e a r <z>) Delete lists: (on <x> <z>) (MOVE-ONTO-BLOCK <x> < y > ) : - Stack block <x> onto block < > > . Preconditions: (on <x> <z>) ( c l e a r <x>) ( c l e a r <y>) ( ty p e <y> B lo c k ) Add lists: (on <x> <y>) ( c l e a r <z>) Delete lists: (on <x> <z>) ( c l e a r <y>) Figure 1.1: The Sussman’s anomaly in the blocks-world domain. assumption, the number of goal conjuncts considered at each planning step can be reduced, so that planning tim e can be saved. However, this assumption makes the planners unable to generate an optimal plan in certain domains, and fail to find a plan in domains with irreversible operators. Sussman’s anomaly in the blocks- world domain is a classical problem where an optimal solution cannot be found by a linear planner [Sussman, 1973]. Figure 1.1 shows the initial state, goal conjuncts, and operators for this problem.1 Since a linear planner does not consider the other goal conjuncts until the current goal conjunct is achieved, both goal orderings — (on A B) followed by (on B C), or (on B C) followed by (on A B) — generate non-optimal operator sequences. th ro u g h o u t this thesis, variables in operators are denoted by angle brackets, as in < a> . Initial State (atOl LocA) (at 02 LocA) (at Rocket LocA) Operators (LOAD < o b j> ) - Load <obj> into Rocket. Preconditions: ( a t < o b j> < lo c > ) ( a t R o c k e t < lo c > ) Add lists: ( i n s i d e < o b j> R o c k e t) Delete lists: ( a t < o b j> < lo c > ) (MOVE-ROCKET) - Move Rocket from LocA to LocB. Preconditions: ( a t R o c k e t LocA) Add lists: ( a t R o c k e t LocB) Delete lists: ( a t R o c k e t LocA) Goal Gl: (at Ol LocB) G2: (at 02 LocB) (UNLOAD < o b j> ) - Unload <obj> from Rocket. Preconditions: ( i n s i d e < o b j> R o c k e t) ( a t R o c k e t < lo c > ) Add lists: ( a t < o b j> < lo c > ) Delete lists: ( i n s i d e < o b j> R o c k e t) Figure 1.2: A problem in the one-way-rocket domain. A more serious problem occurs in domains with irreversible operators. Fig- j ure 1.2 shows a problem in the one-way-rocket domain which cannot be solved by' i a linear planner [Veloso, 1989]. In this problem, achieving either goal conjunct in-1 dividually inhibits achieving the other goal conjunct. For example, after achieving j the first goal conjunct ( a t 0 1 L o c B ) by applying (LOAD 0 1 ) — * (M OVE-ROCKET) I —> (UNLOAD 0 1 ) , the second goal conjunct ( a t 0 2 L o c B ) cannot be achieved be- j cause the Rocket cannot return to pick up the remaining object. Nonlinear planners can generate optimal plans for these problems because they axe free from the linearity assumption.2 However, for other problems that could be solved by a linear planner, nonlinear planners may be less efficient than linear planners. For example, a nonlinear planner which uses a goal set as opposed to a goal stack, such as NOLIMIT [Veloso, 1989], has more choices to consider at each goal-selection point. This allows an optimal plan to be generated for a given problem; however, the overall planning performance may be decreased by the increased branching factor. It has been known that partial-order planners can efficiently solve problems in which the specific order of the plan steps is critical [Sacerdoti, 1975, Tate, 1977, Chapman, 1987, Me Allester and Rosenblitt, 1991, Barrett and Weld, 1992]. This is done by delaying step-ordering decisions as long as possible, so that the size of the plan space can be smaller than those of total-order planners. However, they pay the cost of having a more complex ordering procedure [Minton et al., 1991]. For example, the partial-order planner SNLP [McAllester and Rosenblitt, 1991, Barrett and Weld, 1992], detects a threat between a step and a causal link whenever j a new step or causal link is added. The ordering procedure searches over the space j of ordering constraints to resolve the detected threat. This scheme can be quite effective if there are many threats in a problem. However, if there are only a few trivially-resolvable threats in a problem, it is generally less efficient to use such a I complex threat-detecting and resolving algorithm for the entire problem. Figure 1.3 illustrates the scope and performance for a hypothetical set of single- ; m ethod planners. The inherent trade-off between a planner’s scope and its perfor- j mance, as shown in the figure, suggests that single-method planning has a limita- J tion in performing efficiently over a wide range of problems, so that a more flexible j ! planning approach is needed. ! i , > - | 2The term “nonlinear” in this context implies that it is allowable to interleave operators in \ I service of different goal conjuncts. It does not necessarily mean that either partial-order or j least-commitment planning are used. i 1 S c o p e ^ P e r fo r m a n c e Figure 1.3: Hypothetical trade-off between single-method planners’ scope and per formance. 1.1 O verview o f th e A p p roach i As an alternative to single-method planning, multi-method planning is investigated i i in this thesis. A multi-method planner is an integrated system which utilizes a co- ! ordinated set of methods, where each method has different scope and performance j I [Lee and Rosenbloom, 1992, Lee and Rosenbloom, 1993]. The focus in this thesis is j on multi-method planning in a single serial environment.3 Figure 1.4 shows an ex- . ample of a multi-method planner in which two different methods — linear planning I and nonlinear planning methods — are coordinated sequentially in a single serial j ; environment. In this planner, the linear m ethod has better overall performance i ' | than the nonlinear method, while the nonlinear m ethod can solve more problems than the linear method. Given a problem, the linear method is tried first to solvei i th at problem. If it cannot solve the problem, the nonlinear m ethod is tried. i The potential advantage of multi-method planning over single-method planning > 1 is that multi-method planning can achieve both applicability and efficiency at the i same time. Theoretically, the scope of a multi-method planner can be the union of 1 3 In a multi-agent environment, multi-method planning can be accomplished by running the methods in parallel until the problem is solved via one of the methods [Bond and Gasser, 1988]. ; However, detailed discussion on this issue is beyond the scope of this thesis. I I I i___________________________________________________________________________ 5 ; Linear Planning Nonlinear Planning l I Figure 1.4: An example of a m ulti-method planner. the scopes of all individual single-method planners in the m ulti-method planner. Thus, a m ulti-method planner is at least as applicable as the most general single- j m ethod planner within the multi-method planner. If a single-method planner is | complete for a domain — that is, it can solve all problems in a domain — then any j m ulti-method planner which includes the single-method planner is also complete. ! W ith respect to efficiency, if a m ulti-method planner includes a m ethod which is very efficient for some classes of problems, and that m ethod can be selected i for those classes of problems without too much extra effort, then multi-method ' planning can have an overall efficiency gain over single-method planning. W ith this potential advantage of m ulti-method planning in hand, the ideal | m ulti-method planner would be able to solve each problem with the most efficient | m ethod that is sufficient to solve it. In general, however, it is not known a priori i j which m ethod is the most appropriate one for a given problem. The best way t o ; I j approach this ideal is to learn about which methods to use for which classes of' problems from a training set of problems. This type of m ethod learning can be j accomplished by either an analytical approach or an empirical approach, i The analytical approach to learning is based on reasoning about why the given training problem is solved (or cannot be solved) by the current method. If the problem is solved by the method, one constructs an explanation which proves' I that the problem is a positive instance of the goal concept ‘solved’. Then, this explanation is generalized and positive control knowledge is learned which selects the m ethod for later similar problems. In contrast, if the m ethod fails to solve the problem, one constructs an explanation which proves that the problem is a positive instance of the goal concept ‘unsolvable’. In this case, negative control knowledge is learned which avoids the m ethod for later similar problems. The empirical approach to learning is based on the performance of those m eth ods for a training set of problems. Instead of learning control knowledge by an alyzing a solution trace for each problem, this approach extracts the information ] needed to select an appropriate method or to avoid a set of inappropriate methods for the set of problems under a fixed distribution. Since the extracted information is a function of the problem distribution, this approach can be used flexibly for other problem distributions or other domains. The m ulti-method planning frame work investigated in this thesis is based on the empirical approach; however, the analytical approach will also be discussed later in more detail. W ithin the empirical multi-method planning framework, the main goal of this research is to create a set of multi-method planners which are more efficient and I ; applicable than single-method planners. Towards this end, the basic issues to j be investigated are: (1) how to create individual methods which have different f performance and scope so that the created m ulti-method planner can have both j highly efficient methods and highly applicable methods; and (2) how to coordinate 1 the created methods in an efficient manner so that the multi-method planner can j have high performance. Each of these issues is discussed in turn in the followingJ subsections. i 1 1.1.1 M ethod C reation ! i I I j 1 In order for a multi-method planner to satisfy both efficiency and applicability,1 i ' the single methods in the multi-method planner should range from highly efficient, methods to a complete method. For this purpose, a methodology to create a set o f' 1 methods with different performance and scope is developed. This methodology is based on the notion of bias in planning [Rosenbloom et a i, 1993]. Bias in planning; is any basis for choosing one plan over another other than plan correctness. W ith the view of planning as search over a space of plans [Korf, 1987], a bias is a restriction over the space of plans considered that determines which portion of the entire plan space can or will be the output of the planning process. For example, a linearity bias eliminates plans in which operators for different goal conjuncts are interleaved. In general, a bias can potentially reduce computational effort by reducing the num ber of plans that must be examined, and it can potentially generate shorter plans by avoiding plans containing inefficient operator sequences. However, this is not always the case. For example, if the space eliminated by a bias is not large enough or the eliminated space does not include a sufficient num ber of inefficient plans, the bias has no effect. W hether or not these cases happen relies on the domain characteristics. Thus, it is im portant to devise biases which are really effective for a given domain in term s of performance improvement. For a training set of problems in a given domain, a bias is called effective, if the average planning effort for the biased m ethod over the training problem set is less than the average planning effort for the method that does not use that bias, and the average length of plans generated from the biased m ethod over the training problem set is less than the average length of the plans generated from the method which does not use that bias. J By varying the effective biases, a set of methods with different performance and j i scope can be created. Given a set of effective biases, the most restricted m ethod; — which uses all of the biases — is the most efficient one, but can be incomplete! if the desired plans are eliminated. On the other hand, the least restricted m ethod; — which uses no bias — is the least efficient one, but can be a complete method since no plans are eliminated. ! s 1.1.2 M ethod Coordination i Once a set of methods with different performance and scope is created, these meth- < ods need to be coordinated efficiently so that the created m ulti-m ethod planner can 1 satisfy both efficiency and applicability. Method coordination, as used here, refers to (1) the selection of appropriate methods as situations arise, and (2) the granu larity of method switching as the situational demands shift. 1 M eth od selection: For method selection, individual methods need to be or ganized so that a higher level control structure can determine which m ethod to use first and which method to use next if the current m ethod fails. Two straightfor- 1 j ward ways of organizing individual methods in a m ulti-method planner are sequen- j tial and time-shared. A sequential m ulti-method planner consists of a sequence of { single-method planners. A time-shared m ulti-method planner consists of a set of j single-method planners in which each m ethod is active in turn for a given tim e slice [Barley, 1991]. In either approach, the key issue is how to reduce the effort) I wasted in using inappropriate methods. j The wasted effort in a sequential m ulti-method planner is the cost of trying, inappropriate earlier methods in the sequence, whereas the wasted effort in a time- j shared m ulti-method planner is the cost of trying all methods in the m ethod set except the one th at actually solves the problem. The wasted effort in sequential 1 multi-method planning is sensitive to the ordering of the methods because it takes ; | I 1 too much tim e if inappropriate earlier methods are not efficient enough, or in a n ! ' extreme case, it may not be able to generate a plan at all if one of the inappropriate J ! j earlier methods does not halt. On the other hand, the wasted effort in time- j I > : shared m ulti-method planning is sensitive to the number of individual methods. ' I Also, time-shared multi-method planning switches among methods more often than ; sequential multi-method planning, and it has more overhead for context sw itching.: The planning approach primarily investigated in this thesis is a special typei of sequential m ulti-method planning, called monotonic multi-method planning [Lee j ; and Rosenbloom, 1992]. In a monotonic m ulti-method planner, individual m ethods | axe sequenced so that the earlier methods are more efficient and have less coverage than the later methods. Compared with the single-method approach with planner, completeness and the time-shared multi-method approach, the monotonic m ulti-' m ethod approach can potentially generate plans more efficiently. The idea is that if the biases used in efficient methods can prune the search space, the problems solvable by efficient methods should be solved more quickly, while problems requir ing less biases should not waste too much extra tim e trying out the insufficient early methods. In this way, a monotonic m ulti-method planner can retain plan ner completeness by allowing the least restricted m ethod to be used, while it can generate low cost plans efficiently by using more restricted methods. One way to construct a monotonic multi-method planner is to use the biases which themselves increase efficiency. Individual methods are sequenced so that the set of biases used in a m ethod is a subset of the biases used in earlier m eth ods, and the later methods have more coverage than the earlier methods. This means that planning starts by trying highly efficient methods, and then succes sively relaxing biases until a sufficient m ethod is found. This type of planning i is called bias-relaxation multi-method planning. A bias relaxation multi-method! J i 1 planner is not necessarily a monotonic multi-method planner if there axe interac- | j tions among biases. However, one can generate monotonic multi-method planners J via bias-relaxation by just testing whether monotonicity holds for the created bias- * ! relaxation multi-method planners. In bias-relaxation m ulti-m ethod planning, each , bias is evaluated independently by comparing a m ethod which uses that bias only I and a m ethod which uses no bias. Thus, bias-relaxation multi-method planning has a restricted scope in creating and comparing individual methods, j G ra n u la rity of m e th o d sw itching: The second issue of m ethod coordination ; is the granularity at which individual methods are switched [Lee and Rosenbloom, ' 1993]. This issue is im portant in terms of a planner’s performance, because the I performance of a multi-method planner can be changed according to the granularity of shifting control from m ethod to method. The family of m ulti-method planning j t systems can be viewed on a granularity spectrum. At one extreme there is the | i i normal single-method approach, where one m ethod is selected ahead of tim e for ; : the entire set of problems. At another point of this spectrum are coarse-grained multi-method planners, where methods are switched for a whole problem when no solution can be found within the current method. Toward the other extreme, there i 10 are fine-grained multi-method planners, where methods are switched at any point during a problem at which a new set of subgoals is formulated. Time-shared m ulti m ethod planners, where methods are switched based on the tim e slice, also can be i viewed on the spectrum. There is a trade-off between coarse-grained multi-method planning and fine grained m ulti-method planning. A coarse-grained m ulti-method planner examines all paths within the current biased space until a solution is found or all paths are exhausted. Thus, a coarse-grained m ulti-method planner finds a solution within the first m ethod that has one at the cost of searching the entire biased space in the worst case (unless some form of within-method learning or heuristics are used to prune out some portions of the space, or unless the tim e limit is exceeded). On the other hand, a fine-grained m ulti-method planner falls back on the next method • 1 whenever the partial plan for the current solution path cannot be expanded w ithout: J I violating the biases used in the current method. Thus, it can save the effort of j backtracking within the current method. However, it does not guarantee to find a solution that may exist within the current biased space. I { 1.2 Im p lem en ta tio n j ! A set of single-method planners and bias-relaxation m ulti-method planners — both i | coarse-grained and fine-grained versions — have been implemented in the context j of the Soar architecture [Laird et ah, 1987, Rosenbloom et ah, 1991]. Soar is a , useful vehicle for this work because its impasse-driven subgoaling scheme provides the necessary context for planning and its m ultiple problem-space scheme facilitates | the multi-method planning approach, though it is difficult to implement context! i switching for time-shared m ulti-method planners in Soar. j Speed-up learning is used in both single-method planners and multi-method I planners for each problem, but only within-trial transfer was allowed; that is, rules' learned during one problem are not used for other problems. However, learned! • rules were allowed to transfer from an earlier m ethod to a later m ethod (for the i i j same problem). T hat is, if a search path is evaluated within one m ethod, and the i results of the evaluation depend only on aspects of the m ethod that are shared j ; by a second method, then it should not be necessary to repeat that path when ' the second m ethod is tried. Learned rules do not transfer across trials, because some rules are expensive so that they may increase the planning tim e for later problems [Tambe and Newell, 1988]. Restricting expressiveness such as by the unique-attribute scheme can solve this problem [Tambe et al., 1990]; however this thesis uses the m ulti-attribute scheme to learn rules with higher generality. The implemented multi-method planners are compared with single-method planners theoretically and experimentally in three domains: blocks-world, machine shop scheduling, and a simulated agent domain. In this thesis, the focus is on plans represented by STRIPS-like operators; however, since the multi-method framework in this thesis is independent of the operator representation, this framework should be extendable to planners with more expressive plan representations. 1.3 C o n trib u tio n s ; The prim ary contributions of this thesis include the following: i * 1. A methodology fo r building a set o f planning methods with different per formance and scope. A bias determines the portion of the entire plan space considered. In particular, an effective bias improves planning performance ! by reducing the number of candidate plans and generates shorter plans by ! avoiding inefficient operator sequences. A methodology is developed to se- j lect a set of effective biases based on performance over a training problem | set. By varying the selected effective biases, a set of methods with different j performance and scope can be created. | 2. A new planning framework fo r multiple methods. A new m ulti-method planning framework is developed based on the relaxation of biases. Issues | arising in multi-method planning, such as how to efficiently coordinate in dividual methods within a multi-method framework and the granularity at which methods can be switched, are investigated. 3. Performance improvement over single-method, planning. Bias-relaxation multi-method planning with various granularities of m ethod switching pro vides a planning system which can improve planning efficiency and reduce plan length without loss of planner completeness. In fact, for the domains in vestigated, both coarse-grained and fine-grained multi-method planning can reduce plan length significantly compared with single-method planning, and fine-grained planning can improve the planning tim e significantly compared with coarse-grained and single-method planning. 1.4 G u id e to th e T h esis The body of this thesis consists of six chapters. Chapter 2 defines the notion of bias in planning. Examples of planning bias are presented along with the justifications on which these biases depend. The differences between bias and search control heuristics are described. Chapter 3 explains two bias dimensions— goal flexibility and goal protection — and defines single-method planners that vary along these dimensions. The implementation of these planners in Soar is described, and learning in Soar for single-method planning is discussed. Finally, experimental results in the blocks- world and machine-shop scheduling domains are provided. Chapter 4 specifies how to build monotonic m ulti-method planners and bias- relaxation multi-method planners from a set of single-method planners. The issue of granularity at which individual methods can be switched is investigated, and learning in multi-method planning is discussed. Experimental results for coarse grained and fine-grained multi-method planners are presented and compared with the results for single-method planners. Finally, the performance of m ulti-method planning is compared with the performance of partial-order planning. 13 Chapter 5 shows how this approach can be applied to a more complex domain such as a simulated agent domain. Finally, chapters 6 and 7 discuss related work and conclusions. 14 C h a p ter 2 B ia s in P la n n in g Bias was originally defined in the context of concept learning from preclassified training instances as any basis for choosing one generalization over another, other than strict consistency with the observed training instances [Mitchell, 1980]. Trans ferring the notion of bias to planning, it can be defined as “any basis for choosing ! i one plan over another other than plan correctness”, where a plan is correct if the application of the plan transforms the initial state into the goal state [Rosenbloom et aL, 1993]. I The notion of bias is useful in planning, because bias can reduce computation effort by reducing the num ber of plans that must be examined, and it can po tentially generate shorter plans by avoiding plans containing inefficient operator sequences such as ones that undo achieved goals or loop on states. The notion is particularly useful in m ulti-method planning, because bias can provide a ba sis for building a set of planning methods with different performance and scope, j Also, method switching in m ulti-method planning can be easily accomplished by changing the set of biases used in the individual methods. This chapter begins with the notion of bias in inductive concept learning, and then describes how this notion is applied to planning. Some examples of planning biases axe presented, and finally, the relationship between search control and bias is discussed. 2.1 B ia s in In d u ctiv e L earning An induction problem is, in general, given an instance description language and a set of training instances, to determine a generalization th at is consistent with the training instances [Mitchell, 1980]. In induction, an unbiased hypothesis space, denoted as l~t, consists of every possible generalization on the instance space — that is, the power set of the training instances. The unbiased version space, denoted as 7i v C 'H, is the portion of Ti that is consistent with the observed training instances. Then, a bias b determines a biased hypothesis space, denoted as T i* * C 7i, so that the output generalization can be selected from 7iv D 'Hb. It has been shown that bias plays an im portant role in induction, because it influences hypothesis selection [Utgoff, 1986]. W ithout bias, an induction system has no basis for choosing one generalization over another. In other words, bias enables induction systems to determine how to go beyond the training instances; that is, which inductive leaps to make. Bias can be either absolute or relative. An absolute bias completely removes parts of the unbiased version space. For example, a generalization language pro vides an absolute bias by eliminating any element of the unbiased version space not j expressible in the language. A relative bias defines a partial order over portions of ; the unbiased version space. For example, one can prefer one hypothesis to another ; based on measures such as simplicity of the hypothesis [Michalski et al., 1986]. I I : i i i 2.2 A p p lica tio n o f B ia s to P la n n in g i | As in inductive learning, the notion of bias can be formalized in planning, j | Planning can be defined in term s of the notion of a problem space [Newell et al., | 1 1991]. A problem space consists of a set of states S, and a set of operators O. ! A problem, denoted as p = (So, Sg), consists of two components, So € S and I i _ _ 1 Sg € S, where So is a description of an initial state of the world and Sg is a partial j description of a desired state. A plan for a problem p = (So, Sg) can be defined as ; 1 Unbiased Hypothesis Space Instance Description nstance^v Space Training , Instances Unbiased / Version Spade \ Biased > Hypothesis Space Concept Bias (a) Inductive concept learning Unbiased Plan Space Operator Description Language Unbiased / Plan Space f a t p Biased Ian Space Wan o ( Initial State t Bias (b) Planning Figure 2.1: Analogy between concept learning and planning. 17 a structure th at represents the sequence of operators in O th at achieves Sg from So by applying each operator to each of the resulting states in the sequence. An unbiased plan space, denoted as V , is the “power sequence” — that is, the set of all sequences — of the possible operators in O.1 For a given problem p, the unbiased plan space fo r p, denoted as V p C P , is the portion of V , for which each i | element of V p solves p. Then, a bias b determines the biased plan space, denoted as V h C P , so that the output plan can be selected from V p fl V h. Figure 2.1 shows the analogy between the processes of inductive concept learn ing and planning. In both cases the output of the process is to be some element of the unbiased hypothesis space that is consistent with the process’s correctness criterion. Where the two cases differ is in the definitions of “unbiased hypothesis space” and “correctness criterion”. In concept learning, the unbiased hypothesis space is the power set of the training instances, and the correctness criterion is con sistency with the observed training instances. In planning, the unbiased hypothesis space is the power sequence of the possible operators, and the correctness criterion is whether the application of the plan achieves the goal state from the initial state. I In spite of these differences, bias together with the process’s correctness criterion, | I in both cases, determines which portion of the unbiased space can be the output of the process. | As in the case of induction, an absolute use of bias in planning engenders in- j completeness in the planner. This incompleteness can be used to speed up the ! planner by reducing the number of plans th at the planner can possibly generate; I for particular problems. However, it only really helps if the bias is an appropriate I . one; otherwise, desired plans can be eliminated. A relative use of bias does not ■ ; introduce incompleteness. However, if the bias is not an appropriate one, generated ’ plans may not be the desired ones. Thus, in order to show that using a bias is plausible, some form of appropriate justification is needed. For example, with a i ' f ................................... 1 " i 1Tfae specification here assumes that the plan space contains only totally-ordered sequences of operators, but it does not rule out a search strategy that incrementally specifies an element of t i the plan space by refining a partially-ordered plan structure. H Dj m Initial State Gl: (on AB) G2: (on C D) Goal (a) (MOVE B Table) —► (MOVE D Table) —* ► (MOVE A B)—► (MOVE C D) ForG l ForG2 F orG l ForG2 (b) (MOVE B Table) —► (MOVE A B) —► (MOVE D Table) —► (MOVE C D) ForG l ForG2 i (C) Figure 2.2: Example of the effects of a linearity bias on the plan space: (a) initial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. independence justification, one assumes that goal conjuncts are achieved by inde- ! pendent processes without interfering with other goal conjuncts in a conjunctive ; _ . . . . . 1 ; goal problem. W ith a progress justification, one assumes that it is always possible J to move forward to solve the problem, and never required to move backward. A j boundedness justification limits the total effort that it is reasonable to expend in ♦ ! solving a problem or a set of problems. In the next section, examples of planning j biases based on these justifications are presented. 2.3 E x a m p les o f P la n n in g B ia ses I , i ; Two typical planning biases justified by an independence justification are lin- j earity and protection. A linearity bias removes all plans in which operators for different unachieved goal conjuncts occur in succession; that is, once an operator 1 for one unachieved goal conjunct is in the plan, operators for other conjuncts can be placed only after the first goal conjunct has been achieved. For example, given Gl: (onCD) G2: (on B Q G3: (on A B) Goal (a) t : (MOVE A B) — (MOVE C D)—► (MOVE A Table) —► (MOVE B C) —► (MOVE A B) ForG3 ForG l ForG2 ForG3 (b) (MOVE C D)—► (MOVE B C) —— (MOVE A B) F orG l ForG2 ForG3 (c) Figure 2.3: Example of the effects of a protection bias on the plan space: (a) initial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. ; the initial state and the goal conjuncts in Figure 2.2(a), plans such as the one in ; Figure 2.2(b) would be eliminated, while plans such as the one in Figure 2.2(c) | i ! would remain. Linearity depends on an independence justification because one , i . . . ! assumes that while solving one goal conjunct, operators for other goal conjuncts need not be considered. A protection bias eliminates all plans in which an operator undoes a goal con junct that is either true in the initial state or established by an earlier operator in the sequence.2 For example, given the initial state and the goal conjuncts in ! Figure 2.3(a), plans such as the one in Figure 2.3(b) would be eliminated since the , 2The notion of protection used here was introduced by Sussman [1973]. Other forms \ of protection can be found in the planning literature. For example, one can protect the current goal conjunct from being clobbered by other operators while regressing an opera tor or a goal through a partial linear plan [Warren, 1976, Waldinger, 1977]. In partial- order planning, one can protect a causal link from being clobbered by any other plan ning steps, within the interval where the causal link is needed [Tate, 1977, Chapman, 1987, Barrett and Weld, 1992]. H 0 H H Initial State 20 | Gl: (on A Table) G2: (cm B Table) G3: (on C Table) Initial State Goal (a) i (MOVE A Table) — (MOVE B A) —► (MOVE C Table) —► (MOVE B Table) i ForGl ForG3 ForG2 i ' (b) I t (MOVE A Table)—► (MOVE B Table) —► (MOVE C Table) ForGl FotG2 ForG3 t (c > ! i Figure 2.4: Example of the effects of a directness bias on the plan space: (a) initial, state and goal conjuncts, (b) plan eliminated, (c) plan remaining. i | operator (MOVE A T a b l e ) undoes the goal conjunct ( o n A B ) which is established; I by the earlier operator (MOVE A B ) , while plans such as the one in Figure 2.3(c) ■ would remain. Protection is based on an independence justification since one as- j sumes that while solving one goal conjunct, operators that interact negatively w ith' previous goal conjuncts need not be considered. A progress justification underlies all greedy biases. For example, protection is also justified by a progress justification, because once a goal is achieved, it would never be undone. Another bias justified by a progress justification is directness. ; A directness bias eliminates all plans in which there is at least one operator that does not directly achieve a goal conjunct included in the problem definition. For example, given the goal conjuncts and operators in Figure 2.4(a), plans such as the one in Figure 2.4(b) would be eliminated since the operator (MOVE B A ) does not directly achieve any of the goal conjuncts in the problem definition, while plans such as the one in Figure 2.4(c) would remain. Directness is justified by 1 2 1 1 B p i to Gl: (on A Table) ■ ■ ■ ■ ■ G2: (clear C) Initial State Goal (a) (MOVE A D) —► (MOVE B Table) —► (MOVE A Table) For (dear B)____________________ _____________ For G2 (clear C) For Gl < b ) (MOVE A Table) —► (MOVE B Table) For Gl For G2 (c) , Figure 2.5: Example of the effects of a goal-nonrepetition bias on the plan space: ( i (a) initial state and goal conjuncts, (b) plan eliminated, (c) plan remaining. j i a progress justification, because whenever an operator is applied, a new goal is i i always achieved, increasing the degree of goal achievement for the entire problem .' Directness is a quite interesting bias because it ensures that the num ber of operators; to achieve each of the goal conjuncts is bounded by one. Biases justified by a boundedness justification include goal-depth, goal-breadth, plan length, and goal-nonrepetition. Both goal-depth and goal-breadth limit the size of the goal hierarchy used in planners based on means-ends analysis (M EA ),! J so that planning effort can be reduced. For a predefined bound n, a goal-depth I bias eliminates from the hypothesis space all plans that require more than n levels1 of subgoals to generate, while a goal-breadth bias eliminates all plans th at require more than n conjunctive subgoals for a single goal. Directness is also justified by boundedness. In fact, directness can be viewed as a special case of goal-depth bias, since it allows no generation of subgoals, thus ensuring that the depth of the goal hierarchy is bounded by one. 22 A plan-length bias eliminates all plans which consist of a sequence of more than n operators, for a predefined bound n, so that the length of the output plan can be bounded. It can be used either on a problem-by-problem basis or on a goal- by-goal basis. If plan-length is used on a problem-by-problem basis, the length of the output plan for the entire problem is guaranteed to be no more than n. If it is used on a goal-by-goal basis for a set of conjunctive goals, the length of the plan for the conjunctive goals is bounded by n times the number of goal conjuncts. In fact, directness is also a special case of goal-by-goal plan-length bias, where n = 1, because the length of the plan for each goal is bounded by one. A goal-nonrepetition bias eliminates all plans th at require a repetition on a goal | literal; that is, if satisfying an unm et precondition for a selected operator requires a new goal conjunct whose literal is equivalent to the literal of its ancestor in the I goal hierarchy, then that plan would be eliminated. For example, given the goal I j conjuncts and operators in Figure 2.5(a), plans such as the one in Figure 2.5(b)! would be eliminated because it requires a repetition on a goal literal — operator j (MOVE B T a b l e ) is chosen for conjunct ( c l e a r C), but in making it applicable, an iterative c l e a r conjunct ( c l e a r B ) is generated (resulting in the selection of (MOVE A D ) as the first operator). On the other hand, plans such as the one in Figure 2.5(c) would remain. The prime reason to use a goal-nonrepetition b ias; is th at it forces learning from non-repetitive paths by eliminating all plans that i require a repetition on a goal conjunct, so th at learning specific rules for each' ! size of repetition can be avoided. In this way, it is closely related to Etzioni’s [1990b] work on restricting EBL to learn from only non-recursive explanations. This relationship will be discussed later in more detail. 2.4 R ela tio n sh ip w ith Search C on trol As described before, bias affects the output of the planning process. However, if bias can be incorporated directly into the planning procedure, then it can also have a significant impact on the efficiency of the planning process by reducing the j num ber of candidate plans that are generated. In this way, bias can lead to effective | control of search. i i ; 1. If the goal of the problem is achieved, show the output plan and stop; else continue i 2. Select a goal from the goal hierarchy, j 3. Select an operator to achieve the selected goal. 4. If the selected operator is applicable to the current state, create a new state by applying the operator, and remove achieved goals from the goal hierarchy. Go to step 1. 5. If the selected operator is not applicable to the current state, create subgoals to establish the unm et preconditions of the operator and add them to the goal hierarchy. Go to step 1. Figure 2.6: A recursive planning procedure based on means-ends analysis. i I , 1 i I I j For example, consider a recursive planning procedure based on means-ends ] I analysis, as shown in Figure 2.6 3. Table 2.1 shows the planning biases classified ! according to the way they can be incorporated into this procedure. Linearity can , be incorporated into goal selection (step 2) by selecting a new goal conjunct only' , I ; after the current goal conjunct is achieved. Protection and goal-length can be in- i 1 1 corporated into operator selection (step 3) by rejecting operators which violate the j criterion for the bias. Goal-depth, goal-breadth, directness, and goal-nonrepetition1 can be incorporated into goal expansion (step 5) by limiting the expansion of the | goal hierarchy. ' i ! However, despite this close relationship between search control and bias, there I is a distinction between the two. Bias determines which plan is generated from; the plan space, while search strategies determine the efficiency with which that < plan is found from the search space. In general, the search space is not necessarily! equivalent to the plan space. For example, a node in the search space for a MEA- based planner with the above procedure can be defined as a combination of the! 3This algorithm is comparable to the one used in NOLIMIT [Veloso, 1989]. i Class Bias Justification Independence Progress Boundedness Goal selection Linearity o Operator selection Protection o o Plan-length o Goal expansion Directness o 0 Goal-depth o Goal-breadth o Goal-nonrepetition o Table 2.1: Examples of planning biases and their justifications classified according to a MEA-based planning procedure. current state and goal hierarchy, whereas a node in the plan space for this planner can be defined as a partial sequence of operators. Whenever an operator is applied j i | to the current state, a node is expanded both in the search space and the plan space, j i However, if the selected operator is not applicable to the current state, a node in \ the plan space is not expanded, while a node in the search space is expanded for | I i the new goal hierarchy. I 2.5 S u m m a ry In this chapter, the notion of bias is applied to planning. An analogy between the processes of concept learning and planning is presented in term s of the usage of bias in the process. Since bias determines which portion of the unbiased space can be the output of the process, using an appropriate bias is critical; if it is too weak it has no effect, but if it is too strong it can eliminate the desired output. Some examples ' of planning biases are introduced which can be justified by independence, progress,! ; or boundedness. The relationship between search control and bias is discussed. ! I C h a p ter 3 S in g le-M eth o d P la n n ers As described in the introduction, the first hypothesis underlying this research is that no single m ethod will satisfy both sufficiency and efficiency for all situations. 1 The ideal way to evaluate the hypothesis would be to construct all possible single-' m ethod planners and to evaluate their performance and scope for every possible, domain. However, this is not possible. In this research, a system is constructed! th at can utilize a set of different planning methods, which vary in the amount of’ i bias used. These methods are implemented in the context of Soar, an architecture which integrates basic capabilities for problem-solving, use of knowledge, learning, and perceptual-m otor behavior [Laird et al., 1987, Rosenbloom et al., 1991]. J Soar has not traditionally been seen as a planning architecture, partly because j it does not create structures that resemble traditional plans, such as totally-ordered j plans or partially-ordered plans, and partly because its problem-solving approach I t does not closely resemble the traditional planning methods [Rosenbloom et al., 1993]. However, recent work on a Soar-based framework for planning has demon-1 strated how versions of such standard planning methods as linear, nonlinear, and. abstraction planning can be derived from the Soar architecture [Rosenbloom et al., \ 1990]. ! This chapter begins with an overview of planning in Soar and introduces a set of different planning methods as implemented in Soar (version 6). The effect of leaxning in these methods with respect to the performance of planning is discussed. 261 Finally, these methods are evaluated experimentally in term s of planner complete ness (for sufficiency), planning tim e and plan length (for planning efficiency and ; execution efficiency, respectively) in two domains. i I 3.1 P la n n in g in Soar 3.1.1 O verview o f Soar 1 i [ ! Soar is based on the hypothesis that all symbolic goal-oriented behavior may < | | be represented in term s of problem spaces [Newell et al., 1991]. A problem space is j defined by a set of states and a set of operators. The states represent situations, and the operators represent actions which apply to current states to yield newj states. Problem-solving in Soar is driven by applying operators to states within a j ! problem space to achieve a goal. A goal context consists of a goal, together with i i i j the current problem space, state, and operator. : j Figure 3.1 illustrates the architectural structure of Soar. Knowledge is stored in [ 1 a perm anent recognition memory and a tem porary working memory. Recognition; ! memory consists of a set of variabilized rules, where each rule is a condition-action' pair. The conditions of each rule m atch against the content of working memory. | ! Conditions can contain variables, so th at a single condition can m atch against j different data in working memory. If the conditions of a rule are matched, the: actions of the rule are instantiated to propose preferences that change the work ing memory. The most typical preferences are feasibility (acceptable, reject) and ■ desirability (best, better, indifferent, worse, worst) preferences. These preferences! I axe held in preference memory and used by a decision procedure to determine what i ; changes are m ade to working memory.1 I If the system does not have sufficient information about a situation to m ake1 a decision for that situation, then an impasse arises. For example, if the system j i 1The current Soar version has a separate preference memory, which is not included in this' figure for simplicity. Recognition Memory Rules Learning Instantiation Working Memory B n Initial State Goal Decision Subgoal | Perception/ 1 Motor Control Environment Figure 3.1: The Soar architecture. is unable to select the next operator from a set of candidate operators, then an| J impasse, called a tie impasse (or selection impasse) arises. Other types of impasses | 1 are generated if the system fails to generate a set of candidate operators (generation impasse), or fails to execute the selected operator (execution impasse) [Rosenbloom et al., 1990]. I i In response to an impasse, a subgoal is autom atically generated. W ithin th e ' i I subgoal, Soar searches for more information that can lead to the resolution of the j impasse. As the result of the subgoal, new preferences are generated and new rules 1 are learned (via a chunking process) whose actions are based on the preferences | th at are the results of the subgoal, and whose conditions are based on the working- j memory elements in supergoals that led to the results. In effect, chunking is much j like explanation-based learning [Rosenbloom and Laird, 1986]. ' Note that the notions of subgoal and operator in a Soar should be distinguished j 1 1 from those in traditional planning. A Soar subgoal is generated in response to an I impasse whenever progress cannot be m ade on the current goal, and term inated | when the impasse is resolved. On the other hand, a planning subgoal is generated j ^ in response to a precondition violation and term inated when the violated condi- ■ tion is achieved. A precondition violation may or may not create an impasse in i Soar depending on whether or not knowledge to achieve the violated condition is 1 I , available in the current goal context. j ! I In the planning framework for this research, planning goals (together with subgoals) and their hierarchy are explicitly represented as augmentations of Soar , states. Precondition violation is handled in a single goal context without creating a , Soar subgoal. However, if there is no information about how to apply an operator! (yielding an execution impasse) or how to select among the candidate operators! 1 (yielding a tie impasse), a Soar subgoal is created. A planning operator is repre-. sented as a set of variabilized rules which create and apply an instantiated Soar operator to change the current Soar state, where the current planning state and the goal hierarchy are represented. 1. If the goal of the problem is achieved, stop; else continue. 2. Select an operator to achieve one of the active goal conjuncts in the goal hierarchy. j 3. If the selected operator is applicable to the current state, create a new state by applying the operator, and remove achieved goals from the goal hierarchy. Go to step 1. 4. If the selected operator is not applicable to the current state, create ; subgoals to establish the unm et preconditions of the operator and j add them to the goal hierarchy. Go to step 2. j Figure 3.2: The planning algorithm based on means-ends analysis as im plem ented; in Soar. I In this work, the predominant planning m ethod in Soar is means-ends analysis ! (MEA). The version of means-ends analysis implemented in this work is close to th e . algorithm described in Figure 2.6. Figure 3.2 shows the skeleton of the M EA-based: planning algorithm implemented in Soar for the framework of this thesis. There are j only two differences between this algorithm and the one shown in Figure 2.6. F irst,. in this algorithm, a goal conjunct is selected implicitly from the goal hierarchy when an operator is selected in step 2. By merging two steps (goal selection and operator ; selection) into a single operator selection step, the number of decisions required t o ' generate a plan can be reduced. Second, there is no explicit output plan to print in step 1 in this algorithm. This is because a plan in Soar is rarely represented as a unitary entity like a totally-ordered or partially-ordered plan. Instead, a plan in Soar is represented as a set of control rules or a set of preferences which jointly specify which operators should be executed at each point in time. In the following sections, operator representation, plan representation, and planning in Soar are described in more detail. 3.1.2 O perator R epresentation in Soar In the planning framework for this thesis, the implementations of planning op erators are represented by three classes of variabilized rules in recognition memory 30 ; If th e problem -space is blocks-world A There exists an active goal (on < x > < y > ) A (on < x > < y > ) is not achieved Propose an operator (M O V E < x > < y > ) for (on < x > < y > ) (a) A n operator proposal rule for (on < x > < y > ) . If th e problem -space is blocks-world A There exists an active goal (clear < x > ) A (clear < x > ) is not achieved A There exists a block < to p > on top o f < x > A There exists an object < y > which is different from < x > and < to p > = > • Propose an operator (M O V E < to p > < y > ) for (clear < x > ) (b) A n operator proposal rule for (clear < x > ) . Figure 3.3: Examples of operator proposal rules. — operator proposal rules, operator application rules, and goal expansion rules — plus instantiated operator objects in working memory. An operator proposal, rule implements a bit of means-ends analysis, determining when it is appropriate, to propose operators. This rule is instantiated (possibly multiple times) based o n ' I the current goal hierarchy represented in the working memory, creating a set of! instantiated Soar operators in the working memory. i Figure 3.3 shows examples of operator proposal rules in the blocks-world do-; main. In our implementation of blocks-world, there is a single general operator,! MOVE, which moves a block from one location to another. However, depending o n ’ the type of goal this operator is trying to achieve, different operator proposal rules can be specified, as shown in Figure 3.3(a) and (b). A single operator proposal rule can be instantiated with different components of the state, yielding m ultiple instan tiated operators. In Figure 3.3(a), for example, if the goal of a problem is to stack a set of n blocks, represented by (and (on Blocks Blocks) (on Blocks Blocks) If the problem -space is blocks-world A T he operator is (M O V E < x > < z > ) for goal < w > A < x > is on < y > A < x > and < z > are clear i A < z > is the table = > < x > is not on < y > A < x > is on < z > A < y > is clear j A < w > is achieved j (a) A n operator application rule to put down a block onto the table. ! 1 ! | If th e problem -space is blocks-world A T he operator is (M O V E < x > < z > ) for goal < w > < j A < x > is on < y > | A < x > and < z > are clear A < z > is a block i • = > j < x > is not on < y > j A < x > is on < z > 1 A < y > is clear i A < z > is not clear A < w > is achieved I ; i (b) A n operator application rule to stack a block on to another block. | I ! \ ; Figure 3 .4 : Examples of operator application rules. ' ■ I I ------------------------------------------------------------------------------------------------------------------------------------------ i (on Blockn- 1 Blockn) ) , then (on < x > < y > ) is instantiated with each of i the n — 1 goal conjuncts when the problem solving starts. 1 i ' Once an operator has been selected for the current state by the decision pro cedure, it can be applied to generate a new state if its preconditions are m et.: Figure 3 .4 shows examples of operator application rules for the MOVE operator. In this implementation of blocks-world, two operator application rules are used for this operator: one to put down a block onto the table, and one to stack a block onto another block. Once an operator has been applied, operator proposal rules i I 32 If the problem -space is blocks-world j A T he operator is (M O V E < x > < z > ) for goal < w > \ A < x > is not clear = ► ' C reate a new goal < n e w > to clear < x > j A T he parent o f < n e w > is < w > in the goal hierarchy j (a) A goal expansion rule in which the block to be m oved is not clear. I If the problem -space is blocks-world A T he operator is (M O V E < x > < z > ) for goal < w > A < z > is not clear = ► Create a new goal < n e w > to clear < z > A T he parent of < n e w > is < w > in the goal hierarchy (b) A goal expansion rule in which the destination block is not clear. j Figure 3.5: Examples of goal expansion rules. \ 1 are m atched with the new state and the updated goal hierarchy, generating the, next set of candidate operators. If the selected operator is not applicable, goal expansion rules (as shown in Fig ure 3.5) are instantiated to generate a new goal hierarchy. In effect, goal expansion ■ rules implement operator subgoaling in means-ends analysis. j 3.1.3 P lan R epresentation in Soar 1 As described in Chapter 2, a plan for a problem can be defined as a stru cture1 th at represents the sequence of actions to be taken for that problem [Rosenbloom' et al., 1993]. W ith this definition of a plan in hand, two predom inant structures can be identified that serve as plans in Soar. The first structure is the set of vari- abilized control rules in recognition memory that serves as generalized plans for classes of potential goals. Control rules are different from operator representation rules described in the previous section in that control rules generate instantiated 33 If the problem-space is blocks-world | A Goal protection is assumed to hold j A There exists an active goal (on <x> <y>) ! A There exists an active goal (on <y> <z>) i A (on <x> <y>) and (on <y> <z>) are not achieved A <x> and <y> are blocks A <x> and <y> are clear A The proposed operator is (MOVE <x> <y>) = * • The operator is worst Figure 3.6: A generalized plan. preferences to help select the current operator from the candidate operators, thus yielding indirectly a sequence of operators. The second structure is the set of in- , stantiated preferences in preference memory that serves as instantiated plans for* active goals. The instantiated preferences can be generated either by the general ized plans or as the results of subgoals (that is, by planning). Figure 3.6 shows an example of a generalized plan for a set of problems shown ■ 1 in Figure 3.7(a). This rule implies that if goal protection is assumed to hold, onej wants a stack of at least three blocks, neither of the top two blocks (out of thej three) are in position, both of the top two blocks (out of the three) are clear, and I an operator is proposed to put the top one on the second one, then th at operator is worst. Figure 3.7(b) shows the sequence of steps to generate a sequence of operators for a four-block-stacking problem. For each step, it shows the current state, thei goal conjuncts that have not yet been achieved, the operators proposed, and the: portion of the instantiated plan generated from the generalized plan in Figure 3.6. Figure 3.7(c) then shows the actual operator sequence this plan generates. The plan representation in Soar has m any interesting aspects. First, the pref erence language has an imperative construct (best) that allows relatively direct specification of the next action to perform; however, it also goes beyond this. For example, partial orders can be specified by using binary preferences such as worse 341 Initial State Goal G l: (on A B) G2: (on B C) Initial State Goal G l: (on A B) G2: (on B Q G3: (on C D) Initial State Goal Gl: (on A B) G2: (on B C) G3: (on C D) G4: (on D E) • • • uiyB^D^ Unachieved Goal Conjuncts Proposed Operators Instantiated Plan (on A B) (on B C) (on C D) (MOVE A B ] (MOVE B C (MOVE C I (onAB) (on B C) (MOVE A B) (MOVE B C) (on A B) (MOVE A B) (MOVE A B) is worst (MOVE A B) is worst (MOVE B C) is worst <b) (MOVE C D ) — ► (MOVE B C) — ► (MOVE A B) (c) Figure 3.7: The plan representation in Soar: (a) a set of problems which are solvable by the rule in Figure 3.6, (b) the sequence of steps for a four-block-stacking problem, (c) the sequence of operators. : and better, and also operator avoidance can be specified by using worst and reject preferences. Second, the use of control rules provides a fine-grained conditionality and context sensitivity that allows it to easily encode such control structures as conditionals and loops. In addition, the variabilization of the control rules allows a single plan fragment to be instantiated for multiple related decisions. I 3.1.4 Planning M ethods in Soar Although the plan representation in Soar is different from conventional plan repre sentations, recent work on a Soar-based framework of planning has dem onstrated how versions of such standard planning methods as linear, nonlinear, and abstrac tion planning can be derived by adding m ethod increments that include core means- ends knowledge about what operators to suggest for consideration, and varying knowledge about how to respond to impasses resulting from precondition failures [Rosenbloom et al., 1990]. Figure 3.8 illustrates initial traces of particular versions of these three forms of planning as implemented in Soar for Sussman’s anomaly (Figure 1.1) in the I blocks-world.2 They all start with a top-level operator that is to achieve the entire: conjunctive goal — ( a n d ( o n B C ) ( o n A B ) ) — directly from the initial sta te ,1 I and reach an execution impasse if there is no information about how to do this. In response to this impasse, a subgoal is created where means-ends analysis is used to generate the set of candidate operators — (MOVE B C ) a n d (MOVE A B ) — th at are known to potentially be able to achieve any of the goal conjuncts. Ai tie impasse then occurs unless there is information about how to pick among themj (or unless only one operator is generated). In this tie impasse, a look-ahead search begins by selecting one of the alternatives to evaluate — here it is (MOVE A B ) . Its preconditions are tested and if the operator is known to be applicable, it is 2Abstraction in the blocks world is shown for comparison purpose. Although abstraction has not actually implemented within the planning framework for this research, it has been imple mented in Soar by Unruh [1993]. «Hn O - ^ M - AIBI IaJ b J (a) Update goal hierarchy • • • clear S Update goal hierarchy • • • B clear (b) I B I lA |C| |B -o— J T : : . (C) r J J j Current state /-n m Execution impasse d ear □ Active □ □ goal i t i V J impasse H Pending la 11 goal Entering a subgoal 1 T [„ Operator application / Returning from a subgoaj, Figure 3.8: Planning in the blocks-world using (a) linear; (b) nonlinear; and (c) abstraction planning. 37! executed to create a new state. If it is not known to be applicable — as here — | ' what happens next depends on the planning methods. ! 1 W ith abstraction, the operator is executed anyway and problem solving ju s t! continues. In Figure 3.8(c), for example, operator (MOVE A B ) is executed even! though block A is not clear. W ithout abstraction, as in Figure 3.8(a) and (b), a new set of goal conjuncts is generated from the operator’s unm et preconditions. The difference between linear and nonlinear planning, at least for these versions, 1 is in the focus of operator generation from the new goal hierarchy. Linear planning: i shifts focus completely to the new conjunct — ( c le a r A ) as in Figure 3.8(a). I t ! stays with the new conjunct until it is achieved, and then pops back to the original I ! conjunct that led to the precondition violation. Processing shifts to one of its siblings (if there are any) only after the original conjunct is achieved. Nonlinear planning instead shifts focus to an expanded set of conjuncts th atj includes the new set plus the original set minus the conjunct that led to the precon dition violation, yielding (on B C) and ( c le a r A ) here (Figure 3.8(b)). At any point in tim e an operator can be selected for any of these conjuncts, enabling op- j erator sequences to be interleaved as necessary (similar to the casual-commitment approach to nonlinear planning [Veloso, 1989]). For both planning methods, oncei j the new focus has been determined, planning continues recursively by using means- i | ends analysis to generate candidate operators from the new goal hierarchy. So far, we have been referring to these methods as “planning m ethods”, be-j cause they are versions of classical methods used in the creation of plans. W ith; ■ this notion in hand, the question to be asked then is how they actually yields plans. \ j As mentioned earlier, a plan in Soar consists of a set of plan fragments — th at is, | a set of either instantiated preferences or generalized control rules. Instantiated, • plan fragments are generated whenever operator preferences are created in work- k I ing memory. This can happen simply by the instantiation of a generalized plan j fragment (by the execution of a control rule) or as a result of projection in an 1 operator-selection subgoal. In projection, one or more operators are tried out in look-ahead search to see which ones lead to success or failure. Success engenders best preferences and failure engenders worst preferences. For example, in Figure 3.8(a) a best preference is returned from the selection subgoal if the result of evaluating (MOVE A B ) is success, ■ whereas a worst preference is returned if the result is failure. These preferences act j directly as fragments of a plan for the currently active goals. In addition, whenever a preference is returned as a result of a subgoal, it triggers Soar’s chunking process, which creates and stores a control rule that acts as a generalized plan fragment for classes of problems. These relationships are summarized by the following two influence paths. P lan n in g m eth o d =» P rojection = £ ► In stan tiated plan i P lan n in g m eth o d = £ • P rojection => Learning = ► G en eralized plan ] While projection plays an integral role in determining which plans axe created, what is projected and what is considered to be success or failure are determined by the planning method. W ithin this framework, planning biases are implemented by altering the planning method, which then determines which plans are created, through the influence paths above. For example, a protection bias is im plem ented; by altering the planning m ethod to term inate look-ahead with failure any tim e a | projected path leads to a protection violation. In comparison to the same planner without this bias, the protection planner will lead to the creation of worst prefer ences (and negative control rules) which will avoid paths th at violate protection. 3.2 Im p le m e n te d P la n n in g B ia ses W ithin the context of Soar, an integrated planning system has been constructed which utilizes a set of different methods. These methods vary in the amount of bias used. The planning biases th at have been concentrated on in this research are directness, linearity and protection. Linearity and protection are chosen because they have been widely used in the planning literature, and directness is chosen 39 I I Z l tfJ n b [Goal Hierarchy] l““l > M Active goal H Pending goal Global Focus (Nonlinear) (c) i I i | Figure 3.9: Goal-flexibility dimension. j i ____________________________________________________________________________ j i ! k ! because it can generate an efficient plan very quickly for a num ber of problems.* I These three biases are defined along two bias dimensions — goal flexibility and; goal protection. j The goal-flexibility dimension is shown in Figure 3.9. It determines the degree of, flexibility the planner has in generating new subgoals and in shifting the focus in th e ! goal hierarchy. This dimension subsumes the directness and linearity biases. The I most restricted point along this dimension allows no generation of new subgoals for precondition violations (Figure 3.9(a)), yielding a single-level goal hierarchy. I This implements a directness bias by ensuring that each of the operators in a plan directly achieves an initial goal conjunct, rather than an unm et precondition of, another operator. The second point along the flexibility dimension allows generation of new sub goals, but only a single local set of conjuncts are attended to at any point in time (Figure 3.9(b)). This local focus of attention has two m ain consequences for the planner. First, it reduces the branching factor of the planners’s search — with 40i No Subgoal Local Focus (Directness) (Linear) (a) (b) [Search Tree] Goal Protection (a) No Goal Protection (b) Figure 3.10: Goal-protection dimension. respect to the nonlinear planner — by restricting the set of operators that the planner can consider at any point in tim e to just those that are able to achieve the local conjuncts. Second, with the assumption th at an operator achieves only one goal conjunct and that the placement of operators in the plan is restricted within the context of the local conjuncts from which it arose, it enforces linearity on the resulting plans (thus implementing linear planning) by ensuring th at operators for different goal conjuncts cannot be interleaved in the output plans. The third point along the flexibility dimension allows the global use of subgoals; th at is, new goal conjuncts are generated for unm et preconditions, and operators j are simultaneously considered for all unsatisfied conjuncts (Figure 3.9(c)). This is the least restricted version, and implements nonlinear planning by allowing opera tors for different goal conjuncts to be interleaved, as in NOLIMIT [Veloso, 1989]. The goal-protection (GP) dimension is shown in Figure 3.10. The two points implemented along this dimension correspond to goal protection (Figure 3.10(a)) — that is, every achieved top-level goal conjunct is protected between the time; it is achieved and the tim e it is no longer needed — and to no goal protection' J (Figure 3.10(b)). The main consequence of using goal-protection is th at it shrinks the search space by cutting off sequences of operators which violate goal protection. I Figure 3.11 characterizes the 3x2 set of planning methods derived from these j bias dimensions. Each of the cells in Figure 3.11 shows a label representing the planner for th at cell along with a problem th at is just hard enough to require that , planner; th at is the problem can be solved optim ally by the planner represented by th at cell, but not by either the planner to its left or the planner above it. The bottom -left cell represents an extended blocks-world problem where a block th at is I j second from the top of a tower can be moved [Etzioni, 1990a]. The most restricted I planner (M i) — a direct goal-protection planner — is in the top-left cell of the figure. W hile quite restrictive, it is sufficient to solve the block-stacking problem j shown in th at cell of the figure. The least restricted planner (Me) — a nonlinear i ■ planner without goal protection — is in the bottom -right cell of the figure. It is the only planner in the figure capable of generating an optim al solution to the blocks- world problem shown in that cell. Between these two extremes, moving up or to j the left yields more bias, while moving down or to the right yields less bias. In each; • of these interm ediate cells, the problem shown is one th at is just hard enough to i require th at planner; that is, the problem can be solved optim ally by the planner | represented by that cell, but not by either the planner to its left or the planner 1 above it. Note th at in the blocks-world domain, both Ms and Me are complete planners i | in th at they can potentially solve every problem, though Ms may not be able; to generate an optimal solution for some problems, However, in domains withj j irreversible operators as shown in Figure 1.2, Me is the only complete planner. j Figure 3.12 compares the traces of these methods for Sussman’s anomaly. They i I all start with a combination of the initial state, the entire conjunctive goal — (andj i ( o n B C ) ( o n A B ) ) — and the initial set of candidate operators — (MOVE B ■ C ) and (MOVE A B ) — which axe generated by means-ends analysis. If there is: no information about which operator to select, a tie impasse occurs.3 In this tie I 3For simplicity of presentation, these traces only show tie impasses. No subgoal Local Global (Directness) (Linear) (Nonlinear) GP No GP A Standard Blocks-world Problem An Extended Blocks-world Problem Figure 3.11: The planning methods generated by the bias dimensions. Active goals j g (moveB (move A B Evaluate (move A B) (a) The initial state, goals, and operators Evaluate Directness failure (move A 1 (b) Directness & protection (Mj) State Active goals Proposed operators 1 Evaluate^ j clear /(move C TK (move A B) 0 \(move C By State Evaluate (move A if) Active goals B clear C [a] Proposed operators • • • Protection failure (c) Linear & protection (M2) move B (move C T) moveC B • • • Success (d) Nonlinear & protection (M3) Active goals Proposed operators (move C (move C B Evaluate (move A j Success (e) Linear & no protection (Ms) Figure 3.12: Planning in Soar. impasse, a look-ahead search begins by selecting one of the alternatives to evaluate — here it is (MOVE A B ) . If the directness bias is used — as in Figure 3.12(b) — the evaluation of (M O V E A B) is term inated immediatedly with failure as the evaluation value, and the other operator (M O V E B C) is selected. If the directness bias is not used, a new set of goal conjuncts are generated from the operator’s unm et preconditions (Figure 3.12(c- e)). Linear planning focuses on the new conjunct — ( c l e a x A ) as in Figure 3.12(c) and (e) — until it is achieved, and then returns to the original conjunct that led to the impasse — here, ( o n A B ) . Sibling conjuncts — here, ( o n B C ) — are considered only after the original conjunct is achieved. In this problem, this eventually leads to failure if a protection bias is used (Figure 3.12(c)), or generates a non-optimal plan if a protection bias is not used (Figure 3.12(e)). Nonlinear planning instead shifts focus to the entire set of goal conjuncts (except the one that led to the impasse) —- ( a n d ( o n B C ) ( c l e a r A ) as in (Figure 3.12(d)). This eventually can yield an optim al plan for this problem regardless of the use of a protection bias. 3 .3 L earn in g in S in g le-M eth o d P la n n ers For each of the single-method planners, chunking is performed over the planner’s projection (look-ahead) process: the elements to be explained are the preferences | generated during projection, and the explanations are the traces of the projections I th at led to the preferences. Both positive rules and negative rules can be learned j from projections. Figures 3.13 and 3.15 provide a simple example of this. Figure 3.13 shows a path projected by the nonlinear planner for a simple four- ; block-unstacking problem. This projection proceeds through multiple tie impasses' until the problem is successfully solved. In this example, (MOVE A T a b l e ) is eval uated in the first operator-selection subgoal, and (MOVE B T a b l e ) is evaluated in, the second operator-selection subgoal. As shown in Figure 3.14, this results in a Goal Achieved BuildifigJChunk-4 Building Chunk-2 Goal Achieved Figure 3.13: Four block unstacking with nonlinear planning. pair of positive control rules, one for each correct decision on the solution path. I Evaluating an operator which is not directly applicable in the current state — here; (MOVE B T a b l e ) or (MOVE C T a b l e ) in the first operator-selection subgoal — also leads to success in nonlinear planning, though the learned rules are more complex. Figure 3.15 shows a path projected with a directness bias for the same block-: unstacking problem. In contrast to the previous case, the projection is term inated w ith failure as soon as the non-applicable operator (MOVE B T a b l e ) is selected. As shown in Figure 3.16, this yields a negative control rule for the incorrect decision on the solution path. Chunk-2: If the problem-space is blocks-world A There exists an active goal (on <y> Table) A There exists an active goal (on <z> Table) A (on < y> Table) and (on <z> Table) are not achieved A < y> is on <z> A <y> is clear A The proposed operator is (MOVE <y> Table) j The operator is best | (a) | Chunk-4: If the problem-space is blocks-world ; A There exists an active goal (on <x> Table) | A There exists an active goal (on <y> Table) A There exists an active goal (on <z> Table) A (on <x> Table), (on < y> Table), and (on <z> Table) are not achieved A <x> is on <y> A <y> is on <z> A <x> is clear A The proposed operator is (MOVE < x> Table) = > ] j The operator is best j 0 ) I 1 I ! Figure 3.14: Learned rules for block unstacking with nonlinear planning. Note that if the planner’s bias is reflected in an altered planning m ethod, which, I in turn yields an altered projector, then the planner’s bias can indirectly induce, a bias in the resulting learning process. For example, the rules in Figure 3.14 are i relatively specialized, because each m ust encapsulate the entire explanation for why a particular operator will eventually lead to success. In larger problems these; explanations get even larger, and the rules end up being even more specialized. ! On the other hand, the explanation for the rule in Figure 3.16 is quite short — based as it is on the explicit assumption that directness can hold and on the failure of the first selected operator to be applicable. As it turns out, this single rule is I I 47 (chunk-6^ Building Chunk-6 [blL Directness Failure Goal Achieved Figure 3.15: Four block unstacking with directness. Chunk-6: If the problem -space is blocks-world A D irectness is assum ed to hold A There exists an active goal (on < y > Table) A (on < y > Table) is not achieved A < y > is not clear A T he proposed operator is (M O V E < y > Table) = * ► T he operator is worst Figure 3.16: A learned rule for block unstacking with directness. j general enough to handle the entire problem, by removing from consideration all ' operators th at attem pt to move unclear blocks onto the table. The bias in this case [ has thus yielded faster planning and learning — because of shorter projections and j explanations — and has resulted in the acquisition of fewer, more general rules. Implicit in this example is one approach to producing generalization to N [Bost- 1 rom, H., 1990, Cohen, 1988, Shavlik, 1989, Subramanian and Feldman, 1990], where | a plan learned for a problem of a particular size can transfer to solve problems I with the same structure but of arbitrary size [Rosenbloom et al., 1993]. W ithout directness, the control rules are specific to particular numbers of blocks, and thus can only be used to directly solve term inal subregions of larger problems. However, with directness, a single rule is learned that removes from consideration at each decision all operators that move unclear blocks to the table, no matter how many unclear blocks there are. This idea can be applied to other problems and biases as well. Figure 3.17, for example, shows a path projected with protection for a i four-block-stacking problem. As with the directness bias in block unstacking, a protection bias leads here to learning a single negative rule (Figure 3.18) th at can be applied to stacking problems of arbitrary size. A third type of bias th at can also induce generalization to N is complete pro tection. Complete protection is a variant on goal protection that provides a very 1 strong bias by not only protecting established goals, but also protecting established! 1 [ ■ operator sequences. T hat is, it disallows any backtracking on operator selection, thus letting projection be term inated with success whenever an operator is se lected, rather than waiting until the entire problem has been solved. As with th e ; i . . . . . ; ] directness example, projection is term inated here after the first operator is selected | (Figure 3.19(a)). However, in this case it is term inated with success as soon as the top block is moved to the table. The explanation for this success depends only on the explicit assumption of complete protection and on the fact that the operator was successfully applied, so a relatively general, positive control rule is learned' (Figure 3.20). Although this is a positive rule, it also turns out to produce gener alization to N, but now by always specifying that the one clear block that is not t 49; -o - , ; S*~ (Chunk-8^) • • • [aTbI BM ding Chunk-8 X Goal ► Protection Failure JU bi _ A C D Goal Achieved Figure 3.17: Four block stacking with protection Chunk-8: If the problem -space is blocks-world A G oal protection is assum ed to hold A There exists an active goal (on < x > < y > ) A There exists an active goal (on < y > < z > ) A (on < x > < y > ) and (on < y > < z > ) are not achieved A < x > and < y > are blocks A < x > and < y > are clear A T he proposed operator is (M O V E < x > < y > ) The operator is worst Figure 3.18: A learned rule for four block stacking with protection. I Building dhynk-10 Goal Achieved j S * . Success (a) ^ Chunk-10) », • • •. J*L » JaTbTcTdTe Goal Achieved (b) Figure 3.19: Three and five block unstacking with complete protection: (a) a projected path, (b) transfer of the learned rule to a different number. Chunk-10: If the problem -space is blocks-world A Com plete protection holds A There exists an active goal (on < x > Table) A (on < x > Table) is not achieved A < x > is clear A T he proposed operator is (M O V E < x > Table) =» T h e operator is best Figure 3.20: A learned rule for four block unstacking with complete protection. already on the table — if it were already on the table, there would be no active goal conjunct for it — should be moved to the table. The resulting rule can transfer to any num ber of iterations, as shown in Figure 3.19(b). ! The key to producing generalization to N with these biases is that they enable I learning from non-iterative paths — in this way it is similar to Etzioni’s [1990a] work on restricting EBL to learn from only non-recursive paths. In the directness and protection cases, the success paths are iterative, but (negative) rules can in-' stead be learned from non-iterative failure paths. In the complete-protection case, I i learning occurs from a fragment of the success path th at corresponds to just a sin- ‘ gle cycle of iteration. In both cases, the resulting rules can transfer to any num ber I of iterations. I I i | 3 .4 E x p e r im e n ta l R e su lts j Experim ental results from the six planners in two planning domains — the blocks-1 ; world domain and the machine-shop scheduling domain — are shown in Tables 3.1 -1 3.3. The data comes from running each planner on the same set of 100 problems for | ' each domain. For each problem in the blocks-world domain, the num ber of blocks 1 was randomly selected between three and four. Given the num ber of blocks, an I initial state was randomly generated among the possible configurations of the blocks I 52 j and the table (3 configurations for 3 blocks, and 5 configurations for 4 blocks). The generated initial state was represented as a set of ( o n Xj yi)-type predicates. Likewise a set of ( o n xj y j) - type goal conjuncts was random ly generated that num bered between two and the num ber of blocks in the initial state. For each goal conjunct, Xj was selected randomly from the initial set of blocks, and then yj was selected randomly from among the table and the blocks which have not yet been selected as y* (fc < j). The num ber of possible combinations of goal conjuncts for n-block problems is 0 (nn), because for each of the n blocks, there are n possible locations. A task in the machine-shop scheduling domain is to determ ine a sequence of machining operations to produce the desired objects so as to m eet the given re quirements [Minton, 1988]. 4 The shop contains several machines, including ROLL, LA TH E, PUNCH, D R I L L -P R E S S , P O L IS H , G R IN D E R , S P R A Y -P A IN T , and IM M E R SIO N - - P A I N T . Each object has five attributes — s h a p e , h a s - h o l e , s u r f a c e - c o n d i t i o n , p a i n t e d , and t e m p e r a t u r e . Each attribute can have one of two to four types of values. For each problem, the initial state was generated by assigning a randomly j generated type to each attribute for an object (except th at the initial tem pera-' ture is always c o l d ) . The number of goal conjuncts for each problem was fixed as five. The goal conjuncts for each problem were generated random ly as with the initial-state generation. Learning was turned on for each problem, but only within-trial transfer was ^ allowed; that is, rules learned during one problem were not used for other problems.; Planning tim e is mainly measured in term s of decisions, the basic behavioral cycle j in Soar. This measure is not quite identical to the more traditional measure of! i num ber of planning operators executed, but should still correlate with it relatively closely. J 4The version of the machine-shop domain used in this research is almost identical to the original: PRODIGY version presented in [Minton, 1988]. The only difference between the two versions is that the time augmentation for each generated operation in the original version is not specified in our version, because our main focus here is on the sequence of operations rather than the tim e1 when to execute the operations. No subgoal Local Global (Directness) (Linear) (Nonlinear) GP No GP GP No GP Table 3.1: Number of problems solved. Mi m 2 m 3 68 (A i) 95 (A2) 96 (A3) m 4 M5 M 6 68 (A4) 100 (As) 100 (Ae) (a) Blocks world domain. No subgoal (Directness) Local (Linear) Global (Nonlinear) Mi m 2 Ms 70 (Ax) 70 (A2) 70 (As) m 4 Mg Me 100 (A4) 100 (As) 100 (A6) (b) Machine-shop scheduling domain. No subgoal (Directness) Local (Linear) Global (Nonlinear) m 2 m 3 GP At(A4) 8.63 (0.17) 15.06 (0.41) 16.24 (0.43) A2 - 22.67 (0.92) 23.66 (0.73) A 3 - - 23.60 (0.73) As(Ae) - - - m 4 M5 m 6 No GP Ai (A4) 12.34 (0.29) 22.21 (0.67) 33.40 (2.19) A 2 - 29.41 (1.06) 47.12 (3.66) A 3 - 29.48 (1.06) 48.06 (3.68) As(Ae) - 29.22 (1.04) 47.93 (3.62) (a) Blocks world domain. I No subgoal Local Global (Directness) (Linear) (Nonlinear) Mt m 2 M 3 GP AX(A2, A 3 ) 20.73 (0.49) 20.73 (0.49) 20.73 (0.49) Aj(As, Ae) - - - m 4 Ms Me No GP At(A2, A3) 31.47 (0.85) 31.47 (0.85) 31.47 (0.85) A4(As, As) 33.97 (0.92) 33.97 (0.92) 33.97 (0.92) (b) Machine-shop scheduling domain. Table 3.2: Average num ber of decisions (and CPU tim e (sec.)) per problem. i 55 No subgoal (Directness) Local (Linear) Global (Nonlinear) Mi m 2 M 3 GP A X {A4) 1.82 1.85 2.00 A 2 - 2.35 2.54 A3 - - 2.54 As(Ae) - - - m 4 M5 Me No GP A t(A 4) 1.82 3.00 2.90 A 2 - 3.78 3.88 a 3 - 3.83 4.07 As(A& ) - 3.82 4.14 (a) Blocks world domain. No subgoal Local Global (Directness) (Linear) (Nonlinear) Mi m 2 Ms GP A 1 (A2, A3) 2.43 2.43 2.43 A4(As, Ae) - - - m 4 Ms Me No GP A 1 (A 2 ,A 3) 4.13 4.13 4.13 A 4 (As, Ae) 4.47 4.47 4.47 (b) Machine-shop scheduling domain. Table 3.3: Average plan length per problem. j Table 3.1 shows the num ber of problems solved by each cell’s planner, as defined ! in Figure 3.11, for these two domains. The label Ai denotes the problem set that i the m ethod Mi implicitly defines. W ith a sufficient tim e lim it, every problem solvable in principle by a planner was actually solved. Not surprisingly, the data | show a monotonic relationship between planner bias and scope, from a low of 68 problems in the blocks-world domain and 70 problems in the machine-shop scheduling domain for the most restricted planner to a high of 100 problems in both domains for the least restricted planner. Tables 3.2 and 3.3 show the average num ber of decisions, average CPU tim e, and average plan lengths — which should positively correlate with execution tim e — for each distinct problem sets defined in Table 3.1. In the standard blocks-world domain, four distinct problem sets are defined. This is because A 4 is the same as A \ since if a problem is not solvable with protection, it also is not solvable with directness; and As is the same as A$ since both Ms and M s are complete in this domain, though Ms may not be able to generate an optim al solution. These four problem sets are associated with the four rows within each cell. In the machine- shop scheduling domain, no precondition subgoals axe required because there is* no operator which achieves any of the unm et preconditions. Thus both directness- and linearity are irrelevant. However, there are strong interactions among thej operators, so protection violations are still relevant. In consequence, two distinct j problems sets are defined, Ay and A 4. j The tim ing results are shown in Table 3.2. The two columns within each cellj I show the average num ber of decisions and the average CPU tim e, respectively,! which are required to generate plans for the problems. The table shows th at plan-! ning effort is also a monotonically decreasing function of the am ount of bias along these dimensions (only for protection in the machine-shop scheduling domain). For, example, for problem set Ay in the blocks-world domain, effort ranged from a low: of 8.63 decisions for the most biased m ethod (that is, the direct goal-protection m ethod) to a high of 33.40 decisions for the least biased m ethod (that is nonlinear 57 planning without goal-protection). This trade off between efficiency and complete ness implies th at selecting an appropriate amount of bias for a given problem is critical for finding a solution quickly. Table 3.3 exhibits a similar monotonic rela tionship between plan length and the amount of bias used. 3.5 S u m m a ry Six single methods are defined along two bias dimensions: goal-flexibility, and goal- protection. These methods are implemented in Soar, in which generated plans are j represented as sets of control rules th at jointly specify which operators should \ be executed at each point in time. The six implemented m ethods are compared empirically in term s of planner completeness, planning tim e, and plan length. The experim ental results show a trade-off between completeness and efficiency. This implies th at the planning system would be best served if it could always opt for the most restricted m ethod adequate for its current situation. i I ! C h a p ter 4 : i t I ! M u lti-M e th o d P la n n ers One of the m ain problems with the planners examined in the previous chapter is th at each is either incomplete or performs a significant amount of excess work for some of the problems (both in planning and execution). An alternative approach is to build a multi-method planner which utilizes a coordinated set of planning \ 1 : methods, where each individual m ethod has different scope and performance. The j i basic idea underlying this thesis is to select and coordinate a set of individual ! m ethods based on the empirical performance of those methods for a training se t1 of problems. ; W ithin the empirical multi-method planning framework, the main goal of this ^ i _ I i research is to create a set of multi-method planners which are more efficient and ap plicable than single-method planners. The previous chapter introduced a m ethod ology to create individual methods which have different performance and scope j based on the amount of bias used. Given a set of created m ethods, the key issue I . . . ! is then how to coordinate the methods in an efficient m anner so that the multi-! i | m ethod planner can have high performance. M ethod coordination refers to (1) the selection of appropriate methods as situations arise, and (2) the granularity of i m ethod switching as the situational demands shift. For m ethod selection, individual methods need to be organized so th at a higher * level control structure can determine which m ethod to use first, and which m ethod to use next when the current m ethod fails. Two straightforward ways of organizing ------------------------------------------------------------------ individual methods are a sequential and a time-shared manner. A sequential multi- , m ethod planner consists of a sequence of single-method planners. A time-shared l m ulti-m ethod planner consists of a set of single-method planners in which each m ethod is active in turn for a given tim e slice [Barley, 1991]. In this thesis, a special type of sequential m ulti-m ethod planning, called monotonic multi-method planning is focused on. In a monotonic m ulti-m ethod planner, the single methods 1 are sequenced according to increasing coverage and decreasing efficiency. W ith the assumptions th at earlier methods term inate, and that m ethods which are efficient when they succeed do not waste too much tim e when they fail, monotonic m ulti m ethod planners can generate plans efficiently by using more restricted methods earlier in the sequence [Lee and Rosenbloom, 1992]. One way to construct a monotonic m ulti-method planner is to use the biases which themselves increase efficiency. Individual methods are sequenced so that the set of biases used in a m ethod is a subset of the biases used in earlier m eth ods, and the later methods have more coverage than the earlier methods. This means th at planning starts by trying highly efficient m ethods, and then succes- I I sively relaxing biases until a sufficient m ethod is found. This type of planning ! I is called bias-relaxation multi-method planning. A bias-relaxation m ulti-m ethod I planner is not necessarily a monotonic m ulti-m ethod planner if there are interac- 1 ! tions among biases. However, one can generate monotonic m ulti-m ethod planners / ! via bias-relaxation by just testing whether monotonicity holds for the created bias- [ relaxation m ulti-m ethod planners. In bias-relaxation m ulti-m ethod planning, each ; bias is evaluated independently by comparing a m ethod which uses th at bias only ' J and a m ethod which uses no bias. Thus, bias-relaxation m ulti-m ethod planning has ■ i more restricted scope in creating and comparing individual m ethods than mono- | tonic m ulti-m ethod planning1. j 1Strongly-monotonic multi-method planning described in [Lee and Rosenbloom, 1993] also f , uses a bias-relaxation scheme. However, it evaluates each bias along with other biases, thus j i examining more methods than bias-relaxation multi-method planning. i j ! i I I I i 60 The second issue of m ethod coordination is the granularity at which individual m ethods are switched. This issue is im portant in term s of a planner’s perfor- l mance, because the performance of a m ulti-method planner can be changed ac cording to the granularity of shifting control from m ethod to method. Depending on the granularity of m ethod switching, m ulti-m ethod planners can be further spe cialized: coarse-grained multi-method planners, where methods are switched on a problem-by-problem basis; and fine-grained multi-method planners, where methods are switched on a goal-by-goal basis [Lee and Rosenbloom, 1993]. This chapter investigates these two issues of coordinating individual methods in m ulti-m ethod planning in depth. To investigate the m ethod organization is sue, a scheme to construct monotonic m ulti-m ethod planners from a set of single m ethod planners is provided, and then a formal model is presented to compare the performance of constructed monotonic m ulti-method planners with time-shared m ulti-m ethod planners and single-method planners. Also, a scheme to construct a set of bias-relaxation m ulti-m ethod planners is provided, and the constructed bias-relaxation m ulti-m ethod planners are compared experimentally with single- j m ethod planners. To investigate the granularity of m ethod switching, the per- j i formance of coarse-grained bias-relaxation m ulti-method planners and fine-grained ' bias-relaxation m ulti-m ethod planners (called simply coarse-grained m ulti-m ethod planners and fine-grained m ulti-method planners, respectively, throughout this the sis) are evaluated experimentally, and compared with the performance of single- | i m ethod planners. \ 1 I Partial-order planning is one of the most popular approaches in the planning | literature. At the end of this chapter, m ulti-method planning is compared with j partial-order planning in term s of planning performance. I ( t 4.1 M o n o to n ic M u lti-M e th o d P la n n ers I ; In a monotonic m ulti-m ethod planner, individual methods are sequenced so th at i the earlier methods are more efficient and have less coverage than the later m ethods. 1 The idea is th at if the biases used in efficient methods can prune the search space, the problems solvable by efficient methods should be solved more quickly, while problems requiring less bias should not waste too much extra tim e trying out the insufficient early methods. This approach is inspired by iterative deepening [Korf, 1985]. In iterative deep ening, a sequence of depth-first searches are performed, each to a greater depth than the previous one. If a solution is found at a shallow depth, the cost of search ing to a greater depth is saved. If a solution is not found at a particular depth, a f | deeper search is performed. The cost of doing the shallower searches is then wasted, } but since the deeper search costs at least /? times the cost of the shallower search | — where /3 is the branching factor of the search tree — this cost can be relatively I quite small. Thus, if the proportion of problems solvable at shallow depths is large j ; enough, and the ratio of costs for successive levels is large enough, there should be | a net gain. | A monotonic m ulti-m ethod planner can be defined formally by using a restricted r j dominance relation [Lee and Rosenbloom, 1992]. i | i 4.1.1 R estricted D om inance R elation Let M i be a single-method planner. Let A be a sample set of problems, and let I Ai C A be the subset of A which is solvable in principle by Mi. The functions I 1 s(M i, A s) and I (Mi, As) represent respectively the average cost th at Mt - requires | to succeed and the average length of plans generated by M,, for the problems in I A s C A,-. Similarly, /(M,-, A f ) represents the average wasted cost for M i to fail for the problems in A f C A - A ,-. ; Given a set of methods {M,} ( i = l , ..., n), a restricted dominance relation M x -< ' My is defined between two different single-method planners, Mx and My, if the j ; following conditions hold: ( I ' (1) Ax C Ay (2) s(M x,A i) < s(M y, Ai), for every A ,- C A x 62 Planner Decisions Plan length A x A 2 A$ A x A 2 As Mi (directness, protection) 12.50 - - 1.56 - - M2 (linearity, protection) 13.00 18.90 - 1.56 2.32 - M3 (protection) 13.21 26.91 - 1.62 2.49 - M4 (directness) 13.54 - - 1.56 - - M5 (linearity) 14.81 24.47 24.84 2.10 3.22 3.34 M6 16.23 40.85 40.96 2.02 3.17 3.37 (a) Blocks world domain. Planner Decisions Plan length Ax A \ Ax a 4 Mi (directness, protection) 22.14 - 2.68 - M2 (linearity, protection) 22.14 - 2.68 - M3 (protection) 22.14 - 2.68 - M4 (directness) 35.33 37.42 4.36 4.68 Ms (linearity) 34.27 36.10 4.45 4.82 Mg 34.27 36.10 4.45 4.82 (b) Machine-shop scheduling domain. I i Table 4.1: The performance of the six single-method planners for the problem sets i defined by the scopes of the planners. ! (3) l(Mx, Ai) < l(My,Ai), for every Ai C A x. ] ; A sequential m ulti-m ethod planner which consists of n different single-method ! ( planners is denoted as Mk 1 — *Mk2 — *...— >M iC n . A sequential m ulti-m ethod planner I ; Mk 1 — *Mk2 -* :.— *Mkn is called monotonic if -< Mki+ 1 holds for each i — 1,..., n — \ ; 1. i The straightforward way to build monotonic m ulti-m ethod planners is to run i each of the individual methods on a set of training problems, and then from the j ; resulting data to generate m ethod sequences for which monotonicity holds. Ta- j ble 4.1 shows the average num ber of decisions, s{Mki,Akj ), and the average plan 1 i lengths, l(Mki,Akj), over a training problem set for the six single-method planners : 63 (a) Blocks-world domain Me (b) Machine-shop scheduling domain Figure 4.1: Restricted dominance graphs for the single-method planners. j defined in Chapter 3, for the blocks-world domain and the machine-shop scheduling! 1 domain. ! For each domain, the problem set consists of 30 problems which are randomly! i generated as in Chapter 3. In the blocks-world domain, A 2 and A 3 are different sets1 in principle, because problems such as Sussman’s anomaly cannot be solved by a linear planner with protection (M2) but can be by a nonlinear planner with protec tion (M3). However, among the 30 training problems, these “anomaly” problems did not occur, yielding A 2 = A 3 for this set of problems. Figure 4.1 exhibits restricted dominance graphs based on the results in Ta- j ble 4.1. Each node in a graph represents a single-method planner, and an arc | from M x to M y implies that M x -< M y holds. Thus every path in the graph \ corresponds to a monotonic m ulti-m ethod planner. A monotonic m ulti-m ethod planner Mk1 — *Mk2 — *...— +Mkn is complete, if Mkn is complete. In the blocks-world I domain, seven complete 2-method planners and four complete 3-method planners can be constructed, whereas in the machine-shop scheduling domain, nine complete 2-m ethod planners can be constructed. The next section compares a monotonic m ulti-m ethod planner with its corre sponding time-shared multi-method planner and single-method planner in term s of planning tim e and plan length. 4.1.2 Perform ance A nalysis P la n n in g tim e : In this section, it is assumed th at the individual methods in a sequential m ulti-m ethod planner are switched on a problem-by-problem basis. For a given problem a 6 A, the planning tim e of a sequential m ulti-m ethod plan ner Mkx— >Mk2— *Mkn, where Mk, is the first m ethod which solves a, can be represented as {<•}) = » (# * „{ « } ) + {a}), (4.1) where s(Mk;, {a}) is the cost for to solve a, and £)}=i f{M k}, {a}) is the sum of the costs for inappropriate earlier m ethods to fail for a. The corresponding time-shared m ulti-m ethod planner consists of the same set of single m ethods, denoted as \\Mk2 ||...||Mfcn. Let Mk, be the first m ethod that solves a in a horse-race manner. Suppose th at the switching in a tim e-shared m ulti m ethod planner is based on a unit tim e slice. Then, the expected planning tim e of Mkt \\Mk2 |[...||Mfcn for a problem a & A can be represented as s(M kl ||M k2 ||...||Affc„, {a}) = s(M ki, {a}) + fnin[f{M kj, {a}), s(M ki, {a})], (4.2) 65 j where the first term is the cost for the m ethod that actually solves a, and the ! second term is the sum of the costs for the rest of the methods either to fail for a l ! (f(M kj,{a})) or to try to solve a (s(Mki,{a})). i j The average planning time for a problem set A can be represented by using | a probability function. Let Pki(= lA^ |/|A |) be the probability th at an arbitrary 1 j problem in A is solvable by M ki. Let M ko be a null planner which cannot solve any j problem; that is A ^ = < f > and Pko = 0. Let A k. = A ki — A ki_ 1 ? for l< * < n , be the set of problems which are solvable by M ki but not by M^t-1, and let Pk. = \A'ki |/|A |, j for l< * < n . Let s(M ko,A s) = /(Mfr0, As) = 0, for any A s, and f( M ko,A F ) = 0, for » | any A f - i For a problem set A, the planning tim e of a complete sequential muli-method I i j planner M kl— *Mk2— >...~+Mkn can be rew ritten as the sum of the average planning tim e for the disjoint problem sets A'k.(l<i<n): \ I A) = -£ (P i, * s(Mkt-+Mk,^ ...^ M kn,A'kl)), j t = l t where s(Mk,-*M k, ^ . . . - M k„,A'kl) = s(Mkl,A'tl) + Y , f ( M kl,A'ki). (4.3) I The planning tim e of the corresponding time-shared m ulti-method planner can j be rew ritten as j n t = l where s(Mk,\\Mkl\\...\\Mk„,A'ti) = " min[f(Mkj,{a}),a{M ki,{a})] s(M k„ A ‘ ki) + £ ------- “------------- U M -----------------------• (4- 4) I The relative performance between a complete sequential m ulti-m ethod planner and the corresponding time-shared m ulti-m ethod planner depends on the ordering of the m ethods and the cost of f( M kj, {a}) and s(M ki, {a}). If M kl -*M k2-*...— >Mkn, j is monotonic, later methods (M j,j—i+ l , ..., n ) would not fail to solve a, from the I : definition. Thus, s(M kl\\Mk2\\...\\Mkn,A'ki) = ((n -i + l)*s(Mki,Ak.) + Y , 1 - - - - - - - - T 7 7 1 - - - - - - - - - - - - - - • ( 4 - 5 ) i= i I A ki I In particular, if f( M kj, {a}) < s(M ki, {a}) ( l < j < i — 1) for each a, we have s(M kl\\Mk2\\...\\Mkn,A'ki) = ( ( n - i + l) * s(M ki,A k.) + l £ f ( M kiiAfkt). (4.6) j = i ! Thus, the performance difference between a monotonic m ulti-m ethod planner and j the corresponding time-shared m ulti-m ethod planner is j s(M kl\\Mk2\\...\\Mkn, A) - s(M kl->Mk2-*...-+Mkn, A) = t n ! * (« - ■ ) * > 0. (4.7) I < = 1 This implies that if the cost of failure for a restricted planner is always less , than the cost of success for a more relaxed planner, then monotonic m ulti-method ! planners outperform corresponding time-shared m ulti-m ethod planners; otherwise, tim e-shared m ulti-m ethod planners may perform better. This all depends on the I relative search space size for the restricted planner and the density and distribution j of solutions in the search space. However, if the biases used in a restricted m ethod j are strong enough to cut off all the failure paths at shallow depths, the cost to | determ ine whether a m ethod fails may be less than the cost to determine whether a m ethod succeeds. Moreover, if the rule learned from a failure path can transfer to other failure paths, the cost of failure can be even less. The performance of monotonic m ulti-method planners and single-method plan ners is compared as follows. For each monotonic multi-method planner M kl — +Mk2 — » ...— +Mkn, there is a corresponding single-method planner M kn which has the same ! coverage of solvable problems. If M kl— +Mk2— *...— *Mkn is complete, M kn is also complete. We compare a complete monotonic m ulti-m ethod planner w ith its cor responding single-method planner in term s of planning time. The performance of is HMtn,A) = (4 .8 ) 4=1 To compare the performance of M kl — +Mk2 — *Mkn with Mkn, it is necessary to subtract (4.3) from (4.8), yielding s(Min,A) — A) = jT [P ^ * (s(Mkn,A'ki) - s(Mki,A'ht) - ' t , f ( M ti,A'kl))}. (4.9) i= l j = 1 This means that if the performance gain by using a cheaper m ethod (s{Mkn, A'k.) — s(M knA k.)) is greater than the wasted tim e from using inappropriate methods (£}=1 f{M kj,A'k.)) in a monotonic m ulti-m ethod planner, then it is preferable to use th at m ethod over the single-method planner; otherwise, the single-method I planner is preferred (at least where planning tim e is concerned). I P la n le n g th : The plan length I for a complete monotonic m ulti-m ethod plan ner and for the corresponding tim e-shared m ulti-m ethod planner is the same and equal to = l(Mkl\\Mk,\\...\\M^, A) = ' t \ P i i *l(Mkl,A'k,)), (4.10) I 4=1 while the plan length for the corresponding single-method planner Mk„ is l(Mtn,A ) = * m ^ A ' ki)]. (4.11) 4=1 Since Mkn is monotonic, then l(M ki,Ak.) < l(Mkn, A k.). There fore, the lengths of plans generated from a monotonic m ulti-m ethod planner and ! the corresponding time-shared m ulti-m ethod planner are always less than or equal to the length of plans generated from the corresponding single-method planner. r 4.1.3 Learning in M ulti-M ethod Planning I The analytical results in the previous section show th at monotonicity can yield I performance gain in sequential m ulti-method planning by using cheaper methods i earlier in the sequence. The performance of sequential m ulti-m ethod planning can be further improved by ameliorating the effects of wasting effort on insufficient planners via learning — in particular, of two sorts. The first sort of learning is within-planner learning th at can transfer across planners (possibly for the same ' problem). If a projection is performed within one planner, and the results of the , projection depend only on aspects of the planner that are shared by a second i | planner, then it should not be necessary to repeat th at projection when the second j I planner is tried. For example, a rule learned from a plan violating goal protection in j i . 1 the direct goal-protection planner should be able to transfer to the nonlinear goal- { | protection planner, where it prevents the planner from reprojecting along paths i th at violate goal protection. The second sort of learning is about which methods to use for which classes of | problems. To the extent that this can be done, the effort wasted in trying inad-1 i equate m ethods can be avoided in the future. In our Soar-based implementation,, bias selection is structured just as would be any other selection, so this sort of | learning can happen autom atically by chunking. From an experiment with such I 1 learning, Figure 4.2 shows a rule learned to avoid using the most restricted m ethod — th at is, direct goal-protection — under specific circumstances where there is; | only one active goal conjunct but (at least) two blocks m ust be moved to achieve i it. This rule was learned during the first problem and can be used in three later I problems to avoid even trying this method. j ; Though we have examined instances of learning about which methods to use for j which classes of problems in the context of m ulti-m ethod planning, no systematic! study has yet been m ade of their effectiveness or of whether issues of overgener alization and/or undergeneralization will prove troublesome, which they are likely i » to be. Another reason why this form of learning is not used is th at the current m ulti-attribute encoding creates some expensive chunks for some of the problems. Problem-space is "select-method” One conjunct is unachieved Want a stack of at least two blocks The upper block is not in position The upper block Is not clear An operator is proposed to use “directness & protection" method - - > The operator is worst (a) a [cl A s B D Initial State J c m Jh h « L ■ C ■ E ■ Unachieved conjunct (On B A) (On A B) (b) (On B E) A B C IDIEI (OnDB) Figure 4.2: Example of learning which planners to use for which classes of problems: | (a) a learned rule to avoid the direct goal-protection planner, (b) a class of problems j in which this rule is applicable. Future work should include rerunning the experiments summarized in Table 4.1' with this form of learning enabled. 4 .2 B ia s-R e la x a tio n M u lti-M e th o d P la n n ers ! Section 4.1.1 showed an approach to creating monotonic m ulti-m ethod planners by using a restricted dominance graph. This approach is quite straightforward in : the sense th at each pair of methods is directly compared. However, given a set of! methods, it does not specify which methods should be created and which pairs of| m ethods should be compared. Let k be the num ber of biases. Then, the number of single methods generated by every combination of these biases is G(2k). The num ber of comparisons for creating a restricted dominance graph for these methods is 0 ( 2fc 2 ). Although it is tractable to generate all monotonic m ulti-m ethod planners by using a restricted dominance graph with a small set of initial biases — as in the case of Section 4.1.1 — it may not be tractable if the num ber of biases considered is increased. Therefore, a scheme is needed to restrict the scope of m ethods to be generated and compared. One way to remedy this problem is to compare the effectiveness of each bias in isolation, instead of comparing the performance of methods which are generated with respect to combinations of these biases. The approach presented in this section is based on bias-relaxation [Lee and Rosenbloom, 1993]. Bias-relaxation m ulti m ethod planners can be created as combinations of effective biases only, so that later methods can embody subsets of the effective biases incorporated into earlier m ethods2. M ethod switching is implemented by relaxing some of these biases; that is, planning starts with a set of effective biases, and then successively relaxes one or more biases until a solution is found within the method. This can be formalized as follows. Let B ki be the set of biases used in M ki • A bias b is called effective in a problem set A and a m ethod set {M ki }, if for a pair of methods M kx and M ky in {M ki} such that B hx — {& } and B ky = < j > , (1) Afkx, A kx ) ^ s[Mky i A kx), and (2) l(M kx,A kx) < l ( M ky,A kx). A sequential m ulti-m ethod planner M kl— *Mk2— . > M kn is called a bias-relaxa tion m ulti-m ethod planner, if (1) B ki_1 D B ki, for 2< i< ny and (2) B ki_i — B ki consists of effective biases only, for 2<i<n. \ Given a set of k biases, the tim e complexity of testing whether these biases are effective or not is 0 (k ) (by factoring out the complexity of solving problems), which is exponentially smaller than 0 ( 2fc 2 ). ! 2 Positive bias, as defined in [Lee and Rosenbloom, 1993], is a different notion from effective ! bias here in that the effectiveness of a positive bias is evaluated along with other biases. 71 The results in Table 4.1 imply that directness and protection are effective in the blocks-world domain, while linearity is not, since /(M 5, Ai) > and ! /(M 5, A 2 ) > 1(Mq, A 2 ). If one uses linearity as an independent bias — so that one 1 set of m ulti-m ethod planners is generated using it and one set without it — and vary directness and protection within the individual m ulti-m ethod planners, we 1 get a set of ten different bias-relaxation m ulti-m ethod planners (four three-m ethod ! I planners and six two-method planners) as shown in Table 4.2. In the machine- 1 | shop scheduling domain, only protection is effective. In consequence, only one I bias-relaxation m ulti-method planner is generated M i— >M4. ! Type M ulti-method planners Type M ulti-m ethod planners Linear M i— »M2— »M5 M i— >M4— >M5 M x->M 5 M 2-+A/5 m 4-> m 5 Nonlinear M \ — *Mz — ► M q M x— >M4— *M6 M i-+M 6 M 3— »Ms M 4— >M$ Table 4.2: Ten bias-relaxation m ulti-m ethod planners in the blocks-world. ; Note th at if there are no interactions among effective biases, a bias-relaxation ] m ulti-m ethod planner is a special case of a monotonic m ulti-m ethod planner. How- | ever, this is not necessarily true if there are interactions among them . For example, | although directness is effective, this does not necessarily m ean th at the m ethod j th at uses directness and protection is more efficient than the m ethod th at uses I j protection only. In order to generate monotonic m ulti-m ethod planners via bias-relaxation, one j can just test whether monotonicity holds for the created bias-relaxation multi- | m ethod planners. The tim e complexity for this procedure is linear in term s of 1 the num ber of biases, because at least one bias is relaxed whenever a m ethod is ! j switched. I In the next section, the experimental results for all of the created bias-relaxation ■ ( m ulti-m ethod planners are presented. Decisions Plan length Planner Ax A 2 -43 As Ax A 2 A 3 As Ms 22.21 29.41 29.48 29.22 3.00 3.78 3.83 3.82 M6 33.40 47.12 48.06 47.93 2.90 3.88 4.07 4.14 Average 38.58 3.98 M\-^-M 2 — *Ms 13.26 24.69 25.07 26.13 1.82 2.48 2.54 2.58 M \ — ► Me— ► Me 13.26 26.34 26.55 28.91 1.82 2.52 2.54 2.59 M i — *M4— >Ms 13.26 26.16 26.41 26.79 1.82 2.85 2.92 2.94 M \ — ► M \— >M $ 13.26 36.78 37.40 37.30 1.82 2.91 2.99 3.02 M \— > Ms 13.26 25.68 25.86 26.04 1.82 2.96 3.02 3.03 M\ — ► Me 13.26 31.54 31.85 31.77 1.82 2.89 2.94 2.97 M^— ► Ms 19.54 27.89 28.18 29.34 1.85 2.43 2.49 2.58 M 3 * Me 21.22 28.46 28.41 30.67 2.00 2.52 2.52 2.57 M4-+M5 16.85 27.81 27.95 28.38 1.82 2.83 2.88 2.93 M ^ M e 16.85 33.33 33.59 34.47 1.82 2.83 2.85 2.95 Average 29.98 2.82 (a) Blocks-world domain. Planner Decisions Plan length Ax A 4 Ax a 4 M4, Ms, Me Mx— >M4, M 2 — *Ms, Ms— *Me 31.47 26.17 33.97 35.91 4.13 2.43 4.47 3.58v (a) Machine-shop scheduling domain. i Table 4.3: Single-method and bias-relaxation m ulti-m ethod planning. 4.2.1 E xperim ental R esults i ; We have implemented the ten bias-relaxation multi-methods planners in Soar6. j Each single-method planner in a bias-relaxation m ulti-method planner was imple- \ j m ented as a specialization of a general problem-space. Based on the sequence ofj single-method planners, a set of meta-level control rules was provided to coordinate1 which problem-space is tried next if the current problem-space does not generate, a plan for the given problem. Only within-trial learning was turned on for each! ' problem, as in the experiments with the single-method planners, but learned rules were also allowed to transfer from an earlier m ethod to a later m ethod (for the same problem). This is equivalent to the type of transfer allowed in the single-method I planners, because the scope of transfer is lim ited to the current trial only. Table 4.3 compares the ten bias-relaxation m ulti-m ethod planners with the two complete single-method planners over the test set of 100 random ly generated problems used in Chapter 3 (this test set is different from the 30-problem training set used in developing the m ulti-method planners). Paired-sample Z-tests are made for the average performance on A$ — because it is the only complete problem set in this domain — between bias-relaxation m ulti-m ethod planners and single m ethod planners. The results reveal th at bias-relaxation m ulti-m ethod planners take significantly less planning tim e (z= 2.27, p<.05), and generate significantly shorter plans than single-method planners (2=4.86, p C.01). In the machine-shop . scheduling domain, paired-sample Z-tests are m ade for the average performance on j A\. The results show that bias-relaxation m ulti-m ethod planners take slightly more I | planning tim e than single-method planners; however, no significance is found at a ; 5% level (2=1.00). In term s of plan length, bias-relaxation m ulti-m ethod planners J generate significantly shorter plans than single-method planners (2=3.15, pC.Ol) j in this domain also. j Although it has been shown that bias-relaxation m ulti-m ethod planners can | outperform single-method planners (in the blocks-world domain), it does not nec- j essarily m ean th at, for all situations, there exists a bias-relaxation m ulti-method planner which outperforms the most efficient single-method planner. In fact, the ! J performance of these planners depends on the biases used in the bias-relaxation j m ulti-m ethod planners and the problem set used in the experiments. For example, j if the problems are so complex that most of the problems are solvable only by j ! the least restricted method, the performance loss by trying inappropriate earlier I i ! ' m ethods in m ulti-method planners might be relatively considerable. On the other ■ i hand, if the problems are so trivial th at it takes only a few decisions for the least restricted m ethod to solve the problems, the slight performance gain by using more restricted methods in m ulti-m ethod planners might be overridden by the complex ity of the meta-level processing required to coordinate the sequence of primitive j planners. ] I # | j 4 .3 F in e-G ra in ed M u lti-M e th o d P la n n ers i The approach to m ulti-method planning described so far starts with a restricted m ethod and switches to a less restricted m ethod whenever the current m ethod fails. This switch is always made on a problem-by-problem basis. However, this is not the only granularity at which methods could be switched. The family of m ulti m ethod planning systems can be viewed on a granularity spectrum . W hile in i j coarse-grained m ulti-method planners, methods are switched for a whole problem I | ■ when no solution can be found for the problem within the current m ethod, in; fine-grained m ulti-m ethod planners (denoted as m ethods can be] j switched at any point during a problem at which a new set of subgoals is formulated,; 1 • I and the switch only occurs for that set of subgoals (and not for the entire problem) j I [Lee and Rosenbloom, 1993]. At this finer level of granularity it is conceivable ! th at the planner could use a highly-restricted and efficient m ethod over much of a 1 problem, but fall back on a nonlinear m ethod without protection for those critical subregions where there are tricky interactions. j j W ith this flexibility of m ethod switching, fine-grained m ulti-m ethod planning > I can potentially outperform both coarse-grained m ulti-method planning and single-1 m ethod planning. Compared with coarse-grained m ulti-m ethod planning, it can . save the effort of backtracking when the current m ethod can not find a solution or j : the current partial plan violates the biases used in the current m ethod. Moreover, | i it can save the extra effort of using a less restricted m ethod on later parts of the! ! . I problem, just because one early part requires it. As compared with single-method 1 planning, a fine-grained m ulti-method planner can utilize biases which would cause ! incompleteness in a single-method planner — such as directness or protection in the blocks-world domain — while still remaining complete. The result is that a Decisions Plan length Planner Ai A2 A3 As Ai A2 a 3 As m 5 22.21 29.41 29.48 29.22 3.00 3.78 3.83 3.82 m 6 33.40 47.12 48.06 47.93 2.90 3.88 4.07 4.14 Average 38.58 3.98 M i— *Mi— >M § 13.26 24.69 25.07 26.13 1.82 2.48 2.54 2.58 M i— * M 3 — * Me 13.26 26.34 26.55 28.91 1.82 2.52 2.54 2.59 Mi — ► M 4 — ► M$ 13.26 26.16 26.41 26.79 1.82 2.85 2.92 2.94 Mi — > M.\ — ► Me 13.26 36.78 37.40 37.30 1.82 2.91 2.99 3.02 M i^ M s 13.26 25.68 25.86 26.04 1.82 2.96 3.02 3.03 M i— 13.26 31.54 31.85 31.77 1.82 2.89 2.94 2.97 M 2 — *Ms 19.54 27.89 28.18 29.34 1.85 2.43 2.49 2.58 M 3 — * Me 21.22 28.46 28.41 30.67 2.00 2.52 2.52 2.57 M4->Ms 16.85 27.81 27.95 28.38 1.82 2.83 2.88 2.93 M 4 — *Me 16.85 33.33 33.59 34.47 1.82 2.83 2.85 2.95 Average 29.98 2.82 M l_ 2— 5 8.63 12.87 13.00 13.01 1.82 2.80 2.84 2.90 Afj— .3— .6 8.63 13.38 13.43 13.56 1.82 2.53 2.53 2.59 8.63 13.19 13.29 13.25 1.82 3.25 3.32 3.34 Mi-+4-+e 8.63 13.48 13.73 13.63 1.82 2.87 2.96 2.97 Ml-* 5 8.63 12.21 12.36 12.51 1.82 2.63 2.73 2.81 M i-,.6 8.63 13.22 13.27 13.23 1.82 2.68 2.69 2.73 M 2-+ S 19.19 23.75 23.76 23.80 2.56 3.07 3.11 3.16 M 3 —,6 16.62 23.45 23.56 24.22 2.03 2.56 2.57 2.71 M4— .5 13.57 17.24 17.30 17.38 2.44 3.71 3.77 3.77 M4_6 14.10 19.28 19.58 19.83 2.41 3.33 3.43 3.46 Average 16.44 3.04 Table 4.4: Single-method and coarse-grained m ulti-m ethod vs. fine-grained m ulti m ethod planning in the blocks-world domain. Planning Type Average Decisions Single Coarse- Grained Planning Single 38.58 - - Type Coarse- Grained 29.98 2.27* - Fine-Grained 16.44 5.37** 6.72** (a) Decisions. Planning Type Average Plan Length Single Coarse- Grained Planning Single 3.98 - - Type Coarse-Grained 2.82 4.86** - Fine-Grained 3.04 3.42** 1.77 (b) Plan length , Table 4.5: Significance test results for the blocks-world domain. i I ------------------------------------------------------------------------------------------------------------------------------------------ l ; fine-grained m ulti-m ethod planner can potentially be more efficient than a single- ; m ethod planner that has the same coverage of solvable problems. i i \ 1 4.3.1 E xperim ental R esults ; Table 4.4 compares the bias-relaxation fine-grained m ulti-m ethod planners with the corresponding bias-relaxation coarse-grained m ulti-m ethod planners and (com- : plete) single-method planners over the same 100 test set as used in Table 4.3 in ; the blocks-world domain. Paired-sample Z-tests on this data, as shown in Ta ble 4.5, reveal th at fine-grained m ulti-m ethod planners take significantly less plan- j | ning tim e than both single-method planners (2=5.35, pC.Ol) and coarse-grained j | m ulti-m ethod planners (2=6.72, pC.Ol). This likely stems from fine-grain multi-! I t j m ethod planners preferring to search within the more efficient spaces defined by; the biases — thus tending to outperform single-method planners — but being able Planner Decisions Plan len gth Ai A 4 A i A 4 M 4 , A/5 , M q M i— *M4j M 2 — ►A/s, M 3 — ► Afg A fl_ 4 , A/2 — +5, Af3_6 31.47 26.17 18.71 33.97 35.91 19.07 4.13 2.43 2.87 4.47 3.58 3.29 (a) Experim ental results for the scheduling dom ain. Table 4.6: Single-method and coarse-grained m ulti-m ethod vs. fine-grained m ulti m ethod planning in the machine-shop scheduling domain. to recover from bias failure without throwing away everything already done for a problem (thus tending to outperform coarse-grained m ulti-m ethod planners). Fine-grained m ulti-m ethod planners also generate significantly shorter plans I than single-method planners (2=3.42, pC.Ol). They generate slightly longer plans i than coarse-grained m ulti-m ethod planners; however, no significance is found at a ! 5% level (2=1.77). These results likely arise because, whenever possible, both types of m ulti-m ethod planners use the more restrictive methods th at yield shorter p la n 1 lengths, while there may be little difference between the m ethods that ultim ately | succeed for the two types of m ulti-m ethod planners. j Table 4.6 illustrates the performance of these three types of planners over the | same test set of 100 problems used in Table 4.3 in the machine-shop scheduling! I domain. As with the blocks-world domain, paired-sample z-tests in the scheduling domain, as shown in Table 4.7, indicate that fine-grained planners dom inate both \ single-method planners (2=10.91, pC.Ol) and coarse-grained planners (2=8.95,1 p<.01) in term s of planning tim e. Fine-grained planners also generate significantly j shorter plans than do the single-method planners (2=6.49, p<.01). They gener-j ate slightly shorter plans than coarse-grained m ulti-m ethod planners; however, no, significance is found at a 5% level (2=1.28). | Figures 4.3 and 4.4 plot the average num ber of decisions versus the average J j plan lengths for the data in Tables 4.4 and 4.6. These figures graphically illustrate. 4.5 4 •S 3.5 £ 3 Blocks-world domain 2.5 - ! --------i -------- ..... " I + ♦ + * * 0 ** * * * ........ * 0 0 o ® o Decisions Figure 4.3: Performance of single-method planners (+ ), coarse-grained m ulti m ethod planners (o), and fine-grained m ulti-m ethod planners (*) in the blocks- world domain. 5.5 •S 4.5 3.5 Scheduling domain ^5 20 25 30 35 40 45 Decisions Figure 4.4: Performance of single-method planners (+ ), coarse-grained multi-; m ethod planners (o), and fine-grained m ulti-m ethod planners (*) in the scheduling domain. 79 Planning Type Average Decisions Single Coarse- Grained Planning Single 33.97 - - Type Coarse-Grained 35.91 1.00 - Fine-Grained 19.07 10.91** 8.95** (b) The significance test result for decisions. Planning Type Average Plan Length Single Coarse- Grained Planning Single 4.47 - - Type Coarse-Grained 3.58 3.15** - Fine-Grained 3.29 6.49** 1.28 (c) The significance test result for plan lengths. Table 4.7: Significance test results for the machine-shop scheduling domain. how the coarse-grained approach primarily reduces plan length in comparison t o . the single-method approach, and how the fine-grained approach prim arily improves j efficiency in comparison to the coarse-grained approach. 4 .4 C o m p a riso n w ith P a rtia l-O rd er P la n n in g i Partial-order planning can be more efficient than total-order planning, because; partial-order planning avoids prem ature commitment to an incorrect ordering be- j tween operators, and thus reduces the size of the search space [Minton et al., 1991, j B arrett and Weld, 1992]. In particular, B arrett and Weld [1992] showed experi-j mentally that the total-order planner TO CL exhibited apparently exponential time; complexity while the partial-order planner POCL m aintained near-linear perfor-' mance, as the num ber of problems was increased in the D l S l domain. This sec tion compares m ulti-m ethod planning with partial order planning, and shows that m ulti-m ethod planners can perform as well as partial-order planners in this domain. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 80 j A tem p late for generating operators in D 1S l is illustrated as follows: (define—operator : action Ai : precondition {/,} : add {(?,} : delete {/,_i}). Note th at operator Ai deletes the preconditions of operator A ,_i3. This implies th at for any problem in this domain, there exists a single ordering of operators which solves that problem. Planning in TOCL is similar to planning in Ms (without learning) in the sense th at both are complete and search over the space of sequences of task operators. In the search, operators can be tried which are not provably right, and backtracking can occur if the choice is wrong. In contrast, POCL searches over the space of ordering constraints. It tries out constraints that may be right, and then back tracks over them if they prove wrong. In general, POCL can outperform TOCL by avoiding prem ature step-ordering constraints. The performance of TOCL can be improved by adding EBL. The idea is that if a control rule is learned by EBL when a failure occurs and this rule can cut off j all similar failure paths, then TOCL may perform as well as POCL. However, this j j may not lead to linear performance, because EBL is com m itting even less than least . commitment planning with respect to adding constraints. In EBL, constraints (i.e. preference rules) are generated only when they are provably correct, and they are I never backtracked over. Proving th at a constraint is correct can be a non-trivial ! 1 | task, and may not guarantee a polynomial complexity in planning tim e. On the i . | other hand, in POCL the added constraints are not proved correct. In the D 1S 1 do- I | m ain, however, the constraints generated by POCL do work without backtracking, i since there is no operator which adds a precondition of another operator. Bias-relaxation m ulti-m ethod planning can improve the performance of TOCL, : because a bias allows learning constraints based on weaker proofs if they prove j wrong, and m ulti-m ethod planning allows backtracking over these constraints. In I 3 D 1 S 1 means that there is one entry in its operator’s delete set and it only takes one step to | achieve a goal. this domain, a bias, called precondition protection is used, which eliminates all plans i 1 in which disachieved preconditions are reachieved. A m ulti-m ethod planner can be i constructed which consists of a m ethod using precondition protection (denoted as Mp) and one without it — th at is, the least restricted planner (M6). In fact, a precondition protection bias is weak enough to solve all problems (thus M p is complete) in this domain. However, Mp itself may not be complete for other domains. In th at case, Backtracking across incorrect preference rules (that is across m ethods) can happen in M p— +Mq. Mp— POCL Number of Goals Number of Nodes (S ) CPU Time (Sec.) Number of Nodes (S' + O) CPU Time (Sec.) 1 1 0.07 2 0.03 2 2 0.09 4 0.05 3 3 0.09 6 0.06 4 4 0.09 8 0.10 5 5 0.13 10 0.13 6 6 0.56 12 0.15 7 7 0.12 14 0.17 8 8 0.14 16 0.19 9 9 0.17 18 0.23 10 10 0.18 20 0.26 11 11 0.20 22 0.31 12 12 0.21 24 0.31 13 13 0.24 26 0.33 ! Table 4.8: Experimental results for Mp— tMe and POCL I [ ■ ■ ...............- ......- ................................. — — I | I Table 4.8 shows experimental results for MV-^M% and POCL for 13 problems in the D l S 1 domain. In term s of the num ber of nodes visited, both planners show ; | linear performance. Note th at the definitions of node in the two planners are dif ferent because their search spaces are different. In Mp— *Me, a node represents an ; ' element of the space of operator sequences (S ), whereas in POCL a node repre-' sents an element of the space of the set of operators (S') plus the set of ordering : constraints among them (O). This explains the factor of two between these two columns. ' The data for CPU tim e also exhibit (near) linear performance in both planners. i : Although the simulation for these planners is done on the same machine, the dif ferences in CPU tim e for these two planners do not imply much, because they are coded in different languages (Soar on top of C for M P— *M6 and Lisp for POCL). Nevertheless, the (near) linear performance of M p— suggests that M p— *Me can perform as well as POCL, in this domain. 4.5 S u m m a ry i i In this chapter the notion of monotonicity in sequential m ulti-m ethod planning j is investigated. In a monotonic m ulti-method planner, the single methods axe sequenced according to increasing coverage and decreasing efficiency. A formal analysis shows that (1) if the cost of failure for a restricted planner is always less | than the cost of success for a more relaxed planner, then monotonic m ulti-method i planners outperform corresponding time-shared m ulti-m ethod planners; otherwise, i ! time-shared m ulti-m ethod planners may perform better; (2) a monotonic multi-1 m ethod planner takes less planning tim e than the corresponding single-m ethod! | planner, if the performance gain by using a cheaper m ethod is greater than the ( I wasted tim e by using inappropriate methods in the monotonic m ulti-m ethod plan- j ner; and (3) the lengths of plans generated from a monotonic m ulti-m ethod planner ! | and the corresponding time-shared m ulti-method planner are less than or equal to the length of plans generated from the corresponding single-method planner. ! A set of bias-relaxation m ulti-method planners has been constructed. In bias- | relaxation m ulti-m ethod planning, each bias is evaluated in isolation. Thus, b ias-! ! relaxation m ulti-m ethod planning has a restricted scope in creating and compar- i ing individual methods. The constructed bias-relaxation m ulti-m ethod planners, vary in the granularity at which individual methods are selected and used. De pending on the granularity of m ethod switching, two variations on bias-relaxation m ulti-m ethod planners are implemented: coarse-grained m ulti-m ethod planners, where m ethods are switched on a problem-by-problem basis; and fine-grained m ulti m ethod planners, where m ethods are switched on a goal-by-goal basis. J The experimental results in the blocks-world and machine-shop-scheduling d o -, mains imply th at (1) in term s of planning tim e, fine-grained m ulti-m ethod plan-1 ners can be significantly more efficient than coarse-grained m ulti-m ethod planners j and single-method planners; and (2) in term s of plan length, both fine-grained and coarse-grained m ulti-m ethod planners can be significantly more efficient than single-method planners. Finally, the comparison of m ulti-method planning with partial-order planning in D 1S 1 suggests th at m ulti-m ethod planning can be as efficient as partial-order planning in term s of planning performance. J I C h a p ter 5 ! i A p p lic a tio n to a C o m p lex D o m a in The investigations of m ulti-m ethod planning in the previous chapter have occurred in the context of the blocks-world and machine-shop scheduling domains. These are classical planning domains th at provide good environments for developing and i evaluating m ulti-m ethod planners. However, the intent here is to transfer the multi- | m ethod planning technology to a more realistic domain; in particular, a sim ulated, I battlefield domain. ; The task focused on in this domain is to simulate autom ated intelligent agents i th at can accomplish tactical missions in navy fighters. One interesting aspect of! I this domain is th at the main criterion to evaluate planning is how well the missions j j can be accomplished, whereas planning tim e and plan length are secondary criteria.! j This chapter shows how the m ulti-method planning framework can be applied toj I domains with such a criterion. ! I Since this domain involves the complexity of the real world and the domain it-1 j . . 1 I self is not clearly defined, the full implementation of a planner th at can be used i n , I ! . such agents is beyond the scope of this thesis. The focus in this thesis is on inves tigating the issues related to planning in this domain and dem onstrating planning J capabilities th at m ulti-m ethod planners have, rather than developing a real planner th at can actually be deployed. The application of m ulti-m ethod planning to this , domain will help both in evaluating the degree of domain independence provided by the m ulti-m ethod planning framework, as well as being a step toward integrating the technology into a broader agent. , i This chapter begins with an overview of simulated battlefield environments and , ! . . . . . I ! describes the tactical air simulation task. Then, it demonstrates how m ulti-m ethod planning can be applied to this task. i 5.1 S im u la ted B a ttle fie ld E n v iro n m en ts i The goal of the work in a simulated battlefield environment is to create agents that act as virtual agents to participate in exercises with real hum an agents. These exercises are to be used for training as well as for development of tactics. In order for these exercise to be realistic, the agents m ust be able to behave as much like , hum ans as possible. i j To approxim ate human behaviors, the agents m ust have capabilities including ; obeying tactical missions, planning and reacting in real tim e, adapting to new situ- 1 I ations, learning from experience, exhibiting the cognitive lim itations and strengths of humans, interacting with other agents, and so on. Developing agents with such capabilities is a non-trivial task with many real-world complexities. Soar-IFOR is an attem pt to build such agents within the Soar architecture. Soar is a promising candidate for developing such agents, because it is a single unified system which can integrate various components of AI technologies such as problem-solving, planning, reasoning, learning, perception, m otor control, and so on. In addition, Soar is the basis for the development of unified theories of hum an cognition [Newell, 1990], and thus can provide an appropriate framework for modeling hum an like agents. To begin the effort to build autom ated intelligent agents for sim ulated battle field environments, Soar-IFOR has mainly focused on creating specific autom ated agents, called TacAir-Soar, for simulated tactical air environments [Jones et al., 1993, Rosenbloom et al., 1994]. 5.2 T a ctica l A ir S im u la tio n The goal of building TacAir- Soar is to construct autom ated intelligent agents for flight simulators th at are used to train navy pilots in flight tactics. For example, it can be used in simulating a Barrier Combat Air Patrol (BARCAP) mission; th at is, to patrol the skies to protect a High-Value Unit (HVU) such as an aircraft carrier. During the course of the mission, if the agent detects a hostile aircraft, it intercepts the aircraft by firing missiles and then resumes its patrol. One of the im portant characteristics of tactical air simulation is th at it is a highly reactive, real-tim e, I/O intensive task. Thus, the agents m ust be able to make decisions in real-time and react appropriately according to the changes in the environment. On the other hand, it is a highly goal-oriented (or mission-oriented) task. The goals include accomplishing m ultiple missions and survival. Thus, the agent m ust have a planning capability which can deal with multiple goals at the same time. Dealing with m ultiple goals involves the following issues: how to represent mul tiple goals and their interaction, how to generate appropriate actions that satisfy m ultiple goals at a tim e, how to decide on appropriate actions when m ultiple goals require conflicting behaviors, which goals can be ignored if necessary, and so on. The next section presents a prototype agent which employs the m ulti-m ethod planning technique for tactical air simulation, and dem onstrates how m ulti-m ethod planning can deal with these multiple goal issues. Although TacAir-Soar is a highly reactive agent, the implementation of the prototype agent based on the m ulti m ethod planning technique focuses on the planning capabilities only, and not on the reactive capabilities. 5.3 Im p le m e n ta tio n The application of m ulti-m ethod planning in tactical air simulation is based on a beyond-visual-range (BVR) 1-v-l aggressive bogey scenario [Jones et al., 1993, i Johnson, 1994, Tambe and Rosenbloom, 1994]. This scenario involves two armed aircraft with similar capabilities. One aircraft (F14) is attem pting to protect a : high-value unit and the other (MiG29) is attem pting to destroy it. W hen the two aircraft come in contact, they both attem pt to intercept and destroy each other, with the overall goals of accomplishing their missions — here, protecting the HVU (or attacking the HVU) — situational awareness, and surviving. Protect HVU Escape Situational Awareness Evade M issile Intercept Bogey Racetrack Get M issile LAR Push Fire Botton Confuse Select M issile Survive Destroy Bandit Support M issile Figure 5.1: A skeleton of the goal hierarchy for the 1-v-l aggressive bogey scenario. W hile performing BARCAP, if a bogey (an unknown aircraft) is noticed, the F14 tries to determ ine whether the bogey is a bandit (an enemy aircraft). If the bogey is identified as a bandit, the F14 attem pts to destroy it by firing missiles. In order to destroy it, the F14 selects a missile — a long-range missile (LRM) 88 here — and approaches the MiG29 close enough to get into its LRM’s launch- ' acceptability region (LAR). After launching a LRM, the F14 makes an F-pole (a , maneuver involving a 25-50 degree turn) to provide radar guidance to the missile, while decreasing the closure between the two aircraft. The fight continues until ! one aircraft is destroyed or runs away. A skeleton of the goal hierarchy for this | , scenario from the F l4 ’s point of view is shown in Figure 5.1. 1 The key issue in implementing a planner in this domain is that some goals can j never be achieved completely. These goals are called maintenance goals. P r o te c t- j I H V U , s itu a tio n a l-a w a re n e s s and su rv iv e are such examples. Thus, the role of ■ an operator for a maintenance goal is not to achieve the goal but to continuously j 1 m aintain the status of the goal. For example, a single application of the oper ator BARCAP does not achieve the goal protect-HVU. By applying this operator continuously, the HVU remains protected. M aintenance goals make it possible to define m ultiple achievement-levels rather than two levels — achieved and unachieved. For example, the achievement-level ! for the goal protect-H V U is maximum when there is no threat to the HVU, while the achievement-level for this goal is minimum when the HVU is destroyed. W hen , a bogey is noticed, the achievement-level is decreased from the maximum, because the bogey can potentially destroy the HVU. If the bogey is identified as a bandit, the achievement-level is further decreased. From the opposite point of view, one can define m ultiple threat-levels for each m aintenance goal; that is the threat-level is maximum when the achievement-level is minimum, and vise versa. For each m aintenance goal, operators which decrease the threat-level for th at goal are proposed. The m ultiple threat-level scheme allows the notion of protection to be refined for m aintenance goals. Instead of protecting achieved goals from being undone, it protects the threat-levels for other goals from being increased. The strongest form of protection in this domain, denoted as GPo, eliminates all plans in which an operator increases the threat-level of another goal. Weaker forms of protection, denoted as GPi (*=1, ...,n — 2, where n is the number of threat-levels) eliminates 89 all plans in which an operator increases the threat-level of another goal by more than i1. Goal Flexibility Dimension Directness Nonlinear GPo Goal Protection GPi Dimension ... GPn- 2 N o GP Table 5.1: The 2 x n planning methods generated from directness and protection, where there are n threat-levels. The notions of directness and linearity are not changed here. However, using j a linearity bias is dangerous in this domain, because focusing on only one goal ! ! conjunct and just ignoring other goals conjuncts until the current one is completely ! | achieved may cause a failure — that is, either the aircraft or the HVU m ay be ' j destroyed. This yields a set of 2 x n planning methods derived from the two bias i dimensions (Table 5.1). ! i The key to implementing the prototype agent is th at the threat-levels for main- | tenance goals m ust rem ain as low as possible. By doing so, the probability that : one (or some) of the maintenance goals is seriously threated can be decreased, j Single-method planners are not appropriate because if their biases are too strong, j , they cannot solve the problem without seriously threatening other goals. On the j ' other hand, if their biases are too weak, planning takes too much tim e since the ! search space is too large. I Based on the planning m ethods shown in Table 5.1, a fine-grained m ulti-m ethod . , planner is implemented for the 1-v-l scenario. Figure 5.2 presents an example of 1 Another possible refinement is to use protection biases which eliminate all plans in which an ! operator increases the total threat-levels across all other goals by more than i. Threat-level T hreat-type 0 No threat 1 P otential threat 2 M inor threat 3 Interm ediate threat 4 M ajor threat 5 Fatal threat I * (a) Six threat-levels for 1 -v -l aggressive bogey scenario. Current state: T h e bandit is aggressive. Proposed T hreat-L evel Change G oal Operator G oal Before A fter protect-H V U DESTROY-BANDIT p rotect-H V U 3 2 s u r v iv e 3 4 s u r v iv e ESCAPE s u r v iv e 3 1 p rotect-H V U 3 5 Selected operator: DESTROY-BANDIT, (b) Operator selection w hen th e bandit is aggressive. Table 5.2: Example of fine-grained m ulti-m ethod planning for tactical air domain. how the implemented fine-grained m ulti-m ethod planning actually works. In this : implementation, six threat-levels are used as shown in Figure 5.2 (a). i Figure 5.2 (b) shows how an operator is selected when m ultiple operators are ; proposed in the situation that the bandit is aggressive. In this situation, the threat- I ' levels for both goals p r o t e c t - H V U and s u r v i v e are set to three, because the aggres- i ' sive behavior of the bandit is considered as a m edium level of threat for protecting j the HVU and surviving. For the goal p r o t e c t - H V U , the D E S T R O Y -B A N D IT opera- j tor is proposed which can decrease the threat-level to minor threat. For the goal I s u r v i v e , the E SC A PE operator is proposed which can decrease the threat-level to potential threat. In evaluating these operators, however, applying e s c a p e increases the threat-level for the other goal p r o t e c t - H V U to fatal threat, whereas applying j D E S T R O Y -B A N D IT increases the threat-level for su rv iv e to m ajor threat. Since D E S T R O Y -B A N D IT increases the threat-level for the other goal less than S U R V IV E , I ; D E S T R O Y -B A N D IT i s s e l e c t e d h e r e . ! 5 .4 S u m m a ry i | In this chapter, how m ulti-m ethod planning can be applied to a tactical air domain j is briefly discussed. A preliminary investigation is m ade of some of planning issues ' in this domain such as how to deal with maintenance goals and how to decide on appropriate actions when m ultiple goals require conflicting behaviors. In doing this, the notion of protection is refined such that one protects the threat-levels for other j goals from being increased. M ulti-method planning based on refined protection | biases shows how appropriate actions can be generated by this planner. C h a p ter 6 R e la te d W ork This chapter describes work related to the m ulti-m ethod planning framework. Sec tion 6.1 describes biases used in other planning systems. Section 6.2 compares the planning and learning framework in Soar to other planning frameworks. Finally, Section 6.3 compares the presented m ulti-m ethod planning technique to other re- i lated approaches. I I i I I 6.1 B ia ses in P la n n in g i i i j Some of the planning biases used here have been introduced by earlier planning ! systems as planning heuristics. For example, the linearity assum ption has been ! used in planners using a goal stack because of its simplicity [Fikes and Nilsson, I 1971]. Also, protection has been used in many planners to reduce the size of the I search space and to avoid generating non-optimal plans. These two biases are discussed in more detail here. I i i 6.1.1 Linearity ! ] In a conjunctive goal problem , th e assum ption th a t subgoals can be achieved se- | quentially and thus th a t the generated plan is a sequence of com plete subplans for : 93 I the conjunctive goals is known as the linearity assumption [Sussman, 1973]. Al- ! though m any problems cannot be solved without interleaving goal conjuncts, this I 1 assumption has two interesting properties. j First, it makes the original problem simpler by allowing decomposition of the t ; problem into a set of subproblems and then solving each subproblem in sequence. Since only a single goal conjunct is considered for each subproblem, the search space to solve the entire problem can be reduced. j Second, it provides a basis for classifying a group of problems in term s of a ! problem ’s complexity. Korf [1987] provided a more refined taxonomy about how | subgoals interact with each other. He defined a set of subgoals to be independent if each operator only changes the distance to a single subgoal. Though this definition | is based on a very strong assumption about goal interference, an optim al global j solution can be achieved by simply concatenating together optim al solutions to the i I individual subproblems in any order. Solving a single independent subgoal might * 1 be nontrivial, but the complexity of problems with independent subgoals increases I only linearly with the num ber of subgoals. Also, he defined a set of subgoals to be serializable if there exists an ordering among the subgoals such that the subgoals can always be solved sequentially with out ever violating a previously solved subgoal in the order. Since this definition is based on the linearity assumption and goal protection, a problem which consists of serializable subgoals can be classified as an element of A i — th at is, the set of problems solvable by the linear protection m ethod — in the m ulti-m ethod planning framework. B arrett and Weld [1992] defined a set of subgoals to be trivially serializable if they can be solved in any order without ever violating a previous solved subgoal. From this definition it is implied that if a set of subgoals is independent, it is trivially serializable, and th at if a set of subgoals is trivially serializable, it is serializable. 94 I 6.1.2 P rotection | I j The notion of protection was introduced in HACKER [Sussman, 1973]. In HACKER, \ 1 a protection violation is detected if “a protected subgoal is clobbered between the tim e it is established and the tim e it is no longer needed” [Sussman, 1973, page 63]. HACKER deals with protection violations by employing procedures called critics th at recognize such violations. W hen necessary, HACKER is able to repair the plan by rearranging the steps in the plan. Waldinger [1977] developed an approach, called goal regression, to protect achieved goals. It involves creating a plan to solve one subgoal followed by con structive modifications to achieve the other subgoals. It differs from HACKER in j that it uses the notion of goal protection to guide the linear placement of actions in the plan. Rather than building incorrect plans and then debugging them , it builds | partial linear plans in non-sequential order and moves subgoals backwards through j the partial linear plans to where they do not interfere with other subgoals. Vere I [1983] also developed a technique, called splicing, which relaxes protection when it ! I , has caused a deadlock. i { SNLP u ses ca u sa l lin k s to rep resen t p r o te c tio n in terv a ls an d d ea ls w ith th r ea ts i l | [ to th e m . ! i I I 6.2 P la n n in g an d L earn in g in Soar j In the thesis, planning operators are represented by operator proposal rules, op- I i I erator application rules, goal expansion rules, and instantiated Soar operators in 1 . ® , working memory. However, this is not the only way to represent planning operators | in Soar. For example, U nruh’s [1993] operator representation for abstraction in- [ ! eludes rules to check operators’ preconditions explicitly before applying operators. , ' Also, in Soar, goal expansion for a violated precondition is usually implemented ] i by creating a new Soar subgoal and achieving the violated condition within the j subgoal, via the operator subgoaling scheme [Laird et al., 1987]. I , 95 j Planning in Soar is similar to planning in PRODIGY in that both systems use I 1 a set of preference-based control rules to yield a sequence of operators. One of 1 i the differences between these two systems is th at while Soar learns control rules I from the result of look-ahead search, PRODIGY learns control rules from its own , problem-solving trace [Minton, 1988] or from a static analysis of the domain theory [Etzioni, 1990a]. Another difference is th at original PRODIGY (version 2.0) uses a ■ linear planning approach [Minton et al., 1989]. Veloso [1989] developed a nonlinear : version of PRODIGY, but the learning m ethod used was a cased-based approach. ! ! i 6 .3 M u lti-M e th o d P la n n in g j ; The basic approach of bias relaxation in m ulti-m ethod planning is similar to the 1 shift of bias for inductive concept learning [Russell and Grosof, 1987, Utgoff, 1986]. ' , In the planning literature, this approach is closely related to an ordering modifica tion which is a control strategy to prefer exploring some plans before others [Gratch and DeJong, 1990]. If the preference is wrong, the alternatives will be eventually reached. Thus, ordering modification retains planner completeness. They explicitly distinguished this modification from structural modification which prunes portions of the potential plan space. Planning systems which employ m ulti-m ethod plan ning techniques include STEPPINGSTONE [Ruby and Kibler, 1991], and FAILSAFE- 2 [Bhatnagar and Mostow, 1990]. These two systems are discussed in more detail here, and a comparison of m ulti-method planning with partial order planning is presented. 6.3.1 STEPPINGSTONE STEPPINGSTONE is a learning problem-solver that decomposes a problem into sim ple and difficult subproblems. It solves the simple subproblems with an inexpensive constrained problem solver. To solve the difficult subproblems, STEPPINGSTONE uses an unconstrained problem solver. Once it solves a difficult subproblem, it uses the solution to generate a sequence of subgoals, or steppingstones, th at can be used 96 ; by the constrained problem solver to solve this difficult subproblem when it occurs ' i again. j The constrained problem solver takes as input a set of subgoals which axe < ordered based on a heuristic called openness. It attem pts to solve the subgoals , in the given order, and generate a solution for the subgoals, if one is found. The constraint used in this problem solver is that each solved subgoal is protected. If the constrained problem solver is unable to solve a subgoal, a memory component is called. The memory component is based on a case-based approach. It matches the current problem-solving context — that is, the subgoal currently being solved, the currently protected subgoals, and the current state — with stored contexts, 1 ■ then returns the ordered subgoals for the m atching context. W hen the memory ' ! component fails to return any useful subgoal ordering, the unconstrained problem j ; solver is called. The unconstrained problem solver relaxes the protection on the : solved subgoals to find a solution. Since STEPPINGSTONE generate a solution according to the prescribed subgoal ordering, the constrained problem solver is comparable to M 2 (the linear planner , with protection), while the unconstrained problem solver is comparable to M 5 (the linear planner without protection). This implies th at STEPPINGSTONE is close to the sequential m ulti-m ethod planner M 2 — * ■ M s; however, the difference between these two is that STEPPINGSTONE has a cased-based memory component in between M 2 and M 5 , which is analogous to the transfer of control rules across problems. 6.3.2 f a i l s a f e - 2 FAILSAFE-2 (F S 2) is a system th at performs adaptive search by learning from its failures. The FS2 problem solver uses two types of search control knowledge: goal selection rules to constrain the selection of which goal to pick as the next current goal; and censors to constrain the selection of which operator to apply to the current state. There are two types of interactions between the problem solver and learner. ' The first type occurs when the search is under-constrained. The symptoms of | under-constrained search include violating a protected goal, reaching a state-loop, and exceeding a preset goal-depth limit. If any of these symptoms is found, the problem solver declares a failure and invokes the learner. If the learner is able to identify the problem solving step that led to the failure, it adds a new censor to prevent similar failures in the future. The other type of interaction between the problem solver and learner occurs when the search is over-constrained. Over-constrained search prunes away all so lution paths. Domain-independent heuristics are used to detect over-constrained search. W hen it is detected, the problem solver calls a heuristic procedure which I : relaxes a censor. If relaxing the censor leads to achieving the current goal, FS2 infers that the censor was over-general and calls the learner to specialize it. The basic idea of censor relaxation in FS2 is close to the bias relaxation mech anism in the thesis. However, there are a num ber of differences, such as the gran ularity at which censors are relaxed and the way censors are relaxed. Whenever applying an operator to the current state violates a censor, th at state is m arked as suspended. Once the problem solver cannot make progress by forward search with the censor, FS2 selects one of the suspended states that is likely to be closest to the goal based on a heuristic, and uses a weak form of backward chaining (WBC) which recurses on the failed preconditions of an operator one at a tim e. If a solu tion is found by this relaxation, the censor is specialized, so th at so th at it does not prevent the expansion of the search tree in the future. 6.3.3 P artial Order Planners Fine-grained m ulti-m ethod planning is related to traditional partial-order plan ning, where heuristics are used to guide search over the space of partially ordered plans without violating planner completeness. For example, SNLP [McAllester and Rosenblitt, 1991, B arrett and Weld, 1992] uses a heuristic which prefers nodes with fewer unresolved goals. Using directness in fine-grained m ulti-m ethod planners is 98 i similar to this heuristic, because applying an operator without violating directness 1 reduces the num ber of unachieved goals by at least one. ' The least-commitment approach can be viewed as planning which starts with the strong assumption that the problem can be solved without any ordering con straints, and relaxes that assumption by adding ordering constraints successively i only as it is necessary. In this sense, it is similar to the bias-relaxation approach which starts from a set of biases and relaxes the biases only when the problem (or I subproblem) cannot be solved with those biases. 99 C h a p ter 7 C o n clu sio n i I This chapter summarizes the methodology used in the thesis and the results, and j then presents some of the lim itations of this methodology and future work. | 7.1 S u m m a ry o f th e A p p ro a ch an d R e su lts In this thesis, two hypotheses are investigated in depth: (1) no single planning m ethod will satisfy both sufficiency and efficiency for all situations; and (2) m ulti- 1 m ethod planning can outperform single-method planning in term s of sufficiency i j and efficiency. To evaluate these hypotheses, a set of single m ethod planners and | a set of m ulti-m ethod planners have been created. The creation of these planners I ; is based on the notion of bias in planning. Bias is a useful notion in planning because it can potentially reduce compu- >. tajdon effort by reducing the num ber of plans that m ust be examined, and it can ] potentially generate shorter plans by avoiding plans containing inefficient operator ■ sequences. By varying the amount of bias used, a set of planning m ethods with I | different performance and scope can be generated. I I To evaluate the first hypothesis, a system has been constructed th at can uti- j lize different single-methods, which are defined along two bias dimensions: goal- ' flexibility, and goal-protection. The goal-flexibility dimension determines the de- | gree of flexibility the planner has in generating new subgoals and in shifting the 1 focus in the goal hierarchy. This dimension subsumes directness and linearity 100 1 biases. The goal-protection dimension determines whether or not an achieved top- level goal conjunct is protected between the tim e it is achieved and the tim e it is no' longer needed. By taking the cross-product of these two dimensions, six different m ethods are created. These methods have been implemented in Soar. In Soar plans are represented as sets of variabilized control rules and sets of instantiated preferences that jointly specify which operators should be executed at each point in tim e. The effect of learning in these methods with respect to the performance of planning has been investigated. The six implemented methods have been compared empirically in term s of planner completeness, planning tim e, and plan length. The experimental | results show a trade-off between completeness and efficiency for these methods — i th at is, if a m ethod is too restricted, it cannot generate plans for some problems, while if it is too relaxed, it takes too much tim e in generating plans, and the generated plans are inefficient. As an alternative approach to single-method planners, m ulti-m ethod planners ' have been created. A m ulti-m ethod planner consists of a coordinated set of plan ning methods, where each individual m ethod has different scope and performance. Given a set of created methods, the key issue here is how to coordinate the methods in an efficient m anner so th at the multi-method planner can have high performance. This includes issues of selecting appropriate methods as situations arise, and the granularity of m ethod switching as the situational demands shift. For the m ethod selection issue, two ways of organizing individual methods in a m ulti-m ethod planner — sequential and time-shared — have been compared analytically. The wasted effort in a sequential m ulti-m ethod planner is the cost of trying earlier methods in the sequence, whereas the wasted effort in a time-shared m ulti-m ethod planner is the cost of trying all methods in the m ethod set except the one th at actually solves the problem. The wasted effort in sequential m ulti-method planning is sensitive to the ordering of the methods because it takes too much tim e if earlier methods are not efficient enough, or in an extrem e case, it may not be able to generate a plan at all if one of the earlier m ethods does not halt. On the 101 other hand, the wasted effort in time-shared m ulti-m ethod planning is sensitive to I the num ber of individual methods. As an approach to reducing the wasted tim e in sequential m ulti-m ethod plan ning, monotonic m ulti-method planning has been investigated. In a monotonic m ulti-m ethod planner, the individual m ethods are ordered according to decreas ing efficiency and increasing coverage based on the empirical performance of those m ethods for a training set of problems. A formal analysis shows th at (1) a mono tonic m ulti-m ethod planner takes less planning tim e than the corresponding single m ethod planner, if the performance gain by using a cheaper m ethod is greater than the wasted tim e by using inappropriate m ethods in the monotonic m ulti-m ethod planner; and (2) the lengths of plans generated from a monotonic m ulti-m ethod planner are less than or equal to the length of plans generated from the correspond ing single-method planner. To restrict the scope of individual m ethods to be generated and compared, a set of bias-relaxation m ulti-method planners has been constructed based on the notion of effective bias. In a bias-relaxation m ulti-method planner, planning starts by trying highly efficient methods, and then successively relaxes effective biases until a sufficient m ethod is found. The second issue of coordinating individual methods in m ulti-m ethod planning is the granularity at which individual planning methods are be switched. W hile in coarse-grained m ulti-m ethod planners, methods are switched for a whole problem when no solution can be found for the problem within the current m ethod, in fine grained m ulti-m ethod planners methods can be switched at any point during a problem at which a new set of subgoals is formulated, and the switch only occurs for that set of subgoals (and not for the entire problem). Both coarse-grained m ulti-m ethod planners and fine-grained m ulti-m ethod planners are implemented via bias relaxation. There is a trade-off between coarse-grained m ulti-m ethod planning and fine grained m ulti-m ethod planning. A coarse-grained m ulti-m ethod planner finds a solution within the first m ethod that has one at the cost of searching the entire 102 i biased space in the worst case. On the other hand, a fine-grained m ulti-method j planner can save the effort of searching all other alternatives within the current ! m ethod; however, it does not guarantee to find a solution th at may exist within > the current biased space. j The experimental results in the blocks-world and machine-shop-scheduling do- ] mains imply th at (1) in term s of planning tim e, fine-grained m ulti-m ethod plan ners can be significantly more efficient than coarse-grained m ulti-m ethod planners | and single-method planners; and (2) in term s of plan length, both fine-grained 1 J and coarse-grained m ulti-m ethod planners can be significantly more efficient than | single-method planners. In summary, the prim ary contribution of this thesis is to develop a new m ulti m ethod planning framework. This framework is developed based on the notion of bias (for m ethod creation), and the notions of monotonicity, bias-relaxation, and the granularity of m ethod switching (for m ethod coordination). The exper im ental results indicate th at, at least for the domains investigated, the created ! m ulti-method planners are more efficient than complete single-method planners. | j 7.2 L im ita tio n s an d F u tu re W ork i ; The m ulti-m ethod planning framework investigated in this thesis is based on three I ! biases: linearity, protection, and directness. One way to enhance the m ulti-m ethod [ [ i planning framework would be to extend the set of biases available. These biases I i include ones that lim it the size of the goal hierarchy such as goal-depth or goal- breadth (to reduce the search space), lim it the length of plans generated such as I plan-length (to shorten execution tim e), and lead to learning more effective rules 1 ' I such as goal-nonrepetition (to increase transfer). ! The m ulti-m ethod planners used here do not guarantee finding optim al plans I for a given problem. However, if a plan-length bias is incorporated with coarse grained m ulti-m ethod planning, where the bound is incrementally specified along j w ith the sequence of individual methods, the m ulti-m ethod planner will be able to find optim al plans for all problems. In fact, this approach implements depth-first iterative-deepening [Korf, 1985] on the length of plans generated. The bias selection approach used here is based on preprocessing a set of train ing examples in order to develop fixed sequences of biases (and m ethods). This approach has a lim itation when it is hard to generate testing problems or when the problem distribution is unknown. A more dynamic, run-tim e approach would ; be to learn, while doing, which biases (and methods) to use for which classes of problems. If such learned information can transfer to the later problems, much of the effort wasted in trying inappropriate methods may be reduced. One problem of learning about which methods to use for which classes of prob- 1 lems in the current m ulti-method framework is th at some rules (chunks) are ex pensive. Restricting expressiveness on the encoding of tasks such as by the unique- attribute scheme can solve this problem [Tambe et al., 1990]; however the learned rules based on this scheme may not be general enough to transfer to the later problems because of the limited expressibility. Another approach to solving the expensive chunk problem is to incorporate search control knowledge into the ex planation [Kim and Rosenbloom, 1993]. This approach can solve expensive chunk problem without restricting expressiveness. Learned rules can be used with the cost bounded by the cost of the problem solving from which it was learned. The methodology for generating a set of monotonic m ulti-m ethod planners or a set of bias-relaxation m ulti-method planners does not specify which m ulti-m ethod planner is the optim al one for a given problem distribution. Greiner [1992] devel oped an algorithm called PALO which searches the space of performance elements and selects a near locally optim al element by using statistical techniques to ap proxim ate the distribution. By employing the PALO algorithm in the m ulti-method planning framework, it may be possible to generate the optim al m ulti-m ethod plan ner for a given distribution. W ithin simulated battlefield environments, the focus of this thesis is on planning based on beyond-visual-range 1-v-l aggressive bogey scenario. One of the direc tions for future work in this domain includes applying the technique described in 104 ! Chapter 5 to other scenarios, such as l-v-2, 2-v-l, and 2-v-n scenarios, or W ithin Visual Range scenario. Application to air-to-ground or ground-to-ground combat ! simulation would be another possibility. 105 A p p e n d ix A E x p e r im e n ta l R esu lts: T h e B lo ck s-W o rld D o m a in This appendix gives the detailed numeric information from the experiments in the blocks-world domain. Appendix A .l presents the experimental results for the six single-method planners over 30 training problems. Appendix A.2 presents the experim ental results for the six single-method planners and the created m ulti m ethod planners over 100 test problems. 106 A .l P erfo rm a n ce over 30 T rain in g P r o b le m s M ethod M \ P lan Length D ecisions Problem s Trial 1 Trial 2 Trial 3 Average Trial 1 TVial 2 THal 3 Average N um ber Blocks Goals 16 16 2.0 16 16.0 29 30 3.0 1.0 31 30.0 8.0 8.0 1.0 1.0 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 8.0 25 29 30 3.0 0.0 2.0 1.0 2.0 2.0 1.0 28.0 3.0 17.0 8.0 9.0 16.0 8.0 17 17 17 16 16 16 8.0 1.0 1.0 2.0 2.0 8.0 9.0 16.0 16 16 16 Total Solved Problem s 25 16 196 16 202 16 202 16 200.0 16.0 25 16 25 16 25.0 16.0 1.56 Average 12.25 12.62 12.62 1.56 1.56 1.56 12.50 Table A .l: Performance of M \ over 30 training problems in the blocks-world do main. 107 Method M2 Problems Decisions Plan Length Number Blocks Goals Trial 1 Trial 2 Trial 3 Average Trial 1 Trial 2 Trial 3 Average 1 4 3 19 19 40 26.0 4 4 4 4.0 2 4 3 16 16 16 16.0 2 2 2 2.0 3 4 4 - - - - - - - - 4 3 3 34 33 32 33.0 3 3 3 3.0 5 4 3 8 8 8 8.0 1 1 1 1.0 6 4 2 27 27 31 28.3 4 4 4 4.0 7 3 2 26 27 27 26.7 3 3 3 3.0 8 3 2 8 8 8 8.0 1 1 1 1.0 9 3 3 - - - - - - - - 10 4 4 8 8 8 8.0 1 1 1 1.0 11 4 4 36 25 35 32.0 4 3 4 3.7 12 4 2 31 33 25 29.7 3 3 3 3.0 13 3 3 33 25 34 30.7 3 3 3 3.0 14 4 3 3 3 3 3.0 0 0 0 0.0 15 3 3 16 16 26 19.3 2 2 2 2.0 16 3 2 8 8 8 8.0 1 1 1 1.0 17 3 2 9 9 9 9.0 2 2 2 2.0 18 4 2 16 16 16 16.0 2 2 2 2.0 19 4 4 8 8 8 8.0 1 1 1 1.0 20 3 2 32 25 25 27.3 3 3 3 3.0 21 3 3 28 28 27 27.7 4 4 3 3.7 22 3 2 8 8 8 8.0 1 1 1 1.0 23 4 3 25 34 25 28.0 3 4 3 3.3 24 3 3 8 8 8 8.0 1 1 1 1.0 25 4 2 9 9 9 9.0 2 2 2 2.0 26 4 2 16 16 16 16.0 2 2 2 2.0 27 4 2 18 27 29 24.7 3 4 5 4.0 28 4 3 25 31 43 33.0 3 3 5 3.7 29 4 3 - - - - - - - - 30 3 2 - - - - - - - - Total 475 475 524 491.3 59 60 62 60.3 Solved Problems 26 26 26 26.0 26 26 26 26.0 Average 18.27 18.27 20.15 18.90 2.27 2.31 2.38 2.32 Table A.2: Performance of M2 over 30 training problems in the blocks-world do main. 108 M ethod M 3 Problem s Decisions P lan Length Num ber Blocks G oals Trial 1 Trial 2 Tried 3 Average Trial 1 Tiried 2 Trial 3 Average 1 4 3 19 19 41 26.3 4 4 4 4.0 2 4 3 16 16 16 16.0 2 2 2 2.0 3 4 4 - - - - - - - - 4 3 3 33 32 33 32.7 4 3 4 3.7 5 4 3 8 8 8 8.0 1 1 1 1.0 6 4 2 34 50 202 95.3 4 5 5 4.7 7 3 2 33 33 33 33.0 3 3 3 3.0 8 3 2 8 8 8 8.0 1 1 1 1.0 9 3 3 - - - - - - - - 10 4 4 8 8 8 8.0 1 1 1 1.0 11 4 4 32 72 48 50.7 3 5 4 4.0 12 4 2 46 32 40 39.3 3 3 3 3.0 13 3 3 32 32 41 35.0 3 3 4 3.3 14 4 3 3 3 3 3.0 0 0 0 0.0 15 3 3 16 24 16 18.7 2 2 2 2.0 16 3 2 8 8 8 8.0 1 1 1 1.0 17 3 2 9 9 9 9.0 2 2 2 2.0 18 4 2 16 16 16 16.0 2 2 2 2.0 19 4 4 8 8 8 8.0 1 1 1 1.0 20 3 2 25 25 32 27.3 3 3 3 3.0 21 3 3 34 49 41 41.3 3 4 3 3.3 22 3 2 8 8 8 8.0 1 1 1 1.0 23 4 3 32 119 74 75.0 3 7 6 5.3 24 3 3 8 8 8 8.0 1 1 1 1.0 25 4 2 9 9 9 9.0 2 2 2 2.0 26 4 2 16 16 16 16.0 2 2 2 2.0 27 4 2 34 58 18 36.7 4 5 3 4.0 28 4 3 94 40 56 63.3 6 3 4 4.3 29 4 3 - - - - - - - - 30 3 2 - - - - - - - - Total 589 710 800 699.7 62 67 65 64.7 Solved Problem s 26 26 26 26.0 26 26 26 26.0 Average 22.65 27.31 30.77 26.91 2.38 2.58 2.50 2.49 Table A.3: Performance of M3 over 30 training problems in the blocks-world do main. 109 M ethod M 4 P lan Length Problem s D ecisions Trial 1 Trial 2 Trial 3 Average Num ber Blocks Goals Trial 1 Trial 2 Trial 3 Average 16 16 16 16.0 2.0 31 28.3 8.0 3.0 1.0 25 29 8.0 1 . 0 8.0 1.0 10 11 12 13 14 15 16 17 18 19 20 29.7 3.0 17.0 8.0 17.3 16.0 8.0 3.0 0.0 2.0 1.0 2.0 2.0 1.0 30 29 30 17 17 17 16 16 18 16 18 16 8.0 1.0 23 24 25 26 27 28 29 30 8.0 17.3 16.0 1.0 2.0 2.0 16 16 18 16 18 16 Total Solved Problem s 213 16 216 16 221 16 216.7 16.0 25 16 25.0 16.0 25 16 25 16 Average 13.31 13.50 13.81 13.54 1.56 1.56 1.56 1.56 Table A.4: Performance of M4 over 30 training problems in the blocks-world do main. 110 M ethod Ms Problem s D ecisions P lan Length Num ber Blocks Goals Trial 1 Trial 2 Trial 3 Average Tried 1 Trial 2 Trial 3 Average 1 4 3 58 68 188 104.7 7 9 26 14.0 2 4 3 16 16 16 16.0 2 2 2 2.0 3 4 4 27 29 18 24.7 4 4 3 3.7 4 3 3 24 32 24 26.7 3 3 3 3.0 5 4 3 8 8 8 8.0 1 1 1 1.0 6 4 2 44 34 44 40.7 6 4 6 5.3 7 3 2 26 25 25 25.3 3 3 3 3.0 8 3 2 8 8 8 8.0 1 1 1 1.0 9 3 3 20 20 18 19.3 4 4 3 3.7 10 4 4 8 8 8 8.0 1 1 1 1.0 11 4 4 25 34 35 31.3 3 4 4 3.7 12 4 2 25 25 36 28.7 3 3 5 3.7 13 3 3 33 49 32 38.0 3 12 7 7.3 14 4 3 3 3 3 3.0 0 0 0 0.0 15 3 3 16 26 26 22.7 2 2 2 2.0 16 3 2 8 8 8 8.0 1 1 1 1.0 17 3 2 25 27 16 22.7 7 8 2 5.7 18 4 2 16 16 16 16.0 2 2 2 2.0 19 4 4 8 8 8 8.0 1 1 1 1.0 20 3 2 25 26 26 25.7 3 3 3 3.0 21 3 3 25 27 25 25.7 3 3 3 3.0 22 3 2 8 8 8 8.0 1 1 1 1.0 23 4 3 64 52 34 50.0 9 6 4 6.3 24 3 3 8 8 8 8.0 1 1 1 1.0 25 4 2 16 16 28 20.0 2 2 4 2.7 26 4 2 16 16 16 16.0 2 2 2 2.0 27 4 2 36 36 28 33.3 5 5 4 4.7 28 4 3 33 44 25 34.0 3 4 3 3.3 29 4 3 54 35 50 46.3 7 5 6 6.0 30 3 2 18 20 18 18.7 3 4 3 3.3 Total 701 732 803 745.3 93 101 107 100.3 Solved Problem s 30 30 30 30.0 30 30 30 30.0 Average 23.37 24.40 26.77 24.84 3.10 3.37 3.57 3.34 Table A.5: Performance of Ms over 30 training problems in the blocks-world do main. I l l M ethod Me Problem s Decisions P lan Length Num ber Blocks Goals Trial 1 Trial 2 Trial 3 Average Trial 1 Trial 2 Tried 3 Average 1 4 3 379 240 436 351.7 13 14 11 12.7 2 4 3 16 16 16 16.0 2 2 2 2.0 3 4 4 29 18 34 27.0 4 3 4 3.7 4 3 3 32 40 40 37.3 3 4 4 3.7 5 4 3 8 8 8 8.0 1 1 1 1.0 6 4 2 154 160 48 120.7 7 14 4 8.3 7 3 2 34 34 33 33.7 3 3 3 3.0 8 3 2 8 8 8 8.0 1 1 1 1.0 9 3 3 20 20 20 20.0 4 4 4 4.0 10 4 4 8 8 8 8.0 1 1 1 1.0 11 4 4 32 80 32 48.0 3 5 3 3.7 12 4 2 40 32 48 40.0 3 3 4 3.3 13 3 3 56 65 40 53.7 8 12 4 8.0 14 4 3 3 3 3 3.0 0 0 0 0.0 15 3 3 26 24 24 24.7 2 2 2 2.0 16 3 2 8 8 8 8.0 1 1 1 1.0 17 3 2 21 16 16 17.7 5 2 2 3.0 18 4 2 16 16 16 16.0 2 2 2 2.0 19 4 4 8 8 8 8.0 1 1 1 1.0 20 3 2 32 33 33 32.7 3 3 3 3.0 21 3 3 33 50 33 38.7 3 4 3 3.3 22 3 2 8 8 8 8.0 1 1 1 1.0 23 4 3 32 89 32 51.0 3 6 3 4.0 24 3 3 8 8 8 8.0 1 1 1 1.0 25 4 2 16 16 26 19.3 2 2 4 2.7 26 4 2 16 16 16 16.0 2 2 2 2.0 27 4 2 76 28 26 43.3 8 4 4 5.3 28 4 3 40 40 48 42.7 3 3 4 3.3 29 4 3 82 147 74 101.0 6 11 6 7.7 30 3 2 18 20 18 18.7 3 4 3 3.3 Total 1259 1259 1168 1228.7 99 116 88 101.0 Solved Problem s 30 30 30 30.0 30 30 30 30.0 Average 41.97 41.97 38.93 40.96 3.30 3.87 2.93 3.37 Table A.6: Performance of M6 over 30 training problems in the blocks-world do main. A .2 P erfo rm a n ce over 100 T estin g P r o b le m s The entries in the tables are defined as follows: N: Problem num ber D: Number of decisions B: Number of blocks L: Plan length G: Number of goal conjuncts C: CPU tim e (sec.) SP: Solved problems. 112 Pbm Mi M2 m 3 Mi M 5 M 6 N B G D L C D L C D L C D L C D L C D L C 1 4 3 . - - 49 4 1.74 64 5 2.93 - - - 49 4 1.69 64 5 2.90 2 3 3 9 2 0.14 9 2 0.14 9 2 0.12 18 2 0.37 16 2 0.28 16 2 0.28 3 4 2 . . - 26 3 0.78 33 3 1.01 - - - 26 3 0.81 33 3 1.07 4 4 4 9 2 0.25 9 2 0.25 9 2 0.25 16 2 0.42 16 2 0.41 16 2 0.43 5 4 3 10 3 0.29 46 3 2.48 26 4 0.84 24 3 0.66 24 3 0.64 24 3 0.65 6 4 4 9 2 0.25 9 2 0.25 9 2 0.25 18 2 0.56 16 2 0.43 16 2 0.43 7 3 2 - - - 44 4 1.08 34 4 0.78 - - - 30 5 0.64 45 5 1.07 8 4 2 8 1 0.13 8 1 0.13 8 1 0.13 8 1 0.13 8 1 0.14 8 1 0.13 9 4 4 - - - 294 6 31.77 59 5 2.64 - - - 63 8 2.74 91 8 4.76 10 3 3 8 1 0.11 8 1 0 4 0 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.11 11 4 3 9 2 0.22 9 2 0.22 9 2 0.21 18 2 0.53 26 4 0.81 16 2 0.38 12 4 2 9 2 0.21 9 2 0.20 9 2 0.20 18 2 0.48 37 6 1.29 16 2 0.34 13 3 2 - - - 27 3 0.54 32 3 0.75 - - - 26 3 0.56 39 3 1.06 14 3 2 9 2 0.13 9 2 0.13 9 2 0.13 18 2 0.34 21 5 0.46 16 2 0.26 15 3 2 9 2 0.13 9 2 0.12 9 2 0.12 18 2 0.34 16 2 0.26 16 2 0.26 16 4 4 - - - 26 4 0.91 49 5 2.23 - - - 33 4 1.12 464 17 108.50 17 4 2 - - - 34 4 1.03 48 4 1.53 - - - 25 3 0.67 118 9 5.89 18 3 2 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.09 19 3 2 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.10 8 1 0.10 20 4 3 - - - 43 5 1.50 50 5 1.80 - - - 53 7 2.12 80 6 3.47 21 3 3 3 0 0.04 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 22 4 3 - - - 25 3 0.72 34 4 1.11 - - - 25 3 0.71 57 5 2.11 23 4 3 10 3 0.29 24 3 0.67 32 3 0.97 10 3 0.28 31 3 0.99 56 5 2.28 24 4 3 - - - 59 4 3.56 57 4 2.46 - - - 54 7 2.05 113 7 7.00 25 4 4 - - - 32 4 1.21 63 4 2.49 - - - 71 6 2.85 73 5 2.99 26 3 3 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.11 27 4 3 10 3 0.30 25 3 0.75 40 3 1.58 10 3 0.28 32 3 0.96 56 5 1.67 28 3 3 8 1 0.11 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.10 8 1 0.11 29 4 3 10 3 0.29 35 4 1.23 48 4 1.63 10 3 0.28 36 4 1.15 48 4 1.70 30 3 2 - - - 17 2 0.28 17 2 0.28 - - - 17 2 0.28 17 2 0.27 31 4 3 9 2 0.22 9 2 0.22 9 2 0.22 16 2 0.38 16 2 0.38 16 2 0.37 32 4 4 - - - 42 4 1.55 66 5 2.93 - - - 34 4 1.24 89 7 3.95 33 3 3 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.10 34 3 3 9 2 0.13 9 2 0.14 9 2 0.13 16 2 0.27 16 2 0.28 16 2 0.28 35 4 3 - - - 26 4 0.79 42 5 2.18 - - - 206 25 16.20 179 15 13.19 36 4 2 8 1 0.15 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 37 3 3 8 1 0.11 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.10 8 1 0.10 38 4 2 - - - 64 6 2.38 40 4 1.33 - - - 53 7 1.94 119 10 5.00 39 3 2 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.10 8 1 0.09 8 1 0.09 40 3 2 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 41 3 2 - - - 19 2 0.35 17 2 0.28 - - - 17 2 0.29 17 2 0.27 42 3 2 - - - 21 3 0.50 21 3 0.47 - - - 27 4 0.58 27 4 0.58 43 4 3 8 1 0.15 8 1 0.13 8 1 0.14 8 1 0.13 8 1 0.14 8 1 0.14 44 4 3 10 3 0.27 25 3 0.71 58 3 2.05 10 3 0.26 25 3 0.72 24 3 0.63 45 4 4 16 2 0.43 16 2 0.44 16 2 0.44 16 2 0.43 16 2 0.43 16 2 0.43 46 4 4 9 2 0.23 16 2 0.40 25 3 0.76 9 2 0.23 16 2 0.40 16 2 0.40 47 3 2 - - - 17 2 0.29 17 2 0.28 - - - 17 2 0.29 17 2 0.28 48 3 3 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.10 49 4 4 11 4 0.41 18 4 0.61 34 5 1.31 42 4 1.88 119 13 6.26 374 10 43.41 50 3 3 10 3 0.17 27 3 0.64 25 3 0.57 19 3 0.39 32 7 0.85 40 4 1.07 Table A .7: Perform ance of single-m ethod planners over 100 testing problem s in the; blocks-world dom ain. 113 P b m M i m 2 m 3 M i M e N B G D L C D L C D L C D L C D L C D L C 5 1 4 3 1 0 3 0 . 2 8 3 9 3 1 . 5 0 6 4 t 2 . 5 5 1 0 3 0 . 2 8 3 4 4 1 . 1 2 4 8 4 1 . 6 5 5 2 3 3 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 1 8 1 0 . 1 1 8 1 0 . 1 0 8 1 0 . 1 0 5 3 3 2 9 2 0 . 1 3 9 2 0 . 1 2 9 2 0 . 1 3 1 6 2 0 . 2 6 2 1 5 0 . 4 6 1 6 2 0 . 2 6 5 4 4 3 1 0 3 0 . 2 8 2 4 3 0 . 6 6 2 4 3 0 . 6 7 1 0 3 0 . 2 8 5 0 6 1 . 8 4 6 4 5 2 . 6 8 5 5 4 4 1 1 4 0 . 3 8 4 9 4 1 . 8 3 4 1 5 1 . 5 2 1 1 4 0 . 3 9 5 3 5 2 . 2 3 8 7 5 4 . 3 1 5 6 3 2 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 0 9 8 1 0 . 0 9 5 7 4 4 9 2 0 . 2 4 1 6 2 0 . 4 0 1 6 2 0 . 4 2 9 2 0 . 2 2 1 6 2 0 . 4 0 3 2 3 1 . 0 1 5 8 3 3 9 2 0 . 1 3 2 6 3 0 . 5 9 4 0 3 1 . 0 5 9 2 0 . 1 3 2 6 3 0 . 5 9 4 0 3 1 . 0 3 5 9 4 4 - - - 5 2 5 2 . 1 0 4 8 5 1 . 8 7 - - - 8 2 1 0 3 . 8 2 6 7 6 3 . 1 1 6 0 3 2 3 0 0 . 0 4 3 0 0 . 0 3 3 0 0 . 0 3 3 0 0 . 0 4 3 0 0 . 0 3 3 0 0 . 0 4 6 1 3 3 9 2 0 . 1 5 2 4 2 0 . 5 2 1 6 2 0 . 2 9 9 2 0 . 1 4 3 2 7 0 . 8 9 1 6 2 0 . 2 8 6 2 3 2 - - - - - - 1 8 3 0 . 3 2 - - - 3 6 9 1 . 0 0 1 3 8 2 2 6 . 1 0 6 3 3 2 9 2 0 . 1 4 9 2 0 . 1 3 9 2 0 . 1 2 1 6 2 0 - 2 6 2 3 6 0 . 5 5 2 3 6 0 . 5 5 6 4 3 3 9 2 0 . 1 3 1 6 2 0 . 2 8 1 6 2 0 . 2 8 9 2 0 . 1 2 2 6 2 0 . 5 5 1 6 2 0 . 2 7 6 5 3 2 9 2 0 . 1 2 1 6 2 0 . 2 6 1 6 2 0 . 2 6 9 2 0 . 1 2 1 6 2 0 . 2 6 1 6 2 0 . 2 5 6 6 4 2 8 1 0 . 1 4 8 1 0 . 1 4 8 I 0 . 1 4 8 I 0 . 1 5 8 1 0 . 1 4 8 1 0 . 1 4 6 7 3 3 9 2 0 . 1 4 1 6 2 0 . 2 8 2 4 2 0 . 5 1 9 2 0 . 1 3 2 6 2 0 . 5 3 2 4 2 0 . 4 9 6 8 4 2 3 0 0 . 0 4 3 0 0 . 0 4 3 0 0 . 0 4 3 0 0 . 0 4 3 0 0 . 0 4 3 0 0 . 0 3 6 9 4 3 - - - 4 6 4 1 . 8 8 3 3 4 1 . 0 6 - - - 8 9 1 2 4 . 1 5 2 0 7 2 6 2 1 . 1 4 7 0 3 2 1 8 3 0 . 3 3 2 0 4 0 . 3 9 7 1 3 2 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 0 9 8 1 0 . 0 9 8 1 0 . 0 9 7 2 3 3 8 1 0 . 1 0 8 1 0 . 1 1 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 1 0 7 3 4 2 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 1 4 8 1 0 . 1 4 8 1 0 . 1 4 7 4 4 3 9 2 0 . 2 0 1 6 2 0 . 3 7 3 1 3 1 . 1 0 9 2 0 . 2 2 9 8 1 3 4 . 6 7 1 6 2 0 . 3 6 7 5 3 2 - - - 2 8 4 0 . 6 1 6 4 5 1 . 6 7 - - - 2 8 4 0 . 5 9 6 5 5 1 . 8 1 7 6 3 2 9 2 0 . 1 3 9 2 0 . 1 2 9 2 0 . 1 3 1 8 2 0 . 3 4 2 1 5 0 . 4 6 2 3 6 0 . 5 6 7 7 4 3 - 3 6 5 1 . 3 0 1 1 9 1 2 6 . 6 9 7 8 4 2 - - - 1 7 2 0 . 3 8 4 1 3 1 . 4 7 - - - 2 6 3 0 . 7 3 1 7 2 0 . 3 6 7 9 4 2 - - - 2 5 3 0 . 6 6 4 1 4 1 . 3 6 - - - 2 5 3 0 . 6 7 2 5 3 0 . 6 5 8 0 3 2 3 0 0 . 0 4 3 0 0 . 0 4 3 0 0 . 0 3 3 0 0 . 0 3 3 0 0 . 0 3 3 0 0 . 0 4 8 1 3 3 1 0 3 0 . 1 7 1 7 3 0 . 3 3 1 7 3 0 . 3 3 1 7 3 0 . 3 1 3 2 7 0 . 8 8 4 5 7 1 . 3 5 8 2 3 3 9 2 0 . 1 3 2 6 2 0 . 5 9 2 5 3 0 . 5 6 9 2 0 . 1 3 2 9 2 0 . 6 6 1 6 2 0 . 2 8 8 3 3 3 9 2 0 . 1 4 9 2 0 . 1 3 9 2 0 . 1 3 1 6 2 0 . 2 8 2 5 7 0 . 6 8 2 9 9 0 . 9 3 8 4 3 3 2 0 3 0 . 4 0 2 0 3 0 . 4 1 8 5 3 3 1 0 3 0 . 1 7 1 7 3 0 . 3 2 1 7 3 0 . 3 3 2 6 3 0 . 5 5 4 2 9 1 . 2 0 2 4 3 0 . 5 1 8 6 3 2 8 1 0 . 1 0 8 1 0 . 0 9 8 1 0 . 1 0 8 1 0 . 1 0 8 1 0 . 0 9 8 1 0 . 1 0 8 7 4 2 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 8 4 2 - - - 2 6 3 0 . 7 1 5 8 5 2 . 0 3 - - - 2 6 3 0 . 7 1 4 1 3 1 . 1 9 8 9 3 2 8 1 0 . 1 0 8 1 0 . 0 9 8 1 0 . 0 9 8 1 0.11 8 1 0 . 1 0 8 1 0 . 0 9 9 0 4 2 - - - 2 7 4 0 . 8 9 5 7 5 2 . 1 1 - - - 9 6 1 2 4 . 6 8 9 7 7 4 . 8 7 9 1 3 2 1 8 3 0 . 3 2 2 0 4 0 . 3 9 9 2 4 4 1 1 4 0 . 4 1 4 7 4 2 . 7 9 2 6 4 0 . 9 3 5 3 4 2 . 7 5 6 6 8 2 . 8 6 2 9 2 2 2 3 6 . 6 3 9 3 4 2 - - - 2 5 3 0 . 6 8 4 0 3 1 . 3 6 - - - 6 9 7 2 . 5 1 2 5 3 0 . 6 7 9 4 4 2 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 4 8 1 0 . 1 5 8 1 0 . 1 3 9 5 4 4 1 0 3 0 . 3 1 1 0 3 0 . 3 1 1 0 3 0 . 3 2 2 4 3 0 . 7 3 4 5 6 1 . 7 0 3 1 6 1 3 3 3 . 5 3 9 6 3 2 - - - 1 9 2 0 . 3 5 1 9 2 0 . 3 5 - - - 1 7 2 0 . 2 7 1 9 2 0 . 3 5 9 7 3 3 9 2 0 . 1 4 2 4 2 0 . 5 1 2 5 3 0 . 5 6 9 2 0 . 1 3 2 9 2 0 . 6 7 2 4 2 0 . 5 0 9 8 3 3 1 0 3 0 . 1 6 4 1 3 1 . 3 2 3 4 3 0 . 7 9 1 0 3 0 . 1 6 2 4 3 0 . 5 1 3 2 3 0 . 7 7 9 9 3 3 1 0 3 0 . 1 7 2 7 3 0 . 6 4 1 7 3 0 . 3 3 1 7 3 0 . 3 2 2 4 3 0 . 5 2 3 3 8 0 . 9 6 1 0 0 4 4 1 0 3 0 . 3 3 3 3 3 1 . 5 6 3 3 4 1 . 1 8 1 7 3 0 . 4 9 5 0 6 1 . 9 6 5 0 6 2 . 0 1 T o t a l 5 8 7 1 2 4 1 5 . 3 6 2 1 5 4 2 2 3 9 1 . 5 1 2 2 6 6 2 4 4 8 2 . 7 5 8 3 9 1 2 4 2 3 . 9 0 2 9 2 2 3 8 2 1 0 4 . 1 1 4 7 9 3 4 1 4 3 6 1 . 5 6 S P 6 8 6 8 6 8 9 5 9 5 9 5 9 6 9 6 9 6 6 8 6 8 6 8 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 A v e r a g e 8 . 6 3 1 . 8 2 0 . 2 3 2 2 . 6 7 2 . 3 5 0 . 9 6 2 3 . 6 0 2 . 5 4 0 . 8 6 1 2 . 3 4 1 . 8 2 0 3 5 2 9 . 2 2 3 . 8 2 1 . 0 4 4 7 . 9 3 4 . 1 4 3 . 6 2 Table A .8: Performance of singie-m ethod planners over 100 testing problems in the ! blocks-world domain (continued). I 114 : P bm M i— ► M i — M 5 M j—M 4 - M s M j — Ms M 2 — Ms m 4— M s N B G D L C D L C D L C D L C D L C 1 4 3 54 5 2.16 83 8 3.87 52 5 1.83 54 4 1.96 52 5 1.80 2 3 3 14 2 0.23 14 2 0.23 14 2 0.23 14 2 0.22 21 2 0.47 3 4 2 38 4 1.35 39 3 1.15 36 4 1.06 31 3 1.00 34 3 0.92 4 4 4 14 2 0.36 14 2 0.36 14 2 0.36 14 2 0.36 23 2 0.69 5 4 3 15 3 0.41 15 3 0.39 15 3 0.38 51 3 2.55 31 3 0.97 6 4 4 14 2 0.36 14 2 0.37 14 2 0.36 14 2 0.36 21 2 0.69 7 3 2 37 4 0.91 41 5 0.92 35 4 0.71 49 4 1.23 36 5 0.76 8 4 2 13 1 0.22 13 1 0.22 13 1 0.22 13 1 0.21 13 1 0.22 9 4 4 72 5 3.10 80 8 3.77 96 8 4.17 299 6 32.89 75 8 3.61 10 3 3 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.19 11 4 3 14 2 0.32 14 2 0.33 14 2 0.33 14 2 0.32 23 2 0.63 12 4 2 14 2 0.30 14 2 0.30 14 2 0.28 14 2 0.29 21 2 0.53 13 3 2 49 3 1.48 39 3 0.88 35 3 0.67 32 3 0.72 35 3 0.68 14 3 2 14 2 0.22 14 2 0.21 14 2 0.21 14 2 0.20 21 2 0.42 15 3 2 14 2 0.21 14 2 0.22 14 2 0.21 14 2 0.20 23 2 0.42 16 4 4 128 5 8.14 128 12 6.20 65 8 2.62 31 4 1.21 220 15 13.70 17 4 2 77 3 3.22 47 4 1.36 43 4 1.29 39 4 1.28 41 3 1.14 18 3 2 13 1 0.17 13 1 0.18 13 1 0.17 13 1 0.18 13 1 0.17 19 3 2 13 1 0.18 13 1 0.19 13 1 0.18 13 1 0.17 13 1 0.18 20 4 3 44 4 1.55 65 7 2.44 61 7 2.26 48 5 1.83 41 3 1.22 21 3 3 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.03 3 0 0.04 22 4 3 62 6 2.47 77 9 3.00 52 6 1.92 30 3 0.99 33 3 0.91 23 4 3 15 3 0.41 15 3 0.40 15 3 0.38 29 3 0.92 15 3 0.39 24 4 3 46 5 1.75 60 7 2.20 62 7 2.24 64 4 3.89 61 7 2.27 25 4 4 74 6 3.37 47 4 1.67 67 6 2.62 37 4 1.51 42 4 1.44 26 3 3 13 1 0.19 13 1 0.19 13 1 0.18 13 1 0.19 13 1 0.19 27 4 3 15 3 0.40 15 3 0.41 15 3 0.39 30 3 1.00 15 3 0.40 28 3 3 13 1 0.19 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 29 4 3 15 3 0.40 15 3 0.41 15 3 0.41 40 4 1.53 15 3 0.39 30 3 2 27 2 0.59 30 2 0.57 25 2 0.46 22 2 0.45 25 2 0.44 31 4 3 14 2 0.34 14 2 0.34 14 2 0.33 14 2 0.34 21 2 0.59 32 4 4 42 4 1.73 61 6 2.28 63 7 2.51 47 4 1.86 54 6 1.92 33 3 3 13 1 0.20 13 1 0.20 13 1 0.20 13 1 0.20 13 1 0.19 34 3 3 14 2 0.23 14 2 0.23 14 2 0.23 14 2 0.23 23 2 0.45 35 4 3 44 4 1.58 57 4 1.95 116 15 5.45 31 4 1.07 155 17 10.06 36 4 2 13 1 0.27 13 1 0.27 13 1 0.26 13 1 0.26 13 1 0.27 37 3 3 13 1 0.20 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 38 4 2 99 8 4.24 67 8 2.35 61 7 2.14 69 6 2.66 74 9 2.82 39 3 2 13 1 0.18 13 1 0.17 13 1 0.17 13 1 0.17 13 1 0.17 40 3 2 3 0 0.03 3 0 0.03 3 0 0.03 3 0 0.03 3 0 0.03 41 3 2 27 2 0.59 30 2 0.57 27 2 0.50 24 2 0.43 25 2 0.43 42 3 2 28 3 0.66 41 4 0.93 35 4 0.75 26 3 0.59 28 4 0.56 43 4 3 13 1 0.23 13 1 0.23 13 1 0.23 13 1 0.23 13 1 0.23 44 4 3 15 3 0.40 15 3 0.39 15 3 0.37 30 3 0.97 15 3 0.37 45 4 4 21 2 0.70 21 2 0.71 21 2 0.70 21 2 0.70 21 2 0.69 46 4 4 14 2 0.34 14 2 0.34 14 2 0.33 21 2 0.66 14 2 0.33 47 3 2 29 3 0.68 32 3 0.64 27 3 0.50 22 2 0.46 25 2 0.43 48 3 3 13 1 0.20 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 49 4 4 16 4 0.54 16 4 0.54 16 4 0.54 23 4 0.91 49 4 2.29 50 3 3 15 3 0.28 15 3 0.27 15 3 0.27 32 3 0.84 22 3 0.51 Table A .9: Perform ance of coarse-grained m ulti-m ethod planners over 100 testing problem s in the blocks-world dom ain. 115 Pbm M i — m 2- Ms M i-* M i-* Ms M i —M s M 2— M s M i —M s N B G D L C D L C D L C D L C D L C 51 4 3 15 3 0.40 15 3 0.40 15 3 0.39 44 3 1.79 15 3 0.39 52 3 3 13 1 0.19 13 1 0.20 13 1 0.18 13 1 0.19 13 1 0.19 53 3 2 14 2 0.21 14 2 0.22 14 2 0.22 14 2 0.21 23 2 0.43 54 4 3 15 3 0.41 15 3 0.40 15 3 0.39 29 3 0.92 15 3 0.38 55 4 4 16 4 0.50 16 4 0.51 16 4 0.49 54 4 2.16 16 4 0.48 56 3 2 13 1 0.18 13 1 0.18 13 1 0.18 13 1 0.17 13 1 0.17 57 4 4 14 2 0.35 14 2 0.34 14 2 0.34 21 2 0.65 14 2 0.33 58 3 3 14 2 0.23 14 2 0.23 14 2 0.23 31 3 0.80 14 2 0.22 59 4 4 118 6 6.16 49 5 1.78 59 6 2.22 57 5 2.41 61 7 2.39 60 3 2 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.03 3 0 0.03 61 3 3 14 2 0.23 14 2 0.23 14 2 0.22 29 2 0.72 14 2 0.22 62 3 2 61 8 1.65 50 9 1.28 43 9 1.07 55 8 1.49 41 7 0.98 63 3 2 14 2 0.21 14 2 0.21 14 2 0.21 14 2 0.21 21 2 0.43 64 3 3 14 2 0.22 14 2 0.22 14 2 0.22 21 2 0.46 14 2 0.21 65 3 2 14 2 0.21 14 2 0.22 14 2 0.21 29 2 0.66 14 2 0.20 66 4 2 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.23 67 3 3 14 2 0.22 14 2 0.22 14 2 0.22 21 2 0.45 14 2 0.22 68 4 2 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.05 3 0 0.04 69 4 3 57 4 3.73 170 12 10.86 211 17 15.33 31 4 1.09 106 14 4.63 70 3 2 32 3 0.68 31 3 0.61 26 3 0.48 27 3 0.53 26 3 0.46 71 3 2 13 1 0.18 13 1 0.18 13 1 0.18 13 1 0.17 13 1 0.17 72 3 3 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.19 73 4 2 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.23 13 1 0.23 74 4 3 14 2 0.30 14 2 0.31 14 2 0.30 35 2 1.25 14 2 0.30 75 3 2 38 4 0.95 38 3 0.80 35 3 0.68 33 4 0.79 33 3 0.65 76 3 2 14 2 0.22 14 2 0.21 14 2 0.21 14 2 0.22 21 2 0.42 77 4 3 107 4 5.24 51 5 1.82 43 4 1.53 145 10 7.21 77 8 3.13 78 4 2 27 2 0.74 41 4 1.21 36 4 1.04 31 3 0.97 25 2 0.54 79 4 2 45 4 1.53 40 3 1.16 43 4 1.25 50 6 2.03 51 4 1.74 80 3 2 3 0 0.03 3 0 0.04 3 0 0.03 3 0 0.04 3 0 0.03 81 3 3 15 3 0.27 15 3 0.27 15 3 0.27 22 3 0.52 24 3 0.49 82 3 3 14 2 0.23 14 2 0.23 14 2 0.22 21 2 0.46 14 2 0.22 83 3 3 14 2 0.23 14 2 0.24 14 2 0.22 14 2 0.23 23 2 0.46 84 3 3 33 3 0.71 31 3 0.65 26 3 0.50 30 3 0.62 26 3 0.50 85 3 3 15 3 0.28 15 3 0.27 15 3 0.26 36 3 0.95 31 3 0.76 86 3 2 13 1 0.19 13 1 0.18 13 1 0.18 13 1 0.18 13 1 0.18 87 4 2 13 1 0.27 13 1 0.27 13 1 0.27 13 1 0.26 13 1 0.26 88 4 2 38 3 1.17 41 3 1.11 69 6 2.25 50 4 1.74 34 3 0.89 89 3 2 13 1 0.18 13 1 0.17 13 1 0.17 13 1 0.17 13 1 0.17 90 4 2 37 4 1.35 108 13 4.56 66 10 2.56 41 5 1.60 43 4 1.37 91 3 2 34 4 0.74 31 3 0.62 26 3 0.47 27 3 0.53 26 3 0.47 92 4 4 16 4 0.56 16 4 0.54 16 4 0.53 23 4 0.89 44 4 1.85 93 4 2 80 7 3.41 40 3 1.14 34 3 0.90 49 5 1.75 60 5 2.11 94 4 2 13 1 0.23 13 1 0.23 13 1 0.22 13 1 0.22 13 1 0.22 95 4 4 15 3 0.46 15 3 0.45 15 3 0.45 15 3 0.44 29 3 1.03 96 3 2 27 2 0.59 32 2 0.64 27 2 0.49 24 2 0.43 27 2 0.49 97 3 3 14 2 0.23 14 2 0.23 14 2 0.22 21 2 0.47 14 2 0.22 98 3 3 15 3 0.26 15 3 0.26 15 3 0.26 32 3 0.77 15 3 0.26 99 3 3 15 3 0.27 15 3 0.27 15 3 0.26 22 3 0.52 24 3 0.49 100 4 4 15 3 0.46 15 3 0.45 15 3 0.45 63 3 3.75 24 3 0.78 Total 2613 258 86.32 2679 294 83.06 2604 303 82.17 2934 258 115.89 2838 293 92.04 SP 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Average 26.13 2.58 0.86 26.79 2.94 0.83 26.04 3.03 0.82 29.34 2.58 1.16 28.38 2.93 0.92 Table A .10: Perform ance of coarse-grained m ulti-m ethod planners over 100 testing problem s in th e blocks-world dom ain (continued). 116 Pbm M i-* M 3 —Me M j—M i— Me M i — Me M 3— Me m 4- M e N B G D L C D L C D L C D L C D L C 1 4 3 93 7 4.75 87 6 3.95 91 7 4.33 69 5 3.30 91 7 4.35 2 3 3 14 2 0.23 14 2 0.23 14 2 0.23 14 2 0.23 23 2 0.45 3 4 2 43 3 1.66 46 3 1.42 41 3 1.25 38 3 1.38 33 3 0.88 4 4 4 14 2 0.36 14 2 0.36 14 2 0.37 14 2 0.36 21 2 0.67 5 4 3 15 3 0.40 15 3 0.39 15 3 0.39 31 4 1.11 33 3 1.00 6 4 4 14 2 0.38 14 2 0.37 14 2 0.35 14 2 0.36 23 2 0.69 7 3 2 47 4 1.24 58 5 1.37 43 5 0.98 39 4 0.98 44 5 1.13 8 4 2 13 1 0.24 13 1 0.23 13 1 0.22 13 1 0.21 13 1 0.22 9 4 4 68 5 3.66 136 9 7.53 90 7 4.15 64 5 3.02 207 12 18.71 10 3 3 13 1 0.20 13 1 0.20 13 1 0.19 13 1 0.18 13 1 0.18 11 4 3 14 2 0.33 14 2 0.33 14 2 0.32 14 2 0.33 21 2 0.60 12 4 2 14 2 0.29 14 2 0.30 14 2 0.29 14 2 0.28 21 2 0.54 13 3 2 50 3 1.32 63 4 1.51 41 3 1.01 37 3 0.94 41 3 1.06 14 3 2 14 2 0.22 14 2 0.22 14 2 0.21 14 2 0.21 21 2 0.42 15 3 2 14 2 0.22 14 2 0.21 14 2 0.22 14 2 0.21 23 2 0.43 16 4 4 111 8 6.86 77 5 3.25 72 5 3.08 54 5 2.62 64 5 2.40 17 4 2 42 3 1.29 130 8 4.76 65 5 2.11 53 4 1.81 193 12 10.74 18 3 2 13 1 0.18 13 1 0.18 13 1 0.17 13 1 0.17 13 1 0.18 19 3 2 13 1 0.20 13 1 0.19 13 1 0.18 13 1 0.18 13 1 0.18 20 4 3 67 5 2.60 106 9 5.07 72 5 2.53 55 5 2.14 40 3 1.18 21 3 3 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.03 3 0 0.04 22 4 3 51 4 1.92 71 6 2.55 64 4 2,36 39 4 1.40 128 8 6.50 23 4 3 15 3 0.40 15 3 0.40 15 3 0.39 37 3 1.25 15 3 0.38 24 4 3 104 6 5.10 180 12 13.95 166 12 10.00 62 4 2.80 108 8 5.40 25 4 4 76 5 3.62 63 4 2.41 106 7 5.19 68 4 2.89 68 6 2.99 26 3 3 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.18 13 1 0.18 27 4 3 15 3 0.40 15 3 0.40 15 3 0.39 45 3 1.90 15 3 0.38 28 3 3 13 1 0.20 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.18 29 4 3 15 3 0.40 15 3 0.40 15 3 0.39 53 4 1.95 15 3 0.39 30 3 2 27 2 0.58 30 2 0.56 25 2 0.43 22 2 0.45 25 2 0.43 31 4 3 14 2 0.32 14 2 0.33 14 2 0.34 14 2 0.33 23 2 0.63 32 4 4 75 5 3.19 101 6 4.28 88 6 4.23 71 5 3.38 48 4 1.63 33 3 3 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.18 13 1 0.19 34 3 3 14 2 0.23 14 2 0.23 14 2 0.22 14 2 0.23 23 2 0.45 35 4 3 52 5 2.30 170 12 10.35 170 13 11.43 47 5 2.54 67 5 2.32 36 4 2 13 1 0.26 13 1 0.27 13 1 0.26 13 1 0.26 13 1 0.27 37 3 3 13 1 0.19 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 38 4 2 51 4 1.97 84 6 3.40 127 8 6.01 45 4 1.60 118 9 5.99 39 3 2 13 1 0.18 13 1 0.18 13 1 0.17 13 1 0.17 13 1 0.17 40 3 2 3 0 0.04 3 0 0.03 3 0 0.04 3 0 0.03 3 0 0.04 41 3 2 29 2 0.58 30 2 0.56 27 2 0.50 22 2 0.45 25 2 0.41 42 3 2 28 3 0.66 33 4 0.71 27 4 0.52 26 3 0.58 28 4 0.55 43 4 3 13 1 0.24 13 1 0.23 13 1 0.23 13 1 0.24 13 1 0.23 44 4 3 15 3 0.38 15 3 0.37 15 3 0.37 63 3 2.36 15 3 0.37 45 4 4 21 2 0.70 21 2 0.69 21 2 0.72 21 2 0.71 21 2 0.68 46 4 4 14 2 0.34 14 2 0.34 14 2 0.34 30 3 1.08 14 2 0.33 47 3 2 29 3 0.66 32 3 0.63 27 3 0.50 22 2 0.45 27 3 0.50 48 3 3 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.19 49 4 4 16 4 0.54 16 4 0.54 16 4 0.54 39 5 1.68 43 4 1.73 50 3 3 15 3 0.27 15 3 0.28 15 3 0.27 30 3 0.78 24 3 0.49 Table A .ll: Perform ance of coarse-grained m ulti-m ethod planners over 100 testing problem s in the blocks-world dom ain (continued). 117 Pbm M a M6 M i — M« — Ms Mi - M e M 3— Mg M 4— M e N B G D li C D L C D L C D L C D L C 51 4 3 15 3 0.40 15 3 0.39 15 3 0.39 69 5 2.98 15 3 0.39 52 3 3 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.18 13 1 0.18 53 3 2 14 2 0.21 14 2 0.21 14 2 0.22 14 2 0.21 21 2 0.43 54 4 3 15 3 0.40 15 3 0.40 15 3 0.40 29 3 0.91 15 3 0.38 55 4 4 16 4 0.49 16 4 0.50 16 4 0.51 46 5 1.91 16 4 0.49 56 3 2 13 1 0.18 13 1 0.18 13 1 0.17 13 1 0.17 13 1 0.17 57 4 4 14 2 0.35 14 2 0.34 14 2 0.35 21 2 0.65 14 2 0.34 58 3 3 14 2 0.23 14 2 0.23 14 2 0.22 45 3 1.28 14 2 0.22 59 4 4 92 6 4.39 143 9 7.84 132 7 10.90 53 5 2.27 91 6 4.07 60 3 2 3 0 0.04 3 0 0.04 3 0 0.03 3 0 0.04 3 0 0.04 61 3 3 14 2 0.23 14 2 0.24 14 2 0.22 21 2 0.46 14 2 0.22 62 3 2 47 5 1.31 96 11 3.24 62 7 1.71 23 3 0.50 59 5 1.58 63 3 2 14 2 0.22 14 2 0.22 14 2 0.21 14 2 0.20 23 2 0.43 64 3 3 14 2 0.22 14 2 0.22 14 2 0.22 21 2 0.45 14 2 0.22 65 3 2 14 2 0.22 14 2 0.20 14 2 0.20 21 2 0.43 14 2 0.21 66 4 2 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.24 67 3 3 14 2 0.22 14 2 0.22 14 2 0.22 29 2 0.69 14 2 0.21 68 4 2 3 0 0.04 3 0 0.04 3 0 0.03 3 0 0.03 3 0 0.03 69 4 3 67 5 2.92 525 13 90.83 201 17 13.29 38 4 1.36 141 10 7.47 70 3 2 34 4 0.73 33 4 0.68 28 4 0.54 29 4 0.59 28 4 0.53 71 3 2 13 1 0.18 13 1 0.17 13 1 0.18 13 1 0.18 13 1 0.17 72 3 3 13 1 0.20 13 1 0.19 13 1 0.19 13 1 0.19 13 1 0.19 73 4 2 13 1 0.24 13 1 0.24 13 1 0.24 13 1 0.23 13 1 0.25 74 4 3 14 2 0.31 14 2 0.31 14 2 0.30 21 2 0.58 14 2 0.30 75 3 2 42 3 1.04 46 3 1.17 41 3 0.96 46 3 1.26 50 3 1.08 76 3 2 14 2 0.22 14 2 0.22 14 2 0.21 14 2 0.20 21 2 0.42 77 4 3 239 4 13.05 41 4 1.31 35 4 1.07 254 5 13.11 140 11 7.60 78 4 2 27 2 0.72 30 2 0.70 25 2 0.54 22 2 0.57 25 2 0.54 79 4 2 36 3 1.14 40 3 1.13 34 3 0.92 30 3 0.90 49 3 1.51 80 3 2 3 0 0.04 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.04 81 3 3 15 3 0.27 15 3 0.26 15 3 0.26 22 3 0.52 24 3 0.49 82 3 3 14 2 0.23 14 2 0.22 14 2 0.22 29 2 0.71 14 2 0.22 83 3 3 14 2 0.24 14 2 0.23 14 2 0.23 14 2 0.23 21 2 0.48 84 3 3 35 3 0.77 33 3 0.72 28 3 0.57 30 3 0.62 28 3 0.57 85 3 3 15 3 0.27 15 3 0.27 15 3 0.26 30 3 0.79 31 3 0.76 86 3 2 13 1 0.19 13 1 0.18 13 1 0.18 13 1 0.19 13 1 0.18 87 4 2 13 1 0.27 13 1 0.26 13 1 0.26 13 1 0.26 13 1 0.26 88 4 2 78 6 3.20 46 3 1.28 57 3 1.67 54 4 2.05 41 3 1.12 89 3 2 13 1 0.18 13 1 0.18 13 1 0.18 13 1 0.18 13 1 0.18 90 4 2 135 8 8.30 172 8 8.16 172 9 10.71 108 7 5.45 125 9 6.50 91 3 2 34 4 0.73 33 4 0.68 28 4 0.53 27 3 0.52 26 3 0.46 92 4 4 16 4 0.54 16 4 0.54 16 4 0.54 63 6 2.96 39 4 1.51 93 4 2 51 3 1.91 63 3 2.09 65 4 2.27 53 4 2.07 118 6 8.52 94 4 2 13 1 0.23 13 1 0.23 13 1 0.22 13 1 0.23 13 1 0.22 95 4 4 15 3 0.44 15 3 0.44 15 3 0.44 15 3 0.44 38 3 1.55 96 3 2 29 2 0.57 30 2 0.55 25 2 0.42 24 2 0.45 25 2 0.42 97 3 3 14 2 0.22 14 2 0.23 14 2 0.23 30 3 0.78 14 2 0.23 98 3 3 15 3 0.26 15 3 0.26 15 3 0.25 39 3 1.01 15 3 0.25 99 3 3 15 3 0.27 15 3 0.27 15 3 0.27 22 3 0.52 22 3 0.50 100 4 4 15 3 0.45 15 3 0.45 15 3 0.44 38 4 1.54 24 3 0.78 Total 2891 259 102.81 3730 302 206.58 3177 297 124.46 3067 257 104.82 3447 295 135.39 SP 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Average 28.91 2.59 1.03 37.30 3.02 2.07 31.77 2.97 1.24 30.67 2.57 1.05 34.47 2.95 1.35 Table A .12: Perform ance of coarse-grained m ulti-m ethod planners over 100 testing problem s in th e blocks-world dom ain (continued). 118 Table A. 13: Performance o f fine-grained multi-method planners over 1 0 0 testing problems i n the blocks-world. 31 O 4k C O 00 4k 4k 03 4i 3» 4k 4k 4k C O 4k to 4k H I § W © 8 © 8 © © 2 8 © to © H i © © to © to 0 0 to M to © to 31 to 4k to © to to to Hi to © Hi © Hi 0 0 H i •0 H i 01 H i 31 M 4k M © H i to H i H i H i © © © © 3l 4k © to H i 2 -a * 03 4 * . C O C O 4k 4k 4k 4k C O © C O © 4k © 4k 4k C O © 4k ifr. © 4k w 4k © 4k 4k 4k 4k © 4k © © 4k 4k © © © it 4k © 4k 4k © 4k 4k 4k © 4k C D 0 2 S T 3 C O 4k C O to 4k 4k C O C O to to to to to © to © © © 4k © to © © U © 4k © © U © © to © to 4k to to to to © © 4k to to 4k © 4k to © © O H i O Hi M oo M © H * 01 M O © to o H » o C O © © © © Hi © © 0 0 © © © H i -4 H i o 0 0 H i O © © © to © H * o 4k H * © to o 0 0 o o H i 00 M © © © H i H i © © C O H i 4k © H i 4k © H i o © to © © © to a C o * M to to to C O M 4k to o (H H i © M Hi 4k to H i H i to to © H * © H i H i 4 k 4 k © © o 4 k Hi Hi © A . to to © to to H i 31 M 3i to © to © to © r > 1 o o o o o o © © © © © © © O o © © o to © o © o © © © © © to © © © © © o © o o © © © O O O o o © w © H i 1 t-a S C O M ki C O 31 to 4 k 0 0 to © H * © 4 k © H i © o © M H i o © H i H i 31 © 4 k H i M © 4 k to © w 31 8 H i © w H i H i © 31 © © H * to © H i © o 4 k o o Hi M o M o © © C O © H i to H i © to o to © to to M H i 2 H > © © to to 0) 8 to 4 k o to H i 4 k 31 to a < n 0 1 i i oo Hi © © H i 03 M o © H * to H i © V © 4 k Hi © © © O © 0 0 M © © H i H i © © M o © © © to © H i © H * © to © o o 0 0 Hi 0 0 M © © © H i H i © © 0 0 Hi * © Hi 4 k © H i o © to © © © to a X , C O4 k . H i C Oto to C O H » 4 k to o H » © H i Hi H i M to H i 0) to to © M © H i H i 4k 4 k © 0 0 o 4 k Hi © 4 k to to V to to M 3 1 3i to © to © to © r l o o o o o © © © O © o o to © © H * O O o o ©o o o ©© O © to o O ©o o ©o o o ©©o O o O ©o o Hi o H i i Hi Jk . C O M M to to 4 k 0 0 C O to M © to © M © o 4 k © © * © M to M © © H i H i H i © © to to © 3 1 © © H i H i to © H i © 3 1 H i © © to o M 0 0 2 0 0 M M © § g 0 0 to H i H i to to o to o to © H i H i 2 © H i A 8 to © 8 to 3 1 © to H i © 3i to a < n o Hi Hi 0 0 Hi -0 C O H * © M © © to © M © © © H i © © Hi © © 0 0 H i © © H i H i © © H i © © 8 to o H i O 4 k Hi © to © O O © H i 0 0 M © © © H i H i © © © H i 4 k © H i 4 k © H * o © to © © © to o C OA M to to to C o M 4 k to © H i © Hi Hi © to H i © to to © H i © H i H * 4 k 4 k © © o 4 k H i Hi © 4 k to to © to to H i 3t H i 3i to w to © to © r £ J < n o o © o o o o o o o o © to © © © © © o © © © © © © © © © to © © © o © © © © o o o o O o © © o o M o H i Hi 0 0 4 k . 4k M o s to C O 4 k 0 0 to 0 0 M 4 k 4 k © Hi SI © © © s H i © H * 3 1 © M 4 k H i H * © to to to © 3 1 to © H i © © o © © 4 k © © M © © © 4 k 0 0 © © © H i ©g 0 0 © H i © H i to to © to © to © M H i •4 © H i 4 k to 3 1 to © to 3i o © H i 3i 3 1 M o 239 17 oo Hi M M 0 3 H * © C O to © to © Hi © © © © © © to H i © 0 0 lU to © H i © © © to 4 k © 4 k to 4 2 47 © 4 k © 3 1 © O o C O to 3i -4 -0 © © to C n © © © * 0 © © to © © -0 © © to © © 4 k © d C O to C OM to to to C O M 4 k to © H * © M © to t - n 3i to to 4 k H i © M 4 k •0 31 4 k © Hi H i « © to to © to to H i © M 31 to H i 4 k to © to 4 k r £ 1 cn o to 0 0 o o o o M O © © o o H i o © © o o M © o M o © o M to H * H i © to o o o 4 k © © © © o o 3 1 o o o 3i O © o M C O S I C OH i M C O M 4 k 4 k . 4 k s M 4 k 4 k © 8 o 4 k § 3 1 © H i H i M i f c . © © H i © M -0 o o to to © to to H i H i © - 0 A . H i ©s © 0 3 © to © O 4 k 2 0 o H i o H i © * 4 3i © © H i © H i © 3 1 © H i © to to H i © © © H i -0 to to 4 k H i o to 4 k 2 H i © © to a to to C O C n 0 0 Hi -4 C O H i © H * © © * - k to H I o © © © © © © 4 k © H 01 © H i 4 k © C n H i •4 H i o © H i © © H i 4 k 2 H i © 4 k © © Hi 0 0 0 0 0 0 to © 8 H i © H i © H i M to to to © © to H i © H i © H i © © H i © 3 1 to © H i © © to a 03 01 M to to to C O H * 4 k to O H i H i H * Hi H i © to H i C n 3 1 to w H i © H i C n © © © © © Hi H i 00 to to © to 4 k Hi 3 1 Hi to -0 3i w to © r & i cn o M o o o © o © o o © © to © © to © © O M O o © o © © M © to O o © © Hi H i O o o o © o o © © H i Hi o o H i 3 1 <0 01 to M o C O C n to C O 4 k to o o H * © to • f c . M © © 4 k H i o M to H i o M © to © © M Hi M § 3 1 H * 2 to © M © to © Hi © •0 © to © © © 2 01 4 k Hi © H i o © 4* 3 1 8 to © to o © to § Hi M © w Hi w to 3i 4 k © 4 k © © © © Hi © o Problem Mi — a— 5 Mj_*4_ s Mx~ 5 M2_ 5 M4_5 N B G D L C D L C D L C D L C D L C 51 4 3 10 3 0.30 10 3 0.31 10 3 0.30 36 4 1.31 10 3 0.29 52 3 3 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.10 8 1 0.10 53 3 2 9 2 0.13 9 2 0.13 9 2 0.13 9 2 0.13 23 6 0.62 54 4 3 10 3 0.30 10 3 0.30 10 3 0.30 53 6 2.41 10 3 0.29 55 4 4 11 4 0.40 11 4 0.40 11 4 0.39 49 4 2.09 11 4 0.38 56 3 2 8 1 0.09 8 1 0.09 8 1 0.09 8 1 0.10 8 1 0.10 57 4 4 9 2 0.24 9 2 0.24 9 2 0.24 30 6 1.50 9 2 0.23 58 3 3 9 2 0.14 9 2 0.14 9 2 0.13 26 3 0.66 9 2 0.13 59 4 4 21 5 0.92 18 7 1.07 21 5 0.93 40 7 1.97 22 6 1.05 60 3 2 3 0 0.04 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 61 3 3 9 2 0.14 9 2 0.13 9 2 0.14 16 2 0.31 9 2 0.14 62 3 2 25 7 0.76 23 10 0.79 27 12 1.05 25 7 0.67 23 10 0.73 63 3 2 9 2 0.13 9 2 0.13 9 2 0.13 9 2 0.12 16 2 0.29 64 3 3 9 2 0.13 9 2 0.13 9 2 0.13 26 2 0.59 9 2 0.13 65 3 2 9 2 0.12 9 2 0.12 9 2 0.12 24 2 0.54 9 2 0.12 66 4 2 8 1 0.15 8 1 0.14 8 1 0.15 8 1 0.14 8 1 0.14 67 3 3 9 2 0.13 9 2 0.13 9 2 0.13 16 2 0.31 9 2 0.13 68 4 2 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 69 4 3 19 4 0.76 19 4 0.76 19 4 0.76 32 4 1.23 33 4 1.76 70 3 2 13 4 0.29 13 4 0.29 13 4 0.29 11 3 0.20 18 3 0.38 71 3 2 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.10 72 3 3 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.11 73 4 2 8 1 0.14 8 1 0.15 8 1 0.14 8 1 0.14 8 1 0.14 74 4 3 9 2 0.20 9 2 0.21 9 2 0.21 16 2 0.40 9 2 0.20 75 3 2 11 3 0.19 11 3 0.20 11 3 0.20 27 3 0.58 14 4 0.29 76 3 2 9 2 0.13 9 2 0.13 9 2 0.14 9 2 0.12 16 2 0.29 77 4 3 12 4 0.51 12 4 0.51 21 5 0.90 60 5 2.91 28 5 1.18 78 4 2 17 2 0.50 17 2 0.49 17 2 0.49 17 2 0.42 17 2 0.48 79 4 2 27 4 0.97 27 4 0.99 18 3 0.56 42 4 1.57 18 3 0.57 80 3 2 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 81 3 3 10 3 0.18 10 3 0.17 10 3 0.17 17 3 0.36 17 3 0.35 82 3 3 9 2 0.14 9 2 0.14 9 2 0.13 16 2 0.31 9 2 0.14 83 3 3 9 2 0.14 9 2 0.14 9 2 0.13 9 2 0.14 16 2 0.32 84 3 3 15 5 0.34 11 3 0.22 17 6 0.43 17 6 0.38 11 3 0.21 85 3 3 10 3 0.17 10 3 0.17 10 3 0.17 36 9 1.22 33 6 0.95 86 3 2 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.09 87 4 2 8 1 0.14 8 1 0.15 8 1 0.15 8 1 0.14 8 1 0.15 88 4 2 45 16 3.32 14 4 0.57 35 11 1.97 39 3 1.40 12 3 0.41 89 3 2 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 90 4 2 46 5 2.50 133 61 63.29 36 5 1.61 33 4 1.36 146 67 70.66 91 3 2 13 4 0.29 13 4 0.29 13 4 0.29 11 3 0.20 20 4 0.47 92 4 4 11 4 0.43 11 4 0.42 11 4 0.44 18 4 0.68 47 11 2.79 93 4 2 27 4 1.00 18 3 0.57 18 3 0.56 44 5 1.68 18 3 0.55 94 4 2 8 1 0.15 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 95 4 4 10 3 0.32 10 3 0.33 10 3 0.33 10 3 0.32 38 11 2.29 96 3 2 10 2 0.16 10 2 0.16 10 2 0.16 19 2 0.37 10 2 0.16 97 3 3 9 2 0.13 9 2 0.13 9 2 0.14 29 2 0.77 9 2 0.13 98 3 3 10 3 0.16 10 3 0.16 10 3 0.17 37 3 1.04 10 3 0.16 99 3 3 10 3 0.17 10 3 0.18 10 3 0.17 17 3 0.35 17 3 0.34 100 4 4 10 3 0.36 10 3 0.33 10 3 0.34 28 5 1.31 36 6 1.72 Total 1301 290 44.12 1325 334 101.14 1251 281 39.51 2380 316 103.50 1738 377 124.82 SP 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Average 13.01 2.90 0.44 13.25 3.34 1.01 12.51 2.81 0.40 23.80 3.16 1.03 17.38 3.77 1.25 Table A. 14: Performance of fine-grained multi-method planners over 100 testing problems in the blocks-world (continued). 120 Problem —6 m ^ 4 —6 M i — 6 M 3 — 6 m 4_*6 N B G D L C D L c D L C D L c D L C 1 4 3 47 7 2.42 47 7 2.43 47 7 2.42 56 5 2.84 47 7 2.42 2 3 3 9 2 0.14 9 2 0.13 9 2 0.13 9 2 0.13 16 2 0.31 3 4 2 25 3 0.87 25 3 0.87 25 3 0.86 33 3 1.11 25 3 0.86 4 4 4 9 2 0.24 9 2 0.24 9 2 0.24 9 2 0.25 16 2 0.45 5 4 3 10 3 0.29 10 3 0.29 10 3 0.28 33 4 1.20 75 13 4.63 6 4 4 9 2 0.25 9 2 0.27 9 2 0.25 9 2 0.25 16 2 0.47 7 3 2 20 4 0.45 20 4 0.45 20 4 0.45 34 4 0.89 27 4 0.68 8 4 2 8 1 0.13 8 1 0.13 8 1 0.13 8 1 0.14 8 1 0.13 9 4 4 37 6 1.96 37 6 1.97 37 6 1.98 59 5 2.99 20 5 0.86 10 3 3 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.12 11 4 3 9 2 0.22 9 2 0.22 9 2 0.22 9 2 0.22 42 5 1.78 12 4 2 9 2 0.19 9 2 0.20 9 2 0.19 9 2 0.20 16 2 0.36 13 3 2 11 3 0.21 11 3 0.19 11 3 0.20 32 3 0.85 18 3 0.40 14 3 2 9 2 0.13 9 2 0.13 9 2 0.12 9 2 0.13 21 5 0.52 15 3 2 9 2 0.13 9 2 0.13 9 2 0.13 9 2 0.13 16 2 0.29 16 4 4 37 6 1.89 61 17 4.46 34 9 2.08 49 5 2.54 26 4 1.09 17 4 2 20 4 0.76 34 8 1.64 20 4 0.77 48 4 1.75 18 3 0.59 18 3 2 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 19 3 2 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.10 20 4 3 20 4 0.80 40 8 2.15 28 5 1.33 50 5 2.06 27 4 1.17 21 3 3 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 22 4 3 28 5 1.31 29 6 1.37 31 7 1.56 34 4 1.25 62 9 3.23 23 4 3 10 3 0.29 10 3 0.29 10 3 0.29 32 3 1.09 10 3 0.28 24 4 3 52 10 3.03 49 8 2.67 38 7 1.82 57 4 2.79 61 12 3.45 25 4 4 28 5 1.42 12 4 0.54 28 5 1.40 63 4 2.88 28 5 1.38 26 3 3 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.10 8 1 0.11 27 4 3 10 3 0.30 10 3 0.30 10 3 0.28 40 3 1.78 10 3 0.28 28 3 3 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.10 8 1 0.11 29 4 3 10 3 0.29 10 3 0.30 10 3 0.30 48 4 1.89 10 3 0.29 30 3 2 17 2 0.33 17 2 0.33 17 2 0.33 17 2 0.31 17 2 0.33 31 4 3 9 2 0.22 9 2 0.22 9 2 0.22 9 2 0.23 52 7 2.33 32 4 4 28 5 1.29 20 4 0.85 12 4 0.53 66 5 3.37 29 6 1.51 33 3 3 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.10 8 1 0.10 34 3 3 9 2 0.14 9 2 0.14 9 2 0.13 9 2 0.13 16 2 0.32 35 4 3 19 4 0.76 21 10 1.51 21 9 1.88 21 5 0.93 51 11 2.71 36 4 2 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 37 3 3 8 1 0.10 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.10 38 4 2 27 4 1.09 32 7 1.49 27 4 1.14 57 5 2.47 82 12 4.87 39 3 2 8 1 0.10 8 1 0.09 8 1 0.09 8 1 0.10 8 1 0.09 40 3 2 3 0 0.03 3 0 0.03 3 0 0.03 3 0 0.03 3 0 0.04 41 3 2 10 2 0.16 10 2 0.15 10 2 0.16 17 2 0.30 10 2 0.15 42 3 2 20 4 0.47 12 4 0.26 18 3 0.37 18 3 0.36 12 4 0.24 43 4 3 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 44 4 3 10 3 0.27 10 3 0.28 10 3 0.28 58 3 2.34 10 3 0.27 45 4 4 16 2 0.48 16 2 0.47 16 2 0.48 16 2 0.48 16 2 0.47 46 4 4 9 2 0.23 9 2 0.23 9 2 0.23 16 2 0.43 9 2 0.23 47 3 2 17 2 0.34 19 3 0.42 19 3 0.40 19 3 0.37 19 3 0.41 48 3 3 8 1 0.10 8 1 0.10 8 1 0.11 8 1 0.10 8 1 0.11 49 4 4 11 4 0.41 11 4 0.42 11 4 0.42 18 4 0.66 38 4 1.69 50 3 3 10 3 0.17 10 3 0.17 10 3 0.17 17 3 0.35 17 3 0.34 Table A. 15: Perform ance of fine-grained m ulti-m ethod planners over 100 problem s in the blocks-world (continued). Problem Mi — 3— 6 Mi — 4— 6 Ml __6 M3_6 M*_ 6 N B G D L c D L c D L C D L C D L C 51 4 3 10 3 0.30 10 3 0.29 10 3 0.29 56 5 2.38 10 3 0.28 52 3 3 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.11 53 3 2 9 2 0.13 9 2 0.13 9 2 0.12 9 2 0.13 25 7 0.73 54 4 3 10 3 0.30 10 3 0.30 10 3 0.29 40 4 1.47 10 3 0.28 55 4 4 11 4 0.38 11 4 0.40 11 4 0.38 48 5 2.28 11 4 0.37 56 3 2 8 1 0.09 8 1 0.09 8 1 0.10 8 1 0.09 8 1 0.10 57 4 4 9 2 0.24 9 2 0.24 9 2 0.23 24 2 0.78 9 2 0.23 58 3 3 9 2 0.13 9 2 0.13 9 2 0.13 32 3 0.86 9 2 0.13 59 4 4 30 6 1.74 15 6 0.83 15 6 0.82 64 6 3.20 30 6 1.45 60 3 2 3 0 0.03 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 61 3 3 9 2 0.14 9 2 0.13 9 2 0.14 24 2 0.58 9 2 0.13 62 3 2 18 3 0.38 37 11 1.59 18 3 0.38 34 4 0.88 57 13 3.22 63 3 2 9 2 0.12 9 2 0.13 9 2 0.13 9 2 0.12 16 2 0.29 64 3 3 9 2 0.13 9 2 0.13 9 2 0.13 16 2 0.31 9 2 0.13 65 3 2 9 2 0.13 9 2 0.13 9 2 0.12 16 2 0.28 9 2 0.12 66 4 2 8 1 0.15 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 67 3 3 9 2 0.13 9 2 0.12 9 2 0.13 24 2 0.57 9 2 0.13 68 4 2 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 3 0 0.04 69 4 3 48 8 2.65 53 12 3.37 27 4 1.11 33 4 1.18 102 19 7.52 70 3 2 13 4 0.28 11 3 0.21 13 4 0.28 13 4 0.27 20 4 0.46 71 3 2 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.09 8 1 0.10 72 3 3 8 1 0.10 8 1 0.11 8 1 0.11 8 1 0.11 8 1 0.10 73 4 2 8 1 0.15 8 1 0.14 8 1 0.14 8 1 0.14 8 1 0.14 74 4 3 9 2 0.20 9 2 0.20 9 2 0.20 44 6 2.07 9 2 0.20 75 3 2 11 3 0.19 11 3 0.19 11 3 0.19 32 3 0.79 11 3 0.19 76 3 2 9 2 0.13 9 2 0.13 9 2 0.13 9 2 0.13 16 2 0.29 77 4 3 21 5 0.83 12 4 0.49 12 4 0.49 104 6 5.00 45 7 2.15 78 4 2 17 2 0.48 17 2 0.48 17 2 0.46 17 2 0.40 17 2 0.47 79 4 2 18 3 0.56 18 3 0.55 27 4 0.95 56 4 2.45 27 4 0.95 80 3 2 3 0 0.04 3 0 0.03 3 0 0.04 3 0 0.04 3 0 0.04 81 3 3 10 3 0.17 10 3 0.17 10 3 0.17 17 3 0.36 17 3 0.35 82 3 3 9 2 0.13 9 2 0.14 9 2 0.14 25 3 0.62 9 2 0.13 83 3 3 9 2 0.13 9 2 0.14 9 2 0.14 9 2 0.14 16 2 0.31 84 3 3 22 4 0.56 11 3 0.22 11 3 0.22 32 11 1.23 11 3 0.21 85 3 3 10 3 0.17 10 3 0.17 10 3 0.17 41 4 1.28 31 7 0.97 86 3 2 8 1 0.10 8 1 0.10 8 1 0.09 8 1 0.10 8 1 0.10 87 4 2 8 1 0.14 8 1 0.14 8 1 0.15 8 1 0.14 8 1 0.15 88 4 2 36 5 1.51 19 3 0.61 36 5 1.50 49 3 1.75 19 3 0.61 89 3 2 8 1 0.09 8 1 0.09 8 1 0.10 8 1 0.10 8 1 0.09 90 4 2 33 4 1.33 37 9 5.10 56 14 13.64 73 7 3.39 51 10 9.20 91 3 2 11 3 0.21 11 3 0.21 13 4 0.28 11 3 0.19 18 3 0.37 92 4 4 11 4 0.42 11 4 0.43 11 4 0.42 26 4 1.05 50 11 2.84 93 4 2 18 3 0.55 18 3 0.54 27 4 0.96 25 3 0.74 27 4 0.95 94 4 2 8 1 0.14 8 1 0.14 8 1 0.15 8 1 0.14 8 1 0.14 95 4 4 10 3 0.33 10 3 0.32 10 3 0.32 10 3 0.31 30 3 1.30 96 3 2 10 2 0.15 10 2 0.16 10 2 0.16 19 2 0.37 10 2 0.15 97 3 3 9 2 0.14 9 2 0.14 9 2 0.14 25 3 0.63 9 2 0.14 98 3 3 10 3 0.17 10 3 0.17 10 3 0.16 34 3 0.87 10 3 0.16 99 3 3 10 3 0.17 10 3 0.17 10 3 0.17 17 3 0.35 22 6 0.58 100 4 4 10 3 0.33 10 3 0.34 10 3 0.33 25 3 0.91 17 3 0.54 Total 1356 259 42.63 1363 297 50.04 1323 273 52.78 2422 271 84.96 1983 346 82.91 SP 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Average 13.56 2.59 0.43 13.63 2.97 0.50 13.23 2.73 0.53 24.22 2.71 0.85 19.83 3.46 0.83 Table A. 16: Perform ance of fine-grained m ulti-m ethod planners over 100 testing problem s in th e blocks-world (continued). 122 A p p e n d ix B E x p e r im e n ta l R esu lts: T h e M a ch in e-S h o p S ch ed u lin g D o m a in This appendix gives the detailed numeric information from the experiments in the machine-shop scheduling domain. Appendix B .l presents the experimental results for the six single-method planners over 30 training problems. Appendix B.2 presents the experimental results for the six single-method planners and the created m ulti-m ethod planners over 100 test problems. 123 B . l P erfo rm a n ce over 30 T raining P r o b le m s M ethod M \ Problem s D ecisions P lan Length Num ber Goals Trial 1 Trial 2 Trial 3 Average Trial 1 Trial 2 Trial 3 Average 1 5 17 23 25 21.7 3 3 3 3.0 2 5 24 24 26 24.7 3 3 3 3.0 3 5 23 23 23 23.0 3 3 3 3.0 4 5 17 23 17 19.0 3 3 3 3.0 5 5 26 24 26 25.3 3 3 3 3.0 6 5 22 22 22 22.0 2 2 2 2.0 7 5 - - - - - - - - 8 5 39 32 39 36.7 4 4 4 4.0 9 5 - - - - - - - - 10 5 16 16 16 16.0 2 2 2 2.0 11 5 - - - - - - - - 12 5 23 23 17 21.0 3 3 3 3.0 13 5 16 16 16 16.0 2 2 2 2.0 14 5 25 25 17 22.3 3 3 3 3.0 15 5 24 24 24 24.0 3 3 3 3.0 16 5 25 25 17 22.3 3 3 3 3.0 17 5 - - - - - - - - 18 5 - - - - - - - - 19 5 - - - - - - - - 20 5 24 24 40 29.3 3 3 3 3.0 21 5 8 8 8 8.0 1 1 1 1.0 22 5 - - - - - - - - 23 5 25 25 23 24.3 3 3 3 3.0 24 5 40 40 24 34.7 3 3 3 3.0 25 5 32 32 34 32.7 4 4 4 4.0 26 5 16 16 16 16.0 2 2 2 2.0 27 5 8 8 8 8.0 1 1 1 1.0 28 5 - - - - - - - - 29 5 16 16 16 16.0 2 2 2 2.0 30 5 24 24 24 24.0 3 3 3 3.0 Total 490 493 478 487.0 59 59 59 59.0 Solved Problem s 22 22 22 22.0 22 22 22 22.0 Average 22.27 22.41 21.73 22.14 2.68 2.68 2.68 2.68 Table B .l: Performance of M \ over 30 training problems in the machine-shop scheduling domain. 124 M ethod M 2 Problem s D ecisions P lan Length Num ber Goals T ia l 1 Trial 2 Trial 3 Average Trial 1 Trial 2 Tried 3 Average 1 5 17 23 25 21.7 3 3 3 3.0 2 5 24 24 26 24.7 3 3 3 3.0 3 5 23 23 23 23.0 3 3 3 3.0 4 5 17 23 17 19.0 3 3 3 3.0 5 5 26 24 26 25.3 3 3 3 3.0 6 5 22 22 22 22.0 2 2 2 2.0 7 5 - - - - - - - - 8 5 39 32 39 36.7 4 4 4 4.0 9 5 - - - - - - - - 10 5 16 16 16 16.0 2 2 2 2.0 11 5 - - - - - - - - 12 5 23 23 17 21.0 3 3 3 3.0 13 5 16 16 16 16.0 2 2 2 2.0 14 5 25 25 17 22.3 3 3 3 3.0 15 5 24 24 24 24.0 3 3 3 3.0 16 5 25 25 17 22.3 3 3 3 3.0 17 5 - - - - - - - - 18 5 - - - - - - - - 19 5 - - - - - - - - 20 5 24 24 40 29.3 3 3 3 3.0 21 5 8 8 8 8.0 1 1 1 1.0 22 5 - - - - - - - - 23 5 25 25 23 24.3 3 3 3 3.0 24 5 40 40 24 34.7 3 3 3 3.0 25 5 32 32 34 32.7 4 4 4 4.0 26 5 16 16 16 16.0 2 2 2 2.0 27 5 8 8 8 8.0 1 1 1 1.0 28 5 - - - - - - - - 29 5 16 16 16 16.0 2 2 2 2.0 30 5 24 24 24 24.0 3 3 3 3.0 Total 490 493 478 487.0 59 59 59 59.0 Solved Problem s 22 22 22 22.0 22 22 22 22.0 Average 22.27 22.41 21.73 22.14 2.68 2.68 2.68 2.68 Table B.2: Performance of M 2 over 30 training problems in the machine-shop scheduling domain. 125 M ethod M 3 Problem s Decisions P lan Length Num ber Goals Trial 1 Trial 2 Trial 3 Average TWal 1 Trial 2 Trial 3 Average 1 5 17 23 25 21.7 3 3 3 3.0 2 5 24 24 26 24.7 3 3 3 3.0 3 5 23 23 23 23.0 3 3 3 3.0 4 5 17 23 17 19.0 3 3 3 3.0 5 5 26 24 26 25.3 3 3 3 3.0 6 5 22 22 22 22.0 2 2 2 2.0 7 5 - - - - - - - - 8 5 39 32 39 36.7 4 4 4 4.0 9 5 - - - - - - - - 10 5 16 16 16 16.0 2 2 2 2.0 11 5 - - - - - - - - 12 5 23 23 17 21.0 3 3 3 3.0 13 5 16 16 16 16.0 2 2 2 2.0 14 5 25 25 17 22.3 3 3 3 3.0 15 5 24 24 24 24.0 3 3 3 3.0 16 5 25 25 17 22.3 3 3 3 3.0 17 5 - - - - - - - - 18 5 - - - - - - - - 19 5 - - - - - - - - 20 5 24 24 40 29.3 3 3 3 3.0 21 5 8 8 8 8.0 1 1 1 1.0 22 5 - - - - - - - - 23 5 25 25 23 24.3 3 3 3 3.0 24 5 40 40 24 34.7 3 3 3 3.0 25 5 32 32 34 32.7 4 4 4 4.0 26 5 16 16 16 16.0 2 2 2 2.0 27 5 8 8 8 8.0 1 1 1 1.0 28 5 - - - - - - - - 29 5 16 16 16 16.0 2 2 2 2.0 30 5 24 24 24 24.0 3 3 3 3.0 Total 490 493 478 487.0 59 59 59 59.0 Solved Problem s 22 22 22 22.0 22 22 22 22.0 Average 22.27 22.41 21.73 22.14 2.68 2.68 2.68 2.68 Table B.3: Performance of M3 over 30 training problems in the machine-shop scheduling domain. 126 M ethod M i Problem s Decisions P lan Length Num ber Goals Trial 1 Tried 2 Tried 3 Average Trial 1 Trial 2 Trial 3 Average 1 5 40 32 57 43.0 5 4 8 5.7 2 5 39 32 32 34.3 4 4 4 4.0 3 5 39 48 48 45.0 4 6 6 5.3 4 5 39 40 39 39.3 4 5 4 4.3 5 5 39 32 32 34.3 4 4 4 4.0 6 5 23 32 32 29.0 2 4 4 3.3 7 5 72 55 56 61.0 9 6 7 7.3 8 5 48 41 57 48.7 6 6 8 6.7 9 5 48 40 57 48.3 6 5 8 6.3 10 5 16 23 23 20.7 2 2 2 2.0 11 5 16 25 25 22.0 2 4 4 3.3 12 5 39 39 40 39.3 4 4 5 4.3 13 5 24 32 24 26.7 3 4 3 3.3 14 5 39 63 56 52.7 4 7 7 6.0 15 5 32 24 32 29.3 4 3 4 3.7 16 5 48 40 65 51.0 6 5 9 6.7 17 5 24 24 40 29.3 3 3 5 3.7 18 5 42 40 41 41.0 7 5 6 6.0 19 5 40 40 40 40.0 5 5 5 5.0 20 5 41 57 41 46.3 6 8 6 6.7 21 5 24 24 24 24.0 3 3 3 3.0 22 5 40 55 39 44.7 5 6 4 5.0 23 5 40 40 40 40.0 5 5 5 5.0 24 5 73 41 41 51.7 10 6 6 7.3 25 5 41 32 41 38.0 6 4 6 5.3 26 5 23 24 16 21.0 2 3 2 2.3 27 5 16 15 15 15.3 2 1 1 1.3 28 5 48 56 73 59.0 6 7 10 7.7 29 5 16 23 16 18.3 2 2 2 2.0 30 5 24 24 40 29.3 3 3 5 3.7 T otal 1093 1093 1182 1122.7 134 134 153 140.3 Solved Problem s 30 30 30 30.0 30 30 30 30.0 Average 36.43 36.43 39.40 37.42 4.47 4.47 5.10 4.68 Table B.4: Performance of M* over 30 training problems in the machine-shop scheduling domain. 127 M ethod M g, Problem s Decisions P lan Length Num ber Goals TVial 1 Trial 2 Trial 3 Average Trial 1 Trial 2 TVial 3 Average 1 5 50 40 41 43.7 8 5 6 6.3 2 5 32 33 40 35.0 4 5 5 4.7 3 5 57 40 32 43.0 8 5 4 5.7 4 5 41 39 76 52.0 6 4 13 7.7 5 5 33 24 32 29.7 5 3 4 4.0 6 5 23 33 33 29.7 2 5 5 4.0 7 5 66 49 57 57.3 10 7 8 8.3 8 5 39 32 40 37.0 4 4 5 4.3 9 5 75 57 47 59.7 12 8 5 8.3 10 5 16 16 16 16.0 2 2 2 2.0 11 5 16 16 16 16.0 2 2 2 2.0 12 5 32 56 31 39.7 4 7 3 4.7 13 5 24 23 24 23.7 3 2 3 2.7 14 5 93 31 40 54.7 15 3 5 7.7 15 5 31 32 32 31.7 3 4 4 3.7 16 5 40 39 49 42.7 5 4 7 5.3 17 5 49 24 24 32.3 7 3 3 4.3 18 5 33 32 59 41.3 5 4 10 6.3 19 5 24 24 42 30.0 3 3 7 4.3 20 5 50 32 32 38.0 8 4 4 5.3 21 5 24 24 33 27.0 3 3 5 3.7 22 5 40 32 32 34.7 5 4 4 4.3 23 5 65 41 49 51.7 9 6 7 7.3 24 5 32 32 40 34.7 4 4 5 4.3 25 5 48 56 40 48.0 6 7 5 6.0 26 5 16 16 23 18.3 2 2 2 2.0 27 5 16 16 15 15.7 2 2 1 1.7 28 5 49 48 76 57.7 7 6 13 8.7 29 5 16 16 16 16.0 2 2 2 2.0 30 5 24 24 31 26.3 3 3 3 3.0 Total 1154 977 1118 1083.0 159 123 152 144.7 Solved Problem s 30 30 30 30.0 30 30 30 30.0 Average 38.47 32.57 37.27 36.10 5.30 4.10 5.07 4.82 Table B.5: Performance of M 5 over 30 training problems in the machine-shop scheduling domain. 128 " T M ethod Mg Problem s Decisions P lan Length Num ber Goals Trial 1 Trial 2 Trial 3 Average Trial 1 TVial 2 TVial 3 Average 1 5 50 40 41 43.7 8 5 6 6.3 2 5 32 33 40 35.0 4 5 5 4.7 3 5 57 40 32 43.0 8 5 4 5.7 4 5 41 39 76 52.0 6 4 13 7.7 5 5 33 24 32 29.7 5 3 4 4.0 6 5 23 33 33 29.7 2 5 5 4.0 7 5 66 49 57 57.3 10 7 8 8.3 8 5 39 32 40 37.0 4 4 5 4.3 9 5 75 57 47 59.7 12 8 5 8.3 10 5 16 16 16 16.0 2 2 2 2.0 11 5 16 16 16 16.0 2 2 2 2.0 12 5 32 56 31 39.7 4 7 3 4.7 13 5 24 23 24 23.7 3 2 3 2.7 14 5 93 31 40 54.7 15 3 5 7.7 15 5 31 32 32 31.7 3 4 4 3.7 16 5 40 39 49 42.7 5 4 7 5.3 17 5 49 24 24 32.3 7 3 3 4.3 18 5 33 32 59 41.3 5 4 10 6.3 19 5 24 24 42 30.0 3 3 7 4.3 20 5 50 32 32 38.0 8 4 4 5.3 21 5 24 24 33 27.0 3 3 5 3.7 22 5 40 32 32 34.7 5 4 4 4.3 23 5 65 41 49 51.7 9 6 7 7.3 24 5 32 32 40 34.7 4 4 5 4.3 25 5 48 56 40 48.0 6 7 5 6.0 26 5 16 16 23 18.3 2 2 2 2.0 27 5 16 16 15 15.7 2 2 1 1.7 28 5 49 48 76 57.7 7 6 13 8.7 29 5 16 16 16 16.0 2 2 2 2.0 30 5 24 24 31 26.3 3 3 3 3.0 Total 1154 977 1118 1083.0 159 123 152 144.7 Solved Problem s 30 30 30 30.0 30 30 30 30.0 Average 38.47 32.57 37.27 36.10 5.30 4.10 5.07 4.82 Table B.6: Performance of M G over 30 training problems in the machine-shop I scheduling domain. i B .2 P erfo rm a n ce over 100 T estin g P r o b le m s | ■ i i The entries in the tables are defined as follows: \ i D: Number of decisions L: Plan length C: CPU time (sec.) I N: Problem number B: Number of blocks G: Number of goal conjuncts SP: Solved problems. 129 Problem M i m 2 M 3 M t M s M 6 N G D L C D L C D L C D L C D L C D L C 1 5 23 3 0.67 23 3 0.67 23 3 0.68 40 5 1.23 40 5 1.24 40 5 1.25 2 5 26 3 0.53 26 3 0.53 26 3 0.54 41 6 1.05 41 6 1.05 41 6 1.06 3 5 17 3 0.33 17 3 0.33 17 3 0.33 71 8 2.32 71 8 2.33 71 8 2.34 4 5 17 3 0.33 17 3 0.33 17 3 0.33 50 8 1.73 50 8 1.74 50 8 1.74 5 5 26 3 0.54 26 3 0.54 26 3 0.54 41 6 1.07 41 6 1.07 41 6 1.07 6 5 16 2 0.30 16 2 0.30 16 2 0.31 42 7 1.53 42 7 1.54 42 7 1.55 7 5 - - - - - - - - - 40 5 0.99 40 5 0.98 40 5 0.99 8 5 32 4 0.77 32 4 0.77 32 4 0.79 32 4 0.75 32 4 0.76 32 4 0.75 9 5 - - - - - - - - - 57 8 1.70 57 8 1.67 57 8 1.67 10 5 16 2 0.31 16 2 0.31 16 2 0.31 23 2 0.48 23 2 0.49 23 2 0.49 11 5 - - - - - - - - - 34 6 1.15 34 6 1.15 34 6 1.18 12 5 25 3 0.74 25 3 0.74 25 3 0.77 50 8 1.78 50 8 1.78 50 8 1.77 13 5 16 2 0.31 16 2 0.31 16 2 0.31 24 3 0.55 24 3 0.55 24 3 0.56 14 5 17 3 0.33 17 3 0.33 17 3 0.33 55 6 1.43 55 6 1.43 55 6 1.43 15 5 24 3 0.50 24 3 0.50 24 3 0.51 24 3 0.52 24 3 0.52 24 3 0.52 16 5 17 3 0.33 17 3 0.33 17 3 0.33 81 11 2.97 81 11 2.96 81 11 2.96 17 5 - 24 3 0.55 24 3 0.56 24 3 0.56 18 5 - 58 9 1.76 58 9 1.76 58 9 1.77 19 5 - - - - - - - - - 47 5 1.29 47 5 1.27 47 5 1.28 20 5 24 3 0.52 24 3 0.52 24 3 0.49 32 4 0.76 32 4 0.75 32 4 0.75 21 5 8 1 0.17 8 1 0.17 8 1 0.17 15 1 0.29 15 1 0.30 15 1 0.29 22 5 - - - - - - - - - 41 6 1.08 41 6 1.08 41 6 1.07 23 5 25 3 0.75 25 3 0.75 25 3 0.80 41 6 1.08 41 6 1.08 41 6 1.07 24 5 30 3 0.77 30 3 0.77 30 3 0.78 40 5 0.99 40 5 0.99 40 5 0.99 25 5 34 4 0.76 34 4 0.76 34 4 0.76 41 6 1.04 41 6 1.06 41 6 1.05 26 5 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 27 5 8 1 0.17 8 1 0.17 8 1 0.17 15 1 0.29 15 1 0.30 15 1 0.30 28 5 - - - - - - - - - 56 7 1.59 56 7 1.59 56 7 1.59 29 5 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 30 5 24 3 0.51 24 3 0.51 24 3 0.51 24 3 0.52 24 3 0.51 24 3 0.52 31 5 8 1 0.16 8 1 0.16 8 1 0.17 15 1 0.30 15 1 0.29 15 1 0.29 32 5 - - - - - - - - - 48 6 1.23 48 6 1.24 48 6 1.23 33 5 26 3 0.54 26 3 0.54 26 3 0.54 33 5 0.81 33 5 0.81 33 5 0.81 34 5 24 3 0.50 24 3 0.50 24 3 0.52 24 3 0.51 24 3 0.50 24 3 0.51 35 5 24 3 0.52 24 3 0.52 24 3 0.51 24 3 0.51 24 3 0.51 24 3 0.51 36 5 - 40 5 0.99 40 5 0.99 40 5 0.99 37 5 - 31 3 0.69 31 3 0.69 31 3 0.69 38 5 - 32 4 0.76 32 4 0.77 32 4 0.77 39 5 - - - - - - - - - 34 6 1.07 34 6 1.07 34 6 1.08 40 5 16 2 0.30 16 2 0.30 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 41 5 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.32 16 2 0.31 16 2 0.31 42 5 - 40 5 0.96 40 5 0.96 40 5 0.96 43 5 - - - - - - - - - 40 5 0.97 40 5 0.95 40 5 0.97 44 5 16 2 0.30 16 2 0.30 16 2 0.30 23 2 0.47 23 2 0.47 23 2 0.46 45 5 33 4 1.01 33 4 1.01 33 4 0.95 73 10 2.31 73 10 2.30 73 10 2.29 46 5 24 3 0.54 24 3 0.54 24 3 0.51 24 3 0.52 24 3 0.53 24 3 0.52 47 5 16 2 0.32 16 2 0.32 16 2 0.30 23 2 0.47 23 2 0.47 23 2 0.47 48 5 24 2 0.75 24 2 0.75 24 2 0.72 33 5 0.96 33 5 0.97 33 5 0.97 49 5 23 3 0.59 23 3 0.59 23 3 0.58 49 7 1.41 49 7 1.43 49 7 1.43 50 5 - 65 9 2.02 65 9 2.02 65 9 2.03 Table B.7: Perform ance of single-m ethod planners over 100 testing problem s in th e m achine-shop scheduling dom ain. 130 Problem M i m 2 M 3 m 4 Af5 Me N G D L C D L C D L c D L C D L C D L C 51 5 8 1 0.18 8 1 0.18 8 1 0.16 8 1 0.17 8 1 0.17 8 1 0.16 52 5 24 3 0.53 24 3 0.53 24 3 0.53 68 12 2.85 68 12 2.84 68 12 2.84 53 5 32 4 0.74 32 4 0.74 32 4 0.74 54 5 - - - - - - - - - 63 7 1.78 63 7 1.79 63 7 1.79 55 5 26 3 0.54 26 3 0.54 26 3 0.54 40 5 0.99 40 5 0.98 40 5 0.99 56 5 - - - - - - - - - 64 8 1.71 64 8 1.72 64 8 1.73 57 5 22 2 0.56 22 2 0.56 22 2 0.55 42 7 1.53 42 7 1.52 42 7 1.55 58 5 . . - - - - - - - 16 2 0.36 16 2 0.36 16 2 0.36 59 5 8 1 0.18 8 1 0.18 8 1 0.17 16 2 0.37 16 2 0.37 16 2 0.36 60 5 16 2 0.33 16 2 0.33 16 2 0.31 24 3 0.55 24 3 0.55 24 3 0.55 61 5 - - - - - - - - - 25 4 0.69 25 4 0.69 25 4 0.69 62 5 16 2 0.32 16 2 0.32 16 2 0.32 16 2 0.31 16 2 0.31 16 2 0.31 63 5 24 3 0.53 24 3 0.53 24 3 0.52 31 3 0.68 31 3 0.68 31 3 0.69 64 5 24 2 0.76 24 2 0.76 24 2 0.74 23 2 0.47 23 2 0.47 23 2 0.48 65 5 - - - - - - - - - 40 5 0.99 40 5 1.00 40 5 0.99 66 5 8 1 0.18 8 1 0.18 8 1 0.16 16 2 0.36 16 2 0.37 16 2 0.36 67 5 3 0 0.06 3 0 0.06 3 0 0.05 3 0 0.05 3 0 0.05 3 0 0.05 68 5 24 3 0.53 24 3 0.53 24 3 0.51 24 3 0.51 24 3 0.52 24 3 0.51 69 5 24 3 0.52 24 3 0.52 24 3 0.50 32 4 0.75 32 4 0.76 32 4 0.76 70 5 34 4 0.79 34 4 0.79 34 4 0.77 49 7 1.31 49 7 1.29 49 7 1.32 71 5 24 2 0.75 24 2 0.75 24 2 0.73 24 3 0.55 24 3 0.55 24 3 0.54 72 5 16 2 0.32 16 2 0.32 16 2 0.32 16 2 0.32 16 2 0.31 16 2 0.31 73 5 39 4 1.02 39 4 1.02 39 4 0.99 47 5 1.17 47 5 1.17 47 5 1.18 74 5 24 2 0.78 24 2 0.78 24 2 0.74 33 5 0.99 33 5 0.98 33 5 0.99 75 5 22 2 0.59 22 2 0.59 22 2 0.55 32 4 0.91 32 4 0.91 32 4 0.90 76 5 16 2 0.33 16 2 0.33 16 2 0.31 16 2 0.30 16 2 0.31 16 2 0.31 77 5 16 2 0.31 16 2 0.31 16 2 0.30 16 2 0.30 16 2 0.31 16 2 0.31 78 5 - - - - - - - - - 48 6 1.36 48 6 1.36 48 6 1.36 79 5 32 4 0.74 32 4 0.74 32 4 0.74 80 5 - - - - - - - - - 41 6 1.20 41 6 1.20 41 6 1.20 81 5 24 2 0.74 24 2 0.74 24 2 0.72 60 11 2.57 60 11 2.58 60 11 2.57 82 5 - - - - - - - - - 31 3 0.69 31 3 0.69 31 3 0.69 83 5 8 1 0.18 8 1 0.18 8 1 0.17 16 2 0.37 16 2 0.37 16 2 0.37 84 5 16 2 0.32 16 2 0.32 16 2 0.30 24 3 0.55 24 3 0.56 24 3 0.56 85 5 24 3 0.54 24 3 0.54 24 3 0.51 32 4 0.73 32 4 0.74 32 4 0.73 86 5 24 2 0.74 24 2 0.74 24 2 0.72 24 3 0.55 24 3 0.54 24 3 0.54 87 5 - - - - - - - - - 42 7 1.55 42 7 1.54 42 7 1.53 88 5 31 4 0.83 31 4 0.83 31 4 0.80 58 9 1.78 58 9 1.78 58 9 1.78 89 5 46 3 1.36 46 3 1.36 46 3 1.32 32 4 0.77 32 4 0.77 32 4 0.76 90 5 - - - - - - - - - 17 3 0.41 17 3 0.39 17 3 0.40 91 5 24 2 0.76 24 2 0.76 24 2 0.73 24 3 0.54 24 3 0.55 24 3 0.55 92 5 8 1 0.17 8 1 0.17 8 1 0.16 15 1 0.30 15 1 0.29 15 1 0.30 93 5 - - - - - - - - - 24 3 0.55 24 3 0.56 24 3 0.55 94 5 16 2 0.32 16 2 0.32 16 2 0.31 16 2 0.31 16 2 0.31 16 2 0.31 95 5 8 1 0.17 8 1 0.17 8 1 0.17 24 3 0.61 24 3 0.61 24 3 0.61 96 5 22 2 0.58 22 2 0.58 22 2 0.55 24 3 0.55 24 3 0.54 24 3 0.54 97 5 17 3 0.36 17 3 0.36 17 3 0.33 40 5 1.00 40 5 0.98 40 5 0.98 98 5 _ - - - - . - - - 32 4 0.77 32 4 0.77 32 4 0.76 99 5 24 3 0.53 24 3 0.53 24 3 0.52 24 3 0.51 24 3 0.51 24 3 0.51 100 5 34 4 0.79 34 4 0.79 34 4 0.77 47 5 1.16 47 5 1.18 47 5 1.17 Total 1451 17044-19 1451 17044.19 1451 17043.38 3397 44791.98 3397 44792.00 3397 44792.07 SP 70 70 70 70 70 70 70 70 70 100 100 100 100 100 100 100 100 100 AVG 20.732.43 0.63 20.732 .43 0.63 20.732.43 0.62 33.974.47 0.92 33.974.47 0.92 33.974.47 0.92 Table B.8: Perform ance of single-m ethod planners over 100 testing problem s in the m achine-shop scheduling dom ain (continued). 131 Problem m 2~*Ms m 3— M 6 M 2—s M 3—6 N G D L C D L C D L C D L C 1 5 28 3 0.83 28 3 0.84 18 4 0.51 18 4 0.52 2 5 31 3 0.71 31 3 0.70 27 6 0.71 27 6 0.71 3 5 22 3 0.50 22 3 0.50 18 4 0.42 18 4 0.42 4 5 22 3 0.49 22 3 0.50 17 3 0.35 17 3 0.35 5 5 31 3 0.70 31 3 0.71 27 6 0.73 27 6 0.70 6 5 21 2 0.46 21 2 0.46 17 3 0.41 17 3 0.41 7 5 58 5 1.43 58 5 1.43 26 5 0.68 26 5 0.67 8 5 39 4 0.96 39 4 0.96 32 4 0.84 32 4 0.84 9 5 74 7 1.90 74 7 1.92 27 6 0.71 27 6 0.72 10 5 21 2 0.49 21 2 0.49 16 2 0.33 16 2 0.33 11 5 24 2 0.57 24 2 0.58 9 2 0.24 9 2 0.24 12 5 22 3 0.51 22 3 0.50 18 4 0.43 18 4 0.43 13 5 21 2 0.49 21 2 0.48 16 2 0.34 16 2 0.33 14 5 30 3 0.86 30 3 0.85 18 4 0.43 18 4 0.43 15 5 29 3 0.68 29 3 0.69 24 3 0.57 24 3 0.57 16 5 28 3 0.76 28 3 0.75 18 4 0.43 18 4 0.43 17 5 67 7 2.01 67 7 1.97 17 3 0.41 17 3 0.41 18 5 49 6 1.37 49 6 1.38 21 7 0.50 21 7 0.50 19 5 60 7 2.05 60 7 2.05 17 3 0.41 17 3 0.41 20 5 61 3 1.88 61 3 1.89 24 3 0.55 24 3 0.55 21 5 13 1 0.25 13 1 0.26 8 1 0.16 8 1 0.16 22 5 90 12 3.30 90 12 3.30 17 3 0.37 17 3 0.38 23 5 22 3 0.49 22 3 0.50 18 4 0.43 18 4 0.43 24 5 29 3 0.71 29 3 0.69 24 3 0.56 24 3 0.56 25 5 39 4 0.94 39 4 0.95 32 4 0.80 32 4 0.81 26 5 21 2 0.47 21 2 0.48 16 2 0.33 16 2 0.33 27 5 13 1 0.24 13 1 0.24 8 1 0.17 8 1 0.16 28 5 73 6 1.85 73 6 1.89 28 7 0.74 28 7 0.75 29 5 21 2 0.49 21 2 0.47 16 2 0.33 16 2 0.34 30 5 29 3 0.67 29 3 0.68 24 3 0.56 24 3 0.56 31 5 13 1 0.25 13 1 0.25 8 1 0.16 8 1 0.17 32 5 111 16 4.47 111 16 4.47 26 5 0.67 26 5 0.68 33 5 31 3 0.71 31 3 0.71 24 3 0.54 24 3 0.57 34 5 29 3 0.69 29 3 0.70 24 3 0.55 24 3 0.55 35 5 29 3 0.68 29 3 0.68 24 3 0.56 24 3 0.56 36 5 99 14 3.99 99 14 3.98 18 4 0.45 18 4 0.44 37 5 40 4 0.94 40 4 0.93 17 3 0.37 17 3 0.38 38 5 41 4 1.00 41 4 1.01 18 4 0.45 18 4 0.45 39 5 33 4 0.86 33 4 0.86 18 4 0.51 18 4 0.51 40 5 21 2 0.47 21 2 0.47 16 2 0.33 16 2 0.33 41 5 21 2 0.48 21 2 0.48 16 2 0.32 16 2 0.33 42 5 49 5 1.15 49 5 1.15 19 5 0.47 19 5 0.47 43 5 60 5 1.74 60 5 1.74 26 5 0.68 26 5 0.68 44 5 21 2 0.47 21 2 0.47 16 2 0.33 16 2 0.33 45 5 36 4 0.98 36 4 1.00 25 4 0.60 25 4 0.60 46 5 29 3 0.69 29 3 0.69 24 3 0.56 24 3 0.56 47 5 21 2 0.47 21 2 0.47 17 3 0.41 17 3 0.40 48 5 27 2 0.73 27 2 0.74 16 2 0.32 16 2 0.33 49 5 28 3 0.75 28 3 0.75 18 4 0.43 18 4 0.44 50 5 58 5 1.40 58 5 1.38 26 5 0.69 26 5 0.68 Table B.9: Perform ance of m ulti-m ethod planners over 100 testing problem s in the m achine-shop scheduling dom ain. 132 Problem | M 2 — ► M g m 3-* m 6 M 2— .5 M 3_6 N G D L C D L C D L C D L C 51 5 13 1 0.24 13 1 0.25 8 1 0.16 8 1 0.16 52 5 60 3 2.04 60 3 2.04 24 3 0.56 24 3 0.57 53 5 67 5 1.91 67 5 1.91 26 5 0.70 26 5 0.70 54 5 102 14 3.97 102 14 3.96 28 7 0.79 28 7 0.79 55 5 29 3 0.69 29 3 0.69 26 5 0.67 26 5 0.68 56 5 75 10 2.19 75 10 2.21 19 5 0.42 19 5 0.42 57 5 21 2 0.46 21 2 0.47 17 3 0.40 17 3 0.40 58 5 42 6 1.36 42 6 1.36 9 2 0.24 9 2 0.24 59 5 13 1 0.25 13 1 0.24 8 1 0.17 8 1 0.17 60 5 27 2 0.72 27 2 0.72 17 3 0.40 17 3 0.40 61 5 24 2 0.57 24 2 0.57 9 2 0.24 9 2 0.24 62 5 21 2 0.47 21 2 0.48 16 2 0.33 16 2 0.33 63 5 29 3 0.68 29 3 0.68 24 3 0.55 24 3 0.56 64 5 29 2 0.83 29 2 0.82 16 2 0.33 16 • 2 0.33 65 5 59 4 1.65 59 4 1.65 26 5 0.68 26 5 0.69 66 5 13 1 0.24 13 1 0.24 8 1 0.17 8 1 0.16 67 5 3 0 0.04 3 0 0.05 3 0 0.05 3 0 0.05 68 5 29 3 0.68 29 3 0.68 24 3 0.54 24 3 0.55 69 5 45 3 1.24 45 3 1.24 24 3 0.56 24 3 0.55 70 5 37 4 0.93 37 4 0.93 34 6 0.95 34 6 0.96 71 5 29 2 0.82 29 2 0.82 17 3 0.41 17 3 0.42 72 5 21 2 0.47 21 2 0.47 16 2 0.33 16 2 0.33 73 5 37 4 0.93 37 4 0.91 27 6 0.72 27 6 0.73 74 5 29 2 0.81 29 2 0.82 17 3 0.42 17 3 0.40 75 5 29 2 0.80 29 2 0.80 16 2 0.33 16 2 0.33 76 5 21 2 0.47 21 2 0.47 16 2 0.32 16 2 0.33 77 5 21 2 0.47 21 2 0.47 16 2 0.33 16 2 0.33 78 5 48 5 1.27 48 5 1.28 18 4 0.45 18 4 0.45 79 5 66 4 1.83 66 4 1.85 26 5 0.68 26 5 0.68 80 5 40 4 0.93 40 4 0.93 20 6 0.52 20 6 0.51 81 5 21 2 0.47 21 2 0.48 16 2 0.33 16 2 0.34 82 5 42 3 1.04 42 3 1.04 17 3 0.41 17 3 0.41 83 5 13 1 0.24 13 1 0.25 8 1 0.17 8 1 0.17 84 5 21 2 0.46 21 2 0.47 16 2 0.33 16 2 0.33 85 5 29 3 0.70 29 3 0.69 24 3 0.56 24 3 0.56 86 5 27 2 0.71 27 2 0.72 17 3 0.40 17 3 0.41 87 5 85 11 3.07 85 11 3.09 17 3 0.42 17 3 0.42 88 5 30 4 0.74 30 4 0.73 26 5 0.67 26 5 0.67 89 5 35 3 0.94 35 3 0.94 25 4 0.64 25 4 0.64 90 5 32 3 0.77 32 3 0.78 17 3 0.44 17 3 0.44 91 5 29 2 0.82 29 2 0.82 17 3 0.40 17 3 0.40 92 5 13 1 0.25 13 1 0.25 8 1 0.17 8 1 0.17 93 5 57 4 1.40 57 4 1.40 17 3 0.41 17 3 0.41 94 5 21 2 0.47 21 2 0.48 16 2 0.34 16 2 0,33 95 5 13 1 0.25 13 1 0.25 8 1 0.16 8 1 0.17 96 5 29 2 0.81 29 2 0.82 16 2 0.33 16 2 0.34 97 5 28 3 0.76 28 3 0.75 18 4 0.42 18 4 0.43 98 5 34 4 0.83 34 4 0.83 18 4 0.45 18 4 0.48 99 5 29 3 0.68 29 3 0.69 24 3 0.56 24 3 0.57 100 5 39 4 0.96 39 4 0.96 33 5 0.87 33 5 0.88 Total 3591 358 98.31 3591 358 98.49 1907 329 45.75 1907 329 45.94 SP 100 100 100 100 100 100 100 100 100 100 100 100 AVG 35.91 3.58 0.98 35.91 3.58 0.98 19.07 3.29 0.46 19.07 3.29 0.46 Table B.10: Perform ance of m ulti-m ethod planners over 100 testing problem s in th e m achine-shop scheduling dom ain (continued). 133 R eferen ce L ist [Allen et al., 1990] J. Allen, J. Hendler, and A. Tate. Readings in Planning. Mor gan Kaufmann, San Mateo, CA, 1990. [Barley, 1991] M Barley. Private communication. Private communication, 1991. [Barrett and Weld, 1992] A. B arrett and D. S. Weld. Partial-order planning: Eval uating possible efficiency gains. Technical Report 92-05-01, Departm ent of Com puter Science and Engineering, University of Washington, 1992. [Bhatnagar and Mostow, 1990] N. Bhatnagar and J. Mostow. Adaptive search by explanation-based learning of heuristic censors. In Proceedings o f the Eighth National Conference on Artificial Intelligence, pages 895— 901, Boston, MA, 1990. [Bond and Gasser, 1988] A. H. Bond and L. Gasser. Readings in Distributed A r tificial Intelligence. Morgan Kaufmann, San Mateo, CA, 1988. [Bostrom, H., 1990] Bostrom, H. Generalizing the order of goals as an approach to generalizing numbers. In Proceedings o f the Seventh International Conference on Machine Learning, pages 260-267, Austin, TX, 1990. Morgan Kaufmann. [Chapman, 1987] D. Chapman. Planning for conjunctive goals. Artificial Intelli gence, 32(3):333-377, 1987. [Cohen, 1988] W. W Cohen. Generalizing number and learning from multiple ex amples in explanation based learning. In Proceedings o f the Fifth International Conference on Machine Learning, pages 256-269, Ann Arbor, MI, 1988. Morgan Kaufmann. 134 i [Etzioni, 1990a] O. Etzioni. Why PRODIGY/EBL works. In Proceedings o f the Eighth National Conference on Artificial Intelligence, pages 916-922, Boston, MA, 1990. [Etzioni, 1990b] Oren Etzioni. A Structural Theory o f Explanation-Based Learn ing. Ph.D. Thesis, School of Computer Science, Carnegie Mellon University, 1990. Available as Technical Report CMU-CS-90-185. [Fikes and Nilsson, 1971] R. E. Fikes and N Nilsson. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2(3):189-208, 1971. [Gratch and DeJong, 1990] J. M. Gratch and G. F. DeJong. A framework for eval uating search control strategies. In Proceedings of the Workshop on Innovative Approaches to Planning, Scheduling, and Control, pages 337-347, San Diego, CA, 1990. Morgan Kaufmann. [Greiner, 1992] Russell Greiner. Probablistic hill-climbing: Theory and applica tions. In Proceedings of the Ninth Canadian Conference on Artificial Intelli gence, Vancouver, Canada, 1992. [Johnson, 1994] W. L. Johnson. Agents that explain their own actions. In Pro ceedings o f the Fourth Conference on Computer Generated Forces and Behavior Representation, Orlando, FL, 1994. [Jones et al., 1993] R.M. Jones, M. Tambe, J.E. Laird, and Rosenbloom P.S. In telligent autom ated agents for flight training simulators. In Proceedings o f the 3rd Conference on Computer Generated Forces and Behavior Representation, Orlando, FL, 1993. [Kim and Rosenbloom, 1993] J. Kim and P. S. Rosenbloom. Constraining learning with search control. In Proceedings o f the Tenth International Conference on Machine Learning, pages 174-181, 1993. 135 [Korf, 1985] R. E. Korf. Depth-first iterative-deepening: An optim al admissible j tree search. Artificial Intelligence, 27(1):97-109, 1985. i 1 t [Korf, 1987] R. E. Korf. Planning as search: A quantitative approach. Artificial ; Intelligence, 33(1):65— 88, 1987. [Laird et ah, 1987] J. E. Laird, A. Newell, and P. S. Rosenbloom. Soar: An archi tecture for general intelligence. Artificial Intelligence, 33(l):l-64, 1987. [Lee and Rosenbloom, 1992] S. Lee and P. S Rosenbloom. Creating and coordi nating multiple planning methods. In Proceedings of the Second Pacific R im International Conference on Artificial Intelligence, pages 89-95, Seoul, Korea, j 1992. j [Lee and Rosenbloom, 1993] S. Lee and P. S Rosenbloom. Granularity in multi- j m ethod planning. In Proceedings o f the Eleventh National Conference on Arti- [ ficial Intelligence, pages 486-491, Washington D. C., 1993. i 1 [McAllester and Rosenblitt, 1991] D. McAllester and D. Rosenblitt. Systematic j nonlinear planning. In Proceedings of the Ninth National Conference on Artificial Intelligence, pages 634-639, Anaheim, CA, 1991. j [Michalski et al., 1986] R.S. Michalski, I. Mozetic, J. Hong, and N Lavrac. The ! j : AQ15 inductive learning system: An overview and experiments. Technical Re- port UIUCDCS-R-86-1260, Departm ent of Computer Science, University of Illi- , nois at Urbana-Champaign, 1986. [Minton et a l, 1989] S. Minton, C.A. Knoblock, D.R. Kuokka, Y. Gil, R.L. Joseph, | i and J.G. Carbonell. PRODIGY 2.0: The manual and tutorial. Technical Report j CMU-CS-89-146, School of Computer Science, Carnegie Mellon University, 1989. ' [Minton et al., 1991] S. Minton, J. Bresina, and M. Drummond. Commitment strategies in planning: A comparative analysis. In Proceedings of the Twelfth \ International Joint Conference on Artificial Intelligence, Australia, 1991. I I 136 [Minton, 1988] Steven Minton. Learning Effective Search Control Knowledge: An I Explanation-Based Approach. Ph.D. Thesis, Com puter Science Departm ent, ! Carnegie Mellon University, 1988. i [Mitchell, 1980] T. M. Mitchell. The need for biases in learning generalizations. Technical Report CBM-TR-117, Departm ent of Com puter Science, Rutgers Uni- j versity, 1980. i ! I [Newell et al., 1991] G. R. Newell, A. Yost, J. E. Laird, P. S. Rosenbloom, and E. Altmann. Formulating the problem space com putational model. In R. F. Rashid, editor, Carnegie Mellon Computer Science: A 25-Year Commemorative. Addison-Wesley/ACM Press, Reading, MA, 1991. [Newell, 1990] A Newell. Unified Theories o f Cognition. Harvard University Press, Cambridge, MA, 1990. j [Rosenbloom and Laird, 1986] P. S. Rosenbloom and J. E. Laird. Mapping ex- | planation-based generalization onto Soar. In Proceedings o f the Fifth National Conference on Artificial Intelligence, pages 561-567, Philadelphia, PA, 1986. i [Rosenbloom et al., 1990] P. S. Rosenbloom, S. Lee, and A Unruh. Responding to impasses in memory-driven behavior: A framework for planning. In Proceedings j o f the Workshop on Innovative Approaches to Planning, Scheduling, and Control, I pages 181-191, San Diego, CA, 1990. Morgan Kaufmann. 1 [Rosenbloom et al., 1991] P. S. Rosenbloom, J. E. Laird, and A. Newell. A prelimi- i ! nary analysis of the Soar architecture as a basis for general intelligence. Artificial ( j Intelligence, 47(l-3):289-325, 1991. ; j ! ! [Rosenbloom et al., 1993] P. S. Rosenbloom, S. Lee, and A. Unruh. Bias in plan- ^ ning and explanation-based learning. In Chipman S. and Meyrowitz A., editors, j i Machine Learning: Induction, Analogy and Discovery. Kluwer Academic Pub- j lishers, Hingham, MA, 1993. Also available in S. Minton (Ed.) Machine Learning Methods fo r Planning. Morgan Kaufmann, San Mateo, CA, 1993. ; ! 137 i [Rosenbloom et al., 1994] P. S. Rosenbloom, W. L. Johnson, R. M. Jones, F. Koss, J. E. Laird, J. F. Lehman, R. Rubinoff, K. B. Schwamb, and M. Tambe. In telligent autom ated agents for tactical air simulation: A progress report. In i Proceedings o f the Fourth Conference on Computer Generated Forces and Be havioral Representation, Orlando, FL, 1994. In Press. ! [Ruby and Kibler, 1991] D. Ruby and D. Kibler. Steppingstone: An empirical and analytical evaluation. In Proceedings of the Ninth National Conference on Artificial Intelligence, pages 527-532, Anaheim, CA, 1991. [Russell and Grosof, 1987] S. J. Russell and B. N. Grosof. A declarative approach to bias in concept learning. In Proceedings of the Sixth National Conference on j Artificial Intelligence, pages 505-510, Seattle, WA, 1987. | [Sacerdoti, 1975] E. D. Sacerdoti. The non-linear nature of plans. In Proceedings j of the Fourth International Joint Conference on Artificial Intelligence, Tbilisi, ! Georgia, USSR, 1975. ’ < [Shavlik, 1989] J. W. Shavlik. Acquiring recursive concepts with explanation- ' based learning. In Proceedings o f the Eleventh International Joint Conference ! on Artificial Intelligence, pages 688-693, Detroit, MI, 1989. i | [Subramanian and Feldman, 1990] D. Subramanian and R. Feldman. The utility of EBL in recursive domain theories. In Proceedings of the Eighth National Conference on Artificial Intelligence, pages 942-949, Boston, MA, 1990. I i i [Sussman, 1973] G.J. Sussman. A computational model of skill acquisition. Memo ; ! 297, A l Lab, MIT, 1973. ! i i [Tambe and Newell, 1988] M. Tambe and A. Newell. Some chunks are expensive. ; In Proceedings o f the Fifth International Conference on Machine Learning, pages j : 451-458, 1988. j i t i : 138 i [Tambe and Rosenbloom, 1994] M. Tambe and P.S. Rosenbloom. Event tracking in complex multi-agent environments. In Proceedings o f the Fourth Conference on Computer Generated Forces and Behavior Representation, Orlando, FL, 1994. [Tambe et al., 1990] M. Tambe, A. Newell, and P. S. Rosenbloom. The problem of expensive chunks and its solution by restricting expressiveness. Machine Learn ing, 5(3):299-348, 1990. [Tate, 1977] A. Tate. Generating project networks. In Proceedings o f the Fifth International Joint Conference on Artificial Intelligence, pages 888-893, Cam bridge, MA, 1977. [Unruh, 1993] Any Unruh. Using Automatic Abstraction fo r Problem-Solving and Learning. Ph.D. Thesis, Departm ent of Computer Science, Stanford University, 1993. [Utgoff, 1986] P. E. UtgofF, Shift of bias for inductive concept learning. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Artificial Intelligence Approach, Vol. II. Morgan Kaufmann, Los Altos, CA, 1986. [Veloso, 1989] M Veloso. Nonlinear problem solving using intelligent casual- commitment. Technical Report CMU-CS-89-210, School of Com puter Science, Carnegie Mellon University, 1989. [Vere, 1983] Steven A. Vere. Planning in time: Windows and durations for activi ties and goals. IE E E Transactions on Pattern Analysis and Machine Intelligence, 5(3):246-267, May 1983. [Waldinger, 1977] R. Waldinger. Achieving several goals simultaneously. In D. Michie, editor, Machine Intelligence 8, pages 94-136. Ellis Horwood, Chich ester, England, 1977. [Warren, 1976] D. Warren. WARPLAN: A system for generating plans. Memo 76, D epartm ent of Computational Logic, University of Edinburgh, June 1976. 139
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
Asset Metadata
Core Title
00001.tif
Tag
OAI-PMH Harvest
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11255771
Unique identifier
UC11255771
Legacy Identifier
DP22887