Page 133 
Save page Remove page  Previous  133 of 178  Next 

small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)

This page
All

maximize the righthandside of Equation (5.13). Furthermore, F must be an element of a set X = fF : (5:15); (5:16); (5:17)g where constraints (5.15), (5.16), (5.17) are defined as follows:9 F(mn);(1)(t) = 1 (5.15) F ;Q(t) = X ml2A( ) F ;Q(ml)(t) + F ;Q(m0)(t) (5.16) F( ;ml);(Q;q)(t) = Z t 0 Pb(ml; t0; q) F ;Q(ml)(t0)pl(t t0)dt0 (5.17) for all agents n = 1; :::; N, sequences 2 n and outcomes Q 2 f0; 1gj j. Here, ( ;ml) is a concatenation of and (ml), (Q; q) is a concatenation of Q and (q) whereas Pb(ml; t0; q) (explained later) is the probability that method ml is enabled before time t0 (in case q = 1) or not enabled before time t0 (in case q = 0). Constraints (5.15), (5.16), (5.17) are explained as follows: Constraint (5.15) ensures that each agent n = 1; ::; N starts the execution from its starting state s0;n. To observe that, recall that the starting state sn;0 of agent n in encoded in the CRDEC MDP framework as sn;0 = (hn; ln;0; ln;0; 1i), that is, a spoof method mn is by default completed successfully (with q = 1) at time ln;0 = 0. Hence the probability F(mn);(1)(t) that method mn will be completed successfully before time t must be 1 for all t 2 [0; ]. Constraint (5.16) can be interpreted as the conservation of of probability mass flow through a method sequence . Applicable only if jA( )j > 0 it ensures that the cumulative distribution function F ;Q is split into cumulative distribution functions F ;Q(ml) for methods 9Constraints (5.15), (5.16), (5.17) are defined for a single method execution time windows [0; ]. Extension to multiple time windows is shown in Equation (5.6) 123
Object Description
Title  Planning with continuous resources in agent systems 
Author  Marecki, Janusz 
Author email  marecki@usc.edu 
Degree  Doctor of Philosophy 
Document type  Dissertation 
Degree program  Computer Science 
School  Viterbi School of Engineering 
Date defended/completed  20080507 
Date submitted  2008 
Restricted until  Unrestricted 
Date published  20080619 
Advisor (committee chair)  Tambe, Milind 
Advisor (committee member) 
Lesser, Victor Gratch, Jonathan Maheswaran, Rajiv T. Ordonez, Fernando I. 
Abstract  My research concentrates on developing reasoning techniques for intelligent, autonomous agent systems. In particular, I focus on planning techniques for both single and multiagent systems acting in uncertain domains. In modeling these domains, I consider two types of uncertainty: (i) the outcomes of agent actions are uncertain and (ii) the amount of resources consumed by agent actions is uncertain and only characterized by continuous probability density functions. Such rich domains, that range from the Mars rover exploration to the unmanned aerial surveillance to the automated disaster rescue operations are commonly modeled as continuous resource Markov decision processes (MDPs) that can then be solved in order to construct policies for agents acting in these domains.; This thesis addresses two major unresolved problems in continuous resource MDPs. First, they are very difficult to solve and existing algorithms are either fast, but make additional restrictive assumptions about the model, or do not introduce these assumptions but are very inefficient. Second, continuous resource MDP framework is not directly applicable to multiagent systems and current approaches all discretize resource levels or assume deterministic resource consumption which automatically invalidates the formal solution quality guarantees. The goal of my thesis is to fundamentally alter this landscape in three contributions:; I first introduce CPH, a fast analytic algorithm for solving continuous resource MDPs. CPH solves the planning problems at hand by first approximating with a desired accuracy the probability distributions over the resource consumptions with phasetype distributions, which use exponential distributions as building blocks. It then uses value iteration to solve the resulting MDPs more efficiently than its closest competitor, and allows for a systematic tradeoff of solution quality for speed.; Second, to improve the anytime performance of CPH and other continuous resource MDP solvers I introduce the DPFP algorithm. Rather than using value iteration to solve the problem at hand, DPFP performs a forward search in the corresponding dual space of cumulative distribution functions. In doing so, DPFP discriminates in its policy generation effort providing only approximate policies for regions of the statespace reachable with low probability yet it bounds the error that such approximation entails.; Third, I introduce CRDECMDP, a framework for planning with continuous resources in multiagent systems and propose two algorithms for solving CRDECMDPs: The first algorithm (VFP) emphasizes scalability. It performs a series of policy iterations in order to quickly find a locally optimal policy. In contrast, the second algorithm (MDPFP) stresses optimality; it allows for a systematic tradeoff of solution quality for speed by using the concept of DPFP in a multiagent setting.; My results show up to three orders of magnitude speedups in solving single agent planning problems and up to one order of magnitude speedup in solving multiagent planning problems. Furthermore, I demonstrate the practical use of one of my algorithms in a largescale disaster simulation where it allows for a more efficient rescue operation. 
Keyword  agents; planning under uncertainty; continuous resources; Markov decision process; convolution 
Language  English 
Part of collection  University of Southern California dissertations and theses 
Publisher (of the original version)  University of Southern California 
Place of publication (of the original version)  Los Angeles, California 
Publisher (of the digital version)  University of Southern California. Libraries 
Type  texts 
Legacy record ID  uscthesesm1277 
Contributing entity  University of Southern California 
Rights  Marecki, Janusz 
Repository name  Libraries, University of Southern California 
Repository address  Los Angeles, California 
Repository email  cisadmin@lib.usc.edu 
Filename  etdMarecki20080619 
Archival file  uscthesesreloadpub_Volume32/etdMarecki20080619.pdf 
Description
Title  Page 133 
Contributing entity  University of Southern California 
Repository email  cisadmin@lib.usc.edu 
Full text  maximize the righthandside of Equation (5.13). Furthermore, F must be an element of a set X = fF : (5:15); (5:16); (5:17)g where constraints (5.15), (5.16), (5.17) are defined as follows:9 F(mn);(1)(t) = 1 (5.15) F ;Q(t) = X ml2A( ) F ;Q(ml)(t) + F ;Q(m0)(t) (5.16) F( ;ml);(Q;q)(t) = Z t 0 Pb(ml; t0; q) F ;Q(ml)(t0)pl(t t0)dt0 (5.17) for all agents n = 1; :::; N, sequences 2 n and outcomes Q 2 f0; 1gj j. Here, ( ;ml) is a concatenation of and (ml), (Q; q) is a concatenation of Q and (q) whereas Pb(ml; t0; q) (explained later) is the probability that method ml is enabled before time t0 (in case q = 1) or not enabled before time t0 (in case q = 0). Constraints (5.15), (5.16), (5.17) are explained as follows: Constraint (5.15) ensures that each agent n = 1; ::; N starts the execution from its starting state s0;n. To observe that, recall that the starting state sn;0 of agent n in encoded in the CRDEC MDP framework as sn;0 = (hn; ln;0; ln;0; 1i), that is, a spoof method mn is by default completed successfully (with q = 1) at time ln;0 = 0. Hence the probability F(mn);(1)(t) that method mn will be completed successfully before time t must be 1 for all t 2 [0; ]. Constraint (5.16) can be interpreted as the conservation of of probability mass flow through a method sequence . Applicable only if jA( )j > 0 it ensures that the cumulative distribution function F ;Q is split into cumulative distribution functions F ;Q(ml) for methods 9Constraints (5.15), (5.16), (5.17) are defined for a single method execution time windows [0; ]. Extension to multiple time windows is shown in Equation (5.6) 123 