Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Probabilistic maps: computation and applications for multi-agent problems
(USC Thesis Other)
Probabilistic maps: computation and applications for multi-agent problems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
PROBABILISTIC MAPS: COMPUTATION AND APPLICATIONS FOR MULTI-AGENT PROBLEMS by Huseyin Hakan Kizilocak A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful¯llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY ELECTRICAL ENGINEERING May 2007 Copyright 2007 Huseyin Hakan Kizilocak Table of Contents List Of Figures v Abstract viii 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Sensors, Uncertain Measurements . . . . . . . . . . 7 Map Building . . . . . . . . . . . . . . . . . . . . . 8 Coordination of Multi-Agents . . . . . . . . . . . . 11 1.3 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 14 1.3.1 Sensor Model . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.2 Mapping Objects . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.3 Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.3.4 Operators for Recursive Map Computation . . . . . . . . . 19 1.3.4.1 Fusion Operator . . . . . . . . . . . . . . . . . . 20 1.3.4.2 Prediction Operator . . . . . . . . . . . . . . . . 21 1.3.5 Iterative Computation of Dynamic Probabilistic Maps. . . 24 Sequential Case . . . . . . . . . . . . . . . . . . . . 25 Out-of-Sequence Case . . . . . . . . . . . . . . . . . 25 1.3.6 Applications of Probabilistic Maps . . . . . . . . . . . . . 26 1.3.7 Occupancy maps for objects . . . . . . . . . . . . . . . . . 29 1.3.8 Optimal Path . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.3.9 Simulation Results . . . . . . . . . . . . . . . . . . . . . . 33 2 Concept and Uses of Map Building 34 2.1 Discussion on Sensor Model . . . . . . . . . . . . . . . . . . . . . 36 2.2 De¯nitions of Probabilistic Maps . . . . . . . . . . . . . . . . . . 37 Bayesian Filter Maps . . . . . . . . . . . . . . . . . 37 Particle Filter Maps . . . . . . . . . . . . . . . . . . 38 2.3 Example Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Setup for JFACC Program . . . . . . . . . . . . . . 40 ii SetupforMICA(MixedInitiativeControlofAutoma- teams) Program. . . . . . . . . . . . . . 43 2.4 Discussion on How to Use Probabilistic Maps . . . . . . . . . . . 47 Minimum-Risk Path Planning . . . . . . . . . . . . 47 Pursuit-Evasion . . . . . . . . . . . . . . . . . . . . 49 3 Sensor and Motion Model for Probabilistic Maps 52 3.1 Parametric Sensor Model . . . . . . . . . . . . . . . . . . . . . . . 55 3.1.1 Single-object sensor stationary likelihood functions . . . . 55 3.1.2 Multiple-object stationary sensor likelihood functions . . . 61 3.2 Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4 Bayesian Filtering 68 4.1 Separability of Probabilistic Maps . . . . . . . . . . . . . . . . . . 71 4.2 Fusion Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3 Prediction Operator . . . . . . . . . . . . . . . . . . . . . . . . . 74 Single object . . . . . . . . . . . . . . . . . . . . . . 75 Multiple objects . . . . . . . . . . . . . . . . . . . . 76 4.4 Preferential Bearing Motion Model . . . . . . . . . . . . . . . . . 82 4.5 Mapping Static Objects . . . . . . . . . . . . . . . . . . . . . . . 84 4.6 Mapping Moving Objects . . . . . . . . . . . . . . . . . . . . . . . 86 Sequential Case . . . . . . . . . . . . . . . . . . . . 87 Out-of-Sequence Case . . . . . . . . . . . . . . . . . 88 4.7 Smoothing Operator . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.8 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5 Particle Filter Maps vs Bayesian Approach 102 5.1 Sequential Importance Sampling (SIS) . . . . . . . . . . . . . . . 104 5.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Comparison: Particle Filter vs Bayesian . . . . . . . . . . . . . . 109 6 Minimum Risk Path Planning 116 6.1 Mission risk assessment . . . . . . . . . . . . . . . . . . . . . . . . 118 6.2 Occupancy maps for objects . . . . . . . . . . . . . . . . . . . . . 121 6.3 Optimal Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 6.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7 Probabilistic Pursuit Evasion 135 7.1 Probabilistic Approach and Challenges . . . . . . . . . . . . . . . 136 7.2 Pursuit Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.3 Simulation Results: All Policies Considered . . . . . . . . . . . . . 143 7.4 Constrained Communication Case . . . . . . . . . . . . . . . . . . 145 7.5 Simulation: Constrained Communication . . . . . . . . . . . . . . 147 iii 7.6 E®ect of Number of Pursuers on Capture Time . . . . . . . . . . 148 8 Conclusions and Future Work 151 Reference List 154 Appendix A Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.1 Simulation and Modeling for Military Air Operations . . . . . . . 159 A.1.1 Simulation Architecture . . . . . . . . . . . . . . . . . . . 163 A.1.2 Model Predictive Control . . . . . . . . . . . . . . . . . . . 169 A.1.3 Task Group Control . . . . . . . . . . . . . . . . . . . . . 171 A.1.4 Map Building . . . . . . . . . . . . . . . . . . . . . . . . . 172 A.1.5 Minimum-risk Path Planning . . . . . . . . . . . . . . . . 175 A.1.6 Future Work For Simulation . . . . . . . . . . . . . . . . . 177 iv List Of Figures 1.1 A Typical Probabilistic Map . . . . . . . . . . . . . . . . . . . . . 4 1.2 Recursive computation of m r (¢ ;tjY k ), t¸ t k using the prediction and fusion operators. . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.1 Overall system architecture . . . . . . . . . . . . . . . . . . . . . 42 2.2 Control System for Teams of Autonomous Agents (CRYSTAL) ar- chitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.3 Minimum Risk Path . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.4 Pursuit using the pursuit policy with search paths. The searchers are represented by blue circles, the fugitive is represented by a red circle. The background color of each cell r 2 R encodes the probability of having a fugitive at r, with a light color for low probability and a dark color for high probability. The search paths arerepresentedbygreenstarsandthesearchwidthbyyellowstars. This is the Bayesian Filter Maps case. . . . . . . . . . . . . . . . 51 3.1 Example motion model which considers roads in the terrain. . . . 67 4.1 Example collision-free-at-any-time motion model which considers roads in the terrain.. . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2 Recursive computation of m r (¢ ;tjY k ), t¸ t k using the prediction and fusion operators. . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3 Examplescenarioforrecursiveout-of-sequencecomputationofm r (¢ ;tjY k ), t¸t k using the prediction and fusion operators. . . . . . . . . . . 89 v 4.4 Recursive Out-of-sequence Computation: Fusion-on-arrival Algo- rithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5 Recursive Out-of-sequence Computation: Delayed-fusion Case . . 91 4.6 Occupancy Map at 1 sec . . . . . . . . . . . . . . . . . . . . . . . 98 4.7 Occupancy Map at 192 sec . . . . . . . . . . . . . . . . . . . . . . 99 4.8 Occupancy Map at 407 sec . . . . . . . . . . . . . . . . . . . . . . 99 4.9 Occupancy Map at 601 sec . . . . . . . . . . . . . . . . . . . . . . 100 4.10 Occupancy Map after 701 sec . . . . . . . . . . . . . . . . . . . . 100 4.11 Occupancy Map after 801 sec . . . . . . . . . . . . . . . . . . . . 101 5.1 Particle Filter Map with 50 samples at 1 sec . . . . . . . . . . . . 109 5.2 Particle Filter Map with 50 samples at 192 sec . . . . . . . . . . . 110 5.3 Particle Filter Map with 50 samples at 407 sec . . . . . . . . . . . 111 5.4 Particle Filter Map with 100 samples at 1 sec . . . . . . . . . . . 112 5.5 Particle Filter Map with 100 samples at 192 sec . . . . . . . . . . 113 5.6 Particle Filter Map with 100 samples at 407 sec . . . . . . . . . . 113 5.7 Particle Filter Map with 500 samples at 1 sec . . . . . . . . . . . 114 5.8 Particle Filter Map with 500 samples at 192 sec . . . . . . . . . . 114 5.9 Particle Filter Map with 500 samples at 407 sec . . . . . . . . . . 115 6.1 Battle¯eld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.2 The occupancy map of radars . . . . . . . . . . . . . . . . . . . . 133 6.3 The log-live map of radars . . . . . . . . . . . . . . . . . . . . . . 133 6.4 The example for online optimization . . . . . . . . . . . . . . . . 134 vi 7.1 Pursuit-evasion simulation results using all three policies. Symbols represent mean value of capture time and light color bars represent standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . 144 7.2 Pursuit-evasion simulation results using pursuit policy with search pathsinunconstrainedandconstrainedcommunicationcases. Sym- bols represent mean value of capture time and light color bars rep- resent standard deviation. . . . . . . . . . . . . . . . . . . . . . . 148 7.3 Pursuit-evasion simulation results for e®ect of number of searchers on the mean and variation of capture time using local-max policy and pursuit policy with search paths on 20x20 map. Symbols rep- resent mean value of capture time and light color bars represent standard deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.1 Overall system architecture . . . . . . . . . . . . . . . . . . . . . 162 A.2 Simulation Graphical User Interface . . . . . . . . . . . . . . . . . 165 A.3 Simulation platform and architecture . . . . . . . . . . . . . . . . 166 A.4 State °ow diagram of a battle . . . . . . . . . . . . . . . . . . . . 167 A.5 Multiple bases engage in resource allocation and battle . . . . . . 172 A.6 Occupancy Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 A.7 Minimum Risk Path . . . . . . . . . . . . . . . . . . . . . . . . . 176 vii Abstract Computation and application of probabilistic maps to multi-agent problems generally consist of generating maps, distributing paths for each agent that allow the group to minimize a cost function subject to constraints, e.g. the length of the path, risk involved with the path, coordinated group behavior, time to arrive at the destination, power consumption, network delay, and bandwidth usage, and deciding to either avoid threats or pursue treasures. In this thesis, we describe a formalism to take care of the general problem of detecting objects with imperfect sensors. This is accomplished by building several probabilistic maps. We develop e±cient algorithms to compute probabilistic maps for a large number of static and mobile objects. By a probabilistic map we mean the probability density of the objects' positions, conditioned to noisy sensor measurements. One of the contributions of this work is that, under suitable assumptions, the joint probability distribution of n objects that lie in a region partitioned into N cells can be approximately determined from an aggregate measurement function that can be represented with memory complexity o(N), regardless of the number of objects n, while the errors in sensors are not Gaussian and sensors are not able to distinguish between individual objects. This is far more compact than an extensive representation of the joint distribution, whose memory complexity is o(N n ). Another contribution here is that algorithms do not have exponential complexity in the number of objects although the positions of multiple objects viii are not independent of each other. Three main assumptions are: the objects are indistinguishablefromthesensorview-point,thereisaminimumdistancebetween any two of them, and -if mobile objects- their motion is essentially independent from each other. Innovative applications for this kind of work involve search and rescue opera- tions for disaster sites, e.g. hurricanes, ¯res, earthquakes, °oods, volcanoes, and battle¯eld strategies, e.g., searching areas for threats or targets, classi¯cation of objects detected, pursuit-evasion games, battle damage assessment. One of our applications is to assess the risk of moving from a source to a destination point, exploiting the probabilistic maps, and even to do minimum-risk path planning. Probabilisticmapsareconstantlycomputedwiththelocationsofobjectsbyfusing information collected from all the sensors in the region. Additionally, the objects, that we map, may induce danger. Therefore the probabilistic maps may be used for intelligence updates and for minimum risk route planning. They make use of themap-buildingfacilitytocomputepathsfortheautonomousmobilerobotsthat minimizes the risk of being in the danger zone. The autonomous mobile robots could be air or ground vehicles depending on the scenario. Another application focuses on the problem where we do not want to avoid the objects that we map, but to pursue. Depending on the particular context, capture may actually mean handling the evader in some particular way, i.e. rescuing it. ix Algorithms were implemented on two simulations of Military Operations, one developed by Honeywell Technology Labs and the other by Boeing. x Chapter 1 Introduction Systems and Control Theory has been accepted as a research area for many decades. It has also provided support as a communication hub between di®er- ent applications by putting application problems into a stronger mathematical framework. This research area also establishes a common ground that enables researchersfromdiversebackgroundstocometogetherandpassonnewideasand therefore to contribute to the good of the research community. Keeping these in mind, the ¯rst chapter is organized in a way to include a motivation section, a relatedworksectionandasummaryofcontributionssection. Themotivationsec- tion is where the problems that need to be solved are presented in the context of Coordinated Multi-Agent Pursuit Using Probabilistic Maps. It also discusses why we need Probabilistic Maps. Challenges that are faced for these kinds of problems 1 are introduced in the motivation section as well. The related work section pro- vides a brief survey of related work. The summary of contributions summarizes main results in the thesis and lists main theorems for brief reference. 1.1 Motivation Coordinatedmulti-agentpursuitproblemsgenerallyconsistofgeneratinganddis- tributing paths for each agent that allow the group to minimize a cost function subject to constraints, e.g. the length of the path, risk involved with the path, coordinated group behavior, time to arrive at the destination, power consump- tion, network delay, and bandwidth usage. Several problems motivate this work. Twostimulatingproblemsarepathplanningwithriskavoidanceandsearch. Risk avoidance deals with how to go from one point to another point in a region in- fested with threat objects while avoiding these dangerous objects. The search problem on the other hand studies how to ¯nd objects of interest in the shortest time possible. That is, one wants to plan a path that maximizes the probabil- ity of ¯nding interesting objects. These problems are motivated by military air operations where aircraft need to be routed safely and e±ciently for search and rescue, search and destroy operations. The common feature of these two prob- lemsisthatbothrequirelocatingobjects(tobeavoidedorlookedfor)usingnoisy measurements. Becausethedatacollectedisnoisy,therewillbeuncertaintywhen one construct maps localizing the objects. Therefore, decisions are made based 2 on noisy data collected by the sensors regarding the position of threat or treasure objects. In simple words, sensors do not produce perfect measurements, and this introduces uncertainty in map of objects positions constructed from sensor data. Hence, one needs to use probabilistic approaches to overcome this problem. Fromaformalpointofviewaprobabilisticmapissimplytheprobabilitydistri- bution of the position of an object (or objects) over a region (space), conditioned to the data collected by sensors. Figure 1.1 shows a snapshot of a probabilistic occupancymap. Theprobabilityofoccupancyisencodedinthebackgroundcolor (lighter corresponding to a higher probability) and each x corresponds to a loca- tion where an aircraft detected a radar. The two circled light areas in the ¯gure representtheapproximatepositionsoftworadarslocatedintheregion. Notethat several false positives that can be seen in the plot were correctly "interpreted" as erroneous measurements. We deal with dynamic maps, i.e., maps of objects that move. Furthermore, there are actually two times that are relevant for a proba- bilistic map, i.e. an old map that is computed at time t 1 and a new map that needs to be computed at time t 2 where t 2 > t 1 . When we do not have any new data from a sensor until t 2 , we have a predicted map. We are faced with the following challenges: ² computational complexity to compute probabilistic maps. Themaindi±culty incomputingprobabilisticmapsistheirsize. Whenoneismappingmultiple objects whose positions are not independent of each other|and therefore 3 Figure 1.1: A Typical Probabilistic Map one needs to keep track of the joint probability density of all the objects| theprobabilisticmap(i.e.,thejointprobabilitydensity)easilybecomesvery large. This is because, in principle, the memory requirements to represent a joint probability density grow exponentially with the number of random variables. It should be noted that (conditional) independence of the posi- tions of several targets is not expected in our problem because, e.g., the probability of all the targets being very close to each other is often much smaller(oratleastdi®erent)thantheprobabilityofthetargetsbeingspread to cover a certain perimeter. In this thesis, we develop e±cient algorithms to compute probabilistic maps for a large number of mobile objects. ² memory complexity to store probabilistic maps. It is well known that if the a priori probability density of the objects is jointly Gaussian, their motion 4 can be expressed by a linear model with Gaussian disturbances, and the measurements are obtained by adding Gaussian noise to the true object po- sition, then the probabilistic map is Gaussian. In general, without the Lin- ear/Gaussian assumptions mentioned above, there is no ¯nite-dimensional representation of the a posteriori probability densities. A ¯nite discretiza- tion of the state-space overcomes this di±culty but one is faced with an exponential explosion on the number of states as the number of objects be- ing mapped grows. For example, this is important since for any region of reasonable size, the joint probability distribution of n objects that lie in a region partitioned into N cells can not be represented exhaustively because such representation has memory complexity o(N n ). In this thesis, we de- velop a far more compact representation than an extensive one of the joint distribution, whose memory complexity is o(N n ). ² computing optimal paths using probabilistic maps, i.e. optimization of costs. Therearemorethanonecriterionincomputingoptimalpaths,whichcauses additional burden on complexity of the overall work. The problem at hand, in general, is a two-cost optimization problem, one of the costs being path length and the other the minimum-risk constraint. In our case, "path length" cost can be replaced by a function of time f(T), i.e. J 1 = f(T). 5 Likewise, "risk" cost will be a function of probability of survival p survive , J 2 =f(p survive ) This can be presented as min x 1 ;x 2 ;:::;x m : x 1 =x init ;x m =x ¯nal J i ·J max i J j ; i;j2f1;2g;i6=j; where J 1 = T and J 2 = ¡log(p survive ) are the costs and J max i cost con- straints. Thesolutions tothesekinds ofproblems are called Pareto-optimal. Pareto-optimal solution stands out as a reasonable equilibrium solution for the cooperative game problems, since it features the property that no other joint decision of the players can improve the performance of at least one of them, without degrading the performance of the other. In our case, there is no other cost pair than the Pareto-optimal solution itself that might be found to decrease one of the costs without increasing the other. The¯rsttwoitemsarestudiedinChapter3andChapter4. Optimizationofcosts are presented in Chapter 6. 1.2 Related Work Thissectionisabriefsurveyofthecoordinatedmulti-agentpursuit problemfrom Systems and Control point of view, which includes research from electrical engi- neering, computer engineering, computer science, mechanical engineering. [44], [47] are other surveys that view the methodology for programming robots known 6 as probabilistic robotics from Arti¯cial Intelligent standpoint. Researchers in AI aremainlyconcernedwithdevelopmentofroboticsoftwareexploitingmodel-based or reactive examples. In our perspective, the general problem consists of the fol- lowing research areas: Sensors,UncertainMeasurements Sensors,ingeneral,senddataintwofor- mats. Encoder-based format, e.g. especially in robotics, is data related with the positionandorientationoftheagentsusingjointencoders. Itturnsoutthatsince physical parameters of the agent are di±cult to determine and change over time (e.g., due to wear and tear), the measurements result in low accuracy. Secondly, vision-based data, e.g. camera images, laser range scans, exhibits intrinsic ambi- guities that introduce uncertainty in measurements. Therefore, the main reason to utilize a probabilistic framework for map building is the uncertainty due to the inaccuracies in the sensor used to detect the objects being mapped. These inaccuracies are well known and have been studied in the literature for several types of sensors, such as airborne laser radar imagery[8], two-dimensional pas- sive sensors[40], stereo camera systems [30], etc. Because of the uncertainty from sensor limitations, sensor noise and environment dynamics, statistical modelling andevaluationoftheperformanceofobstacledetection sensorshavebeen studied broadly. 7 Map Building Mapping is the problem of building maps of the region of in- terest from sensor readings. Researchers build maps assuming known locations of sensors, and yet it is hard to neutralize the e®ect of the sensor position errors. Localization seems to be a ¯rst problem to solve. On the other hand, localiza- tion needs a good known map of the environment to work properly. Due to the chicken and egg problem, a more robust method that considers uncertainty and errors is needed. Hence, probabilistic map building is becoming an important topic in the robotics literature. [48] addresses the problem of building large-scale geometric maps of indoor environments with mobile robots. It poses the map building problem as a constrained, probabilistic maximum-likelihood estimation problem. It then devises an algorithm for generating the most likely map from data, along with the most likely path taken by the robot. [45] adds a technique thatcombinesfastmaximumlikelihoodmapgrowingwithaMonteCarlolocalizer that uses particle representations. In our work, to overcome the computational complexity, we introduce a low dimensional aggregate measurement function to represent the large-scale geomet- ric map. [4] presents a behavior-based technique for incremental online mapping and autonomous navigation for mobile robots, speci¯cally geared for time-critical indoor exploration tasks. While this work includes the autonomous generation of topologicalmapsoftheenvironmentincrementallyandlocally,ourworkrepresents 8 the maps globally bene¯ting computation ease with low dimensional representa- tion. Localization Localization consists of determining the position of the object of interest relative to the position of another object or landmark in the region, of- ten assuming that map of the region is given. The localization problem, like map learning, could be interpreted as a density estimation problem, where a posterior conditional distribution of the positions of the objects of interest, e.g. agents, obstacles, objects in the region of interest conditioned to available information is predicted. The posterior probability density is estimated exploiting typically Bayes' Theorem, Total Probability Theorem and Markovian assumptions. Most probabilistic approaches originate from and are inspired by Kalman Filter [22]. Especially for indoors robotics, map building is often combined with the problem of localizing the robot within a global coordinate system [9, 26, 43, 48, 4, 3]. [49] appliesasetofalgorithmsforlocalizationbasedonparticle¯ltersso-calledMonte Carlo Localization, which are approximate Bayes ¯lters that use random samples for posterior estimation. [12] proposes a Markov localization. This connection between mapping and localization is required since the measurements used to build the map are a function of the position of the robot, which is hard to deter- mine accurately due to accumulated odometry errors. In outdoors robotics, the localization problem can be mitigated through the use of GPS. 9 MapLearning Arepresentationofenvironmenttobeusedinpath-planning is built by an agent in multi-agent systems to explore its environment.Such a representation is called a map of the environment where the process of building a mapfromasetofmeasurementsiscalledmaplearning. [5]considershowtolearn the interior of a two dimensional room. However, systematic map learning is time consuming and computationally hard, even for simple two dimensional rectilinear environmentswitheachsideoftheobstaclesparalleltooneofthecoordinateaxis. MapsOnaGraph Thegraphconsistsofnodesandedges, wheretheagent can navigate from one node to another through an edge connecting these two nodes. [6] considers the robot exploration problem of graph maps with homoge- neous markers. Occupancy Grid Maps Occupancy grid mapping investigates the prob- lem of estimating a map from a set of sensor readings given known locations. Occupancy grid mapping is currently the most widely used class of mapping algo- rithms for mobile robots (cf. , e.g., [43, 52]). Occupancy maps simply provide the probability of each individual cell in the grid being occupied by an object. This information is not su±cient to compute the probability of a particular con¯gura- tion of objects, unless it is assumed that the existence of objects in di®erent cells correspond to independent events. As mentioned above, it is di±cult to justify this type of independence assumption for the problem at hand. 10 Coordination of Multi-Agents Coordinated multi-agent pursuit problems generally consist of generating and distributing paths for each agent that allow the group to minimize a cost function subject to constraints, e.g. the length of the path, risk involved with the path, coordinated group behavior, time to arrive atthedestination, powerconsumption,networkdelay,andbandwidthusage. Our work consists of high level control algorithms to design paths. The designed path consists of waypoints. These waypoints are away from each other by a distance in the order of kilometers. We do not study low level control where one might study actual dynamics of the motion with detailed di®erential equations. However, low- levelcontrolstrategiesmaymakeuseofprobabilisticmapsforobstacle-avoidance. Our probabilistic approach may help the low-level controller to compute an ob- stacle map in this case. Another way to look at the problem of coordination and cooperation in multi-agents comes in the °avor of formations of multi-agents. Much work has been done from the stability of formations of multi-agents to the obstacle avoidance in formations. [54] derives a tracking control law to control a satellite formation. [33] presents an approach to obstacle avoidance for a group of unmanned vehicles moving in formation. The main goal is to move agents in a partially known environment with obstacles and reach a destination while maintainingthestableformation. [11], [34]and[31]developtrajectorygeneration strategies for the coordination of multi-agents. [39] introduces a decentralized coordination architecture with decentralized formation control strategies. [24] 11 presents a behavior-based approach to maneuvers in multi-agent formations, for which formation maneuvers are broken into a sequence of maneuvers between for- mation patterns. [25] utilizes an information °ow law in multi-agent networks to study the stability of formations. [20] studies a local rule based on nearest neigh- bors of an agent in multi-agent formation to move agents in the same direction via decentralized coordination. [10] presents strategies based on rigidity to keep formations of multi-agents stable. Considering these facts, we still provide here some example work and ap- proaches for Coordination of Multi-Agents, Path Planning, Motion Planning : Neural Networks Neural network approach is mostly taken for learning algorithms. Thelearningalgorithmsareusedtopredictthestateoftheworld,e.g. behavior of other agents, in multi-agent domain for search and avoid, search and ¯nd purposes. In [53], agents learn to predict the behavior of other agents using neural networks trained via reinforcement learning. [36] has the corresponding background work for the search problem and path planning. 12 Behavior-Based Approach Behavior based approaches to coordination of multi-agents introduce algorithms that focus on behaviors and conditions. Typi- cally, a control algorithm selects behaviors to reduce the large-scale learning do- main. [29] presents a behavior-based reinforcement learning algorithm on multi- robot cooperation during mapping. To utilize the coordination between multi- agents, [13]and[35]introduceempiricalstudiesontaskallocationproblem. InAI context,taskallocationhandleswhichactionisgoingtobeselectedforeachrobot to complete a global task. [13] provides a method of dynamic task allocation for multi-agents, so-called distributed negotiation mechanism. Game Theoretic Approaches Game theoretic approaches are broadly in- vestigated for pursuit-evasion type of games in multi-agent domain. These ap- proaches are coupled with probabilistic approaches. Greedy control policies are introduced for pursuit games in a probabilistic framework in [17], [19], [37]. Probabilistic Approaches Probabilistic approaches exploits probability densities to express what state of the real world is. It turns out, probabilistic approaches can also predict future state of the world once the world is repre- sented in probability densities. Furthermore, considering the uncertainty present in sensors, it is also particularly useful in decision making. A key concept used in our report is that of a probabilistic map. In usual maps of a region, one ¯nds markings that represent locations of objects (e.g., radars) within the region. By 13 placing one of these markings in a location on the map, one is expressing the fact that the object represented can be found at that particular location with proba- bility one. Because of uncertainty in sensor measurements, when the information used to build a map (or more precisely to locate interesting objects in it) is not perfect, it is not possible to mark speci¯c positions on the map as the location of the object. In this case, the best one can do is to superimpose the probability density of the location of the object in question on the whole map. A formalism to take care of the general problem of detecting objects with imperfect sensors is presented here on. This is accomplished by building several probabilistic maps. We develop e±cient algorithms to compute probabilistic maps for a large number of static and mobile objects. It is this probability density that we call the prob- abilistic map for the object. From a formal point of view, probabilistic maps are actually probability densities of the position of objects, conditioned to the sensor measurements available. Note that these probabilistic maps do not represent the topologyofthegeographicregion,butareoftenrelatedtoit,e.g.,theprobabilistic map of a radar will most likely exhibit zero probability over areas covered by a river because it is unlikely to ¯nd an illuminator radar over water. 1.3 Summary of Contributions We summarize main results in the thesis here and put into context with related work. 14 1.3.1 Sensor Model Our knowledge about the position of the objects improves with time through observations collected by the sensors over R. These observations are denoted by y k , k2K :=f1;2;:::; ¹ kg, and each y k consists of a triple of the form ft k ;a k ;s k g; where t k ¸ 0 denotes the time at which the observation was made and a k 2 R the position of the sensor. Sensors of this type can be modelled by a likelihood function of the form ` n (y;r;t):=P(y k =yjr(t)=r); k2K; r(t)2R n ; y2Y: (1.1) where t k =t. We refer to ` n as the n-object sensor likelihood function. 1.3.2 Mapping Objects The classical problem in partially observable Markov chains is to recover a poste- riordistributionoverthestateatanytimefromallavailablesensormeasurements. This problem is known as Bayesian ¯ltering problem, also called optimal ¯ltering or stochastic ¯ltering problem. We proposed a solution to this problem by Bayes 15 ¯lters, which computes this posterior recursively. The a posteriori probabilistic map for the objects are de¯ned by m r (r;tjY K ):=P(r(t)=rjY K =Y K ); r2R n ; whereY K denotesthelistfy 1 ;y 2 ;:::;y K gofallsensorobservationsavailableand typically t¸t K . Assumption 1. 1. From the sensor point of view, the objects are indistin- guishable from each other. 2. The distance between any two objects is larger than twice the sensor range 2½ sensor The following is the main result of Section 3.1.2: Theorem 1. UnderAssumption1, thelikelihoodfunctionformultipleobjectscan be written as the multiplication of the single-object likelihood functions. Additionally, under Assumption 1, the a posteriori probabilistic map for the objects can be written as m r (r;tjY K )= ^ c k ³ n Y i=1 f(r i ;tjY K ) ´ p 0 (r;t 0 ;;); r :=fr 1 ;r 2 ;:::;r n g2R n ; (1.2) 16 where^ c k isanormalizingconstantand;2Y ¤ denotestheemptysequence(empty observation set) in a priori joint probability distribution for the objects at time t 0 , p 0 (r;t 0 ;;) and f(x;tjY K ):= Y j2K ` 1 (y j ;x;t); x2R: In this work, the function f(¢jY K )|called here the aggregate measurement func- tion|encapsulates the information contained in all the measurement. This is important since for any region of reasonable size, the posterior probabilistic map m r (¢jY K ) can not be represented exhaustively because such representation has memory complexity o(N n ), where N is the number of points in R and n is the number of objects mapped. However, one can represent m r (¢jY K ) using a single function f(¢jY K ), whose exhaustive representation has memory complexity o(N). We will show that the joint probability distribution of n mobile objects that lie in a region partitioned into N cells can be approximately determined from an aggregate measurement function that can be represented with memory complex- ity o(N), regardless of the number of objects n. This is far more compact than an extensive representation of the joint distribution, whose memory complexity is o(N n ). 17 1.3.3 Motion Model Weassumethatthemotionfort¸t 0 isMarkovian. Themotionmodelisspeci¯ed by a known n-object transition probability function of the form ©(r 0 ;r;t+dt;t):=P(r(t+dt)=r 0 jr(t)=r) r;r 0 2 R n ; dt ¸ 0; t ¸ t 0 , with the standing assumption that r(t + dt) is conditionally independent of any random variable at time t or before, given r(t). Section 4.6 considers the following basic assumptions: Assumption 2. The a priori joint probability distribution p 0 and the n-object transition probability function © are such that ±(r(t);d min ) > 0 with probability one for all t¸ t 0 , where ±(r;d min ) denotes a function that is zero whenever r := fr 1 ;r 2 ;:::;r n g2R n has two r i , r j , i6=j with r i and r j closer than d min . We assume that the multiple objects essentially move independently of each other, except that we enforce the constraint that two objects will never get closer than d min from each. Under Assumption 1 and Assumption 2, multiple object 18 transition probability function can be written as the multiplication of the single- object transition probability functions. The following n-object transition proba- bility function captures this: ©(r 0 ;r;t+dt;t)= c(r) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t) ´ ±(r 0 ;d min ) (1.3) where r;r 0 2R n , ' 1 (r 0 k ;r k ;t+dt;t) is single-object transition probability function and c(r) is the following normalizing constant c(r) ¡1 = X ¹ r2R n n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t)±(¹ r;d min ): (1.4) 1.3.4 Operators for Recursive Map Computation InspiredbytherecursiveformulasforaKalman¯lter,wewillconstructthepredic- tion operator T pred [t;¿;¢ ]thattakesaprobabilisticmapattime¿ andpropagates itforwardintimetoobtainamapattimet>¿;andthefusionoperator T fuse [y;¢ ] that takes a map at time t and, given a list of old measurements Y and a new measurement y taken at time t, produces a new map at the same time given both the old and new measurements. With these operators we can compute the probabilistic map m r (¢ ;tjY k ), t¸t k recursively as depicted in Figure 1.2. 19 t1 t2 t tk ... t0 Tf1 Tfk Tp0 Tp1 Tpk m=p0 Figure 1.2: Recursive computation of m r (¢ ;tjY k ), t¸t k using the prediction and fusion operators. We say that a probabilistic map m r is separable if it can be written as m r (r;tjY):=c ³ n Y i=1 f(r i ;tjY) ´ ±(r;d min ); (1.5) 8r :=fr 1 ;r 2 ;:::;r n g2R n , where c is a normalizing constant, f is an appropri- ately de¯ned function. 1.3.4.1 Fusion Operator Using the Bayes rule and the de¯nition of the sensor likelihood function in (2.1) we obtain m r (r;t k jY k )=c` n (y k ;r)m r (r;t k jY k¡1 ); (1.6) and f(r;t k jY k ):=` 1 (y k ;r)f(r;t k jY k¡1 ); 8r2R: (1.7) 20 which de¯nes the fusion operator T fuse [y k ;¢ ] and prove that the map given Y k is still separable and therefore that T fuse [y k ;¢ ] indeed preserves separability. The following was stated: Theorem 2. Thefusionoperatorthatproducestheprobabilisticmapm r (¢ ;t k jY k ) from the measurement y k :=ft k ;a k ;s k g and the map m r (¢ ;t k jY k¡1 ) is de¯ned by (1.6). Moreover,withTheorem1andm r (¢ ;t k jY k¡1 )isseparable,thenm r (¢ ;t k jY k ) isalsoseparableandtheaggregatemeasurementfunctionforthelattercanbecom- puted from the one for the former using (1.7). 1.3.4.2 Prediction Operator By total probability theorem, we have m r (r 0 ;t+dtjY k )= X r2R ©(r 0 ;r;t+dt;t)m r (r;tjY k ); (1.8) which de¯nes the prediction operator T pred [t+dt;t;¢ ] that takes a probabilistic map at time t and propagates it forward in time to obtain a map at time t+dt. We assume that the multiple objects essentially move independently of each other according to a model similar to that in the previous section, except that we enforce the constraint that two objects will never get closer than d min from 21 each. This is needed for consistency with Assumption 2. The following n-object transition probability function captures this: ©(r 0 ;r;t+dt;t)=c(r) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t) ´ ±(r 0 ;d min ) (1.9) r;r 0 2R n , where c(r) is the following normalizing constant c(r) ¡1 = X ¹ r2R n n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t)±(¹ r;d min ): (1.10) We assume here that the objects' velocity is bounded and therefore, for small values of dt> 0, ' 1 (r;r 0 ;t;t+dt) is nonzero only when r and r 0 correspond to the same or adjacent cells. Moreover, in the latter case ' 1 (r;r 0 ;t;t+dt) is of order dt, i.e., ' 1 (r;r 0 ;t;t+dt)·cdt for some ¯nite constant c. This corresponds to the following assumption: Assumption 3. The n-object transition probability function is given by (1.9), with ' 1 (r 0 ;r;t+dt;t)= 8 > > > > > > > > < > > > > > > > > : 1¡o(dt) r =r 0 o(dt) r and r 0 are adjacent 0 otherwise (1.11) 8r;r 0 2R; t¸t 0 ;dt¸0. 22 We show next that under this assumption the probabilistic map is approxi- mately separable at time t+dt, for small dt. Moreover, the error is of order dt 2 for most con¯gurations and of order dt for a small setA bad ½R n of \bad" object con¯gurations r for which 1. ±(r;d min )6=0, i.e., no two objects are closer than d min from each other; and 2. there exists some con¯guration ¹ r 2 R n for which ±(¹ r;d min ) = 0 and it is possible to go from con¯guration ¹ r to con¯guration r by a single object moving to an adjacent cell. Inpractice,A bad denotesthesetofcon¯gurationsforwhichtheobjectsstillsatisfy the minimum distance constrain but are very close to violating it. Theorem 3. Thepredictionoperatorthatproducestheprobabilisticmapm r (¢ ;t+ dtjY k ), dt ¸ 0 from the map m r (¢ ;tjY k ) is de¯ned by (1.8). Moreover, in case m r (¢ ;tjY k ) is separable and Assumption 3 holds, then m r (r 0 ;t+dtjY k )=c ³ n Y k=1 f(r 0 k ;t+dtjY k ) ´ ±(r 0 ;d min ) +²(r 0 ;dt); 8r 0 2R n ; 23 where ²(r 0 ;dt) is of order dt for r 0 2A bad and of order dt 2 otherwise, and f(¢ ;t+ dtjY k ) is de¯ned as f(r 0 ;t+dtjY k )= X r2R ' 1 (r 0 ;r;t+dt;t)f(r;tjY k ) 8r 0 2R: (1.12) In practice, for small dt we can approximate m r (r 0 ;t+dtjY k )¼c ³ n Y k=1 f(r 0 k ;t+dtjY k ) ´ ±(r 0 ;d min ): (1.13) Moreover,ifonewantstocomputem r (r 0 ;t 0 jY k )forsomet 0 >tforwhicht 0 ¡tisnot small,onecandividetheinterval[t 0 ;t]intoN subintervalsoflengthdt:=(t 0 ¡t)=N and apply (1.13) N times to propagate the map from t to t 0 . OutsideA bad , each of the N steps will introduce an error of order (t 0 ¡t) 2 =N 2 , which adds up to a total worst-case error of order (t 0 ¡t) 2 =N. Therefore the error can be made small by making N large, which means that the map at time t 0 is still approximately separable. For the con¯gurations in A bad , in general it will not be possible to make the error arbitrarily small. However, in practice A bad is a small subset of R n . 1.3.5 IterativeComputationofDynamicProbabilisticMaps Sensors send all new information to a map computation module so that the al- gorithm will fuse new data to construct the most current version of the map. 24 Sensor nodes are connected to the map computation node in a network setting. In general, networks have delay between nodes. This may cause some of the data packagesnottoarriveorderedbythetimestamp. Therearetwowaystocompute the a posteriori probabilistic map for moving objects recursively depending on the application as follows: Sequential Case In the more regular case, we will follow the steps in order. That is, fusion and prediction come one after the other. With these operators we can compute the probabilistic map m r (¢ ;tjY k ), t¸t k recursively. The sequential case can follow the order of operations shown in Figure 1.2. Out-of-Sequence Case This is the case where we deal with information arriv- ingoutofsequence. Thatis, thenewinformationarrivingmighthaveanoldtime stamp. Weneedtostoreoldprobabilisticmapstotakecareofthiscase. Thiscase may further be handled by 2 algorithms: Fusion-on-arrival and Delayed-fusion. The out-of-sequence case will be explained in detail by an example scenario that isalsopresentedinFigure4.3. Usingthesettingofthisscenario, fusion-on-arrival algorithm is presented ¯rst. In this case, we fuse newly arrived out-of-sequence information as soon as it arrives. The advantage of this algorithm is that most current map is always available. On the other hand, we may waste computation trying to fuse every information right after it arrives. The most current map may not be needed as frequent as new information arrives. Considering such cases, 25 delayed-fusion algorithm is considered next. This is where a pre-set delay to fuse information is inserted in the system. In such cases, we do not fuse the newly arrived out-of-sequence information by the time it arrives. Rather, we delay the fusion for the pre-set amount of time and then fuse all information, that arrives during that delay, in their sequential order. Computation is not wasted in this algorithm but this time we may not always have the most current version of the probabilistic map. 1.3.6 Applications of Probabilistic Maps For the problem at hand, probabilistic maps are constantly computed with the locations of objects by fusing information collected from all the sensors in the region. Theobjectsthatwemapmayinducedanger. Theprobabilisticmapsmay beusedforintelligenceupdatesandforminimumriskrouteplanning. Theymake use of the map-building facility to compute paths for autonomous mobile robots that minimizes the risk of being in the danger zone. The autonomous mobile ro- botscouldbeairorgroundvehiclesdependingontheapplication. Forsomecases, pathscanbecalculatedinanopen-loopfashion. Thatis, wewillcalculateamini- mumriskpathfromasourcepointtoadestinationpoint, andlettheautonomous mobilerobotsgothedestinationpointfollowingthispath. Usingtheprobabilistic maps introduced in previous section, it is possible to assess the risk of speci¯c run from source to destination point and even do minimum-risk path planning. This 26 can be done without ever explicitly computing the high-dimensional probabilistic map and working simply with aggregate measurement functions. To determine the probability of success, one needs to compute a danger-map. This map assigns toeachpointintheregiontheprobabilitythatanautonomousmobilerobotgoing overthatlocationwillbedestroyed. Theprobabilityofsuccessofaparticularrun turns out to be linear on the values of a simple transformation of the danger-map and is therefore computationally amenable to do minimum-risk path planning. We study the case of n static objects and a single robot in this section.The conditionala posteriori probabilityp survive thatarobotwillsurvivethisrun,given a list of past measurement Y 2Y ¤ , can then be computed as p survive = m Y j=1 (1¡m danger (x j jY)): where one needs to compute the a posteriori danger map. This is de¯ned by m danger (ajY):=P(KjY k =Y;a k =a) were Y denotes a list of k observations, K the event that a robot, which has an objectinitsrangeandpositionedatawillbedestroyedasaresult. Alowerbound onm danger (ajY)canbecomputedbyallowing\double-counting"when the robot is within range of multiple objects. When this is done we obtain a "worst-case" 27 bound. This "worst-case" bound is a function of a new probabilistic map, the a posteriori occupancy maps for the objects de¯ned by m o (xjY):=P(9i:r i =xjY k =Y); x2R; where Y 2Y ¤ denotes a list of k observations, and r i the positions of the objects. We actually have an equality when the robot has within its range at most one object. This can be known to occur with probability one, e.g., when the prior probability places the objects far apart. Anotheruseisintheareaofthepursuit-evasionproblems. Thepursuit-evasion problem in a region R½R 2 is to capture a mobile object (called a fugitive) oc- cupying unknown location in R by a team of pursuers (called searchers). The fugitive is said to be captured if one of the searchers occupy the same position as the fugitive. Unlike minimum-risk path planning applications, we do not want to avoid the objects that we map, but to pursue. Depending on the particular con- text, capture may actually mean handling the evader in some particular way, i.e. rescuing it. We do not take into account physical properties of a pursuer/evader such as the shape and size. Therefore a pursuer/evader is represented simply by its position in regionR½R 2 in which the pursuit takes place. Hence, we assume that a pursuer and an evader can be located at the same position. Possible scenarios for search include search and rescue operations, localizing and retrieving parts in a warehouse, localizing and neutralizing environmental 28 threats (i.e. forest ¯res or development of algae in the ocean), and search and capture missions (i.e. law enforcement). In some of these problems the objects are moving more or less randomly (e.g. search and rescue) whereas in the others they are actively avoiding detection(e.g. search and capture). The objects we pursue, might in general be hidden, moving, or actively avoiding capture. Our solution to the pursuit-evasion problem on R is an algorithm which moves the searchers in R so that the fugitive is captured eventually no matter how the fugitive moves. This is discussed in Chapter 7. 1.3.7 Occupancy maps for objects The goal of this section is to compute the a posteriori occupancy maps for the objects. The a posteriori occupancy map for the objects is simply the probability of having an object of interest at cell x2R given all information up to that time. It is de¯ned by m o (xjY):=P(9i:r i =xjY k =Y); x2R; where Y 2Y ¤ denotes a list of k observations. From the de¯nition of the proba- bilistic map m r (¢jY), the following Lemma can be adapted to the setting of this work: 29 Lemma 1. The occupancy map at time t given Y k is given by m o (xjY)= X r2R n :9i:r i =x m r (rjY): Moreover, in case Assumption 1 holds, then m o (xjY)¼cf(xjY) µ nf n¡1 0 (Y)+ n¡1 X k=1 n!f n¡k¡1 0 (Y) (n¡1¡k)! X ¾2S (k) f ¾ 1 (Y)f ¾ 2 (Y):::f ¾ k (Y) ¶ ; where f i (Y) := P r2R i f(rjY), i 2 f0;1;:::;mg with the region R i de¯ned as above, and c is a normalizing constant chosen so that P x2R m o (xjY)=n. Lemma 1 allows us to compute the a posteriori occupancy map m o (¢ j Y) directly from the aggregate measurement function f(¢ j Y), without explicitly computing the a posteriori probabilistic map m r (¢jY). 1.3.8 Optimal Path Situations with uncertain or unknown environments are typical in navigation of autonomous vehicles. Works on path generation with uncertainty come primar- ily from studies on navigation of an autonomous vehicle moving along a two- dimensional surface [2], [42], [32] and from studies on local control of robot arms [28]. In these works, various heuristics are used for path generation in the limited 30 area around the vehicle for which information is available. Within this limited area, the problem is actually treated as one with full information. On the other hand, [27] studies the problem of dynamic path planning for a mobile robot with limited information. The algorithms presented are nonheuristic and are done continuously based on mobile robot's current position and on its feedback from sensors. In this work, the probabilistic maps are used for intelligence updates and for minimum risk route planning. They make use of the map-building facility to compute paths for the autonomous mobile robots that minimizes the risk of being in the danger zone. For some cases, paths can be calculated in an open- loop fashion. That is, we will calculate a minimum risk path from a source point to a destination point, and let the autonomous mobile robots go the destination point following this path. We will call such a case as Static Optimization. Paths can also be recomputed in a closed-loop fashion, meaning the path is recalculated while the robots are on their way to the destination following the previous path to divert them from new danger zones. This will be called Online Optimization. We can formulate the problem as min e e 1 =e init ;e m =e ¯nal P i J 2 (e i )·J max 2 X i J 1 (e i ); i2f1;¢¢¢ ;ng; where e = (e 1 ;:::;e n ) is the set of edges de¯ning the path. In our case, the vertices are the cells in a regionR½R 2 and the edges are the paths that connect 31 two cells. (v 1 ;v 2 )2 E if and only if a robot can go from v 1 to v 2 . Let us denote "time" cost as J(e)= 8 > > > < > > > : p (2) e is a diagonal edge in the map 1 e is a vertical or horizontal edge in the map and the "risk" cost as J(e)= v i +v j 2 ; e =(v i ;v j ) where v i and v j represent the probability of danger associated to those vertices. However, we may choose to use the probability of survival, p survive , associated to those vertices, in which case it is a maximization problem. This case may still be presented as a minimization problem by representing danger map as a function of survival map. Because of the way we model the latter cost in this problem, we do not consider half the cost of the ¯rst and last vertex in the path in the optimization problem. Yet this does not e®ect the solution since the ¯rst and last cell in the path are ¯xed for each path possible. The above problem is solved using the work done in [38]. 32 1.3.9 Simulation Results The algorithms presented above were implemented in a simulator of military air- operations developed by the Honeywell Technology Center. 33 Chapter 2 Concept and Uses of Map Building This work is used, in general, for military purposes physically motivated by mil- itary air operations. The future of air-operations is evolving towards a scenario where the battle¯eld is populated with many unmanned air-vehicles gifted with a large number of cheap sensors whose primary mission is to gather informa- tion about enemy assets and possibly execute simple missions in an autonomous fashion. In this scenario, the fusion of information coming from a multitude of sensors into forms suitable for mission planning is a major challenge, because the data gathered is likely to be noisy. We develop probabilistic formalisms to process the information coming from aircraft sensors capable of detecting electro- magnetic radiation originating from radar making it suitable for the planning of air-operations. The need for a probabilistic framework stems from the fact that 34 thesensorsproducingthemeasurementsarenotperfectandsusceptibletoprovide erroneous information. Because of this, any inferences made from this data can never be assumed 100 correct. In usual maps of a particular region, one ¯nds markings that represent loca- tions of objects (e.g., radar) within the region. By placing one of these markings in a particular location on the map, one is expressing the fact that the object rep- resented can be found at that particular location with probability one. However, when the information used to build a map (or more precisely to locate interesting objects in it) is not perfect, it is not possible to mark a speci¯c position on the map as the location of the object. In this case, the best one can do is to super- impose the probability density of the location of the object in question on the whole map. It is this probability density that we call the probabilistic map for the object. From a formal point of view, probabilistic maps are actually proba- bility densities of the position of objects, conditioned to the sensor measurements available. Note that these probabilistic maps do not represent the topology of the geographicregion,butareoftenrelatedtoit,e.g.,theprobabilisticmapofaradar will most likely exhibit zero probability over areas covered by a river because it is unlikelyto¯ndaradaroverwater. Themaindi±cultyincomputingprobabilistic maps is their size. When one is mapping multiple objects whose positions are not independent of each other and therefore one needs to keep track of the joint prob- ability density of all the objects, the probabilistic map (i.e., the joint probability 35 density) easily becomes very large. This is because, in principle, the memory requirements to represent a joint probability density grow exponentially with the number of random variables. It should be noted that (conditional) independence of the positions of several radar is not expected because the probability that all the radar are very close to each other is often much smaller (or at least di®erent) than the probability that the radar are spread to cover a certain perimeter. The probabilistic maps discussed in this section are of two types, i.e Bayesian Filter andParticleFilter maps. Beforecontinuingonwiththeirde¯nitionsandtheuses, let us talk about the sensor model that is used for both approaches. 2.1 Discussion on Sensor Model Our knowledge about the position of the objects improves with time through observations collected by the sensors over R. These observations are denoted by y k , k2K :=f1;2;:::; ¹ kg, and each y k consists of a triple of the form ft k ;a k ;s k g; wheret k ¸0denotesthetimeatwhichtheobservationwasmadeanda k 2Rthe position of the sensor. We model this imperfection in the sensors in such a way that the variable s k can either have the value s k =; if nothing was detected at thattimeorsomevalues k 2Rifanobjectpositionedats k isdetected. However, 36 we assume that the sensors used to detect objects are not perfect and therefore s k =; simply means that the sensors were not able to detect an object and one should consider that this may be a \false negative." Similarly, s k 2 R may be caused by a \false positive," or there may be a target close to s k but not exactly at that location. The set of all possible observation triples is denoted byY. Sensors of this type can be modelled by a likelihood function of the form ` n (y;r;t):=P(y k =yjr(t)=r); k2K; r(t)2R n ; y2Y: (2.1) where t k = t. We refer to ` n as the n-object sensor likelihood function. Here, we assume that y k is conditionally independent of any other measurements, given a particular con¯guration r(t) = r for the objects. This implicitly assumes that sensor errors occur independently of each other. We will present the de¯nitions for the two types of probabilistic maps in the next section. 2.2 De¯nitions of Probabilistic Maps Inthissection,wewillprovidebirefde¯nitionsoftheprobabilisticmaps: Bayesian Filter and Particle Filter maps. Bayesian Filter Maps The classical problem in partially observable Markov chains is to recover a posterior distribution over the state at any time from all 37 availablesensormeasurements. This problem is knownas Bayesian ¯ltering prob- lem, also called optimal ¯ltering or stochastic ¯ltering problem. We proposed a solutiontothisproblembyBayes¯lters,whichcomputesthisposteriorrecursively. The a posteriori probabilistic map for the objects are de¯ned by m r (r;tjY K ):=P(r(t)=rjY K =Y K ); r2R n ; whereY K denotesthelistfy 1 ;y 2 ;:::;y K gofallsensorobservationsavailableand typically t ¸ t K . We assume that an a priori joint probability distribution for the objects at time t 0 p 0 (r;t 0 ;;):=P(r(t 0 )); r2R n ; is known. ; 2 Y ¤ denotes the empty sequence (empty observation set). We also assume the objects to be a priori indistinguishable, meaning that for any permutation j :=fj(1);j(2);:::;j(n)g off1;2;:::;ng we have p 0 (fr 1 ;r 2 ;:::;r n g;t 0 ;;)=p 0 (fr j(1) ;r j(2) ;:::;r j(n) g;t 0 ;;): (2.2) Particle Filter Maps Other work in automatic control was done during the 60's and 70's based on sequential Monte Carlo integration methods. Most likely because of of the primitive computers available at the time, these last algorithms 38 were overlooked. In the late 80's, the great increase of computational power al- lowedtherebirthofnumericalintegrationmethodsforBayesian¯ltering. Current research has now focused on MC (Monte Carlo) integration methods which have thegreatadvantageofnotbeingsubjecttoanylinearityorGaussianityhypotheses on the model. Particle ¯lters comprise a broad family of sequential Monte Carlo algorithms for approximate inference in partially observable Markov Chains. The signal fr(k);k 2 K := f0;1;2;:::;Kgg, r(k) 2 R, is an unobserved (hidden) Markov process of initial distribution p(r(0)) and transition equation p(r(k)jr(k¡1)). The observations, y k :=ft k ;a k ;s k g;k2K :=f0;1;2;:::;Kg is conditionally independent of any other measurements given a particular con¯g- uration r for objects of interest. In short, the model is a hidden Markov Model (HMM) described by P(r(0)) and P(r(k)jr(k¡1)) (2.3) P(y k jr(k)); k¸0 (2.4) The analyzing of the data is to perform a complete forwards sweep of a Monte Carlo ¯ltering algorithm to produce weighted particles fr(k) i ;w(k) i g for k = 1;2;:::;K, i = 1;2;:::;M, drawn approximately according to the importance 39 function. AsimplechoiceistochoosethepriordistributionofourHiddenMarkov Model as the importance function, ¼(r(k)jR k¡1 ;Y k )=P(r(k);jr(k¡1)) (2.5) and w(k) i =w(k¡1) i P(y k ;jr(k) i ): (2.6) Given Y K , we want to estimate P(r(k)jY K ) for any k = 0;:::;K. At time K, the ¯ltering algorithm yields the following approximation of P(R k jY k ): ^ P(R k jY k )= M X i=1 ~ w(k) i ±(R i k ;R k ) (2.7) 2.3 Example Work Here, we will provide the setup for the works supported by the DARPA JFACC (Joint Forces Air Component Commander) and MICA (Mixed Initiative Control of Automa-teams) programs through SPAWAR Systems. Setup for JFACC Program The Joint Forces Air Component Commander (JFACC) in military air operations controls the allocation of resources (wings, squadrons, air defense systems, AWACS) to di®erent geographical locations in 40 the theater of operations. The JFACC mission is to de¯ne a sequence of tasks for the aerospace systems at each location, and providing feedback control for the execution of these tasks in the presence of uncertainties and a hostile enemy. In a joint e®ort with Honeywell International for SPAWAR Systems Center, the overall solution includes a tasking controller and a probabilistic threat/survival map indicating high threat regions for aircraft and suggesting optimal routes to avoid these regions. A simulation/modeling environment using objected-oriented methodologies has been developed to serve as an aide to demonstrate the value of the solu- tion. The simulation/modeling environment is based on an open architecture that enables the integration, evaluation, and implementation of di®erent control approaches. The simulation o®ers a graphical user interface displaying the bat- tle¯eld, the control performance, and a probability map displaying high threat regions. Figure 2.1 illustrates the overall control components and their integra- tion with the simulator. It shows a distribution of the three main control levels of the campaign management system. These three levels are the Operational Level, the Task Group Level, and the Task Level. The campaign manager in the oper- ational level is responsible for generating the campaign steps, which include task groups, deadlines and desired probability of success (all models are probabilistic; and the objective thus needs to have an associated probability to be meaningful). 41 Each task group is assigned to several task controllers. It is this controller's re- sponsibility to distribute the group's resources among the di®erent tasks. Model predictive control is done at the task level and task group level. MPC task con- trollers take a task and determine the aircraft deployments needed to execute the task. Figure 2.1: Overall system architecture Probabilisticmapsareconstantlycomputedwiththelocationsofenemythreats (SAMsites)byfusinginformationcollectedfromalltheaircraftinthebattle¯eld. The probabilistic maps are used for intelligence updates and for minimum risk route planning. They make use of the map-building facility to compute paths for 42 the aircraft that minimizes the risk of being attacked by enemy SAM sites. Paths can also be recomputed after a mission has started to divert the aircraft from pop-up threats. The intelligence information is passed on to the model predictive control. In order to evaluate these control strategies and integrate the probabilistic threat models, a simulation environment has been developed based on objected- orientedmethodologiesthato®ersanopenarchitectureandaGUI.Thesimulator environment simpli¯es the integration and evaluation of the above described mul- tiple control approaches. It also o®ers a graphical user interface displaying the battle¯eld, the controlperformance, and a probability map displaying high threat regions. Setup for MICA (Mixed Initiative Control of Automa-teams) Program MICA will develop sensor and information system technology and systems with application to battle space awareness, targeting, command and control, and the supporting infrastructure required to address land-based threats in a dynamic, closed-loopprocess. MICAisallaboutcontrollingthecooperativebehaviorofun- manned vehicles operating as teams to prosecute multi-faceted tasks. The objec- tiveistodevelopintelligentautonomyalgorithms,software,andoperatorinterface proof-of-conceptsforhierarchicalbattlemanagementanddistributedmissioncon- trol of semi-autonomous systems in an uncertain, adversarial environment. The general approach is closed loop dynamic resource and task allocation, scheduling, 43 multi-asset route planning. The main goal is to reduce the mission commander and operator footprint. Another goal is to speed the sensor-to-shooter cycle in the presence of time sensitive targets: Cooperatively coupling sensing and strike, enabling self-reorganization of teams and replenishment, allowing event-driven dynamic planning and replanning. Theproposedsolution,inajointe®ortwithHoneywellInternationalforSPAWAR SystemsCenter,iscalledControlSystemforTeamsofAutonomousAgents(CRYS- TAL).TheCRYSTALarchitectureisintendedtoapplytomixed-initiativecontrol of unmanned air or ground vehicles. However, in certain details the design is spe- ci¯c to the UCAV-based scenarios de¯ned by the MICA program. Figure 2.2 summarizes our work's ,Cooperative Path Planning Layer (CPPL), internal ar- chitecture and its interaction with the other layers of the CRYSTAL architecture. The CRYSTAL executive's responsibility is to manage the inner-loop control of the UAVs in the MICA operations. The executive is a hierarchical controller, the top level of which is responsible for controlling the package as a whole and conforming to the instructions in the Task Brief received from the Team Com- position and Tasking Layer (TCTL). The MICA program has an o±cial Open Experimental Platform (OEP). The MICA OEP is based on the Boeing C4Isim simulation, which models the collection, processing, and dissemination of battle- ¯eld information. The CPPL is composed of two components: Map Building and 44 Figure 2.2: Control System for Teams of Autonomous Agents (CRYSTAL) archi- tecture Path Planning. Let us ¯rst start with map building. CPPL utilizes a proba- bilistic framework to combine data arising from Intelligence Preparation of the Battle¯eld (IBP) with real-time measurements collected by UAVs and wide-area surveillance aircraft. To asses the risk posed by SAMs to aircraft it computes several probabilistic maps that locate the threats spatially. The algorithms used areverye±cientandscalewellwiththenumberofobjects(SAMs)beingmapped. 45 In particular, their memory complexity grows only linearly with the area being mapped. The proposed algorithm to compute probabilistic maps is presented in [18]. Based on the threat locations provided by probabilistic maps, CPPL com- putes the risk they induce for airborne UAVs °ying over the region of interest. By risk, it is meant the probability that a SAM will be able to lock and kill the UAV. The risk computation takes into account: 1. The position of the UAV with respect to the SAM(s), 2. Its orientation with respect to the track radar and 3-dimensional signature characteristics, 3. The protection o®ered by jamming capabilities on the UAV being targets or other UAVs in the group. The risk computation can be found in [16]. Let us now continue with the Path Planning component of our work. CPPL uses the threat information provided by the Map building module to compute minimum-risk path for groups of UAVs carrying out missions (pro-active path planning). The computation of paths for groups of Unmanned Air Vehicles (UAVs) that minimize the risk of being de- stroyed by ground defenses can be formulated as a shortest path optimization. We show that the algorithm proposed can e±ciently produce low-risk paths in [18]. CPPL o®ers several functionalities to the higher-level layers: 46 1. Computation of point-to-point paths, in which a higher-level layer requests a minimum-risk path between two given points, 2. Computation of point-to-area paths, in which a higher-level layer requests a path from a given point to anywhere with a given area 3. Computation of point-to-areas-to-point paths, in which a higher-level layer requestsapathbetweentwoagivenpoints,passingthroughagivensequence of areas, 4. Computation of safe-harbor loitering areas along a given path, these are typically used to synchronize groups of UAVs so that they arrive a speci¯c locations at the same time. The CPPL also provides spatial and temporal pro¯les for real-time UAV evasion in response to radar-locks (reactive path-planning) [23]. 2.4 DiscussiononHowtoUseProbabilisticMaps We will present two uses of the probabilistic maps, namely Minimum-Risk Path Planning and Pursuit-Evasion. Minimum-Risk Path Planning Using the probabilistic maps, it is possible to assess the risk of speci¯c missions in air-operations and even do minimum-risk 47 path planning. As shown in [16], this can be done without ever explicitly com- putingthehigh-dimensionalprobabilisticmapandworkingsimplywithaggregate measurement functions. To determine the probability of success of a mission, one needs to compute a danger or kill-map. This map assigns to each point in the region the probability that an aircraft °ying over that location will be destroyed. The probability of success of a particular °ight mission turns out to be linear on the values of a simple transformation of the kill-map and is therefore computa- tionally amenable to do minimum-risk path planning. A danger map is shown in Figure 2.3, where darkness indicates danger. A minimum-risk path is also plotted in the ¯gure. Figure 2.3: Minimum Risk Path In the simulation, aircraft follow minimum risk paths. By doing so, they actively avoid high threat regions. The threat information associated with the 48 minimum-risk path is also provided to the model predictive control at the task level. Thisinformationallowsthecontrollertoadjustthenumberofassetsneeded to successfully complete future battle rounds. Pursuit-Evasion Thepursuit-evasionprobleminaregionR½R 2 istocapture a mobile object (called a fugitive) occupying unknown location in R by a team of pursuers (called searchers). The fugitive is said to be captured if one of the searchers occupy the same position as the fugitive. Unlike minimum-risk path planning applications, we do not want to avoid the objects that we map, but to pursue. Dependingontheparticularcontext,capturemayactuallymeanhandling the evader in some particular way, i.e. rescuing it. We do not take into account physical properties of a pursuer/evader such as the shape and size. Therefore a pursuer/evader is represented simply by its position in region R ½R 2 in which the pursuit takes place. Hence, we assume that a pursuer and an evader can be located at the same position. For simplicity we assume that both space and time are quantized. The region R ½ R 2 in which the pursuit takes place is then regarded as a ¯nite collection of N points, each representing a cell inR 2 and all events take place on a set of discrete times T. Here, the events include the motion and collection of sensor data by the pursers/evaders. For simplicity of notation we take equally spaced event times. In particular,T :=f1;2;:::g. Speaking of which, we may not have sensor readings for each event time. 49 For each time t 2 T we receive a measurement, we denote by y(t) the set of all measurements taken by the pursuers at time t. Every y(t) is assumed a random variable with values in a measurement spaceY. At each time t2T it is possible to execute a control action u(t) that, in general, will a®ect the pursuers sensing capabilities at times ¿ ¸ t. Each control action u(t) is a function of the measurementsbeforetimetandshouldthereforeberegardedasarandomvariable taking values in a control action space U. We assume that the motion of each pursuerisconstrainedbythat,inasingletimestep,itcanonlymovetocellsclose toitspresentposition. ReadersarereferredtoChapter7fordetails. Thesolution to the pursuit-evasion problem on R is an algorithm which moves the searchers inR so that the fugitive is captured eventually no matter how the fugitive moves. Two snapshots from the simulation of our pursuit policy with search paths is seen in Figure 2.4. 50 t=9 t=11 Figure 2.4: Pursuit using the pursuit policy with search paths. The searchers are represented by blue circles, the fugitive is represented by a red circle. The background color of each cell r 2R encodes the probability of having a fugitive at r, with a light color for low probability and a dark color for high probability. The search paths are represented by green stars and the search width by yellow stars. This is the Bayesian Filter Maps case. 51 Chapter 3 Sensor and Motion Model for Probabilistic Maps Suppose that there exist n objects in a regionR½R 2 . For now we assume that the positions of these objects are not accurately known, and that the objects may move in time. For simplicity we takeR to be a ¯nite collection of N points, each representing a cell in R 2 . We model the location of each object as a stochastic process 1 taking values in R. We denote by r(t) :=fr 1 (t);r 2 (t);:::;r n (t)g2R n , t¸ 0, the locations of the n objects, where r i (t)2R denotes the position of the ith object at time t. The \distance" between two objects is de¯ned by a given metric d :R£R! [0;1) onR. We denote by ±(r;d min ) a function that is zero 1 In this report we denote random processes by bold symbols and their realizations by non- bold ones. 52 whenever r :=fr 1 ;r 2 ;:::;r n g2R n has two r i , r j , i6=j with d(r i ;r j )<d min and one otherwise. Our knowledge about the position of the objects improves with time through observations collected by the sensors over R. These observations are denoted by y k , k2K :=f1;2;:::; ¹ kg, and each y k consists of a triple of the form ft k ;a k ;s k g; where t k ¸ 0 denotes the time at which the observation was made and a k 2 R the position of the sensor. The variable s k can either have the value s k = ; if nothing was detected at that time or some value s k 2 R if an object placed at s k is detected. However, we presume that the sensors used to ¯nd objects are not perfect and therefore s k =; simply means that the sensors were not able to pinpoint an object and one should consider that this may be a \false negative." Similarly, s k 2R may be caused by a \false positive," or there may be an object closetos k butnotexactlyatthatpoint. Thesetofallpossibleobservationtriples is denoted byY. Sensors of this type can be modeled by a likelihood function of the form ` n (y;r;t):=P(y k =yjr(t)=r); k2K; r(t)2R n ; y2Y; (3.1) 53 where t k = t. We refer to ` n as the n-object sensor likelihood function. Here, we assume thaty k is conditionally independent of any other measurements, given a particular con¯guration r(t) = r for the targets. This implicitly assumes that sensor errors occur independently of each other. This section aims to compute the a posteriori probabilistic map for the objects de¯ned by m r (r;tjY K ):=P(r(t)=rjY K =Y K ); r2R n ; where Y K is the list fy 1 ;y 2 ;:::;y K g of all sensor observations available and typically t ¸ t K . Since we will calculate m r recursively, it will be convenient to build probabilistic maps with just a subset of the observations available. With a slight abuse of notation we will use the same symbol m r for the k-observations probabilistic map for the objects, k2K, de¯ned by m r (r;tjY k ):=P(r(t)=rjY k =Y k ); r2R n ; where Y k denotes a list fy 1 ;y 2 ;:::;y k g with the ¯rst k sensor observations. In the sequel we denote byY ¤ the set of all possible lists of measurements with any length. 54 We assume that an a priori joint probability distribution for the objects at time t 0 p 0 (r;t 0 ;;):=P(r 1 (t 0 )=r 1 ;r 2 (t 0 )=r 2 ;:::;r n (t 0 )=r n ); r :=fr 1 ;r 2 ;:::;r n g2R n ; is known. ; 2 Y ¤ denotes the empty sequence (empty observation set). We also assume the objects to be a priori indistinguishable, meaning that for any permutation j :=fj(1);j(2);:::;j(n)g off1;2;:::;ng we have p 0 (fr 1 ;r 2 ;:::;r n g;t 0 ;;)=p 0 (fr j(1) ;r j(2) ;:::;r j(n) g;t 0 ;;): (3.2) 3.1 Parametric Sensor Model In practice, the precise shape of the sensor likelihood function must be estimated experimentally. We develop next a simple parametric model for a single-object (i.e.,n=1)sensorlikelihoodfunctionsthatcouldbe\best-¯tted"toexperimental data. 3.1.1 Single-object sensor stationary likelihood functions Consider the following parameters: 55 p 1 probability of recognizing a real object given that object is in range of sensor. p 2 probability of recognizing a distractor instead of a real object given that object is in range of sensor. p 3 probability of not recognizing anything given that object is in range of sensor. p 4 probability of recognizing a distractor instead of a real object given that object is not in range of sensor. ½ sensor >0 range over which the sensor can detect an object. e loc maximum localization error for the position of the ob- ject when the sensor detects the presence of object. The actual localization error is assumed uniform within the range speci¯ed by e loc . We also assume here that e loc in- creaseslinearlywiththedistancebetweenthesensorand the object. In particular, these parameters may depend on any subset of fa;r 1 ;tg, where r 1 is the position of the single object. We study ¯rst the case of a measurement y k =ft;a;sg for which an object was found, i.e., s2R. By de¯nition, ` 1 ¡ ft;a;sg;r ¢ =P(s k =sjr 1 =r;t k =t;a k =a)P(t k =t;a k =ajr 1 =r): (3.3) 56 For simplicity we will assume here that the measurements t k = t for the time and a k = a for the sensor position are error free and therefore P(t k = t;a k = a j r 1 = r) = 1. Utilizing this and the stationarity assumption, we can expand the right-hand side of (3.3) to conclude that ` 1 ¡ ft;a;sg;r ¢ =P(s k =sjr 1 =r;t k =t;a k =a;E1) P(E1jr 1 =r;t k =t;a k =a) +P(s k =sjr 1 =r;t k =t;a k =a;E2) P(E2jr 1 =r;t k =t;a k =a) +P(s k =sjr 1 =r;t k =t;a k =a;E3) P(E3jr 1 =r;t k =t;a k =a) where E1 denotes the event that the algorithm recognizes a real object and E2 that the algorithm recognizes a distractor instead of a real object and E3 that 57 the algorithm did not recognize anything. Suppose ¯rst that kr¡ak · ½ sensor . According to our sensor model, P(s k =sjr 1 =r;t k =t;a k =a;E1)= 8 > > > < > > > : 1 B(kr¡ak) ks¡rk·e loc (kr¡ak) 0 otherwise P(s k =sjr 1 =r;t k =t;a k =a;E2)= 8 > > > < > > > : 1 A(a) ka¡sk·½ sensor 0 otherwise P(s k =sjr 1 =r;t k =t;a k =a;E3)=0 where B(b) is the area of the regionf¹ s2R:k¹ s¡rk·e loc (b)g and A(a) denotes the area of the region f¹ s 2 R : ka¡ ¹ sk · ½ sensor g. We then conclude that, for kr¡ak·½ sensor , ` 1 ¡ ft;a;sg;r ¢ = 8 > > > < > > > : p 1 B(kr¡ak) ks¡rk·e loc (kr¡ak) 0 otherwise + 8 > > > < > > > : p 2 A(a) ka¡sk·½ sensor 0 otherwise (3.4) 58 Suppose now that kr¡ak > ½ sensor . In this case the sensor will never ¯nd the object and we have ` 1 ¡ ft;a;sg;r ¢ =P(s k =sjr 1 =r;t k =t;a k =a;E2)P(E2jr 1 =r;t k =t;a k =a) = 8 > > > < > > > : p 4 A(a) ka¡sk·½ sensor 0 otherwise (3.5) Once the likelihood function is known for each s2R, its values for s =; can be obtained using ` 1 ¡ ft;a;;g;r ¢ =1¡ X ¹ s2R ` 1 ¡ ft;a;¹ sg;r ¢ = 8 > > > < > > > : 1¡p 1 ¡p 2 kr¡ak·½ sensor 1¡p 4 otherwise (3.6) The actual localization error, in this study, is assumed uniform within the range speci¯ed by e loc . On the other hand, in general, all sensors determine de- tection by comparing calculated received signal strength from the object with a prede¯nedminimumdetectablesignallevel,i.e.,forexample, inpassiveRFdetec- tors, this would correspond to checking for an object's emission in its frequency 59 band. Likewise, for an image sensor, signal strength would be related to the num- ber of matching patterns. Therefore, a more realistic model of the sensor requires the following assumption. ² e loc increases linearly with the distance between the sensor and the object, i.e., e loc (kr¡ak)=e min p kr¡ak 2 +h 2 h , where h denotes the aircraft altitude. Additionally, aprobabilitydistributionotherthanuniformdistributionwithin the range speci¯ed by e loc would be needed for the localization error since the distance is, in general, a key factor in detection calculations. First of all, the nonzero probability that the received signal is detected to originate from a posi- tion di®erent than the true source position exists for a number of reasons, e.g., miscalculations in detection, source signal re°ecting from nearby obstacles, etc. Yet, in all cases, this probability should be a decreasing function of the distance from the detected location. The Central Limit Theorem states that complicated events like our problem at hand that are caused by a lot of di®erent simple rea- sons tend to become Gaussian. Therefore, following the same line of reasoning, a goodcandidateforthejobwouldbeazero-meangaussianprobabilitydistribution. The Gaussian probability distribution for this job is de¯ned as follows: Let the error in positions ~ x and ~ y be written as ~ z =col(~ x;~ y). The function p(~ x;~ y)=p(~ z)= 1 p (2¼) m jP z j exp[¡ 1 2 ~ z 0 P ¡1 ~ z] (3.7) 60 is Gaussian density function with zero mean and error covariance matrix P z [i.e., ~ z»N(~ z;0;P z )] where ~ z =z¡ ^ z (3.8) Here ^ z is the observed value of the positions and z is the actual values that cover the whole map. 3.1.2 Multiple-objectstationarysensorlikelihoodfunctions The computation of the likelihood function for multiple objects presented below is based on the following basic assumptions: Assumption 1. 1. From the sensor point of view, the objects are indistin- guishable from each other. 2. The distance between any two objects is larger than twice the sensor range 2½ sensor 2 . The following is the main result of this section: 2 Incasesomeobjectsareveryclosetoeachother,wecanjustregardtheseobjectsasasingle unit. 61 Theorem 1. Under Assumption 1, for every k ¸ 1 and every r2R n such that ±(r;d min )>0, ` n (y k ;r)=c n Y i=1 ` 1 (y k ;r i ); (3.9) with probability one, where c is a normalizing constant 3 that does not depend on r and ` 1 is the single-object likelihood function de¯ned by (3.4){(3.5){(3.6). Proof of Theorem 1. 1 We consider ¯rst the case of a measurement y k = y := ft;a;sg2Y for which an object was detected, i.e., with s2R. As before, we can expand ` n ¡ y;r ¢ =P(s k =sjr i ¤ =r i ¤;t k =t;a k =a;E1)P(E1jr=r;t k =t;a k =a) +P(s k =sjr=r;t k =t;a k =a;E2)P(E2jr=r;t k =t;a k =a) +P(s k =sjr=r;t k =t;a k =a;E3)P(E3jr=r;t k =t;a k =a); r2R n ; whereE1denotestheeventthatthealgorithmrecognizesarealobject, E2thatthe algorithm recognizes a distractor instead of a real object, E3 that the algorithm 3 With some abuse of notation, in this paper we use the same symbol c to denote di®erent normalizing constants when this does not lead to ambiguity. 62 did not recognize anything, i ¤ the index of the actual object that is in the range, and r i ¤ its positions. Reasoning as before we conclude that ` n ¡ y;r ¢ =` 1 ¡ y;r i ¤ ¢ ; where ` 1 is the single-target likelihood function computed before. Since, under Assumption 1, the sensor can detect one object at a time, we also have n Y i=1 ` 1 (y;r i )= µ p 4 A(a) ¶ n¡1 ` 1 (y;r i ¤)= µ p 4 A(a) ¶ n¡1 ` n (y;r); ka¡sk·½ sensor where r i is the position of the ith object. This leads to ` n (y;r)=c n Y i=1 ` 1 (y;r i ); (3.10) where c is a normalizing constant. 63 Let us study now the case y k =y :=ft;a;;g2Y. We can expand ` n ¡ ft;a;;g;r ¢ =P(s k =;jr=r;t k =t;a k =a;E1) P(E1jr=r;t k =t;a k =a) +P(s k =;jr=r;t k =t;a k =a;E2) P(E2jr=r;t k =t;a k =a) +P(s k =;jr=r;t k =t;a k =a;E3) P(E3jr=r;t k =t;a k =a); r2R n ; where E1, E2, E3 have the same meaning as before. Suppose ¯rst thatkr¡ak· ½ sensor . From the de¯nitions of the parameters introduced at the beginning of the previous section, we have P(s k =;jr=r;t k =t;a k =a;E1)=0 P(s k =;jr=r;t k =t;a k =a;E2)=0 P(s k =;jr=r;t k =t;a k =a;E3)=1: Since p 3 =1¡p 1 ¡p 2 ; 64 we then ¯gure out that ` n ¡ ft;a;;g;r ¢ =1¡p 1 ¡p 2 : Suppose now that kr ¡ ak > ½ sensor . Once again from the de¯nitions of the parameters introduced at the beginning of the previous section, we get P(E1jr=r;t k =t;a k =a)=0 P(s k =;jr=r;t k =t;a k =a;E2)=0 P(s k =;jr=r;t k =t;a k =a;E3)=1: Because P(E3jr=r;a k =a)=1¡P(E1jr=r;a k =a)¡P(E2jr=r;a k =a) =1¡P(E1[E2jr=r;a k =a); we then determine that ` n ¡ ft;a;;g;r ¢ =1¡p 4 65 and hence ` n ¡ ft;a;;g;r ¢ = 8 > > > < > > > : 1¡p 1 ¡p 2 kr¡ak·½ sensor 1¡p 4 otherwise =` 1 ¡ ft;a;;g;r i + ¢ ; where i + is the index of the object closest to the sensor. Since, under Assumption 1, only one object can be in the range of the sensor at a time, we also have n Y i=1 ` 1 (y;r i )=(1¡p 4 ) n¡1 ` 1 (y;r i +)=(1¡p 4 ) n¡1 ` n ¡ y;r ¢ : We therefore rule that (3.10) also holds in this case. 3.2 Motion Model Weassumethatthemotionfort¸t 0 isMarkovian. Themotionmodelisspeci¯ed by a known n-object transition probability function of the form ©(r 0 ;r;t+dt;t):=P(r(t+dt)=r 0 jr(t)=r) r;r 0 2 R n ; dt ¸ 0; t ¸ t 0 , with the standing assumption that r(t + dt) is conditionally independent of any random variable at time t or before, given r(t). We assume the following for the sparseness of the objects in the region. 66 Assumption 2. The a priori joint probability distribution p 0 and the n-object transition probability function © are such that ±(r(t);d min ) > 0 with probability one for all t¸t 0 . Figure3.1showsanexampleterrainwitharoadandanexamplemotionmodel to consider roads for the movement of the objects. Figure 3.1: Example motion model which considers roads in the terrain. 67 Chapter 4 Bayesian Filtering The classical problem in partially observable Markov chains is to recover a poste- riordistributionoverthestateatanytimefromallavailablesensormeasurements. This problem is known as Bayesian ¯ltering problem, also called optimal ¯ltering or stochastic ¯ltering problem. We proposed a solution to this problem by Bayes ¯lters, which computes this posterior recursively. The a posteriori probabilistic map for the objects are de¯ned by m r (r;tjY K ):=P(r(t)=rjY K =Y K ); r2R n ; 68 whereY K denotesthelistfy 1 ;y 2 ;:::;y K gofallsensorobservationsavailableand typically t ¸ t K . We assume that an a priori joint probability distribution for the objects at time t 0 p 0 (r;t 0 ;;):=P(r(t 0 )); r2R n ; is known. ; 2 Y ¤ denotes the empty sequence (empty observation set). We also assume the objects to be a priori indistinguishable, meaning that for any permutation j :=fj(1);j(2);:::;j(n)g off1;2;:::;ng we have p 0 (fr 1 ;r 2 ;:::;r n g;t 0 ;;)=p 0 (fr j(1) ;r j(2) ;:::;r j(n) g;t 0 ;;): (4.1) In the case of moving objects, this chapter targets to determine the prediction operator T pred [t;¿;¢ ] that takes a probabilistic map at time ¿ and propagates it forward in time to obtain a map at time t>¿; and the fusion operator T fuse [y;¢ ] that takes a map at time t and, given a list of old measurements Y and a new measurement y taken at time t, produces a new map at the same time given both the old and new measurements. We elaborate on how to use these operators in Section 4.6. This chapter aims to determine e±cient algorithms to compute probabilistic maps for moving objects. Inspired by the recursive formulas for a Kalman ¯lter, wewillconstructa prediction operator T pred [t;¿;¢ ]that takesa probabilisticmap 69 at time ¿ and propagates it forward in time to obtain a map at time t > ¿; and a fusion operator T fuse [y;¢ ] that takes a map at time t and, given a list of old measurements Y and a new measurement y taken at time t, produces a new map at the same time given both the old and new measurements. The computation of probabilistic maps can be viewed as a Kalman ¯ltering problem, in which one propagates in time a probability density, with occasional Bayesian sensor fusion of noisy measurements. It is well known that if the a priori probability density of the objects is jointly Gaussian, their motion can be expressed by a linear model with Gaussian disturbances, and the measurements are obtained by adding Gaussian noise to the true object position, then the prob- abilisticmapisGaussianandcanbecomputedusinga¯nitedimensionalKalman ¯lter. However, here we are faced with probabilistic maps that are not Gaussian. Several factors contribute to this: 1. Thea priori probabilitydensitiesfortheobjectspositionsarenotGaussian. These densities may be multi-modal and typically have ¯nite support be- cause the objects are known to be in a pre-speci¯ed region. 2. The models for the object's motion are not necessarily linear with Gaussian disturbances. 3. We want to consider non-Gaussian sensors, e.g., with limited range, which occasionally produce measurements that do not correspond to any object 70 (\false positives"), which occasionally fail to produce measurements regard- ing objects that are within their range (\false negatives"); etc. In practice, these situations occur with any real sensor. In general, without the Linear/Gaussian assumptions mentioned above, there is no ¯nite-dimensional representation of the a posteriori probability densities. A ¯nite discretization of the state-space overcomes this di±culty but one is faced with an exponential explosion on the number of states as the number of objects being mapped grows. For example, if one partitions the region where n objects lie into N cells, the memory complexity associated with representing the joint probability distribution of the objects is o(N n ). 4.1 Separability of Probabilistic Maps InspiredbytheworkinSection4.5,wesaythataprobabilisticmapm r isseparable if it can be written as m r (r;tjY):=c ³ n Y i=1 f(r i ;tjY) ´ ±(r;d min ); (4.2) 8r :=fr 1 ;r 2 ;:::;r n g2R n , where c is a normalizing constant, f is an appropri- ately de¯ned function, and ±(r;d min ) is zero whenever two r i , r j , i6= j are closer thansome minimumdistance d min >0andone otherwise. The function f(¢ ;tjY) is called the aggregate measurement function and encapsulates the information 71 contained in all the measurement. Together with d min , it unambiguously de¯nes the probabilistic map. This de¯nition is very useful because for any region of reasonablesize, theposteriorprobabilisticmap m r (¢ ;tjY k )cannotberepresented exhaustively as such representation has memory complexity o(N n ), where N is thenumberofcellsinR. However,using(4.2)onecanrepresentm r (¢ ;tjY k )using a single function f(¢ ;tjY k ), whose representation has memory complexity o(N). Weshallseeshortlythatboththefusionandthepredictionoperatorspreserve separability. This means that is the a priori map is separable then any a posteri- ori map will also be separable. Moreover, it is possible to apply these operators directly to the aggregate measurement functions, avoiding the explicit computa- tion of the intermediate a posteriori maps. This makes recursive probabilistic map building extremely e±cient and scalable. It should be noted however, that the prediction operator only \approximately" preserves separability. This issue is discussed below. 4.2 Fusion Operator Consider a list Y k := fy 1 ;y 2 ;:::;y k g of k measurements with y k := ft k ;a k ;s k g, corresponding to a positive probability realization of Y k . Using the Bayes rule, we conclude that m r (r;t k jY k )=cP(y k =y k jr(t k )=r;Y k¡1 =Y k¡1 )m r (r;t k jY k¡1 ); (4.3) 72 where Y k¡1 := fy 1 ;y 2 ;:::;y k¡1 g, r 2 R n , and c > 0 is a normalizing constant chosen so that P r2R m r (r;Y k ) = 1. From this and the de¯nition of the sensor likelihood function in (3.1) we obtain m r (r;t k jY k )=c` n (y k ;r)m r (r;t k jY k¡1 ); (4.4) which de¯nes the fusion operator T fuse [y k ;¢ ]. Note that this formula takes a map at time t k and, given a list of old measurements Y k¡1 and a new measurement y k taken at time t k , produces a new map given both the old and new measurements. Suppose now that (3.9) holds and that the probabilistic map given Y k¡1 is separable, i.e., that (4.2) holds with t = t k and Y = Y k¡1 . Then, from (4.4) one concludes that m r (r;t k jY k )=c ³ n Y i=1 f(r i ;t k jY k ) ´ ±(r;d min ); where c is a di®erent normalizing constant, and f(r;t k jY k ):=` 1 (y k ;r)f(r;t k jY k¡1 ); 8r2R: (4.5) This means that the map given Y k is still separable and therefore that T fuse [y k ;¢ ] indeed preserves separability. The following was proved: 73 Theorem 2. Thefusionoperatorthatproducestheprobabilisticmapm r (¢ ;t k jY k ) from the measurement y k := ft k ;a k ;s k g and the map m r (¢ ;t k jY k¡1 ) is de¯ned by (4.4). Moreover, in case (3.9) holds and m r (¢ ;t k jY k¡1 ) is separable, then m r (¢ ;t k jY k ) is also separable and the aggregate measurement function for the latter can be computed from the one for the former using (4.5). 4.3 Prediction Operator ConsideragainalistY k :=fy 1 ;y 2 ;:::;y k gofkmeasurementswithy k :=ft k ;a k ;s k g, correspondingtoapositiveprobabilityrealizationofY k andpickanarbitrarytime t¸t k . By total probability theorem, we have m r (r 0 ;t+dtjY k )= X r2R n P(r(t+dt)=r 0 jr(t)=r;Y k =Y k )m r (r;tjY k ): Moreover, since t ¸ t k , r(t+dt) is conditionally independent of Y k , given r(t). Therefore m r (r 0 ;t+dtjY k )= X r2R n ©(r 0 ;r;t+dt;t)m r (r;tjY k ); (4.6) which de¯nes the prediction operator T pred [t+dt;t;¢ ] that takes a probabilistic map at time t and propagates it forward in time to obtain a map at time t+dt. 74 Suppose now that the probabilistic map at time t is separable, i.e., that (4.2) holds with Y = Y k . Our goal for the remaining of this section is to determine if the probabilistic map m r (r 0 ;t+dtjY k ) is still separable. Single object We start by considering the case n=1 and therefore the single- object transition probability function ' 1 (r 0 ;r;t+dt;t):=©(r 0 ;r;t+dt;t); r;r 0 2R; t¸t 0 ;dt¸0 is such that X r 0 2R ' 1 (r 0 ;r;t+dt;t)=1; (4.7) 8r2R; t¸ t 0 ;dt¸ 0. To compute the object's probabilistic map at some time t+dt, dt> 0, we use (4.6) and the maps' separability at time t to obtain m r (r 0 ;t+dtjY k )=c X r2R ' 1 (r 0 ;r;t+dt;t)f(r;tjY k ): Note that for a single object ±(r;d min ) is always equal to one. Therefore, the map at time t+dt is still separable and can be written as in (4.2) with Y =Y k and f(r 0 ;t+dtjY k )= X r2R ' 1 (r 0 ;r;t+dt;t)f(r;tjY k ) 8r 0 2R: (4.8) 75 This is a partial derivation of the prediction operator with aggregate measure- ment functions. Section 4.3 has the full derivation using aggregate measurement functions. Multiple objects We consider now the general case n ¸ 1 for the derivation of the prediction operator using aggregate measurement functions. We assume that the multiple objects essentially move independently of each other according to a model similar to that in the previous section, except that we enforce the con- straint that two objects will never get closer than d min from each. This is needed for consistency with Assumption 2. The following n-object transition probability function captures this: ©(r 0 ;r;t+dt;t)= c(r) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t) ´ ±(r 0 ;d min ) (4.9) r;r 0 2R n , where c(r) is the following normalizing constant c(r) ¡1 = X ¹ r2R n n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t)±(¹ r;d min ): (4.10) We assume here that the objects' velocity is bounded and therefore, for small values of dt> 0, ' 1 (r;r 0 ;t;t+dt) is nonzero only when r and r 0 correspond to the same or adjacent cells. Moreover, in the latter case ' 1 (r;r 0 ;t;t+dt) is of order 76 dt, i.e., ' 1 (r;r 0 ;t;t+dt)·cdt for some ¯nite constant c. This corresponds to the following assumption: Assumption 3. The n-object transition probability function is given by (4.9), with ' 1 (r 0 ;r;t+dt;t)= 8 > > > > > > > > < > > > > > > > > : 1¡o(dt) r =r 0 o(dt) r and r 0 are adjacent 0 otherwise (4.11) 8r;r 0 2R; t¸t 0 ;dt¸0. We show next that under this assumption the probabilistic map is approxi- mately separable at time t+dt, for small dt. Moreover, the error is of order dt 2 for most con¯gurations and of order dt for a small setA bad ½R n of \bad" object con¯gurations r for which 1. ±(r;d min )6=0, i.e., no two objects are closer than d min from each other; and 2. there exists some con¯guration ¹ r 2 R n for which ±(¹ r;d min ) = 0 and it is possible to go from con¯guration ¹ r to con¯guration r by a single object moving to an adjacent cell. Inpractice,A bad denotesthesetofcon¯gurationsforwhichtheobjectsstillsatisfy the minimum distance constrain but are very close to violating it. 77 Theorem 3. Thepredictionoperatorthatproducestheprobabilisticmapm r (¢ ;t+ dtjY k ), dt ¸ 0 from the map m r (¢ ;tjY k ) is de¯ned by (4.6). Moreover, in case m r (¢ ;tjY k ) is separable and Assumption 3 holds, then m r (r 0 ;t+dtjY k )=c ³ n Y k=1 f(r 0 k ;t+dtjY k ) ´ ±(r 0 ;d min ) +²(r 0 ;dt); 8r 0 2R n ; where f(¢ ;t+dtjY k ) is de¯ned as in (4.8), and ²(r 0 ;dt) is of order dt for r 0 2A bad and of order dt 2 otherwise. In practice, for small dt we can approximate m r (r 0 ;t+dtjY k )¼c ³ n Y k=1 f(r 0 k ;t+dtjY k ) ´ ±(r 0 ;d min ): Moreover,ifonewantstocomputem r (r 0 ;t 0 jY k )forsomet 0 >tforwhicht 0 ¡tisnot small,onecandividetheinterval[t 0 ;t]intoN subintervalsoflengthdt:=(t 0 ¡t)=N and apply (4.12) N times to propagate the map from t to t 0 . OutsideA bad , each of the N steps will introduce an error of order (t 0 ¡t) 2 =N 2 , which adds up to a total worst-case error of order (t 0 ¡t) 2 =N. Therefore the error can be made small by making N large, which means that the map at time t 0 is still approximately separable. For the con¯gurations in A bad , in general it will not be possible to 78 make the error arbitrarily small. However, in practice A bad is a small subset of R n . Proof of Theorem 3. 3 To compute the object's probabilistic map at some time t+dt, dt> 0, we use (4.6), (4.9), and the maps' separability at time t to obtain m r (r 0 ;t+dtjY k ) =c X r2R c(r) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t)f(r k ;tjY k ) ´ ±(r;d min )±(r 0 ;d min ) =c X r2R ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t)f(r k ;tjY k ) ´ ±(r 0 ;d min )¡² 1 +² 2 =c ³ n Y k=1 f(r 0 k ;t+dtjY k ) ´ ±(r 0 ;d min )¡² 1 +² 2 ; where ² 1 :=c X r2R: ±(r;d min )=0 ±(r 0 ;d min )6=0 ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t)f(r k ;tjY k ) ´ ² 2 :=c X r2R: ±(r;d min )6=0 ±(r 0 ;d min )6=0 (c(r)¡1) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t)f(r k ;tjY k ) ´ : We show next that both ² 1 and ² 2 are of order dt for r 0 2A bad and of order dt 2 otherwise. To this e®ect, we consider three cases separately: 79 Case 1: For any r 0 2 R n (including r 0 2 A bad ), we have r 6= r 0 in all terms of the summation that de¯nes ² 1 . Therefore, there must always exist at least one ' 1 (r 0 k ;r k ;t+dt;t) for which r k 6= r 0 k . Because of (4.11) that term will be o(dt) and therefore ² 1 =o(dt). As for ² 2 , note that because of (4.10) and (4.7), we have that c(r) ¡1 = X ¹ r2R n n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t) ¡ X ¹ r2R n : ±(¹ r;d min )=0 n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t) (4.12) = n Y k=1 X r k 2R ' 1 (¹ r k ;r k ;t+dt;t) ¡ X ¹ r2R n : ±(¹ r;d min )=0 n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t) (4.13) =1¡ X ¹ r2R n : ±(¹ r;d min )=0 n Y k=1 ' 1 (¹ r k ;r k ;t+dt;t): (4.14) Moreover, for any r such that ±(r;d min )6= 0, all the ¹ r in the summation must be di®erent than r and therefore there will always be one ¹ r k 6=r k . Because of (4.11) we conclude that c(r) ¡1 =1¡o(dt) and therefore c(r)¡1=o(dt). From this and the de¯nition of ² 2 it follows that ² 2 =o(dt). Case 2: In case ±(r 0 ;d min ) = 0, then ² 1 = ² 2 = 0 by de¯nition. Therefore ² 1 and ² 2 are of order dt 2 . 80 Case 3: In case r 0 62 A bad and ±(r 0 ;d min ) 6= 0, at least two objects must change their positions to reach r 0 from any con¯guration r for which ±(r;d min ) = 0. Therefore, in the de¯nition of ² 1 , each term of the summation must always have at least two ' 1 (r 0 k ;r k ;t+dt;t) for which r k 6=r 0 k . Because of (4.11) that term will be of order dt 2 and therefore ² 1 =o(dt 2 ). Regarding ² 2 , we can re-write its de¯nition as ² 2 :=c(c(r 0 )¡1) ³ n Y k=1 ' 1 (r 0 k ;r 0 k ;t+dt;t)f(r 0 k ;tjY k ) ´ +c X r6=r 0 : ±(r;d min )6=0 ±(r 0 ;d min )6=0 (c(r)¡1) ³ n Y k=1 ' 1 (r 0 k ;r k ;t+dt;t)f(r k ;tjY k ) ´ The second term is of order dt 2 because c(r)¡1 is of order dt and, since r6= r 0 , all terms in the summation are also of order dt. As for the ¯rst term, because of (4.14), c(r 0 )¡1 is of the same order as X ¹ r2R n : ±(¹ r;d min )=0 n Y k=1 ' 1 (¹ r k ;r 0 k ;t+dt;t): Using the same argument used to show that ² 1 = o(dt 2 ), we conclude that this term is also of order dt 2 and therefore ² 2 =o(dt 2 ). 81 4.4 Preferential Bearing Motion Model TherestrictionsofthesimplemotionmodelinvestigatedinTheorem3givesusan ideaofwhatkindofeducatedmotionmodelwouldbeneededagainstanintelligent adversary. Following are the characteristics for such a motion model: 1. The multiple objects essentially should still move independently of each other according to a model similar to that in the previous section. 2. The constraint that two objects will never get closer than d min from each other should still be enforced. This is needed for consistency with Assump- tion 2 3. The objects could be coordinated or uncoordinated in their motion. The motion model, therefore, should be able to cover both of these cases. 4. The motion model, moreover, should be as simple and general as to enable each object to move randomly in any direction. At the same time, it should be °exible enough to allow to de¯ne trajectory scenarios 5. The motion model should not be contingent upon con¯gurations of the ob- jects, e.i., it should not introduce error for any con¯guration of objects' position . 82 Thetransitionprobabilityfunctionso-calledn-objectPreferentialBearingModel does the job. The word bearing deals with the comprehension of the objects' po- sitions with respect to another object's or obstacle's position. Hence, the two important issues are candidate ¯xed and randomized bearings. The preferential bearing could therefore stem from a decision maker or an intelligence source as a ¯xed scenario for the object trajectory. Else, the bearing value selection at every time step could be totally randomized. Therefore, the preferential bearing satis¯es the characteristics 3 and 4 above. The random selection of bearings is studied in this work. The a priori list of bearings represent the deterministic way of assigning preferred directions, e.g. a ¯xed-scenario ordered-list of directions. We still assume that the motion for t¸ t 0 is Markovian. And the objects do notgetcloserthanaminimumdistancedmin,whichwouldbe1intheverysimple case. They would never occupy the same cell in the map since they always follow the same preferred direction. Therefore, the characteristics 1 and 2 are satis¯ed. We even do not have to assume that the objects' velocity is bounded. Since the objects follow preferred directions at all times, this will get rid of the error introduced in calculations by the previous model. This clears characteristic 5. One example motion model is in Figure 4.1. Under the above assumptions,the probabilisticmapwiththePreferential Bearing Motion Model isseparableattime t+dt. 83 Figure 4.1: Example collision-free-at-any-time motion model which considers roads in the terrain. 4.5 Mapping Static Objects For simplicity, we are going to start with static objects. We now express the positions of the objects with r (without ()), i.e. r(t)=r; 8t: We will model sensors by a stationary likelihood function to treat static objects. Thissectiontargetstocomputethea posteriori probabilistic map for static objects de¯ned by m r (rjY k ):=P(r=rjY k =Y k ); r2R n ; where Y k is a listfy 1 ;y 2 ;:::;y k g with the ¯rst k sensor observations. 84 Pick now some k 2 K and a list Y k := fy 1 ;y 2 ;:::;y k g of k measurements. Using Bayes rule we conclude that, for the sensor model described above, m r (rjY k )=c k P(y k =y k jr=r;Y k¡1 =Y k¡1 )m r (rjY k¡1 ) (4.15) =c k ` n (y k ;r)m r (rjY k¡1 ) (4.16) where Y k := fY k¡1 ;y k g, r 2 R n , and c k > 0 is a constant chosen so that P r2R m r (rjY k )=1. Iterating the above from 1 through k we get m r (rjY k )= ¹ c k ³ k Y j=1 ` n (y j ;r) ´ p 0 (r;;); r2R n (4.17) where ¹ c k is a normalizing constant. Under Assumption 1 and from (3.2) and (4.17) it is also straightforward to ¯gure out that the objects remain a posteriori indistinguishable, meaning that for any permutation j :=fj(1);j(2);:::;j(n)g of f1;2;:::;ng we have m r (fr 1 ;r 2 ;:::;r n gjY k )=m r (fr j(1) ;r j(2) ;:::;r j(n) gjY k ): 85 When the a priori probabilistic map precludes con¯gurations where the distance between two objects is larger than 2½ sensor , the likelihood function ` n derived in Section 3.1.2 is valid and we can write (4.17) as m r (rjY k )= ^ c k ³ n Y i=1 f(r i jY k ) ´ p 0 (r;;); r :=fr 1 ;r 2 ;:::;r n g2R n ; (4.18) where ^ c k is a normalizing constant and f(xjY k ):= Y j2K ` 1 (y j ;x); x2R: The function f(¢ j Y k ) | called here the aggregate measurement function | encapsulatestheinformationcontainedinallthemeasurement. Thisisimportant since for any region of reasonable size, the posterior probabilistic map m r (¢j Y k ) can not be represented exhaustively because such representation has a memory complexity o(N n ), where N is the number of points inR. However, using (4.18) one can represent m r (¢ j Y k ) using a single function f(¢ j Y k ), whose exhaustive representation has memory complexity N. 4.6 Mapping Moving Objects This chapter addresses the navigation of autonomous vehicles that attempt to executeamissionunderpartialinformation. Atypicalexampleisthatofavehicle that is asked to move to a given destination while avoiding objects (or other 86 vehicles) whose positions are only partially known. In another possible example, a vehicle is asked to search for objects whose positions are not accurately known. Weconsidersituationsinwhichnoisysensorsareusedtogatherinformationabout the localization of the objects to avoid or search for. This naturally leads to a probabilistic framework, in which sensors produce erroneous measurements with given probabilities. Sensors send all new information to a map computation module so that the algorithm will fuse new data to construct the most current version of the map. Sensor nodes are connected to the map computation node in a network setting. In general, networks have delay between nodes. This may cause some of the data packagesnottoarriveorderedbythetimestamp. Therearetwowaystocompute the a posteriori probabilistic map for moving objects recursively depending on the application as follows: Sequential Case In the more regular case, we will follow the steps in order. All observation packages arrive ordered by the time stamp. That is, fusion and prediction come one after the other. With these operators we can compute the 87 probabilistic map m r (¢ ;tjY k ), t ¸ t k recursively as depicted in Figure 4.2. For- mally, the recursion can be written as follows: m r (¢ ;t 0 j;)=p 0 (¢ ;t 0 ); m r (¢ ;t i jY i¡1 )=T pred [t i ; t i¡1 ; m r (¢ ;t i¡1 jY i¡1 )]; m r (¢ ;t i jY i )=T fuse [y i ; m r (¢ ;t i jY i¡1 )]; m r (¢ ;tjY k )=T pred [t; t k ; m r (¢ ;t k jY k )]; i2f1;2;:::;kg, where Y 0 :=; denotes an empty list of measurements. t1 t2 t tk ... t0 Tf1 Tfk Tp0 Tp1 Tpk m=p0 Figure 4.2: Recursive computation of m r (¢ ;tjY k ), t¸t k using the prediction and fusion operators. Out-of-Sequence Case This is the case where we deal with information arriv- ingoutofsequence. Thatis, thenewinformationarrivingmighthaveanoldtime stamp. Weneedtostoreoldprobabilisticmapstotakecareofthiscase. Thiscase may further be handled by 2 algorithms: Fusion-on-arrival and Delayed-fusion. The out-of-sequence case will be explained in detail by an example scenario that is also presented in Figure 4.3 as follows: 88 Example Scenario: ² y 7 arrives ² After that, y i arrives t7 t6 y7 t6 t7 ti yi Figure 4.3: Example scenario for recursive out-of-sequence computation of m r (¢ ;tjY k ), t¸t k using the prediction and fusion operators. Usingthesettingofthisscenario, fusion-on-arrivalalgorithmispresented¯rst. In this case, we fuse newly arrived out-of-sequence information as soon as it arrives. To do this, prediction step after time t 6 is revisited, and computed to predict the probabilistic map at time t i . Then the newly arrived information is fused. Then this map is predicted up to time t 7 . To complete the computation, fusion step at time t 7 is computed again. The advantage of this algorithm is that most current 89 map is always available. On the other hand, we may waste computation. These steps are formally written as follows: m r (¢ ;t i jY 6 )=T pred [t i ; t 6 ; m r (¢ ;t 6 jY 6 )]; m r (¢ ;t i jY i )=T fuse [y i ; m r (¢ ;t i jY 6 )]; m r (¢ ;t 7 jY i )=T pred [t 7 ; t i ; m r (¢ ;t i jY i )]; m r (¢ ;t 7 jY 7 )=T fuse [y 7 ; m r (¢ ;t 7 jY i )]: ThiscanbeseeninFigure4.4. Consideringthetimelineofthisscenario,delayed- t6 t7 Tf7 t6 Tp6 Tfti t7 Tf7 Tpti ti Tp67 Figure4.4: RecursiveOut-of-sequenceComputation: Fusion-on-arrivalAlgorithm fusion algorithm is considered now. Now we do not fuse the newly arrived out- of-sequence information by the time it arrives. We may de¯ne a new parameter 90 T called delay to represent maximum network delay. When y 7 arrives, no com- putation is done until T seconds in real time pass. Likewise when y i arrives, no computation is done until T seconds in real time pass. At real time t i +T, algo- rithm predicts the probabilistic map at time t i and fuses y i . Utilizing the same ideaofdelayingthefusion,atrealtimet 7 +T,algorithmpredictstheprobabilistic mapattimet 7 andfusesy 7 . Computationisnotwastedinthisalgorithmbutthis time we may not always have the most current version of the probabilistic map. This can be seen in Figure 4.5. t6 t6 Tp6 Tfti t7 Tf7 Tpti ti Tp6 ti Tfti at real time ti + T at real t7+T Figure 4.5: Recursive Out-of-sequence Computation: Delayed-fusion Case 4.7 Smoothing Operator Filtering, prediction, andsmoothingarethreemajorclassesofthegeneralestima- tion problem. We used a Bayesian approach to build our probabilistic algorithms 91 for probabilistic map building. So far, ¯ltering and prediction have been dis- cussed brie°y in previous sections. This section studies a smoothing operator. It is inspired by the algorithm called the Optimal Fixed Interval Smoother Using a Bayesian Approach of [1]. We investigated the ¯xed-interval smoothing problem and its proposed solution. Following theorem is the result of such work. In this theorem, we provide a complete recursive algorithm with recursive relationships for every density used to determine the smoothed a posteriori density. For simplicity of notation, from now on, the index k will represent t k to de- note time index where necessary to avoid confusion, unless otherwise stated. The probabilisticmapattimek conditionedtoallmeasurementsfY N giscomputedre- cursivelygivenallthemeasurementsandtheprobabilisticmapsstored. Therefore, this ¯lter that goes backward in time is called smoothing ¯lter. Once again let us consider, a list Y N :=fy 1 ;y 2 ;:::;y N g of N measurements with y k :=ft k ;a k ;s k g, corresponding to a positive probability realization of Y N . Assumption 4 (Independence). 1. r(k+1);y k+1 ;:::;y N , givenr(k) is in- dependent of y i ;i·k. 2. y k+1 ;:::;y N given r(k+1) is independent of r(k) and y i ;i·k. Theorem 4. Under Assumption 4, the recursive smoothing algorithm, that pro- duces the smoothed probabilistic map m r (¢ ;t k jY N ) from the predicted and ¯ltered probabilistic maps m r (¢ ;t k+1 jY k ) and m r (¢ ;t k jY k ) and motion model 92 ' 1 (r;r 0 ;t k+1 ;t k ), is de¯ned by (4.26). Moreover, in case m r (¢ ;t k+1 jY k ) and m r (¢ ;t k jY k ) are separable then m r (r 0 ;t k jY N )=c ³ n Y k=1 f(r 0 k ;t k jY N ) ´ ±(r 0 ;d min ) 8r 0 2R n ; where f(¢ ;t k jY N ) is de¯ned as f(r 0 ;t k jY N )=f(r 0 ;t k jY k ) µ X r2R ' 1 (r;r 0 ;t k ;t k¡1 ) f(r;t k+1 jY k ) f(r;t k+1 jY N ) ¶ 8r 0 2R (4.19) The recursive algorithm given in (4.19) is a backward algorithm where k runs from N ¡ 1 to 1. It turns out, initially, the data must be processed and the predicted and ¯ltered probabilistic maps m r (¢ ;t k+1 jY k ) and m r (¢ ;t k jY k );8k 2 1;2;:::;N must be computed via the fusion-on-arrival algorithm and stored. Proof. Smoothing, like prediction, starts with exploiting total probability theo- rem. m r (r 0 ;kjY N )= X r2R n P(r(k)=r 0 ;r(k+1)=rjY N =Y N ): (4.20) 93 Using Bayes' Theorem, the term on the right side becomes P(r(k)=r 0 ;r(k+1)=rjY N =Y N )= P(r(k)=r 0 ;r(k+1)=r;Y N =Y N ) P(r(k+1)=r;Y N =Y N ) P(r(k+1)=rjY N =Y N )= P(r(k)=r 0 ;r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k ) P(r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k ) P(r(k+1)=rjY N =Y N ): (4.21) Now let's consider the conditional joint density P(r(k) = r 0 ;r(k+1) = r;y k+1 = y k+1 ;:::;y N =y N jY k =Y k ). P(r(k)=r 0 ;r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k )= P(r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jr(k)=r 0 ;Y k =Y k ) P(r(k)=r 0 ;jY k =Y k ) 94 Using Condition 1 in the Independence Assumption 4, P(r(k)=r 0 ;r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k ) =P(r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jr(k)=r 0 ) P(r(k)=r 0 ;jY k =Y k ) =P(y k+1 =y k+1 ;:::;y N =y N jr(k+1)=r;r(k)=r 0 ) P(r(k+1)=r;jr(k)=r 0 ) P(r(k)=r 0 ;jY k =Y k ) Considering Condition 2, we reach at, P(r(k)=r 0 ;r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k ) =P(y k+1 =y k+1 ;:::;y N =y N jr(k+1)=r) P(r(k+1)=r;jr(k)=r 0 )P(r(k)=r 0 ;jY k =Y k ): (4.22) On the other hand, the conditional density P(r(k+1)=r;y k+1 =y k+1 ;:::;y N = y N jY k =Y k ) can be expressed as P(r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k )= P(y k+1 =y k+1 ;:::;y N =y N jr(k+1)=r;Y k =Y k ) P(r(k+1)=rjY k =Y k ): 95 Using Condition 2, we write P(r(k+1)=r;y k+1 =y k+1 ;:::;y N =y N jY k =Y k )= P(y k+1 =y k+1 ;:::;y N =y N jr(k+1)=r) P(r(k+1)=rjY k =Y k ): (4.23) Substituting two conditional densities in (4.22)and (4.23) into (4.21), the density P(r(k)=r 0 ;r(k+1)=rjY N =Y N ) can be expressed as P(r(k)=r 0 ;r(k+1)=rjY N =Y N )= P(r(k+1)=r;jr(k)=r 0 )P(r(k)=r 0 ;jY k =Y k ) P(r(k+1)=rjY k =Y k ) P(r(k+1)=rjY N =Y N ): (4.24) Finally,substituting(4.24)to(4.20),theaposteriori conditionaldensitym r (r 0 ;kjY N ) is found as m r (r 0 ;kjY N )= X r2R n P(r(k+1)=rjr(k)=r 0 )P(r(k)=r 0 jY k =Y k ) P(r(k+1)=rjY k =Y k ) P(r(k+1)=rjY N =Y N ): (4.25) 96 We easily see that smoothing is actually a speci¯c series of fusion and prediction operators. To see this clearly, let us write the formula using the de¯nitions of the probabilistic maps and the motion model equations as m r (r 0 ;kjY N )=m r (r 0 ;kjY k ) X r2R n ©(r;r 0 ;k+1;k) m r (r;k+1jY k ) m r (r;k+1jY N ): (4.26) 4.8 Simulation Results Next, a scenario is presented where prediction operator is used to continuously provide probabilistic maps. The scenario refers to the tracking of a mobile target with information coming from an unreliable network. Even with loss of network connectivity the probabilistic maps continue to provide information (with the natural loss of accuracy). We receive measurements every second until 600 sec. Then communication link is lost and we do not receive any updates any more. After 600 sec, only the prediction operator runs. That is why we see peaks bleeding. The prediction operator is again a convolution operator. In terms of the directional search of the prediction, we predict in only one of the main 4 directions, i.e. southwesterly, 97 southeasterly, northeasterly, northwesterly, after each fusion operation. But visit- ing is done in a counter-clockwise fashion and in order. That is, every 4 seconds, we complete 360 degrees of search and prediction. Uncertainty present in the location is clearly seen in Figure 4.6 since it is the probabilistic map after the ¯rst reading. While more readings are received, the target is pin-pointed as in Figure 4.7. Another thing observed going from Figure 4.7 to Figure 4.8 is that peaks follow the target to its new location. Yet right after a move, the target as in Figure 4.8 is not as precisely located as it is Figure 4.7. Peaks bleed from Figure 4.9 to Figure 4.11 with repeated prediction operations since we do not receive any updates for fusion. 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.6: Occupancy Map at 1 sec 98 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.7: Occupancy Map at 192 sec 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.8: Occupancy Map at 407 sec 99 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.9: Occupancy Map at 601 sec 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.10: Occupancy Map after 701 sec 100 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 4.11: Occupancy Map after 801 sec 101 Chapter 5 Particle Filter Maps vs Bayesian Approach The classical problem in partially observable Markov chains is to recover a poste- riordistributionoverthestateatanytimefromallavailablesensormeasurements. This problem is known as Bayesian ¯ltering problem, also called optimal ¯ltering or stochastic ¯ltering problem. We proposed a solution to this problem by Bayes ¯lters, which computes this posterior recursively in previous sections. Other work inautomaticcontrolwasdoneduringthe60'sand70'sbasedonsequentialMonte Carlo integration methods [15, 14]. Most likely because of of the primitive com- puters available at the time, these last algorithms were overlooked. In the late 80's, the great increase of computational power allowed the rebirth of numerical integration methods for Bayesian ¯ltering. Current research has now focused on 102 MC (Monte Carlo) integration methods which have the great advantage of not being subject to any linearity or Gaussianity hypotheses on the model. Particle¯lters[7]compriseabroadfamilyofsequentialMonteCarloalgorithms for approximate inference in partially observable Markov Chains (see [46, 7] for excellent overview on particle ¯lters and applications). However, the application ofparticle¯lterstoroboticsproblemsisnotwithoutcaveats. Arangeofproblems arisefromthefactthatnomatterhowdetailedtheprobabilisticmodel-itwillstill be wrong, and in particular make false indepence assumptions. Particle ¯lters are approximate techniques for calculating posteriors in partially observable control- lable Markov chains with discrete time. Particle ¯lters address the more general case of (nearly) unconstrained Markov chains. The basic idea is to approximate theposteriorofasetofsamplesfr(k) i g, orparticles. Hereeachr(k) i isaconcrete state sample of index i, where it ranges from 1 to M, the size of the particle ¯lter. The most basic version of the particle ¯lters will be given later in this section. Particle¯ltersareattractivetoroboticistsformorethanonereason. Firstand foremost, theycanbeappliedtoalmostanyprobabilisticrobotmodelthatcanbe formulated as a Markov chain. Furthermore, particle ¯lters are any time, that is, theydonotrequirea¯xedcomputationtime;insteadtheiraccuracyincreaseswith the available computational resources. This makes them attractive to roboticists, who often face hard real-time constraints that have to be met using hard-to- control computer hardware. Finally, they are relatively easy to implement. A key 103 property of particle ¯lter is that each particle can be interpreted as a posterior over entire paths, and not just the present pose. We study Bayesian Estimation For Hidden Markov Models Using Importance Sampling in the following sections. 5.1 Sequential Importance Sampling (SIS) Inspired by the works in [7, 46], we develop a particle ¯lter approach for proba- bilistic map building. For simplicity of notation, from now on, the index k will represent t k to denote time index where necessary to avoid confusion, unless oth- erwise stated. Suppose that there exist an object in a regionR½R 2 . For now we assume that the position of this object is not accurately known, and that the ob- ject moves in time. For simplicity we takeR to be a ¯nite collection of N points, each representing a cell inR 2 . We model the position of the object as a stochas- tic process taking values in R. We denote by R k := fr(0);r(1);:::;r(k)g 2 R, k ¸ 0, the positions of the object in time. Our knowledge about the position of the object improves with time through observations collected by the sensor over R. These observations are denoted by y k , k 2 K := f1;2;:::;Kg, and each y k consists of a triple of the form ft k ;a k ;s k g; 104 where t k ¸ 0 denotes the time at which the observation was made and a k 2 R the position of the sensor, and s k the observation. We model the imperfection in the sensor in such a way that the variable s k can either have the value s k =; if nothing was detected at that time or some value s k 2R if an object positioned at s k is detected. However, we assume that the sensors used to detect objects are not perfect and therefore s k =; simply means that the sensors were not able to detect an object and one should consider that this may be a \false negative." Similarly, s k 2R may be caused by a \false positive," or there may be a target close to s k but not exactly at that location. The set of all possible observation triples is denoted by Y. However, in this section, we will treat the case where at each time we will have a positive reading. Yet, this could be a false or true positive. Y K denotes the listfy 1 ;y 2 ;:::;y K g of all sensor observations available. We assume that an a priori joint probability distribution for the object at time t 0 p 0 (r(0)j;):=P(r(0)); r2R; is known. ;2Y ¤ denotes the empty sequence (empty observation set). The signal fr(k);k 2 K := f0;1;2;:::;Kgg, r(k) 2 R, is an unobserved (hidden) Markov process of initial distribution p(r(0)) and transition equation p(r(k)jr(k¡1)). The observations, y k :=ft k ;a k ;s k g;k2K :=f0;1;2;:::;Kg is 105 conditionally independent of any other measurements given a particular con¯g- uration r for objects of interest. In short, the model is a hidden Markov Model (HMM) described by P(r(0)) and P(r(k)jr(k¡1)) (5.1) P(y k jr(k)); k¸0 (5.2) The¯rststepinanalysingthedataistoperformacompleteforwardssweepof a Monte Carlo ¯ltering algorithm (provided below) to produce weighted particles fr(k) i ;w(k) i gfork =1;2;:::;K,i=1;2;:::;M,drawnapproximatelyaccording totheimportancefunction. Weomittheprecisedetailshereowingtolackofspace, butnotethat thisstepisnowarelativelystandard procedure [7]. A simplechoice is to choose the prior distribution of our Hidden Markov Model as the importance function. This is the choice made by Handschin et Mayne [14, 15]. ¼(r(k)jR k¡1 ;Y k )=P(r(k);jr(k¡1)) (5.3) and w(k) i =w(k¡1) i P(y k ;jr(k) i ) (5.4) 106 Although the prior importance function is not the optimal one, in our case, the optimal importance function and our choice is the same. This is because our transition function is not conditioned to the last observation. On the other hand, in general, this method is often ine±cient in simulations as the state space is explored without any knowledge of the observations. Algorithm 1 (Sequential Importance Sampling (SIS)). . 1. At time k =0. For i=1;:::;M, sample r(0) i from uniform initial map. 1 For i=1;:::;M, evaluate the importance weights up to a normalizing constant. w(0) i = P(y 0 jr(0) i )P(r(0) i ) ¼(r(0) i jy 0 ) (5.5) For i=1;:::;M, normalize the importance weights. ~ w(0) i = w(0) i ) P M j=1 w(0) j (5.6) 2. At time k¸1. Fori=1;:::;M,sampler(k) i from¼(r(k)jR k¡1 ;Y k ),R i k = ¡ R i k¡1 ;r(k) i ¢ . 1 Sample a random reading from the same distribution, since an initial reading is needed too. 107 For i=1;:::;M, evaluate the importance weights up to a normalizing constant. w(k) i =w(k¡1) i P(y k jr(k) i )P(r(k) i jr(k¡1) i ) ¼(r(k) i jR i k¡1 ;Y k ) (5.7) For i=1;:::;M, normalize the importance weights. ~ w(k) i = w(k) i ) P M j=1 w(k) j (5.8) Numerousalgorithmsproposedintheliteraturearespecialcasesofthisgeneral (andsimple)algorithm. Thenumericalcomplexityofthisalgorithmiso(M). This isimportantaswetakeM À1inpracticebutithasthegreatadvantageofbeing parallelizable. Inthegeneralcase,thememoryrequirementsareo((K+1)M)asit is necessary to keep all the M simulated trajectories from time 0 to K. However, if ¼(r(k)jR i k¡1 ;Y k ) = ¼(r(k)jr(k¡1) i ;y k ) and if one is only interested in the ¯ltering distribution P(r(k)jY k ), the memory requirements are o(M). In the general case, one obtains at time K the following estimate of the joint posterior distribution: ^ P(R k jY k )= M X i=1 ~ w(k) i ±(R i k ;R k ) (5.9) 108 5.2 Simulation Let us work with the following scenario to illustrate the use of particle ¯lters. We receive measurements every few seconds until 600 sec. Following are the particle ¯lter maps with 50, 100, 500 samples for times 1, 192 and 407 seconds where one object is being mapped. 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.1: Particle Filter Map with 50 samples at 1 sec 5.3 Comparison: Particle Filter vs Bayesian Both of the algorithms for the computation of the posterior distribution would behave di®erently in di®erent settings. Based on the work in the previous section in this chapter, we conclude with the following brief topics of comparison of the two approaches for our problems at hand: 109 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.2: Particle Filter Map with 50 samples at 192 sec Computation of posterior distribution Bayesian approach actually pro- duces the exact a posteriori probability distribution for the objects' location, whereas particle ¯lter is only an approximation to that posterior distribution. It will be an interesting exercise to ¯nd how far particle ¯lter solutions are from the exact posterior distribution and how this di®erence varies with changes in the numberoftheobjectsmappedorthefrequencyofthemeasurementsreceivedfrom the sensors. On the other hand, this work is beyond the scope of our work. We may still state here that the main criticism of particle ¯lter has been that in gen- eral, populating an n-dimensional space requires exponentially many particles in n. Therefore, most successful applications have been con¯ned to low-dimensional state spaces. There is also a survey of convergence results in [7, 46] and their ref- erences. Whatismeantbyconvergence result thereisthattherateofconvergence to zero of the approximation error is independent of the dimension of the state 110 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.3: Particle Filter Map with 50 samples at 407 sec space under certain assumptions. Yet, one has to use a very large number, M, of particles, otherwise, it will not behave as a proper approximation of the posterior distribution. Computational Issues Particle ¯lters work very fast at least for single object mappingasweapplieditforourproblemsathand. WeprovedBayesianapproach is fast for multiple object mapping as well. In general, both of the approaches are computationally fast and relatively easy to implement. Assumptions Needed Particle ¯lter approach needs only a few assumptions forissuesabouttheconstructionofthemap,e.g. themotionmodel,thesparseness of the area in terms of the sensors and the objects, sensor models. Bayesian approach requires sparseness for objects as in Assumption 1 and is built on a speci¯c sensor model. 111 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.4: Particle Filter Map with 100 samples at 1 sec Memory Complexity If one is only interested in the ¯ltering distribution P(r(k)jY k ), the memory requirements for Particle Filter areo(M), whereM is the number particles used in the algorithm. Same for Bayesian Filter is o(N), where N is the number of the cells in the region. Remembering that one needs a very large number of particles tracked for good convergence results, Bayesian Filter behaves better in this case. NumericalComplexity ThenumericalcomplexityofParticleFilter algorithm is o(M), whereas Bayesian Filter is o(N). With the same line of reasoning of the previous topic of comparison, Bayesian Filter is a better choice. Table 5.1 summarizes this comparison. 112 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.5: Particle Filter Map with 100 samples at 192 sec 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.6: Particle Filter Map with 100 samples at 407 sec 113 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.7: Particle Filter Map with 500 samples at 1 sec 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.8: Particle Filter Map with 500 samples at 192 sec 114 52k 54k 55k 57k 59k 61k 63k 64k 66k 68k 64k 65k 67k 69k 71k 73k 74k 76k Occupancy map for radars Figure 5.9: Particle Filter Map with 500 samples at 407 sec Particle Filter Bayesian Approach Computation of pos- terior distribution Provides only an approx- imation of the exact dis- tribution Computesexactposterior distribution Computationallyfast Yes Yes Assumptions Requiresfewassumptions Assumes sparseness for objects Assumes a speci¯c sensor model Memory Complexity o(M) o(N) Numerical Complex- ity o(M) o(N) Table 5.1: Comparison: Particle Filter vs Bayesian Approach 115 Chapter 6 Minimum Risk Path Planning For the problem at hand, probabilistic maps are constantly computed with the locations of objects by fusing information collected from all the sensors in the region. The objects that we map induce danger. The probabilistic maps are used for intelligence updates and for minimum risk route planning. They make use of the map-building facility to compute paths for the autonomous mobile robots that minimizes the risk of being in the danger zone. The autonomous mobile robots could be air or ground vehicles depending on the application. For some cases, paths can be calculated in an open-loop fashion. That is, we will calculate a minimum risk path from a source point to a destination point, and let the autonomous mobile robots go the destination point following this path. We will 116 call such a case as Static Optimization. Paths can also be recomputed in a closed- loop fashion, meaning the path is recalculated while the robots are on their way to the destination following the previous path to divert them from new danger zones. This will be called Online Optimization. Using the probabilistic maps introduced in previous chapter, it is possible to assess the risk of speci¯c run from source to destination point and even do minimum-risk path planning. This can be done without ever explicitly comput- ing the high-dimensional probabilistic map and working simply with aggregate measurement functions. To determine the probability of success , one needs to compute a danger-map. This map assigns to each point in the region the proba- bility that a autonomous mobile robot going over that location will be destroyed. The probability of success of a particular run turns out to be linear on the values of a simple transformation of the danger-map and is therefore computationally amenable to do minimum-risk path planning. Robots follow minimum risk paths. By doing so, they actively avoid high threat regions. If the application is for military purposes, autonomous mobile robots are go- ing to be aircraft and the objects are radars collocated with SAM sites (enemy threats). Each run from a source point to a destination point will be a mission from own base to enemy base. 117 Another application could be for autonomous mobile robots for ¯re depart- ment. In the case of a ¯re, e.g. forest ¯re, building ¯re, a minimum risk path can be calculated to avoid high temperature regions in the region. 6.1 Mission risk assessment For the rest of the chapter, we will study the case of n static objects and a single robot. To compute the probability of success of a particular run, suppose now that we are interested in having an autonomous mobile robot going from a source point to a destination point. The trajectory of the robot is de¯ned by m way points fx 1 ;x 2 ;:::;x m g. The conditional a posteriori probability p survive that a robot will survive this run, given a list of past measurement Y k = Y 2Y ¤ , can then be computed as p survive =P(S k jY k =Y)= k Y j=1 P(S j jY k =Y;S j¡1 ) where S j , j2f0;1;2;:::;mg denotes the event that the robot will survive up to thejthwaypoint(byconventionS 0 isthewholesamplespace). SinceP(S j jY k = Y;S j¡1 ) is the conditional probability that a robot going over x j will not be destroyed at that point, we have that P(S j jY k =Y;S j¡1 )=1¡m danger (x j jY) 118 where one needs to compute the a posteriori danger map. This is de¯ned by m danger (ajY):=P(KjY k =Y;a k =a) were Y denotes a list of k observations, K the event that a robot, which has an object in its range and positioned at a will be destroyed as a result. To compute the danger map we need the parameter ½ sensor de¯ned before, together with the following one: p destroy 2[0;1] probabilitythatanautonomousrobotwillbedestroyedby an object in its range (in general, could be a function of the robot position and the other side's intention). Denoting by R the event that the robot has within its range at least one object, we have that m danger (ajY):=P(KjY k =Y;a k =a;R)P(RjY k =Y;a k =a) =p destroy X 9i:kr i ¡ak·½ sensor m r (rjY): A lower bound on m danger (ajY) can be computed by allowing \double-counting" when the robot is within range of multiple objects. When this is done we obtain the following \worst-case" bound m danger (ajY)·p destroy X kx¡ak·½ sensor m o (xjY): (6.1) 119 where one needs to calculate the a posteriori occupancy maps for the objects de- ¯ned by m o (xjY):=P(9i:r i =xjY k =Y); x2R; where Y 2Y ¤ denotes a list of k observations, and r i the positions of the objects. In the above formula, we actually have an equality when the robot has within its range at most one object. This can be known to occur with probability one, e.g., when the prior probability places the objects far apart. And therefore p survive = m Y j=1 (1¡m danger (x j jY)): Itisoftenconvenienttoexpressthisasasummation,whichcanbeachievedtaking the logarithm of p survive : log(p survive )= m X j=1 log live (x j ;Y); (6.2) where log live (x;Y):=log ¡ 1¡m danger (xjY) ¢ ; x2R; 120 is called the log-live map. Using (6.1) we can obtain the following worst case bound for the log-live map log live (x;Y)¸log ³ 1¡p destroy X k¹ x¡xk·½ sensor m o (¹ xjY) ´ ; x2R: Notethat,becauseofthemonotonicityofthelogarithm,optimizingforlog(p survive ) is equivalent to optimizing for p survive . We can therefore compute a m-step path of minimum-risk from x init to x ¯nal using (6.2): max x 1 ;x 2 ;:::;x m : x 1 =x init ;x m =x ¯nal m X j=1 log live (x j ;Y); where the maximization is further constrained by the robot's kinematic model. It is also straightforward to compute paths that are Pareto-optimal with respect to the two costs: probability of survival p survive and path length m. We are going to address these in Section 6.3 6.2 Occupancy maps for objects The goal of this section is to compute the a posteriori occupancy maps for the objects de¯ned by m o (xjY):=P(9i:r i =xjY k =Y); x2R; 121 where Y 2Y ¤ denotes a list of k observations. From the de¯nition of the proba- bilistic map m r (¢jY), we conclude that m o (xjY)= X fr2R n :9i:r i =xg m r (rjY): (6.3) When the a priori joint probability precludes two objects from being in the same cell, i.e., r i =r j ; i6=j ) p 0 (fr 1 ;r 2 ;:::;r n g)=0; the same will hold for the a posteriori map m r and (6.3) can be written as m o (xjY)= n X m=1 X fr2R n :r m =xg m r (rjY) =n X r j 2R;j>1 m r (fx;r 2 ;:::;r n gjY): (6.4) Here, we also used the indistinguishability. Note that, in this case, X x2R m o (xjY)=n: (6.5) 122 Supposenowthattheonly a priori informationaboutthepositionsoftheobjects is that there is a minimum distance d min between any two of them. The a priori joint probability map is then of the form p 0 (fr 1 ;r 2 ;:::;r n g):= 8 > > > < > > > : 0 9i6=j :kr i ¡r j k·d min c otherwise fr 1 ;r 2 ;:::;r n g2R n : (6.6) where c is a positive constant. In this case, using the a posteriori map given by (4.18), we obtain m o (xjY)= ¹ cf(xjY) X r 2 2R[x] f(r 2 jY) X r 3 2R[x]\R[r 2 ] f(r 3 jY)¢¢¢ (6.7) X r n 2R[x]\R[r 2 ]\¢¢¢\R[r n¡1 ] f(r n jY) (6.8) where ¹ c is a normalizing constant chosen so that (6.5) holds andR[z]:=f¹ z2R: k¹ z¡zk>d min g, z2R. A worst-case upper bound on the a posteriori occupancy map for the objects can then be computed by m o (xjY)· ¹ cf(xjY) X r 2 2R[x] f(r 2 jY) X r 3 2R[x] f(r 3 jY)¢¢¢ X r n 2R[x] f(r n jY) = ¹ c ¹ f n¡1 f(xjY); 123 where ¹ f := P z2R[x] f(zjY). However, this bound is di±cult to use because it not easy to compute the normalizing constant. Fix now some Y 2Y ¤ and suppose that there exists a partitionR 0 ,R 1 ,R 2 , ::: , R m ofR such that 1. allpointsineachR i ,i2f1;2;:::;mg,areawayfromeachotherbynomore than d min , i.e., kz¡ ¹ zk·d min 8z;¹ z2R i ; therefore there can be no more than one object in eachR i ; 2. all the R i , i2f1;2;:::;mg, are away from each other by more than d min , i.e., kz¡ ¹ zk>d min 8z2R i ;¹ z2R j ; i;j2f1;2;:::;mg; i6=j; therefore di®erent objects can be in any position in distinct R i , i2f1;2;:::;mg; 3. thevaluesoff(zjY)aresu±cientlyuniformoverR 0 sothat,forany ¹ R½R 0 with small area compared toR 0 , X z2R 0 n ¹ R f(zjY)¼ jR 0 j¡j ¹ Rj jR 0 j X z2R 0 f(zjY)¼ X z2R 0 f(zjY); (6.9) 124 where jR 0 j and j ¹ Rj denote the areas of R 0 and ¹ R, respectively. The ¯rst approximationabovecomesfromtheuniformityof f(zjY)overR 0 , whereas the second one comes from the fact that the area of ¹ R is small. Note that, even when (6.9) does not hold, we still have the following upper and lower bounds on P z2R 0 n ¹ R f(zjY). ³ X z2R 0 f(zjY) ´ ¡j ¹ Rjmax z2R 0 f(zjY)· X z2R 0 n ¹ R f(zjY)· X z2R 0 f(zjY): (6.10) Pick now some x2R. Using the fact that theR i form a partition ofR and the indistinguishability, we can write (6.4) as m o (xjY)=n µ X r 2 ;:::;r n 2R 0 m r (fx;r 2 ;:::;r n gjY)+ (6.11) n¡1 X k=1 µ n¡1 k ¶ X r 2 ;:::;r 1+k 2[ m i=1 R i X r 2+k ;:::;rn 2R 0 m r (fx;r 2 ;:::;r n gjY) ¶ : (6.12) Let us denote by ¾ := f¾ 1 ;¾ 2 ;:::;¾ k g a combination of k elements taken from the set fi : x 62 R i g ½ f1;2;:::;mg. We denote by S (k) the set of all such 125 combinations. Using the fact that there can be no more than one object in each R i , i2f1;:::;mg, we can further expand (6.11) as m o (xjY)=n X r 2 ;:::;rn 2R 0 m r (fx;r 2 ;:::;r n gjY)+ n n¡1 X k=1 µ n¡1 k ¶ k! X ¾2S (k) X r i+1 2R ¾ i i2f1;:::;kg X r k+2 ;:::;rn 2R 0 m r (fx;r 2 ;:::;r n gjY) =n X r 2 ;:::;rn 2R 0 m r (fx;r 2 ;:::;r n g;Y)+ n¡1 X k=1 n! (n¡1¡k)! X ¾2S (k) X r i+1 2R ¾ i i2f1;:::;kg X r k+2 ;:::;rn 2R 0 m r (fx;r 2 ;:::;r n gjY) In this case, using the approximate a posteriori map given by (4.18), we obtain m o (xjY)= ¹ cf(xjY) µ n X r 2 ;:::;r n 2R 0 f(r 2 jY)¢¢¢f(r n jY)p 0 (fx;r 2 ;:::;r n g)+ n¡1 X k=1 n! (n¡1¡k)! X ¾2S (k) X r i+1 2R¾ i i2f1;:::;kg f(r 2 jY)¢¢¢f(r k+1 jY) X r k+2 ;:::;r n 2R 0 f(r k+2 jY)¢¢¢f(r n jY)p 0 (fx;r 2 ;:::;r n g) ¶ (6.13) 126 where ¹ cisanormalizingconstantchosensothat(6.5)holds. However, from(6.6), we conclude that for x;r 1 ;r 2 ;:::;r k+1 away from each other by more than d min , X r k+2 ;:::;r n 2R 0 f(r k+2 jY)¢¢¢f(r n jY)p 0 (fx;r 2 ;:::;r n g)=c X r k+2 2 ¹ R[x;r 1 ;:::;r k+1 ] f(r k+2 jY) X r k+3 2 ¹ R[x;r 1 ;:::;r k+2 ] f(r k+3 jY) ¢¢¢ X r n 2 ¹ R[x;r 1 ;:::;r n¡1 ] f(r n jY); where ¹ R[z 1 ;z 2 ;:::;z j ]:=f¹ z2R 0 :k¹ z¡z i k>d min ;for some ig,witheachz i 2R. When ¼d 2 min is small when compared to the total area ofR 0 , because of (6.9), we further conclude that 1 X r k+2 ;:::;r n 2R 0 f(r k+2 jY)¢¢¢f(r n jY)p 0 (fx;r 2 ;:::;r n g)¼cf 0 (Y) n¡k¡1 ; 1 When (6.9) is not valid we can still use (6.10) to get upper and lower bounds on the a posteriori occupancy map. 127 where f i (Y) := P r2R i f(rjY), i 2 f0;1;:::;mg. Replacing this in (6.13) we obtain m o (xjY)¼ ¹ cf(xjY) µ nf 0 (Y) n¡1 + n¡1 X k=1 n!f 0 (Y) n¡k¡1 (n¡1¡k)! X ¾2S (k) X r i+1 2R¾ i i2f1;:::;kg f(r 2 jY)¢¢¢f(r k+1 jY) ¶ = ¹ cf(xjY) µ nf 0 (Y) n¡1 + n¡1 X k=1 n!f 0 (Y) n¡k¡1 (n¡1¡k)! X ¾2S (k) f ¾ 1 (Y)f ¾ 2 (Y):::f ¾ k (Y) ¶ ; (6.14) where the constant ¹ c was rede¯ned to absorb c. The following Lemma can be adapted to the setting of this work: Lemma 1. The occupancy map at time t given Y k is given by m o (xjY)= X r2R n :9i:r i =x m r (rjY): Moreover, in case Assumption 1 holds, then m o (xjY)¼cf(xjY) µ nf n¡1 0 (Y)+ n¡1 X k=1 n!f n¡k¡1 0 (Y) (n¡1¡k)! X ¾2S (k) f ¾ 1 (Y)f ¾ 2 (Y):::f ¾ k (Y) ¶ ; where f i (Y) := P r2R i f(rjY), i 2 f0;1;:::;mg with the region R i de¯ned as above, and c is a normalizing constant chosen so that P x2R m o (xjY)=n. 128 Lemma 1 allows us to compute the a posteriori occupancy map m o (¢jY) di- rectly from the aggregate measurement function f(¢jY), without explicitly com- puting the a posteriori probabilistic map m r (¢jY). 6.3 Optimal Path The minimization problem at hand is a two-cost optimization problem, one of the costs being path length and the other the minimum-risk constraint. In our case, "path length" cost can be replaced by a function of time f(T), i.e. J 1 = f(T). Likewise, "minimum-risk"costwillbeafunctionofprobabilityofsurvival p survive , J 2 =f(p survive ) This can be presented as min x 1 ;x 2 ;:::;x m : x 1 =x init ;x m =x ¯nal J i ·J max i J j ; i;j2f1;2g;i6=j; whereJ 1 =T andJ 2 =¡log(p survive )arethecostsandJ max i costconstraints. The solutionstothesekindsofproblemsarecalled Pareto-optimal. Pareto-optimal so- lution stands out as a reasonable equilibrium solution for the cooperative game problems, since it features the property that no other joint decision of the play- ers can improve the performance of at least one of them, without degrading the performance of the other. In our case, there is no other cost pair than the Pareto- optimal solution itself that might be found to decrease one of the costs without increasing the other. 129 To compute a minimum risk path, we convert the problem into ¯nding an optimal path on a graph. We consider a directed graph G = (V;E) where V is the set of vertices and E is the set of edges. We then consider the problem of ¯nding a path from a source vertex to a destination vertex such that the sum of the costs on the path satisfy the cost constraints. We can formulate the problem as min e e 1 =e init ;em=e ¯nal P i J 2 (e i )·J max 2 X i J 1 (e i ); i2f1;¢¢¢ ;ng; where e = (e 1 ;:::;e n ) is the set of edges de¯ning the path. In our case, the vertices are the cells in a regionR½R 2 and the edges are the paths that connect two cells. (v 1 ;v 2 )2 E if and only if a robot can go from v 1 to v 2 . Let us denote "time" cost as J(e)= 8 > > > < > > > : p (2) e is a diagonal edge in the map 1 e is a vertical or horizontal edge in the map and the "risk" cost as J(e)= v i +v j 2 ; e =(v i ;v j ) 130 where v i and v j represent the probability of danger associated to those vertices. However, we may choose to use the probability of survival, p survive , associated to those vertices, in which case it is a maximization problem. This case may still be presented as a minimization problem by representing danger map as a function of survival map. Because of the way we model the latter cost in this problem, we do not consider half the cost of the ¯rst and last vertex in the path in the optimization problem. Yet this does not e®ect the solution since the ¯rst and last cell in the path are ¯xed for each path possible. The above problem is solved using the work done in [38]. 6.4 Simulation Results The algorithms presented above were implemented in a simulator of military air- operations developed by the Honeywell Technology Center. Figure 6.1 shows a snapshot of the main simulation window displaying the battle¯eld during a mis- sion that consisted of °ying several aircraft from one air-base to another, passing through a region infested by enemy SAMs. In this ¯gure, enemy radars with co-located SAMs are represented by large triangles and the friendly aircraft are represented by small circles or triangles (depending on the type of weapons they are carrying). In this simulation, all aircraft were carrying radar detection equip- ment. 131 Figure 6.1: Battle¯eld Figure 6.2 shows a snapshot of the occupancy map de¯ned in Section 6.2 (this snapshot was taken at the same time as the one in Figure 6.1). The probability of occupancy is encoded in the background color (darker corresponding to a higher probability) and each \x" corresponds to a location where an aircraft \detected" a radar. The occupancy map shown in Figure 6.2 exhibits large certainty on the positions of the radar since most probabilities in the occupancy map already collapsed to values very close to 0 or to 1. This is true in spite of several false positives that can be seen in the plot. Thelog-livemapde¯nedinSection6.1isplottedinFigure6.3. Inthiscase,itis an example of static optimization. In this ¯gure, darker regions indicate smaller values for the log-live map and therefore more danger, whereas lighter regions 132 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 Occupancy map for radars Figure 6.2: The occupancy map of radars correspond to safer areas. The line superimposed in the log-live map corresponds to a path that is Pareto-optimal for the pair of costs: probability of survival and path length (with distance measured in the Manhattan sense). Figure 6.4 is an −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0 x 10 −3 Log−live map Figure 6.3: The log-live map of radars example of online optimization from the same simulation environment. In this run, we calculated the optimal path for each aircraft and during the mission. 133 Figure 6.4: The example for online optimization 134 Chapter 7 Probabilistic Pursuit Evasion The pursuit-evasion problem in a region R ½ R 2 is to capture a mobile object (called a fugitive) occupying unknown location inR by a team of pursuers (called searchers). The fugitive is said to be captured if one of the searchers occupy the same position as the fugitive. Unlike minimum-risk path planning applications, we do not want to avoid the objects that we map, but to pursue. Depending on the particular context, capture may actually mean handling the evader in some particular way, i.e. rescuing it. We do not take into account physical properties of a pursuer/evader such as the shape and size. Therefore a pursuer/evader is represented simply by its position in region R ½R 2 in which the pursuit takes place. Hence, we assume that a pursuer and an evader can be located at the same position. 135 For simplicity we assume that both space and time are quantized. The region R ½ R 2 in which the pursuit takes place is then regarded as a ¯nite collection of N points, each representing a cell inR 2 and all events take place on a set of discrete times T. Here, the events include the motion and collection of sensor data by the pursers/evaders. For simplicity of notation we take equally spaced event times. In particular,T :=f1;2;:::g. Speaking of which, we may not have sensor readings for each event time. For each time t 2 T we receive a measurement, we denote by y(t) the set of all measurements taken by the pursuers at time t. Every y(t) is assumed a random variable with values in a measurement spaceY. At each time t2T it is possible to execute a control action u(t) that, in general, will a®ect the pursuers sensing capabilities at times ¿ ¸ t. Each control action u(t) is a function of the measurementsbeforetimetandshouldthereforeberegardedasarandomvariable taking values in a control action space U. We assume that the motion of each pursuerisconstrainedbythat,inasingletimestep,itcanonlymovetocellsclose to its present position. Readers are referred to Chapter 3 for details. 7.1 Probabilistic Approach and Challenges Possible scenarios for search include search and rescue operations, localizing and retrieving parts in a warehouse, localizing and neutralizing environmental threats (i.e. forest ¯res or development of algae in the ocean), and search and capture 136 missions(i.e. lawenforcement). Insomeoftheseproblemstheobjectsaremoving more or less randomly (e.g. search and rescue) whereas in the others they are actively avoiding detection(e.g. search and capture). The objects we pursue, might in general be hidden, moving, or actively avoiding capture. Our solution to the pursuit-evasion problem on R is an algorithm which moves the searchers in R so that the fugitive is captured eventually no matter how the fugitive moves. Our approach is inspired by the works of [19, 17, 37, 41, 51], [19, 17] are the most related works to our study and their pursuer teams can be described in terms of actions as follows: 1. Detect potential evaders. 2. Movetolocationswherepotentialevadersweredetectedtoverify theiriden- tity and handle them. Letusinvestigateeachrelatedpieceofworkintermsofourwork. [41]developsan algorithm for the pursuit-evasion problem incrementally by considering variations of the problem of increasing complexity. This is a deterministic approach where the searchers can see all cells in R and the location of the fugitive is known, whereas our approach is probabilistic with imperfect sensor models, where we do not know the location of the fugitive a priori. The work in this section is quite di®erent than the hybrid pursuit game in [37], where it assumes that the search and capture operation cannot be done at the same time. The framework and the solution of the pursuit evasion problem in this section also di®ers from [19, 17]. 137 They propose a greedy policy to control mobile agents in pursuit. This is actually ¯nding the local maximum point in the map where it is highly likely to ¯nd an evader. This is also investigated in [51] This work adds a new pursuit policy where global maximum point is calculated. Therefore, the pursuers follow the most likely cell to contain an evader in the map. This is rather more complicated than local max policy, but works better since pursuers do not lose time following local maximum peaks in the map. 7.2 Pursuit Policies In this section, our pursuit policy with search paths for a team of pursuers is going to be presented. We are going to compare the performance of this policy to the performance of other greedy policies (local-max, global-max policies) in terms of capture time and present the results of the simulations in the next section. For now, let us introduce all the pursuit policies. For each time t2T we denote by Y t 2Y ¤ the sequence 1 of measurementsfy(1);:::;y(t)g taken up to time t. By the pursuit policy we mean the functiong :Y ¤ !U that maps the measurements 1 Given a set A we denote by A ¤ the set of all ¯nite sequences of elements of A and, given some a2A ¤ , we denote byjaj the length of the sequence a. 138 taken up to some time to the control action executed at the next time instant, i.e., u(t+1)=g(Y t ); t2T: (7.1) We denote by x e the position of the fugitive and by x p := fx p 1 ;x p 2 ;:::;x p np g the positions of the pursuers. Both the fugitive and the pursuers can move and thereforex e andx p are time-dependent quantities. At each time t2T,x e (t) and each x p i (t) are random variables taking values on R. At each time t 2 T, the control action u(t) := fu 1 ;u 2 ;:::;u n p g consists of a list of desired positions for the pursuers at time t. Two pursuers should not occupy the same cell, therefore each u(t) must be an element of the control action space,U U := © fv 1 ;v 2 ;:::;v np g:v i 2R; v i 6=v j for i6=j ª : Local-Max Policy This policy is also known as the greedy pursuit policy with constrained motion in [19, 17]. Under this policy, the pursuer moves to the cell in the one-step reachable set with the highest probability of containing a fugitive, given the measurements Y t =Y taken by the pursuers up to t, i.e., g lm (Y)=arg max fv 1 ;v 2 ;:::;v np g2U(v k ) np X k=1 p e (v k ;Y); Y 2Y ¤ : (7.2) 139 Global-Max Policy It searches over the entire map in order to compute the controlthatmaximizestheprobabilityofcapturingthefugitive. Thereisanaviga- tion policy which takes the list of desired positions for the pursuers and produces a list of positions "one-step closer" to the desired list. Given the measurements Y t =Y taken by the pursuers up to t, the global-max pursuit policy is de¯ned as follows g gm (Y)=nav(x ¤ p ;k ¤ ;Y); Y 2Y ¤ : (7.3) where the navigation policy is the function nav : U £f1;2;:::;n p g£Y ¤ ! U de¯ned as Algorithm 2 (Global-Max Navigation Algorithm (GM-NA)). . At every time k , the navigation function 1. computes the cell with maximum probability of capture based on x ¤ =argmax x2R p e (x;Y); Y 2Y ¤ : (7.4) 2. de¯nes the desired positions of the pursuers as x ¤ p =arg max fv 1 ;v 2 ;:::;v n p g2U n p X k=1 p e (v k ;Y); Y 2Y ¤ : (7.5) 140 3. takes the list of desired positions, x ¤ p , together with measurements and pro- duces a position reachable in a single time step that is one-step closer to x ¤ p where k ¤ is the integer for which x ¤ =x ¤ p k . PursuitPolicywithSearchPaths OurpursuitpolicyissimilartotheGlobal- Max Policy. It also searches over the entire map in order to compute the control that maximizes the probability of capturing the fugitive, but the underlying nav- igation policy is di®erent. There is a distance function dist :R£R!R which gives the Euclidean distance between two points in regionR. Given the measure- mentsY t =Y taken by the pursuers up to t, the pursuit policy with search paths is de¯ned as follows g sp (Y)=nav sp (x ¤ ;Y); Y 2Y ¤ : (7.6) where the navigation policy with search paths is the function nav sp :R£Y ¤ !U de¯ned as Algorithm 3 (Navigation Algorithm with Search Paths (NA-SP)). . At every time k, the navigation function de¯nes a search-region-at-k,R k :=R that is equal the whole search region,R, and a searcher-list-at-k, x s k :=x p k , that is equal to the list of all the pursuers, and then repeats the following actions in a loop until all the pursuers are moved to their next location. 141 1. ¯nd the cell with maximum probability of capture on search-region-at-k,R k based on x ¤ =argmax x2R k p e (x;Y); Y 2Y ¤ : (7.7) 2. assign this maximum point, x ¤ as the ¯nal destination point to a pursuer, x p i 2x s k , that is the closest pursuer to x ¤ , i.e. dist(x p i ;x ¤ )· min j:j6=i;j2f1;:::;npg dist(x p j ;x ¤ ) (7.8) 3. compute the search path, S i that is the shortest path between the closest pursuer, x p i , and the maximum point, x ¤ , where the minimization is as follows S i = min x 1 =x p i ;x 2 ;:::;x m =x ¤ m¡1 X j=1 c j : (7.9) where m is the path length, and x j ;j2f1;:::;mg the cells along the search path, and c j the cost of the link between cells x j and x j+1 . For simplicity, in our case, c j =c is a constant. That is, each link between two cells on the search path has same value in terms of cost. 4. move the closest pursuer, x p i , one-step closer to the maximum point, x ¤ , on the search path, S i , 142 5. computethe searchwidth,W i , forthe searchpath,S i , basedonthefunction, g(R;½ range ), of the size of the map, and the range of the sensors on pursuer, ½ range , W i =g(R;½ range ); (7.10) andthende¯nethe do-not-searchregion, ~ R, wherethisregionhasthosecells that are on the search path S i and W i away from S i , and ¯nally remove do- not-search region, ~ R, from search-region-at-k,R k , and the closest pursuer, x p i , from searcher-list-at-k,x s k 6. repeat from step 1 until every pursuer is moved Let us now take a look at the simulation results to compare the pursuit policy with search paths to the other two greedy policies. 7.3 Simulation Results: All Policies Considered This simulation is based on our MICA Setup, that is, a military airborne search and destroy operation and therefore the speed of each searcher aircraft is about 180m=s. The fugitive Transporter-Erector-Launcher (TEL) is a mobile unit. We receive measurements for the location of the target from sensors on board of aircraft. The initial map of the region contains a priori intelligence information about the whereabouts of the fugitive. The navigation policy used for greedy 143 policy with constrained motion is directly inspired by the Bug2 Algorithm in [27]. For simplicity in our case, g(R;½ range ) is de¯ned as g(R;½ range )=sum(dim(R))=10 (7.11) where sum() is an operator that outputs the summation of elements of a vector, anddim()isanoperatorthatoutputsthedimensionsofaregion. Figure7.2shows simulation results on regions with N :=f400;1600;3600;6400g cells, 3 pursuers, where the scaling is such that, for a region with N := 8000 cells, each cell has a side of 180 meters to cover an area of about 260km 2 . Here one can easily see that Figure 7.1: Pursuit-evasion simulation results using all three policies. Symbols represent mean value of capture time and light color bars represent standard deviation. as the area of the region gets smaller, all three policies converge to comparable capture times. On the other hand, as the search area gets bigger, the global-max policy becomes obsolete compared to even local-max policy. This is because the 144 pursuerslosetimefollowingfalsepeaksinthemap. Ontheotherhand, intelligent policy performs much better compared to the other two greedy policies even as the search area gets bigger. In the next section, we will consider and compare the performance of intelligent policy in the case of constrained communication. 7.4 Constrained Communication Case We consider systems in which transmission and perception amongst the pursuers and the central computation mode may be faulty, or otherwise randomly varying or where lines of communication may randomly disappear. In the above solution to pursuit evasion problem, there is a central mode where all the computation is madeandtheoutputsaresenttocorrespondingpursuers. Thisrequiresacommu- nication network where an uplink exists from every pursuer to the central mode and downlink exists in a broadcast style from the central mode to every pursuer. At times, pursuers may not receive new information from the central computa- tion mode. This happens in such a way that we assume we do not have any delay between sending and receiving information. In other words, as one may recall from Equation7.1, sending and receiving information are done simultaneously. To investigate this case, one has to adapt the pursuit policy accordingly. In this sec- tion, we are going to present the pursuit policy with search paths in constrained communication case. 145 Pursuit Policy with Search Paths in Constrained Communication Case The minimization and hence computation of search paths, S := fS 1 ;:::;S n p g is still done in the central mode. The navigation algorithm in this case is almost exactly the same as in Algorithm3 where the only di®erence is how the pursuer decides to move. That is, Algorithm 3 is still valid except for a change in Step 4. This step is now on algorithm running on the pursuers' side. Given the measure- ments Y t = Y taken by the pursuer up to t, the control input, u i (k) for pursuer i at time k is de¯ned as follows u i (k)=move constrained (S i ;Y); Y 2Y ¤ : (7.12) where the move constrained :R£Y ¤ !U de¯ned as Algorithm 4 (Move Algorithm: Constrained Communication). At every time k, ² IF (the pursuer receives new information from central mode) pursuer, x p i , moves one-step closer to the maximum point, x ¤ , on the new search path of time k, S i (k). ² ELSE IF (the pursuer has any positive reading from its own sensors) pursuer computes the new search path, S i (k) that is the shortest path between the last position, x p i (k¡1), and the positive reading. 146 pursuer, x p i , moves one-step closer to the positive reading. ELSE pursuer computes the new search path, S i (k) that is the shortest path between the last position, x p i (k¡1), and the destination point on the old search path of time k¡1, S i (k¡1). pursuer, x p i , moves one-step closer to the destination point on the newly computed search path of time k, S i (k). END ² END Let us now take a look at the simulation results to compare the pursuit policy with search paths in the cases of unconstrained and constrained communication. 7.5 Simulation: Constrained Communication Figure 7.2 shows simulation results again on regions with the number of cells, N :=f400;1600;3600;6400gcells, 3 pursuers. Once again, one can easilysee that as the area of the region gets smaller, both cases converge to comparable capture times. On the other hand, as the search area gets bigger, the pursuit policy with search paths in constrained communication performs worse as expected. This is because the pursuers lose time considering old information in the map. On the other hand, one can recall from and compare with previous simulation results to 147 Figure 7.2: Pursuit-evasion simulation results using pursuit policy with search paths in unconstrained and constrained communication cases. Symbols represent mean value of capture time and light color bars represent standard deviation. see that policy policy with search paths compares to the other two greedy policies in unconstrained communication case even in constrained communication case. 7.6 E®ect of Number of Pursuers on Capture Time Our pursuit policy with search paths for a team of pursuers and the compari- son of the performance of this policy to the performance of other greedy policies (local-max,global-max policies)intermsofcapturetimehavebeenpresented. Ad- ditionally, we considered systems in which transmission and perception amongst the pursuers and the central computation mode may be faulty, or otherwise ran- domly varying or where lines of communication may randomly disappear. To 148 complete the research on pursuit-evasion, simulation results of the e®ect of num- ber of pursuers on the capture time is presented next. The pattern of variation of both the mean value and the standard deviation of the capture time is recorded during simulations. The policies selected are pursuit policy with search paths andlocal-max policy. Theglobal-max policy isnot selected since the performance of this policy is not within acceptable limits for large search-areas relative to the former two policies. This simulation, as well, is based on our MICA Setup. Figure 7.3 shows simulation results only on a region Figure 7.3: Pursuit-evasion simulation results for e®ect of number of searchers on the mean and variation of capture time using local-max policy and pursuit policy with search paths on 20x20 map. Symbols represent mean value of capture time and light color bars represent standard deviation. with N :=f400g cells, and for the cases of 2;4;5;6;7 pursuers. The selection of optimal number of pursuers for a certain search-region is not within the scope of this research. Hence, the number of pursuers selected is random and is only used to illustrate the pattern of capture time based on changes in number of pursuers. 149 However, one can easily see that as the number of pursuers gets bigger, both the mean value and the standard deviation from that get smaller. On the other hand, as the pursuers populate the search-area, the mean value and the standard deviation do not decrease linearly. However, the pursuit policy with search paths still performs better than local-max policy as expected. 150 Chapter 8 Conclusions and Future Work In this report, we developed a probabilistic formalism to fuse the information coming from sensors used to detect general objects. Through the computation of several probabilistic maps, we showed how to assess the risk of missions that consist of °ying an aircraft over a region infested by enemy radars and co-located SAM launchers. We proposed an e±cient algorithm to compute probabilistic maps. This algo- rithm scales well with the number of objects being mapped and is characterized by a memory complexity that is linear on the area of the region where the ob- jects lie. To achieve this, we represent the joint probability distribution of the 151 objects' positions approximately using an appropriately de¯ned aggregate mea- surement function. In general, the approximation error can be made arbitrarily small. Threemainassumptionsmakethisapproximationvalid: theobjectsarein- distinguishable from the sensor view-point, there is a minimum distance between any two objects, and -in the case of mobile objects- their motion is essentially independent when the objects are far from each other. Hence probabilistic maps for objects can be economically represented by aggregate measurement functions with memory complexity independent of the number of assets. Map construction can be done recursively and directly on the aggregate mea- surement functions. We also showed how to assess the risk of missions in air- operations and even do minimum-risk path planning without ever explicitly com- puting the high-dimensional probabilistic map and working simply with the ag- gregate measurement function. Due to the assumptions mentioned above, the probabilistic formalism, as dis- cussed here, may not be utilized for all applications of pursuit-evasion. However, one can easily de¯ne the sensor model and the motion model accordingly to new applications to make use of the probabilistic nature and the computational ad- vantage. E®ective science requires future research on the selection of optimal number of pursuers, the pursuit for multiple evaders, modeling of new types of sensors, 152 and motion patterns for next steps. Of special interest is to model the intent of opposing party in pursuit-evasion. 153 Bibliography [1] M. Askar and H. Derin. A recursive algorithm for the bayes solution of the smoothing problem. IEEE Trans. on Automat. Contr., 26(2):558{561, 1981. [2] B.Bullock,D.Keirsey,J.Mitchell,T.Nussmeier,andD.Tseng. Autonomous vehicle control: An overview of the hughes project. In Proc. IEEE Comput. Society Conf.; and Trends and Applications, May 1983. [3] J. A. Castellanos, J. M. M. Montiel, J. Neira, and J. D. Tardos. The SPmap: a probabilistic framework for simultaneous localization and map building. IEEE Trans. Robot. Automat., 15(5):948{52, October 1999. [4] Goksel Dedeoglu, Maja J. Mataric, and Gaurav S. Sukhatme. Incremental online topological map building with a mobile robot. In Proc. of the SPIE. International Society for Optical Engineering, September 1999. [5] Xiaotie Deng, Tiko Kameda, and Christos Papadimitriou. How to learn an unknown environment i: The rectilinear case. Journal of the ACM, 45(2), March 1988. [6] Xiaotie Deng and A. Mirzaian. Competitive robot mapping with homoge- neous markers. IEEE Transactions on Robotics and Automation, 12(4), Au- gust 1996. [7] A.Doucet,S.J.Godsill,andC.Andrieu. Onsequentialmontecarlosampling methods for bayesian ¯ltering. Statist. Comp., 10, 2000. [8] D. E. Dudgeon, J. G. Verly, and R. L. Delanoy. An experimental target recognition system for airborne laser radar imagery. In Proc. of the IEEE International Conference on Systems Engineering, pages 478{481, 1990. [9] S. P. Engelson and D. V. McDermott. Error correction in mobile robot map learning. InProc.IEEEInt.Conf.onRobot.andAutomat.,pages2555{2560, May 1992. [10] T.Eren,P.N.Belhumeur,B.D.O.Anderson,andA.S.Morse. Aframework for maintaining formations based on rigidity. Proceedings of the 2002 IFAC Congress, 2002. 154 [11] J. Alexander Fax and Richard M. Murray. Information °ow and cooperative control of vehicle formations. IFAC World Congress (to appear), 2002. [12] D. Fox, W. Burgard, and Sebastian Thrun. Markov localization for mobile robots in dynamic environments. Journal of Arti¯cial Intelligence Research, 11:391{427, 1999. [13] BrianP.GerkeyandMajaJMataric. Sold!: Auctionmethodsformulti-robot coordination. IEEE Transactions on Robotics and Automation, 18:758{786, 2002. [14] J. E. Handschin. Monte carlo techniques for prediction and ¯ltering of non- linear stochastic processes. Automatica, 6, 1970. [15] J.E. Handschin and D.Q. Mayne. Monte carlo techniques to estimate the conditional expectation in multi-stage non-linear ¯ltering. Int. J. Cont., 9(5):547{559, 1969. [16] Joao Hespanha, Yusuf S. Ateskan, and Huseyin H. Kizilocak. Probabilistic map building for aircraft-tracking radars. In Proc. of the 2001 American Contr. Conf., 2001. [17] Joao Hespanha, Hyoun Jin Kim, and Shankar Sastry. Multiple-agent prob- abilistic pursuit-evasion games. In Proc. of the 38th Conf. on Decision and Contr., pages 2432{2437, 1999. [18] Joao Hespanha and Huseyin Kizilocak. E±cient computation of dynamic probabilistic maps. In Proc. of the 10th Mediterranean Conf. on Contr. and Automat., 2002. [19] Joao Hespanha and Maria Prandini. Optimal pursuit under partial infor- mation. In Proc. of the 10th Mediterranean Conf. on Contr. and Automat., 2002. [20] A. Jadbabaie, J. Lin, and A. S. Morse. Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Transactions on Au- tomatic Control (submitted), 2002. [21] Jan Jelinek. Predictive models of battle dynamics. Technical report, Honey- well Laboratory, May 2000. [22] Kalman, Rudolph, and Emil. A new approach to linear ¯ltering and pre- diction problems. Transactions of the ASME{Journal of Basic Engineering, 82(Series D):35{45, 1960. 155 [23] Jongrae Kim and Joao Hespanha. Discrete approximations to continuous shortest-path: Applicationtominimum-riskpathplanningforgroupsofuavs. Submitted to the 42th Conf. on Decision and Contr., 2003. [24] JonathanLawton,RandalBeard,andBrettYoung. Adecentralizedapproach to formation maneuvers. IEEE Transactions on Robotics and Automation, (to appear). [25] Feng-Li Lian and Richard M. Murray. Cooperative task planning of multi- robot systems with temporal constraints. International Conference on Ro- botics and Automation (submitted), 2003. [26] F.LuandE.Milios.Globallyconsistentrangescanalignmentforenvironment mapping. Autonomous Robots, 4:333{349, 1997. [27] Vladimir J. Lumelsky and Alexander A. Stepanov. Dynamic path planning for a mobile automaton with limited information on the environment. IEEE Trans. on Automat. Contr., 31(11):1058{1063, November 1986. [28] M. T. Mason. Compliance and force control for computer controlled manip- ulators. In Robot Motion, 1982. [29] Maja J Mataric. Reinforcement learning in the multi-robot domain. Au- tonomous Robots, 4:73{83, 1997. [30] L. Matthies and P. Grandjean. Stochastic performance, modeling and eval- uation of obstacle detectability with imaging range sensors. IEEE Trans. Robot. Automat., 10:783{792, 1994. [31] Timothy McLain and Randal Beard. Trajectory planning for cooperative rendezvous of multiple unmanned air vehicles. AIAA Guidance, Navigation and Control Conference, 2000. [32] H. P. Moravec. The stanford cart and the cmu rover. Proc. IEEE, 1983. [33] Petter Ogren and Naomi Ehrich Leonard. Obstacle avoidance in formation. Proc. of the IEEE Int. Conf. on Robotics and Automation, 2003. [34] Reza Olfati-Saber, William B. Dunbar, and Richard M. Murray. Cooper- ative control of multi-vehicle systems using cost graphs and optimization. American Control Conference (submitted), 2003. [35] EsbenH.Ostergaard,MajaJ.Mataric,andGauravS.Sukhatme.Multi-robot task allocation in the light of uncertainty. Proceedings, IEEE International Conference on Robotics and Automation, pages 3002{3007, 2002. 156 [36] Marios M. Polycarpou, Yanli Yang, and Kevin M. Passino. A cooperative search framework for distributed agents. Proc. of the IEEE Int. Symp. on Intelligent Control / IEEE Conf. on Control Applications, 2001. [37] Maria Prandini, Joao Hespanha, and George J. Pappas. Greedy control for hybrid pursuit games. In Proc. of the 2001 European Contr. Conf., 2001. [38] Anuj Puri and Stavros Tripakis. Algorithms for routing with multiple con- straints. xxx, 10, 1999. [39] Wei Ren and Randal W. Beard. A decentralized scheme for spacecraft for- mation°yingviathevirtualstructureapproach. AIAA Journal of Guidance, Control and Dynamics (to appear), 2003. [40] J. A. Roecker. E®ectiveness of track monitoring with multiple 2d passive sensors. IEEE Trans. Aerospace and Electronic Syst., 27:941{945, 1991. [41] K. Sugihara and I. Suzuki. On a pursuit-evasion problem related to motion coordination of mobile robots. In Proc. of the Twenty-First Annual Hawaii International Conference on Applications Track, 4:218 {226, 1988. [42] A. M. Thompson. The navigation system of the jpl robot. In Proc. 5th Joint Int. Conf. Arti¯cial Intelligence, August 1977. [43] Sebastian Thrun. Learning metric-topological maps for indoor mobile robot navigation. Arti¯cial Intelligence, 99(1):21{71, February 1998. [44] Sebastian Thrun. Probabilistic algorithms in robotics. AI Magazine, 21(4):93{109, 2000. [45] Sebastian Thrun. A probabilistic online mapping algorithm for teams of mobile robots. International Journal of Robotics Research, 20(5):335{363, 2001. [46] Sebastian Thrun. Particle ¯lters in robotics. In Proceedings of the 17th Annual Conference on Uncertainty in AI (UAI), 2002. [47] SebastianThrun.Roboticmapping: Asurvey.InG.LakemeyerandB.Nebel, editors,ExploringArti¯cialIntelligenceintheNewMillenium.MorganKauf- mann, 2002. to appear. [48] SebastianThrun,WolframBurgard,andDieterFox.Aprobabilisticapproach to concurrent mapping and localization for mobile robots. Machine Learning and Autonomous Robots (joint issue), 31(5):29{53, 1998. [49] Sebastian Thrun, D. Fox, W. Burgard, and F. Dellaert. Robust monte carlo localization for mobile robots. Arti¯cial Intelligence, 128(1-2):99{141, 2000. 157 [50] Jorge Tierno. Distributed, multi battle model predictive control. Technical report, Honeywell Laboratory, June 2000. [51] ReneVidal,OmidShakernia,H.JinKim,DavidHyunchulShim,andShankar Sastry. Probabilistic pursuit-evasion games: Theory, implementation, and experimental evaluation. IEEE Transations on Robotics and Automation, 18(5), October 2002. [52] B. Yamauchi and P. Langley. Place recognition in dynamic environments. J. of Robot. Systems, 14:107{120, 1997. [53] YanliYang,MariosM.Polycarpou,andAliA.Minai.Opportunisticallycoop- erativeneurallearninginmobileagents. Proceedings of the 2002 IEEE/INNS International Joint Conference on Neural Networks, 2002. [54] Hsi-Han Yeh, Eric Nelson, and Andrew Sparks. Nonlinear tracking control for satellite formations. IEEE Conference on Decision and Control, 2000. 158 Appendix A Simulator A.1 Simulation and Modeling for Military Air Operations In the JFACC Program, the Joint Forces Air Component Commander (JFACC) in military air operations controls the allocation of resources (wings, squadrons, air defense systems, AWACS) to di®erent geographical locations in the theater of operations. The JFACC mission is to de¯ne a sequence of tasks for the aerospace systems at each location, and providing feedback control for the execution of these tasks in the presence of uncertainties and a hostile enemy. Honeywell Labs has been developing an innovative method for control of military air operations. 159 The novel model predictive control (MPC) method extends the models and opti- mization algorithms utilized in traditional model predictive control systems. The enhancements include a tasking controller and, in a joint e®ort with our work, a probabilistic threat/survival map indicating high threat regions for aircraft and suggesting optimal routes to avoid these regions. A simulation/modeling environment using objected-oriented methodologies has been developed to serve as an aide to demonstrate the value of MPC and facilitate its development. The simulation/modeling environment is based on an open architecture that enables the integration, evaluation, and implementation of di®erent control approaches. The simulation o®ers a graphical user interface dis- playingthe battle¯eld, the controlperformance, and a probabilitymap displaying high threat regions. This section describes the features of the di®erent control approaches and their integration into the simulation environment. Figure A.1 illustrates the overall control components and their integration with the simulator. It shows a distribution of the three main control levels of the campaign management system. These three levels are the Operational Level, the TaskGroupLevel, andtheTaskLevel. Thecampaignmanagerintheoperational level is responsible for generating the campaign steps, which include task groups, deadlines and desired probability of success (all models are probabilistic; and the objective thus needs to have an associated probability to be meaningful). Each 160 taskgroupisassignedtoseveraltaskcontrollers. Itisthiscontroller'sresponsibil- itytodistributethegroup'sresourcesamongthedi®erenttasks. Modelpredictive control is done at the task level and task group level. MPC task controllers take a task and determine the aircraft deployments needed to execute the task. For a given maximum duration for the task (whichdetermines the maximumnumber of aircraftsorties)andadesiredprobabilityofsuccess,thetaskcontrollerdetermines thenumberofaircraftthatshouldbedeployedineachsortie. Thecontrollersalso provide the Task Commander with the expected requirements needed to accom- plish a certain task. The Task Group Level and Task Level controllers are based onfeedbackcontrolthatrevisesthecontrolstrategyaftereachbattleroundbased on damage reports. For control purposes, each task can be modeled individu- ally. This allows for specialized modeling of tasks of di®erent nature and localizes the information needed for modeling at the most natural place. Estimation and adaptation is done within the task control level. Probabilisticmapsareconstantlycomputedwiththelocationsofenemythreats (SAMsites)byfusinginformationcollectedfromalltheaircraftinthebattle¯eld. The probabilistic maps are used for intelligence updates and for minimum risk route planning. They make use of the map-building facility to compute paths for the aircraft that minimizes the risk of being attacked by enemy SAM sites. Paths can also be recomputed after a mission has started to divert the aircraft from 161 Figure A.1: Overall system architecture pop-up threats. The intelligence information is passed on to the model predictive control. In order to evaluate these control strategies and integrate the probabilistic threat models, a simulation environment has been developed based on objected- orientedmethodologiesthato®ersanopenarchitectureandaGUI.Thesimulator environment simpli¯es the integration and evaluation of the above described mul- tiple control approaches. It also o®ers a graphical user interface displaying the battle¯eld, the controlperformance, and a probability map displaying high threat regions. The current version of the simulator integrates the Task level and the Task group level. To clarify the integration let us explain the overall aspect of the 162 simulation. The main acting entities in the simulation are forces that can engage in combat. Each force owns assets; typical assets include bases, di®erent types of aircraft,SAMssites,andtargets. Thetaskgroupcontrollerintegrationisachieved by using the base object as the equivalent of a task. Each base as part of the task level owns a model predictive controller that passes information to the task group controller. The bases are then allowed to engage in resource transfers based on theassignmentsgivenbythetaskgroupcontrol. Aftertheresourcetransfershave been completed each of the bases engage in combat with their enemy following their task orders. A.1.1 Simulation Architecture This section describes the main building blocks and the overall °ow of the simu- lation. The main acting entities in the simulation are forces that can engage in combat. Each force owns assets; typical assets include bases, di®erent types of aircraft, SAM sites, and targets. As can be seen in Figure A.2, simulation is performed on a graphical 2- dimensional "game board" that can display any terrain. It shows a red and a blue base (= large circles), enemy SAMs (= red diamonds) and blue ¯ghters at- tackingtheredbase. Thegraphicdisplayshowsasnapshotofthemainsimulation window displaying the battle¯eld during a mission that consisted of °ying several 163 aircraft from one air-base to another, passing through a region infested by enemy SAMs. The number of ¯ghters sent is based on the MPC control model. The results of the probability map are clearly presented in this snapshot of the simulation. The display to the right shows a snapshot of the occupancy map, where the probability of occupancy is encoded in the background color (darker corresponding to a higher probability) and each "x" corresponds to a location where an aircraft "detected" a radar. In the ¯gure showing the log-live map, darker regions indicate smaller values for the log-live map and therefore more danger, whereas lighter regions correspond to safer areas. The line superimposed in the log-live map corresponds to a path that is Pareto-optimal for the pair of costs: probability of survival and path length (with distance measured in the Manhattan sense). The optimal route avoids the high threat regions. The information of the existing threat that can be avoided is being feed back to the MPC control which in turn can reduce the optimal number of aircraft send during the next round by simply being able to avoid the threats altogether. The bar graphs show the assets sent at the beginning of each round (¯lled bars) versus the assets returning after each round (empty bars). The simulation/modeling environment is based on objected-oriented method- ologies. It is delivered as a set of Matlab classes shown in Figure A.3 (top level). 164 Figure A.2: Simulation Graphical User Interface The simulator level allows the integration of general functionality or speci¯c con- trol methods including outside routines. The object-oriented approach allows us to easily override many of the meth- odsandbene¯tfromthegeneraladvantagesofobject-orientedprogramming. The Matlab class (second level) provides a framework consistent interface for this pro- gram, supervisory functions, export capabilities, and graphics. It can be used on a Unix or Windows NT platform (bottom level). The simulator includes the following classes: Theforceclasshasthefunctionalitythatcombinesalltheotherclasses. Forces are the entities described earlier that can engage in a battle and contain several 165 Figure A.3: Simulation platform and architecture types of assets. Although the other classes besides the force class can be declared separately, the main actions like ¯ring, movement, status assessment etc., can be more e±ciently done using the functions contained in the force class. The aircraft class represents the parent class of the inherited classes including ¯ghters, bombers, weasels, and tankers. It includes members and functions that describe common aspects and actions of an aircraft, like the movement or fuel consumption etc. Aircraft speci¯c actions can be found in the subclasses of each aircraft type. The base class represents an object characterization of a base, where most of the ¯ghting assets can reside initially. There are no speci¯c ¯ghting functions 166 for a base, but its location, survivability, and the status of the fuel and weapon depots are represented in the class. The target class represents an object characteristic of a target, e.g., a bridge or a tower. The SAMS class represents a surface to air missile object, where its main functionality is to protect a base and shoot any type of aircraft. Its members are mainly the numbers of weapons it can have, its probability of being active, its ability to track aircraft, and its health. The battle simulation is performed on a step-by-step basis as illustrated in Figure A.4. Figure A.4: State °ow diagram of a battle In each step the following actions are taken: 167 Initialization: At the beginning of the simulation, the forces are initialized regarding their positions, strengths, and more detailed attributes (e.g. number of SAMS, bombers, bases, etc.). After the initialization, the simulation runs in battle rounds. During each round, each force sends a certain set of assets to ¯ght the enemy. After certain conditions are satis¯ed, the set of assets returns to base and one battle round has been completed giving way for the next round. The conditions can include lack of fuel or a new task assignment. Control: At each time step the controller may act on an asset assigning a missionofthefollowingtype: Attack, defense, aircombat, reserveonground, and reassignment missions. Basedoncertainconditions-e.g.,enemyentersthedefensiveradiusofabase- each force and their assets operate based on speci¯c control and strategic actions. One controller is the MPC controller [21] that provides an optimal number of aircraft for each battle round. Other control components include the tasking con- trol [50] that coordinates asset transfers based on need among the bases, and the probability map [16] that provides threat assessment data to the MPC controller. Move: After each force has been given its speci¯c control action, its assets move a certain distance and direction based on their speed and the path given by the minimum risk path planner. For details on minimum-risk path planning see section 5. 168 Fire: After the move has been completed the force members check on their instructions to ¯re. Firing instructions include: Air to Air: An aircraft in °ight for example may attack another aircraft in °ight Air to Ground: An aircraft in °ight may drop one or more bombs on one or more ground targets Ground to air: A SAM site may ¯re on an enemy aircraft that moves within distance. StateUpdate: After¯ringallthestatesareupdatedtoaccountforanyaircraft lost, changes in the operational status of the bases, armament, etc. This is done based on a probabilistic evaluation. GameObjective: Thegameisoverwhenairsuperiorityisreached. Thismeans that side has no aircraft or no operational runways. As long as the game is not over another battle round is started. Asdescribedinthestate°owdiagram, ineachstep, each¯ghtingmemberofa forcewillexecutethesamefunctionality,i.e.,move,¯re,andassessandupdateits health, but can do it according to di®erent parameters based on its parent class and its own characteristics. A.1.2 Model Predictive Control One of the key control components that is represented in the simulator is the model predictive control (MPC) that operates mainly at the task level. The commonly used Lanchester models approximate only expected attrition rates and 169 were found not suitable for MPC. Honeywell has developed probabilistic models to capture the complete probabilistic state of the battle [21]. In these models, battles are interpreted as duels between two opponents and consist of a number of rounds. In each round, both sides simultaneously ¯re a single salvo from some or all of their surviving units. Such battles can be described bya Markovprocess, whose state distribution is a probability matrix that characterizes all possible asset strengths at a given time. The probabilistic models are de¯ned in terms of the number of assets on each side and their lethality (which is the probability that a given unit will actually destroy an enemy target it is ¯ring upon). In the simulation,thenumberofassetsandthelethalityinformationareprovidedbythe threat assessment maps, as well as state and parameter estimation. More details are described in following sections. The predictive model uses this information to calculate the probability of winning the battle in the remaining rounds. An optimizer is then imposed on the predictive model providing the optimal number of assets to be deployed during the future battle rounds. The resulting MPC controllerprovedtobeveryrobustwithrespecttomismatchesbetweenitsinternal predictive model and reality. The simulations demonstrate remarkable immunity to inaccurate battle¯eld information caused by the under- or overestimation by blue of the lethality or the number of red's assets. In the current version of the simulation, the entities that can own a MPC are the bases. 170 A.1.3 Task Group Control The task group controller [50] addresses the use of a decentralized model predic- tive control to manage a set of related battle tasks. A task is a goal-oriented activity comprising a speci¯cation of objectives (e.g., neutralization of a given set of targets), a timeline for achieving those objectives, and a desired probability of success. Battle tasks are considered related when they can share resources in a reasonablya®ordableway. Tasksareaggregatedeitherbynature,bygeographical location or by a combination of these and other characteristics, into task groups (see Figure A.5). Tasks within a task group can share resources in a cost e®ective way(i.e.,withoutneedforsigni¯cantlogisticburden). Taskgroupsareaggregated according to a strategy into campaign steps. The controller is meant to operate autonomously. It attempts to manage conditions in which success in all its objectives cannot be guaranteed in one step. It is also meant to work in a distributed fashion. This implies that complete knowledgeofthestateofalltasksisnotneededatacentrallevel. Thetaskcontrol assigns resources for the di®erent tasks under its supervision using market-based algorithms. Objectives are to achieve the desired probability of success for all targets, to minimize combat exposure of the assets and to maximize the scoring function. The simulator integrates the task group controller by using the base object as the equivalent of a task. Each base then owns a model predictive controller that 171 communicates with the task group controller. The bases are allowed to engage in resource transfers based on the assignments given by the task group control. The resources can consist of any aircraft types or SAMs. Figure A.5: Multiple bases engage in resource allocation and battle A.1.4 Map Building The future of air-operations is evolving towards a scenario where the battle¯eld is populated with many unmanned air-vehicles gifted with a large number of cheap sensors whose primary mission is to gather information about enemy assets and possibly execute simple missions in an autonomous fashion. In this scenario, the 172 fusion of information coming from a multitude of sensors into forms suitable for mission planning is a major challenge, because the data gathered is likely to be noisy. We develop a probabilistic formalism to process the information coming from aircraft sensors capable of detecting electromagnetic radiation originating from radar making it suitable for the planning of air-operations. The need for a probabilistic framework stems from the fact that the sensors producing the measurements are not perfect and susceptible to provide erroneous information. Because of this, any inferences made from this data can never be assumed 100 percent correct. A key concept used here is that of a probabilistic map [16]. In usual maps of a particular region, one ¯nds markings that represent locations of objects (e.g., radar)withintheregion. Byplacingoneofthesemarkingsinaparticularlocation on the map, one is expressing the fact that the object represented can be found at that particular location with probability one. However, when the information used to build a map (or more precisely to locate interesting objects in it) is not perfect, it is not possible to mark a speci¯c position on the map as the location of the object. In this case, the best one can do is to superimpose the probability density of the location of the object in question on the whole map. It is this probabilitydensitythatwecalltheprobabilisticmapfortheobject. Fromaformal point of view, probabilistic maps are actually probability densities of the position of objects, conditioned to the sensor measurements available. Note that these 173 probabilisticmapsdonotrepresentthetopologyofthegeographicregion, butare oftenrelatedtoit,e.g.,theprobabilisticmapofaradarwillmostlikelyexhibitzero probability over areas covered by a river because it is unlikely to ¯nd a radar over water. The main di±culty in computing probabilistic maps is their size. When oneismappingmultipleobjectswhosepositionsarenotindependentofeachother and therefore one needs to keep track of the joint probability density of all the objects, the probabilistic map (i.e., the joint probability density) easily becomes very large. This is because, in principle, the memory requirements to represent a jointprobabilitydensitygrowexponentiallywiththenumberofrandomvariables. It should be noted that (conditional) independence of the positions of several radar is not expected because the probability that all the radar are very close to each other is often much smaller (or at least di®erent) than the probability that the radar are spread to cover a certain perimeter. As shown in [16], for the sensor models used here, the probabilistic map of a large number of radar can be represented by a function called the aggregate measurement function de¯ned in a low-dimensional space (two-dimensional for our speci¯c setup). Another type of probabilistic map is the so-called occupancy map. These are maps that assign to each point in the region the (conditional) probability of having a radar at that location, given the set of measurements gathered so far. Occupancy maps can also be computed directly, and very e±ciently, from the aggregate measurement functions. Figure A.6 shows a snapshot of an occupancy 174 Figure A.6: Occupancy Map map. The probability of occupancy is encoded in the background color (lighter correspondingtoahigherprobability)andeachxcorrespondstoalocationwhere an aircraft detected a radar. The two light areas in the ¯gure represent the approximate positions of two radars located in the region. Note that several false positives that can be seen in the plot were correctly "interpreted" as erroneous measurements. A.1.5 Minimum-risk Path Planning Using the probabilistic maps introduced in previous section, it is possible to as- sess the risk of speci¯c missions in air-operations and even do minimum-risk path planning. As shown in [16], this can be done without ever explicitly computing 175 the high-dimensional probabilistic map and working simply with aggregate mea- surement functions. To determine the probability of success of a mission, one needs to compute a danger or kill-map. This map assigns to each point in the region the probability that an aircraft °ying over that location will be destroyed. The probability of success of a particular °ight mission turns out to be linear on the values of a simple transformation of the kill-map and is therefore computa- tionally amenable to do minimum-risk path planning. A danger map is shown in FigureA.7,wheredarknessindicatesdanger. Aminimum-riskpathisalsoplotted in the ¯gure. Figure A.7: Minimum Risk Path In the simulation, aircraft follow minimum risk paths. By doing so, they actively avoid high threat regions. The threat information associated with the minimum-risk path is also provided to the model predictive control at the task 176 level. Thisinformationallowsthecontrollertoadjustthenumberofassetsneeded to successfully complete future battle rounds. A.1.6 Future Work For Simulation The simulation and modeling environment serves as an aide to demonstrate the value of the di®erent novel control components that Honeywell developed for JFACC. It simpli¯ed the integration, evaluation, and implementation of these approaches. The simulation o®ers a graphical user interface displaying the bat- tle¯eld, the control performance, and a probability map displaying high threat regions. Additional functionality however could always be added to simulations of this kind. It would be advantageous to add user interface capabilities for ini- tializing the simulation and for data collection. These may include drop down tables for adding assets or tables that simplify the initialization in general. The simulation could be extended to include campaign control components as showninFigureA.1andmorelogisticcapabilities. Ifthesimulationhastobeused for scienti¯c purposes like running Monte Carlo simulations for data collection then the UI should be disconnected from the simulation and the simulation needs to be implemented more e±ciently for speed. A possible conversion to C code may be the solution. 177
Abstract (if available)
Abstract
Computation and application of probabilistic maps to multi-agent problems generally consist of generating maps, distributing paths for each agent that allow the group to minimize a cost function subject to constraints, e.g. the length of the path, risk involved with the path, coordinated group behavior, time to arrive at the destination, power consumption, network delay, and bandwidth usage, and deciding to either avoid threats or pursue treasures. In this thesis, we describe a formalism to take care of the general problem of detecting objects with imperfect sensors. This is accomplished by building several probabilistic maps. We develop efficient algorithms to compute probabilistic maps for a large number of static and mobile objects. By a probabilistic map we mean the probability density of the objects' positions, conditioned to noisy sensor measurements.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Keep the adversary guessing: agent security by policy randomization
PDF
Target assignment and path planning for navigation tasks with teams of agents
PDF
Modeling social causality and social judgment in multi-agent interactions
PDF
Multi-softcore architectures and algorithms for a class of sparse computations
PDF
Interaction and topology in distributed multi-agent coordination
PDF
Time synchronization and scheduling in underwater wireless networks
PDF
Local optimization in cooperative agent networks
PDF
Macroscopic approaches to control: multi-robot systems and beyond
PDF
Compilation of data-driven macroprograms for a class of networked sensing applications
PDF
Coalition formation for multi-robot systems
PDF
Joint routing and compression in sensor networks: from theory to practice
PDF
Advancements in understanding the empirical hardness of the multi-agent pathfinding problem
PDF
A general approach to using problem instance data for model refinement in constraint satisfaction problems
PDF
Managing multi-party social dynamics for socially assistive robotics
PDF
Mobility-based topology control of robot networks
PDF
Software security economics and threat modeling based on attack path analysis; a stakeholder value driven approach
PDF
Analytical and experimental studies in the development of reduced-order computational models for nonlinear systems
PDF
Scalable exact inference in probabilistic graphical models on multi-core platforms
PDF
Long range stereo data-fusion from moving platforms
PDF
Iterative path integral stochastic optimal control: theory and applications to motor control
Asset Metadata
Creator
Kizilocak, Huseyin Hakan
(author)
Core Title
Probabilistic maps: computation and applications for multi-agent problems
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
05/21/2009
Defense Date
11/01/2002
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
multi-agent,OAI-PMH Harvest,probabilistic maps
Language
English
Advisor
Ioannou, Petros A. (
committee chair
), Hespanha, Joao P. (
committee member
), Sukhatme, Gaurav S. (
committee member
)
Creator Email
caan@caanslaw.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m493
Unique identifier
UC1316900
Identifier
etd-kizilocak-20070521 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-498792 (legacy record id),usctheses-m493 (legacy record id)
Legacy Identifier
etd-kizilocak-20070521.pdf
Dmrecord
498792
Document Type
Dissertation
Rights
Kizilocak, Huseyin Hakan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
multi-agent
probabilistic maps