Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Macroscopic approaches to control: multi-robot systems and beyond
(USC Thesis Other)
Macroscopic approaches to control: multi-robot systems and beyond
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MACROSCOPICAPPROACHESTOCONTROL:MULTI-ROBOTSYSTEMS ANDBEYOND by DylanA.Shell ADissertationPresentedtothe FACULTYOFTHEGRADUATESCHOOL UNIVERSITYOFSOUTHERNCALIFORNIA InPartialFulfillmentofthe RequirementsfortheDegree DOCTOROFPHILOSOPHY (COMPUTERSCIENCE) August2008 Copyright 2008 DylanA.Shell TableofContents ListOfTables vi ListOfFigures vii Abstract xiv Chapter1: Introduction 1 1.1 Motivatingexample: manipulationofanantcolony . . . . . . . . . . . . . . . 3 1.2 Swarmsynthesisasamacroscopiccontrolproblem . . . . . . . . . . . . . . . 5 1.3 Beyondrobotswarms: optimizingcollectiveevacuation . . . . . . . . . . . . . 11 1.4 Dissertationcontributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.5 Dissertationoutline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter2: Controlledtransportinanantcolony 17 2.1 Ant-to-antinteractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 Controlledtransport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Experimentdesignandoutcome . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter3: Backgroundandrelatedwork 29 3.1 Motivationformulti-robotandswarmsystems . . . . . . . . . . . . . . . . . . 30 3.2 Definingthebehavioralsynthesisproblemformulti-robotsystems . . . . . . . 39 3.2.1 Intrinsicchallenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.2 Problemcomplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Approachestomulti-robotandswarmsynthesis . . . . . . . . . . . . . . . . . 44 3.3.1 Coordinationmethodsandphilosophies . . . . . . . . . . . . . . . . . 50 3.3.1.1 Implicitcoordination . . . . . . . . . . . . . . . . . . . . . 50 3.3.1.2 Explicitcoordination . . . . . . . . . . . . . . . . . . . . . 56 3.3.1.3 Implicitversusexplicitcoordination . . . . . . . . . . . . . 59 3.3.1.4 Multi-agenttechniques . . . . . . . . . . . . . . . . . . . . 62 3.3.1.5 Controltheoreticcoordinationmethods . . . . . . . . . . . . 64 3.3.2 Controlarchitectures . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 ii 3.3.2.1 Reactivecontrol . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3.2.2 Hybridarchitectures . . . . . . . . . . . . . . . . . . . . . . 67 3.3.2.3 Behavior-basedcontrol . . . . . . . . . . . . . . . . . . . . 67 3.3.3 Multi-robotformalisms . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.4 Relationshipbetweendissertationresearchandpriorwork . . . . . . . . . . . 73 3.4.1 Multi-robotsystemsize . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.4.2 Minimalistcoordinationprimitives . . . . . . . . . . . . . . . . . . . . 74 3.4.3 Sufficientconditionsforpredictableprocesses . . . . . . . . . . . . . . 75 3.4.4 Equilibriummodels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.4.5 Exploitingmodelsforsynthesis . . . . . . . . . . . . . . . . . . . . . 75 3.4.6 Macroscopicbases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.7 Entropymaximization . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.4.8 Tokenpassingforcoordinationprocesses . . . . . . . . . . . . . . . . 78 3.4.9 Behavioronmeasurespaces . . . . . . . . . . . . . . . . . . . . . . . 78 3.4.10 Randomnessandstochasticbehavior . . . . . . . . . . . . . . . . . . . 79 3.4.11 On-lineuseofmacroscopicmodels . . . . . . . . . . . . . . . . . . . 80 3.4.12 Evaluationdomains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.4.13 Multiscalenon-synchronouscomputation . . . . . . . . . . . . . . . . 82 3.5 Reoccurringmacroscopicproperties . . . . . . . . . . . . . . . . . . . . . . . 82 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Chapter4: Synthesismethodology 87 4.1 Microscopicprocessdescription . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.1.1 Feasibleconfigurations . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.1.2 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.1.3 Timeandphase-spaceaverages . . . . . . . . . . . . . . . . . . . . . 94 4.1.4 Parameterizedprocesses . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.1.5 Examplesofergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.2 Macroscopicprocessdescription . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2.1 Macrostates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2.2 Ensembleproperties . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.3 Synthesisbycouplingprocesses . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.4 TheMMMCanalysistool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.5 Constructingthetoolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.6 Ergodicityandmixinginexistingroboticswork . . . . . . . . . . . . . . . . . 107 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Chapter5: Multi-robotcasestudies 111 5.1 Strategyselection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.1.1 Motivationandproblemdefinition . . . . . . . . . . . . . . . . . . . . 113 5.1.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.1.2.1 FormalspecificationofProcessA . . . . . . . . . . . . . . . 120 5.1.2.2 TheintuitionforProcessA . . . . . . . . . . . . . . . . . . 121 iii 5.1.2.3 TheanalysisofProcessA . . . . . . . . . . . . . . . . . . . 122 5.1.2.4 FormalspecificationofProcessB . . . . . . . . . . . . . . . 124 5.1.2.5 TheintuitionforProcessB . . . . . . . . . . . . . . . . . . 125 5.1.2.6 TheanalysisofProcessB . . . . . . . . . . . . . . . . . . . 126 5.1.2.7 Couplingtheprocessestogether . . . . . . . . . . . . . . . . 130 5.1.2.8 Relatingtheprocessestothetask . . . . . . . . . . . . . . . 132 5.1.3 Evaluationanddiscussion . . . . . . . . . . . . . . . . . . . . . . . . 133 5.1.3.1 Methodologicalinsights . . . . . . . . . . . . . . . . . . . . 136 5.2 Divisionoflabor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.2.1 Motivationandproblemdefinition . . . . . . . . . . . . . . . . . . . . 138 5.2.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.2.3 Evaluationanddiscussion . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3 Additionalexamples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 5.3.1 Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 5.3.1.1 Problemdefinition . . . . . . . . . . . . . . . . . . . . . . . 144 5.3.1.2 Evaluationanddiscussion . . . . . . . . . . . . . . . . . . . 145 5.3.1.3 Comparativediscussion . . . . . . . . . . . . . . . . . . . . 148 5.3.2 Distributedtime-stamping . . . . . . . . . . . . . . . . . . . . . . . . 149 5.3.2.1 Problemdefinition . . . . . . . . . . . . . . . . . . . . . . . 150 5.3.2.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 5.3.2.3 Evaluationanddiscussion . . . . . . . . . . . . . . . . . . . 153 5.4 Capabilitiesandlimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Chapter6: Large-scalesimulation 160 6.1 Relatedsimulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 6.2 Datastructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 6.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 6.2.2 Localrangesensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 6.2.3 Planarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 6.2.4 Triangulationcreationandupdates . . . . . . . . . . . . . . . . . . . . 168 6.2.4.1 Creationandinsertion . . . . . . . . . . . . . . . . . . . . . 169 6.2.4.2 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6.2.5 Dynamicupdate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 6.2.6 Triangulationconditions . . . . . . . . . . . . . . . . . . . . . . . . . 173 6.2.6.1 Condition1 . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.2.6.2 Condition2 . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 6.3.1 Sensorymodels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 6.3.2 Communicationmodels . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.3.2.1 Integratedmodel . . . . . . . . . . . . . . . . . . . . . . . . 177 6.3.2.2 Renemotemodel . . . . . . . . . . . . . . . . . . . . . . . 186 iv 6.3.3 Velocityerrormodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 6.4 Examplesimulations: parametrizedforagingstrategies . . . . . . . . . . . . . 193 6.4.1 Controller: parametrizedbucketbrigading . . . . . . . . . . . . . . . . 193 6.4.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Chapter7: Humancollectivebehavior 199 7.1 Modelsofhumancollectivebehavior . . . . . . . . . . . . . . . . . . . . . . . 200 7.1.1 Modelingandsimulation . . . . . . . . . . . . . . . . . . . . . . . . . 201 7.1.1.1 Coreproblems . . . . . . . . . . . . . . . . . . . . . . . . . 202 7.1.1.2 Macroscopicapproaches . . . . . . . . . . . . . . . . . . . 202 7.1.1.3 Microscopicapproaches . . . . . . . . . . . . . . . . . . . . 204 7.1.1.4 SpaceSyntaxAnalysis . . . . . . . . . . . . . . . . . . . . 205 7.1.2 Applicationofmodels . . . . . . . . . . . . . . . . . . . . . . . . . . 206 7.2 Laser-basedpedestriantracking: micro-andmacroscopicviews . . . . . . . . 206 7.2.1 Microscopicview: shorttrajectories . . . . . . . . . . . . . . . . . . . 208 7.2.2 Macroscopicview: exhibitaverages . . . . . . . . . . . . . . . . . . . 210 7.3 Robotics-aideddisasterresponse . . . . . . . . . . . . . . . . . . . . . . . . . 210 7.4 Assistiveaudio-beacondeployment . . . . . . . . . . . . . . . . . . . . . . . 212 7.4.1 Directionalaudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 7.4.2 Deploymentalgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 214 7.4.2.1 Task-allocationforcoordination . . . . . . . . . . . . . . . . 215 7.4.2.2 Wheretodeploy . . . . . . . . . . . . . . . . . . . . . . . . 216 7.4.2.3 Whomtodeploy . . . . . . . . . . . . . . . . . . . . . . . . 218 7.4.3 Evacuationdynamicsmodelsinutilitycalculation . . . . . . . . . . . 218 7.4.3.1 Spacesyntaxmeasures . . . . . . . . . . . . . . . . . . . . 219 7.4.4 Experimentalvalidation . . . . . . . . . . . . . . . . . . . . . . . . . 220 7.4.4.1 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 7.4.4.2 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . 223 7.5 Human-basedenvironmentalmetrics . . . . . . . . . . . . . . . . . . . . . . . 225 7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Chapter8: Dissertationsummary 228 Bibliography 232 v ListOfTables 3.1 Repeatedqualitativepropertiesacrossnaturaldistributedsystems. . . . . . . . 85 5.1 Classificationofthesimulatedrobotsusedforexperimentalvalidation. . . . . . 116 6.1 SummaryofoperationsontheDelaunaytriangulation. . . . . . . . . . . . . . 174 6.2 Simulatedsensormodelparameters. . . . . . . . . . . . . . . . . . . . . . . . 177 6.3 Renemotecommunicationsmodelparameters. . . . . . . . . . . . . . . . . . 187 6.4 Qualitativesummaryofcommunicationsbehaviorbyrange. . . . . . . . . . . 189 6.5 Coefficientsofthequadraticplanesfittedtomodelcontrolandodometricerror. 192 vi ListOfFigures 1.1 Ananttransportingasliverofpaper. . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 The distinction between microscopic and macroscopic views of multi-robot systemsbecomemoremarkedwithincreasingsystemsize. Weemployatech- nique that exploits this fact in the modeling and analysis of distributed pro- cesses. These processes are used as a building block in the construction of coordinatedrobotcontrollers. . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Theprocesstoolboxisasetofprocessdescriptionsandassociatedmacroscopic characterizations. The toolbox is produced by analyzing individual processes in isolation in order to construct a macroscopic model. The toolbox provides theprimitivebuilding-blocksfromwhichcomplexcoordinatedbehaviorissyn- thesized. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Thelevelofdescriptionusedbysystemdesigneriselevatedawayfrommicro- scopic minutiæ toward collective properties. Analysis of individual processes produces a toolbox that connects microscopic and macroscopic descriptions, whichinformsthedesignerduringcontrollersynthesis. . . . . . . . . . . . . . 11 1.5 Macroscopic-level models have a broader applicability in robotics. This dis- sertationdescribesamulti-robotsystemthatusesmodelsofcrowdbehaviorin ordertominimizeexpectedegresstimes. . . . . . . . . . . . . . . . . . . . . . 12 2.1 Temnothorax rugatulus ants, the species used in the transport example be- low.(PhotoCredit: AlexWild.) . . . . . . . . . . . . . . . . . . . . . . . . . . 18 vii 2.2 A nest designed to have configurable desirability, the inset shows the hinged roofactivated,makingthenestanundesirablesite. Theentirenestis75mm×52mm with a 32mm×24mm cavity. The roof and floor are made from glass micro- scope slides 1.2mm thick and the wooden partition is 2mm thick. The nest entrance has a diameter of 5mm. A push-rod connects the control horn to a standard servo motor. Photo shows nest inhabited by the colony of T. rugatu- lususedtocarryoutthetransportscenario. . . . . . . . . . . . . . . . . . . . 21 2.3 The environment constructed to manipulate an ant colony’s nest site selection inordertoelicitcontrolledtransport. Theinsetshowsthepapermarkerthatthe colony transports when the desirability of the nests are adjusted by following thesequenceofstatesshownalongthebottom. . . . . . . . . . . . . . . . . . 24 4.1 Overview of the modeling procedure use to produce the toolbox of processes andassociatedmacroscopiccharacterizations. . . . . . . . . . . . . . . . . . . 89 4.2 Representationofthesystemphase-space ~ S andthedynamicsfunction ~ Φ. The dynamics show an unstructured walk through the space suggesting a measure P(·)andhencetheergodicproperty. . . . . . . . . . . . . . . . . . . . . . . . 91 4.3 Rather than requiring the characterization of a complex process, as in (a), the compositional approach makes analysis of aggregate processes feasible be- cause macroscopic properties can be “summed” as in (b). (Arrows represent thestaticalmechanicalanalysis.) . . . . . . . . . . . . . . . . . . . . . . . . 103 4.4 Three different views of the system dynamics. (a) A trajectory within the en- tire behavioral phase-space; (b) A hypothetical projection of the phase-space showingplanarsubspacesforbehaviornotpartoftaskstatesS andtheactions connecting these spaces; (c) The abstracted S representation as used by the formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.1 Plot of the number of pucks within home region versus time for both foraging strategies (with 250 robots). Left figure has low puck density, right has high puck density. Plots show mean and standard deviation for 5 independent sim- ulation runs. (See Section 6.4 for a more complete empirical treatment of the foragingstrategies.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.2 A subgraph of the global communication graph, Vertices represent robots and edges represent communication links. Numbers depict Process A state. Two robots exchange a random value (12 here). No state information from the robotsingrayisnecessary,thisisastrictlylocalinteraction. . . . . . . . . . . 121 viii 5.3 The density function describing the probability of Process A (n = 100 and Z = 10 5 ) being in a particular state at a random time. The plot of the full domainalongthetopandthebrokenlinethatdelineates97%oftheprobability mass,showingsharppeakintheDistribution. . . . . . . . . . . . . . . . . . . 122 5.4 Plots from ten experimental runs showing the state of a single robot (n = 100 and Z = 10 5 ) over time. In all cases, the plotted robot moved quickly into stateswellcharacterizedbythetheoreticalmeanandstandarddeviation. . . . 123 5.5 AProcessBtransition. Filledverticesrepresentstate+1,emptyones−1;solid edges depict alignment, broken edges misalignment. The number of solid and broken edges is conserved across the global graph. Within the local neighbor- hoodthereremain4alignededgesand5misalignedones. . . . . . . . . . . . 125 5.6 Plots of the entropy surface for a finite, albeit idealized,P B . TheS =S is the entropy,M = m n andE ise normalized so as to fall between−1 and 1. Since S gives the log probability, these probability density functions are extremely peaked,i.e.,allowinggoodpredictionofaveragebehavior. . . . . . . . . . . . 127 5.7 BothProcessAandProcessesBarecoupledtoproduceanaggregatecontroller. 131 5.8 Plot of the proportion of robots using the bucket brigading and homogeneous foragingstrategies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.9 Plotofthenumberofpucksforagedovertime. . . . . . . . . . . . . . . . . . 135 5.10 Anaggregatecontrollerisconstructedtoperformdivisionoflabor. . . . . . . . 140 5.11 Performanceoftaskallocationprocesses. Theverticalaxisgivestheproportion oftasks(forthebrokenline),andthedivisionofrobotsamongtasks(thesolid lines). Plotsshowmeanandstandarddeviationfor5runs. . . . . . . . . . . . 142 5.12 Screen-shotsofasimulationwith 441robotsrunningfor∼ 520seconds. . . . 145 5.13 Thenumberofneighboringrobotswithincommunicationrangefor441robots flocking North-East then South-West. The average over all robots plus/minus one standard deviation. There is a high density of robots, the communication is characterized by a falloff that means good connectivity can be had within a radius of 2m, and mixed performance can be had until 4m. See Section 6.3.2 fordetails. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 ix 5.14 Plot of synchronization processes internal state for the first 400 seconds of a runwith441robots. ThetransitionofM valuesisclearlyvisible. . . . . . . . 147 5.15 Anoverviewofthetime-stampsynchronizationapproach. Bluelinesrepresent the behavior of non-ideal clocks, the thick red regions show when a node is holding the token for what it believes to be a fixed period of time. The dot- ted red line shows the effect of the ensemble average clock. The green bars, reflecting the durations that the token is held for, increasingly approaches the truetime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.16 The simulated wireless sensor network used for the distributed time-stamping problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 5.17 Resultssimulatedwirelesssensornetworkusedforthedistributedtime-stamping problem. The node number appears in the upper-left corner of each plot. For each node, we plot error in time estimate of the ensemble averaged clock as received as a token at the node (red) and the local clock (blue) versus time. Bothaxesareinunitsofseconds. . . . . . . . . . . . . . . . . . . . . . . . . 154 5.18 Plot of the variance in the simulated clocks versus time. Node 18 is excluded in these calculations. The linear clocks show the expected quadratic increase, but the token averaged clock is a significant improvement having a maximum varianceof4.5266×10 −07 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 6.1 Aschematicrepresentationoftheenvironmentalrealismofferedbyothersim- ulatorsandtheirintendedsystemsize . . . . . . . . . . . . . . . . . . . . . . 161 6.2 An arrangement of robots and an obstacle with a portion of the triangulation usedbythesimulator. Edgesinthetriangulationareshowassolidlines;circles circumscribetrianglesandareguaranteedtocontainnoverticesifthetriangu- lation is Delaunay. The region is a star-shaped polygon produced by joining edgesincidenttocentralvertex,andisalwaysfreeofothervertices. . . . . . . 164 6.3 The flip operation switches the middle edge within a quadrilateral. In the case shownitrestoresintheDelaunayproperty. . . . . . . . . . . . . . . . . . . . . 167 6.4 Plotofthemeandegreeoftheverticesversustimeina1500-robotsimulation. The robots were expanding to fill an environment and the degree decreased overtimeaccordingly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 x 6.5 A scenario in which the central robot is about to undergo a motion. Only part of the global triangulation is shown. Three regions are shaded, each having differentimplicationsfortheupdateoftheunderlyingtriangulation. . . . . . . 171 6.6 Themapofthebuildingusedintheexamplesthatfollow;itrepresentsanarea ofapproximately40m×20m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 6.7 Pathattenuationfactors(PAF)fromatransmitterarecalculatedbyray-casting and integrating values along the connecting ray. Partition values for brick, drywallandplumbingarefromthosereportedinRappaport(2001). . . . . . . . 179 6.8 Ray-tracing allows the reflected components of the received power to be cal- culated. This is used to construct an angular distribution of the power, which issummarizedwiththethreeshape-factors(forclarity,thepowerproducedby the direct path is omitted from the figure.) The resulting shape-factor for the shown points are: Λ a = 0.985525, γ a = 0.590345, Θ a = 0.291113; Λ b = 0.662603, γ b = 0.843989, Θ b = 0.046750; Λ c = 0.762548, γ c = 0.388172, Θ c =−0.191796. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 6.9 Tests of the radio model have a static transmitter marked T. A test receiver moves from 1 to 2. Interference from transmissions by a single radio ata,b,c anddwerealsoconsidered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 6.10 Radiotransmissiontoamovingtestreceiver. . . . . . . . . . . . . . . . . . . 187 6.11 Radiotransmissioninthepresenceofinterferers,at(a),(b),(c)and(d)respec- tively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 6.12 The robot with three laser fiducials, observed by a second stationary robot as usedtothecollectthedataforthismodel. . . . . . . . . . . . . . . . . . . . . 190 6.13 Measured values for given commanded ˙ x and ˙ θ. Without noise the graphs wouldbeinclinedplanes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 xi 6.14 Performance of parametrized bucket brigading controller across system sizes. Tophalfoffigure(fiveplots)givesdataonlowpuckdensitycase(0.781pucks/m 2 ), bottomhalfthehighdensitycase(3.125pucks/m 2 ). Twolargeplotsontheleft giveperformance(numberofpucksforagedafter2000s)forrobotgroupsrang- ing from 10 to 500 robots. Medium sized plots on the right give performance perrobotforeachoftheDparameters. Thesizesmallplots,threealongthetop and three along the bottom, give time series data for 100, 250 and 400 robots. All data are averages of 5 independent runs, error bars show one standard de- viation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 7.1 The constrained environment we consider is the life tunnel and cell theater exhibit. Bothlaserrangefindersandcameraswereplacedused,onlythelasers areusefulforautomatedtracking. . . . . . . . . . . . . . . . . . . . . . . . . 207 7.2 Videostillsfromaceilingmountedfish-eyecameraandthesimultaneouslaser range reading. Non-background pixels are identified as red blobs within the laserscan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 7.3 Trackingoutputfrombothmicroscopicandmacroscopicmodels . . . . . . . . 209 7.4 A decomposition of the task space for disaster recovery robotics. The small and large blocks represent evacuation assistance and urban search and rescue tasks, respectively. The space spanned by the axes indicates conditions of the survivors; some regions are infeasible, and no survivors are assumed to exist outsidetheaxes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 7.5 Twodirectionalbeaconsusedtodeliveraudiosignalssothatthesource’sloca- tion is easily identifiable. Manufactured by Brigade PLC and Klaxon Signals PLCrespectively,theshriekerontherightwasexplicitlydesignedforevacua- tionscenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 7.6 Arepresentationofoneofthemulti-flooredtestenvironmentsusedforbeacon deployment. The dots mark pieces of information added by the operator, in this case emergency exits and stairwell links. The image on the left includes a topologicaloverlayfortheenvironment. Ontheright,theemergencyexitsare numbered 1–5 and the connection stairways 6 and 7 (the second floor is only partiallyshown). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 7.7 Themapusedwithphysicalrobots. Exitsmarked1,2. Theconnectionlabeled 3isastairwell. TheActivMediaPioneerDX2isshownmovingtowardthetop ofthestairwell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 xii 7.8 The distances of each robot from its assigned destination. The reassignment occursatthe300secondmark,resultinginthediscontinuity. . . . . . . . . . . 219 7.9 All four of these plots give the mean distance an evacuee will travel (based on the simple flow model described in the text) as a function of the number of deployed beacons. Thisisafunctionofthelocationsofthebeacons,too;thisplotisforthefive locationschosenbythedescribedallocationalgorithm. Thetwoplotsonthelefthave initial locations of simulated pedestrians assigned uniformly over the environmental area; the right plots use the value of the global integration as the statistical weight. Thetoptwoplotsshowgenerateddata,meanandstandarddeviations(errorbarwidth = one standard deviation) of simulated runs. The bottom two plots show the range of variation as the bias parameter is varied from 0 (solid line that is upper bound) to 1 (lowerboundingsolidline). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 xiii Abstract Swarm multi-robot systems consist of groups of robots with simple interaction rules that ex- ploit local information to collectively perform task-directed activities. Behavioral prediction ofsuchsystemsischallenging;consequently,mostexistingmethodsforprogrammingswarms lack principled design criteria. In this work we consider a particular class of swarm robot systems with hundreds of robots. The systems in this class are significantly larger than cur- rentlydeployedsystems. Weshowthatthelargesizeofsuchswarmsmakesthemamenableto statisticalanalysis. This work makes use of a mathematical framework based on equilibrium thermodynamic andstatisticalmechanicalmethodstoenableaprincipledcontrollersynthesismethodologyfor homogeneousrobotswarms. Thisinvolvesatwo-stepprocedure. Thefirststepistheconstruc- tion of a toolbox, a set of distributed processes and their associated macroscopic characteri- zations. Analytical or numerical statistical mechanics and thermodynamics methods are used to determine these characterizations. The second step consists of construction of the robots’ controllers by combining distributed processes while, simultaneously, coupling the processes associated macroscopic characterization. This approach aims to allow system designers to think about controller synthesis as the problem of combining macroscopic templates rather xiv than as manipulation of low-level controllers which are often sensitive to changes. Thus, this work is a step toward elevation of the level of description used while programming swarm behaviors. Generalpredictionofdistributedbehaviorremainsadifficultproblem,thoughthisresearch makesexplicitadynamicalsystemsproperty,termedergodicity,thatmakesbehavioralpredic- tiontractable. Thetoolboxisconstructedfromergodicprocessesbecausetheyadmitqualitative descriptionsofperformance,providetime-invariantequilibriumstates,fosteranunderstanding of behavioral regimes, and permit stability analysis. This work outlines several constructive principlesandguidelinesthataidinestablishingalinkbetweentheseformalpropertiesandthe realities of task directed robot system design. Examples of such guidelines include exploiting distincttimescales,andseekingconservationpropertieswithintaskbehavior. We show that, taken together, these tools enable principled design of multi-robot systems. Controllers based on ergodic processes are produced for a number of classic coordination and collective decision making problems in the multi-robot literature. The synthesized controllers arevalidatedwithaspecialsimulationtooldesignedforexperimentswithlarge-scalesystems. The results show that, despite the simplicity of the individual processes, non-trivial collective behaviorcanbesuccessfullysynthesized. Moreover,thefocusonergodicdynamicsandtopo- logical characteristics rather than algorithmic properties leads to novel solutions for these and relatedtasks. Morebroadly,theresearchshowsthatmacroscopicmodelsprovideanappropriateabstrac- tion for synthesizing useful behavior in large-scale loosely-coupled distributed systems. For xv example, we demonstrate that by manipulating conditions in a colony of Temnothorax rugat- ulus, commonly studied social ants, a directed transport behavior can be induced. We also demonstratehowmodelsofcrowdbehaviorcanbeusedinthedesignandvalidationofacoor- dinatedmulti-robotsystemtominimizeegresspathlengthandcollectiveevacuationtimes. xvi Chapter1 Introduction Thischapterprovidesanoverviewoftheproblemsofmodeling,synthesisandcontrol of distributed systems addressed in this dissertation. It places the research into a broader context, explaining the scientific significance of these problems. We outline anexampleofcollectivecontroltogroundthedissertation’sapproachtotheproblem ofswarmsynthesis,andlistthecontributionsoftheresearch. Although the world around us has always contained many examples of distributed phenom- ena, only recently have engineered systems become fundamentally distributed. An observer of nature may well express wonder at complexity, having arisen from the local interactions of many seemingly simple constituents, but that same source serves well as a reminder of inadequacy and ignorance to the engineer. Indeed, a deep and general understanding of the relationship between phenomena across disparate temporal and spacial scales remains a fun- damental scientific problem of our time. This fact is widely recognized, but any doubt as to theproblem’ssignificancediminisheswhenoneconsidersthescientist’sresponse: thebirthof “complexity sciences,” described not only in technical but also in popular acounts (Prigogine 1 and Stengers 1984; Resnick 1994). One philosophical response is to proclaim defeat of re- ductionism (Horgan 1997). Perhaps, then, it is not surprising that those concerned with the construction and study of artificial systems might largely choose to ignore collective dynam- ics or self-organization as a means toward an end, and instead consider it something to be avoided. Forexample,severalrobotcontrolarchitecturesthatareusedformulti-robotsystems are straightforward extensions to those designed for single robot control (Arkin 1989). While justifiedforsmallteamsofrobots,suchaviewbecomesquestionablewithincreasingnumbers ofrobots. The starting point of this work is the idea that as distributed systems grow in size, the dis- tinction between local interactions and group-wide behavior becomes increasingly important; whiletheinteractionsremainprogrammaticrules,theaggregatebehaviorsbecometrulymacro- scopic phenomena. For example, when a large-scale multi-robot system consists of hundreds ofrobots,itismeaningfultospeakofensemblebehavior,andoneisjustifiedintheconstruction of models that explicitly capture the multi-scale nature of the system’s behavior. From such a perspective, one finds several non-traditional ways of approaching oft-studied problems in distributedsystemresearch. Thiswillbedemonstratedseveraltimesbyconsideringarchetypal distributedtasksinthepagesthatfollow. Theunderlyinghypothesisofthisresearchisthatmacroscopicmodelsprovideanappropri- ate abstraction for synthesizing useful behavior in large-scale distributed systems. Although a significantportionoftheworkisconcernedwithdefiningthenotionofasuitablemacroscopic model, one may consider these models as high-level qualitative descriptions of a distributed 2 Figure1.1: Ananttransportingasliverofpaper. system’s aggregate properties. Appropriateness is primarly supported through demonstrations of sufficiency. When taken together, however, these demonstrations also suggest that such descriptions are relatively robust to modeling uncertainty and approximation—at least for the looselycoupledsortsofdistributedsystemsdescribedinthiswork. The following example serves to reify the preceding overview. The example is a demon- strationofmacroscopiccontrolofadistributedbiologicalsystemand,althoughsimple,itcap- tures many of the main ideas of the dissertation work. It suitably exemplifies what constitutes amacroscopicmodelandwhatismeantbymacroscopiccontrol. 1.1 Motivatingexample: manipulationofanantcolony AntofthegenusTemnothoraxarewellknownsocialinsects. Themechanismsbywhichthese antsmakecollectivedecisionshasbeenthesubjectofmuchresearch(Prattetal.2002;Franks et al. 2003; Pratt 2005). One example of particularly robust colony-level collective-decision making is employed by the ants in nest site selection. While scouting, an individual ant may 3 discover a location would make an ideal nest site. Being sheltered without too much bright light, for example, is important for these ants. Once such a site has been found, the individual ant shares this information with others in the colony. The colony then has a choice. It can remain in the current nest, or emigrate to a new location that scouts have found. The ant colony uses several mechanisms to achieve consensus and ensure that they are not split across multiplenestsites. Although the mechanisms by which the ants reach consensus are fascinating, it is only necessarytoknowthatthecolonycanreachconsensusinordertoexploitthebehaviortoward usefulends. Onesuchexampleisdirectedtransport. Undercertaincircumstances,theantswill transportparticulardebris anditemstotheirnest. Itispossible to manipulatetheenvironment so as to cause the ant colony to emigrate from one location to another. By doing so, we show that a colony of Temnothorax rugatulus can be used to transport artificial items in a directed way. Figure 1.1 is an photo of a scout transporting a sliver of paper toward the nest. This exampleispresentedindetailinChapter2. The effective and efficient optimization of nest sites by these ant colonies stems from the interactions of the constituent individuals. Although they each may have only limited infor- mation, together they achieve a robust, predictable behavior. Because the colony’s collective behavior is reliable it can be usefully exploited. Thus, models of this sort, which are con- cerned less with the low-level microscopic interactions but rather with collective information processingcapabilities,areusefulabstractionsforthecontrolofdistributedsystems. 4 Thenextsectiondescribeshowsynthesisformulti-robotsystemscanbeconsideredasim- ilar control problem. A new methodology is produced by seeking to leverage descriptions of system-widebehaviorduringcontrollerdesign. 1.2 Swarmsynthesisasamacroscopiccontrolproblem This research is concerned with controlling distributed systems consisting of many loosely interactingautonomousindividuals. Theprecedingexampleandsubsequentdiscussionhasin- troducedtheideathat macroscopicmodelscanserveasa useful abstractionforachievingcol- lective control. This dissertation explores the idea more thoroughly in the context of robotics. Theobjectivewillbetoviewtasksasmacroscopiccontrolproblemsandtoexploitsuchmodels duringtheconstructionofsuitablerobotcontrollers. Thus,thefocusistheconnectionbetween programmable interactions at the level of the individual and structured behavior exhibited by thesystemasawhole. Unlikenaturalsystems,thedesignerisfreetolimittheinteractionrules to only those that allow a suitable local-to-global connection. Macroscopic models are ac- corded first-class status in the developed synthesis methodology, which results in deliberately forgoingexpressibilityatthelocal-levelsoastoachieveglobalpredictability. Swarmroboticsanditskeychallenges Swarmroboticshasbeensuggestedasaparadigmfordesigningmulti-robotsystems(Beniand Wang1989;Bonabeauetal.1999). Itsadvocatesenvisionlargemulti-robotsystemswithmany 5 autonomous agents, each individual being simple, small and, ideally, cheap. Despite the limi- tationsoftheconstituents,largegroupsofsuchrobotsmayperformusefultasksbyexploiting cooperationandsynergism. Theaimistoconstructarobotgroupwithcapabilitiesthatgreatly exceed those of the individuals. Such an arrangement could have numerous advantages over more traditional systems, including fault tolerance through sheer redundancy, the potential to exploiteconomiesofscale,andtheabilitytoimitateandadaptnature’smechanisms. Several outstanding problems need to be solved if the vision of robotic swarms is to be- come a reality. Fortunately significant progress has already been made. Processor design, manufacturing and miniaturization developments continue to produce devices that are faster, smaller,thataremoreenergy-efficient,andhavegreaterdata-densitiesthaneverbefore. Recent years have seen a proliferation of research on short-range wireless communication technolo- gies,fueledbythegrowingconsumermarketforportabledevices. Muchofthisworkisalready being leveraged for robotic applications. But further research is needed to improve the energy capacity, sensor ranges and fidelity, actuator precision, and power consumption. Significant problems in systems integration and manufacturing remain. Although it will still be some years before systems with thousands of robots are deployed in unstructured environments, we arecurrentlyabletoexploreimportantaspectsofswarmsystemsinlaboratorysettings. There are a few published experiments with more than fifty robots and such instances are notable for being pioneering efforts (see, for example, Howard et al. (2004); Konolige et al. (2004); Schwageretal.(2006)). Thisworkisevidenceoftheprogressthathasalreadybeenmade,and suggeststhathundredsofrobotswillbeusedinaresearchsettinginthenearfuture. 6 Figure 1.2: The distinction between microscopic and macroscopic views of multi-robot sys- tems become more marked with increasing system size. We employ a technique that exploits this fact in the modeling and analysis of distributed processes. These processes are used as a buildingblockintheconstructionofcoordinatedrobotcontrollers. As such systems become more common, the lack of principled approaches for achieving a desired swarm behavior must be addressed. Much existing work on synthesizing, modeling and analyzing traditional single or centralized systems lacks decentralized counterparts. This deficiencyisexacerbatedinthecaseofrobotswarms(versus,say,explicitlycoordinatedteams) becausetheyrepresentthedecentralizedmindsettakentotheextreme. Theswarmphilosophy plays down the importance of precision and efficiency in the actions of individual robots, as longasthegroupcanoperateinanharmoniousfashiontoachieveitscollectivegoals. Thecoredifficultyinattackingthedesignproblemistherangeoffactorswhichcontribute to the collective behavior exhibited by the swarm. Many aspects interact to produce the over- all behavior, and these interactions may cross logical system boundaries. For example, the shape of a robot can have an effect on robot’s environmental interactions, producing regular- ity in the world that, in turn, affects the controller responses. Feedback and dynamics from 7 the robots and environment interact to produce globally observable behavior. The notions of emergence and self-organization have been invoked to describe the production of high-level spatio-temporal pattern from simple interactions(Fukuda et al. 1988; Steels 1990; Theraulaz et al. 1990). Successful a posteriori explanations of system behavior often involve small de- tails, highlighting the results of unexpected side-effects that have been multiplied by positive feedback, rather than conventional elements playing a direct causal role. 1 Despite these chal- lenges, phenomenological models have met with some success (see, for example, Sekiyama and Fukuda (1996); Sugawara and Sano (1997); Lerman and Galstyan (2001); Martinoli et al. (2004)). Such models have been used to analyze existing systems and inform design through iterativerefinement. Processcompositionforswarmcontrollersynthesis This research tackles the problem of controller design by introducing a compositional ap- proach in which macroscopic models guide the design process incrementally. This approach arosefromalackofexistingsynthesistechniqueswhichaddressthedistinctivelevelsofdetail appropriate for large-scale multi-robot (and, more generally, distributed) systems. In groups with several hundred robots, the type of description appropriate for characterizing aggregate behavior is quite distinct from the information used to describe individual robots. Figure 1.2 gives an example of separate levels of detail in a large robot swarm: microscopic—each robot 1 SeeHollandandMelhuish(1999)whogiveacomparisonwithearlierresultsbyBeckersetal.(1994),including severalexamplesofsmalldetailsthathaveimplicationsonglobalbehavior. 8 (a) A single process is modeled by assuming it is executing in isolation. Analyticalandnumericaltoolsproduceanentropy-basedmacroscopicde- scriptionoverfarfewerdimensions. (b) Repeated modeling for different processes allows construction of a set ofprocessesandassociatedmacroscopicdescriptions. Thesetuplesconsti- tutetheprocesstoolbox. Figure 1.3: The process toolbox is a set of process descriptions and associated macroscopic characterizations. The toolbox is produced by analyzing individual processes in isolation in order to construct a macroscopic model. The toolbox provides the primitive building-blocks fromwhichcomplexcoordinatedbehaviorissynthesized. 9 has short-range sensors and local communications capabilities, macroscopic—a function rep- resentingtheaveragedensityofrobotsinaregionofspace. Ratherthanattemptingtomodela completeexistingsystemend-to-end, webeginwithprimitiveelementsthatsatisfyconditions which allow for a type of macroscopic model to be constructed. Figure 1.3 sketches how this prediction enables construction of a process toolbox. Controllers are constructed by combin- ing elements from this toolbox. The manner of this construction ensures that an equilibrium thermodynamic characterization of the collective behavior can be produced in parallel. More formaldefinitionsanddevelopmentareinChapter4. We term the approach model-centric because the programming method is shaped by the need to accommodate modeling restrictions. Thus, the generality of the approach depends on the applicability of the modeling assumptions. Part of the present work is an empirical exploration of the capabilities of this toolbox orientated synthesis approach and is based on thebeliefthatthedevelopedmodelissufficientforproducingarangeofusefulbehavior. This research shows that task-oriented coordinated behavior can be produced in large-scale robot swarmsthroughcombinationsofsimplecomputationalprocesses,eachpossessingtheergodic propertywhichpermitstractableconstructionofamacroscopicmodel. Ultimatelythisenables behavioralsynthesistoproceedwithaninformedviewoftheswarm’scollectivebehavior(see Figure1.4). Instead large distributed systems being viewed as a challenge for algorithmic scalability, we propose viewing the system size as a factor to be usefully exploited. For example, with 10 Figure1.4: Thelevelofdescriptionusedbysystemdesigneriselevatedawayfrommicroscopic minutiæ toward collective properties. Analysis of individual processes produces a toolbox that connects microscopic and macroscopic descriptions, which informs the designer during controllersynthesis. hundreds ofrobots continuousmodels becomegood approximations toreality, despitethe un- derlying substrate being fundamentally discrete. Also, the properties of individuals averaged across the system can become increasingly representative with increases in system size. This work can be viewed as an exploration of the alternatives that become possible in large robotic systems with hundreds of robots by embracing the belief that it is essential to understand the information processing capabilities at the individual and collective levels, as well as the rela- tionshipbetweenthetwo. 1.3 Beyondrobotswarms: optimizingcollectiveevacuation The utility of a macroscopic models is considerably extended when robots use the models in an online fashion. To apply this idea, and to demonstrate that the macroscopic control 11 Figure 1.5: Macroscopic-level models have a broader applicability in robotics. This disserta- tion describes a multi-robot system that uses models of crowd behavior in order to minimize expectedegresstimes. concept generalizes beyond the swarm systems targeted by the synthesis methodology, we considertheproblemofdesigningamulti-robotteamtoreducethetimerequiredforemergency evacuation from a given building. This is achieved by viewing one distributed system (the robots) as controlling the other (the crowds of evacuees). The robots affect evacuee behavior bycarryingandactivatingadirectionalaudiobeacons(e.g.,cf.Withington(2000)). Thegroup of robots collectively deploy themselves and coordinate so as to maximize the efficacy of all the audio cues. They do this by using a model of crowd way-finding behavior in a distributed fashion. Figure 1.5 gives a graphical overview of the idea. In contrast to the swarm synthesis work,therobotsuseanexplicitcoordinationalgorithm,demonstratingthatmodelsofaggregate behavior can also be effective when used with this class of coordination schemes. The few robots involved also mean that physical robots could be used for validation of the algorithm. SeeChapter7fordetails. 12 Modeling human collective behavior allows for broader applicability of macroscopic con- trol, but the assumptions of ergodicity and distinct timescales, which apply to robot swarms, are not generally justified for crowds. As a result, this work supplements an existing graph- basedmodelofevacuationdynamicsinordertomodeltheeffectofdirectionalaudiobeacons. The extension is necessary because we are unaware of any existing mathematical models of theeffectongroupway-finding. Figure1.5labelsthismodel“swarm-inspired”becauseitrep- resents a high-level description that allows for a simple deployment heuristic. It is closer in flavor to the model of ant nest-site collection than the entropy-based models of the toolbox processes. Macroscopicmodelsforaffectingcrowdbehaviorcomplementstheswarmcontrolproblem because, while a person’s range of behavior is extreme, several existing macroscopic models and theories suggest that, collectively, human crowds oftentimes act in a few, relatively pre- dictable ways (Smelser 1962; Hillier 1996; Helbing et al. 2000). This counterpoints insect behavior,forexample,wherethecollectiveregularitymaybeseentobeofgreatercomplexity, orreflectgreatcapability,thantheindividual(Seeley1989;Franks1989). Bothcasesreflectthe beliefthatmacroscopiccharacterizationsshouldplayakeyroleinautonomousrobotsystems. 1.4 Dissertationcontributions Novel aspects of this work include the choice of primitive elements from which programming proceeds, the use a particular time-invariant characterization of each primitive, the type of 13 combinationaloperationscarriedoutand,moregenerally,thestyleofincremental,bottom-up, formalanalysisperformed,anditsuseasanintegralpartofthesynthesisprocess. Thefollowingarethemaincontributionsofthisdissertation: 1. Development of an equilibrium statistical physics inspired formalism that allows for a more principled synthesis of large-scale minimalist multi-robot systems on the basis of compositionofindividuallyanalyzedprocesses. Weidentifythepropertiesnecessaryto make prediction of individual processes and (recursive) compositions of the processes possible and tractable. An important part of this contribution is the precise definition of microscopic, macroscopic and ensemble system views, and the specific interpretations ofthesedescriptivetoolsfromtheperspectiveofmulti-robotsystems. Thesedefinitions delineatetheunderlyinglocaltoglobalproblem,andtheformalismtacklesthequestion ofreconcilingmultiplelevelsofdetaildirectly. 2. Developmentofprototypicalcomponentsthatcanbecombinedinapredictablewayand alsoidentificationanumberofdesignprincipleswhichleadtoeasilyanalyzablesystems. Theitemswithinthistoolboxcanbereusedinfurthermulti-robotcontrollersforavariety ofdomains. 3. Application of the formalism to coordination problems in the multi-robot coordination domain. This is a demonstration of the expressive power of the prototypical processes withinthetoolbox. Thesedemonstrationsareofrobotsperformingeitherclassicaltasks, or variants of such tasks. For each task, the data are presented for significantly larger groupsofrobotsthanhavebeenbeenpublishedintheliterature. 14 Thefollowingareauxiliarycontributions: 1. Explicit identification of ergodicity sheds new light on a technique that reoccurs in the multi-robotic literature. We identify and discuss examples of researchers who have for- tuitously used ergodic behavior, without necessarily recognizing the formal notion of it. 2. Development of infrastructure was necessary in order to perform realistic large-scale simulations, as well as for instrumentation of the California Science Center, the space usedforthehumancrowdmodelingandcontrolresearch. 1.5 Dissertationoutline Theremainderofthedocumentisorganizedasfollows: Chapter2: Controlledtransportinanantcolony describes the motivating example, out- linedabove,ingreaterdepth. Chapter3: Backgroundandrelatedwork describesexistingmulti-robotresearchandposi- tionsthecurrentworkwithinthislargercontext. Chapter4: Large-scalesimulation describes the infrastructure developed and used for this research. Chapter5: Synthesismethodology introducestheprocessmodelingtechniquesusedforsyn- thesisofmulti-robotcontrollers. 15 Chapter6: Multi-robotcasestudies describes controllers for particular tasks, used to eval- uatetherobotsynthesismethodology. Chapter7: Humancollectivebehavior presentsanoverviewofworkrelatedtomacroscopic modellingofthecollectivebehaviorofcrowds,andtheevacuationassistancesystem. Chapter8: Dissertationsummary concludesthedissertation. 16 Chapter2 Controlledtransportinanantcolony This chapter describes a system constructed in order to control a colony of ants and serves as an example of macroscopic control. It concretely shows how qualitative macroscopic models can be used as an abstraction in the design of task-achieving distributedsystems. Ants of the genus Temnothorax (formerly Leptothorax) are social insects known for their unusually small colonies, often comprising a few hundred workers and a half dozen queens, and also for being particularly hardy (R¨ uppell and Kirkman 2005). It is a species-rich genus with more than 75 species identified to date (AntWeb Field Guide, 2008). Figure 2.1 shows representative members of the T. rugatulus species in their natural surroundings. These ants inhabit existing cavities like cracks between rocks, seed husks and hollowed branches (Pratt 2005). Since these sites are relatively vulnerable, a portion of the colony’s workers (measure- mentsforT.albipennissuggestaboutonethird(Prattetal.2002))arecontinuallyscoutingfor potentialnestingsites. Dependingonthesuitabilityofadiscoveredsite,thecolonymaychoose toemigratetoanewhome. Butemigrationisadangerous,timeandenergyconsumingprocess 17 (a) Workersare∼ 3mminlengthexcludingantennae. (b) An example of natural nesting site. Their colonies typi- callyconsistof50–150ants. Figure2.1: Temnothoraxrugatulusants,thespeciesusedinthetransportexamplebelow.(Photo Credit: AlexWild.) 18 that risks predation of brood items and queens. Consequently, the ants demonstrate consid- erable deftness in nest selection. The ants are known to exhibit distinct preferences in terms of nest volume, entrance size, and light intensity (Franks et al. 2003). A colony will ignore sitesoflesserquality thanthenesttheyalreadyoccupy. Thisisa colony-levelbehaviorthatis robust despite the fact that occasionally a scout may mistakenly believe she has discovered a moresuitablesite. Withseveralantsscoutingforlocationssimultaneously,itiscommonforthecolonytohave multiple promising nest sites to choose from. Splitting the colony would be detrimental to the survivalofeachoftheindividualsandthecolonyitself. 1 SeveralspeciesofTemnothoraxhave evolved a mechanism for collective decision-making that ensures the colony will not divide itself among the candidate sites. Both the comparison of candidate sites and the method of emigrationaresolvedtogetherinafascinatingmanner. Comparisonofsitesisitselfnon-trivial becausemostofteneachofthepotentiallocationswillhavebeenobservedbyscoutsthathave not seen all the other candidates. An individual ant will begin actively recruiting others to support the site she has discovered, but may be recruited to support emigration to other more promisingsitesifotherantsrecruithervigorouslyenough. Theantwillactonceshebelievesa sitetobesuitable,butthemechanismsucceedscollectivelyinspiteofherindividualignorance of the quality of other sites, i.e., under partial observability. These local interactions result in macroscopicdynamicswhichsolvetheselectionproblem. 1 Many have suggested that genetic selection occurs of the colony-level but much controversy remains as to which genes are involved, how mechanisms vary with sex-ratios, and so forth. Bourke and Franks (1995) use the term“minefield”todescribethegroupselectiondebate. Acurrentlyfavoredtheorysuggeststhatbothcolony-level andindividualselectionoccurs(Owen1989). 19 2.1 Ant-to-antinteractions The following detailed description of mechanisms employed follows Pratt et al. (2002) and Mallon et al. (2001). A scout that has discovered an interesting site recruits her nest-mates to it after a delay that varies so that it is shorter the better the site is. Initially she recruits by performing a tandem run with another ant, a process in which one ant leads another along the route to the nest. The leader ant takes a few steps before stopping to wait for the follower and the follower acknowledging that she is behind the leader by tapping the leader’s legs with her antennae. Although a slow process, it allows the recruited ant to learn the route to the nest site and is effective because thereafter she can perform recruitment herself. In addition to the tandemruns,ascoutmayrecruitnest-matesbyperformingtransportsinwhichshecarriesother antstothenewsite. Thesetransportsaresignificantlyfasterthantandemrunsandaretheonly way passive ants, queens and brood items will move from one nest to another. A scout will switchfromtandemrunningtotransportswithalikelihoodthatdependsonthenumberantsat thenewnestsite. Thisistheso-calledquorumrule. Tactilesensingatthenestsiteallowsthisa quorumtobesensed. Pratt(2005)writes: “collectivecomparison[ofcandidatenests]emerges from competition between independent recruitment efforts at each site.” The site-dependant latency ensures that the rate of increase, i.e., positive feedback, at a site reflects the quality of thesite. Thequorumruleservestoacceleratetheprocessthroughamplification. Althoughthe first mechanism may suffice to select the superior site, it is advantageous to shorten transients whichotherwisewouldinvolvemanytransportsofpassiveantsandbrooditems. 20 Figure 2.2: A nest designed to have configurable desirability, the inset shows the hinged roof activated, making the nest an undesirable site. The entire nest is 75mm×52mm with a 32mm×24mm cavity. The roof and floor are made from glass microscope slides 1.2mm thick andthewoodenpartitionis2mmthick. Thenestentrancehasadiameterof5mm. Apush-rod connectsthecontrolhorntoastandardservomotor. Photoshowsnestinhabitedbythecolony ofT.rugatulususedtocarryoutthetransportscenario. The mechanisms these ants employ during recruitment, as well as their other behaviors, are well-understood because Temnothorax may be successfully kept for study in the labora- tory. Nests are easily manufactured by cutting a hole in a balsa wood spacer, sandwiching this between two microscope slides, and drilling through the top slide to provide a nest en- trance. The desirability of such a nest can be adjusted by varying the entrance hole diameter: coloniesprefersmallerentrancessolongasthediametersexceedaminimumofabout0.75mm (cf.Franksetal.(2003)). Ifthecolonyareconfinedtoanenvironmentcontainingonlyasingle nest with a larger than desired entrance hole, then the scouts attempt to improve the currently occupiednest. Theydothisbyfindingitemsthattheycanusetobarricadetheentrance. 21 2.2 Controlledtransport A Temnothorax colony, taken collectively, will express a preference for some nest site, and will act in a coordinated fashion to inhabit the preferred site. The discovery, evaluation and relocationallrelyondistributedmechanisms,buttheresultisasimpletostategroupbehavior. Most importantly, this group behavior is robust, stable, and ultimately successful over a wide range of low-level microscopic details: a colony will exhibit repeatable behavior in the face of changing environmental conditions, colony constitution, and uncertainty. It meaningful to speakofagroupbehaviorpreciselybecausethisrepeatabilityisastatementofregularityinthe system’sresponse. Aswithothercomputing,itisfunctionalitythecanbeusedrepeatedlyover arangeoflow-leveldetailsthatmakesforausefulabstraction. From a synthesis perspective, one need only manipulate nest sites so as to have a colony change its preferences in order to move that colony. Or, more precisely, one may have the colony move itself. But causing colony emigration for its own sake is not particularly useful, andsynthesissetsouttoachievesomeusefultask. However,byintroducingasuitableartificial item into the ant’s environment, one can have the ants transport the item along a chain of nest sites in a predictable and directed manner. We call this macroscopic control because the only realknowledgeandinfluencethatthedesignerhaswithregardtothesystemisatthecollective level. 22 2.3 Experimentdesignandoutcome Inordertodemonstratecollectivecontrol,wedesignanapparatusthatallowstheants’environ- menttobedynamicallyshapedtosuitatransporttask. Wedesignanetworkofreconfigurable nest sites, with each site having an actively controllable level of desirability. By triggering differentnestconfigurations,thecolony’snestselectionisbiasedsoastohavethecolonyem- igrate from their current nest to one of our choosing. The transport of artificial items follows byconvincingtheantsmoveitemstoimprovetheirnewnest. Figure2.2showsthebasicnestdesignmodifiedsotohaveahingedroof. Thecontrolhorn mounted on the roof allows a standard hobbyist servo motor to open the roof and expose the nest’s contents. Once the roof has been opened, the desirability of the site to the colony is greatlyreduced. Notice,also,thatthenestshaveanentranceholewithadiameterof5mm. Wearrangedthreehingednestswithininaninescapable37cm×30cmcontainer. Acolony ofaboutsixtyT.rugatuluswereplacedwithintheapparatus. Tothisweaddedasinglecolored 7.5mm×2.5mm sliver of paper intended to be the transported by the ants. This environment was designed to be extremely sparse having no dirt, soil, or dust. The only thing available for scouts to use in protecting the nest entrance is the provided sliver of paper. A camera was position over each of the nests in order to estimate the number ants at each site and to detect the presence of the paper sliver at particular nest entrance. The bottom of Figure 2.3 shows the sequence of states that the apparatus was designed to take. The photo shows the apparatus approximatelysixhoursafteratransitionfromthefirststatetothesecond. Themagnifiedinset 23 State1 State2 State3 Figure 2.3: The environment constructed to manipulate an ant colony’s nest site selection in ordertoelicitcontrolledtransport. Theinsetshowsthepapermarkerthatthecolonytransports whenthedesirabilityofthenestsareadjustedbyfollowingthesequenceofstatesshownalong thebottom. 24 shows the ants occupying the central nest along with the sliver successfully positioned by an anttoprotectthenest’sentrance. Repeatedmanipulationofthenestsfromstate1,state2,state3,state2,state1,state2...has shown that both the ant nest selection and the transport of the paper item are repeatable and reliable. Initially emigration from one site to the next took about 8 hours, but with repeated relocationstheantsbecamefasteruntilthecolonywasaveragingabout4.5hours. (Thisspeed- up has been observed elsewhere, cf. Pratt (2005).) The time reported is approximate because it is difficult to determine exactly when the emigration should be considered complete. The photoinFigure2.3showsagoodexample: evenmanyhoursafterrelocationtothecenternest, scouts are visiting the previous location. It is unclear why this occurs, but it is possible that lingering pheromones play a role. We consider emigration complete once all of the passive membersofthecolonyarebelievedtohavebeentransported. Unlike emigration completion, the single sliver of paper allows the cameras to determine when nest roofs must be adjusted in order to proceed to the next state. Observations showed that, once the nest had emigrated, the time taken to transport the sliver was highly variable. Sometimes as few as a 2 hours were needed but, occasionally, the ants were left overnight. No more than 24 hours were ever needed. 2 The nest emigration is a response to a radical alteration of the colony’s environment, whereas transport of the sliver requires recognition of thecontinuedexposureofthenestentrance. Whilepassivenestmembersarediligentlycarried to a new nest site after quorum has been reached, the scouts will entirely ignore the paper sliver. Relocation after a nest is destroyed is clearly a much higher priority than improving a 25 currently inhabited one. Furthermore, observations suggest that the scout actually rediscovers the sliver each time it is transported, rather than learning of its applicability more generally. An explanation of the variability in sliver transport times will need to consider the effect of priorities on the synchronization of activity because long periods of inactivity are observed after relocation but before the paper item is transported. Cole (1991) studied such activity cyclesinT.allardycei. 2.4 Discussion Thissimpleifsomewhatunusualexamplepossessseveralaspectsthatrecurthroughoutthisdis- sertation. It demonstrates how a high-level description of the mode of system behavior can be usedtoguideprogrammingasystem. Adetailedparameterizableagent-basedmodel(Sumpter et al. 2001) of nest-site selection exists, and a quantitative agreement is established with T. al- bipennis (Pratt et al. 2002) and T. curvispinosus (Pratt 2005). Qualitatively, these two species employ the same mechanism, but quantitative differences suggest that T. albipennis places greateremphasisonspeedthanT.curvispinosus,whichinsteadtradesspeedforaccuracy(Pratt 2005). Theseinterspecificdataareclearlyofgreatusetosystemdesignerswhoimplementsim- ilarmechanismsandwishtooptimizetheirsystem’sperformance. Butthisresearchembodies the idea that a method for synthesizing behavior in large-scale systems that is generalizable (e.g., through composition) despite having only very coarse descriptions of the system behav- ior, can serve where a precise but specific theory can not. Although the T. rugatulus species 2 Somewhat ironically, as the ant behavior becomes reliable on the timescale of days, the electronic cameras wouldfailoversuchperiodsandrequireaphysicalresetoftheuniversalserialbushost. 26 usedintheexampleabovehavenotyetbeenquantitativelymodeled,thequalitativemodelwas sufficientinordertoinducetransport. Althoughfascinatingdiscoveries,thedetailedmechanicsofrecruitmentviadelayedtrans- ports and tandem running were unimportant in the design of the directed transport capability. There are several other ways that the colony might have solved that same nest selection prob- lem,eachofwhichwouldhaveresultedinthesameoutcomeofdirectedtransportofthepaper sliver. Of course this need not be the case, but the example shows that the mechanism was in- dependentfromtheothertaskoperations. Suchindependence(ororthogonalityamongseveral behaviors) is an important design goal, although less practicable when considering a natural distributed system. It should be stressed that potential ignorance of the underlying mechanics is more than advocation for phenomenology. While phenomenological models may suffice for control, nothing precludes more descriptive methods, but, in either case, the model must captureadynamicsystem-wideresponseacrossarangeofinputorparameters. 3 Even though this dissertation develops an equilibrium statistical mechanics-inspired for- malism primarily for synthesizing large-scale robot swarms, the idea of macroscopic control is,asthepreceedingantexampleshows,morewidelyapplicable. 2.5 Summary AcolonyofTemnothoraxantswillcollectivelyproducenestselectionbehaviorthatispredica- bleandrobust. Weareabletoexploitthisgroup-levelbehaviortoinducetransportofartificial 3 A descriptive model does have an advantage because it may provide evidence for the robustness or indepen- dence(orboth)ofthebehavior,whichismorechallengingwithphenomenologicalmodels. 27 items in the environment. Directing this transport requires only a qualitative model of this colony-level behavior, which is robust to variants of underlying local behavior. As a motivat- ing example, it shows that high-level macroscopic models can be used as an abstraction in the designoftask-achievingdistributedsystems. 28 Chapter3 Backgroundandrelatedwork The purpose of this chapter is to elaborate on the modeling and synthesis problems central to this work, further motivating their significance, and to establish the ratio- nalefortheproposedapproach. Thenatureofthetopicdictatesthatworkfromseveral disciplinesbeexamined,includingComputerScience,Cybernetics,Robotics,Artifi- cial Intelligence and Statistical Physics. The chapter seeks to provide a context for the current research as well as a review of recent results and their relationship with thepresentwork. The previous chapter began by considering the fundamental scientific problem concerned with understanding how complexity arises from the interactions of constituent parts. Ander- son (1972) has pointed out that the division of the sciences into the disciplines we now use, is meaningful only because the universe is filled with structure that does not trivially follow fromlower-leveldescriptions. ThischapterbeginswithSection3.1outliningwhymulti-robot systemsaresuitedtothestudyofcomplexityanddescribeshowthesynthesisprobleminlarge- scale systems is really a model of the much grander, fundamental problem. The section also 29 describes the motivation for studying multi-robot systems in greater detail than afforded by the previous chapter. It shows that the controller synthesis problem arises in many contexts. Section 3.2defines thecontroller design andthe synthesisproblem precisely. Section3.3 pro- vides a systematic presentation of existing approaches to designing coordinated multi-robot systems is provided. The section is devoted to categorizing and classifying multi-robot sys- tems work, while comparing the dissertation research. Section 3.4 serves to further clarify the relationship, discussing several aspects that do not fit in the preceding sections. The penulti- mate section describes background research that motivates the choice of toolbox processes by showingcomparableoperationsarefoundacrossawiderangeofnaturaldistributedsystems. 3.1 Motivationformulti-robotandswarmsystems Robots are complex systems whose long-term dynamics are difficult to predict. The difficulty stemsfromthemanyaspectsofthesystemandtheirinteractionsthatmustbeunderstood. For example, feedback effects can result from robots altering a shared environment. The behavior exhibitedbyamulti-robotsystemisoftenmorecomplexthanwouldbeanticipatedbyana ¨ ıve analysisoftheindividualrobots. In addition to the characteristics of typical complex computational systems, several dis- tinguishing features make robots unique tools for studying complexity. Robots balance two criticalproperties: (1)theyare controllablebecausetheyattempttoexecutecommands,while (2) inexorable physical laws place constraints on system behavior. Although robots can be instructed to perform operations, they do not always oblige. When a robot is commanded, the 30 robot’s construction, and ultimately the underlying physics, dictate how quickly or accurately that command is executed. Informational constraints mean that no sensor provides error free readings, nor can any perfect and complete model of the real world be produced. In robots, unlikecomputers,instructionsarenotalwayscarriedoutperfectlyandtheworld’sstatecannot be perfectly queried. 1 Robots are interesting because they fall between the two extremes of beingperfectlycontrollable,andbeingentirelyconstrained. Research on multi-robot systems is proceeding on fronts that range from addressing the needsofparticularapplicationstomorefoundationalquestionsaboutthenatureofcomputation in physical systems. What follows is a survey of the major motivations for studying multi- robot systems and how this dissertation appertains. The presentation serves to highlight that thesynthesisproblemiscentraltoanextremelywiderangeofresearch. Current and potential applications Although few multi-robot systems are currently de- ployed outside research settings, most current engineering is concerned with achieving useful tasks in an autonomous or semi-autonomous way. From an applications standpoint, a multi- robot system may have advantages over single robot systems: some tasks may be impossible for a single robot, other tasks may be feasible for a single robot but be performed more ef- ficiency or robustly by multiple robots. Examples of current applications of multi-robot sys- tems include urban search and rescue(Kobayashi and Nakamura 1983; Murphy et al. 2001), 1 Memoryreadsandwritesdofail,butsoinfrequentlythattheunderlyingtheoreticalmodelignoressuchfailures. At timescales short compared with an electrons movement, transients can be observed, but they are insignificant to the programmer’s whose code runs far more slowly. This is an instructive example as it analogous to the use of macroscopicmodelsasabstractionsforsynthesis. 31 mine-clearing(Longetal.2005), hazardouswasteclean-up(Parker1998), militaryreconnais- sance(Gage 1992; Oh and Green 2004), automated mapping(Thrun 2002; Fox et al. 2006; Howard et al. 2006), etc. Important applications exist in the service of science, for exam- ple, space exploration and remote habitat construction(Brooks and Flynn 1989; Thomas et al. 2005), dynamic instrumentation and sensing for long-term study of spatially-extended phe- nomena through distributed sensor(Estrin et al. 1999) and sensor/actuator networks(Rahimi et al. 2004; Sukhatme et al. 2006). Increasingly, the biological sciences are taking advantage of robotics technologies(Balch et al. 2006). Several applications are envisioned for increas- ingly large multi-robot systems, for example, disposable smart-dust(Kahn et al. 1999) and nano-scale robot swarms(Lewis and Bekey 1992; Requicha 2003; Hogg and Sretavan 2005). The macroscopic model-centric synthesis method developed in this dissertation has potential forsuchapplications. Robot systems as models of biological systems Several researchers have used robots as toolswithwhichtotesthypothesesaboutbiologicalsystems. Scientificobservationofbiolog- ical behavior results in plausible mechanisms being postulated, which can be implemented on robots(physicalorsimulated)andinteractivelystudied. Thisisaneffectivewaytoensurethat proposed explanations of behavior observed in nature include the necessary causal elements, and do not make unnecessary assumptions (Webb 2001). A variety of natural behaviors rang- ingfromgeneralmodelsofreflexivebehaviorselection(Connell1990),tonavigation(Matari´ c 1990), as well as models of evolutionary processes for single(Nolfi et al. 1994) and multi- robot(Agah and Bekey 1997) systems have been studied in this manner. Subjects include 32 crickets(Webb 1994), lobsters(Ayers and Crisman 1992; Grasso et al. 1996), frogs(Arbib 1989; Arkin 1989; Arbib and Liaw 1995), flies(Cliff 1990), as well as social insects, such as cockroaches(Garnier et al. 2005; Halloy et al. 2007), and ants(Deneubourg et al. 1990; Beckersetal.1994;KubeandZhang1993). Severalresearchersconductexperimentsinrepro- ducing elements of “higher-order” intelligent behavior, e.g., social symbol grounding (Billard and Dautenhahn 1999; Cangelosi and Riga 2006), and language acquisition (Steels and Vogt 1997;Maroccoetal.2003;Steels2003). Other researchers are less interested in evaluating hypotheses than in adapting nature’s techniques for engineering purposes, for example, Payton et al. (2001). This typically results intheappropriationofanabstractedversionofnature’slocalinteractionsinordertoreproduce the collective behavior. As in the Temnothorax transport example in the preceding chapter, nature may provide useful local rules, but what is different about the work outlined in this dissertationisitsfocusonmacroscopicreuse. It is worth noting that a robot’s physical embodiment can be important in the modeling process,asdiscussedinSection3.3.1.1. Robotsystemsexploitingmodelsofbiologicalsystems Alesscommonwayinwhichrobot systems are used to study biological systems is by programing robots to interact with the bio- logicalsystembeinginvestigated. Whenaparticularmodelofthebiologicalsystemisusedto shapearobot’sinteractions,thenobservationofthebehaviorofboththerobotandthebiolog- ical system can be particularly informative, especially when under controlled environmental 33 conditions. Vaughan et al. (1998, 2000b) conducted the first experiments in robot-animal in- teraction by demonstrating control of a flock of ducks by an autonomous robot equipped with an overhead tracking system. The work showed that a simple model suffices in the case of a group of ducks. B¨ ohlen (1999) analyzed interactions between a robot and three chickens, identifying techniques to reduce the animals’ anxiety. Caprari et al. (2005) outlined ongoing work in controlling a cockroach group by having robots accepted as congeners. Much ongo- ing human-robot interaction research (see Fong et al. (2003) for a review) does not directly fit the characterization given above, but there are exceptions, such as Breazeal et al. (2004), whoappliedJointIntentionsTheory(CohenandLevesque1991)tostructurehumanandrobot interactions. Themodelingandcontrolofhumancrowdsdescribedinthisdissertationfitsthis characterization. SodoestheTemnothoraxtransportexampleifthenestapparatusisconsidered arobot. Multi-robot systems for the study of cooperation, coordination and competition Prob- lems related to cooperation, coordination and competition are the subject of inquiry in disci- plines ranging from sociology to economics (Smelser 1962; Brown 1965). As above, multi- robotsystemscanservetotesthypotheses. Aroboticimplementationensuresthatframeworks and theories are precisely specified, which can be particularly useful—and challenging—for thesocialsciences. Forexample,testingtheproposalofCohenandLevesque(1991)onphys- icalrobotswouldbeastringenttestoftheunderlyingpsychology,namelyBratman(1987). In 34 additiontoensuringalgorithmicrigor,roboticsystemsquicklyhighlightunreasonableassump- tions. The move of toward “embodied artificial intelligence” crystallized around the demon- stration that off-line symbolic planning, neglecting uncertainty while assuming a static world, wasinappropriateforagentsexpectedtoinhabittherealworld(Brooks1986). For work on cooperation and competition of a more theoretical nature, multi-robot sys- tems offer a concrete application area with circumstances that challenge existing assumptions and suggest areas worthy of exploration. Perhaps the best example is in economics. Market- based approaches have become popular for coordinating robots(Gerkey and Matari´ c 2002; Zlot et al. 2002; Vig and Adams 2006; Dias et al. 2006). The result is that several important (asyetunanswered)questionshavebeenraisedbytheroboticscommunitysuchas: Whatprin- cipled methodology can be used to calculate utility or price within robotic systems(Gerkey and Matari´ c 2004b)? Why should a competitive mechanism (e.g., a free-market auction) be appropriate when designing cooperative and potentially altruistic agents? (cf. Simon (1996, pp.30–41)). Autonomous agent and distributed artificial intelligence research stands to benefit from roboticsinwayssimilartobothoftheprecedingexamples. Robotshelpidentifyunrealisticas- sumptionsbyprovidinganexperimentaltestbed,andsuggestdirectionsforfurtherrefinement. Multi-robot systems continue to be an important motivation and application area for such re- search,withseveralaspectsofmulti-agentresearchofferingmuchpotentialforroboticists(for a further discussion see Gerkey and Matari´ c (2004b)). Section 3.3.1.4 provides examples and adiscussionofmulti-agenttechniquesappliedtorobotics. 35 Robots as a systems testbed for distributed algorithms The wide range of unpredictable events that occur in the real world mean that robots are subject to a host of failures. Some examples include the permanent failure of moving parts, erratic network behavior, power dis- continuities, and other temporary failures. On the other hand, most traditional distributed al- gorithmsarestudiedundertheassumptionofmoderatelystable,reliableandrelativelybenign computing infrastructure (cf. Foster and Kesselman (2003)). For a multi-robot system to be treated successfully as a distributed system, the software implementations must be extremely robust (Section 3.3.1.2 lists particular examples). At present, few general systems principles, comparable to Saltzer et al. (1984), for example, have been identified for such systems. Sev- eralvisionsforthefuture,includingtheproposedubiquitouscomputingparadigm(e.g.,Weiser (1993)),willrequirethedevelopmentofdistributedprogrammingtechniquescapableofcoping withdynamic,heterogeneousandsometimeshostilecomputingenvironments. Robots in the development of theories of situatedness, emergence, and commutable re- sources The complex phenomena associated with systems of robots and robot-like agents have inspired several theoretical developments. Robot systems highlight the distinction be- tweenagencyandenvironment(seeFranklinandGraesser(1996)). Thissituatednessisimpor- tant, requiring that robots, along with other situated agents, be studied in conjunction with the environments they inhabit. Kaelbling and Rosenschein (1990) explicitly model and consider the importance of situatedness in intelligent agents. The environment is of particular impor- tance in multi-robot systems because rich inter-robot interactions can be mediated by the en- vironment,andevenarelativelysimpleagentinteractingwithacomplexenvironmentexhibits 36 complex behavior(Simon 1996, pp. 63). (See also Moore (1990) for a formal result.) The adjective emergent is used to describe pattern or behavior–typically novel and unexpected– stemming from interactions between entities rather than any particular entity itself. Steels (1990) writes: “Emergent functionality has become one of the main themes in research on Artificial Life and autonomous agents.” This concept has also been the subject of theoretical models(see,forexampleCrutchfield(1994);Darley(1994);HoggandHuberman(1991)). Thedecentralizednatureofmulti-robotsystemshasstimulatedtheoreticalworkthatseeks tounderstandgeneralizedresourcecorrespondencesandtrade-offs. Thislikelystemsfromthe fact that distributed solutions to problems typically require factorization or decomposition: a processthatexposesstructureinherenttotheproblemandleadstothequestionofresourcere- quirements. Severalresearchershaveexaminedparticularrobotictasks(orclassesoftasks)and showedthatcommunication,internalstate,sensingcapabilities,andeventhenumbersofagents arevariablesthatcanoccasionallybetradedoffinfavorofothers(BlumandKozen1978;Jones 2005; O’Kane and LaValle 2006). Donald (1995) introduced the theory of information invari- ants which has a stronger focus on the capabilities of distributed sensori-computational ele- ments rather than tasks, and considers a geometric model for commuting resources. Work has also been done on defining formal measures and asymptotic complexity-classes for particular resources(see,forexample,seeKlavins(2003a)). Robotsandquestionsofaphilosophicalororganizationalnature Robotshavebeenused asevidencetosupportavarietyofclaimsandcounter-claimsinphilosophicaldebate. Undoubt- edlythebestknownarethevocalquestioning,inBrooks(1991,1990),ofthephysicalsymbol 37 hypothesis(Newell and Simon 1976); the rejection of traditional knowledge representation in AI; and the proposal to study intelligence in a bottom-up fashion by first reproducing the be- haviorbefittingsimpleranimals(Ashby1966),beforeexamininghigher-levelintelligence. Whenemergenceandsituatednesstakentoanextreme,doubtiscastontheefficacyofany decompositional techniques for studying robot behavior. Indeed, Darley (1994) proposes that systemswithemergentbehaviorcanonlybetrulypredictedbydirectsimulation. Aquestionre- mainsaswhatphysicsshouldbesimulated: thelevelofdescriptioncanhasmajorimplications forwhatistractable(Feynman1982)oreventheoreticallysimulable(Smith2006). Ultimately such reasoning leads to the conclusion that the simplest representation of some physical sys- tems,atleastintermsofreproducingbehavior,mayintrinsicallybethephysicalsystemitself. Taking a less radical view of emergence, one may still assume interaction side-effects re- sultinsignificantsystemfunctionality. SeeSection3.3.1.1foranexampleofminutiæplaying an important role. Such cases provide a challenge to the top-down methodology which has beencriticalforthesuccessnaturalsciences; Simon(1996, pp.16)notesthat, despitethepre- occupation with deductive methods, this is equally true for works of mathematical rigor and datesbackatleasttoWhiteheadandRussell(1910,pp.v–vi). Simonclaimsthatbecausearti- ficialsystemsareparticularlysuitedtosimulation,theprocessofabstractionandsimplification should be tackled by simulation. Popularizations have touted such computational approaches as a new method to scientific inquiry (for example, see Resnick (1994), which, in addition, emphasizesadecentralizedmindset). 38 These ideas predate the modern digital computer with early cyberneticists constructing physical or analogue electronic circuits in order to model phenomena of interest, seeWiener (1948, 1962); Walter (1950, 1951, 1953); Ashby (1966, 1956); Pask (1961). Later develop- ments turned the focus of cybernetics inward, studying its own processes and explicitly con- sidering the role played by observers and system modelers. The result, so-called Second- Order Cybernetics, grew into Generalized Systems Theory. Each of these developments de- emphasized reductionism while claiming to be applicable to an ever larger class of systems. Such work has been criticized, along with modern theories of a similar flavor (e.g., see Bak etal.(1987);Bak(1996)),asclaimingtoprovideaunifiedviewandexplanationofeverything, butfailingtoproduceanyusefulpredictions(Horgan1997,chapt.8). 3.2 Definingthebehavioralsynthesisproblemformulti-robotsystems Thesinglereccurringproblem,fundamentaltoeachoftheprecedingmotivations,isthesynthe- sisproblem. Whetheritbeapplications,hypothesistesting,orevenrobotsexecutinggenerative models,thereissomeexpectationofthewaytherobotsoughttobehave. Synthesisinvolvespre- scribing algorithmsor rulesto individualrobots in orderto achievea desiredgroup behavior. For applications, the desired behavior is a task-specification, and consequently, the synthesis problemisoneofproducingtheinteractionsthatleadtotaskachievement. Thismaybedoneby decomposing and restructuring the task, for example, by defining higher-level competencies, positing roles, and having robots dynamically share organizational information. Alternatively, robotsmaybeprogrammedtoachievethetaskbyprovidingeachrobotwithsimpleoperations 39 thatonlyrequirecommunicationwithnearbyrobots. Eithercaseisaconstructiveprocessthat leads to a pre-specified behavior, which is behavioral synthesis. 2 Synthesis is the process of constructing control laws and interaction policies for individual robots so as to achieve some pre-specifiedgroupbehavior. Synthesisandanalysisaretypicallycoupledandusediterativelytoapproachsomedesired group behavior. Hypothesis testing, for example, is an incremental procedure involving both synthesis and analysis. After some behavior is observed in a natural system, an hypothesis is formulated as to the local-rules responsible for this behavior. A robotic implementation that embodies this hypothesis is studied by comparing observations of the robotic and natural sys- tems. These observations form the basis for accepting or rejecting the hypothesis. However, additional analysis will often be necessary; for example, different controller parametrizations may need to be explored. An understanding of the modeled behavior is achieved by iterating the process: proposing, testing and refining hypotheses. Synthesis plays its part in this pro- cesswheneverthehypothesisistobeupdated,eitherbecauseofcontradictoryobservationsor serendipitous discoveries. In such cases, the natural behavior to be reproduced is known and the objective becomes to produce a suitable controller. Suitability in these cases may require satisfaction of additional constraints. For example, the research direction may assume a con- nectionistmodel. Butthesynthesisisinformedthroughexperiencewithpriorhypotheses,and anyresultsofthesystemanalysis. Thus,synthesisandanalysiscomplementoneanother. 2 WedifferfromBrooks(1991)inthatwedonotrequireanautomatedprocedureforsynthesis,buttermsucha case“automatedsynthesis.” 40 3.2.1 Intrinsicchallenges The synthesis problem for multi-robot systems is challenging because the design space of controllers remains poorly understood. The practitioner is faced with many different design decisions and few guidelines or explicitly identified trade-offs. Much existing experience is drawn from knowledge of particular problem domains and identifying general insights with broad applicability. This stems, at least partially, from the large number of factors that can al- tersystembehavior. Researchersusingdifferentrobothardwareplatforms,orminorvariations in a task, in different environments may find that previous insights fail to apply directly. As an expository tool, it is helpful to think of controller synthesis as a search problem. If syn- thesis is the search for a satisficing controller, then the space of possible controllers has many dimensions. The same robot with a particular controller operating in two different environments can resultinverydifferentbehavior. Programmersmayinadvertentlymakeassumptionsaboutthe robot’s environment, resulting in a controller that implicitly encodes assumptions about the structureoftheenvironment. Also,therobot’sphysicalembodimentplaysaroleinproducing thebehaviorexhibitedthesystem. Mass,sizeandshapeallaffecttherobot’sdynamics,which can be significant. When synthesis is viewed as a search problem, the space of possible con- trollers has several dimensions that are non-trivial to characterize, and are often beyond the programmer’scontrol. 41 Synthesis is an inverse problem: many controllers may result in the desired behavior with two effective controllers achieving the same task in altogether different ways (see Sec- tion3.3.1.3forexamplesofcontrastingsolutionstothesameproblem). Atpresent,philosoph- ical or aesthetic reasons may cause one solution to be favored (or discovered) over another. Design of controllers remains an under-constrained problem because of the lack of principled guidelines. Intermsofasearch,multiplesuitablecontrollersmayexist. The converse situation also arises because circumstances may exist under which robots are incapable of performing a task. The literature contains few scenarios in which necessary conditionshavebeenidentifiedinorderforrobotstoperformatask. Thosefewaretheresultof ademonstrationofthesufficiencyofasystemofminimalistrobots. Minimalistrobotstypically use few sensing and computational resources, but these may not be the fewest possible. It remainschallengingtodefine“fewestpossible”becausethereisnogeneralagreementonhow resourcesofonetypemaybeoffsetbyotherresourcesofanothertype. Thismeansthatdespite searching,asatisficingcontrollerisnotguaranteedtoexist. A number of researchers have identified cases in which differences in small details can have a major effectis on the system’s global behavior. Examples of this marked sensitivity includepropertiesoftheenvironmentandadjustablecontrollerparameters. Onesourceforthis type of non-linear behavior in distributed robot and multi-agent systems is the multiplicative powerofpositive-feedbackproducedbyagenttoagentinteractions. Asaconsequenceofthese difficulties, the space of possible controllers may have dimensions that are non-linear with smalldifferencesbeingmagnified. 42 Robots inhabit a continuous world. Several researchers discovered problems while using a direct discretization in order to provide a robot with an internal world model. Such models may become brittle when small errors cause a robot’s beliefs to diverge from reality, resulting inexecutionofinappropriateactions. Someformoferrorisunavoidablebecauserobotsensors are noisy and actuators are imprecise. Further uncertainty arises from robots inhabiting a dynamicandpartiallyobservableworld. Sometimes,circumstanceswillariseinwhicharobot isendangered,andmustactwithspeed. Alloftheseconstraintsmeanthatcontrollersmustbe robusttouncertainty,theymustbepragmaticaboutmodelingtheworld,andtheymustoperate onatimescalecomparabletotheneedsoftheworld. Goodcontrollersarehardtofind. 3.2.2 Problemcomplexity Littleisformallyknownaboutthecomplexityofgeneralsynthesisformulti-robotsystems,but it is widely recognized as a formidable problem. A less general problem involves synthesiz- ing what are termed “social potential fields.” Potential fields are a widely used approach for obstacleavoidanceinmulti-robotsystems,inwhichobstaclesgeneratearepulsivevirtualfield and the goal position is represented as an attractive field(Khatib 1986). Social potentials are a simple extension in order to deal with multi-robot systems: pairs of robots within sensing range of one another are modeled as introducing additional (but not necessarily symmetric) virtual fields. Reif and Wang (1999) conjectured that the problem of determining a set of so- cial potentials in order to produce a general prespecified global outcome is undecidable. The 43 broadersynthesisproblemthatmustconsidermorethanjustspatio-temporalpatternsisatleast asdifficultasthesocialpotentialcase. 3.3 Approachestomulti-robotandswarmsynthesis Several authors have written reviews of multi-robot systems by offering categorizations and classifying the multi-robot literature. The most complete taxonomies to date are Cao et al. (1997) and Dudek et al. (2002), 3 both of which use similar—but not identical—axes to orga- nize the existing work. We combine and summarize those reviews, which form the starting point for detailed discussion of the current approaches to the multi-robot synthesis problem. These axes will be used to specify the sorts of robot team involved in later validation experi- ments. Forreferencesandseveralexamplesofworkthatfitthedefinitionsaxeswell,thereader isreferredtotheoriginalsurveys. By considering the following dimensions one may explicitly identify many aspects of multi-robotsystems. GroupComposition: Dealswiththesizeandmakeupofthegroup. Numberofrobots: Multi-robotsystemsrangefromsmallteamswithtwoorthreerobots workinginconcordtoswarmswithmanytens,butmorethantenisrare. 3 Thepublicationdatesaresomewhatmisleadingasbothtaxonomieswereindependentlydevelopedaroundthe sametime. Theyareupdatedandextendedversionsoftheearlierreviews,Caoetal.(1995)andDudeketal.(1996), respectively. 44 Physicaldiversity: Robotgroupscanbedesignedtobehomogeneouswithidentical(or near-identical) individuals, or heterogeneous with robots having distinct sensing, communicationoractuationcapabilities. Behavioraldiversity: Groups may be homogeneous in that all robots have the same programming,orheterogeneousinthedesignoftheircontroller. CommunicationProperties: Dealswithmechanismsformeaningfulcommunicativeinterac- tionsbetweenrobots. Mechanism: Specialized wireless communications hardware may be used to commu- nicate. Alternatively one robot may be within the sensing range of another robot, allowingactionstobedirectlyobserved. Finally,communicationmaybemediated by the environment. A robot’s actions may alter the environment in a way that is thenobservedbyanotherrobot. Range: Communicationhardwaremayallowglobal,oronlylocalcommunication. Al- ternatively,communicaitonshardwaermaybeunavailable. Topology: Useofaparticularcommunicationschannelmayrequirethateveryrobotre- ceivesentmessages,thatis,broadcastonlycommunications. Alternatively,itmay bepossibletoexplicitlyspecifyrecipientsforcommunications,calledaddressable communications. Betweenthesetwoextremesexistothercommunicationsgraphs, forexample: atreetopology. Cost: Communication costs may vary. Communication may be free, that is, be consid- eredtohavezerocost;orbeprohibitivebeyondtheworthcommunication,orhave 45 infinitecost. Costmaybepositiveandfinite,inwhichcaseitisimportanttoassess whethercommunicationscostmorethanmotionorless. IndividualRobotCapabilities: Considersthecapabilitieseachrobothasinordertofunction aspartofagroup. Computational(theoretical): Individual robots may have hardware or designer im- posedlimitations. Somerobotcontrollersaremodeledonsimpleautomataorcon- nectionist models, in which case it is important to know whether robots have no memory,limitedfinitestates,asinglestack,orcompleteTuringMachinecapabili- ties. Computational(practical): Physical robots typically have on-board computers. It is worth distinguishing those with relatively limited capacity (e.g., less than 100Kb main memory) from those with capabilities approaching those of a desktop com- puter. Peerrecognition: Some robots have sensing capabilities that allow other robots to be recognizedanddistinguishedfrommovingobstacles. Peeridentification: Some robots have the ability of matching a physically recognized robot with a symbolic name. Without this capability the system is limited to de- ictic(Agre and Chapman 1990) representations, which can be difficult when ex- plicitlycommunicatingthroughandevicesthathideslocalityinformation(suchas wirelessinfrastructure). 46 MechanicsofCooperation: Deals with underlying aspects for the design of cooperative be- havior. Origins: Cooperativeeffectsmayresultfromexplicitcomputationandcommunication, orlessdirectlythroughself-organization. Structure: Computation can be achieved in a number of ways depending on the infor- mational and logical organization of the system. A basic distinction is between centralized and decentralized means. Occasionally authors will differentiate dis- tributedfromdecentralized,theformermeaningthateachrobotperformsanequal shareofthecomputationalload,whilethelattermerelymeansthatnosinglerobot performsallthecomputing. Architecture: Multi-robot systems are often distinguished by the underlying control architectures. Such architectures exist in order to assist the system designer, and will encode a set of assumptions (either implicitly or explicitly) about appropriate organizationormethodsforachievingcooperativebehavior. These dimensions provide a logical basis for arranging the aspects and attributes of most multi-robot systems. In later chapters these same dimensions will be used to define the robot systems in the research. This organization has at least one significant drawback: it does not express the interdependence among the dimensions. Using only the four dimensions above to delineate the field multi-robot research can be misleading. Consider, for example, a large multi-robot system. Such a system must employ a communication strategy that is scalable, 47 which affects the coordination style and, in turn, the choice of architecture. This organization failstocapturethemutualinformationbetweenanytwodimensions. One way to highlight the connections is to structure a taxonomy in a more hierarchical fashion. Farinellietal.(2004)introducedamulti-robottaxonomywithfourcoordinationlevels, each level containing the subsequent levels as a subset. This produces better decomposition than the four flat dimensions listed above, but complete containment at each level means that the“hierarchy”isalinearstructure,whichiscertainlyanunbalancedtree. Thefourlevelsare: CooperationLevel: This identifies the most basic notion that cooperative systems consist of robots which operate together in order to achieve a collective task, and can be distin- guished from non-cooperative systems. The remaining levels are only concerned with cooperativerobotsystems. KnowledgeLevel: This level distinguishes robots that are aware of their peer robots from systemsinwhichmultiplerobotsareunawareofeachanother. Thetwolevelsbeloware concernedwithrobotsthatareawareoftheirpeers. CoordinationLevel: This level is concerned with the mechanism for achieving cooperation. The authors discriminate between strongly, weakly and not coordinated. Strong coordi- nation is distinguished from weak coordination, in that the former requires an explicit protocoltogovernhowrobotsinteractintheenvironment. OrganizationLevel: Thedecision-makinginordertoachievestrongcoordinationcanbereal- izedinanumberofways. Thislevelcompartmentalizesthestronglycentralized,weakly 48 centralized and distributed approaches. The centralized approaches are distinguished from distributed approaches by requiring a leader. The difference between strong and weakcentralizationisthattheformerhasasinglerobotthatpersistsasleader,whilethe latterallowsthisroletochange. Again, the reader is referred to the original work for archetypal research on each level. The authorshaveagreaterfocusoncontroltheoreticresearchthaneitherCaoetal.(1997)orDudek etal.(2002). Farinelli et al. (2004) state that these four levels were based on relationships observed in multi-agent systems research. Similarly, Stone and Veloso (2000) review the literature from a multi-agent research perspective, but with a focus on machine learning. The type of implicit coordinationobservedinself-organizedsystemsiseitherentirelyignoredorbarelymentioned in such agent-inspired taxonomies, and it remains unclear how these taxonomies could be expandedupontoaccommodatethisimportantclassofwork. The sections that follow attempt to better illustrate the interrelationships between the four dimensions described above. This dissertation employs an alternative to hierarchical decom- position, instead identifying the themes that underlie the interdependencies among the dimen- sions. Thesethemesreflectdifferentconstraints, perspectivesandphilosophiesratherthanthe particularitiesoftherobotsbeingused. Thethemesaremuchclosertothesocialaspectsofthe research,assharedviewsseemtoproducebetter“principalcomponents”thanmerelytechnical aspects. 49 For the sake of completeness, we note here that in Parker (2000, 2003), Cao et al. (1997) and Dudek et al. (1996) several broad topic areas are identified as having “generating con- siderable interest” or “resulted in significant levels of study.” The union of these topics in- clude: biological inspirations, task planning and allocation, addressing resource conflicts, and learning. They are applied to domains and problems such as: multi-robot path planning, lo- calization, mapping and exploration (both topological and metric representations), distributed searchandsensorcoverage,foraging,cooperativeobjecttransportandmanipulation,collective navigation,motioncoordination,formationandmarchingproblems,competitiveactivitiesand games,andself-reconfigurablerobots. 3.3.1 Coordinationmethodsandphilosophies This section distinguishes four approaches for achieving coordinated behavior: implicit, ex- plicit, multi-agent , and control theoretic approaches. Most important for the current work are implicit and explicit coordination methods. Because the dissertation’s synthesis approach fits somewhere between the two extremes, we also describe the (sparse) work that has been done tobetterunderstandtherelationshipbetweenthetwomethods. 3.3.1.1 Implicitcoordination Implicit coordination is the name given to coordination that emerges in robot systems from nothing more than simple local interaction rules. Typically such coordination is performed by several simple robots, using only limited computation—perhaps a small finite state machine controller—and sensing necessary for achieving the task at hand. Such robots often perform 50 taskorientedbehaviorinamoreorlessindependentfashion,butthegroupsucceedsinmaking incremental progress collectively. A robot may, through task related work alone, structure the environment in such a way that other robots are biased toward assisting in ongoing activities, or adding to previous work. This process can occur without an explicit communication: all that is necessary is for the robots to sense the environment directly. This type of environ- mental interaction is called stigmergy(Beckers et al. 1994; Holland and Melhuish 1999). It is also common for the environment to play the role of a collective memory and may be the primary mechanism used for persistence. In addition to these extremely simple, but powerful, task-related environmental interactions, robots will sometimes use infra-red or wireless radio communication, typically a low-bandwidth limited range local broadcast type device. In such casesfewbitsaretypicallytransmittedandalwaysonlybetweenneighboringrobots. Implicitlycoordinatedrobotsystemsareusuallystudiedbyresearcherswhoviewthephys- ical and interaction dynamics of a robot system as the primary determinant and contributor to theoverallbehavior. Wheneverpossible,suchresearchwilltrytoexploittheunderlyingphys- ical dynamics in order to assist task achievement. Each robot’s controller typically operates on a single timescale which, because of the few computational requirements, can be a high- frequencycontrolloop. Situatednessandembodimentareoftenexploitedfortaskrelatedpurposesinsuchsystems. The following is a notable example. Deneubourg et al. (1990) considered the brood sorting behavior exhibited by several species of ants. They proposed a set of interaction rules based on the sensing and short-term memory capabilities one would expect of an individual ant. 51 Simulations of these rules successfully reproduced many features of the clustering behavior observed in real ant nests. Beckers et al. (1994) set out to reproduce the same behavior on physical robots by having them cluster pucks within an arena. In working with the robots, theauthorsunexpectedlydiscoveredthatsimplersensingcapabilitiessufficedtoformclusters. Theyalsofoundthatnomemorywasrequired. Adetailedinvestigationintotherobotsystem’s behavior lead to the following explanation: clusters were formed through gradual accretion, with robots randomly transporting pucks from one cluster to another. The design of each robot’sscoopmeantthatitcouldonlyremoveitemsbystrikinganexistingclustertangentially, whereas items would be added whenever a puck carrying robot struck a cluster more directly. As a result, the chance of a cluster gaining or losing pucks depends on the size of that cluster. Thelargeraclusteris,themorelikelyitistocontinuetogrow. Runningtherobotsaroundthe arena eventually causes a single large cluster to be formed. Thus, the physics the of system played an important part in producing the overall behavior and achieving the task of puck clustering. Althoughtheaposterioriexplanationissimple,predictionofthisformofbehavior isextremelychallenging. Emergencecanproducesurprisingresults. The following is another example with different underlying mechanics, but a similar pat- tern. Kube and Zhang (1993) describe experiments in a box-pushing domain, modeled after studiesofantpreytransport. Fivesimplerobotsaretaskedwithpushingaboxthatistooheavy to be pushed by any single robot acting alone. Multiple robots must collaborate in order to makeanyprogress. Inthespiritofimplicitcoordination,therobotscommunicatebyperform- ingthetask, thatis,simplybypushingthebox. Therobotsindependentlynavigatetowardthe 52 box and begin to push by applying a force to the adjacent edge. While pushing, each robot monitors whether or not it is being pushed backwards by its peers. If a robot detects that it is beingpushedbackwards,itstopspushing,movesawayfromthebox,andnavigatestoarandom place on the box perimeter. The result is successful box transport with robots acting together topushthebox. Theunderlyingmechanismreliesonrandomnesstoavertdeadlocksituations, ensuring that eventually one direction wins out. The directional symmetry is broken by inter- actionsofthegroupusingnothingmorethanthebox. Furtherdemonstrationsshowedthatthis process can be effectively biased to have the robots move the box to a particular goal(Kube 1997). Both of the preceding examples take inspiration from social insect systems. Roboticists have learned lessons from other biological systems too. Vaughan et al. (2000a) describe a game that exploits situatedness in order to resolve inter-robot resource conflicts. The authors generalize a mechanism that is ubiquitous in nature: demonstrations of aggression. Disputes over shared resources can be resolved if and when they arise, rather than enforcing a group- wide hierarchy or dominance relationship. Overwhelming, demonstrations of aggression stop short of causing serious harm because the animals involved assess the value of the contested resource. Analogously,therobotspicktheirlevelofaggressiondependingontheirneedatthat instant of time, resulting in dynamism that can mitigate resource starvation. Vaughan et al. (2000a) (and follow-up Zhang and Vaughan (2006)) use aggression games in order to reduce interference in a narrow stretch of corridor: their robots play a stylized version of “chicken” withthewinningrobotpermittedtoremaininthecorridor. Theworkisparticularlyadmirable 53 because of the way the authors adapted a previously observed mechanism to suit their robots. Rather than constructing an abstract virtual game to be played over some communications network—perhaps the more obvious choice—the authors use the physical nature of the robot whileelegantlyhavingthegamefitthetaskthatneedstobeperformed. Large-scale multi-robot systems comprising of implicitly coordinated individuals have been called swarms by Beni and Wang (1989) and collectives by Kube and Zhang (1993). The social insects that inspire the work and the style of communication suggest that implic- itly coordinated robot systems will scale well. It is important, however, to distinguish swarm systems from implicitly coordinated robot systems. Scalability is a feature of implicitly coor- dinated systems, but is not a condition. Not all researchers believe that scalability is possible only through implicit coordination. For example, Schwager et al. (2006) describe a robot system as a swarm (we presume because of the number of relatively simple robots) despite the robots employing decidedly non-implicit coordination techniques. Also, Werger (1999) describes an implicitly coordinated soccer team with few robots. His work too, is an exam- ple of implicit coordination without being an implementation of some biological mechanism. The work is certainly minimalist, and exploits both embodiment and situatedness. In this dis- sertation we distinguish robot swarms from general implicitly coordinated robot systems by requiring swarms to be a large-scale minimalist multi-robot system with limited capabilities and simple controllers, but do not necessarily require that coordination be a byproduct of the robots’physicalaspects. 54 One drawback of implicitly coordinated groups, and particularly implicitly coordinated swarmsystems, isthedifficultyofanalyzingsuchsystems. Thiswashighlightedintheexam- pleoftheant-inspiredpuckclusteringofBeckersetal.(1994). Someyearslaterthoseoriginal experiments were reproduced by Holland and Melhuish (1999). The new study showed that the overall puck clustering performance depends critically on several physical features of the experimental setup, including the physical shape of the robots. Unbeknownst to the authors at the time, several of these aspects had been fortuitously ideal in the first experiment. Unfortu- nately, formalizing such aspects can be extremely difficult. Simply noting that coordination is anemergentpropertyofthesystemfallsappreciablyshortofsuggestingamethodforpredict- ing the system’s behavior. Similarly, the notion of self-organization, while providing a broad explanationoftherobot’scollectivebehavior,hasnotyetresultedinanyformalunderstanding ofthesystems. Itisstillworthpointingoutthattheprecedingexamplesdohavetheproperties that Bonabeau et al. (1999) give as ingredients for self-organization: positive feedback, nega- tive feedback, amplification of fluctuations, and multiple interactions. Section 3.3.3 describes the phenomenological models that have had some success in modeling implicitly coordinated multi-robotswarms. Implicitly coordinated multi-robot swarms have had some overlap artificial life research, whichconsidersavarietyofproblems,likemulti-leveldynamics,emergence,therolesofinter- actionandthelocaltoglobalproblem(seeforexampleSatoandCrutchfield(2003)). Artificial life studies typically consider agents that are not embodied in a world with its own dynam- ics but, occasionally, the environment is a simulation that reproduces a few chosen aspects of 55 the real world. It is common for such agents to inhabit discrete worlds, being directly related to broader cellular automata research community. Despite these aspects being unrealistic by robotics standards, non-embodied systems are frequently inspired by observable cooperative behaviorintherealworld,foranexampleseethecoloniesofDorigoetal.(1996),andcompu- tationalecosystemsofHoggandHuberman(1991). Finally,inadditiontothemulti-robotsurveysdescribedintheprecedingsection,implicitly coordinated and swarm systems are also (briefly) reviewed in Fong et al. (2003), which is a surveyfocusingonsocialaspectsofrobotsystems. 3.3.1.2 Explicitcoordination Instarkcontrasttotheimplicitcoordinationdescribedabove,explicitlycoordinatedapproaches focus on intentional cooperation(Parker 1996). The primary interactions between robots are explicitcommunicativeacts. Researchersconcentrateonasetofabstractdistributedproblems, like task allocation, role-assignment or scheduling, in order to produce a set of algorithmic solutionsunderavarietyofassumptions. Tasksaredecomposedintothesearchetypeproblems andasuitablealgorithmapplied. Usuallythisalgorithmisdistributedsothatfailureofasingle robot does not debilitate the group. The types of environmental interactions and the physical dynamics that are critical for implicitly coordinated systems are not usually exploited or even directly modeled. Usually the coordination algorithms will solve static (or short finite time window) problems repeatedly so as to ensure the symbolically represented information that is usedduringthecalculationiscurrent. 56 The communication infrastructure is generally assumed to be layered over the multi-robot system so that global broadcast and/or uniquely addressed messages can be relayed to the respective recipients. The coordination algorithm is usually viewed as one of several software components. Since the coordination algorithm integrates information from several robots, it will usually be updated less frequently than the lower-level controller which must respond in a timely fashion to keep the robot safe. It is not uncommon for these distinct timescales to be used as an organizational mechanism for separating the various software components. See Section3.3.2dealingspecificallywithcontrolarchitectures. Significantprogresshasbeenmadebyidentifyingabstractcoordinationproblemsthatcan be analyzed and solved. The multi-robot task allocation problem is probably the best exam- ple, withseveral researchers proposingsolutions tovarious formsof theproblem. Gerkeyand Matari´ c (2004a) introduced a taxonomy with which to frame task allocation variants. They analyzed the computational complexity and known algorithms of particular variants. Briefly stated, the task allocation problem involves assigning robots to tasks so as to maximize some notion of expected reward or utility. Many instances of this problem cannot they be solved optimallyinatractabletime. Particularrestrictions,likeassignmentsthatpermitatmostaone robottoataskaswellasasingletasktoarobot,canbesolvedoptimallyandefficiently. Exam- ples of task allocation methods include: the ALLIANCE architecture(Parker 1998), Broadcast of Local Eligibility(Werger and Matari´ c 2000), MURDOCH(Gerkey and Matari´ c 2002) and Traderbots(Zlot et al. 2002). The last two are market-based approaches that treat tasks as 57 items in an auction. The robots bid to perform tasks, each robot striving to maximize individ- ual profit. The intention is for the market to produce an efficient allocation despite the lack of global knowledge. See Dias et al. (2006) for a recent review of market-based coordination strategies. Market-based methods have also been used in order to produce systems with more generaltypesofallocationsthroughcombinatorialauctions(Berhaultetal.2003),andcoalition formation(VigandAdams2006). Although task allocation is a popular area, other researchers have proposed different ways of explicitly achieving coordination. For example, Jennings and Kirkwood-Watts (1998) con- sider the notion of fluidity as a guiding principle. They introduced a programming framework for their method of “dynamic teams” which included particular primitive statements for re- cruitingandsubstitutingmembersofsub-teams. Although the process of abstraction leads to well-defined coordination problems, it can also introduce some challenges. Because many robot-specific details are necessarily absent from the abstract coordination problem, any information about the state of the world must be providedasinput. Inthecaseofmulti-robottaskallocation,thisistypicallydonebyspecifying expected utilities for given robot-to-task assignments. The specification can be both general and convent since, for example, heterogeneous teams can be easily accommodated, and these expected utilities form a clean separation of the environmental and robotic attributes from the coordination algorithm. But they can make it difficult to ensure that the all necessary infor- mation is captured. In the particular case of utilities, the final numbers do not directly convey 58 uncertainty information. 4 Gerkey and Matari´ c (2004b) give an in-depth discussion of this is- sue,whileGerkeyandMatari´ c(2004a)describetherelatedandimportantissueofinterrelated utilities. Paradoxically, these issues seem to be mitigated by one of the following diametri- callyopposedtechniques. Thefirstismorecomplexandcompletemodelingoftheunderlying circumstances, for example, capturing more information about the uncertainty. This is done by the multi-agent community, see Section 3.3.1.4. The second is to embrace the robot’s em- bodiment and situatedness directly rather than modeling their effects, as is done in implicit coordination. 3.3.1.3 Implicitversusexplicitcoordination The majority of existing work employs techniques that fall close to either implicit or explicit coordination methods. Little work has been done to try better understand how features can be combined from each of the methods in a useful middle ground. In fact, there are few direct comparisons of implicit and explicit coordination methodologies. In a notable study, KalraandMartinoli(2006)performedanempiricalcomparisonofthreshold-basedandmarket- basedtaskallocationmethodologiesinanevent-handlingscenario. Market-basedallocationis an explicit coordination technique, while threshold-based allocation is a minimalist approach considered representative of implicit coordination. Threshold-based allocation involves each robot assigning itself to the most urgent task within the local sensory region. The market- based allocation has each robot estimating cost to perform a task on the basis of distance, 4 We say “directly” here, as it is the expected utility, and thus each possible outcome should be considered in lightoftheuncertaintyoftheoutcomearising. Generalizationscouldproductivelyusehigher-ordermoments. 59 and bidding on tasks in a manner that follows the description in the preceding section. The authors found that with accurate information the market-based solution found more efficient allocations, but when information was corrupted by significant noise, both allocation schemes performedcomparably,butwiththethreshold-basedallocationusingfewerresourcestodoso. Theauthorknowsofonlyoneotherpieceofwork,namelyCamposetal.(2001),whichat- temptstocomparethetwocoordinationphilosophies,anddealswithasituationthatissurpris- inglysimilartothatofKalraandMartinoli(2006). Wepositseveralreasonswhycomparative studies are in short supply. Firstly, researchers tend to embrace a particular approach and use only that type approach within their research. The reasons for this may include: the expertise necessary for studying coordination approaches has limited overlap across approaches; there areinfrastructureconsiderationsassomemethodssuitsomerobotsbetterthanothers;research objectives and funding considerations are significant because the approaches fit some motiva- tions better than others (see Section 3.1). Secondly, there is the challenge of finding problem domains fair to both implicit and explicit methods. Kalra and Martinoli (2006) compare task allocation methods that ignore the particulars involved in carrying out the tasks – the robots are simply said to service “events.” This begs the question of whether it is fair to compare coordination methods without considering problem domain specifics. These specifics can be importantinplacingconstraintsontherobot,orpermittingembodimentandsituatednesstobe exploited. The multitude of reasons that influence the choice of coordination methodology is whythissectionisheaded“philosophies.” Itreflectstwocultures(cf.Snow(1959))ratherthan adetailedunderstandingwhichmethodsareapplicablewhere. 60 Distributedbox-pushingisonetaskthathasbeenstudiedbyresearchersfrombothperspec- tives, although we are unaware of any work serving to compare the two. As discussed above, Kube(1997)demonstratedbox-pushingwithsimpleimplicitlycoordinatedrobots. Gerkeyand Matari´ c (2002) used cooperative box pushing as a test domain for MURDOCH, their market- based explicit coordination approach. Donald et al. (1997) began with robots explicitly com- municating about the object being pushed and, through a series of reductions, produced a system that uses the object itself for control feedback. They analyze this reduction from the perspective of Information Invariance(Donald 1995) concluding that the earlier explicit com- munication had been translated into information captured in the task mechanics. The work is notable for being the sole example of a principled transformation of one coordination method to the other, although the work was mostly motivated by achieving robustness rather than as a comparison of the methodologies. Increases in robustness are expected to follow because, as Erdmann (1989) notes, reducing the knowledge requirements of a strategy reduces its brit- tleness. The reduction of knowledge requirements as a means toward robustness raises the fact an important distinction should be drawn between minimalism and implicit coordination. Mini- malism concerns the resources available or used by the robots. Minimalist robots use simple and limited sensing, basic computation, and limited or no communications. Implicit coordi- nation is primarily distinguished by the role of interaction dynamics: implicit coordination views the dynamics as complex, but critical for achieving cooperation; explicit coordination disregardstheroleofthosedynamics. Implicitlycoordinatedrobotsystemsarealmostalways 61 minimalist. Jones (2005) provides an example of minimalist robots that are not implicitly coordinated. The work introduces a method for automated synthesis of robot controllers for tasks with an acyclic structure, the resultant controllers having few (and in some cases mini- mal)communicationandcomputationalrequirements. Theworkisnotanexampleofimplicit coordinationbecausetherobotcontrollersdonotmakeuseoftheinteractionandenvironmen- taldynamics. Whiletheworkdoesexplicitlyconsiderenvironmentalstructureandhowitmay change,itdoessobydescribingtheworkintermsofdiscretestates. Thetimescaleatwhichthe state changes occur is irrelevant to Jones’s formalism, and consequently we say that the work uses environmental statics. This is mainly true of the present research, although interference effectsaretreateddynamicallyinChapter5. 3.3.1.4 Multi-agenttechniques Multi-agentresearchisconcernedwithachievingcooperationinarangeofsystems,withmulti- robot systems being one particular instance. Like explicit coordination, physical dynamics and other aspects related to embodiment and situatedness are ignored. Uncertainty is tackled directly in Markov Decision Theoretic methods but may be ignored in other approaches. The challengesinvolvedinthecalculationofutilities,asdescribedaboveforexplicitcoordination, occurindecisiontheoreticmulti-agentresearchtoo. Unlike other coordination techniques, several multi-agent methods require significant off- line processing. This represents a significant philosophical difference and probably stems 62 from a close association with earlier single agent AI. Well-known methods like Situated Au- tomata(KaelblingandRosenschein1990;RosenscheinandKaelbling1995)andPartiallyOb- servable Markov Decision Processes (POMDPs)(Kaelbling et al. 1998) take a complete spec- ification of the problem (including the environment, the agent’s capabilities and its goals) and process it with algorithms that produce a simple controller for the agent. Markov-decision- based approaches have the useful property that the resultant controller will optimally balance actions that generate reward with those actions that reduce uncertainty. Generalizations to deal with multi-agent scenarios, like the Communicative Multi-agent Team Decision Prob- lem (COMMTDP)(Pynadath and Tambe 2002) and Partially Observable Stochastic Games (POSGs)(Emery-Montemerlo et al. 2005) are solved in a similar way, but with a significantly larger search space. The output of such systems is a reactive policy that maps observations to actions (including communicative acts). Some researchers focus on execution-time decisions in order to simplify the model that must be solved off-line. For example Roth et al. (2006) evaluatewhatinformationshouldbecommunicatedatruntimeinamulti-agentPOMDP. Evensimpleproblemsresultinlargestate-spaces. Data-drivencompressionisanapproach that is used to address this problem(Roy et al. 2005) but has not, as of yet, been applied to multi-agentdecisionproblems. Guestrin et al. (2002) describe a framework for particular types of (jointly-observable) problemsthatcanbemaintainedinafactorizedformwhilerespectingtheindependenceofthe agents within a task context. Still other agent-based research operates entirely on-line. Exam- plesincludedistributedconstraintsatisfaction(Modietal.2005),andtoken-basedmethods(Xu 63 et al. 2005). This latter work has some similarities to one of the processes developed for the synthesis toolbox, although their model requires that the token’s carry additional information. Theirmethodreliesontheindependentrandomwalksoftokensoverthecommunicationgraph and,thoughtheydonotidentifythefact,relyonanassumptionofergodicity. Themulti-agentcommunitycurrentlyhasastrongerfocusonanalysisandtheoreticalmod- els than the multi-robot community does. Theoretical contributions may have a significant impact on the robotics community in the future, provided the models address uncertainty ade- quatelyandscale-uptomulti-robotdomains. 3.3.1.5 Controltheoreticcoordinationmethods Controltheoreticapproachestomulti-robotcoordinationtypicallyinvolveconstructingamodel of the system and converting the task specification into an objective function to be optimized. Themodelsusedintheapproachtypicallyconsidertherigid-bodydynamicsoftherobotsand assumethatmassesotherparametersareperfectly(orverywell)known. Theoptimizationmay beperformedanalyticallyprovidingcontrollawsforindividualrobots,ortheoptimizationmay need to be explicitly solved through iteration at runtime. Formalisms may allow the designer toprovecontrollability,stability,asymptoticconvergence,orrelatedproperties,ofasetofcon- trol laws. Most often these techniques are applied constructing motor control policies. Such methodscanbeusedinconjunctionwiththeprecedingcoordinationphilosophieswhichoften assumethatmotorcontrolismanagedbysomelower-levelcomponent. Oftentimesthepreced- ing methods may view the coordination problem as a fundamentally discrete problem, while, in contrast, control theoretic methods are almost always continuous in flavor. 5 Thus, it is not 64 surprising that such methods are applied to problems like formation control(Belta and Kumar 2004), target tracking(Spletzer and Taylor 2003), boundary detection(Marthaler and Bertozzi 2003)andbox-pushing(Pereiraetal.2004). Currently, it is common for control theoretic approaches to use some degree of centraliza- tion, e.g., electing a leader robot. While many aspects of the physical systems are often mod- eled, it remains uncommon for modeling uncertainty to explicitly taken into account. Sensing uncertainty is often tackled through application of estimation theory, which can be elegantly andconstantlyintegratedintotheformalism, e.g., thetreatmentofStengel(1994). Estimation theoryhashad,andcontinuestohave,alargeimpactonrobotics(Thrunetal.2005). 3.3.2 Controlarchitectures The following discussion is based on a division of existing work into classes of control archi- tectures. Althoughthereisastrongcorrelationbetweencertainarchitecturesandcoordination approaches,thereissufficientindependencetojustifythefollowingshortsummary. Addition- ally, this division permits a more natural description of the historical development than the discussionthusfar. Control architectures serve an organizational purpose: the architectures aid in the process of synthesizing robot behavior by providing system designers with structure. Like a program- ming language, a control architecture typically embodies a particular philosophy about the process of design. Features that are deemed important may be emphasized at the expense 5 Whichistosaythattheyusetechniquesofcalculus,forexample,overcontinuousfields. Theimplementation willbediscreteandthismaybeexplicitlyconsideredwithintheformalism,cf.(BulloandLewis2004). 65 of others. One dimension often used to differentiate architectures is in their different use of timescales. Layeringoftimescalesplaysanimportantroleinthisdissertationbecauseitisnec- essary for the equilibrium thermodynamic formalism to provide useful predictions. Thus, we provide a brief overview of architectures describing their organization in terms of operations thatoccuratdifferenttimescales. Enough control architectures have been proposed to warrant classification. This is the resultofatimewhenarchitecturesweredeemedtobeoneofthosemostimportantquestionin mobile robotics, resulting in a proliferation of architectures. Fortunately control architectures may be easily classified into categories, and classifications are common(Arkin 1998; Matari´ c 2002). Wepresentageneraloverviewofeachcategory. TheinterestedreaderisreferredtoCao etal.(1995)who,additionally,discussseveralimportantcontrolarchitecturesomittedhere. 3.3.2.1 Reactivecontrol The first modern multi-robot system can be fairly attributed to Grey Walter (Walter 1950, 1951, 1953) whose observation of a single robot suggested to him the potential richness of multiple interacting robots. After attaching an electric light bulb to a robot for photographic trackingpurposes,heobservedthattherobotwouldoccasionallyrespondtoitsownreflection in a mirror and that this form of feedback increased the behavioral repertoire of the robot. Walter then built a second robot and carried out experiments with two robots. These two robots,nowwidelyknownbytheirnamesElmerandElsie,showedhowinteractingindividuals mayinducechangesinbehavioralperformanceononeanother,producingcomplexpatternsof behavior(Walter1950)andadaptation(Walter1951). 66 Reactive controllers have no memory, and Walter’s two robots were a demonstration that even reactive controllers could produce complex behavior. As may be surmised from the dis- cussionofWalter’srobots,multi-robotsystemsemployingreactivecontrollersusuallyfallinto theclassofimplicitlycoordinatedsystems. 3.3.2.2 Hybridarchitectures Hybrid architectures allow for a composite controller. Such controllers are typically layered: the bottom layer being a fast low-level controller, and the top layer performing slower high- level symbolic processing. A variable number of intermediate layers may sit between these two extremes. The low-level controller is similar to a reactive controller, but must have some methodofcommunicatingwiththeotherhigher-levelcomponents. Hybridcontrollersareoften usedinexplicitlycoordinatedmulti-robotsystems,wherethehigher-levelreasoningsystemis responsibleforcommunicatingandexecutingcoordinationalgorithms. Thefundamentalorga- nizational feature of hybrid systems is the use of timescales to layer the components. Connell (1992)pointedoutthatalgorithmsrunningatthetop-leveloftenoperateinadiscretetimeand spacemodels,whilelow-levelcontrollersareviewedascontinuousintimeandspace. 3.3.2.3 Behavior-basedcontrol Behavior-based controllers consist of a set of concurrently executing modules, called behav- iors, each of which processes input and produces commands for the robot’s actuators or for otherbehaviorsinthesystem. Unlikehybridsystems,behaviorbasedsystemsaredecomposed by activity (Brooks 1991) ensuring that a concrete connection between perception and action 67 is maintained. This dissertation considers concurrently executing and interacting processes somewhat similar to a behavior-based controller, we do not require the processes couple per- ception to action. But the processes that the synthesis methodology uses are limited in that theyarerequiredtopossesstheergodicproperty. Behavior-based controllers are organized into activities which loosely follows discoveries in biology. Evidence from Mussa-Ivaldi and Giszter (1992) and Giszter et al. (1993), which providesthefoundationforthetheoryofmotorprimitives,pointstotheexistenceofspinalfield motor primitives that encode a complete movement or behavior. Experiments on spinalized frogsandratshaveshownthatwhenaindividualfieldisactivated,thelimbexecutesacomplete behavior, such as a reaching or wiping. Behaviors may also be designed to activate when particular actions are perceived and thus follow from the “mirror neuron” hypothesis of the functionalconnectionbetweenthevisionandmotorcontrolsystem(Rizzolattietal.1996). Thebiological-inspirationsofthecontrolarchitecturemeanthattheyareoftenemployedin experiments that attempt to reproduce biological systems. For example, Beckers et al. (1994) describe the fit between stigmergy and behavior-based systems as an “excellent fit.” In multi- robotcases,behavior-basedsystemsoftenuseimplicitcoordinationmethods,butnothinglimits a behavior-based robot system from using explicit coordination. On the other hand, multi- robot task allocation as implemented in ALLIANCE architecture(Parker 1998) and Broadcast ofLocalEligibility(WergerandMatari´ c2000)assumethattherobotsareusingbehavior-based controllers. 68 The synthesis methodology developed in subsequent chapters of this dissertation uses the interactions of concurrently executing distributed processes in order to produce task-related global behavior. While behaviors map perception and action, the synthesis processes perform a coordination role mapping local communicative actions into globally predicable behavior changes. The main demonstrations will be of instances in which the primitive achieve pre- dictable macroscopic changes in internal state—those changes are made observable (and use- ful) by being reflected in the actions of the robots (e.g., in switching on or off behaviors in a behavior-based controller). Another important distinction is that the toolbox processes may have to execute on different timescales in order for their couplings to be predictable. Of- tentimes all the behaviors in a behavior-based controller will execute with some comparable updatefrequency. 3.3.3 Multi-robotformalisms The vast majority of work in multi-robot systems is of an empirical nature, and few general insights have been gained in order to assist the system designer. More theoretical work is needed to in order to fully address the synthesis problem. In particular, modeling methods are neededinordertobetterunderstandwhichofthemanyfactorsthataffectsystemperformance dominate in particular circumstances. Formal models are also required to better characterize theresourcesnecessaryinordertoperformparticulartasks,andmethodsareneededtopredict aspectsofsystemperformancewithoutthecostandtimeinvolvedinrunningexperiments. 69 In recent years, some progress toward these goals has been made in the modeling of min- imalist multi-robot systems with stochastic models. This non-determinism reflects the fact that it is infeasible to model all the aspects that affect a robot’s behavior with sufficient de- tail. There are generally two types of model named for the level of description they consider, namelymicroscopicandmacroscopicmodels. Microscopic models focus on individual robots, keeping track of the states of each of the robots, but without providing full simulations of the sensing or actuation of each robot. State transitions within each controller are updated based on stochastic approximations ofactualinteractions. Seeforexample,Martinolietal.(2001). Macroscopic models describe the complete robot system. Rather than capturing the state of eachrobot,macroscopicmodelsdescribethetotalnumbersofrobotswithinagivenstate. Transitions from one state to another are governed by so-called rate equations, where the actual number of model parameters depend on the complexity of the system being studied. As this is most closely related to the current work, we give several examples below. It should be noted that both microscopic and macroscopic models are related through the Master-Equation(Lerman and Galstyan 2001) which describes the evolution of a probability density. By averaging a description of the change of this density, one may obtain the macro- scopic model. This process is not always tractable, and consequently additional assumptions maybeintroducedandaphenomenologicalmacroscopicmodelproduced. 70 ThefirstexampleofmacroscopicmodelinginroboticsistheworkofSekiyamaandFukuda (1996). They consider the problem of opinion formation in a group of robots. Sugawara and Sano (1997) and Sugawara et al. (1999) construct a similar description of a group of robots performing the task of foraging, and use geometric considerations in order to estimate model parameters. Lerman and Galstyan (2001) described how this type of modeling can be ap- plied more generally for a given specification of a reactive controller, considering multi-robot foraging as an example. Lerman and Galstyan (2002) employed these methods to provide a model for the effect of interference in multi-robot foraging. While the preceding examples demonstrated qualitative predictions, Lerman et al. (2001) achieved a quantitative agreement forlarge-robotsystemsinacollaborativestickpullingscenario. Martinolietal.(2004)describe thesamescenario,butusingbothmicroandmacroscopicmodels. Incontrasttothepreceding examples,theyconstructadiscretetimemodel(i.e.,therateequationsaredifferenceequations asopposedtodifferentialequations),andcomparemodelsofthesameunderlyingcontrollerat differentlevelsofdetail. Themodelsusedaboveweresuccessfulinthecaseofreactiverobots. Lermanetal.(2006) generalized the methods to account for memory. They study dynamic task allocation in a variation of the foraging problem in which there are multiple types of pucks to forage. The robots maintain a memory of some past (finite) history of observations which is fed into a “transition function” that affects the tasks a given robot performs. The model shows excellent agreement and allows, for example, transition functions to be explored in the macroscopic model, before having to be simulated. In other words, this model informs the designer in 71 addressingthesynthesisproblem. WediscusstherelationshipinSection3.4. Thesamemulti- puckforagingtaskisusedtovalidatethesynthesismethodologyinSection5.2. Traditional computer science theories for formal specification and verification of concur- rent systems (e.g., CSP(Hoare 1978), CCS(Milner 1980),π-calculus(Milner 1999)) have not beenwidelyadoptedbymulti-robotresearchers,despitethefactthatsuchanalysismaybecrit- ical for fault tolerant distributed systems. One exception is the Communication and Control Language (CCL)(Klavins 2003b), which includes features typical of process calculi, like an associated logic (with which modeling checking or theorem proving could be implemented) and a language interpreter. Klavins (2003a) presents one of the first applications of CCL in whichcontrollersarecharacterizedintermsofcommunicationcomplexity. Donald(1995),previousmentionedinthecontextofbox-pushing,introducedthegeomet- ric theory of information invariants in order to characterize information implicitly encoded in various aspects of robot systems. The theory allows for comparison of distributed sensori- computational units, describing how much information is encoded in particular fixed spatial relationships, in sensor calibration, etc. His use of “information” as the common currency allows for a wide range of resources to be treated together and compared. For example, the theoryallowsonetoexplorecircumstancesinwhichinter-robotcommunicationcanbetraded off for better sensors. Donald (1995) also showed that several aspects of the theory could—at leastinprinciple—beautomated. Anotherapproachistodefineaformalmeasureforaparticularresource,anddefineassoci- ated complexity classes that characterize scaling properties for the resource. Thus, analogous 72 to space or time complexity in traditional algorithmic analysis. A notable example is Balch (1998);heintroducedaninformationtheoreticmeasurefordescribingheterogeneity. Jones(2005)introducedaformalismandsuiteoftoolsforautomaticallysynthesizingmin- imalist controllers in order to perform non-cyclic tasks. His work considers circumstances in which memory and communication can be traded off. Given a task specification and some controller parametrization (e.g., how many bits of state can the controller use), his formalism willeitherproduceacontroller,orconstructanexampleshowingwhynoneexists. Weemploy asimilarstyleofindividualrobotcontroller,althoughourfocusisnotonautomatedsynthesis. ThereisadeeperconnectionbetweenJonesandMatri´ c’sformalismandtheworkpresentedin thisdissertation,thisisdiscussedinSection4.6aftersuitabledefinitionshavebeenprovided. 3.4 Relationshipbetweendissertationresearchandpriorwork 3.4.1 Multi-robotsystemsize Asignificantdistinctionbetweenmostpreviousworkandthisresearchisthatweexplicitlyseek to exploit system size in modeling the system. The developed formalism aims to describe a particulartypeofmacroscopicbehaviorwhichisonlyfeasiblewhenoneconsiderslargegroups ofrobots. Thestatisticalphysicsbasisfortheanalysismethodsmeanthattheyscalewellinthat they become increasingly predictive with larger group sizes: or, more precisely, fluctuations thatcausethesystemtobehavedifferentlyfromtheexpectedmeanbehaviorbecomelesslikely. The design of processes is often helped by thinking about behavior in the limit of large size. 73 For this thinking to reflect what really occurs in the multi-robot systems we require that edge- effects do not dominate, which requires hundreds of robots. At such sizes, continuous models areoftenausefulsimplificationofthediscretereality. Inordertocarryoutthisresearch,newtechniquesweredevelopedtorealisticallysimulate severalhundredrobots. SeeChapter6foradetaileddiscussion. 3.4.2 Minimalistcoordinationprimitives The vast majority of work on minimalist robotics relies on implicit coordination mechanisms, we (following Jones (2005)) believe that there are several advantages to using simple robots with simple controllers, but without directly seeking to exploit environmental dynamics. Co- ordinationresultslessfromemergencethanfromexplicitdesign. Thefocusofthisdissertationworkisonsolvingparticularcoordinationproblemsviapro- cesses that produce some collective behavior. These collective behaviors can be viewed as coordination primitives for robot swarms. This is similar to explicit coordination approaches that focus on standard abstract coordination problems (e.g., task allocation, scheduling, etc.). However,thebasiccoordinationprimitivesthatcomprisetheprocesstoolboxaresimplerthan those typically studied for explicitly coordinated systems. Yet demonstrations of the synthe- sis methodology show that these simple primitives can be used to solve standard coordination problems like, for example, division of labor. Additionally, as Section 3.5 describes, existing complexdistributedsystemswereusedasacuetosuggestwhichmacroscopicbehaviorsought tobepresentintheprocesstoolbox. 74 3.4.3 Sufficientconditionsforpredictableprocesses Unlike most robotics research, the present study aims to identify sufficient conditions for the prediction of system behavior, and explores controllers that satisfy these constraints. The vast majority of multi-robot research either seeks to find suitable modeling techniques in order to analyze existing systems, or performs synthesis with rich programmatic constructs or Turing complete programming languages. The unique aspect of this work is the focus on processes whichsatisfytheergodicpropertyand,thus,havelittletemporalstructuremicroscopically. We show,however,thattemporalstructurecan beproducedbutinsteadatthecollectivelevel. (See Section5.3.1.) 3.4.4 Equilibriummodels A critical distinction between this work and previous macroscopic modeling work is that we exploretheuseofequilibriumstatedescriptions,whereasrate-equationsconsiderthe“kinetics” ofchangesinstates. Thisdissertationconsidersonlythesteady-state(orequilibrium)behavior of the interacting processes, not the transients or the route toward equilibrium. Investigating thisrelationshipmathematicallyisapotentiallyusefuldirectionforfuturework. 3.4.5 Exploitingmodelsforsynthesis Animportantdifferencebetweenthisresearchandexistingmacroscopicmodelingwork,isthat this work aims to use the macroscopic descriptions during design time. Existing macroscopic modelingworkattemptstoinformthesystemdesigneronceacontrollerhasbeengiven,rather 75 thandirectlyduringsynthesis. Ascomparedwithexistingmodelingmethods,wedonotmake any claims about the suitability of end-to-end modeling with stochastic processes. Instead, we give sufficient rules for predicting “primitive” behavior, and principles for constructing further behavior (e.g., exploiting different timescales and conservation constraints) to allow incrementalmodelingformorecomplexbehavior. Webelievethatitshouldbepossibleforthe twomethodstocomplementoneanother. 3.4.6 Macroscopicbases The focus on constructing a toolbox of processes is similar in many ways to the use of basis behaviors to structure and simplify design of behavior-based systems(Matari ´ c 1995). Basis behaviors are a set of behaviors that are independent in the sense that each either achieves, or helps achieve, a relevant goal that cannot be achieved by other behaviors. The large space of feasible actions is factorized into a smaller set of basis behaviors which has the purpose of spanning the action space while reflecting its structure. Although the nomenclature is linear algebraic,andthebehaviorsthemselvesarenot,thenotioncanremainusefulwheninterpreted as guidelines for sets of behaviors. The processes considered in this work serve to produce predictable reusable macroscopic behaviors. The linear algebraic notions are similarly useful, if somewhat lacking in formal applicability in this research. The properties that make the processespredictableandcombinable,meanthatitismoremeaningfultoenvisionthenotions ofindependenceandspanasmacroscopicspace,ratherthanlocally(e.g.,atlevelthatbehaviors areimplementedonarobot). Infact,sincebehavior-basedsystemsproducecomplexemergent 76 behavior through non-linear interactions of concurrently executing behaviors(Matari ´ c 1995) whereastheergodicprocess-basedsynthesismethodsusepredictableprocesscouplingsacross timescales, the notion of a basis is arguably more applicable for ergodic processes than for behaviors. An important difference between behavior-based systems more generally and this work is thattheprocessesweconsiderperformabstractcomputation,inotherwords,thereisnoexplicit requirementforthemtomapdirectlytoactions. Theprocessesexecuteoneachrobot,commu- nicatingwiththesameprocessonneighboringrobots. Inthisregard,theyappearsuperficially similartotheportbasedmechanismofWergerandMatari´ c(2000). 3.4.7 Entropymaximization SpletzerandTaylor(2003)describeanoptimization-basedmethodforcoordinatingrobotsys- tems. In a sense this work is similar because execution of the processes can be viewed as an optimization process since the system will tend to produce an increase in entropy. It can be usefultoviewthecoordinationthatresultsfromthisperspectiveduringsynthesis. Thesystem designer may ask him or herself how the task may be achieved by exploiting an increase in entropy and constructing processes that do this (see Howard et al. (2002) for an example task where this occurs). However, an important difference is that while Spletzer and Taylor (2003) constructthefunctiontobeoptimizedbyconsideringthetaskanddetermininganexpressionto be optimized. In contrast, the entropy surface captures the structure of the processes, or more precisely,thestructureofthephase-spaceoftheglobalsystem,butisderivedthroughanalysis 77 ofprocesses. 6 Therearelimitstothetypeofentropysurfacethatcanbeconstructed. Wehave anhomogeneousrobotsystemwhichmustresultinatypeofsymmetryinthesurfaceitself. 3.4.8 Tokenpassingforcoordinationprocesses Some of the processes used in this work rely on passing messages between robots that are never destroyed, instead being passed from one robot to the next. We term such messages “tokens.” A similar technique has been described by Xu et al. (2005) in order to coordinate a multi-agentsystem. Theirmethodplacescoordinationrelatedinformationintotoken-likedata structuresthatarepassedaroundacommunicationnetwork. Xuetal.makeuseofsophisticated decision theoretic methods in order make decisions about how tokens should be routed within the system. In contrast, the related processes within this work use very simple techniques, like picking a random neighbor, to distribute tokens. Nevertheless both Xu et al. (2005) and the present work rely on the tokens dynamically exploring the connected sections of the robot system. Ourfocusontokensconcernstheiruseinprocesseswhichcanbeanalyticallymodeled atthemacroscopiclevel,whereastheirmethodshaveyettobe. 3.4.9 Behavioronmeasurespaces Donald(1995)usestopologicalspacesastheunderlyingmathematicalstructureforhistheory ofinformationinvariants. Forexample,sensorsand/orcomputationalresourcesarerepresented 6 The generation of processes to produce a given surface is a problem we are deeply interested in but we know of noresultsinthisregard. 78 as graph immersions (i.e., embeddings that are not necessarily injective) within the configu- ration space. Structural constraints, like co-location or separation requirements, are captured by constructing a quotient space. Our formalism is similar because equivalence relations play animportantroleinthedefinitionofmacroscopicproperties. Additionally,wefinditusefulto thinkofaspectsoftherobotsystemsintermsofthe“space”ofvaluestheycantake,ratherthan their specific values at a particular time. This research makes assumptions about system size, which are missing from Donald (1995) and, because we assume the underlying dynamics of the processes are ergodic, we endow our representation with a measure. This measure allows forthecalculationofphase-spaceaverages,whichgreatlyexpandstheutilityofthemodel. 3.4.10 Randomnessandstochasticbehavior We adopt an idea described by Erdmann (1989) in his motivation for the use of random ac- tions: “...actively randomizing its actions a [robot] system can blur the significance of mod- eled or uncertain parameters.” We believe the same to be true for multi-robot systems and that such randomness can help mitigate feedback effects which multiply unmodeled and un- wantedbehavior(cf.,also,Deneubourgetal.(1983)). Theemergenceofhigher-levelregularity in implicitly coordinated multi-robot systems relies on positive feedback rooted in structured temporal dynamics. The mixing properties and weak temporal-dynamics of ergodic processes supportthisconjecture,whichisanextensionoftheoriginalintuitionobservedinmanipulator robots(Erdmann1989). 79 The exploitation of stochasticity is also apparent in a machine learning method based on Boltzmann Machines described by Ackley et al. (1985). The processes used in the current workarerelatedinthattheybehaveinasimilarstochasticfashion,althoughwedonotrequire the processes be limited to only two states. The common feature is that both encode informa- tion through their temporal behavior. For example, a Boltzmann Machine is only capable of producingeithera0ora1. Inordertorepresentavaluelike0.25,themachineswillproduce0 as output 3 4 of the time, with a 1 being produced the 1 4 of the time. Similar Monte Carlo be- havior can be exhibited in the processes used to make swarm controllers. While behaviorally theymayappearsimilar,andbothmakeuseofthermodynamicnotionslikeatemperature,the purposesoftheBoltzmannMachinesandourprocessesarequitedistinctinpurpose. 3.4.11 On-lineuseofmacroscopicmodels This research is related to Goldberg (2001); Goldberg and Matari´ c (2003) because he demon- strateshowamacroscopicmodel,basedonaugmentedMarkovModels,canbeusedbyagroup ofrobotsduringtheirtask-execution. Thisdissertationshowshowthermodynamic-basedmod- elscanbeusedatdesigntimetotriggertransitionsincollectivestrategy,similarinprinciplebut different in intention and implementation. The models allow us to pre-design a repertoire of taskbehaviors(e.g.,distinctforagingstrategies)andperformmacroscopiccontrolbyswitching among them. Much of the modeling information is thus used at design time. The evacuation assistanceworkhasanadditionalsimilarityinitsuseofonlinemodels. 80 TheTemnothoraxtransportexampleandevacuationassistanceworkbothdemonstratehow macroscopic models can be used as a control abstraction. In this regard the present work is close to Vaughan et al. (1998) and their control of ducks — the distinction in the case of evacuation being that a multi-robot system (rather than a single agent) controls the second group. 3.4.12 Evaluationdomains ThisresearchrelatestoGoldberg(2001);GoldbergandMatari´ c(2003)inanotherwaytoo. He used models of collective dynamics in order to change strategies and improve performance of a group of robots. We consider the same problem, namely mitigating interference in foraging, using macroscopic coordination done using controllers composed of ergodic processes. The focus on interference is particularly relevant to large-scale systems, where it is amplified and limitsthepossibleperformanceofthesystem. Goldberg’sinitialworkoninterferenceprovided anunderstandingthatdifferentforagingstrategieshaveusefulqualitativepropertiesthatcanbe usefully exploited. We started with this idea, but also needed to perform experiments in order toconsiderlarge-scalerobotsystems(ResultsarereportedinSection6.4). Theprecedingdiscussionisaparticularinstanceofthefactthatmoregenerally,thisdisser- tationattemptedtoevaluatethesynthesismethodologyondomainsthathadalreadybeeniden- tified as useful for multi-robot systems. So, in addition to the foraging strategy selection, we consideredadivisionoflaborproblemstudiedelsewherebyJonesandMatari´ c(2003);Suseki et al. (2005); Lerman et al. (2006). The sensor-network distributed time-stamping problem is 81 similar to Xu et al. (2004). Further details are provided in the discussion of each evaluation domain. 3.4.13 Multiscalenon-synchronouscomputation Finally,wenotethatthelocal-to-globalproblemisconcernedwithhowinteractionsatonespa- tial and temporal scale result in patterns at a larger and longer scale. Typically this is studied in terms of local interactions among discrete finite automata style agents (see, for example, Resnick (1994); Berlekamp et al. (1982)). These local interaction rules are termed “simple.” However, whether a synchronous, discrete error-free operations are actually natural for phys- ical systems remains questionable. After all, it took many years before machines could be constructedsoastobehavepredictablyandreliablyattheappropriatescales. Weconsiderthis worktoberelatedtotheoriginalcybernetictradition,adoptingmethodssimilartotheoriginal proposalbyWiener(1948)toexploitGibbs’sstatisticalmechanicaltechniques. Theexamples inthefollowingchaptersalsohaveaflavorclosetothehomeostat(Ashby1966),withtheentire systemcollectivelyequilibrating. 3.5 Reoccurringmacroscopicproperties Oneoftheprimaryreasonsthatthisdissertationisconcernedwithexploitingmodelsofmacro- scopicbehavioristheobservationthatthevastmajorityofbiologically-inspiredroboticsisre- ally an application of some low-level biologically studied mechanism to robots. When multi- robot systems are “biologically-inspired” they most often mean that some local interactions 82 have been appropriated in order to achieve some global behavior. Unfortunately, this appro- priation of mechanism often fails to be particularly deep. We find, for example, that little is successfullyabstractedawayfromNature’simplementation,orreusedfordifferenttasks. Consider the trail-formation in foraging networks by ants (e.g., Monomorium pharao- nis (Sudd 1960a), or Linepithaema humile (Goss et al. 1989)). As the ants move through the environment they deposit pheromones. The greater the number of ants, the stronger the odor. Shorter routes from the nest to food sources can be travelled more quickly than longer ones, this results in more frequent laying of pheromones. The ants bias their navigation based on odor strength, and hence a positive feedback loop is created. This behavior successfully findstheshortestroutefromthenesttothesource. Thisoptimizationmechanismisprobablythesingleentomologicalphenomenonthatmost stimulatedcomputerscienceandroboticresearchinrecentyears. Researchrangesfromrobots that lay physical chemical trails (Russell 1999), to those that perform trail-based transport (Vaughan et al. 2000c), to those that share trails in a shared virtual space (Vaughan et al. 2002), to agents optimizing abstract functions over a virtual graph (Dorigo et al. 1996), to placementofcomputationaldeviceswithintheenvironmentandgeneralizationto“pheromone robotics”(Paytonetal.2001). Eachoftheseexamplesinvolvesextractinganaspectofthemi- croscopicmechanismemployedbyrealants,andthisisdoneatdifferentlevelsofabstraction, butwhatremainsisstillstrikinglysimilartotheoriginalantproblem. Eachofthesepreceding examplesshowhowthebiologicalresearchcanbeappliedatwhatMarr(1982)termsthealgo- rithmic and implementation levels. While useful for finding out how to solve some particular 83 problem, it tells little about how the procedures can be adapted or combined with others to solveradicallydifferentproblems. But what mechanism-independent lessons are to be learned from the set of global behav- iors exhibited across natural systems? We approach this question in order to direct the design of processes for the synthesis toolbox, reasoning that the same class of macroscopic behavior occurs across widely disparate systems, then it is likely that it represents a useful primitive. Table3.1describesanumberofqualitativecommonalitiesthathavebeenidentifiedacrosssys- tems. Itshouldbenotedthatseveraloftheseinstanceshaverequiredadegreeofinterpretation. Forexample,“phasetransition”isinterpretedasmeaninganabrupttransitionbetweendistinct forms of behavior, which is a far cry the formal thermodynamic definition. We are forced to broaden the interpretation in this way, because the range of systems involved. (Fortunately, thesetermsareoftenabusedbythereferencedwork,sothisauthordoesnothavetoacceptthe fullresponsibilityforthisabuse.) This selection of examples suggest that phase transitions and symmetry breaking types of behaviorsarecommoninnaturalsystems. Althoughthesynthesismethodologydoesnotclaim tomakeuseofanyofthesamemechanisms,itdoesproducethesehigher-levelbehaviors. The preceding sections have described how research on explicitly coordinated approaches have attempted to solve oft-reoccurring coordination problems with efficient algorithms, and thentoframenewtasksasvariantsoftheseexistingproblems. Similarly,theunderlyingideaof basis behaviors is an attempt to produce, understand and exploit repeated task-structure. This has a bearing on this dissertation research, because the toolbox is a collection of processes, 84 System Instance Macro-Behavior Notes/Description Socialants Nestselection Symmetrybreaking Various Temnothorax (Mallon et al. 2001;Pratt2005) Hysteresis T.rugatuluswithequalqualitynests 7 Synchronization Temnothoraxallardycei(Cole1991) Preytransport Symmetrybreaking Several,e.g., Pheidole crassinoda(Sudd 1960b)† Broodsorting Symmetrybreaking Various Temnothorax (Franks and Sendova-Franks1992) † Corpseclustering Symmetrybreaking Several, e.g., Matthaea sancta (Ther- aulazetal.2002)† Trailformation Symmetrybreaking More than one, e.g., Linepithaema hu- mile(Gossetal.1989)† Cockroaches Shelterselection Symmetrybreaking Periplaneta americana (Halloy et al. 2007)†? Wasps Construction Stigmergy Polistes dominulus (Theraulaz et al. 1990) Fireflies & Glow- worms Mateattraction Synchronization e.g., Photuris pennsylvanica (Buck 1938)† Fish Schooling Phasetransitions Couzin et al. (2002) captures transitions and toroidal forms exhibited by bar- racuda,jackandtuna. Humans Vehiculartraffic Phase transitions & Nucleation See “Fundamental diagram” in Kerner andRehborn(1997) Pedestriandynam- ics Symmetrybreaking Spontaneouslaneformation,seeHelbing etal.(2001a) Panic/Evacuation Phasetransitions The “faster is slower” effect (Parisi and Dorso2005) Symmetrybreaking Irrational herding behavior, see Batty (1997) Opinion Phasetransitions LeBon(1895)describesazero-one-type rule. † Signifies that instance has inspired a robotic (or related) implementation. ? Experiment design actuallyshowsroboticcontrol. Table3.1: Repeatedqualitativepropertiesacrossnaturaldistributedsystems. 85 and an important question is what sorts of macroscopic regularity should be produced by the set of processes. The related question is whether the processes (or behaviors, or coordination problems) suffice for all necessary tasks, or failing that, what range of tasks can be produced. Weareunawareofanyresultsinthisregard. 3.6 Summary This chapter defined the synthesis problem, explained why it important and challenging. Ex- istingcoordinationapproacheswerereviewed,withafocusonformalismsformulti-robotsys- tems. Weintroducedaslightlyaugmentedtaxonomyfordescribingrobotssystems,whichwill beusedtodescribetherobotsinthisresearch. Abroaddiscussiondescribedhowthisresearch relatestopriorwork. Furtherexplorationofrelatedworkappearsinthechaptersthatfollow,as meaninguldiscussionoftheseworksrequiresfurthertechnicaldefinitions. Additionally,work thatdoesnotrelatedirectlytothesynthesisprobleminmulti-robotsystems,butisstillrelated tothepresentwork,isintroducedintheappropriateplacesthroughoutthedissertation. 86 Chapter4 Synthesismethodology This chapter defines and describes the equilibrium statistical mechanics and ther- modynamics methods we use for analysis of individual processes. We present the high-level ideas and point out the modeling choices that are unique to this work. Specifically, we define ergodic processes, explaining why they form a useful micro- level building block. Next, we introduce the thermodynamic basis for characterizing the macroscopic behavior of a process. The operations used for coupling processes to form aggregate ergodic processes are defined. When two processes are coupled, the two microscopic specifications and the two macroscopic characterizations must bemergedinordertodescribetheresultantaggregateprocessatboththemicroscopic andmacroscopiclevels. Theseoperationsaredefinedandexplained. Our multi-robot controllers are constructed by coupling processes that have been individually analyzedinordertohaveawell-definedmacroscopiccharacterization. Inordertorealizethis, several definitions are necessary. First, we must define the differences between microscopic, macroscopicandensemblelevelsofdescription. Thisallowsthemacroscopiccharacterization 87 of a process to be meaningfully explained. Secondly, we must describe how methods can be used to establish a link between the different levels of description for individual processes. Although the focus definitional, the chapter includes some high-level discussion of why such techniques are workable. Thirdly, we must define methods for coupling processes in order to construct aggregate processes. Such a coupling is realized at the controller level by con- strainingtwoprocesses;wedescribehowsuchconstraintscanbeconstructedusingonlylocal informationinthemulti-robotsystem. Anaggregateprocesscreatedbycouplingtwoelemen- tary processes has its own macroscopic characterization defined in terms of the macroscopic characterizationsoftheelementaryprocesses. Central to our synthesis methodology is a toolbox consisting of a collection of reusable processes and their reusable macroscopic characterizations. Figure 4.1 shows the iterative procedure used to construct the process toolbox. This chapter provides formal definitions that describethisprocedure. Thenextsectiondefinesthemicroscopicdescriptionandidentifiesthe processes we consider. The section thereafter gives definitions for macroscopic descriptions. In terms of the figure, these represent the controller (green cog) and characterization (green jigsaw piece) on the arrows. The details of the middle step labelled “analysis” are deferred untilSection4.4. Thefollowingchapterwillpresentconcreteapplications. 4.1 Microscopicprocessdescription We propose that robot controllers be constructed through composition of simple processes. A processP isdescribedbyastate-spacedefinitionandadynamicsfunction,writing P ,hS,Φi 88 Figure 4.1: Overview of the modeling procedure use to produce the toolbox of processes and associatedmacroscopiccharacterizations. where S is the (possibly infinite, potentially uncountable) set of states the process can occur in, and Φ being a dynamics that need not be deterministic. The process produces a sequence ofstatevaluess(t)∈S fort≥ 0,calledtheprocesstrajectory. NowwemodelnrobotseachexecutingasingleprocessP =hS,Φi. Asingleglobalview ofthesystemstateisconstructedbyconsidering ~ s(t), [s 1 (t),s 2 (t),··· ,s n−1 (t),s n (t)] (4.1) Atanyinstantintime,~ s(t)providesthecompleteunambiguousdescriptionoftheprocess P running on each robot and is called a microstate. The trajectory of ~ s(t) gives the entire microscopicdescriptionoftheallprocessesP overtime. 89 4.1.1 Feasibleconfigurations InEquation4.1clearly∀t, ~ s(t)∈S n ,S×S×···×S. Typicallytheglobalstateofasystem of n processes is not simply a copy of n independent states, because there are constraints amongtheprocessesexecutingondifferentrobots. Inpracticewewillconsidercircumstances in which one (or more) quantities are globally conserved across the robots. Thus, we seek a smallersetthanS n whichincludesall ~ s(t),anddenotethisset ~ S thephase-space. Systemsfor which ~ S =S n havenoconstraints. What follows is a dynamical systems formalism. Although the development is in terms of distributed computational processes, the same system description could apply to modeling robots directly. In such a case, the state-space would reflect the robot’s degrees-of-freedom and Φ the Newtonian rigid-body dynamics. In fact, our focus is not on the dynamics of the robots themselves because we intend to build general coordination primitives that serve an information processing purpose, not a model of system’s physics. However, in the case where the set S is actually the configuration space of the robot, then there are natural constraints on~ s(t) as no two robots may occupy the same place at the same time. Donald (1995) gives a related description for configurations of arbitrary resources and the subsequent subtraction of the “diagonals.” Analysis of the structure of the structure of the subspace in which ~ s(t) evolves can be illuminating, but because we construct the process, we typically have some notionofthestructureanyway. Thegoalofunderstandingthephase-spacestructureistackled byconsideringacoarserdescription,aswillbedescribedinthefollowingsections. 90 Figure 4.2: Representation of the system phase-space ~ S and the dynamics function ~ Φ. The dynamics show an unstructured walk through the space suggesting a measureP(·) and hence theergodicproperty. Anotationaldistinctionismadebetweenmicrostatesrepresented,as~ s(orsomethinganal- ogous) when quantified over the entire ~ S (or subsets thereof), and ~ s(t) or similar variables bearing a temporal index when a temporal sequence of elements from ~ S is intended. Both methods will be used. The universal quantification overt above does not imply a single tem- poralevolution,butratherconsidersallpossiblearrangements. 4.1.2 Dynamics Thesystem-widedynamicsisconstructedinasimilarcomponentwisefashiontotheconstruc- tion of the system-wide state space. Generally, we may consider a global dynamics function ~ Φ such that ~ s(t +δt) , ~ Φ[~ s(t)] for some suitably small temporal step-size δt (or one may assume a family of functions, for the continuous time, but this is insignificant for presentation 91 of the underlying ideas). Figure 4.2 shows this pictorially. Considering this componentwise, wemayget: ~ Φ([s 1 (t),...,s n (t)]) = [Φ(s 1 (t),s i∈A 1 (t)),Φ(s 2 (t),s i∈A 2 (t)),...,Φ(s n (t),s j∈An (t))] (4.2) The sets A i fori = 1,...,n define the extent of information required about the state of others in order to calculate the future time step for each i. Most generally each A i will change with time too. The sets represent the fact that while the robots execute processes individually, the processes are usually not entirely decoupled. The decoupled case is most simple, having an entirely separable dynamics update ~ Φ where A i = ∅fori = 1,...,n. Al- though simple, this could model independent clocks executing on each robot. At the other extreme,theglobaldynamicsfunction ~ Φmaydependonthefull~ sforupdateofeachs i (t),so A i ={1,...,n}\{i}fori = 1,...,n. Suchadynamicsisaglobalcentralizedcomputation, requiringinformationfromeveryprocess. We will consider componentwise dynamics functions strictly between the two preceding cases, whereA i ⊂{1,...,n} fori = 1,...,n, and P n i=1 |A i | = kn wherek is a small pos- itive constant. In particular, the setA i below will involve a spatial neighborhood centered on roboti’spositionwithinaplane. (Also,experimentalresultsshowthat|A i |<k 0 ,i = 1,...,n for a small k 0 , see Figure 6.4 on page 170.) Thus, the A i sets will represent a potentially dy- namic communications infrastructure. Because these sets usually change as the robots move, theyareoneofthewaysthebehaviorthephysicalrobotsisimpingesontheprocessexecution. 92 Inordertoprovidemeaningfulpredictionsofsystembehaviorwithoutdirectsimulationof ~ Φ, we introduce the notion of ergodicity(Petersen 1983). The time evolution of the system is atrajectorythrough ~ S. Asystemthatexhibitsergodicdynamicshasthepropertiesthat: 1. ThereexistsameasureP : ~ S→ (0,1]havingthepropertythat Z ~ s∈ ~ S P(~ s)d~ s = 1. (4.3) 2. For any initial conditions from ~ S, the system will evolve exploring ~ S such that given sufficienttime,theprobabilityoffindingthesysteminstates ~ B⊆ ~ S isgivenby Pr(~ s(t)∈ ~ S) = Z ~ r∈ ~ S P(~ r)d~ r, (4.4) thatis,theprobabilitymassthatlieswithinthesub-volume ~ B ofthephasespace. Such a dynamics wanders through the phase space, capable of visiting regions from any location, given sufficient time. In many cases, Long-term history and initial conditions may be unimportant in predicting the system’s trajectory—but even when this is not the case, the probabilitymeasureallowsaveragebehaviortobeanalyzed. Mostdynamicsfunctionsproduce short-term temporal regularity. In such cases, analysis is only reasonable for durations of sufficientlength. 93 4.1.3 Timeandphase-spaceaverages Systems with ergodic dynamics induce a notion of equilibrium because a time-independent description of the system can be provided despite its continual evolution. Operationally, this is significant because time averages of system properties are equal to configuration spatial averages (also called ensemble averages). In the preceding sentence, the “system property” simply means a function of the state of the microscopic system, 1 for example, an arbitrary G : ~ S→R. In physics the underlying microstates cannot be measured directly, so functions are used to represent properties of a particular system that are observable. One calculates the time-averageofthesystemproperty Gby G T , 1 T T Z 0 G ~ Φ t (~ s(0)) dt, (4.5) where ~ Φ t (~ s(0)) = ttimes z }| { ~ Φ ~ Φ ··· ~ Φ(~ s(0)) generates a single trajectory from initial condition ~ s(0) in the typicaliterated-function-system style. In physics this expressionmodels an exper- imental observation of some property for a period T. The information necessary to carry out this integration essentially requires full microscopic simulation, and the calculation ofG T is challenging. Theexistenceofameasure,Pallowsforthephase-spaceaverageof Gtobedefinedas hGi, Z ~ s∈ ~ S P(~ s)G(~ s) d~ s. (4.6) 1 Instatisticalmechanicsthisistermedamechanicalpropertyasdistinguishedfromensembleproperties. 94 Reconsider Figure 4.2. The value forG T would be calculated by integratingG along the trajectory shown. On the other hand,hGi ignores the particular trajectory, instead averaging over ~ S. Ergodic processes have the property that, in the limit of long time windows, time- averagesequalphase-spaceaverages, lim T→∞ G T =hGi. (4.7) Many authors have demonstrated distributed systems of simple interacting robots that ex- hibit a variety of non-ergodic global dynamics ranging from point and periodic attractors to chaos (Hogg and Huberman 1991), and even classic examples of those capable of universal computation (Berlekamp et al. 1982; Wolfram 1984). From this perspective, the requirement of ergodicity places a significant restriction on the set of feasible global dynamics available to thesystemdesigner. Importantly,however,thispropertyenablespredictionofsystembehavior without having to resort to the more traditional integration of the microscopic dynamics. It establishesalinkbetweenmultiplelevelsofdescriptionatdifferenttimescales. Several researchers have explored minimalist multi-robot systems by reducing some par- ticular resource, like memory or communication. We consider simple robot systems too, but we add the additional constraint of using components that have extremely limited temporal structure because they involve ergodic dynamics. Crutchfield (1994) introduced the computa- tional mechanics measure of excess entropy in order to characterize structure “hidden” within dynamical systems. The robot controllers considered in this work are minimalist in the sense thattheyarebuiltfromprocesseswithzeroexcessentropy. 95 An important question that remains is whether the severe limitations placed on ergodic dynamics mean that they are too weak to perform useful tasks. This dissertation work is an instance of minimalist robotics that involves the demonstration of sufficiency for particular classes of tasks. The approach we take requires the introduction of additional complexity at higher levels of description and at longer timescales. In order to achieve this we introduce the notion of macrostates as a description of the system that is coarser than that provided by microstates. 4.1.4 Parameterizedprocesses Two aspects contribute to the overall behavior of a system with many interacting degrees of freedom. Thefirstisthephase-space ~ S,therangeofconceivablestatesavailabletothesystem. Thesecondisthespatio-temporaldynamics ~ Φthroughwhichthesystemexploresthesestates. Currently, the dominant view (albeit implicitly held) is that synthesis of collective behavior should focus on the latter. We propose, instead, that controllers be constructed from simple dynamics, butthatusefulbehaviorresultfromchangesinthephase-space. Thesephase-space leadtotask-orientedcomputationinanentirelydecentralizedmanner. The phase-space is changed through the use of parameterized processes. The preceding description begins by defining P , hS,Φi. In actuality we care about processes of the form Q(x 1 ,x 2 ,··· ,x m ),hS(x 1 ,x 2 ,··· ,x m ),Φi where thex i ,i = 1,...,m are termed param- eters that allow ~ S to be variable, and depend explicitly on each of these variables. Although x 1 ,x 2 ,··· ,x m characterize ~ S,wewriteS(x 1 ,x 2 ,··· ,x m )becausethestructurethatresultsin 96 ~ S must be produced by the processes themselves. Full generality should allow processes with dynamics functions to be parameterized as well, but in elevating the level of description, we wishtoreasonaboutchangesinphase-spacesizeandtopologyratherthanaboutthedynamics themselves. AsingledimensionalexampleofaparameterizedprocessisP a (z),hS a (z),Φ a i, where Φ a is ergodic. Withn robots the result is a trajectory~ s a (t,z) through ~ S a (z). The pre- ceding discussion with different types of averages applies identically — we need only keep track of the additional parameter in each of the equationsz. The previous sections introduced processes without parameters because when designing and analyzing processes, we will as- sume fixed parameters. We do this because any changes in those phase-space parameters are assumed to be slow compared with the dynamics Φ. But the parameters are key to coupling processes across timescales. We will considering aggregate process behavior to be achieved throughquasi-staticchangesthatmaintainequilibriumineachoftheconstituentprocesses. In the processes that we will define, the parameterization describes a conservation con- straint(basedonobservationsofsuccessesinstatisticalmechanicswithHamiltoniansystems). By a constraint we mean a postulated property that reduces the total degrees of freedom in the global processes state, that is, it places restrictions on ~ S so that ~ S ⊂ S n . This means, for example,thatarobotexecutingaprocesswillneedtoconsultwithotherrobotsduringitsexe- cution. Inspiteof being globalconstraints, weconsiderinstancesthatarelocallyenforceable. Reasonable pre- and post-conditions on robot interactions can maintain these types of global constraints. This becomes clearer in the following chapter, where specific examples are used. 97 It is noteworthy that such constraints operate on ~ S; that is, they constrain the set of feasible configurationsnotthesystem’sdynamicalevolutionwithinthatspace. 4.1.5 Examplesofergodicity A natural question to ask is why the preceding formalism should use ergodicity as the fun- damental requirement of the processes. There are other instances of state transition systems thathaveprobabilisticrepresentations. Generalityistheprimaryreasonbecause,nestwecon- sider processes that are essentially finite Markov chains, application of the framework to the physical robots can benefit from proofs of ergodicity in the classical many-body treatment of systems Hamiltonian dynamics which will preserve measure over constant energy manifolds (cf. (Arnold and Avez 1968)). Crucially, this particular case serves to remind that the dynam- ics can in fact be deterministic. Petersen (1983) describes how stationary stochastic processes and aperiodic and positive recurrent Markov chains are ergodic (to do so he treats the Markov shift as the dynamics). Petersen shows that a Bernoulli shifts (i.e., finite-valued stationary stochastic processes with values that are i.i.d) are also ergodic, similarly. Indeed, we would expectasmuch,astheKAM(Kolmogorov-Arnold-Moser)hierarchyofsystemsdistinguishes Bernoulli systems as a subset of Mixing systems, themselves being a subset of the Ergodic ones(LebowitzandPenrose1973). 98 4.2 Macroscopicprocessdescription The microscopic view provides an over-specification of the process behavior because it dis- tinguishes states that to an external observer are identical, or to the system designer ought to be considered identical. In the context of robots, we do not usually have system requirements specifythatrobotishouldbeinstates i ,androbotj shouldbeinstates j ,andsoon. Insteadwe havesomebroadfeaturesofthebehaviorwithwhichweareconcerned. AfunctionG : ~ S→R is used to “collapse” an entire set of microstates into a single macrostate. Meaningful choice ofGdependsontheprocessitself. 4.2.1 Macrostates Afunctionoftheglobalsystemstate,G : ~ S→Rproducesanassociatedequivalencerelation ~ s∼ G ~ r ⇐⇒ G(~ s) =G(~ r). Thus, eachG partitions ~ S into equi-G-valued equivalence classes. Let ~ S G be the class repre- sentatives,thatis,thesetcontainingasinglearchetype~ s∈ ~ S foreachequivalenceclass. Each classisthusdefinedas: ~ S G(·)=k ,{~ s|~ s∈ ~ S∧G(~ s) =k}, and call each set a macrostate with respect toG. Two microstates within the same class are identicaltoanobservercapableofmeasuringonlythevalueoffunctionG. 99 IfP(·) values are equal for all elements within an equivalence class then the phase-space mean(Equation4.6)canberewrittenovermacrostates: hGi = Z ~ s∈ ~ S P(~ s)G(~ s) d~ s = Z ~ r∈ ~ S G Z ~ q∈ ~ S G(·)=G(~ r) P(~ q)G(~ q)d~ qd~ r = Z ~ r∈ ~ S G P(~ r)Ω G (G(~ r))G(~ r)d~ r. whereΩ G (x) =k ~ S G(·)=x k. (4.8) Herek·k is denotes the “size” (in a Lebesgue sense) or “cardinality” (if ~ S is finite) of the argumentk ~ Xk = R ~ s∈ ~ X d~ s. Thus, Ω G (x) measures the size equivalence class withG taking valuex. (Alternatively Equation 4.8 can be used to definek·k, which can then be extended to anysubsetof ~ S bydefiningasuitableindicatorfunctionforthegivensubset.) Despiteappearingtosimplybearewrittenformofthephase-spaceaverageinequation4.8, themacrostateaveragecanbequiteuseful. Ofcourse,theactualutilityofthisvariantdepends onthespecificGfortheunderlyingprocesses. Importantcasesdoexistwheretheprobability density function is characterized by a very sharp peak at a single macrostate and equilibrated system behavior is strongly characterized by that macrostate. Put another way, the system quickly evolves toward this single macrostate and then explores the set of microstates within that equivalence class. Also, this gives our first glimpse at the compositional nature of the macroscopicdescriptions. Toexplainwhy,onemaythinkofGasactingto“abstract”thestate of the system. IfG is a partial “abstraction”, we can define a higher-level abstraction F as 100 F,H◦G. In other words,F clustersG-valued macrostates together, through H. Now, we cancalculate hFi = Z ~ s∈ ~ S F P(~ s)k ~ S F(·)=F(~ s) kF(~ s)d~ s = Z ~ r∈ ~ S G P(~ r)k ~ S G(·)=G(~ r) kH◦G(~ r)d~ r, (4.9) so one can perform calculations at the higher-level of abstraction by integration over the lower-level, but still macrostate representation. To do so, one need only understand how the P(~ s)k ~ S F(·)=F(~ s) k andP(~ r)k ~ S G(·)=G(~ r) k relate, which requires no lower-level detail. These factors suffice to characterize the macroscopic properties of the system and form a useful ab- stractionbyhidinglower-levelinformation. 4.2.2 Ensembleproperties Next,weconsidercouplingpairsofprocesses. WewillassumethattheprocesseshaveP(~ s) = k, ∀~ s ∈ ~ S, but this does not limit generality since, given processes with arbitrary density functions at the microstate-level, we can define a sub-micro-level with a uniform density and a function, likeG, that produces the desired density at the microstate-level. With uniform density, the critical element in calculating the phase-space average is the k·k factor. Given an arbitrary sub-region of the phase-space, we make use of the entropy, a log function of the “size”ofasetofmicrostates S( ~ X), lnk ~ Xk, (4.10) 101 which is Boltzmann’s form in natural units. One may think ofk ~ Xk as a weighting, that is, an unnormalizedprobability. Insuchacase,theentropyissimilartoalogweightingorlogofthe unnormalizedprobabilities. AstatementlikeS( ~ U)≤S( ~ V)impliesthat ~ V isalargerregionof phase-space than ~ U, and the system dynamics will spend proportionally more time in ~ V than ~ U (from Equation 4.4). The entropy is defined in this way, so that operations that multiply phase-spacevolumecanbereasonedaboutassimplyaddingentropy. Several definitions for entropy become identical in the thermodynamic limit. We will not be taking this limit as our robot systems are small by thermodynamic standards, and so the choice is quite important. Rather than Gibbs’s form, for example, we use Boltzmann’s form whichhasamechanicalbasisandhenceismeaningfulforsmallfinitesystems(Gross2001). The definition of entropy in Equation 4.10 is for an arbitrary subset of phase-space, but it is not typically used in this form. The system’s entropy is usually specified as a function of macrostates. Whenconsideringamacrostate,oneisreallyspecifyingsomesetofmicrostates, andtheassociated(Boltzmann)entropywhichmaybecalculated. Occasionally,however,one doesnothaveaparticularvalueofthemacrostateofinterest,butinsteadwishestounderstand how some property functionally depends on the value of the macrostate. In such a case, one might imagine a set of possible macrostate values, each representing a microstate set from whichthepropertycanbecalculated. Considersomefunctionofthissetofmacrostates. Itcan be envisioned as a function of a set of separate systems. Such properties are called ensemble properties. 102 (a) Characterizing the macroscopic behavior of a complexcontrollerdirectly. (b) Complex controller analyzed via characteriza- tionsofconstituentprocesses. Figure 4.3: Rather than requiring the characterization of a complex process, as in (a), the compositional approach makes analysis of aggregate processes feasible because macroscopic propertiescanbe“summed”asin(b). (Arrowsrepresentthestaticalmechanicalanalysis.) 4.3 Synthesisbycouplingprocesses Now that we have formally grounded the definitions of microscopic and macroscopic behav- ior,wereturntothefocusoftheformalism: synthesisthroughthecombinationofprocessesin order to maintain a view of the system’s global behavior. The basic rationale for the construc- tionist approach, and specifically how it contrasts with existing macroscopic analysis method- ologies, is shown in Figure 4.3. The compositional approach constructs the macroscopic de- scriptionfrommacroscopicdescriptionsofthecomponents. Suppose the robots are executing two ergodic processes, each with a single parameter: P 0 (m) = hS 0 (m),Φ 0 i, P 1 (n) = hS 1 (n),Φ 1 i. The complete state of this system is given by two state vectors~ s 1 (t) and~ s 2 (t), one for each of the processes. Independent application of Equation 4.6 for a particular parametrization (e.g., n = n 0 ,m = m 0 ) yields the system’s time-invariantbehavior. 103 Parametrised processes may be coupled by the introduction of a constraint on the process parameters. For example, the previous processes could be constrained so that m +n = C. WhenC isknown,asingledegreeoffreedomdescribestheparametrizationofbothprocesses, and we call this a macroscopic degree of freedom. A slow dynamics operating on the pa- rameters m and n, while maintaining the conservation constraint, gives a composite process with a combined phase-space. Prediction of composite process behavior (on long timescales compared with the coupling dynamics) can then be approached the same way, calculating the phase-space average over the combined space. This recursive property gives the method its power. WeconsiderergodicprocessesforwhichP(~ s) =k. Thispermitsthephase-spaceaverage to be calculated as a function of the state entropy, S. The most likely state of the system is obtained by maximizing this function. For two coupled processes, like the two just described, themaximizationmustbecarriedoutoverallparametrizationsconsistentwithanyconstraints. 2 4.4 TheMMMCanalysistool Statistical mechanics has both analytical and numerical tools. We use only elementary ana- lytical tools in the following chapter. Next we outline MMMC, the numerical technique we implementedandused. 3 2 This has sketched the Microcanonical strategy for understanding the coupled processes. There are other tech- niquesforifoneoftheprocesseshasa“small”phase-spaceandothera“large”one. 3 In order to rapidly compute entropy surfaces for processes, we implemented a distributed version of the algo- rithm. Thismaybenovel,butwecannotbesure. 104 Microcanonical Metropolis Monte-Carlo (MMMC) is a numerical method described by Gross (1997) as a variation of the famous Metropolis-Hastings algorithm(Metropolis et al. 1953). The original method required adaptation because it does not deal well with calculation oftheentropysurfaceasafunctionofpurelyconservedvariables(termed,extensivethermody- namic variables). We used MMMC to construct the entropy surface for a finite Ising model in the next Chapter. Even with 100 robots it is infeasible to explore all available states, so a ran- domized algorithm must be used to explore the space. Na¨ ıvely sampling the state space does not give sufficient data to provide the complete surface. The MMMC algorithm decomposes thesurfaceintoknotsandcalculatespartialderivativesateachknot. Theentiresurfaceisthen constructedthroughintegrationofthesevalues. Values at each knot are calculated by collecting statistics using a pseudo-dynamics simu- lation of the model from a given initial value within the knot. The emphasis is on collecting numerical data based on local transitions. The simulation does not attempt to construct long temporal explorations of the states as the implementation would do. We believe that this ap- proach,ratherthansimulationforplausibleruns,isimportantforcharacterizingtheprocesses. From the perspective of the entropy surface, such an approach provides more information and isamorecompletecharacterization. A complete characterization of a process’s behavior is important because the system de- signer seeking to combine the process, needs to know how the process will behave across all parameterizationsratherthansimplythemostlikelyvariantsofparticularmacrostates. Inother words, a good characterization must should not make assumptions about which variables are 105 controlled and which are free, instead the entire entropic surface should be computed. The equivalentideafromaprobabilitytheorypointofviewisthataspecificationofthejointprob- ability function can be used to condition out particular events and, thus, is most general. This idea is equivalent to the earlier statement that macroscopic models must capture the behavior acrossarangeofinputparameters. 4.5 Constructingthetoolbox Constructing the process toolbox is not an automated procedure. Figure 4.1 shows that the first step requires the toolbox designer to specify microscopic rules that are then analyzed. This can be a challenge. However, a technique that can be used (and is used in the following chapter)istotakeinspirationfromexamplesofstructuredmacroscopicbehaviorinthenatural world. Section 3.5 has listed some biological examples of repeated structures, showing that symmetrybreakingoccursinmanynaturaldistributedsystems—asshowninthenextchapter, this behavior is particularly useful coordination primitive for tasks. Although not biological, chemical systems are natural systems that exhibit the behavior indicative of symmetry break- ing, have meta-stable states, and nucleation processes. We were able construct processes to reproduce these global properties by applying the analysis originally used to understand these bulk properties in equilibrium chemistry. Finally these properties were shown to be reusable forsynthesisoftask-achievingbehaviorinrobotsystems. 106 Figure 4.4: Three different views of the system dynamics. (a) A trajectory within the entire behavioral phase-space; (b) A hypothetical projection of the phase-space showing planar sub- spaces for behavior not part of task states S and the actions connecting these spaces; (c) The abstractedS representationasusedbytheformalism 4.6 Ergodicityandmixinginexistingroboticswork Theuseofrandomizationispopularinrobotics. Inmulti-robotsystems,itisnotuncommonfor robots to be programmed to perform random walks in order to explore an environment. Such randomness is used because it can be implemented easily on physical robots. Randomized actionselectionhassomeusefulproperties, becauseitmayoverpowerothersystematicbiases oraddressuncertainty(Erdmann1989). Jones and Matari´ c considered the problem of automated synthesis of controllers for min- imalist robot systems. Both controllers with memory (Jones and Matari´ c 2004b) and ones endowed with communication capabilities (Jones and Matari´ c 2004a) were demonstrated in simulation and on physical hardware in a construction domain. The task involves the se- quential placement of colored cubes into a planar arrangement. Their input specification is a sequenceT ={S 1 ,S 2 ,··· S n } which contains the required evolution of the structure, with 107 actions A = {a 1 ,a 2 ,··· ,a m } being the placement of an individual cube. The specification doesnotplaceanyrequirementsonnon-taskrelatedactions. Wecanapplysamenotionsalreadydescribedinthischapterforprocessestothecomplete robot system. Consider a “behavioral phase-space” as the physical configuration space of a robotaugmentedtoincludeadditionaldimensionsforeachoftherobot’sinternalcontrolvari- ablesthatresultinobservablebehavior. Thisisausefulmentalrepresentationforamulti-robot system and for reasoning about the overall system dynamics, not something one would ever need to construct explicitly. In the construction domain, each of the world states in T repre- sents entire subspaces of the overall system’s space. Figure 4.4 shows the entire behavioral phase-spaceinthislight. Ahypotheticalprojectionofthisentirespaceseparatestheconfigura- tions into subspaces, each subspace representing a single state inT. Actions (fromA) evolve the world state transiting the system from one subspace to another. The system is designed so that within each subspace the dynamics are ergodic. The evolution of the whole system can beunderstoodintermsofanergodicandnon-ergodicfactorization;theergodicfactorcaptures muchofthephysicaluncertainty. The analytical techniques developed in order to predict task execution are aided by the ergodiccomponentsoftherobots’behaviorinthisdomain. Oneexampleisinthemacroscopic model (Jones and Matari´ c 2004b, pp. 4–5) applied to this formal framework. Their model calculates the probability of successful task completion by calculating a large multiplication of all possible memory states, in each possible world state, after each possible observation, calculatingtheprobabilitythatonlythecorrectactionwillresult. Theyincludetermsforwhen 108 actions may result in other, or null, world transitions. A fundamental assumption for this calculation is that no “structure” in the world results in the observation and action sequences that correlate. When endowed with navigational controllers that have ergodic dynamics, we know that this is true because the observation of an ergodic system at N arbitrary instants in time is statistically the same as N arbitrary points within the behavioral space (McQuarrie 1976,pp. 554). As another example, in their work on self-reconfigurable robots, White et al. (2005) make an assumptionabout the nature ofthe environmental dynamics thatamounts to ergodicity. No work so far, however, has explicitly recognized the connection with ergodic theory. In Shell et al. (2005) we suggest that systems should be explicitly designed to exploit the ergodicity, rather than considering ergodicity just as a descriptive tool. This dissertation generalizes that philosophybyconsideringtheprocesstoolboxthroughwhichrelativelyeasystepscanbeper- formedinordertoanalyzecompositeprocesses. 4.7 Summary We believe that, whenever possible, large-scale robot system programming should operate on macroscopic and system-wide constructs. The microscopic details of a system are often un- necessary during controller construction, and, conversely, designing a controller by aiming for a single trajectory will result in over-specification. Instead, a given global task should be decomposedintoprocessessothattask-relatedactivitiesmaptomacrostates. 109 In this chapter, we have provided descriptions of the system at both the microscopic and macroscopiclevelsofdetail: Microscopic We have formalized the notion of a process, describing ergodicity, and how the propertycanbeexploited. Macroscopic Wehavedefinedthenotionofmacrostateswhichprovidesanabstractedviewof theprocess’sglobalbehavior. Weoutlinedhowtheentropygivesusefulinformationsim- ilartoaprobabilisticdescription. Finally,thechapterexplainedhowtwodescriptionsof processeswithconservationconstraintscanbecoupledandgaveanentropicdescription oftheaggregateprocessdefinedintermsoftheentropyofthetwoconstituents. The synthesis methodology which forms the central contribution of this dissertation in- volves the construction of a toolbox of well-understood processes, each with a formal char- acterization of expected macroscopic behavior, that is used to program controllers for new problems. This chapter has defined what is meant by a process and its macroscopic charac- terization. It has described how low-level coupling of processes can be achieved. In sum, the chapterhasdetailedthemechanismsusedforconstructingcontrollersbycouplingprocessesat the microscopic level, while simultaneously combining macroscopic characterizations. Both are necessary for the goal of simplifying controller synthesis and maintaining a link between micro-andmacro-leveldetail. The next chapter will introduce specific processes and demonstrate their effectiveness as usedintask-achievingrobotcontrollers. 110 Chapter5 Multi-robotcasestudies Thischapterappliesthemethodsdescribedinthepreviouschaptertosynthesizecon- trollers for multi-robot tasks. Coupled ergodic processes perform the coordination and collective decision making aspects for the task. For each task, we specify the processes used as well as the procedure used in order to analyze the individual pro- cesses. For each controller, the method of coupling the processes is detailed. Large- scalemicroscopicsimulationsareusedtovalidatethecontrollers. This chapter demonstrates that the formalism described in the previous chapter can be successfully applied to produce controllers for multi-robot systems, and to validate those con- trollers through realistic simulation. The application of the methods will result in concrete examplesofcontrollerprocesses. The main experimental validation is within the multi-robot foraging domain. Foraging is a canonical task in distributed robotics, and one of the most widely studied problem do- mains (Arkin et al. 1993; Goldberg 2001; Parker 1998). It is an entomologically-inspired task which requires robots locate items (called pucks) scattered throughout a planar environment, 111 and to transport them back to a central location (called the home region). The domain is an idealization of collection and transport tasks has several applications including mine-clearing, hazardouswasteclean-up,andsearchandrescue. Two complementary variations of the traditional foraging problem are considered in this work. Both increase the complexity of the task by requiring particular high-level coordinated behavior. These are introduced in the next two sections, after which we describe an experi- ment for sensor-actuator network sequencing, which reuses processes developed in the forag- ing domains, and consider a distributed sensor network time-stamping problem. We discuss theimplicationsoftheexperimentswiththesedomains. 5.1 Strategyselection Multiple robots operating concurrently offer advantages over a single robot, but the exact ad- vantages differ from task to task. Broadly speaking, there are two (overlapping) classes of task in which these advantages become apparent: (1) Tasks that are impossible for an in- dividual robot but feasible for a group; (2) Tasks in which the addition of robots improves performance—although typically only up to a point. Foraging is the archetypal task from the secondcategory. Itispossibletohaveasinglerobotforage. Anadditionalrobotcanbeadded to forage without being aware of the first robot. Often the second robot will almost double the foraging performance. But further addition of robots does not produce a linear speedup. Beyondthefirstfew,additionalrobotshave,increasingly,asmallerandsmallerpositiveeffect on performance. Further addition of robots will then lead to a net decrease in performance. 112 This behavior was first demonstrated by Arkin et al. (1993) in simulations of a foraging-like domain, and it has remained the primary task in which resource contention is studied; see, for exampleLermanandGalstyan(2002)foramathematicalmodel. 5.1.1 Motivationandproblemdefinition The scalability of robot systems is fundamentally inhibited by resources. Most often the ad- dition of robots to a task-achieving multi-robot system will aggravate conflicts over the finite resources(GoldbergandMatari´ c1997). Coordinatedallocationandmanagementpoliciesmust beintroducedinordertoincreasethebenefitsofmulti-robotsystems. Thisisanimportantcon- siderationiftheproposedlarge-scaleswarmsaretobeuseful. Withintheforagingdomain,the shared resource is space. The resource contention manifests itself as spatio-temporal interfer- ence (Matari´ c 1992; Goldberg and Matari´ c 1997). Once an area is saturated the robots must spend moretime on avoiding eachother than on taskrelated activities. It is interestingto note thatthisisaconsequence,fundamentally,ofrobotsbeingembodied. Thusthenegativeeffects ofinterferencemaybemitigated,butnevercompletelyremoved. Interference in multi-robot foraging was systematically studied by Goldberg (2001) with physicalrobotaswellasinsimulation. Heconsideredthefollowingforagingstrategies: HomogeneousForaging Thetraditionalforagingstrategyhaseachrobotsearchingforpucks andindependentlytransportingthemtothehomeregion. Onceanyoftherobotssearch- ing for a puck finds one, that robot will deliver the puck to the home region; the robot 113 then begins searching anew. As one might expect, such an approach produces spatio- temporalinterferencethatisstronglyconcentratedaroundthehomeregion;manyrobots will attempt to enter the same space and are forced to issue obstacle avoidance com- mands. Thus,ratherthanperformingtask-relatedactions,additionalrobotsmayhamper thecollectiveeffort. Bucket-brigading In order to increase the effectiveness of a system with a high density of robots, the bucket brigading strategy was proposed by Goldberg (2001). (There is bio- logical precedent for this strategy too, cf. Hubbell et al. (1980).) This strategy requires that each robot focus on a sub-region of the total work arena. Any robot finding a puck will transport that puck to the neighboring sub-region in the direction of the home re- gion. Thus, pucks are passed from robot to robot along toward the home region and overcrowdingisreduced. Østergaard et al. (2001) showed that bucket brigading behavior could emerge from simple rules that constituted a policy for when pucks should be dropped. They studied foraging in a complex environment, where as most research, the present included, consider open areas. Theirfindingswerealsothatthestrategycouldreducedeadlockandovercrowding,suggesting thatthesestrategiesarequitegeneral. In his thesis, Goldberg (2001) identified four categories of multi-robot interactions: SPST, DPST, SPDT, and DPDT (where ST is same, D≡ different and P≡ place, T≡ time). Goldberg notesthatinterferencearoundtheforaginghomeregionisphysicalinterference,acharacteris- tic SPST interaction. Suchinterferencecanbereducedbyarbitratingresources: 114 0 500 1000 1500 Time (s) 0 50 100 150 Number of pucks within home region Comparison of foraging performance. (250 robots,30×30 pen) 500 pucks Bucket brigading Homogeneous strategy 0 500 1000 1500 Time (s) 0 50 100 150 Number of pucks within home region 2000 pucks Bucket brigading Homogeneous strategy Figure 5.1: Plot of the number of pucks within home region versus time for both foraging strategies (with 250 robots). Left figure has low puck density, right has high puck density. Plots show mean and standard deviation for 5 independent simulation runs. (See Section 6.4 foramorecompleteempiricaltreatmentoftheforagingstrategies.) 1. Robotsmaybeforcedtostayinadifferentplacesforalltime(DPST). 2. Schedulingisperformedsoastoensurenolocationisusedbytworobotssimultaneously (SPDT). Bucket brigading is arbitration of the first type, while homogeneous foraging does not im- plement arbitration. We do not consider the SPDT explicitly, but point out that if robots are subdivided into small sub-groups, simple scheduling can be performed by using a sequencing mechanism(seebelow). Theperformancedifferencesofthesestrategiesaretypicallystudiedbyvaryingthenumber of robots used within an experiment. Instead, we considered a fixed number of robots, but varied the puck density. Figure 5.1 shows that the two strategies display different behavior depending on puck density. In both cases the homogeneous strategy is characterized by good initial performance, since the rate of puck foraging is steep, but this flattens out after a time. 115 GroupComposition Numberofrobots : Severalhundred(100–500). Physicaldiversity : Allrobotsarephysicallyidentical. Behavioraldiversity : Allrobotsexecutethesamecontroller. CommunicationProperties Mechanism : LocalcommunicationmodeledafterRenemotes. (DetailsofthemodelareprovidedinSection6.3.2.) Range : Local range characterized by a probabilistic model basedondistance. Topology : Localbroadcast,noroutingnecessary. Cost : Communicationisconsideredcheap. (Wedonotevaluatetheenergyconsumption.) IndividualRobotCapabilities Computational(theoretical) : Low-levelcontroller: finite-statemachine. Coordination: coupledergodicprocesses. Computational(practical) : Extremelylimited,afewbitsofmemory. Peerrecognition : Otherrobotssensedasobstacles. Peeridentification : None. MechanicsofCooperation Origins : Macroscopic level structure implicit within ergodic processes. Structure : Entirelydistributed. Architecture : Hybrid/Behavior-based. Fourlevelclassification(Farinellietal.2004) Cooperationlevel : Cooperative. Knowledgelevel : Aware. Coordinationlevel : Strong. Organizationallevel : Distributed. Table5.1: Classificationofthesimulatedrobotsusedforexperimentalvalidation. 116 This behavior is the result of the initial random locations of the robots. As pucks are found, therobotsmovetowardthehomeregion,producingwhatamountstoatrafficjam. Thebucket brigading strategy has a different performance. While bucket-brigading the robots spent time finding pucks with their local area depends on the number of pucks. At low puck density, the robots are simply encountering them too infrequently. The plots show that if the robot system willberequiredtoforageforalengthoftimegreaterthen1000seconds,thenthebeststrategy dependsonthepuckdensity. Wedefinetheproblemofcollectiveforagingstrategyselectionastheprocessofhavingthe swarm of robots switch strategies depending on the puck density. We require that the system use a pure strategy, that is we want all the robots to employ the same strategy and, when environmentalconditionsdictate,therobotsshouldalltransitionfromonestrategytotheother simultaneously. Collective strategy selection poses a number of challenges. The puck density used to trig- gerthetransitionmaynotbespatiallyhomogeneous,infact,weknowthatthebucketbrigading robotsresultinpucksintheenvironmenthavinghigherdensityclosertothehomeregion. This inhomogeneity means that different robots will have different beliefs about the appropriate strategy if they only consult their own sensing. The robots we study have only local commu- nication, and so there is a propagation time associated with a message. Rather than elect a single leader, we seek an entirely distributed solution in which robots play an equal share. In thecontextofthepresentwork,thebigquestioniswhetherergodicprocessescanproducethis high-levelcoordination,oriftherequirementforergodicityprecludesthissortofcomputation. 117 On the other hand, this sort of strategy selection has the potential to improve the task performance for the robot system. It is natural to expect that as further multi-robot problem domains are studied, several trade-offs between different algorithms will be identified. The foraging domain is one example where this has occurred, but strategy selection can play an importantroleinbuildinggeneralrobotsystemsthatareadaptive. Amethodforachievingthis coordinationallowsrobotsystemstoreachglobalconsensus,whichisaprimitivethatunderlies muchcollectivedecisionmaking. The problem can be cast as a multi-robot task allocation problem, by considering each strategy as a task that must be assigned to all robots. There are multiple ways of considering strategyselectionasanassignmentproblemthisthemostnaturalisprobablytotreateachstrat- egyasatask. WithinthetaxonomyofGerkeyandMatari´ c(2004a)thiswouldbecharacterized asanST-MR-IAinstance,whichmeans: (ST) Single-task Robots: Each strategy is taken as a task. Each robot can execute only one strategyatatime. (MR) Multi-robotTasks: Werequirealltherobotsexecutethesamestrategy. Inthisinterpre- tationthetasksrequiremultiplerobots. (IA) InstantaneousAssignment: Theallocationrespondstotheimmediateenvironmentalcon- ditions,doesnotpredictormodellaterconditions. Thisviewoftheproblemisnotparticularlyenlighteningbecause,onceinformationfromacross thegroupofrobotshasbeenintegrated,thechoiceofbeststrategyiseasytoselect. Themajor difficulty,fromourperspective,isintegratingobservationsfromeachoftherobotsandensuring 118 the group reaches consensus about the choice of allocation. These problems are typically assumed to be solved when dealing with explicit coordination techniques because networking infrastructure, for example, provides a global broadcast. This highlights the minimalist nature oftherobotsweemploy. 5.1.2 Solution The preceding definition of the strategy selection problem for multi-robot foraging is an ex- ample of a task specified in terms of some desirable global structure. It is left to the system designer to synthesize a solution that will achieve this desirable collective behavior through local interaction rules. Strategy selection is a case where what is available, accessible and feasible for the individual robot does not point toward obvious interaction rules to achieve the globaltask. The procedures described in the preceding chapter are followed in order to construct an aggregate controller for performing strategy selection. First, a toolbox with two processes is constructed. We formally specify, describe and analyze each process individually. Second, the processes are coupled together to form an aggregate controller capable of achieving the behavior necessary. Although presented in succession, bear in mind that these two steps are quitedistinct. Theprocesseswithinthetoolboxcanbereused. Additionally,oncethecontroller hasbeenconstructedandanalyzed,thisaggregateprocesscanbeaddedtothetoolbox. 119 5.1.2.1 FormalspecificationofProcessA Let P A (n,Z), hS A (n,Z),Φ A i where S A (Z) = {0,1,··· ,Z} with the added global con- straint ~ S A = ( ~ s = [s 1 ,...,s n ] |s 1 ,··· ,s n ∈S A ∧ n X i=1 s i =Z ) . (5.1) UsingthecomponentwisefactorizationinEquation4.2,wewritetheinteractionruleas Φ[s k (t),s l∈A k (t)] =s k (t)− X i∈A k ω k,i (s k (t),s i (t))+ X j∈A k ω j,k (s j (t),s k (t)) (5.2) whereω k,i (s k (t),s i (t))fori∈A k arerandomvariableswiththeconstraintthat X i∈A k ω k,i (s k (t),s i (t))≤s k (t). We require symmetrical neighborhood sets so that i∈A k =⇒ k∈A i . The ω k,i (·) values can be seen as weights on a graph connecting robots, with the first summation term being someflowofrobotk’squantityof“Z”toitsneighborsandsecondtermbeingthesymmetrical increasebecauseoftheflowsfromk’sneighborstok. Iftheω k,i (·)valuesaregeneratedfromtheintervalsbetweenkA k krandomvariablesdrawn uniformally from the set{0,··· ,s k (t)} then entire phase-space ~ S can be visited given suffi- cient time. We assume that theA k ’s remain reasonable so thatP(~ s) = k 00 (wherek 00 is a con- stant)describesthesystembehavior. Bywhichwemeanthatdespitenon-fixedcommunication 120 Figure 5.2: A subgraph of the global communication graph, Vertices represent robots and edges represent communication links. Numbers depict Process A state. Two robots exchange a random value (12 here). No state information from the robots in gray is necessary, this is a strictlylocalinteraction. topology, which may not be globally connected for allt, the robots return to being connected oftenenoughthatthedynamicsinEquation5.2maywanderthroughtheentirespace. 5.1.2.2 TheintuitionforProcessA The basic idea is that the process is described by parameterZ. Each robot has a variable that cantakevaluesthatrangefromzerotoZ,butthevariablemaynottakeanyvaluebecausethere isconstrainton~ s(t). Theglobalconstraintthatsaysthatsummingthevaluess 1 (t),··· ,s n (t) over the whole robot swarm should come toZ. So, for example, each robot may have a value Z n initsvariable(providedZ dividesexactlybyn). Equallyvalidisthestatewithn−1robots having value 0 with the final remaining robot having Z. The set of all states satisfying this constraintis ~ S. The dynamics Φ(·) are shown visually in Figure 5.2. The idea is that any two robots that can communicate may exchange some portion of the number within their variable. In the figurethecentralrobotgivesup12toitsNorth-Easternneighbor. Inreality,manysuchvalues are being exchanged asynchronously across the entire robot swarm. So long as this is done correctly,theglobalsumoveralltherobotswillremainZ. 121 0 2000 4000 6000 8000 10000 States 0.0000 0.0005 0.0010 Probability of state 97% 0 20000 40000 60000 80000 100000 Complete set of states Figure 5.3: The density function describing the probability of Process A (n = 100 and Z = 10 5 ) being in a particular state at a random time. The plot of the full domain along the top and the broken line that delineates 97% of the probability mass, showing sharp peak in the Distribution. Each robot picks a random value to exchange with its neighbors. It follows from this that theentire ~ S canbeexplored,andthattheinformationinthesystem’sinitialconditionsarelost almostinstantaneously. Thisprocessisergodic. 5.1.2.3 TheanalysisofProcessA Thisprocessissosimplethatwecancharacterizeitsbehaviorinanalyticalform. Thissystem hastwodefiningparametersnandZ. ThepossiblewaysofdistributingtheZ overthenrobots is k ~ S(Z,n)k = Z +n−1 n−1 . (5.3) We consider the case wheren Z. By definingG j ([s 1 ,...,s n ]) = s j , we get a equiva- lenceclasses ~ S G j (·)=k consistingofstateswiththeform[s 1 ,...,s j ,...,s n ]. Figure5.3shows 122 0 10 20 30 Time (seconds) 0 2000 4000 6000 States Experimental data Theoretical mean state =1000 Theoretical standard deviation =990.6 z }| { Initial transient behavior 100000 Figure 5.4: Plotsfrom ten experimentalruns showingthe state ofa singlerobot (n = 100and Z = 10 5 )overtime. Inallcases,theplottedrobotmovedquicklyintostateswellcharacterized bythetheoreticalmeanandstandarddeviation. that for reasonable values of n and Z only a small proportion of the system states amount to 99% of the probability mass. Perhaps more significant is that as the number of robots in- creases,theprobabilitydensityfunctionbecomesmorepeakedandhenceproportionallyfewer states are necessary to characterize the system’s equilibrium behavior. (Note that Sterling’s approximationwasusedintheevaluationoflargefactorialsnecessaryforthefigure.) We can calculate both the expected value of site j and the standard deviation. The first involves calculating the integral in Equation 4.8 forG j (·), while the second involves similar integrationofG 2 j (·) = (G j (·)) 2 . forthevaluesn = 10andZ = 10 5 , E[s j ] = 1000 σ[s j ] = q E[(s j ) 2 ]−E[s j ] 2 = 990.6 123 This standard deviation is small when one considers that under these parameter settings there areZ = 100,000feasiblestates,sothat2σ accountsforabout0.3%oftheaccessiblestates. Equation5.3isthemacroscopiccharacterizationofProcessA. 5.1.2.4 FormalspecificationofProcessB LetP B (Z),h{−1,+1},Φ B i. Withnrobotswedefine e(~ s(t)), X j,k, s.t. j∈A k −s j (t)s k (t), as a measure the “frustration” in the system by separating the number of instances in which twoneighboringrobotshavedifferentprocessvalues. The dynamics Φ b is constructed so thate(~ s(t)) remains constant. Given two robots,i and j, such thati∈ A j andj ∈ A i , that is neighboring one another, the following two values are calculated: δe i (s i (t),s l∈A i (t)) = X k, s.t. k∈A i /{j} s i (t)s k (t), (5.4a) δe j (s j (t),s l∈A j (t)) = X k, s.t. k∈A j /{i} s j (t)s k (t). (5.4b) Theinteractionruleisrepresentedas ~ Φ (i,j) B (~ s) = [s 1 ,...,−s i ,...,−s j ,...,s n ] if δe i =−δe j ~ s otherwise. (5.5) 124 Figure 5.5: A Process B transition. Filled vertices represent state +1, empty ones−1; solid edges depict alignment, broken edges misalignment. The number of solid and broken edges is conserved across the global graph. Within the local neighborhood there remain 4 aligned edgesand5misalignedones. AtatimescaleinwhichtheA j setsremainconstant,executingthisprocessensuresthat e(~ s(t+δt)) =e(~ s(t))+2δe i +2δe j =e(~ s(t)). In the robot system, the transition between states of robot’si andj occur asynchronously withothertransitionsforotherrobots,hence ~ Φ B iscomposedofseveral ~ Φ (i,j) B typetransitions. 5.1.2.5 TheintuitionforProcessB Forthisprocess,eachrobothasavariablethatcanbeinoneoftwostates,either−1or+1. The transition from one state to another is called a flip. Two robots within communication range, sayiandj,calculateδe i andδe j ,thesearethechangesine(~ s)thatwouldresultifrobotiand j flipstates. Bothcanbecalculatedusinglocalinformation. Ifδe i =−δe j thenthetworobots carryoutthisflipoperation. Figure 5.5 gives a visual representation of the δe i and δe j calculations. Again this rep- resents part of the communication graph, with vertices representing robots and edges com- munication links. In the figure, filled vertices represent a +1 while empty ones a −1. The 125 grayed vertices and edges are examples of information not necessary for the calculation. The conserved property here is a property of the edges, the number of aligned edges (solid) and misaligned (broken) is conserved. Robot i calculates the effect of a flip by considering how many edges would change (Equation 5.4a), roboti does the same thing (Equation 5.4b). Both flipsoccurwhentheeffects,together,maintaine(~ s). 5.1.2.6 TheanalysisofProcessB Wedefinemacrostatesthroughm : ~ S→R,where: m(~ s(t)), n X i=1 s k (t). The value of m(·) gives a measure of “agreement” among robots, with m(~ s(t)) = −n or m(~ s(t)) = +n describing the two states ~ s in which each robot has the same state for its executingProcessB. Analysis of P B ’s macroscopic behavior is less obvious than P B ’s. The process has the same structure as the Ising ferromagnetic model (Fisher 1967), which has been well studied in the limit of infinite system size, and under controlled temperature conditions. In order to studyafinitesystemunderconservedeconditions,weemployedanumericaltechniquecalled MicrocanonicalMetropolisMonte-Carlo(seeSection4.4). We simplify the problem by considering a model of robots placed on a 21× 21 square lattice with each of the 441 robots placed on the grid and connected to nearest neighbors. In order to characterize the macroscopic behavior of Process B we must consider volume of the 126 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 0 100 200 300 400 500 600 E M S (a) Plot of entropy when the system is free to choose e. Low e have negative curvature,thesignalofaphasetransitionwithinafinitesystem. −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 E M S (b) Entropy normalized across m for a given e. Symmetry-breaking is clearly shown. Figure 5.6: Plots of the entropy surface for a finite, albeit idealized, P B . The S = S is the entropy, M = m n and E is e normalized so as to fall between−1 and 1. Since S gives the log probability, these probability density functions are extremely peaked, i.e., allowing good predictionofaveragebehavior. 127 phase-space (that is, the number of states) across the range of m ande values. In accordance with the discussion in the previous chapter, these can be viewed either as parameters of the processthatarecontrolled,i.e.,conservationconstraintswhichmustremainaparticularvalue, or, as macroscopic states which represent observationally equivalent classes. By considering thecompleterangeoffeasiblemandevalueswecaninterpretthesepropertieseitherway. Figure 5.6(a) is a plot of the entropy associated with any feasible m and e value. In the figure, the axes are labeledS =S simply the entropy,M = m n andE ise normalized so as to fall between−1 and 1. Thus, M is calculated for each configuration and must lie on [−1,1]. The two extremal values are given for ordered states, those midway (M ∼ 0) are disordered states. Observe that at high e, or equivalently E, the process exhibits only disordered states. The ordered states are only feasible for relatively low e. We can interpret the entropy as a probability: the most likely states are with E ∼ 0 and M ∼ 0. Since the is a log function of the number of states, and the number of states is proportional to the probability of some particularstateoccurring,thisisaquiteasharplypeakeddistribution. Recall that the dynamics function described above in Equation 5.5 maintains constant e. One can, with the help of Figure 5.6(a), imagine the process constrained with a givenE. The dynamics mean that as the process evolves, its M changes, and the process execution results inanexplorationoftheM axisatthefixedE. Ofcourse,theunderlyingdynamicsisresulting inachangeofthe−1and+1valuesoneachrobot,thisproducesarandomwalkupanddown the M axis, which is a macroscopic description of the underlying events. Process B has an 128 interesting feature for negative values of E less than about −0.75. The surface as negative curvature,whichcanbeseenbytheconvexportionsoflinesofconstantE inthemesh. Figure5.6(b)isaplotofeachS foreachE andM value,butisnormalizedforagivenE. Put another way, the constant E slices of Figure 5.6(a) are normalized so that the S sums to 1.0. Although not physically meaningful, it gives an intuition for the behavior of the process for fixedE values: For highe c the system hasM → 0, or an approximately equal number of robots (on average) with state +1 as−1. At lowe the system has two highly probable states, eithercompleteagreementwithstate+1oragreementwithstate−1. Nowsupposethatinstead oftheE valuebeing fixed, itmay bechanged sothattheE valuechanges. The systemissaid to undergo a symmetry-breaking phase transition as it moves from a state with M → 0 to one withM →±1. Itisimportanttonotethatthisphasetransitionistheresultofstructurewithin ~ S B notthedynamicsthemselves. Robotsexecutingthisprocesswillbeusedinordertotransitionfromoneforagingstrategy to another. But before this is discussed through the coupling of the processes, first we must revisit some of the modeling assumptions made for the above characterization. As the robots moveabout,thenetworkconnectivitychanges,andtheneighbors(intheA k sets)change. Ithas beenassumedthatthesechangesareslowcomparedwiththeexecution(andequilibration)time ofProcessB.Itisimportanttoseparatetheanalysisoftheparticulardetailsoftheprocess,and theassociatedassumptions,fromtherealitiesoftherobotsystem. Therobotsdonothaveonly neighbors on a lattice, nor will the experiments necessarily involve 441 robots. The crucial idea here is that the process exhibits a symmetry breaking transition, not exactly where the 129 critical point is, that is, where the transition occurs. In a sense, it is the topological features that we believe matter, and it is these qualitative properties robust that are robust to variety of modelingassumptions. 5.1.2.7 Couplingtheprocessestogether In order to perform the collective strategy selection with ergodic processes, we construct a controller by coupling an executing instant of Process A with an instant of Process B. The idea is Process A and Process B run at a particular timescale. We introduce a slower coupling dynamics that operates on the Z and e parameters of the respective processes. Figure 5.7(a) showsthisdiagrammatically. TheP B onrobotimayflipwitharesultingδe i ,providedthatthis canbesupplantedbysubtractinganequivalentvaluefromthestateofP A . Similarly,thatvalue ofP A (andhencetheglobalZ)canbeincreasedbyanappropriateflipfromP B . Becausethis coupling dynamics operates a different timescale, the two processes are treated as at reaching equilibriumbetweenthesetransitions,seeFigure5.7(b). Thecouplingprocessoperatesinanentirelyreversiblefashion,thatis,itissymmetricalin that it converts P A state into P B state and does the reverse operation too. The direction that thistransitionproceedscanbeunderstoodintermsoftheprocessofglobalincreaseinentropy. The total ergodic dynamics that results from the coupled process will attempt to explore the aggregate phase-space. Flips are exchanged for P A state on average more frequently than the opposite only if that leads to larger areas in the phase-space. Analogous to expansion of a gas tofillaroom,butintheconfigurationsoftheprocesses. 130 (a) Both Process A and Process B are dis- tributedinthesensethattheyinvolvecommu- nicationwiththeneighbors. Thecouplingbe- tweenthetwoprocessesislocaltoeachrobot. (b) The dynamics operate at two separate timescales: (1) the basic ergodic dynam- ics that explore the ~ SB, and (2) the slower dynamics the operate on global constraints thatalter ~ SB. Figure5.7: BothProcessAandProcessesBarecoupledtoproduceanaggregatecontroller. If the execution of P B is visualized through Figure 5.6(b) as before, then the coupling dynamics is responsible for alterations in the e value. Rather than introducing these changes toe by directly altering the state (say, through flips) this is achieved by subtracting or adding some number from P A . This is useful in two ways. Firstly, it is easier to decrease the local valueofP A ontherobotthenitistoaltertheeinalocalneighborhoodduetoP B . Theanalysis ofP A shows that robots will typically have, over reasonable lengths of time, a state described as having value Z n . Addition or subtraction is easily done by executing a local rule. Secondly, theP A hasasmoothingeffect,tohelpdealwithinhomogeneities(thisaspectisalsoexploited inSection5.2below). Once this coupling dynamics has been introduced, the symmetry breaking operation per- formed by Process B is controller through adjustment of the Z values. A large Z value in resultinginalargeeandhenceM ∼ 0,whereassmallZ resultinginM →±1. 131 5.1.2.8 Relatingtheprocessestothetask Each robot has a behavior-based controller that takes care of essential low-level aspects, like obstacle avoidance, searching, etc. The controller includes an implementation both foraging strategies, with a binary variable to control the one currently in use. This variable provides an interface to the two coupled ergodic processes. A simple behavior monitored the state of P B at low-frequency. We only need consider two states, so we use M = 0 and|M| = 1. The individualrobotsdonothaveadirectmeasureofM (asthisisamacroscopicproperty)sothey average the m within their local neighborhood over time, relying on the ergodicity to ensure homogeneity. Duringtheforagingtask,localsensingoftaskprogressenablesthestateofP A valuetobe adjusted,thisadjustingtheglobalZ,whichinturnwouldinfluencethebehaviorofP B . While bucket brigading, each robot kept track of the time between puck discoveries, a noisy local measurementof puckdensity. Forhomogeneous foraging, interferencewas estimateddirectly by measuring progress, simply as the measure of odometric distance traveled for a short time. Both of these decreased the local value of P A on each robot. Each time a puck was dropped over the home region the local state for P A was increased. The system achieves a steady- state between P A increments (through interference, obstacle avoidance) and P A decrements (from pucks homed). Different puck densities have different steady-states, and a mismatch of puck density and strategy is sufficient to drive the phase transition of P B . In this way, the robot swarm uses a homeostasis-like mechanism to adjust the ergodic processes locally, and theergodicprocessesdothecollectivecomputation. 132 5.1.3 Evaluationanddiscussion We used simulation methods in order to explore and validate the controller. Table 5.1 (on page 116) details the class of robots used for the experiments using the taxonomic axes de- scribed in Section 3.3. We employed our own custom simulator specifically designed for large-scale multi-robot systems in order to evaluate and validate the preceding controller, see Chapter 6 for details. The simulated robots have physical dimensions 35cm× 10cm, pucks have radius 3cm. All experiments were conducted within a square arena free from obstacles. Thehomeregionwasaquarter-diskwithradius 3mpositionedintheNorth-Eastcornerofthe arena. Each robot was simulated as having a forward pointing puck scoop that holds at most one puck. Therobotwascontrolledthroughavelocitycontrolinterfacewiththecontrollerupdating commandedlinear(˙ r c inmeterspersecond)andangular( ˙ θ c indegreespersecond)velocities. Positions were asynchronously propagated in a randomized order with the probability of each robot being updated given by P update = 0.8. Velocity commands, local communication and range-sensors were all corrupted by noise, described in detail in Section 6.3. The sensing and communicationcapabilitiesareonlyoutlinedbelow. Robotswereprovidedwithadistancesensorwith12radialrayswhichreturnsnoisyvalues for distances to obstacles and robots (but not pucks) up to maximum distance of 0.5m. The sensor did not distinguish between distance readings from obstacles and those from nearby robots. Robotswereequippedwithacompassthatgivesfourbitsofinformationandhasadded noiseN(0,15.0 2 )andaddedbiasN(0,2.0 2 )selectedonceforeachrobotatinitialization. Each 133 robot had a single-bit sensor that detects passing into the home region. False-negatives were returned with probability 0.15 and false-positives with probability 0.08. Finally, each of the robots had a binary sensor to detect the presence of a puck within the scoop, but which gives false-negatives and false-positives with probability P scoop = 0.05. Robots had no ability to detect puck presence or type at distance. The robots using the bucket brigading strategy used their odometric readings to maintain their own independent regions of the environment. They hadnolocalizationinformation. The simulator modeled local-broadcast communications. Messages were delivered with probability modeled on the Rene mote model described in Section 6.3.2. Local point to point communicationallowedarobottoprovidearecipient. Thiswasprovidedbyperformingalocal broadcast,butwithnon-matchingrecipientsautomaticallydiscardingmessages. Thesenderis notified of a successful transmission, as is implemented in protocols with sequence number acknowledgments. We initialized P B within a random state so that m ≈ 0, and P A with Z = 500000. We present a scenario that shows the effective transition of foraging strategies. We usedn = 250 ina25m×25marena. Robotsbeganinthebucketbrigadingstrategy. Wealteredthenumberof pucks from high density (2000) to low density (500). Figure 5.8 shows the result: the system undergoes a phase transition to the homogeneous strategy. Figure. 5.9 shows the change in foraging performance due to the strategy switch. After the transition, there is a significant increase in performance (highlighted with green dashed line). However, this is not sustained. Infact,att≥ 1450theprojectionoftheoriginalbucketbrigadingstrategyseemstooutperform 134 0 500 1000 1500 Time (s) 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of robots bucket brigading Proportion of bucket brigading robots Figure 5.8: Plot of the proportion of robots using the bucket brigading and homogeneous foragingstrategies. 0 500 1000 1500 Time (s) 0 50 100 150 200 Number of foraged pucks Number of foraged pucks Figure5.9: Plotofthenumberofpucksforagedovertime. thehomogeneousforaging. Thisismisleadingbecausethebucketbrigadingwasforagingwith a higher puck density. In terms of Figure 5.1, this is, like comparing the performance of the bucket brigading with 2000 pucks (on the right) to the homogeneous strategy with 500 pucks (ontheleft). Itisinterestingtonotethattheperformancegainafterswitchingappearssimilarto theinitialperformanceinFigure5.1,thatis,aninitialboostinforagingcanbegained,because the robots are homogeneously spread throughout the environment, only after they have all collected pucks and moved toward the home region, does performance limiting interference occur. (This lead us to hypothesize about the possibility of mixed strategies, which we further investigated,seeSection6.4.) 135 Figure 5.8 shows that communication delays mean that the phase transition cannot occur instantaneously. Itdoesoccurrelativelyquicklyasweshowinthenextproblemdomain,where taskadaptationisinvolved. 5.1.3.1 Methodologicalinsights Theperformanceofthetwoforagingstrategies,summarizedinFigure5.1,dependsonseveral decisionsmadeinthedesignofthesystem. Mostcriticalwasprobablythefactthattherobots were unable to sensing pucks at a distance. The only way for them to find the pucks was to more around the environment until the movement of the robot caused it to scoop up a puck. In the simulations above, this meant that bucket brigading only became feasible at relatively high densities. At low densities a robot would–after all the searching that was necessary in order to discover puck–pass the puck to a “blind” neighbor who was very unlikely to find it again. At higher densities, the passed pucks serve the purpose of maintaining a steady flow of pucks to preserve sufficient local density rather than the idea of handing pucks between robots as suggested by the colorful name. Sensing pucks at a distance would critically change the densities at which bucket brigading would outperform homogeneous foraging. Of course havingsuchrangesensingwouldalsohavealteredthehomogeneousperformance. Despitetherelianceontheunderlyingmechanics,webelievethatachangeinthesensing– or other similar change–would preserve the same general qualitative features. Homogeneous foraging would still outperform bucket brigading at low densities with some density at which there is a performance cross-over. This belief is supported by the fact that various studies of interference in foraging with different robots, even different analysis techniques, all result 136 in the same general trend. These types of research results are similar to laws of qualitative structureofNewellandSimon(1976). Our philosophy for constructing coordinated systems reflects an emphasis of the qualita- tiveoverthequantitative. Weseekgeneralmethodsforproducingfunctionalityatthecollective level. In the preceding processes, we care more that the processes produce a phase-transition thanexactlywhereinthephase-spacethisoccurs. Eventhemethodofpredictingbroadchanges based on changes in the phase-space, that is, the space in which processes evolve, rather than the underlying dynamics reflects this idea. After all, the roots of this idea predate Gibbs and Boltzmann, going back to Poincar´ e (1892) with his questioning the dominant opinion of the timethatprogressincelestialmechanicssimplymeantbetterwaysofcarryingoutthetemporal integration. Our experiences with physical robot systems suggest that some low-level param- eter tuning is always necessary: to make progress on formalisms for robotics, one must be realistic about the detail that can be placed on paper to describe such systems. Our outlook is positive, because it is our belief that only qualitative descriptions are necessary. Furthermore, itisourthesisthatsuchbroadqualitativedescriptionsatthemacroscopicsufficeforuseduring synthesisaswellasforcontrol. 5.2 Divisionoflabor In contrast to the previous domain in which the swarm was expected to collectively transition entirelyfromonebehaviortothenext,weconsideranextensiontothemulti-robotforagingin 137 which the robots organize their task assignments so as to reflect a continuous range of mixed strategies. 5.2.1 Motivationandproblemdefinition Nextweconsiderthedivisionoflaborvariationofmulti-robotforaginginwhichtwovarieties of puck exist, call them red and green. We consider the case in which each robot may forage only one variety of puck at a time. The problem involves switching robots from one variety to another in order to emulate the fractional distribution of pucks within the environment. A variousauthorshaveconsideredtheproblemsometimescallingitdivisionoflabor(Jonesand Matari´ c 2003), proportion regulation(Suseki et al. 2005), and task allocation(Lerman et al. 2006). The task typically has associated with it some cost for the robot switching from foraging one puck type to another. Thus, there is an emphasis on reducing the number of number of spurious transitions. We consider this as the problem of having task transitions occur at a dif- ferenttimescaletothebasicobservations. Robotsmaymakeobservationsofpucksfrequently but information at different times should be integrated in the final decision. Information from other robots may be used by in order to make move informed decisions. Robots may include information about pucks that other robots have seen, or about the choice of foraging strategy currently being employed by other robots. The spatial distribution of the pucks can also be important. For example, if pucks of a particular variety occur close to one another, then in- formation from more distant robots may be worth less. On the other hand, without a priori 138 information it may be best to assume all pucks are independently placed, in which case the observationsoffaroffrobotscanbeusefultosmoothlocalinhomogeneities. Similartostrategyselection,wecancasttheproblemofdivisionoflaborintoanallocation problem, in order to place it within the multi-robot task allocation taxonomy of Gerkey and Matari´ c(2004a). ThistaskcanbeconsideredaST-SR-IAinstance. (ST) Single-taskRobots: Robotsarepermittedtocollectonlyonetypeofpuck. (SR) Single-robotTasks: Thecollectionofpucksofonecolororanotherdoesnotrequireother robots,buttheoverallutilityisdependentonmatchingtheenvironmentaldistribution. (IA) Instantaneous Assignment: The allocation responds to immediate environmental condi- tions. Whencastasataskallocationproblemthisway,thedivisionoflaborproblemsuffersfrom the interrelated utilities problem (Gerkey and Matari´ c 2004a). Essentially this means that the utility of the allocation of a particular individual depends on the task assignments of other robots. Division of labor is unique because the overall utility is not dependent on other robots in an arbitrary way, but instead only depends on the number of robots allocated to particular tasks. 5.2.2 Solution Asbeforeweconstructacontrollerbybuildingalow-levelcontrollerwithtraditionalbehaviors like obstacle avoidance, searching, etc. Two processes are layered above these behaviors in 139 (a) Thedivisionoflaborcontrollerconsistsof execution of two Processes B type processes Pr andPg. (b) Observation of red or green pucks changestheprocessphasespace. Figure5.10: Anaggregatecontrollerisconstructedtoperformdivisionoflabor. order to provide each robot with a local strategy for choosing the variety of puck to forage. The two processes are both instances of a type already defined and analyzed. Although each oftheseultimatelydefinedintermsofthelocalrules,theyaresufficienttoguideforagingand to ensure that the proportion of pucks within the environment alters the multi-robot system’s collectivedivisionoflabor. Thetwoprocessesusedare: P r (n,Z r ),hS A (n,Z r ),Φ A i and P g (m,Z g ),hS A (n,Z g ),Φ A i, where the state space and dynamics definitions follow from the original definitions for Pro- cessAinEquation5.1andEquation5.2. Each robot runs the local rules associated with the processes. The dynamics Φ A explore theirrespectivestatespaceswithexpectedstatecalculableintermsofn,Z r andZ g ,producing trajectories~ s r (t) and~ s g (t) for processP r andP g respectively. The system is initialized with 140 Z r = 10 5 ,Z g = 10 5 . TheseZ r andZ g valuesareadjustedoneachpuckobservationwhenthe robot alters the local state of the processess r (t) ands g (t). Observation of a red puck results inthefollowingupdate: s r (t+δt) =γs r (t)+s g (t), (5.6a) s g (t+δt) = (1−γ)s g (t), (5.6b) where γ is a tunable parameter that weights the importance of an observation. When a green puckisobservedanalogousupdatestransferstatefromP r toP g . Figure 5.10 shows the controller constructed from these two processes, and how observa- tionsofpucksweightdecreasethevolume. Atsome(low)frequency,eachrobotindependently decides which type of puck it will forage using thes r (t) ands g (t), the local states of the two processes. Greenpucksarechosentobeforagedwiththeprobabilitygivenby: pr green−puck = s r (t) s r (t)+s g (t) . Thetransitionsaftereachobservationsimplyskewtheprobabilitybyafactorofγ. Robots randomlyencountereitherredorgreenpucks,makingobservationsofeachtypeinproportion with the puck distribution. These observations are smoothed by the dynamics of the two pro- cesses. Lowprobabilityobservations(e.g.,observingtentypesofminoritypuck)areaveraged outovertheentiregroupofrobots. 141 0 5000 10000 15000 20000 Time (s) 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of red pucks Environment Robots [γ =0.25] Robots [γ =0.5] Robots [γ =0.75] Figure 5.11: Performance of task allocation processes. The vertical axis gives the proportion of tasks (for the broken line), and the division of robots among tasks (the solid lines). Plots showmeanandstandarddeviationfor5runs. 5.2.3 Evaluationanddiscussion WefollowtheexampleofJonesandMatari´ c(2003)byconductingexperimentsinasimulated environment in which we introduce radical changes to the underlying puck distribution. We measure the effectiveness of the allocation by comparing the distribution of pucks within the environmentwiththeproportionoftherobotsforagingeachtype. We simulated 100 robots within a 64m×64m arena. Initially 3000 pucks were randomly scattered throughout the arena. Pucks started in an initial 50%/50% distribution. The robots hadidenticalcapabilitiestothestrategyselectionexperiments,butwereadditionallyendowed with the ability to distinguish red pucks from green ones, with misclassifications occurring withprobabilityP mis = 0.05. Robotsexploredfromrandominitiallocations. Afterstumbling onanappropriatepuck,therobotwouldtransportittothehomeregion. Thepuckdensitywas maintainedthroughoutbecauseforeachpucksuccessfulforaged,anewone,ofthesametype, wasintroducedatarandomlocation. The puck distribution was altered at three stages, att = 2000 it was changed to 95%/5%, at t = 8000 to 5%/95%, and at t = 13000 to 75%/25%. In Figure 5.11, the dotted line 142 shows the puck distribution. The plot shows experimental runs with three different settings for theγ parameter. The system shows hysteresis and a response time dependent onγ. In all cases, however, the system adapts so as to find a distribution applicable for the environmental conditions. It is worth noting the difference in response times between the phase-transition in Fig- ure5.8andFigure5.11,theformertakinganorderofmagnitudelesstimetotransition. The two foraging domains were carefully chosen to be complementary. The decentralized division of labor domain required estimation of a continuous quantity. It is typical of the calculations and optimization that might be performed by an implicitly coordinated swarm system. Small local estimates can be used to make a decision, and that can shared with other agentseasily. ThisnotionseemssimilartotheDownhillPrinciple(Vergisetal.1986),butina distributedsense. On the other hand, explicitly coordinated systems often solve discrete problems with hard constraints. For such problems gradient techniques are far less useful. The collective strategy selectiondomainwasintendedtodemonstrationthatdiscretenotionscanalsobefeasiblytack- led with the ergodic process approach. Since communication times are not zero, the system doestaketimetotransitionbetweenstates. 5.3 Additionalexamples In addition to the work described above, we used a controller constructed from the two pro- cessesProcessAandBinordertoachievesub-tasksequencing. Weconsiderthisinanactuated 143 sensor network scenario. A second additional example addresses distributed time-stamping in a sensor network by exploring ergodicity and statistical ensemble-based techniques distinct from the synthesis methodology. This serves to show the applicability of the general analysis toolsoftheresearch,ratherthandemonstratingapplicabilityofthetoolboxitself. 5.3.1 Sequencing We have shown that coupled ergodic process can be used to achieve a basic consensus opera- tion. Here we consider another task which makes use of the same collective decision making processes. Thisworksupplementsthestrategyselectionexample. 5.3.1.1 Problemdefinition Without inherent temporal structure, ergodic processes appear inadequate for robot controller design. Restricting design to ergodic processes certainly represents a significant shift in per- spectivebecausetypicallyprogramminginvolvesadecompositionofproblemsintoasequence of steps. Such sequencing is also of direct importance in robotic applications. Jennings and Kirkwood-Watts (1998) in their method of dynamic teams for multi-robot coordination intro- ducedthefollowingdefinition: “AsetofagentsperformactionB synchronouslyinthesequentialtask{A;B;C}if (i)noagentstartsB untilallagentshavefinishedAand (ii)noagentstartsC untilallagentshavecompletedB.” 144 Figure5.12: Screen-shotsofasimulationwith 441robotsrunningfor∼ 520seconds. Weconsiderasequentialinspectionscenarioinwhichseveralsitesmustbevisitedinorder tobesensedbyarobotswarm. Anergodiccontrollercanimposetemporalordering,andwhat isparticularlyinteresting,thisisdoneatastrictlymacroscopiclevel. 5.3.1.2 Evaluationanddiscussion Experiments were performed in a 50m×50m arena with two disks of radius 3m placed at the North-East(NE)andSouth-West(SW)corners, 2mfromthesides. Thedisksrepresentthesites of interest and can be sensed by robots that are positioned over them. Without localization information and equipped with a noisy 4-bit compass, many robots reach the arena corner having missed or failed to sense the disk. Independent sensing by the many robots within the swarm lessens the effects of the position and sensor uncertainty. Robots were initially placed inthearenacenterandweretaskedwithvisitingfirsttheNEsite,thentheSWsite. Thecritical issueisinsynchronizationofthedecisiontoadvancefromonesitetothenext. SynchronizationisachievedbycouplingProcessesAandBtogetherinafashionsimilarto thatusedforforaging strategy selection. Thoseaspectsusedtorelate theprocessestothetask differ in the case of sequential sensing. On each robot the low-level controller (for obstacle avoidance, navigation, sensor processing, etc.) uses a variable to track task state. The value 145 0 100 200 300 400 500 Time (s) 0 20 40 60 80 Number of neighbors Figure 5.13: The number of neighboring robots within communication range for 441 robots flocking North-East then South-West. The average over all robots plus/minus one standard deviation. There is a high density of robots, the communication is characterized by a falloff thatmeansgoodconnectivitycanbehadwithinaradiusof2m,andmixedperformancecanbe haduntil4m. SeeSection6.3.2fordetails. of this variable affects the interpretation of the compass readings and steering. The low-level controlleriscoupledtothetwosynchronizationprocessesintwoways: 1. Input through gradual perturbation of the Process A’s state space ~ S. A robot detecting that it is over a site will increase the value of its Process A state s A (t) (by 200 units). Observation of thrashing or lack of progress (by measuring odometry movement of less that0.25min5s)resultsintheProcessA’sstatebeingresettozero. 2. Output produced by monitoring average values of Process B’s slow changing state vari- ables. Valuesareaveragedovera12secondwindow. Whenthevalueiswithinathresh- old of zero (we used 0.02) a flag was set indicating that the task-state variable would soon change. When the mean approaches either−1 or 1 (we used 0.98), the task-state variableisalteredtoshowthebeginningofthenexttask(on−1)orreturntotheprevious task(on1). 146 0 100 200 300 400 Time (s) -1 0 1 Value of M Trial 1 Trial 2 Trial 3 0 5000 10000 15000 20000 TotalZ Figure 5.14: Plot of synchronization processes internal state for the first 400 seconds of a run with441robots. ThetransitionofM valuesisclearlyvisible. ConsiderthescenarioinFigure5.12: therobotsbeganinthecenterofthearena,movedto the first inspection site, then moved to the second. Figure 5.13 gives connectivity information for the same run. The increasing density at each inspection location had a marked effect on the number of neighbors each robot had. Figure 5.14 gives a plot of Z and the average M valueforthreeexperimentalruns. ThefigureshowsProcessB’sstatebeingswitchedfrom−1 to +1 throughout the system. These three runs show similar behavior because the symmetry was broken in identical ways for each case. Of 10 total runs, 4 cases transitioned back to −1 and the robots stayed at the NE site. This is expected as symmetry breaking occurred in an unbiased fashion; adding a bias would stop such cases from having to undergo the phase transitionmultipletimesinordertoreachthedecisiontoexplorethenextsite. Figure 5.13 suggests that a smaller communication disk may suffice for the sequential inspectionintheenvironmentweconsidered. Analternativeapproach(perhapswithoutergodic 147 processes) could use the number of robots within communication range in order to switch behaviordirectly. Aninterestingquestioniswhethersuchaswitchwouldoccurasabruptlyas inourapproach. 5.3.1.3 Comparativediscussion It is worth noting an important difference between the application of the symmetry breaking phase transition ofP B to forging strategy selection and to sensing task sequencing. Foraging involves the robots exploring the environment, the robot move in and out of communication rangeofoneanother,andonemayreasonablyexpecttemporarydisconnectedpartsofthecom- munication graph. In contrast the sensing task in which we evaluated the sequencing mecha- nismhastherobotsmoveinrelativelyregularformation,ascanbeclearlyseeninFigure5.12. RecallthattheanalysisofP B considersafixede(~ s(t))characterization. Buteisdependent onthenumberofmatchedandmiss-matchedstatevariablesbetweenrobots. Asthecommuni- cation graph changes, the values fore change. In the case of foraging, there are two possibili- ties. The first, perhapsideal situation, has robotsroamingfreely aboutthe environment, while emaybespontaneouslycreated,itisalsobedestroyed. Iftherobotsarewanderingthroughout the environment then there is little reason to believe that, on the whole, these two aspects will not cancel one another out. 1 The second circumstance is that the task or environment causes some structure in the communication graph. Robots interference near the home region might 148 be an example. This second structured circumstance is similar to the well behaved communi- cation graph in the sequencing graph although, in this case, there are peaks at the times when robotsaresensinginthecornersofthearena. If there are changes in the communications graph,e may be altered. We have already said thatchangesinthenetwork,thatisA k sets,areassumedtobeslowcomparedwithequilibration of P B . When this assumption does not hold P A and P B may fail to behave as predicted by their equilibrium characterizations. In practice, this means that the frequencies at which the processes can be executed depends on the how well behaved of the communication graph is. Comparing Figure 5.8 with Figure 5.14 shows a phase transition time of 400 seconds versus 100seconds. Themorerigidstructureofthecommunicationsinthesequencingtaskmeantthat theprocessesequilibratedfaster. Thismeansthattheconstraintsthatchange ~ S onthebasisof tasksensingcouldberunfaster,andultimatelyafasterphasetransition. 5.3.2 Distributedtime-stamping While sequencing represents an example of high-level temporal structure, those state transi- tionsoccuratimescaleslowcomparedwiththatoftherobotinteractions. Next,weconsidera wireless sensor network task which requires explicit consideration of timing and which, con- sequently, poses a challenge for the synthesis methodology developed thus far. We propose a solution by considering the problem more broadly and applying statistical techniques similar inflavortothoseusedinthedevelopmentofthesynthesismethodology. 1 ItisinterestingtonotethatifweknowthevalueforM,thentheentropycharacterizationinFigure5.6(a)gives awayofcalculatingtheexpecteddifferenceineasrobotsdiscover(orlose)neighbors. 149 5.3.2.1 Problemdefinition Wireless sensor networks allow for the long-term monitoring of distribution spatio-temporal phenomena. A precise idea of time within such networks can be extremely useful: integrating multipledisparateobservationsofthesameeventorunderstandingcausationbetweendifferent events is difficult without a reliable and share notion time. Unfortunately the clocks available on such low-cost devices are often poor and will diverge from on another after a short time. Krishnamachari (2005, Chapter 4) describes several approaches to time synchronization in wirelesssensornetworks—weconsideraproblemofdatasynchronizationsimilartothatused by Xu et al. (2004). Sensors capture and record data over a long period of time. The data are stored locally with readings from a local clock. Later, when data are aggregated or when queries are run on the network itself in a data-centric manner (Krishnamachari 2005, Chap- ter9),eachnoderetrievestherecordedsensormeasurementsandreportsatime. Weareconcernedwiththeproblemofhavingeachwirelessnodelearnabouttheproperties of its own local clock in order to correct previous recorded local times. By doing so the node may return time-stamped data in which the time better reflects the global time than the node’s own recorded local clock time-stamp. Next we introduce a distributed protocol for achieving this. 5.3.2.2 Solution Wefollowthelinearmodelofclocknon-idealitypresentedbyKrishnamachari(2005). Clock i hasaclockoffsetα i anddriftβ i sothatC i (t) =α i +β i t. Weuseatokenpassingmechanism, 150 Figure 5.15: An overview of the time-stamp synchronization approach. Blue lines represent the behavior of non-ideal clocks, the thick red regions show when a node is holding the token for what it believes to be a fixed period of time. The dotted red line shows the effect of the ensemble average clock. The green bars, reflecting the durations that the token is held for, increasinglyapproachesthetruetime. 151 similartoProcessBdescribedabove. However,eachsystemholdsthetokenforsomeamount of time that is fixed within the local time reference frame. Each will attempt to hold the token for 5 seconds, say. Those nodes with β > 1 will underestimate the true time, while those with β < 1 will do the reverse. As the token is passed around the network, sometimes it underestimates,sometimesitoverestimates,butgivenasuitabletimescaletoexploretheentire network,itproduceanaverageestimate. Wheneachnodereceivesthetoken,itisreceivingan ensemble averaged clock and, provided thatE[β i ] = 1, this average will converge to the true time. Absolutetimeaccuracyormaynotbeimportantdependingontheapplication. Figure 5.15 gives a sketch of the approach. The different blue lines represent different non-ideal clocks. The red bars show regions in which the sensor node is holding the token for what it believes to be a fixed duration. The dotted red line shows how the averaged clock estimates time. Note that the transmission time must be factored into this calculation. The tokentimescannotbeaddeddirectlybecausepacketdelaytimesmustbeincluded. Ganeriwal etal.(2003)andMar´ otietal.(2004)analyzethepacketdelaysdecomposingthemintosender, transmission, and receiver delays. By employing MAC time-stamping procedures described in that work, the total time can be considered composed of a deterministic component and a relatively small variable component. 2 The head-to-tail integration of the propagated token, as shown above, requires that the node know this deterministic component. The vertical green bars to the left of the figure are the same green bars shown along the bottom of the true time 152 -20 -10 0 10 20 -10 -5 0 5 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 5.16: The simulated wireless sensor network used for the distributed time-stamping problem. axis, stacked together to show how their mean approximates the same true time duration (the verticalblackbar). Returning to Figure 5.15, suppose the token is returned to the middle clock at the time markedB. ItcomputesacomparisonbetweenlocalandaverageclocksattimemarkedA,with those at timeB. If prompted to report time of any data recorded between that interval, it can simplyusethelocaltimereadingsandtheaveragedtimestointerpolateavalue. Thisamounts toconstructinganestimateofitsownβ value. 5.3.2.3 Evaluationanddiscussion Figure 5.16 shows the simulated sensor network used to evaluate the approach. The darkness of connecting arrows reflecting the packet reception rate (this is simulated using the model described in Section 6.3.2). Figure 5.17 shows the resulting clock behavior, for each of the clocks. Notice that node 18 never receives the token, as a consequence of its low packet 2 We took the actual variance to be the worse estimate of Ganeriwal et al. (2003) and Mar´ oti et al. (2004), i.e., ∼ 5μs. 153 0 1 2 x 10 5 −5 0 5 1 Error 0 1 2 x 10 5 −5 0 5 2 0 1 2 x 10 5 −5 0 5 3 0 1 2 x 10 5 −5 0 5 4 0 1 2 x 10 5 −5 0 5 5 Error 0 1 2 x 10 5 −5 0 5 6 0 1 2 x 10 5 −5 0 5 7 0 1 2 x 10 5 −5 0 5 8 0 1 2 x 10 5 −5 0 5 9 Error 0 1 2 x 10 5 −5 0 5 10 0 1 2 x 10 5 −5 0 5 11 0 1 2 x 10 5 −5 0 5 12 0 1 2 x 10 5 −5 0 5 13 Error 0 1 2 x 10 5 −5 0 5 14 0 1 2 x 10 5 −5 0 5 15 0 1 2 x 10 5 −5 0 5 16 0 1 2 x 10 5 −5 0 5 17 True time Error 0 1 2 x 10 5 −5 0 5 18 True time 0 1 2 x 10 5 −5 0 5 19 True time 0 1 2 x 10 5 −5 0 5 20 True time Figure5.17: Resultssimulatedwirelesssensornetworkusedforthedistributedtime-stamping problem. The node number appears in the upper-left corner of each plot. For each node, we plot error in time estimate of the ensemble averaged clock as received as a token at the node (red)andthelocalclock(blue)versustime. Bothaxesareinunitsofseconds. receptionrate. Ithasnoinformationotherthanitslocalclock. Noticethattheensembleclock, after executing for above∼ 25000s is (on average) 1.28 seconds slow. This constitutes about fivespartsinamillion. Moresignificantisthatacrossnodes,theaveragedclocksareextremely close,havingavarianceof0.4μs(node18excluded). Suppose that the communication is reliable in the sense that the token is never lost (i.e., shouldaneighboringnodefailtoreceivethepacketcontainingthetoken,thesenderisalways notified – rather than assuming delivery with probability 1). Provided all nodes are able to 154 0 0.5 1 1.5 2 2.5 x 10 5 0 5 10 15 20 25 30 True time Clock variance Local clocks Token clocks Figure 5.18: Plot of the variance in the simulated clocks versus time. Node 18 is excluded in these calculations. The linear clocks show the expected quadratic increase, but the token averagedclockisasignificantimprovementhavingamaximumvarianceof4.5266×10 −07 . communicate (not partition like with Node 18) then the system is trivially modeled as a irre- ducibleMarkovchainoverestimatesoftheβ parameter. Whenfailuresaredetectedsothatthe token remains at the sending node, then the Markov chain is aperiodic, having an asymptotic stationary distribution which represents a mean value of β. Once the transition probabilities areknown,onecanuseavarianceestimator(liketheonedescribedbyHorvitzandThompson (1952))toestimatelarge,forthatgivencommunicationsgraph,thedeviationfromtruetimeis expectedtobeinthelimitoflong-timescales. Inpractice,oursimulatedcommunicationoccasionallypacketlossesmeanthatanodewill fail to receive the token or fail to confirm its acceptance. In order to mitigate against such circumstances, tokens are periodically reinitialized. This removes the notion of independent, albeitweighted,observationsofβ valuesandinvalidatesthesortofanalysissketchedabove. 155 If more information is known about the distribution ofβ values and the distribution of the transmission time non-determinism, then other useful properties can be derived. For example, if these are known to be normally distributed, then it follows that the variance in estimates of β after long-times scales with 1 n where n is the number of clocks in the system. While Krishnamachari(2005)givesboundsontheβ valuesforMicamoteclocks,weareunawareof anypublisheddataofthedistributionofthesevalues. In the multi-robot foraging tasks, processes could use parameters that varied because they were coupled by observations to the environment. Changes in these parameters resulted in particular behavior because of the increase in entropy inherent to the processes. In contrast, thetime(ordata)synchronizationproblemarisesfrombecauseclockshavetheirowndynamics whicharenoisy. Thiscanalmostbeconsideredformofentropyproduction. 5.4 Capabilitiesandlimitations The application of symmetry breaking toward task sequencing, or more generally synchro- nization, allows for some understanding of the capabilities of robots with ergodic processes. FollowingJones(2005),wedescribethesystemintermsofthetasksthattheycanperform. The consensus mechanism is used during sequencing to enable the swarm of robots to decidewhetherornottotransitiontothenextstate. Thissamemechanismcouldbeappliedto make decisions in more structured tasks. If the task can be decomposed into sub-tasks with a finite state automation description, then the same consensus mechanism can be used to ensure the swarm of robots move from state to state together. Each robot in the swarm has the finite 156 state automation description and each knows the starting state. Whenever the requirements of the task are met, the swarm collectively reach consensus to update their internal states. In other words, the consensus mechanism allows for a collective representation of finite state automationtasks. Since the notion of sub-tasks is general, in principle it could specify nearly arbitrary tem- poralconstraintsamongrobots. Thefinitestaterepresentationmayresultinover-specification with tasks that could be done in parallel forced to be performed serially. The point is that sucharepresentationcouldbesupportedbyergodicproperties. Animportantconsiderationis which observations of the world state are necessary to make the decision to transition to the next state, and when such observations, suffice to disambiguate the state to decide the next course of action. This was exactly a question addressed in Jones (2005), and the same tasks achievable by his systems (with memory or communication) can be theoretically realized. (In fact,ifweignorethequestionofstatedisambiguation,ageneralfinitestateautomationismore generalthanthemodelheconsiders.) Ontheotherhand,theimplementationsnecessarytocarryoutcomplextasksrequiremany differenttimescalesinordertoallowtheprocessestoreachequilibriumbetweenchanges. This isalimitationofthemethodthatwehaveidentified. Inasense,thenumberoftimescalesplaces a limit on the “depth” of the computation (and state machines that can be considered). Ulti- mately for a given task there are expected input and output reaction times. If many processes are required, this requires that they be executed at high frequency, at increasing computation andcommunicationscost. 157 The assumption of perfect relaxation of the ergodic dynamics is just an approximation. The internal relaxation times are bounded by technology experiences with the processes have shownthatiftwoprocessesarecoupledinsufficientlyslowlytheequilibriumpredictionsbreak down. For example, if P A is coupled with P B , and the Z values are decreased too quickly, then P B may occupy a meta-stable state. Essentially the robots are split into “domains” with differentstates. Inotherwords,globalconsensuscanfailtooccur. Animportantquestionforaparticularcontrollerwithergodicprocesses,ishowtoquantify, or at least describe, the information processing performed by the non-ergodic couplings. For example, in the division of labor task, the two ergodic processes integrate the observations fromacrosstherobotswarmsothatinhomogeneitiesaresmoothed. Buteachobservationuses a coupling that is non-ergodic. (Although may it not necessarily require this.) The question remainsastowhatisappropriateforergodicprocessesandwhatisnot. 5.5 Summary This chapter described problem domains that require collective coordination and developed controllers based on ergodic processes to perform this coordination. The ergodic processes were formally specified and analyzed. Controllers were constructed by coupling processes, and in each case an explanation for their expected behavior given on the basis of the analy- sis. Large-scale microscopic simulations were used to validate the controllers, showing that coupledergodicprocessesarecapableofperformingthedistributedcomputationnecessaryfor 158 such coordinated decision making in both cases. We showed that the symmetry-breaking pro- vided by processes in the toolbox is a relatively general coordination primitive by describing itsuseinasequencingtaskwhich,whengeneralized,givesanindicationofthecapabilitiesof theprocesses. 159 Chapter6 Large-scalesimulation This chapter describes a simulator that was developed to allow efficient high-fidelity simulationoflargemulti-robotsystems. Inordertovalidatethesynthesizedcontrollersforlargeswarmsofrobots,wedevelopeda custom simulation environment. In this chapter, we position the newly developed simulation tool within the space of existing simulators and describe the underlying data structures that allow for efficient simulation; second, we give full details of the noise models used; third, we present example results from experiments that would have been difficult to produce on any othersimulator. 6.1 Relatedsimulators Many simulators exist as the result of a variety of research niches with different assumptions andgoals. Figure6.1showsanumberofpopularsimulatorsonaqualitativecontinuumrepre- sentingphysicalrealism(asperceivedbytherobot). Alsoincludedisanestimateofthenumber 160 Increasing Physical Realism Cellular Automata NetLogo (Wilensky 1999), Swarm (Minar et al. 1999) Developed simulator Stage (Gerkey et al. 2003), TeamBots (Balch 1998) Gazebo (Koenig and Howard 2004), Webots (Michel 2004) Number of robots\agents O(10 6 ) O(10 4 ) O(10 2 ) O(10) Figure 6.1: A schematic representation of the environmental realism offered by other simula- torsandtheirintendedsystemsize of robots that can be simulated by each of these systems (relative to one another) in a single hypothetical period. At one extreme fall those simulators that aim for the highest fidelity pos- sible,wheresimulationstepsincludeapproximationofNewtoniandynamicswithintheworld. Gazebo (Koenig and Howard 2004) and Webots (Michel 2004) have these capabilities while providing vision, range, and other sensors. Both also have limits on the number of robots that can be simulated in a timeframe suitable for an implement-test-debug cycle. Most often the performancedegrades,ratherthanhavingahardmaximumnumberofrobots. Weestimatethat thesesimulatorsarebestforontheorderoftenrobots. ThenextgroupingrepresentstheclassofsimulatorsuchasStage(Gerkeyetal.2003)and TeamBots (Balch 1998). The world is typically a planar environment with a bitmap overlay providingobstacles,andthe“dynamics”involveconstantvelocitiesuntilrobotsupdatevelocity values,orsomethingsimilar. Typicalrangesensorsandothersmaybeprovided. 161 The two leftmost strata represent tools typically considered too simplistic to capture the complexitiesnecessaryforsimulationofphysicalrobots. NetLogo(Wilensky1999)andSwarm(Mi- nar et al. 1999) are general purpose multi-agent modeling tools which, when used to model multi-robot systems, typically result in block-world simulations that introduce undesirable side-effects. As suggested by Figure 6.1, the methods described in this chapter do not offer quite the samerealismasthesecondtiersimulators,becausetherangesensorreadingsareapproximated as described in Section 6.2.2. This trade-off permits larger systems of robots to be simulated atcomparablerun-timefrequencies. 6.2 Datastructures Calculating the sensory readings for a robot equipped with an active range sensor, such as a sonarringorsweepinglaser,typicallyinvolvesaray-castingoperation. Withsufficientrobots, evenanefficientimplementationoftheoperationcandominatethesimulator’sprocessingtime. Typicalsimplificationsinvolvedecreasingsensorresolutionbycastingfewerraysbutcanresult inscenarioswhereobstaclesorotherrobotsremainunsensedinthesimulatorthatwouldhave beenperceivedbyphysicalrobots. Weproposearangesensorforthesimulatedrobotsthathas,ineffect,adynamicresolution and ensures that each of the robots has sufficient information about nearby objects to convey thesurroundingfreespace. Anefficientimplementationofsuchasensormakesthoseelements tobesensed(i.e.,robotsandenvironmentalfeatures)rapidlyaccessiblefromthespatialquery 162 point that represents the sensor. Tree decompositions of the plane (Samet 1984) are efficient for queries of nearest points, while Delaunay triangulations (Guibas and Stolfi 1985) (and their dual, Voronoi diagrams) are useful for finding the nearest points in given directions. We employaDelaunaytriangulationtakenovertheplaneinwhichrobotsanddiscretizededgesof obstaclesformvertices. 6.2.1 Definitions The Delaunay triangulation over a set of points in the Euclidean plane is a unique triangu- lation (up to cocircularity) that satisfies the Delaunay criterion, viz. no point lies within the circumcircleofanytriangleinthetriangulation(GuibasandStolfi1985). Suchatriangulation can be efficiently constructed from a set of points (see the “creation” line in Table 6.1). Once constructed, the graph facilitates rapid query of the triangle containing a particular point (see line “point & vertex location” line in the same table) and provides fast neighbor access from each vertex. We maintain the distinction between points that simply exist on the plane and vertices that are nodes within the triangulation graph; a vertex has an associated point within thegraph’splanarembeddingthatrepresentsthespatialarrangementofrobotsandobstacle. The query points used for generating sensor readings are actually vertices. Each robot has a vertex associated with it. When a sensor reading is presented to the robot, the location operation need not be performed because the associated vertex is used directly. Accessing the information for a sensor involves a vertex to access its k neighbors (and hence thek incident triangles);thistakesonlyO(k)time. 163 Figure6.2: Anarrangementofrobotsandanobstaclewithaportionofthetriangulationusedby thesimulator. Edgesinthetriangulationareshowassolidlines; circlescircumscribetriangles and are guaranteed to contain no vertices if the triangulation is Delaunay. The region is a star-shapedpolygonproducedbyjoiningedgesincidenttocentralvertex,andisalwaysfreeof othervertices. 6.2.2 Localrangesensing Assume that the simulator has a Delaunay triangulation over the points representing robots and environmental features as described above. The key question, addressed here, is whether neighborswithinthetriangulationhaveanyusefulpropertieswithinthecontextofrobotsensor simulation. ConsiderthesituationshowninFigure6.2;thecentralrobotformsthehubfromwhichsix spoke edges connect to other vertices, resulting in six triangles. The figure shows a Delaunay triangulationoververtices0−6,twooftheverticesarefromanobstacleandtheremainingfive arefromrobots. ItisworthnotingthatifrobotR 0 inthelowerleftistobeaddedtotheshown triangulation,itcannotbeconnectedtothecentralrobot(vertex0)becausethecircumscribing 164 circle from the triangle formed with vertex 4 would encompass vertex 3 (and similarly the triangle with vertex 3 would contain vertex 4). The vertex at R 0 must thus be connected to vertices3and4. The edges in the triangulation do not fit exactly the readings that would be returned by a physical sensor: in an identical scenario involving physical robots, robot R 0 may have been sensed by the robot at vertex 0 (if the sensor had sufficient range). But as just demonstrated, robot R 0 cannot share an edge with vertex 0. To better understand the set of vertices that could conceivability have been sensed by a robot but do not appear as neighbors within the triangulation,considerthefollowingmoreabstractscenario. Supposev ∗ is one such “ignored” vertex because it does not share an edge with vertexv c . Let the point on the plane associated with vertex v be denoted byE(v). It follows that two other vertices v l and v r , exist, and share an edge with v c , where v ∗ ’s position on the plane E(v ∗ )thatliesbetweenthelines: x l (α) =αE(v l )+(1−α)E(v c ) α∈ [0,∞) x r (β) =βE(v r )+(1−β)E(v c ) β∈ [0,∞) 165 and |E(v ∗ )−E(v c )|≥D where D = min(|E(v l )−E(v c )|,|E(v r )−E(v c )|) Thus those features that are further away from the sensor are, in a sense, occluded by other closer ones. As a corollary to this fact, if a point p = E(v p ) is closer to some other point q =E(v q )thanallothers,thenanedgeconnectstheverticesv p andv q . Because eachof the triangles isitself contained within its respective circumcircle, the De- launay property ensures that it will be free of points. This is shown as the shaded polygon in Figure6.2andsubstantiatestheearlierassertionthatthesensorenablesdependablereadingsof thefreespacearoundtherobot. Thetriangulationalsoensuresthatrobotswithcomplexsensor areashavehigherdegree. Realism dictates that the simulator should not simply pass each robot the complete list of its neighbors within the triangulation. The edges incident to a vertex do not necessarily represent plausible readings from a range sensor. A set of readings can be constructed with decreasing accuracy but increasing efficiency, by sweeping or sampling the area made from theunionofcircumcircles,theemptystar-shapedpolygonalregion,orsimplytakingthesetof edgesthemselves. Thereadingsmuststillbesubjectedtofilteringforelementsthataregreater than a maximum range. Directional sensors must also be modeled by discarding appropriate componentsofthereadings. 166 Figure 6.3: The flip operation switches the middle edge within a quadrilateral. In the case shownitrestoresintheDelaunayproperty. Readings generated in this manner produce values that may underestimate large distances but never overlook close obstacles. This trade-off is reasonable in controllers where such a simplification is acceptable, as in potential field approaches where the field strength falls off sharply,orinmethodsthatrequireonlyneighbortoneighborinteractionssuchasemployedin mobilesensornetworks. 6.2.3 Planarity Robots and obstacles are represented as 2D planar entities allowing the use of Euler’s polyhe- dronformulatoshowthatthemeanvertexdegreewithinthetriangulationobeysthefollowing: hd i i = 1 n n X j=1 d j ≤ 6− 12 n ≤ 6 Statedanotherway,thenumberofedgeswithinthegraphislinearinthenumberofvertices. TheasymptoticcomplexitiesshowninTable6.1dependonplanarity,asoperationsperformed overeitherverticesoredgesresultinthesamecomplexity. Further,storageoftriangleoredge structuresorbothisonlyO(n)fornvertices. 167 6.2.4 Triangulationcreationandupdates Although calculation of the triangulation may only involve O(nlogn) steps (Fortune 1986), any reasonable multi-robot simulator must provide asynchronous updates. Recomputing the triangulation after each robot’s actions would entail more work than na¨ ıve algorithms. Fac- toring the graph into independent regions for updates is possible but choosing how to do so in a fair fashion is difficult: a straightforward greedy approach may repeatedly favor certain vertices. Anotheroptionistoallowtherobotstolocomotewhiledisplacingtheirassociatedvertices andconcedingtoonlyan“approximateDelaunay”trianglemeshwithfullrecalculationsoccur- ring at a fixed yet slower rate. This approach makes use of the observation that for time-steps usedduringsimulation,therobotswillnothaveopportunitytoundergogrosspositionchanges, sothetriangulationwillchangeveryslightlyifatall. Infactrecomputationofthetriangulation may result in a mesh identical to the previous one. This begs the question of when exactly the graphneedupdatingtomaintaintheDelaunayproperty? Theupdatealgorithmdescribedbelowperformsdifferentactionsdependingonthesizeof themovementthattherelevantvertexundergoes. Specificregionswithinthetrianglemeshare identified and appropriate procedures prescribed, none of which involve complete triangula- tion. Theempiricalmeasurementsdetailedinthesectionsthatfollowmeasureboththelocality ofeachoftheupdateproceduresandtherelativefrequencyofeach. 168 6.2.4.1 Creationandinsertion Even with an efficient update mechanism, an initial Delaunay triangulation is required. Al- though the sweepline algorithm is more efficient, our implementation uses the Bower-Watson algorithm (Watson 1981) because an insertion capability is necessary for the dynamic update described below. The algorithm simply calls the insertion operation for each point. The trian- gulationisonlycreatedonceso,atworst,itresultsinaslightlylongerinitializationtime. The Bower-Watson insertion operation (see “insertion” in the table) finds and deletes the triangleswhosecircumcircleincludesthepointtobeadded. Theresultingholewithinthemesh isretriangulatedtoincludetheinsertedpoint. 6.2.4.2 Deletion The deletion operation is used in the dynamic update described below. Devillers (1999) de- scribesavarietyofmethodsfordeletionofverticeswithinann-dimensionaltriangulationand provides a recommendation for the most desirable algorithm for a given input size; this is reproduced in the “deletion” line of the table. The input size in this case is the degree of the vertextobedeleted. ThetabledescribesaflippingalgorithmbasedontheflipoperationGuibas and Stolfi (1985), the essence of which is shown in Figure 6.3. In general the flipping algo- rithm only works in 2D (although Guibas and Russel (2004) describe a generalization). This operationisusedextensivelybelow. Figure6.4showsarepresentativeexperimentalruninthesimulatorinordertoaccessboth the mean and variance of the vertex degree. The robots are using a potential field algorithm 169 0 200 400 600 800 1000 Time 2 4 6 Number of Neighbors Mean degree Deviation (width2σ) Figure6.4: Plotofthemeandegreeoftheverticesversustimeina1500-robotsimulation. The robotswereexpandingtofillanenvironmentandthedegreedecreasedovertimeaccordingly. (similar to that described in Howard et al. (2002)) to cover an environment while maintain- ing connectivity. An average degree decreases before leveling out, as expected, but the near- constantstandarddeviationwasunforeseen. Theimplicationisthatanedge-flippingalgorithm should be used to handle deletion; it operates by removing a vertex from the graph, freely re- triangulatingtheresultinghole,andflippingtheresultingedgesuntiltheDelaunaypropertyis restored. We have argued that asymptotic complexity should be minimized, yet in this section we have prescribed a suboptimal but competitive algorithm. The reason is our target domain of simulating large-scale systems of robots, with n → ∞. Initial indications are that, even as n increases, the vertex degree of remains k ≈ 6. The deletion algorithms thus need not be efficientfork inthelimit. 170 Figure 6.5: A scenario in which the central robot is about to undergo a motion. Only part of theglobaltriangulationisshown. Threeregionsareshaded,eachhavingdifferentimplications fortheupdateoftheunderlyingtriangulation. 6.2.5 Dynamicupdate In order to maintain the Delaunay property as the vertices underlying the triangulation move, we need to consider the types of motions the can occur and the restoration steps they neces- sitate. Consider Figure 6.5 in which the position of the point P in the center is about to be updatedtoP 0 duetoamotionoftheassociatedrobot. Threeareashavebeenlabeled. Suppose for the moment thatP 0 is somewhere within the area of Region A∪Region B. If the edges are preserved in spite of this action, then the new position can be interpreted as a distortion or corruption of the 7 triangles of which P forms the hub. The Delaunay property canbedestroyedistwoways(fordetailsseethesection6.2.6). 171 The first way arises in adjacent triangles which include a spoke edge off P that grows longer as P → P 0 . Put another way, triangles off P that are behind the direction of motion can change so as to violate the Delaunay property. The figure shows an example of such an occurrence. The two leftmost triangles that haveP as a corner have circumcircles that almost coincide. IfP 0 istotherightofP,thenthecircleswillexpanduntilthetop(bottom)pointfalls within the bottom (top) circle and the Delaunay property is destroyed. The resolution in such cases is to flip the edge between the triangles (which connects toP 0 ) so that the quadrilateral issplithorizontallyintotrianglesratherthanvertically. Wethisiscalledarearwardflip. The second situation in which the property is violated occurs when the point enters a cir- cumcircleontheperimeterofP’ssubtendedpolygon,orfromtheperspectiveoftheappropri- atespokeedgewhenitisshortenedbythemovementofP. Thisoccursinedgesradiatingfrom P inthesamedirectionasthatofthemotionandcanbeinterpretedwithinFigure6.5asthose cases in which P 0 falls in Region B. In these cases the Delaunay property can be restored by flippingtheedgeontheperipheryintoaspokeedgethatconnectsP 0 andthefarpointdirectly. Wetermthisaforwardflip. IfP has degreev then dealing with both conditions requires O(v) circularity tests before possibly flipping the respective edges. A flip may require further edge flips to restore the De- launayproperty. Thefouredgesofaquadrilateralinvolvedinafliparequeuedfortestingand, if necessary, flipped themselves (and the surrounding edges queued). Through this recursive process the forward or rearward flips could conceivability trigger a large number of updates. 172 In practice the affected portions of the mesh remain are highly localized patches surrounding theinitialflip. Measurementsofthissizeareprovidedbytheempiricaldatabelow. Turning to Region C and the exterior area, we note that when P moves to such a P 0 , the graph no longer remains planar. Guibas and Russel (2004) point out that under such circum- stancesflip-basedtechniquesareinadequateasthereexistgraphsforwhichtheDelaunayprop- ertycannotberestored. Accordingly,whenP 0 fallswithin Region C(orbeyond),anupdateis performedthroughthedeletionofP andinsertionofP 0 . Guibas and Russel (2004) also consider the problem of updating a triangulation, but con- sider the problem in higher dimensions where the simple solution provided here will not suc- ceed. They describe the use of kinetic information for maintaining the Delaunay triangulation for enduring trajectories. Our implementation uses robots with controllers that use velocity- basedcontrol,andthecontrollerexecutesataratesimilartothesimulationitself. Thecomplex- ity of paths generated under these circumstances suggests that the kinetic information would beoflittleuse. 6.2.6 Triangulationconditions Suppose,asinSection6.2.5,thatapointP ismovingtoanotherpositionP 0 whichfallswithin star-shapedpolygonformedbythetrianglesthathave P asacorner(i.e. RegionA∪RegionB asshowninFigure6.5). ThesetrianglescanbedistortedbythemovementofP toP 0 soasto destroytheDelaunaypropertyintwoways: 1. Oneofthecorruptedtriangle’scircumcirclenowincludesoneofP’sotherneighbors. 173 Operation Algorithm AsymptoticComplexity Notes Creation Fortune’s Sweepline algo- rithm(Fortune1986) Worst-caseO (nlogn) Non-incrementalalgorithm. Fasterthananyincre- mentalalgorithm. Bower-Watson’s(Watson1981) Worst-caseO (n 2 )in2D, Ave.-case ∼O(n) re- ported(Filipiak1996) Incrementalalgorithm. Point& Vertex Location GuibasandStolfi’swalking method (Guibas and Stolfi 1985) Worst-caseO (n), Ave.-caseO ( √ n) Directed walk performed over Delaunay struc- tures. Balanced quadtree (Samet 1984) Worst-caseO (logn) Independent of triangulation, requires separate treeconstructedinO(nlogn). Insertion Bower-Watson insertion (Wat- son1981) Worst-caseO (n),Ave.-case dominated by “point loca- tion”. Singlestepinthealgorithmlistedunder“creation” above. Deletion Aggarwaletal.(1989) Worst-caseO (k)in2D Preferredfor9k. Devillers’ shelling algo- rithm(Devillers1999) Worst-caseO (klogk)in2D Competitivefor9≤k. Edge-flippingbasedalgorithm Worst-caseO (k 2 ) Competitive fork < 9. Flipping to restore Delau- naypropertyonlyvalidin2D. Herenisthenumberofpointsinthetriangulation,k isthedegreeofthevertexbeingdeleted. TABLE: SUMMARY OF OPERATIONS ON THE DELAUNAY TRIANGULATION AND THEIR RESPECTIVE EFFICIENCIES. Table6.1: SummaryofoperationsontheDelaunaytriangulation. 174 2. Oneofthecorruptedtriangle’scircumcirclenowincludesanotherpointbutwhichisnot aneighborofP. Below it is shown that these cases occur when spoke edges are lengthened or shortened respectively. 6.2.6.1 Condition1 In the first case above, note that this can only arise if somewhere between P and P 0 lies a point P ∗ that is cocircular with 3 or more of P’s neighbors. Supposing that the first three of these neighbors are Q 1 ,Q 2 ,Q 3 . They must themselves form adjacent triangles around P ∗ because a vertex that produces a hypothetical intervening triangle would fall within the circle circumscribing P ∗ ,Q 1 ,Q 2 and Q 3 , having its own cocircularity with P ∗ before Q 1 ,Q 2 and Q 3 . Since the Delaunay property is held until P ∗ , the circumscribe circles of4PQ 1 Q 2 and 4PQ 2 Q 3 did not include Q 3 and Q 1 , respectively. Thus, the movement from P to P 0 must expandthecirclesonthesamesideofedgeQ 1 Q 2 andQ 2 Q 3 asP falls. ThisonlyoccursifP 0 isfurther fromQ 1 ,Q 2 ,Q 3 thanP is. 6.2.6.2 Condition2 For the second possibility above, we have a similar situation with a P ∗ at the first case of cocircularity with Q 1 ,Q 2 and a third point R which is not a neighbor of P. We may assume that this third point R forms a triangle with Q 1 and Q 2 , because if it does not, and instead 175 R 0 forms a triangle with Q 1 ,Q 2 , then R 0 is also cocircular with P ∗ ,Q 1 ,Q 2 . The point R couldneverbecocircularwithP ∗ ,Q 1 ,Q 2 withoutsuchacocircularR 0 sinceitwouldthenfall within the circumcircle of4Q 1 Q 2 R 0 . But P ∗ can only become cocircular with4Q 1 Q 2 R if P 0 ismovingtowardtheedgeQ 1 Q 2 . Within in Figure 6.5, this condition only occurs when P moves to a P 0 that falls within RegionB,andcocircularityoccursattheboundarybetweenRegionAandRegionB. 6.3 Models Duetothelackofrealisticnoisemodelswithinexistingsimulators,wedevelopedthefollowing noisemodelsforusewithinoursimulator 6.3.1 Sensorymodels Eachsimulatedrobothasaringof 12infraredsensorsfordistancereadings. Weimplemented themodelproposedbyBenetetal.(2002),whichisbasedonaphysicalIRdevicescomprising two IR LEDs and PIN photodiode. This required a minor modification to the underlying data structure presented in the previous section: each item stored within the Delaunay mesh also neededtoincludeinformationinordertocalculateasurfacenormalforagivenray. 176 Parameter Value Description Sensingslices 12 AforwardfacingIRwithsensorsevery30 ◦ . Maximumradius 1m Thisradiusistunedfora10bitDAC. α 0.045 This is a material and color dependent number. (Value is for lightgray.) σ Y 0.006 Noiseintherangedirection(fitsauthor’sphysicaldevice). L 0.08 DistancebetweenLEDs. Table6.2: Simulatedsensormodelparameters. 6.3.2 Communicationmodels Many simulators provide no model of network failures or unreliability. Some authors address this by using a disc centered at the robot. Within the sensor network community, single com- municationradiiareregardedasunrealistic,evenfortheoreticalmodels(Kuhnetal.2003). We developed two communications models, the first is a high-fidelity model of wireless propaga- tion that integrates several existing models from the wireless communications literature, the second is an model based on published Rene mote data that is extremely efficient to use. The first model accounts for environmental features, including fading (large and short-scale, and multipath), link-layer models, and interference. The second model is intended to simulating robotswithinlargeopenarenasefficiently,asapplicable,e.g.,intheforagingexperiments. 6.3.2.1 Integratedmodel Although many radio and wireless communication models are available in the literature, we were unaware of any single model that adequately captured the details of the environmental effects on signal propagation, fading, multipath, interference, as well as link-level properties. Weintegratedseveralmodelsinaccountfortheseaspectsintheemployedsimulator. 177 Kotz et al. (2003) provide a list of inaccurate and unrealistic properties that are often mis- takenly assumed of wireless communications. As a starting point we set out to ensure the integrated model would capture sufficient richness to avoid these pitfalls, but added an ex- tra requirement: whereas many wireless networking models (cf. Krishnamachari (2005, Sec- tion5.2.2,pp.73)makeuseofsimplisticprobabilisticfactorstoaccountforchannelproperties (e.g.,log-normalshadowing)thisisinadequateforarobot. Sincearobotmaynavigatearound the space, the communications model must reflect the changes in transmission quality (or re- ceived signal strength) that result from that movement in the environment. Obviously all of the variance cannot be accounted for, so a random term must remain, but is it is inappropriate to generate communications properties anew every time the robot moves. Of course, if one has a static wireless network (e.g., for a sensor network application) then the application of such simplistic models can be better justified as the simulated set-up represents just a single sample from the ensemble of possible configurations, but with moving nodes, the relationship communications at a node’s current position and subsequent one, can be important (to see an roboticapplicationofthisfactseeLindh´ eetal.(2007)). Thus, we implemented the link-layer model of Zuniga and Krishnamachari (2004), a ver- sion of the signal-to-interferer-plus-noise model of Son et al. (2005), and a channel model basedonprimary-raycastingandshapefactormodelsoffadingdescribedbyRappaport(2001). Wemodelthewirelesschannelbycombiningatermforlarge-scalepathloss,andasecond termforsmall-scalemultipatheffectsandfading. Thelog-normalshadowingmodelisaneasy 178 Figure 6.6: The map of the building used in the examples that follow; it represents an area of approximately40m×20m. Figure6.7: Pathattenuationfactors(PAF)fromatransmitterarecalculatedbyray-castingand integrating values along the connecting ray. Partition values for brick, drywall and plumbing arefromthosereportedinRappaport(2001). . 179 and widely agreed upon model (Krishnamachari 2005) , that serves as a reasonable starting point: PL(d) =PL(d 0 )+10nlog 10 ( d d 0 )+X σ , The log-normal shadowing model gives a model of average power lost over the path from a position of lengthd, providing we have measurements at somed 0 . It gives a measure in dB, and it should be noted is only valid ford ≥ d 0 . Note that additional zero-mean normal noise X σ maychangewithtime,andisusedtocaptureunmodeledeffects. As stated above, our philosophy is to find possibly computationally expensive, but bet- ter than resigning ourselves to purely stochastic models. Following Rappaport (2001, Sec- tion4.11.5),weincludeatermsforshadowingbyaddingincludingapathobstructionsthrough so-called “primary ray tracing” so that partition attenuation factor (PAF) values model power lossduepassingthroughanobstacle: PL(d) =PL(d 0 )+10nlog 10 ( d d 0 )+ X PAF[dB]+X σ 0. (Note that variance of the additional zero-mean normal noise X σ 0 is decreased because large-scalepasslossiswellaccountedforbytheattenuationfactor(Rappaport2001,pp.163).) Figure6.6isanmapofabuildingusedintheexamplesthatfollow. Figure6.7graphically showsattenuationfactorvaluesforatransmitterintheupper-leftcorner: theseareconstructed by ray-casting to sum the attenuation factor values along a path from the transmitter to each 180 partoftheenvironment. (AslightmodificationlikethatinDevasirvathametal.(1990)handles multi-flooredenvironmentseasily.) Small-scale fading is produced by two multipath effects: Time-delay spread, and Doppler spread (Rappaport 2001, pp. 206). In our scenarios we consider Flat Fading as the time-delay element,andwhatistermedslowDopplerFading. Oftentimesthesedetailswillbeignoredand either Rayleigh or Ricean distributions used to model the small-scale fading. Rayleigh fading results when there is no line-of-sight between the transmitter and receiver; the receiver is as- sumedtohavepowerarrivingfromallangles. Riceanissimilar,havingpowerarrivingfromall angles,butadditionallyincludespowerpeaktoaccountforaline-of-sighttransmission. Much like the use of a normal distribution to model large-scale path loss and shadowing, we were unsatisfied with the lack of an environmental attributes of the behavior in these distributions. Thus,weimplementedamultipathshape-factormodel,whichisabletoproduceRayleighand Ricean distributions as special cases when the arriving power fits those described above. The followingdescriptionisbasedonDurginandRappaport(2000). The multipath shape-factor model accounts for the fading statistics by modeling the an- gular distribution of the power, p(θ) that arrives at a receiver. To calculate this distribution, a deterministicraytracerthattreatsobstaclesassourcesofreflection. Obstaclesaremodelledas imperfectdielectrics,andassuminganE-fieldnormaltotheedgeoftheobstacle(whichholds 181 when E-field is horizontally polarized, and the antenna are placed vertically) if an incident waveenteringtheobstacleatanangleθ i haselectricfieldE i thenthereflectedwave E r = sinθ i − √ r −cos 2 θ i sinθ i + √ r −cos 2 θ i E i . (Therelativepermittivity r istakenashavingthevalueof4.44asissuitableforbrick.) The shape-factors are based on the Fourier coefficient’s of the power distribution. Once p(θ)hasbeencalculatedforaparticulartransmitterandreceiverpair,wecalculate: F n = 2π Z 0 p(θ)e −inθ dθ. Fromthesethefollowingthreeshape-factorsaredefined: Angularspreadisameasureoftheconcentrationofpoweraboutasingleazimuthaldirection: Λ = s 1− |F 1 | 2 F 2 0 . Angularconstrictionisameasureoftheconcentrationofpoweraboutatwoazimuthaldirec- tions: γ = |F 0 F 2 −F 2 1 | F 2 0 −|F 1 | 2 . Azimuthaldirectionofmaximumfadingisgivenby: Θ = 1 2 arg{F 0 F 2 −F 2 1 }. 182 The variance of the received power (in Volts-squared) is equal to the magnitude-squared ofthecomplexvoltage. Theradiopowerlessthepath-lossaboveprovidesa P R anaverageof the local received power, and then variance (from Rappaport (2001, pp. 233)) of the complex voltageforareceivertravellingindirectionβ is: σ 2 ¯ V 0 = 2π 2 Λ 2 P R λ 2 (1+γcos(2β−2Θ)), (whereλisthewavelengthofthecarrierfrequency). In order to do this position dependant calculation efficiently, we precomputed the shape- factors between any two points for the whole environment, then the required variance that resultsfromsmall-scalefadingcanbeeasilycomputed. Next, we implemented the link-layer model developed by Zuniga and Krishnamachari (2004), the following is an overview. As mentioned P R = P T −PL T (d,Λ,γ,Θ,β), where P T is the radio’s transmit power, and PL T (·) is the sum of PL(d) and the appropriate noise generated as a function of the shape factors just described. The probability of a bit error is modeled the assumption of NRZ encoding with NCFSK modulation. This in turn leads to the followingexpressionfortheprobabilityofreceivingapacketcorrectly: PRR = (1− 1 2 e − α 2 ) 8f , where f is frame length, we take f = 50 bytes. The α is the signal-to-noise ratio in Zuniga and Krishnamachari (2004), but we extend their model by introducing the concurrent 183 Figure6.8: Ray-tracingallowsthereflectedcomponentsofthereceivedpowertobecalculated. This is used to construct an angular distribution of the power, which is summarized with the three shape-factors (for clarity, the power produced by the direct path is omitted from the figure.) The resulting shape-factor for the shown points are: Λ a = 0.985525,γ a = 0.590345, Θ a = 0.291113; Λ b = 0.662603, γ b = 0.843989, Θ b = 0.046750; Λ c = 0.762548, γ c = 0.388172,Θ c =−0.191796. 184 Figure6.9: TestsoftheradiomodelhaveastatictransmittermarkedT. Atestreceivermoves from 1 to 2. Interference from transmissions by a single radio at a, b, c and d were also considered. transmission model based on Son et al. (2005). Basically, this involves including the power thatisreceivedfromconcurrenttransmissionsandtreatingthemasaddingtothenoisefloor: α =P R −P n −P I . Both the noise floor P n and interference are ways that result in asymmetry in the com- munications network. We follow Zuniga and Krishnamachari (2004) and generateP t andP n together with a covariance matrix that is representative for MICA2 radios. To model P I , we simulate packet transmission times, and treat a collision on the medium by adding power pro- duced by the interfering transmitter. Son et al. (2005) show that simply adding such noise will over-estimate packet losses. Thus, additional interference contributions are halved, based on the data in Figure 13 of Son et al. (2005). Our experiments suggest packet reception is 185 adversely affected by increasing the interferers more by the increased frequency of collisions thanbytheircontributiontothenoisefloor. Next, we briefly present some controlled examples in order to show resulting simulated radiobehavior. Figure6.9showstheset-up. Astatictransmitterat T broadcaststestpackets. A testreceiverismovedfrom1to2,whichisadistanceofabout8metres. Figure6.10showsthe resulting shape-factors, variance in power, total fading, and packet-reception. It is interesting to observe that passing through the doorway from the corridor into the room results in packet reception in the transitional regime of behavior. Notice also how the small-scale effects (see Figure6.10(b))resultinmarkedchangesinpacket-receptionratesshowninFigure6.10(d). The same setup arrangement with the moving receiver was repeated, but we included a transmitting interferer at each of (a), (b), (c) and (d) respectively. Figure 6.11 shows that the effect collisions have on the reception rate varies with position (and hence, received interferer power)asonemightexpect. 6.3.2.2 Renemotemodel We fitted a polynomial to the recorded data for Rene motes (in low power mode) provided in Ganesanyetal.(2002). Weusedafourthorderpolynomialasitappearedtogiveagoodmatch. Thefollowingistheexpressionusedtocalculatetheprobabilityofapacketbeingdeliveredas afunctionofdistancex: Pr(packet-delivered |x) =α+βx+γx 2 +ρx 3 +τx 4 186 0 1 2 3 4 5 6 7 8 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 Shape factor coeffients Distance moved (m) Factor Value Λ γ Θ (a) Shape-factorscalculatedalongtrajectory. 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 x 10 4 Variance in Received Power (Fading) Distance moved (m) V 2 (b) Varianceinreceivedpoweralongtrajectory. 0 1 2 3 4 5 6 7 8 −98 −96 −94 −92 −90 −88 −86 −84 −82 −80 −78 Fading Distance moved (m) RSSI (dBm) Small−scale Bulk (c) Superpositionofbulkandsmall-scalefadingef- fects. 0 1 2 3 4 5 6 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Packet Reception Distance moved (m) PRR (d) Packet-receptionratealongthetrajectory. Figure6.10: Radiotransmissiontoamovingtestreceiver. Parameter Value α 0.8884 β 0.0122 γ −0.10545 ρ 0.0195631 τ −0.001036 Table6.3: Renemotecommunicationsmodelparameters. 187 0 1 2 3 4 5 6 7 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PRR with interference Distance PRR No collision Channel collision (a) 0 1 2 3 4 5 6 7 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PRR with interference Distance PRR No collision Channel collision (b) 0 1 2 3 4 5 6 7 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PRR with interference Distance PRR No collision Channel collision (c) 0 1 2 3 4 5 6 7 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PRR with an interference Distance PRR No collision Channel collision (d) Figure 6.11: Radio transmission in the presence of interferers, at (a), (b), (c) and (d) respec- tively. 188 Pr(·) Radius Type 0.65–1.0 0m–1m Reliable 0.25–0.65 2m–4m Intermittent 0.0–0.25 4m– None Table6.4: Qualitativesummaryofcommunicationsbehaviorbyrange. Ganesanyetal.(2002)identifythreetypesofnetworkbehavior: This simple model correctly captures these three, including the so-called “transitional re- gion.” 6.3.3 Velocityerrormodels We 1 were unable to find published noise models for velocity controlled robot. While there is no shortage of data on odometric drift, we were specifically interested in the precision of commandedlinearandangularvelocities. In order to fit a model we placed laser fiducials on one differential drive ActivMedia Pi- oneer 2 robot and had a second (stationary) robot record the trajectory of the first, see Fig- ure6.12. Thefirstrobotwasgivenvelocitycommandsfrom: {0.0rad/s,0.15rad/s,0.30rad/s,··· ,1.5rad/s}×{0.0m/s,0.1m/s,0.2m/s,··· ,0.6m/s}. Five trials were run of each set. Figure 6.13 shows the recorded data, and the meshes fit the meanandstandarddeviationofateachpoint. Nextwefittedquadraticplanesoftheform 1 TheassistanceofM.Cordeiroisacknowledged. 189 Figure6.12: Therobotwiththreelaserfiducials,observedbyasecondstationaryrobotasused tothecollectthedataforthismodel. x μ (˙ x, ˙ θ) =u j +u i ˙ x+u h ˙ x 2 +u g ˙ x 3 +u f ˙ θ+u e ˙ x ˙ θ+u d ˙ x 2 ˙ θ+u c ˙ θ 2 +u b ˙ x ˙ θ 2 +u a ˙ x 2 , x σ (˙ x, ˙ θ) =v j +v i ˙ x+v h ˙ x 2 +v g ˙ x 3 +v f ˙ θ+v e ˙ x ˙ θ+v d ˙ x 2 ˙ θ+v c ˙ θ 2 +v b ˙ x ˙ θ 2 +v a ˙ x 2 , θ μ (˙ x, ˙ θ) =r j +r i ˙ x+r h ˙ x 2 +r g ˙ x 3 +r f ˙ θ+r e ˙ x ˙ θ+r d ˙ x 2 ˙ θ+r c ˙ θ 2 +r b ˙ x ˙ θ 2 +r a ˙ x 2 , θ σ (˙ x, ˙ θ) =s j +s i ˙ x+s h ˙ x 2 +s g ˙ x 3 +s f ˙ θ+s e ˙ x ˙ θ+s d ˙ x 2 ˙ θ+s c ˙ θ 2 +s b ˙ x ˙ θ 2 +s a ˙ x 2 , forthemeanandstandarddeviationgivenacommanded ˙ xand ˙ θ. The simulation uses functions x μ and x σ to construct a normal distribution from which a sampleittakentoupdatethelinearposition. Similarlyfortheposeangle. 190 0 2 4 6 8 10 12 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 Commanded Linear V Graph of measured X Commanded Angular V Recorded Angular V (a) MeasuredLinearVelocity. 0 5 10 1 2 3 4 5 6 0 0.5 1 1.5 Commanded Linear V Graph of measured theta Commanded Angular V Recorded Angular V (b) MeasuredAngularVelocity. Figure 6.13: Measured values for given commanded ˙ x and ˙ θ. Without noise the graphs would beinclinedplanes. 191 Parameter Value u a −0.14864 u b 0.16633 u c 0.25663 u d −0.058494 u e −0.78439 u f 0.17305 u g −1.5012 u h 0.52956 u i 1.2513 u j −0.040279 Parameter Value v a −0.26742 v b 0.05992 v c 0.05603 v d −0.064967 v e −0.09247 v f −0.00232 v g 0.0931 v h −0.11796 v i 0.1003 v j −0.006879 Parameter Value r a −0.47467 r b 0.35505 r c 0.52721 r d −2.4673 r e 0.36098 r f 1.237 r g 3.3941 r h −3.6025 r i 1.2862 r j −0.12206 Parameter Value s a −0.02433 s b 0.17781 s c −0.02246 s d −0.2129 s e −0.20900 s f 0.1424 s g 0.7099 s h −0.7784 s i 0.3257 s j −0.037818 Table6.5: Coefficientsofthequadraticplanesfittedtomodelcontrolandodometricerror. 192 6.4 Examplesimulations: parametrizedforagingstrategies The following is an example of a set of experiments that was feasible because of the high- performance offered by the simulator. The data also consider a much wider range of system sizesthanfoundelsewhereintheliterature. Two foraging strategies are explored in this dissertation. They are described in detail in Section5.1.1. Thefollowingdescriptionsufficesforhere: Traditional homogeneous foraging has each robot searching for pucks and independently transportingthemtothehomeregion. Thebucketbrigadingstrategyrequiresthateachrobotfocusonasub-regionofthetotalwork arena,passingpucksinthedirectionofthehomeregion. Consider that the bucket brigading strategy can be parameterized by the size of the sub- region assigned to each robot. By permitting sub-regions assigned to each robot to overlap, we are able to consider bucket brigading and homogeneous foraging as two ends of a single strategycontinuum. Wetreatthestrategiesasdifferentindegree,ratherthanofdifferenttypes. 6.4.1 Controller: parametrizedbucketbrigading Wedefinetheparametrizedcontrollerashavingthreestates: searching,homing,andreturning. The key difference is in the use of integrated odometric data. The controller is parametrized byD,theradiusofeachrobot’ssub-regionwithinodometricspace(inmeters). The searching state randomly explores the arena, but once outside the disk of radius D (calculated from 193 odometric data), the controller transitions to returning state. If a puck is found within the boundsofthesub-region,the homingstateistriggered. The returningstatenavigatestherobot back to a position within the sub-region. Our implementation is conservative, ensuring that the robot is within the disk of radiusD/2 before transitioning to searching. The homing state simply moves the robot toward the home region. If the robot detects that it is over the home region, the puck is deposited identically as in homogeneous foraging; if the robot exits the D-disk,thepuckisdepositedand returningcommences. Thus,bucketbrigadingwithD =∞ correspondstothehomogeneouscontroller. Robots have only ˙ r c and ˙ θ c values (and neither ˙ r nor ˙ θ) so the odometric estimates of each robot’s sub-region drifts over time. Since all robots’ estimates are drifting, the regions themselves perform (slow) independent random walks over the arena. This has the added effect of ensuring coverage of otherwise neglected areas (and adds robustness in the case of robotfailures). 6.4.2 Simulations In order to assess the effect of parameter D, we simulated bucket brigading with D values {5m,10m,20m,30m,40m,50m}, which is a wide range of values considering that the arena was 64m across (these values must always be interpreted relative to the size of the environ- ment). Again we considered group sizes from 10 to 500 robots in increments of 10, and low andhighpuckdensityconditions. 194 Results from these simulations are displayed in Figure 6.14. The plots show the collec- tive foraging performance for a range of D values; homogeneous foraging is also shown for reference purposes. The two large plots give collective performance across the range of robot system sizes. The six surrounding smaller plots show time-series data for systems with 100 robots (left two), 250 robots (middle two), and 400 robots (right two). The upper plots are for alowpuckdensity;thelowerplotshaveahigher(byafactoroffourtimes)puckdensity. These graphs display several results. Bucket brigading is shown to be more sensitive to puck densities than homogeneous foraging. The robots must have sufficient probability of finding another puck in order to drop one at a neighbor’s sub-region. This effect depends on puck density, and the minimalist nature of the simulated robots, specifically the lack of puck sensing at a distance, requires higher densities than found elsewhere. Bucket brigading (especiallywithsmallD)hasgradualperformanceincreases,andisinefficientwithfewrobots, whereforexampledisksofradiusD maybeinsufficienttocoverthearena. With increasing D values the bucket brigading controller approaches the performance of thehomogeneouscontroller(whichisD =∞). Thisisasexpectedasitwasanticipatedduring the design of the parametrized controller. What is unexpected, however, is that even large D values(40mislarge,relativetothesizeofthearena)canhaveamarkedinterference-reducing effect. ForlargeD values,severalrobotswillstillattempttotransportapucktowardthehome region, but they do not remain in the crowded region as long as homogeneous robots. We believe that this because odometric drift forces some robots to yield, having believed to have passedoutsidetheradiusD disk. 195 0 100 200 300 400 500 Number of robots 0 100 200 300 Pucks foraged after 2,000s Homogeneous foraging D = 5m D = 10m D = 20m D =30m D =40m D = 50m Pucks density = 0.781 pucks/m 2 0 100 200 300 400 500 Number of robots 0 100 200 300 Pucks foraged after 2,000s Puck density =3.125 pucks/m 2 0 100 200 300 400 500 Number of robots 0 1 2 3 4 Pucks/robot after2,000s Puck density =0.781 pucks/m 2 0 100 200 300 400 500 Number of robots 0 1 2 3 4 Pucks/robot after2,000s Puck density =3.125 pucks/m 2 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 100 robots 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 250 robots 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 400 robots 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 100 robots 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 250 robots 0 500 1000 1500 2000 Time (in seconds) 0 100 200 300 Pucks foraged 400 robots Figure6.14: Performanceofparametrizedbucketbrigadingcontrolleracrosssystemsizes. Top half of figure (five plots) gives data on low puck density case (0.781pucks/m 2 ), bottom half thehighdensitycase(3.125pucks/m 2 ). Twolargeplotsontheleftgiveperformance(number of pucks foraged after 2000s) for robot groups ranging from 10 to 500 robots. Medium sized plots on the right give performance per robot for each of the D parameters. The size small plots, three along the top and three along the bottom, give time series data for 100, 250 and 400robots. Alldataareaveragesof5independentruns,errorbarsshowonestandarddeviation. 196 For each value ofD, the width of the performance curve depends on puck density. Zhang andVaughan(2006)provideanexcellentdiscussionofthesignificanceofgeneralperformance versus population curves. For a given minimum performance level, they interpret the width (i.e., where the curve intersects line of constant performance) as the range of robot groups sizesthatcanachievethetaskwithatleastthegivenperformance. Greaterwidthcanbeuseful whenwantingtodeployasystemwithsomeredundancy. Atrade-offisapparentinFigure6.14; in choosing the D parameter value one can either have high performance with few robots, or consistent performance across a range of medium to large numbers of robots. In the latter case, performance depends critically on the puck density. A value likeD = 30m seems like a goodcompromise: steepinitialslopeforbetween 10and 100robots,butgraduallydecreasing performancebeyondtheoptimalnumber. ThetimeseriesplotsinFigure6.14,particularlyfor450robots,showaninitialrateofpuck delivery that is unsustainable. The initially steep performance graph flattens out after approx- imately 200seconds. This occursbecause, initially, therobots arerandomly placed withinthe environment. (Therobotsweresimulatedwithinanopenarena,usingtheefficientpolynomial modeloftheRenemote’s.) Whenearlypucksaredelivered,thereislittleinterference,because few robots have arrived at the home region. Later deliveries must face exiting robots, and if therearesufficientrobots,overcrowdingresultsinatrafficjamscenario. Thisisthereasonfor theperformancecurvesflatteningwithincreasingnumbers. 197 6.5 Summary Thischapterdescribedthesimulationinfrastructuredevelopedinordertovalidatethesynthesis methdology. We have described algorithms for efficient simulation of large multi-robot sys- tems and the underlying models used. While suitable data for infra-red sensors were found in theliterature,nosuitablemodelofvelocitycontrollererrorscouldbefound. Ourownmeasure- mentsweretakenbyusinganActivMediapioneer2robot. Twocommunicationsmodelswere developed, the first integrates published channel, link-layer, and interference models in or- dertoprovideaccuratenetworksimulationbymodelingtheunderlyingenvironmentaleffects, the second based on Rene mote data, is intended for realisticaly modeling open space radio communications. Finally, we demonstrate this simulator by simulating variations of foraging behaviorbyhundredsofrobots. 198 Chapter7 Humancollectivebehavior Thischapterdescribesexistingworkonmodellingofcrowdandpedestriancollective behavior, and describes an assistive multi-robot system developed in order to reduce expected egress time. The system involves deployment of directional audio-beacons usingamodeloftheevacuationbehavior. The synthesis methodology described and evaluated in preceding chapters places con- straints on the types of robots-to-robot interaction. It is an example of trading away power at the level of local rules in order to achieve predictable macroscopic behavior. We have al- ready said, and shown in Chapter 2, that macroscopic models have broader applicability than theequilibriumstatisticalmechanicssynthesismethods. Thischapterwillfurtherdemonstrate this fact by showing how such models can be used by a team of robots during their coordi- nation. Much like the controlled transport example with ants, the macroscopic model will be treatedlikeaninterfacetothedistributedsystems. We also show the exploiting macroscopic models of behavior is not the exclusive purview ofminimalistroboticsystems. Thischapterdescribesanexplicitcoordinationalgorithmbased 199 on the Optimal Assignment Problem (Gerkey 2003; Gerkey and Matari´ c 2004a). The alterna- tiveclassofcoordinationalgorithmsandtheapplicationofmoregeneralmodelstocrowds,are bothamovementtowardbroaderapplicabilityofthemacroscopiccontrolmethod. This chapter first presents a review of models of human collective behavior. Then we describe a novel tracking algorithm and data collected from museum visitors. Next, we dis- tinguish the present work from the broader research area of called robotic disaster response. Finally,wedescribetheaudio-beacondeploymentalgorithmandthevalidationofthecontroller onphysicalrobots. 7.1 Modelsofhumancollectivebehavior The study of human collective behavior falls within the scope of several disciplines. Brown (1965, Chapt 14) describes, from a social psychology perspective, several interesting exam- ples of collective behavior including crazes, mass movements, riots, and panics of escape. He describesseveralfailedexperiments,whichhighlightthechallengesofstudyingcollectivebe- haviorinanempiricalway. OnesuccessfulstudywasconductedbyMintz(1951)whoshowed inasimplecontrolledgamesituation,thatbyincreasingpressure,individualscouldbeinduced toactgreedilyandsub-optimallyforthegroup. Onceoneindividualactsaparticularway,other people quickly adopt similar tactics which, as in a game of the prisoner’s dilemma, results in payoffsthatareworseforeveryone. Smelser (1962) presents a sociological theory on the basis of values and norms, the work is a systematic survey of many kinds of collective behavior, identifying necessary conditions 200 for producing various forms of collective behavior. Two of the conditions that he identifies are “structural conduciveness” and “precipitating factors.” These conditions capture the idea that for spontaneous collective behavior of a particular form to occur, the complete system must be placed in a state so that it is ready for such a change, which typically occurs slowly over time, and some small triggering event must occur before a rapid outbreak causes some radical and identifiable collective behavior. Many of these features are similar in explanation tophasetransition,withasmallnucleatingeventcausingarapidtransitionsunderappropriate conditions. This has also been noted elsewhere, e.g.,Callen and Shapero (1974). On the other handthemechanismsareradicallydifferent. Thequestionweareconcernedwithiswhetherthe same model (or very similar) model can qualitatively capture sufficient information in which todocollectivecontrol. Helbing(1995)expressesskepticismtowardmodelsthatareknownto requireassumptions(e.g.,aclosedsystem)thatareknownnottoholdinsocialcircumstances, urgingthatphysicsmethodsratherthanmodelsbeused. 7.1.1 Modelingandsimulation There has been considerable interest in vehicular traffic since the 1950’s (Helbing 2001), but onlyrecentlyhavephysicistsappliedtheirmethodstopedestrianmovement; influentialworks includethosebyBihametal.(1992),NagelandSchreckenberg(1992),KernerandKonh¨ auser (1993)andHelbingetal.(2000). Congestioneffectsandtheircausesarestudiedinbothpedes- trianandvehiculartraffic,butmodelsofhumancrowdstendtofocusontwoparticularregimes: 1)orderedtrafficflowand2)panickedbehaviorasmayoccurinanevacuationscenario. 201 7.1.1.1 Coreproblems Existing models of crowd behavior often single out an interesting scenario (e.g., evacuees exiting a burning room) and focus on producing predictions that match some given data set. Helbingetal.(2007)isarecentexample;theycomparemodelresultswithvideoofthecrowd disaster at Makkah during the Hajj in January 2006. While several empirical studies exist, experimental studies are rare, cf. Kretz et al. (2006). Of the many existent models, most have notbeensubjectedtorigorousexperimentalverification. Recentworkhasdemonstratedthatverysimplelocalrulesaresufficienttoproducearange of collective dynamics that are identifiable in pedestrian flows (e.g., the self-organizing lanes in pedestrian traffic (Helbing et al. 2001b)). Helbing and collaborators consider a number of very simple environments with carefully selected circumstances (e.g., T-junctions, inter- sections) that are sufficient to demonstrate the diverse properties reproduced by their model. Otherwork,forexampleStill(2000);Galea(2001),considersmoreenvironmentaldetailsand multiple models depending on features involved. The so-called “Opera Problem” is a par- ticular evacuation scenario based on an opera theater building; Crespi et al. (2002) proposed a potential field that gives an effective collection motion plan, while Burstedde et al. (2001) demonstratedtheirmodelofevacueesinanenvironmentsimilartoanoperahouse. 7.1.1.2 Macroscopicapproaches Macroscopic approaches do not target the behavior of a single pedestrian but focus instead on aggregate properties; these models are commonly based on network flow concepts. We 202 describe models used by Hamacher and Tjandra (2001), which are largely representative of this approach. In those, a graph is constructed that represents the environmental constraints, and calculations are performed to evaluate aspects of the graph and hence the implications of theenvironmentalconstraintsonevacuationtime. The graph as an abstraction of the environment is most often used to represent a single building. A typical construction would have a single graph vertex per room, with an edge placed between any two adjoining vertices; a single additional goal vertex then represents the area external to the building. The edges of the graph are labeled with various route details, such as edge capacity and travel time. Vertices are also labeled, such as with initial contents and capacity. Graphsthatcapturedynamics(e.g.,spreadingfirefronts)throughduplicatesfor eachtime-steparecalled dynamicnetworks. Flowtakesplacefromvertextovertex,butcannotexceedthecapacitiesdefinedineitherthe edges or the destination. Max flow represents the maximum throughput in the represented en- vironment, and its optimization is typically solved using specialized network flow algorithms. Max flow and other calculations, such as quickest flow, provide a characterization of the en- vironment, identification of bottlenecks and potential trouble areas, and allow engineers to answer “what if” questions regarding building structure. Hamacher and Tjandra (2001) claim thatthesemacroscopicmodelsaretypicallyusedtofindgoodlowerboundestimatesforevac- uationscenarios. 203 7.1.1.3 Microscopicapproaches While the described macroscopic models are useful, they are unable to take into considera- tion complex effects related to interactions and interference among people. Microscopic ap- proaches,incontrast,cancapturethoseeffects,andthusfocusontheindividualsandtherules that dictate their inter-personal behavior. Evacuees are simulated as simple role-governed en- tities in the environment, the collectively generate representative crowd behavior. While rules thatdescribehumanresponseonthebasisofdistancearenotnew(see,forexample,proxemics thestudiesofthepersonaluseofspace(Hall1959))attemptingtounderstandtheresultantcol- lectivedynamicsthroughcomputationalmodelingisrelativelyrecent. Oneofthemostwidelyusedmicroscopicmodelingapproachesisthe“socialforces”model ofHelbingetal.(2000). Init,peoplearemodeledasparticles(sizessampledfromanappropri- atedistributionfunction)thatmovewhenacteduponbyvirtualforces;forcetermsactbetween people and also between people and the environment. Attractive forces bring groups of ac- quaintances together, while repulsive forces act to keep personal distances; the walls have a repulsiveaction,butanumberofdisplaysorotheritemsofinterestmayhaveattractiveforces. Additionalfrictionalterms,inspiredbygranularflow,areusedforhighcrowddensities. The “social force” model has a single agitation parameter; by varying its value, a range of behavioral dynamics can be produced, from regular pedestrian flows to panicked behavior. Theuseofforcesalsoallowstheappliedpressuretobecalculated,andhencefatalitiescaused by crushing. In the most general case, force terms may depend on time, velocity, and may be anisotropic. Thisallowsforfire-frontsandotherdynamicelementstobemodeled. 204 When discrete models of space are used in these models, Cellular Automata (CA) mod- els and algorithms can be applied. These models are popular both for representing pedes- trians (Keßel et al. 2001) and evacuees (Schadschneider 2001a,b) due to their simplicity and simulation speed. Models with continuous space and time also exist, such as Hoogendoorn et al. (2001). Helbing et al. (1997) present one of few existing models where microscopic behaviorcanbelinkedwithcollectivedynamics. 7.1.1.4 SpaceSyntaxAnalysis Modelsdescribedabovearetargetedeitheratpedestriansorevacuationtraffic. However,more generalmodelsofspacemayalsobeusefulinthecontextofevacuation. Spacesyntaxtheoryis centered on the notion of environmental connectivity. Environments are described in terms of topologicalrepresentations,andenvironmentalcomplexityismeasuredthroughconnectionsto neighbors. Thismakesthemodelscale-invariant,allowingthetheorytobeappliedtostructures as small as single building (which we will focus on) and as large as an entire city (Hillier and Hanson1984;Hillier1996;Read2002). Jiang and Claramunt (2000) calculated a measure called the “local integration” for rooms within a museum, and found a clear correlation with people’s observed routes in the first ten minutes spent in the museum. See Section 7.4.3.1 for a more detailed description of how to calculate integration values. Hillier (1996) described motorized traffic patterns in a portion of central London, showing that rush-hour and mid-day traffic flow along roads represented with axiallinessignificantlycorrelatetoglobalandlocalintegrationvalues,respectively. Thisanal- ysishassubsequentlybeenperformedindozensofothercitieswithsimilarresults(Read2002). 205 Collectively, this empirical evidence gave rise to the central theories of space syntax (Hillier 1996). We make use of these models within our multi-robot system’s coordination algorithm, as willbedetailednext. 7.1.2 Applicationofmodels The modeling and simulation models described are typically used to study given designs to analyze bottlenecks and potential problem spots. Practitioners use the models to evaluate par- ticular architectural designs, and to address “what if...” style questions. We are not aware of anyworkthathasusedthesemethodsforon-linecontrol. 7.2 Laser-basedpedestriantracking: micro-andmacroscopicviews Despitethenumberofmodelsandsimulationtechniques, thereremainsadearthofrealworld data. We have studied different areas within the Los Angeles, California Science Center and instrumented an area called the Life-Tunnel and Cell Theatre exhibit. Figure 7.1 shows this relatively constrained environment and the way it was intrumented with laser range-scanners and overhead cameras. Museum visitors are free to walk across the life tunnel (tunnel at the top) or through the main exhibit itself. Entrances and exits are on the left, but people do flow throughtheareaatthebottomleft,andbetweentheexhibitandthelifetunnel. Theblueoverlay showslaserrangescansfromtwolayerdevices. 206 Figure7.1: Theconstrainedenvironmentweconsideristhelifetunnelandcelltheaterexhibit. Bothlaserrangefindersandcameraswereplacedused,onlythelasersareusefulforautomated tracking. The cameras ordinarily have a limited field of view, and were fitted with fish eye lenses. This allows for a view of the exhibits from the both cameras, but extreme distortion and oc- clusion results from the low ceiling. Figure 5.12 shows laser scans and camera images. The images clearly shows the distortion from the lens, any people near the entrance and exits are severely occluded. In addition to these physical constraints, we were legally forbidden from capturing data that did not maintain people’s natural anonymity. We concluded that human annotation would be the most successful method for processing this video. We implemented a simple background learning and cluster mechanism in order to process the laser data before tracking. The detected foreground from the laser scans are highlighted in red in Figure 5.12. Noticethatinthisparticularcasepeopleareeasytodistinguishfromthebackground,buteven so, it is extremely challenging to estimate group size simply from the area of sensor readings. Three tendencies can make foreground detection difficult in this domain: (1) once observing an exhibit a visitor can stay still reading the exhibit text for several minutes, which can result 207 in them being learned as part of the static background; (2) the exhibit includes chairs that are partofthebackground,butthatcanmove;(3)theexhibits(i.e.,postersanddisplays)arealong the back wall and so visitors are provided with an incentive to move toward the background thresholdandmovablechairs. Several standard Bayes’ filter techniques for tracking individuals (e.g., Arbuckle et al. (2002); Fod et al. (2002)) have been applied to the captured data. Each has failed due to the severity and frequency of the occlusions. Also, since our lasers were mounted at knee height, thedata-associationbecamechallenging: asetofforegroundpointscanbeproducedbyseveral peopleorone. Thismany-to-manyrelationshipischallengingtorepresent. 7.2.1 Microscopicview: shorttrajectories Our most successful solution is based on the gait and stride model developed by Zhao and Shibasaki (2005). They exploit the fact that a laser placed low enough to generate separate readings for each leg results in a recognizable spatio-temporal pattern. However, unlike the results reported for railway station foot traffic, the technique is only reliable over short trajec- tories. Occlusionsandmisclassificationofbackgroundandforegroundpointsmeansthateven thesophisticatedtemporalcorrelationalgorithmemployedwillconfusepedestrians,mistaking twopeopleforoneorvisaversa. Oursolutiontothiswastoemploythisalgorithmtoestimate inter-exhibitflux,ratherthantrajectories. 208 (a) (b) (c) (d) Figure 7.2: Video stills from a ceiling mounted fish-eye camera and the simultaneous laser rangereading. Non-backgroundpixelsareidentifiedasredblobswithinthelaserscan. 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #1 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #2 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #3 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #4 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #5 0 10 20 30 0 0.2 0.4 0.6 0.8 1 Number of people Probability Cell Theatre 0 0.05 0.1 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #1 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #2 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #3 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #4 0123456789 10 0 0.2 0.4 0.6 0.8 1 Number of people Probability Exhibit #5 0 10 20 30 0 0.2 0.4 0.6 0.8 1 Number of people Probability Cell Theatre 0 0.05 0.1 Figure7.3: Trackingoutputfrombothmicroscopicandmacroscopicmodels 209 7.2.2 Macroscopicview: exhibitaverages We implemented a Bayes’ Filter-based tracker that operates at the granularity of entire ex- hibits. It takes estimates of flux produced by the lower-level modified Zhao and Shibasaki (2005) tracker. 1 While the flux measurements are useful in propagating states (i.e., as an ac- tion model), they ultimately result in greater and greater variance in the estimates. This was complemented by an update that uses measurements of the total number of foreground pixels and the degree of occlusion within the exhibit. Figure 7.3 provides snapshots of both trackers operatingtogether. Therightimagereflectsthestateafewsecondsaftertheleftimage. Theno direct foreground measurements can be made of the cell theatre (bottom exhibit) the estimate isbaseddirectlyonfluxes. Theresultisamuchlargervariancethantheotherexhibits. 7.3 Robotics-aideddisasterresponse The idea of using robots to aid in human rescue is not new; see for example the early paper by Kobayashi and Nakamura (1983). Generally, the work focuses on urban search and rescue tasks that require robots to overcome highly uncertain (i.e., potentially drastically altered) en- vironments in order to locate, treat, and possibly transport incapacitated people (Casper et al. 2000). Urban search and rescue, however, is not the only interesting and rich area of rescue robotics. To visualize the larger picture, in Figure 7.4, we propose a space of rescue robotics 1 WethankMuhammadEmad-ud-dinforhisimplementationofthistrackerandthemodificationshedevised. 210 tasksandproblems. Thethreeaxesofthespaceare: urbanstructuraldisintegration,diminish- inghumanmobility,andexpectedresponsetime. In contrast to search and rescue (represented by green volume in diagram), we define the evacuation assistance task (represented by blue volume), whose conditions are significantly different. The environment is typically in much less affected state, evacuees are assumed to be self-mobile (but possibly impaired to an unknown degree and number), and the shorter, immediate evacuation phase occurs immediately, before the rescue interval. While the shown volumes occupied by each of each of these tasks clearly differ, the number of human lives savedinbothscenariosaresignificant. Løv˚ as (1998) offers three reasons for why evacuation can take longer than it should: de- layed initial response, non-optimal selection of escape routes, and congestion while traveling. In cases of fire, initial response is often delayed because of underestimation of the effects of smoke (Proulx 1997). It has been widely noted that non-optimal selection of routes often oc- curs because people typically exit a building the same way they entered it. Thus, congestion occurs due to architectural shortcomings, identical choices of exit routes by people, and self- reinforcingherdingeffects. Low visibility circumstances can be particularly dangerous because people tend to use vi- sual cues as the primary means for spatial localization, and hence in escape route planning and navigation. The most tragic examples are of deaths that occur because victims are un- able to locate suitable emergency exits in time. For example, at the D¨ usseldorf Airport Dis- aster (D¨ usseldorf Airport Disaster, 10 April 1997), asphyxiated bodies were discovered only 211 three meters away from an emergency exit. Visual signs remain the most frequently used means to mark fire escape routes, in spite of their known rapid deterioration of effectiveness withsmokeaccumulation. 7.4 Assistiveaudio-beacondeployment Audio beacon deployment involves the dynamic distribution of beacons designed as human navigationalcuesthroughoutabuilding. Asanapplicationdomain,itrepresentsasimpleproof of concept, showing that it is feasible for a robot system to use an on-line model of collective humanbehaviorinordertomakedecisions. Firstwemotivatetheuseofsuchbeacons. 7.4.1 Directionalaudio One solution to the evacuation problems described in the previous section is the use of di- rectional audio beacons (Withington 2000, 2001, 2003). These beacons (see Figure 7.5) emit soundinarangeoffrequenciesknowntobewell-suitedforlocalizationbythehumanearand composed of fused sounds that minimize ambiguous “blind spots.” The beacons are called di- rectional because their bearing can be determined with high certainty by a listener even when manybeaconsareactivesimultaneously. Multi-flooredbuildingsallowbeaconstoplayanenhancedrole. Inadditiontothestandard pulsing broadband sound, beacons located near stairwells can play a sweeping melodic tone, with ascending tonality indicating that subjects should travel upward, and descending tonality meaningthereverse. Withington(2001)describestheuseofsuchbeaconstodirectagroupof 212 Figure 7.4: A decomposition of the task space for disaster recovery robotics. The small and large blocks represent evacuation assistance and urban search and rescue tasks, respectively. Thespacespannedbytheaxesindicatesconditionsofthesurvivors; someregionsareinfeasi- ble,andnosurvivorsareassumedtoexistoutsidetheaxes. Figure 7.5: Two directional beacons used to deliver audio signals so that the source’s location iseasilyidentifiable. ManufacturedbyBrigadePLCandKlaxonSignalsPLCrespectively,the shriekerontherightwasexplicitlydesignedforevacuationscenarios. 213 peopleduringanavigationtaskthroughonesuchcomplexenvironment. Thesubjectswerenot informed about the meaning of the sounds, but nevertheless not a single participant in any of thetrialstookawrongturnormistakenlyendedupintheincorrectroom. Trials conducted with human participants (both with and without decreased visibility) es- tablishedoverwhelmingeffectivenessofthebeaconsinaidingevacuation. Forinstance,inone trial,peoplegotlostinanenvironmentwith100%visibility. Theparticipantshadsuccessfully evacuatedthesameenvironment,whilesmoke-filled,onlyminutesbeforewiththeaidofaudio beacons(Withington2001). 7.4.2 Deploymentalgorithm Consider the following motivating scenario: a fire has started in an office building, an alarm is triggered, firefighters are notified, and suitable vehicles deployed. In addition, a squad of robot operators arrive at the scene, with a team of robots (equipped with beacons). The squad deploys the robots into the building at the available entrances, including doors and windows. If a map of the building is known, it is provided to the robots ahead of time; if not, the robots performanautonomousmappingphase. Next, an operator uses a console to provide some useful information to the robot team. Specifically, he/she indicates which locations in the building constitute suitable exits, includ- ing existing doorways but also any emergency exits and newly-created exit points such as windows with ladders, etc. In addition, the operator also provides any necessary navigation information the robots may not have acquired, such as stairwell connections. The robots then 214 computesuitabledeploymentpositionsandnavigatetowardsthem. Uponarrival,theyactivate the beacons they carry. The beacon sounds serve two purposes; they 1) provide information about escape routes, and 2) spur any occupants who have missed or ignored prior evacuation warnings. Theabovescenarioinvolvesasetoffundamentalcapabilitiesoftherobots: 1)localization, navigation, and mapping, 2) wireless communication, and 3) task allocation for coordination. Standard techniques are used for each of these. The novel aspect of the system is the use of crowd models in order to provide domain knowledge for the task allocation mechanism, we describethisnext. 7.4.2.1 Task-allocationforcoordination Adirectwaytocastthedeploymentproblemintothetaskallocationframeworkistoconsider thepotentialdeploymentlocationsasnavigationaltasksforeachoftherobots. Thisrequiresthe constructionofautilitymatrix, witheachelementbeinganestimateoftheutility, orexpected worth,ofaparticularassignment. The(i,j) th utilitymatrixentryisforrobotibeingdeployed tolocationj, andcanbeconsideredthe rewardforhavingabeaconatlocationj, lessthe cost expected to be incurred by that robot during navigation that location. Once the full matrix has been constructed, robots can be assigned to tasks. To decentralize the process, we distribute theutilityestimatesandhavetherobotscomputetheirownallocationslocally. Unfortunately,theoptimalassignmentproblemformalism(GerkeyandMatari´ c2004a)re- quires fixed non-interrelated utilities, a requirement that is not satisfied by this example. The 215 reward obtained by having a beacon at a particular location depends not only on the environ- mentalconstraints,butalsoonwheretheotherbeaconsarepositioned. Imaginethatoneexitis in an excellent location; assigning two robots close to that exit is inefficient, once a robot has beendeployedtoalocation,theutilityofhavingothersnearbyisgreatlydecreased. To avoid this problem we permit a task to include a set of locations within some local neighborhood; a robot being assigned to that location is ensured that no other robots will be deployed within that neighborhood. Next, we discuss the problem in two phases: 1) choosing a list locations suitable for deployment and clustering them into tasks, 2) assigning robots to thosetasks. Eachoftheseisdiscussedinturn. 7.4.2.2 Wheretodeploy Feasiblelocationsareselectedfromthesetofemergencyexitsandstairwayswhicharestrate- gic locations consistent with those used in validation experiments with the beacons. These locations are clustered based on distance. Assigning robots to entire clusters separates de- ployed beacons by a minimum threshold distance. The difficult problem involves finding the maximumsuchthreshold. Consider, for example, the two-floor environment shown in Figure 7.6. When deploying a single robot we consider the partition of feasible locations to be the set of all possible des- tinations, S 1 1 ={1,...,7}. If another robot is added, then the environment is partitioned into two sets of locations,S 2 0 ={1,...4,6} andS 2 1 ={5,7}. This is performed using an approach similar to Kruskal’s Algorithm (Cormen et al. 2001) for construction of a minimal spanning tree. ThelaststepofKrushkal’salgorithmmergesthetwosetsofvertices,S 2 0 andS 2 1 ,toform 216 6 7 5 4 1 2 3 First Floor Second Floor Figure7.6: Arepresentationofoneofthemulti-flooredtestenvironmentsusedforbeaconde- ployment. The dots mark pieces of information added by the operator, in this case emergency exits and stairwell links. The image on the left includes a topological overlay for the environ- ment. Ontheright,theemergencyexitsarenumbered1–5andtheconnectionstairways6and 7(thesecondfloorisonlypartiallyshown). 1 2 3 Second Floor First Floor Figure7.7: Themapusedwithphysicalrobots. Exitsmarked1,2. Theconnectionlabeled3is astairwell. TheActivMediaPioneerDX2isshownmovingtowardthetopofthestairwell. 217 S 1 1 . Rather than constructing the full spanning tree, the last merger need not be performed, resulting in a forest of size two. More generally, the set merger step is applied only until the correct number of tasks remains; the length of the last edge added to the minimum spanning treeiseffectivelytheclusterthreshold. 7.4.2.3 Whomtodeploy Each of the robots uses Kuhn’s Hungarian Method (Gerkey and Matari´ c 2004a) to calculate the assignments; in our implementation the calculation of utility matrix entries was far more computationallyintensivethanactuallyperformingtheassignment. 7.4.3 Evacuationdynamicsmodelsinutilitycalculation Theutilityforrobotmbeingpositionedatlocationniscalculatedas: U m,n = max(0,α(1−RA ∞ n )−βD n m ) (7.1) whereRA ∞ n is the global integration from space syntax analysis. This gives us a value for how well connected the exit it, if the exit is well connectedRA ∞ n ≈ 0. We define D n m to be the distance from the current pose of robot m, to the final destination n, α and β are simply weightings. 218 0 10 20 30 40 50 60 0 100 200 300 400 500 600 Node 1 Node 2 Node 3 Node 4 Node 5 Distance Travelled Time Simulated Failure Figure7.8: Thedistancesofeachrobotfromitsassigneddestination. Thereassignmentoccurs atthe300secondmark,resultinginthediscontinuity. 7.4.3.1 Spacesyntaxmeasures Hillier (1996) introduced the theory of space syntax to describe architectural configurations, and it was later found that this spatial information alone can be useful in predicting people’s use of space. The key measure is the measure of integration, this involves several steps: First theenvironmentisskeletonizedintoagraphthatcapturesthetopologicalstructure. Themean depthofvertexi(calculatedtoradiusr)isgivenas: MD r i = 1 kN i k−1 j∈N i X j6=i d(i,j) whereN i ={∀k :d(i,k)≤r} (7.2) where d(i,j) is the minimum distance from vertex i to j, measured in numbers of edges of the adjacency graph. The setN i represents all other vertices with a minimum distance no farther than radiusr away. Frequently,r is taken to be as large as the entire graph, so thatN i includesallverticesandMD ∞ i . Anothercommoncaseisr = 3,easilyrepresentedbyMD 3 i . The mean depth of a vertex indicates how directly it is connected to the regions around it. Therelativeasymmetryofvertexi(calculatetoradiusr)isdefinedas: 219 RA r i = 2(MD r i −1) kN i k−2 whereN i ={∀k :d(i,k)≤r} (7.3) Inthecaseswheretheradiusincludesallelementsinthegraph(RA ∞ i ). 7.4.4 Experimentalvalidation We evaluated our implementation along two dimensions: 1) robustness with respect to failure and 2) effectiveness in terms of impact on the evacuees. A desirable system must meet both ofthesecriteriatoasatisfactorydegree;weattempttodemonstratethatthisisthecaseforour algorithm. 7.4.4.1 Robustness We present a representative sample run where the system behaved reasonably. The run is one ofteninstancesinwhichbeaconsweredeployedintotheFigure7.6environmentandafailure purposefully induced. Similar runs on three additional environments were also performed; in all cases the system demonstrated similar reliability – detection and reassignment to compen- sateforthefailure. Theforcedfailurewasperformedbyshuttingdownarobotafterithasbeen deployed. The exact time of this operation (and its detection) differed from run to run, which unfortunately makes it difficult to present a single visually meaningful summary of the data fromalltheruns. Figure7.8showsthedistanceofeachrobotfromitsassignedgoaldestinationforthesam- ple run, over a range of times. (Robots 1 and 2 have line segments very close to one another, 220 and are indistinguishable unless magnified.) By time t = 200, all of the beacons have been deployed and the system could remain in this state indefinitely, with activated sound beacons. Next,robot5wasshutdown. Thesystemremainedinthestate,withonlyfouractivebeacons, until a time-out elapsed resulting in a robot sending a request for another’s pose information. This caused a cascade of further requests. The remaining four discovered that robot 5 was no longerfunctioning,andthatrobot4wouldserveasabetterbeaconifitmovedtoaplaceprevi- ously assigned to the failed robot. This is observable as the spike in the graph aroundt = 300 for robot 4, because its destination changed. The case of a network partition is dealt with similarly;thoserobotsthatcancommunicatedoso,andcoordinatetheireffortsaccordingly. An additional ten runs were attempted on three physical robots on the first two floors of our laboratory building (see map in Figure 7.7). Our experimental platform consisted of three pioneer 2-DX robots equipped with laser range finders. The first two robots were deployed from the second floor; they moved toward the top of the stairs (bottom left corner, Figure 7.7) andtheexitmarked2(upperrightcorner,Figure7.7),respectively. Athirdrobotwasdeployed on the bottom floor, and moved toward the exit marked 1. A failure was induced in the robot located at exit 2; after being detected, the robot above the stairwell moved toward exit 2, resulting in a better spread. After six consecutive successful trials, the experiment was halted duetoahardwarefailure. 221 0 1 2 3 4 5 6 0 1 2 3 4 5 Distance per Evacuee Number of deployed beacons Distance per Evacee verses Number of deployed beacons (with uniform sampling) (a) Expected Distance traveled per evacuee, when the simulation uses uniform (over area) sampling for initial locations. Both variance and meandecreasesignificantlyaftertheintroduction of a single beacon. Beacon bias parameter = 0 with1,000simulatedpedestriansforeachofthe 6beaconvalues. 0 1 2 3 4 5 6 0 1 2 3 4 5 Distance per Evacuee Number of deployed beacons Distance per Evacee verses Number of deployed beacons (with space-syntax sampling) (b) Expected Distance traveled per evacuee, when the simulation usesRA ∞ i as a statistical weightforinitiallocationsi. Thetrendisconsis- tentwith(a). Parametervaluesareidenticalwith thosein(a). 0 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 Distance per Evacuee Number of deployed beacons Bias Parameter Plot: Mean Distance verses Number of deployed beacons (with uniform sampling) (c) The effect of the beacon bias parameter on the mean escape distance. Solid lines represent theextremevaluesof0and1,asupperandlower bounds, respectively. The bias parameter cap- tures the ability of an evacuee to pick out the loudest (nearest) beacon, as expected, when this is done flawlessly (value=1) egress distance and timeisminimized. 0 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 Distance per Evacuee Number of deployed beacons Bias Parameter Plot: Mean Distance verses Number of deployed beacons (with space-syntax sampling) (d) Same as (c), only difference being that the initial sampling was based onRA ∞ i instead of uniform. Even if exactly the nearest beacon can not be located, modeled as bias parameter value= 0, the mean egress time is not devastat- inglyeffected. Figure 7.9: All four of these plots give the mean distance an evacuee will travel (based on the simple flow model described in the text) as a function of the number of deployed beacons. This is a function ofthelocationsofthebeacons,too;thisplotisforthefivelocationschosenbythedescribedallocation algorithm. The two plots on the left have initial locations of simulated pedestrians assigned uniformly over the environmental area; the right plots use the value of the global integration as the statistical weight. The top two plots show generated data, mean and standard deviations (error bar width = one standard deviation) of simulated runs. The bottom two plots show the range of variation as the bias parameterisvariedfrom0(solidlinethatisupperbound)to1(lowerboundingsolidline). 222 7.4.4.2 Effectiveness Thereisnoformaldescriptionforwheretherobotsshouldbestbedeployed,orwhichbeacons should be relocated when a beacon failure occurs. We have embraced the standard practice of usingexitsandstairways,butevaluationofourlocationselectionmechanismisdifficult. We introduce a metric that attempts to capture the efficacy of a given arrangement of bea- cons. The metric is based upon a simulation of expected evacuee flow with and without bea- cons. The authors are unaware of any other existing simulation of evacuees (or pedestrians) beingaidedbyexternalsignalsintendedtoimprovemovement. We define a choice point as a location where there is a fork in the topological structure of the environment, i.e., where the evacuee (or robot) must make a navigational choice. These arealltheverticeswithdegreethreeorlarger. Weconsiderthesechoicepointsasthelocations whereinthenavigationaleffectsofaudio(orothersignals)occurs. Inordertosimulatehuman behaviorinanunknownenvironment,weassumethatallunvisitedoptionsatanychoicepoint istakenwithequalprobability. Theaudiosignalsbiasthenavigationaldecisiontakenatchoice points. A graph-based representation of the environment allows for the identification of choice points. In order to estimate macroscopic flow, we generated evacuees, placed them randomly onthegraph,andsimulatedtheirnavigationaldecisions. Thedistancestraveledbyeachsimu- latedpersonwererecorded,permittinganalysis. Figure 7.9 shows the relationship between expected evacuation distance (or time) and the number of robots deployed into a particular environment. The graph was plotted with 1000 223 navigational instances for each of 6 values of “number of beacons.” Error bars in the top two plotsgiveameasureforthevarianceofthedistancestraveled. The four plots in Figure 7.9 summarize two fundamental variations explored with the model. The first is stability with respect to the initial distribution of people. The plots on theleftassumepeopleareuniformlyspreadthroughouttheenvironment;thisisquitefarfrom trueineverydayuseofspace. Toaddressthisfact,weusedspacesyntaxanalysisandcalculated the global integration throughout the environment. This was used as a probabilistic weight to generate locations (higher integrations resulting in more pedestrians); results are shown in the two plots on the right. As expected, this resulted in a slightly lower mean value for each of the evacuation times. However, the model has shown that this difference is insignificant when comparedwiththetimevariance. The lower two plots demonstrate the effect of the beacon bias parameter on the mean distances. The bias parameter captures the ability of an individual to infer distances to the beacons, and choose the route toward the nearest one or, said another way, the ability of the nearestbeacontobiastheevacuee’schoiceofroute. Avalueofzerorepresentsthecasewhen routechoiceisbasedentirelyondistancefromtheappropriatechoicepointtothebeacons. As the value is increased toward unity, this models the persons ability to pick the shortest route. The lower solid lines in the plots (c) and (d) are for the parameter set to 1, and represent a lowerboundonmeandistance. Intheplottherobotsweredeployedfromthesameinitiallocationsasthesamplerunfrom the previous section. The environment shown in Figure 7.6 is used here because it is the most 224 complex of the environments we considered, enabling more robots to be deployed and hence more resulting data. The trend is, however, consistent with the other simulated deployment runsmentionedintheprevioussection: averyrapiddecreaseineffectivenessperbeacon. This indicates that even very few beacons are likely to be effective. Both the mean and variance in egress times decrease in the presence of audio beacons than without. This is significant, becauseadecreaseinonewithouttheotherhaslimitedvalue. 7.5 Human-basedenvironmentalmetrics There are other robotics applications for ideas from the pedestrian and crowd modeling com- munity. For example, human-based environmental metrics can provide a standardized yard- stickwithwhichthecompareenvironments. As is well known, the structure and complexity of the environment can have a major in- fluence on robot task performance. Consider, for example, the task of target tracking. Jung andSukhatme(2002)notethat“howobstructed[theenvironment]is,seemstobesignificant.” They then use a metric based on visibility between points in the environment to provide a measureofocclusion. Theirmeasures(fromthecomputergraphicscommunity,originallypro- posed by (Cazals and Sbert 1997)) are representative of a common focus on sensing as the baseline. Theenvironmentaffectsnotonlysensing,butalsoaction,i.e., motion. Sincemotion issomethingthatmobilerobotsmustdo,regardlessoftask,theeffectsoftheenvironmentthat influencefreedomofmotionareparticularlyrelevant. 225 Indoor environments are designed for human use, and so measures of human mobility are appropriateforassessingenvironmentalstructure. Ofcourse,robotsarenotpeople,andsoitis questionablehowmeaningfulthosemeasuresarewhenappliedtorobotmovementinthesame environments. Nevertheless,focusingonthemovementsandcapabilitiespeopledoesprovides arobot-neutral,andhencerelativelyobjectivemeasure. 7.6 Summary This chapter has reviewed existing literature for modeling human collective behavior and de- scribed an algorithm for assistive beacon deployment that uses a space-syntax model during deployment. This qualitative model of the way humans collectively use space was used in the utility calculation procedure of the system. We validated this controller in simulation and on physical robots. An extension to a published model shows that even few robots can have a positiveeffectonexpectedegresstime. Theworkdescribedinthischaptercomplementstheprecedingworkinthefollowingways: • Itmovesawayfromsystemswheretheunderlyingcausalmechanismsareknown. With- out knowing the full microscopic details of underlying system, analysis of people pro- vides a smaller formal contribution, but a more significant experimental one. This has involvedsignificantlydifferentsystemschallengestotheonesrelatedtoswarms. • Itdemonstratedthedirectuseofmacroscopiccontrolinapplications,ratherthanmerely asanalternativetoothersynthesismethods. 226 • Itdemonstratedtheuseofmacroscopicmodelsinaonlinefashion. • It showed that macroscopic models can be used with explicit coordination methods and need not be solely the purview of researchers concerned with implicitly coordinated robotsystems. • It considered a new system: namely crowds. The problem of evacuation assistance is a novelformofhuman-robotinteraction. Wearenotawareofanyotherexistentworkthat seekstointeractwithhumansmacroscopicallyratherthanasindividuals,asinthiswork. 227 Chapter8 Dissertationsummary This dissertation has developed a macroscopic model-based strategy for producing desirable behavior in distributed systems comprised of many autonomous interacting individuals. The focus was on methods that permit multiple levels of description to be reconciled in a useful manner. Theseideashavebeenappliedtothestudyoflarge-scalemulti-robotsystemsthatmust coordinate in order to achieve some collective task. The methods described in this document produce descriptive models that inform the programmer by providing a characterization of the distributed system’s collective behavior during the design time. Such models, whether produced as part of the synthesis methodology or not, can also be used to provide robots with an idea of the impact of their actions on other agents (e.g., humans or robots) at execution time. This research has shown that such an macroscopic-centric approach is an effective way ofprogramminglargemulti-robotsystems. Theideaunderlyingthisworkisthatonemaydesignandcontrollargeloosely-coupleddis- tributed systems while reasoning almost exclusively in terms of collective properties. One of theoriginalmotivationsforthishypothesiswastheobservationthatlittleexistingroboticswork 228 tacklesthelocaltoglobalproblemdirectlybyconsideringsufficientconditionsforpredictabil- ity. This dissertation has aimed to do exactly that, by restricting the set of basic primitives to ergodic processes, and showing that these are still sufficiently powerful to be useful. The followingkeyproperties,whichareusefulforsystemsynthesis,havebeendemonstrated. 1. Formalpredictability. Thereexisttractablestatisticalmechanicalmethodsfordescribing theprocessesandcharacterizingtheirlong-termaveragebehavior. 2. Compositional capabilities. These processes can be layered and combined with other processes. The behavior of two coupled ergodic processes can result in a new process whosebehaviormaybepredictedanddescribed. 3. Computational expressiveness. The processes are sufficiently powerful to perform non- trivialcollectivecomputationthatisusefulinoptimizingandproducingadaptivebehav- iorinswarmsofrobots. 4. Applicability. Theprocessesareappropriateforlarge-scalemulti-robotsystemsinwhich importantissuesariseincludingasynchronicity,sensornoise,uncertainty,etc. The results show that the deliberate trading away of expressibility at the local-level so as to achieve predictability of the resulting macroscopic behavior can be justified. In the case of ergodic processes, what is lost the micro-level (e.g., rich temporal dynamics) can be regained through composition at the macro-level (e.g., a temporal dynamics in sequencing). General- ity is gained through the compositional approach we have described. However, an important limitation of the method is that longer and longer timescales are required as more processes 229 are coupled. This places a practicable limit on the synthesis methodology, because without sufficient equilibration times, the transient dynamics of the processes come to dominate the producedbehavior. Whenalltheprocessesinteractonshort-timescales,theresultresemblesa behavior-basedcontroller,anothermethods(e.g.,HarperandWinfield(2006))arerequired. Morephilosophically,thisdissertationrepresentedthreebroadinshiftsthinking: • First, this work has been an attempt to bridge the cultural divide between those robotics researcherswhostudybiologically-inspired,implicitlycoordinated,multi-robotsystems and those who produce novel application-orientated systems. Whilst the former have a strong focus on model building, sometimes almost as an end in itself, the latter fre- quently work without models of collective behavior or concern for the consequences of interaction dynamics. This work is a demonstration of model-orientated programming of systems. Additionally, the coarse models used show that simple phenomenological modelsareextremelyusefulduringsynthesis. • Secondly,thisworkoutlinesanalternativewaythatmulti-robotsystemscanuseresearch from the biological sciences. Frequently multi-robot systems are biologically-inspired in the sense that local interaction rules, known to produce some useful or interesting collective behavior in social insects, are adapted for use by robots in order to produce comparable collective behavior. This inspiration is really an appropriation of mecha- nism. Unfortunately, this has led to few significant insights that are generalizable be- yondparticulartasks. Becausethisworkusesabstractionsatthemacroscopiclevel,itis lessconcernedwithlocalmechanismsthanwiththetypesofglobalstructuresthatarise. 230 Problemscanbesolvedbycombiningknownmacroscopictemplatesratherthanbyseek- ingwhollynewlocalmechanismstoachievetasks. Iftheabstractionsthatprogrammers use describe global behavior rather than local interactions, then perhaps those abstrac- tions may be reused more successfully. The most important lessons to learn from other disciplines may be about the macroscopic structures that occur across natural systems andnotthespecificwaythattheyareimplemented. • Thirdly, this work is a response to the despair which, as outlined in the first few para- graphs of the introduction, can follow from the lack of a general theory of complexity andcollectivedynamics. Thisresearchshowsthatevenlimitedknowledgecanhelpfully guide thedesigner engaged insystem synthesis. 1 Furthermore, itsuggests that thecom- monalities already identified across a wide range of natural many-body systems should serveasaguidetothehigh-levelpropertiesthatareusefulformacroscopiccontrol. 1 TheepistemologicalimplicationsofthisshiftontheroleofmodelshasaninterestingsimilaritytoErnstMach’s reaction to the growing “scientific realism” during the time Ludwig Boltzmann published his theory of gases. The positivist position is summed up with Mach’s famous assertion that the basic purpose of science is “economy of thought” rather than production of descriptive theories based on hypothetical concepts (Boltzmann 1896 & 1898, pp.13–14,Translator’sIntroduction). 231 Bibliography Daivd H. Ackley, Geoffrey E. Hinton, and Terrence J. Sejnowski. A Learning Algorithm for BoltzmannMachines. CognitiveScience,9(1):147–169,January–March1985. ArvinAgahandGeorgeA.Bekey. Phylogeneticandontogeneticlearninginacolonyofinter- actingrobots. AutonomousRobots,4(1):85–100,March1997. Alok Aggarwal, Leonidas J. Guibas, James B. Saxe, and Peter W. Shor. A linear-time algo- rithm for computing the Voronoi diagram of a convex polygon. Discrete & Computational Geometry,4(6):591–604,1989. Philip Agre and David Chapman. What are Plans For. Robotics and Autonomous Systems— SpecialIssueonDesigningAutonomousAgents,6(1–2):17–34,June1990. PhilipW.Anderson. MoreisDifferent. Science,177:393–396,1972. AntWebFieldGuide,2008. AntWebfieldguidetoTemnothoraxofTheWorld. Madeavailable by the California Academy of Sciences from http://www.antweb.org, 2008. Last access: 25April2008. Michael A. Arbib. Visuomotor coordination: Neural models and perceptual robotics. In Vi- suomotor Coordination: Amphibians, Comparisons, Models and Robots, pages 121–171. PlenumPress,1989. MichaelA.ArbibandJim-ShihLiaw. Sensorimotortransformationsintheworldsoffrogsand robots. Artificial Intelligence—Special Volume on Computational Research on Interaction andAgency,Part1,72(1–2):53–79,January1995. Daniel Arbuckle, Andrew Howard, and Maja J. Matari´ c. Temporal Occupancy Grids: a method for classifying spatio-temporal properties of the environment. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 409–414,Lausanne,Switzerland,September2002. RonaldC.Arkin. MotorSchema-BasedNavigationforaMobileRobot. InternationalJournal ofRoboticsResearch,8(4):92–112,August1989. RonaldC.Arkin. Behavior-BasedRobotics . MITPress,Cambridge,MA,U.S.A.,1998. 232 Ronald C. Arkin, Tucker R. Balch, and Elizabeth Nitz. Communication of Behavioral State in Multi-agent Retrieval Tasks. In Proceedings of the IEEE International Conference on RoboticsandAutomation(ICRA’93),pages588–594,Atlanta,GA,U.S.A.,May1993. Vladimir I. Arnold and Andr´ e Avez. Ergodic Problems of Classical Mechanics. W.A. Ben- jamin,Inc.,NewYork,NY,U.S.A.,1968. W.RossAshby. AnIntroductiontoCybernetics. Methuen,London,U.K.,1956. W. Ross Ashby. Design for a Brain: The origin of adaptive behavior. Wiley & Sons, Inc., NewYork,NY,U.S.A.,secondedition,1966. (originaledition,1952). JosephAyersandJillCrisman. TheLobsterasaModelforanOmnidirectionalRoboticAmbu- lation Control Architecture. In Randall D. Beer, Roy E. Ritzmann, and Thomas McKenna, editors, Biological Neural Networks in Invertebrate Neuroethology and Robots, pages287– 316.AcademicPress,NewYork,NY,U.S.A.,1992. Per Bak. How Nature Works: The Science of Self-Organized Criticality . Copernicus, New York,NY,U.S.A.,1996. Per Bak, Chao Tang, and Kurt Wiesenfeld. Self-organized criticality: an explanation of 1 f noise. PhysicalReviewLetters,59:381–384,1987. TuckerR.Balch. BehavioralDiversityinLearningRobotTeams. PhDthesis,GeorgiaInstitute ofTechnology,CollegeofComputing,December1998. Tucker R. Balch, Frank Dellaert, Adam Feldman, Andrew Guillory, Jr. Charles L. Isbell, Zia Khan, Stephen C. Pratt, Andrew N. Stein, and Hank Wilde. How Multirobot Systems Re- search Will Accelerate Our Understanding of Social Animal Behavior. Proceedings of the IEEE—SpecialIssueonMulti-robotSystems ,94(7):1445–1463,July2006. MichaelBatty. Predictingwherewewalk. Nature,388:19–20,July1997. RalphBeckers,OwenE.Holland,andJean-LouisDeneubourg. FromLocalActionstoGlobal Tasks: Stigmergy and Collective Robotics. In Artificial Life IV: Proceedings of the fourth International Workshop on the Synthesis and Simulation of Living Systems,pages181–189, Cambridge,MA,U.S.A.,July1994. Calin Belta and R. Vijay Kumar. Abstractions and Control Policies for a Swarm of Robots. IEEETransactionsonRobotics,20(5):865–875,2004. Gines Benet, Francisco Blanes, Jos´ e E. Sim´ o, and Pascual P´ erez. Using infrared sensors for distancemeasurementinmobilerobots. RoboticsandAutonomousSystems,40(4):255–266, September2002. Gerardo Beni and Jing Wang. Swarm Intelligence. In Proceedings of the seventh Annual MeetingoftheRoboticsSocietyofJapan,pages425–428,Tokyo,Japan,1989. 233 Marc Berhault, He Huang, Pinar Keskinocak, Sven Koenig, Wedad Elmaghraby, Paul Griffin, andAntonJ.Kleywegt. RobotExplorationwithCombinatorialAuctions. InProceedingsof theIEEE/RSJInternationalConferenceonIntelligentRobotsandSystems(IROS’03),pages 1957–1962,LasVegas,NV,U.S.A.,October2003. Elwyn R. Berlekamp, John H. Conway, and Richard K. Guy. Winning ways for your mathe- maticalplays. AcademicPress,NewYork,NY,U.S.A.,1982. OferBiham,A.AlanMiddleton,andDovLevine. Self-organizationandadynamicaltransition intraffic-flowmodels. PhyscalReviewA,46(10):R6124–R6127,1992. Aude Billard and Kerstin Dautenhahn. Experiments in learning by imitation—Grounding andUseofCommunicationinRoboticAgents. AdaptiveBehavior,7(3/4):415–438,Winter 1999. Manuel Blum and Dexter Kozen. On the power of the compass (or, why mazes are easier to search than graphs). In Proceedings of the Nineteenth Annual Symposium on Foundations ofComputerScience(FOCS’78),pages132–142,October1978. MarcB¨ ohlen. ARobotinaCage: ExploringInteractionsbetweenAnimalsandRobots. InPro- ceedings of the IEEE International Symposium on Computational Intelligence in Robotics andAutomation(CIRA’99),pages214–219,Monterey,CA,U.S.A.,November1999. Ludwig Boltzmann. Vorlesungen ¨ ueber Gastheorie. Dover, New York, NY, U.S.A., 1896 & 1898. Translated and edited by Stephen G. Brush as “Lectures on Gas Theory” in two volumes,UniversityofCaliforniaPress,Berkeley,CA,1964. Eric Bonabeau, Marco Dorigo, and Guy Theraulaz. Swarm Intelligence: From Natural to ArtificialSystems. OxfordUniversityPress,Inc.,NewYork,NY,U.S.A.,1999. Andrew F. G. Bourke and Nigel R. Franks. Social Evolution in Ants. Princeton University Press,Princeton,NJ,U.S.A.,1995. Michael E. Bratman. Intentions, Plans, and Practical Reason. Harvard University Press, Cambridge,MA,U.S.A.,1987. Cynthia Breazeal, Andrew Brooks, David Chilongo, Jesse Gray, Guy Hoffman, Cory Kidd, Hans Lee, Jeff Lieberman, and Andrea Lockerd. Working Collaboratively with Humanoid Robots. In Proceedings of the fourth IEEE/RAS International Conference on Humanoid Robotics,pages253–272,LosAngeles,CA,U.S.A.,November2004. Rodney A. Brooks. Cambrian Intelligence: The Early History of the New AI. MIT Press, Cambridge,MA,U.S.A.,2001. RodneyA.Brooks. ARobustLayeredControlSystemforaMobileRobot. IEEETransactions on Robotics and Automation, 2(1):14–23, April 1986. Reprinted in Brooks (2001) pages 3–26. 234 RodneyA.Brooks. ElephantsDon’tPlayChess. RoboticsandAutonomousSystems—Special Issue on Designing Autonomous Agents, 6(1–2):3–15, June 1990. Reprinted in Brooks (2001)pages111–132. Rodney A. Brooks. Intelligence Without Reason. In Proceedings of the twelfth International Joint Conference on Artificial Intelligence (IJCAI-91) , pages 569–595, Sydney, Australia, 1991. ReprintedinBrooks(2001)pages79–101. Rodney A. Brooks and Anita M. Flynn. Fast, Cheap, and Out of Control: a robot invasion ofthesolarsystem. Journal of the British Interplanetary Society, 42(10):478–485,October 1989. RogerBrown. SocialPsychology. TheFreePress,NewYork,NY,U.S.A.,firstedition,1965. John Bonner Buck. Synchronous Rhythmic Flashing of Fireflies. The Quarterly Review of Biology,13(3):301–314,September1938. (citedbyMirolloandStrogatz(1990),pp.1). FrancescoBulloandAndrewD.Lewis. GeometricControlofMechanicalSystems: Modeling, Analysis, and Design for Simple Mechanical Control Systems. Springer, Berlin, Germany, 2004. Carsten Burstedde, Ansgar Kirchner, Kai Klauck, Andreas Schadschneider, and Johannes Zit- tartz. Cellularautomatonapproachtopedestriandynamics–applications. InProceedingsof the International Conference on Pedestrian and Evacuation Dynamics, pages 87–97, Duis- burg,Germany,2001. Earl Callen and Don Shapero. A Theory of Social Imitation. Physics Today, 27:23–28, July 1974. MikeCampos,EricBonabeau,GuyTheraulaz,andJean-LouisDeneubourg.Dynamicschedul- inganddivisionoflaborinsocialinsects. AdaptiveBehavior,8(2):83–94,2001. Angelo Cangelosi and Thomas Riga. An Embodied Model for Sensorimotor Grounding and Grounding Transfer: Experiments with Epigenetic Robots. Cognitive Science, 30(4):673– 689,2006. Y. Uny Cao, Alex S. Fukunaga, Andrew B. Kahng, and Frank Meng. Cooperative mobile robotics: Antecedentsanddirections. InProceedingsoftheIEEE/RSJInternationalConfer- ence on Intelligent Robots and Systems (IROS’95), pages 226–234, Pittsburgh, PA, U.S.A., August1995. Y. Uny Cao, Alex S. Fukunaga, and Andrew B. Kahng. Cooperative mobile robotics: An- tecedentsanddirections. AutonomousRobots,4(1):7–27,March1997. Gilles Caprari, Alexandre Colot, Roland Siegwart, Jos´ e Halloy, and Jean-Louis Deneubourg. AnimalandRobotMixedSocieties: BuildingCooperationBetweenMicrorobotsandCock- roaches. IEEERobotics&AutomationMagazine,12(2):58–65,June2005. 235 JenniferCasper,MarkMicire,andRobinR.Murphy. Issuesinintelligentrobotsforsearchand rescue. SPIEGroundVehicleTechnologyII,4:41–46,April2000. F.CazalsandM.Sbert. Someintegralgeometrytoolstoestimatethecomplexityof3Dscenes. TechnicalReportRR-3204,InstitutNationaldeRechercheenInformatiqueetenAutomatiue (INRIA),July1997. Dave Cliff. The computational hoverfly: a study in computational neuroethology. In Jean- Arcady Meyer and Stewart W. Wilson, editors, From Animals to Animats: Proceedings of the first International Conference on Simulation of Adaptive Behavior (SAB’90), pages 87– 96,Paris,France,September1990. Phillip R. Cohen and Hector J. Levesque. Teamwork. Noˆ us—Special Issue on Cognitive ScienceandArtificialIntelligence,25(4):487–512,September1991. Blaine J. Cole. Short-Term Activity Cycles in Ants: Generation of Periodicity by Worker Interaction. TheAmericanNaturalist,137(2):244–259,February1991. Jonathan H. Connell. Minimalist Mobile Robotics: A Colony Architecture for an Artificial Creature. AcadeimcPress,Boston,MA,U.S.A.,1990. Jonathan H. Connell. SSS: A Hybrid Architecture Applied to Robot Navigation. In Pro- ceedingsIEEE/RSJInternationalConferenceonRoboticsandAutomation(ICRA’92),pages 2719–2724,Nice,France,May1992. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliff Stein. Introduction to Algorithms. MITPress,Cambridge,MA,U.S.A.,2ndedition,2001. Iain D. Couzin, Jens Krause, Richard James, Graeme D. Ruxton, and Nigel R. Franks. Col- lective memory and spatial sorting in animal groups. Journal of Theoretical Biology, 218: 1–11,2002. Valentino Crespi, George Cybenko, Daniela Rus, and Massimo Santini. Decentralized Con- trol for Coordinated flow of Multiagent Systems. In Proceedings of the International Joint ConferenceonNeuralNetworks,pages2604–2609,Honolulu,HI,U.S.A.,May2002. James P. Crutchfield. The Calculi of Emergence: Computation, Dynamics, and Induction. PhysicaD:NonlinearPhenomena,75(1–3):11–54,August1994. Vince Darley. Emergent Phenomena and Complexity. In Artificial Life IV: Proceedings of thefourthInternationalWorkshopontheSynthesisandSimulationofLivingSystems,pages 411–416,Cambridge,MA,U.S.A.,July1994. Jean-Louis Deneubourg, Jacques M. Pasteels, and Jean-Claude Verhaeghe. Probabilistic Be- haviorinAnts: AStrategyofErrors. JournalofTheoreticalBiology,105:259–271,1983. 236 Jean-Louis Deneubourg, Simon Goss, Nigel R. Franks, Ana B. Sendova-Franks, Claire De- train, and Laeticia Chr´ etien. The Dynamics of Collective Sorting: Robot-Like Ants and Ant-Like Robots. In Jean-Arcady Meyer and Stewart W. Wilson, editors, From Animals to Animats: Proceedings of the first International Conference on Simulation of Adaptive Behavior(SAB’90),pages356–363,Paris,France,September1990. Daniel M.J. Devasirvatham, C. Banerjee, M. J. Krain, and D. A. Rappaport. Multi-Frequency Radiowave Propagation Measurements in The Portable Radio Environment. IEEE Confer- enceonCommunications,4:1334–1340,1990. Olivier Devillers. On deletion in Delaunay triangulations. In Proceedings of the fifteenth annual symposium on Computational Geometry, pages 181–188, Miami Beach, FL, USA, June1999. M. Bernardine Dias, Robert Zlot, Nidhi Kalra, and Anthony Stentz. Market-based multirobot coordination: asurveyandanalysis.ProceedingsoftheIEEE—SpecialIssueonMulti-robot Systems,94(7):1257–1270,July2006. BruceRandallDonald. OnInformationInvariantsinRobotics. ArtificialIntelligence—Special Volume on Computational Research on Interaction and Agency, Part 1, 72(1–2):217–304, January1995. Bruce Randall Donald, Jim Jennings, and Daniela Rus. Minimalism + Distribution = Super- modularity. Journal of Experimental and Theoretical Artificial Intelligence, 9(2–3):293– 321,April1997. Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. The Ant System: Optimization by a Colony of Cooperating Agents. IEEE Transactions on Systems, Man, and Cybernetics— PartB:Cybernetics,26(1):29–41,February1996. GregoryDudek,MichaelJenkin,EvangelosMilios,andDavidWilkes. ATaxonomyforMulti- AgentRobotics. AutonomousRobots,3(4):375–397,1996. Gregory Dudek, Michael Jenkin, and Evangelos Milios. A Taxonomy of Multirobot Systems. In Tucker Balch and Lynne E. Parker, editors, Robot Teams: From Diversity to Polymor- phism,chapter1,pages3–22.A.K.Peters,Natick,MA,U.S.A.,April2002. Gregory D. Durgin and Theodore S. Rappaport. Theory of multipath shape factors for small- scale fading wirelesschannels. IEEE Transactions on Antennas and Propagation, 48(5): 682–693,May2000. D¨ usseldorfAirportDisaster,10April1997. OfficialProsecutorInvestigativeReport,1997. Rosemary Emery-Montemerlo, Geoff Gordon, Jeff Schneider, and Sebastian Thrun. Game Theoretic Control for Robot Teams. In Proceedings of the IEEE International Conference onRoboticsandAutomation(ICRA’05),pages1163–1169,Barcelona,Spain,April2005. 237 MichaelA.Erdmann. OnProbabilisticStrategiesfor RobotTasks. PhDthesis,Massachusetts InstituteofTechnology,ElectricalEngineeringandComputerScienceDepartment,1989. Deborah Estrin, Ramesh Govindan, John Heidemann, and Satish Kumar. Next century chal- lenges: Scalablecoordinationinsensornetworks. InProceedingsoftheACM/IEEEInterna- tionalConferenceonMobileComputingandNetworking(MOBICOM’99),pages263–270, Seattle,WA,U.S.A.,August1999. Alessandro Farinelli, Luca Iocchi, and Daniele Nardi. Multirobot Systems: A Classification Focused on Coordination. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics,34(5):2015–2028,October2004. RichardP.Feynman. Simulatingphysicswithcomputers. InternationalJournalofTheoretical Physics,21:467–488,1982. Mark Filipiak. Mesh Generation. Edinburgh Parallel Computing Centre Technology Watch Report,1996. Michael E. Fisher. The theory of equilibrium critical phenomena. Reports on Progress in Physics,30:615–730,1967. AjoFod,AndrewHoward,andMajaJ.Matari´ c. ALaser-basedPeopleTracker. In Proceedings IEEE/RSJ International Conference on Robotics and Automation (ICRA’02), pages 3024– 3029,WashingtonDC,U.S.A.,May2002. Terrence Fong, Illah Nourbakhsh, and Kerstin Dautenhahn. A survey of socially interactive robots. RoboticsandAutonomousSystems,42:143–166,2003. Steven Fortune. A sweepline algorithm for Voronoi diagrams. In Proceedings of the second annual symposium on Computational Geometry, pages 313–322, Yorktown Heights, NY, USA,June1986. Ian Foster and Carl Kesselman. The Grid 2: Blueprint for a New Computing Infrastructure. MorganKaufmann,secondedition,2003. DieterFox,JonathanKo,KurtKonolige,BensonLimketkai,DirkSchulz,andBenjaminStew- art. Distributed Multi-robot Exploration and Mapping. Proceedings of the IEEE—Special IssueonMulti-robotSystems ,94(7):1325–1339,July2006. Stan Franklin and Art Graesser. Is it an Agent, or Just a Program?: A Taxonomy for Au- tonomousAgents. InProceedingsofthethirdWorkshoponIntelligentAgents: AgentTheo- ries, Architectures, and Languages at the twelfth European Conference on Artificial Intelli- gence(ECAI’96),pages21–35,Budapest,Hungary,August1996. Nigel R. Franks. Army Ants: A Collective Intelligence. American Scientist, 77(2):139–145, March1989. 238 NigelR.FranksandAnaB.Sendova-Franks. Broodsortingbyants: distributingtheworkload overthework-surface. BehavioralEcologyandSociobiology,30(2),March1992. Nigel R. Franks, Eamonn B. Mallon, Helen E. Bray, Mathew J. Hamilton, and Thomas C. Mischler. Strategiesforchoosingbetweenalternativeswithdifferentattributes: exemplified byhouse-huntingants. AnimalBehaviour,65(1):215–223,January2003. Toshio Fukuda, Seiya Nakagawa, Yoshio Kawauchi, and Martin Buss. Self organizing robots based on cell structures—CEBOT. In Proceedings of the IEEE/RSJ International Confer- enceonIntelligentRobotsandSystems(IROS’88),pages145–150,Tokyo,Japan,November 1988. DouglasW.Gage.CommandControlforMany-RobotSystems. UnmannedSystemsMagazine, 10(4):28–34,Fall1992. Edwin R. Galea. Simulating evacuation and circulation in planes, trains, buildings ans ships using the exodus software. In Proceedings of the International Conference on Pedestrian andEvacuationDynamics,pages203–225,Duisburg,Germany,2001. Saurabh Ganeriwal, Ram Kumar, and Mani B. Srivastava. Timing-sync protocol for sensor networks. In Proceedings of the International Conference on Embedded Networked Sensor Systems(SenSys’03),pages138–149,LosAngeles,California,USA,November2003. Deepak Ganesany, Deborah Estrin, Alec Woo, and David Culler. Complex Behavior at Scale: An Experimental Study of Low-Power Wireless Sensor Networks. Technical Report 02- 0013,ComputerScienceDepartment,UniversityofCalifornia,LosAngeles,2002. Simon Garnier, Christian Josta, Rapha¨ el Jeansona, Jacques Gautraisa, Masoud Asadpourb, GillesCaprari,andGuyTheraulaza. CollectiveDecision-MakingbyaGroupofCockroach- Like Robots. In Proceedings of the IEEE Swarm Intelligence Symposium (SIS’05), pages 233–240,Pasadena,CA,U.S.A.,June2005. BrianP.Gerkey. OnMulti-RobotTaskAllocation . PhDthesis,UniversityofSouthernCalifor- nia,ComputerScienceDepartment,August2003. Brian P. Gerkey and Maja J. Matari´ c. Sold!: Auction methods for multi-robot coordination. IEEETransactionsonRoboticsandAutomation—SpecialIssueonMulti-robotSystems ,18 (5):758–768,October2002. Brian P. Gerkey and Maja J. Matari´ c. A formal analysis and taxonomy of task allocation in multi-robotsystems. InternationalJournalofRoboticsResearch,23(9):939–954,September 2004a. Brian P. Gerkey and Maja J. Matari´ c. Are (explicit) multi-robot coordination and multi-agent coordinationreallysodifferent? InProceedingsoftheAAAISpringSymposiumonBridging theMulti-AgentandMulti-RoboticResearchGap ,pages1–3,PaloAlto,CA,U.S.A.,March 2004b. 239 Brian P. Gerkey, Richard T. Vaughan, and Andrew Howard. The Player/Stage Project: Tools for Multi-Robot and Distributed Sensor Systems. In Proceedings of the International Con- ferenceonAdvancedRobotics(ICAR’03),pages317–323,Coimbra,Portugal,July2003. Simon F. Giszter, Ferdinando A. Mussa-Ivaldi, and Emilio Bizzi. Convergent force fields organizedintothefrog’sspinalcord. JournalofNeuroscience,13(2):467–491,1993. DaniGoldberg. EvaluatingtheDynamicsofAgent-EnvironmentInteraction . PhDthesis,Uni- versityofSouthernCalifornia,ComputerScienceDepartment,2001. Dani Goldberg and Maja J. Matari´ c. Maximizing Reward Maximizing Reward in a Non- StationaryMobileRobotEnvironment. Autonomous Agents and Multi-Agent Systems Jour- nal,6(3):287–316,2003. DaniGoldbergandMajaJ.Matari´ c. InterferenceasaToolforDesigningandEvaluatingMulti- RobotControllers. InProceedingsofthefourteenthAAAINationalConferenceonArtificial Intelligence(AAAI’97),pages637–642,Providence,RI,U.S.A.,July1997. Simon Goss, Serge Aron, Jean-Louis Deneubourg, and Jacques M. Pasteels. Self-organized shortcutsintheArgentineant. Naturwissenschaften,76:579–581,1989. Frank Grasso, Thomas Consi, David Moutain, and Jelle Atema. Locating odor sources in tur- bulence with a lobster-inspired robot. In Pattie Maes, Maja J. Matari ´ c, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats: Proceedings of the fourth International Conference on Simulation of Adaptive Behavior (SAB’96), pages 101–112,Paris,France,September1996. DieterH.E.Gross. MicrocanonicalThermodynamics: PhaseTransitionsin“Small”Systems. WorldScientificLectureNotesinPhysics-Volume66.WorldScientific,Singapore,2001. DieterH.E.Gross.Microcanonicalthermodynamicsandstatisticalfragmentationofdissipative systems—the topological structure of the n-body phase space. Physics Reports, 279:119– 202,1997. Carlos Guestrin, Shobha Venkataraman, and Daphne Koller. Context-Specific Multiagent Co- ordination and Planning with Factored MDPs. In Proceedings of the eighteenth AAAI Na- tional Conference on Artificial Intelligence (AAAI’02), pages 253–259, Edmonton, Alberta, Canada,July2002. Leonidas J. Guibas and Daniel Russel. An empirical comparison of techniques for updating Delaunay triangulations. In Proceedings of the twentieth annual symposium on Computa- tionalGeometry,pages170–179,Brooklyn,NY,USA,June2004. Leonidas J. Guibas and Jorge Stolfi. Primitives for the Manipulation of General Subdivisions and the Computation of Voronoi Diagrams. ACM Transactions on Graphics, 4(2):74–123, 1985. 240 EdwardT.Hall. TheSilentLanguage. Doubleday,GardenCity,NY,U.S.A.,1959. Jos´ eHalloy,GregorySempo,GillesCaprari,ColetteRivault,MasoudAsadpour,FabienTˆ ache, and I. Sa¨ ıd Virginie Durier St´ ephane Canonge Jean Marc Am´ e Claire Detrain Nikolaus Correll Alcherio Martinoli Francesco Mondada Roland Siegwart Jean-Louis Deneubourg. Social integration of robots into groups of cockroaches to control self-organized choices. Science,318(5853):1155–1158,November2007. Horst W. Hamacher and Stevanus A. Tjandra. Mathematical modelling of evacuation prob- lems: A state of the art. In Proceedings of the International Conference on Pedestrian and EvacuationDynamics,pages227–266,Duisburg,Germany,2001. Christopher J. Harper and Alan F.T. Winfield. A methodology for provably stable behaviour- basedintelligentcontrol. RoboticsandAutonomousSystems,54(1):52–73,January2006. Dirk Helbing. Traffic and related self-driven many-particle systems. Reviews of Modern Physics,73(4):1067–1141,2001. DirkHelbing. QuantitativeSociodynamics.StochasticMethodsandModelsofSocialInterac- tionProcesses. KluwerAcademic,Dordrecht,Netherlands,1995. DirkHelbing,FrankSchweitzer,JoachimKeltsch,andP´ eterMoln´ ar. Activewalkermodelfor the formation of human and animal trail systems. Physical Review, E, 56(3):2527–2539, September1997. DirkHelbing,Ill´ esFarkas,andTam´ asVicsek.SimulatingDynamicalFeaturesofEscapePanic. Nature,407:487–490,September2000. DirkHelbing,Ill´ esFarkas,PeterMoln´ ar,andTam´ asVicsek. Simulationofpedestriancrowds. In Proceedings of the International Conference on Pedestrian and Evacuation Dynamics, pages21–58,Duisburg,Germany,2001a. DirkHelbing,PeterMoln´ ar,Ill´ esFarkas,andKaiBolay.Self-organizingpedestrianmovement. EnvironmentandPlanningB:PlanningandDesign,28(3):361—383,May2001b. Dirk Helbing, Anders Johansson, and Habib Zein Al-Abideen. Dynamics of crowd disasters: Anempiricalstudy. PhysicalReview,E,75(4):040101,April2007. Bill Hillier. Space is the Machine: A Configurational Theory of Architecture. Cambridge UniversityPress,Cambridge,U.K.,1996. Bill Hillier and Julienne Hanson. The Social Logic of Space. Cambridge University Press, Cambridge,U.K.,1984. CharlesA.R.Hoare. CommunicatingSequentialProcesses. Communications of the ACM,21 (8):666–677,August1978. 241 Tad Hogg and Bernardo A. Huberman. Controlling Chaos in Distributed Systems. IEEE TransactionsonSystems,Man,andCybernetics—SpecialSectiononDistributedAI,21(6): 1325–1332,November/December1991. Tad Hogg and David W. Sretavan. Controlling Tiny Multi-Scale Robots for Nerve Repair. In ProceedingsofthetwentiethAAAINationalConferenceonArtificialIntelligence(AAAI’97), pages1286–1291,Pittsburgh,PA,U.S.A.,July2005. OwenE.HollandandChrisMelhuish. Stigmergy,Self-Organization,andSortinginCollective Robotics. ArtificialLife,5(2):173–202,1999. SergeP.Hoogendoorn,PietH.LBovy,andWinnieDaamen. Microscopicpedestrianwayfind- ing ad dynamics modelling. In Proceedings of the International Conference on Pedestrian andEvacuationDynamics,pages123–154,Duisberg,Germany,April2001. JohnHorgan. TheEndofScience. Broadway,NewYork,NY,U.S.A.,1997. Daniel G. Horvitz and Donovan J. Thompson. A generalization of sampling without replace- ment from a finite universe. Journal of the American Stastistical Association, 47:663–685, 1952. AndrewHoward,MajaJ.Matari´ c,andGauravS.Sukhatme. MobileSensorNetworkDeploy- mentusingPotentialFields: ADistributedScalableSolutiontotheAreaCoverageProblem. In Proceedings of the sixth International Conference on Distributed Autonomous Robotic Systems(DARS’02),pages299–308,FukuokaJapan,June2002. Andrew Howard, Lynne E. Parker, and Gaurav S. Sukhatme. The SDR Experience: Experi- ments with a Large-Scale Heterogeneous Mobile Robot Team. In Proceedings of the ninth InternationalSymposiumonExperimentalRobotics(ISER’04),Singapore,June2004. Andrew Howard, Gaurav S. Sukhatme, and Maja J. Matari´ c. Multi-Robot Mapping using ManifoldRepresentations. ProceedingsoftheIEEE—SpecialIssueonMulti-robotSystems , 94(7):1360–1369,July2006. Stephen P. Hubbell, Leslie K. Johnson, Eileen Stanislav, Berry Wilson, and Harry Fowler. Foraging by Bucket-Brigade in Leaf-Cutter Ants. Biotropica, 12(3):210–213, September 1980. Jim Jennings and Chris Kirkwood-Watts. Distributed Mobile Robotics by the Method of Dy- namic Teams. In Proceedings of the International Symposium on Distributed Autonomous RoboticSystems,pages46–56,Karlsruhe,Germany,May1998. B. Jiang and C. Claramunt. Extending Space Syntax towards an Alternative Model of Space within GIS. In 3rd European Agile Conference on Geographic Information Science, May 2000. 242 Chris V. Jones. A Principled Design Methodology for Minimalist Multi-Robot System Con- trollers. PhDthesis,UniversityofSouthernCalifornia,ComputerScienceDepartment,Au- gust2005. Chris V. Jones and Maja J. Matari´ c. Adaptive Division of Labor in Large-Scale Minimalist Multi-Robot Systems. In Proceedings of the IEEE/RSJ International Conference on Intel- ligent Robots and Systems (IROS’03), pages 1969–1974, Las Vegas, NV, U.S.A., October 2003. Chris V. Jones and Maja J. Matari´ c. Automatic Synthesis of Communication-Based Coordi- nated Multi-Robot Systems. In Proceedings of the IEEE/RSJ International Conference on IntelligentRobotsandSystems(IROS’04),pages381–387,Sendai,Japan,September2004a. Chris V. Jones and Maja J. Matari´ c. Synthesis and Analysis of Non-Reactive Controllers for Multi-RobotSequentialTaskDomains.In ProceedingsoftheninthInternationalSymposium onExperimentalRobotics(ISER’04),Singapore,June2004b. BoyoonJungandGauravS.Sukhatme. TrackingTargetsusingMultipleRobots: TheEffectof EnvironmentOcclusion. AutonomousRobots,13(3):191–205,November2002. Leslie Pack Kaelbling and Stanley J. Rosenschein. Action and planning in embedded agents. Robotics and Autonomous Systems—Special Issue on Designing Autonomous Agents, 6(1– 2):35–48,June1990. Leslie Pack Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Planning and Acting inPartiallyObservableStochasticDomains. ArtificialIntelligence,101:99–134,May1998. Joseph M. Kahn, Randy H. Katz, and Kristofer S. J. Pister. Next century challenges: Mobile Networking for Smart Dust. In Proceedings of the ACM/IEEE International Conference on Mobile Computing and Networking (MOBICOM’99), pages 271–278, Seattle, WA, U.S.A., August1999. Nidhi Kalra and Alcherio Martinoli. A Comparative Study of Market-Based and Threshold- BasedTaskAllocation.InProceedingsoftheeighthInternationalConferenceonDistributed Autonomous Robotic Systems (DARS’06), pages 191–101, Minneapolis, MN, U.S.A., July 2006. Boris S. Kerner and Peter Konh¨ auser. Cluster effect in initially homogeneous traffic flow. PhyscalReviewE,48(4):R2335–R2338,1993. Boris S. Kerner and Hubert Rehborn. Experimental properties of phase transitions in traffic flow. PhysicalReviewLetters,79(20):4030–4033,November1997. Andreas Keßel, Hubert Kl¨ upfel, Joachim Wahle, and Micael Schreckenberg. Microscopic simulation of pedestrian crowd motion. In Proceedings of the International Conference on PedestrianandEvacuationDynamics,pages193–200,Duisberg,Germany,April2001. 243 Oussama Khatib. Real-time obstacle avoidance for manipulators and mobile robots. Interna- tionalJournalofRoboticsResearch,5(1):90–98,Spring1986. Eric Klavins. Communication Complexity of Multi-Robot Systems. In Jean-Daniel Boisson- nat, JoelBurdick, Ken Goldberg, and SethHutchinson, editors, Algorithmic Foundations of Robotics V, volume 7 of Springer Tracts in Advanced Robotics, pages 275–292. Springer, Berlin,Germany,2003a. Eric Klavins. A Formal Model of a Multi-Robot Control and Communication Task. In Pro- ceedings of the IEEE Conference on Decision and Control (CDC’03), pages 4133–4139, Maui,HI,U.S.A.,December2003b. Akihiro Kobayashi and Katsumi Nakamura. Rescue robots for Fire Hazards. In Proceed- ings of the First International Conference on Advanced Robotics (ICAR’83), pages 91–98, September1983. NathanKoenigandAndrewHoward. DesignandUseParadigmsforGazebo,AnOpen-Source Multi-RobotSimulator. In Proceedings of the IEEE/RSJ International Conference on Intel- ligentRobotsandSystems(IROS’04),pages2149–2154,Sendai,Japan,September2004. KurtKonolige,DieterFox,CharlieOrtiz,AndrewAgno,MichaelEriksen,BensonLimketkai, Jonathan Ko, Benoit Morisset, Dirk Schulz, Benjamin Stewart, and Regis Vincent. Centi- bots: Very large scale distributed robotic teams. In Proceedings of the ninth International SymposiumonExperimentalRobotics(ISER’04),Singapore,June2004. David Kotz, Calvin Newport, and Chip Elliott. The mistaken axioms of wireless-network research. Technical Report TR2003-467, Dartmouth College, Computer Science, Hanover, NH,July2003. TobiasKretz,AnnaGr¨ unebohm,MaikeKaufman,FlorianMazur,andMichaelSchreckenberg. Experimentalstudyofpedestriancounterflowinacorridor.JournalofStatisticalMechanics: TheoryandExperiment,pageP10014,October2006. Bhaskar Krishnamachari. Networking Wireless Sensors. Cambridge University Press, Cam- bridge,U.K.,December2005. C. Ronald Kube. Collective Robotics: From Local Perception to Global Action. PhD thesis, UniversityofAlberta,DepartmentofComputingScience,1997. C.RonaldKubeandHongZhang. CollectiveRobotics: FromSocialInsectstoRobots. Adap- tiveBehavior,2(2):189–219,Fall1993. Fabian Kuhn, Roger Wattenhofer, and Aaron Zollinger. Ad-Hoc Networks Beyond Unit Disk Graphs. InProceedingsofthefirstACMJointWorkshoponFoundationsofMobileComput- ing(DIALM-POMC) ,SanDiego,CA,U.S.A,September2003. GustaveLeBon. Thecrowd. Viking,NewYork,1895. 244 JoelL.LebowitzandOliverPenrose. ModernErgodicTheory. PhysicsToday,pages155–175, February1973. Kristina Lerman and Aram Galstyan. A General Methodology for Mathematical Analysis of Multi-Agent Systems. Technical Report ISI-TR-529, USC Information Sciences Institute, 2001. Kristina Lerman and Aram Galstyan. Mathematical Model of Foraging in a Group of Robots: EffectofInterference. AutonomousRobots,13(2):127–141,September2002. Kristina Lerman, Aram Galstyan, Alcherio Martinoli, and Auke Jan Ijspeert. A Macroscopic Analytical Model of Collaboration in Distributed Robotic Systems. Artificial Life, 7(4): 375–393,2001. Kristina Lerman, Chris V. Jones, Aram Galstyan, and Maja J. Matari´ c. Analysis of Dynamic Task Allocation in Multi-Robot Systems. International Journal of Robotics Research, 25 (4):225–242,March2006. M. Anthony Lewis and George A. Bekey. The Behavioral Self-Organization of Nanorobots UsingLocalRules. InProceedingsoftheIEEE/RSJInternationalConferenceonIntelligent RobotsandSystems(IROS’92),pages1333–1338,Raleigh,NC,U.S.A.,July1992. Magnus Lindh´ e, Karl Henrik Johansson, and Antonio Bicchi. An experimental study of ex- ploiting multipath fading for robot communications. In Proceedings of Robotics: Science andSystems,Atlanta,GA,USA,June2007. MattLong,AaronGage,RobinMurphy,andKimonValavanis. ApplicationoftheDistributed Field Robot Architecture to a Simulated Demining Task. In Proceedings of the IEEE Inter- national Conference on Robotics and Automation (ICRA’05),pages3193–3200,Barcelona, Spain,April2005. G.G. Løv˚ as. Models of wayfinding in emergency evacuations. European Journal of Opera- tionalResearch,105:371–389,1998. EamonnB.Mallon,StephenC.Pratt,andNigelR.Franks. Individualandcollectivedecision- making during nest site selection by the ant Leptothorax albipennis. Behavioral Ecology andSociobiology,50(4):352–359,September2001. Davide Marocco, Angelo Cangelosi, and Stefano Nolfi. The emergence of communication in evolutionary robots. Philosophical Transactions of the Royal Society A: Mathematical, PhysicalandEngineeringSciences,351(1811):2397–2421,October2003. Mikl´ osMar´ oti,BranislavKusy,GyulaSimon,and ´ AkosL´ edeczi. TheFloodingTimeSynchro- nizationProtocol. InProceedingsoftheInternationalConferenceonEmbeddedNetworked SensorSystems(SenSys’04),pages39–49,Baltimore,MD,USA,2004. DavidMarr. Vision. W.H.FreemanandCo.,1982. (citedbyWebb(2001),pp.9). 245 DanielMarthalerandAndreaL.Bertozzi. CollectiveMotionAlgorithmsforDeterminingEn- vironmentalBoundaries. TechnicalReport03-17,ComputationalandAppliedMathematics, UniversityofCalifornia,LosAngeles,April2003. AlcherioMartinoli,KjerstinI.Easton,andWilliamAgassounon. ModelingofSwarmRobotic Systems: ACaseStudyinCollaborativeDistributedManipulation. InternationalJournalof RoboticsResearch—SpecialIssueonExperimentalRobotics,23(4):415–436,2004. AukeJanIjspeertAlcherioMartinoli,AudeBillard,andLucaMariaGambardella. Collabora- tionthroughtheExploitationofLocalInteractionsinAutonomousCollectiveRobotics: The StickPullingExperiment. AutonomousRobots,11(2):149–171,September2001. Maja J. Matari´ c. Situated Robotics. In Encyclopedia of Cognitive Science. Nature Publishing Group,MacmillanReferenceLimited,November2002. MajaJ.Matari´ c. NavigatingWithaRatBrain: ANeurobiologically-InspiredModelforRobot Spatial Representation. In Jean-Arcady Meyer and Stewart W. Wilson, editors, From Ani- mals to Animats: Proceedings of the first International Conference on Simulation of Adap- tiveBehavior(SAB’90),pages169–175,Paris,France,September1990. MajaJ.Matari´ c.IntegrationofRepresentationintoGoal-DrivenBehavior-BasedRobots. IEEE TransactionsonRoboticsandAutomation,8(3):304–312,June1992. MajaJ.Matari´ c. DesigningandUnderstandingAdaptiveGroupBehavior. AdaptiveBehavior, 4(1):51–80,Summer1995. DonaldA.McQuarrie. Statistical Mechanics. HarperandRow,1976. reprintedbyUniversity ScienceBooks,Sausalito,CA,U.S.A.in2000. NicholasMetropolis,AriannaW.Rosenbluth,MarshallN.Rosenbluth,AugustaH.Teller,and Edward Teller. Equation of State Calculations by Fast Computing Machines. Journal of ChemicalPhysics,6(21):1087–1092,1953. Olivier Michel. Webots: Professional Mobile Robot Simulation. International Journal of AdvancedRoboticSystems,1(1):39–42,2004. RobinMilner.ACalculusofCommunicatingSystems,volume92ofLectureNotesinComputer Science. Springer-Verlag,Berlin,1980. Robin Milner. Communicating and Mobile Systems: the π-calculus . Cambridge University Press,Cambridge,England,1999. Nelson Minar, Roger Burkhart, Chris Langton, and Manor Askenazi. The Swarm Simulation System: A Toolkit for Building Multi-agent Simulations . Santa Fe Institute Working Paper 96-06-042,SantaFe,1999. http://www.swarm.org,Lastaccess: 31October2006. AlexanderMintz. Non-adaptivegroupbehavior. TheJournalofAbnormalandNormalSocial Psychology,46:150–159,1951. 246 Renato E. Mirollo and Steven H. Strogatz. Synchronization of Pulse-Coupled Biological Os- cillators. SIAMJournalonAppliedMathematics,50(6):1645–1662,December1990. Pragnesh Jay Modi, Wei-Min Shen, Milind Tambe, and Makoto Yokoo. ADOPT: Asyn- chronous Distributed Constraint Optimization with Quality Guarantees. Artificial Intelli- gence,161(1–2):149–180,January2005. Cristopher Moore. Unpredictability and Undecidability in Dynamical Systems. Physical Re- viewLetters,64:2354–2357,1990. Robin R. Murphy, Jennifer Casper, and Mark Micire. Potential Tasks and Research Issues of MobileRobotsinRoboCupRescue. InPeterStone,TuckerR.Balch,andGerhardK.Kraet- zschmar, editors, RoboCup-2000: Robot Soccer World Cup IV , pages 339–344. Springer Verlag,Berlin,2001. FerdinandoA.Mussa-IvaldiandSimonF.Giszter.Vectorfieldapproximation: acomputational paradigmformotorcontrolandlearning. BiologicalCybernetics,67(6):491–500,1992. KaiNagelandMicaelSchreckenberg. Acellularautomatonmodelforfreewaytraffic. Journal dePhysiqueIFrance,2:2221–2229,1992. Allen Newell and Herbert A. Simon. Computer Science as Empirical Inquiry: Symbols and Search. CommunicationsoftheACM,19(3):113–126,1976. Stefano Nolfi, Dario Floreano, Orazio Miglino, and Francesco Mondada. How to evolve au- tonomousrobots: Differentapproachesinevolutionaryrobotics.InProceedingsofthefourth InternationalWorkshopontheSynthesisandSimulationofLivingSystems(ALife-IV) ,pages 190–197,Cambridge,MA,U.S.A.,July1994. PaulY.OhandWilliamE.Green. CQAR:ClosedQuarterAerialRobotDesignforReconnais- sance, Surveillance and Target Acquisition Tasks in Urban Areas. International Journal of ComputationalIntelligence,1(4):353–360,2004. Jason M. O’Kane and Steven M. LaValle. On Comparing the Power of Mobile Robots. In Proceedings of Robotics: Science and Systems (RSS’06), Philadelphia, PA, U.S.A., August 2006. Esben H. Østergaard, Gaurav S. Sukhatme, and Maja J. Matari´ c. Emergent Bucket Brigad- ing—ASimpleMechanismforImprovingPerformanceinMulti-RobotConstrained-Space ForagingTasks. InProceedingsofthefifthInternationalConferenceonAutonomousAgents (Agents’01),pages29–30,Montreal,Quebec,Canada,May2001. Robin E. Owen. Colony-level selection in the social insects: single locus additive and nonad- ditivemodels. TheoreticalPopulationBiology,29:198–234,1989. Daniel R. Parisi and Claudio O. Dorso. Why “Faster is Slower” in Evacuation Process. In ProceedingsoftheInternationalConferenceonPedestrianandEvacuationDynamics,pages 341–346,Vienna,Italy,2005. 247 Lynne E. Parker. Current State of the Art in Distributed Robot Systems. In Proceedings of the fifth International Conference on Distributed Autonomous Robotic Systems (DARS’00), pages3–12,Knoxville,TN.,U.S.A.,October2000. Lynne E. Parker. Current research inmultirobot systems. Artificial Life and Robotics, 7(1–2): 1–5,March2003. Lynne E. Parker. On the design of behavior-based multi-robot teams. Advanced Robotics, 10 (6):547–578,1996. Lynne E. Parker. Alliance: An architecture for fault-tolerant multi-robot cooperation. IEEE TransactionsonRoboticsandAutomation,14(2):220–240,1998. Gordon Pask. An Approach to Cybernetics. Harper & Brothers Publishers, New York, NY, U.S.A.,1961. David Payton, Mike Daily, Regina Estowski, Mike Howard, and Craig Lee. Pheromone Robotics. AutonomousRobots,11(3):319–324,November2001. Guilherme A. S. Pereira, Mario F. M. Campos, and Vijay Kumar. Decentralized Algorithms for Multi-Robot Manipulation via Caging. International Journal of Robotics Research, 23 (7–8):783–795,2004. KarlPetersen. ErgodicTheory. CambridgeUniversityPress,Cambridge,England,1983. Henri Poincar´ e. Les M´ ethodes nouvelles de la m´ ecanique c´ eleste, volume 1. Gauthier-Villars et fils, Paris, France, 1892. Translated and edited by Daniel L. Goroff as “New Methods in CelestialMechanics”,AmericanInstituteofPhysics,Woodbury,NY,1993. StephenC.Pratt. Behavioralmechanismsofcollectivenest-sitechoicebytheant Temnothorax curvispinosus. InsectesSociaux,52:383–392,2005. Stephen C. Pratt, Eamonn B. Mallon, David J.T. Sumpter, and Nigel R. Franks. Quorum sensing, recruitment, and collective decision-making during colony emigration by the ant Leptothoraxalbipennis. BehavioralEcologyandSociobiology,52(2):117–127,July2002. Ilya Prigogine and Isabelle Stengers. Order out of Chaos: Man’s New Dialogue with Nature. BantumBooks,NewYork,NY,U.S.A.,1984. G. Proulx. Misconceptions about human behavior in fire emergencies. Canadian Consulting Engineer,pages36–38,March1997. David V. Pynadath and Milind Tambe. Multiagent Teamwork: Analyzing the Optimality and Complexity of Key Theories and Models. In Proceedings of the first International Confer- enceonAutonomousAgentsandMultiagentSystems(AAMAS’02),pages873–880,Bologna, Italy,July2002. 248 Mohammad H. Rahimi, Richard Pon, William Kaiser, Gaurav S. Sukhatme, Deborah Estrin, andManiSrivastava. AdaptiveSamplingforEnvironmentalRobotics. InProceedingsofthe IEEE International Conference on Robotics and Automation (ICRA’04), pages 3537–3544, NewOrleans,LA,U.S.A.,April2004. Theodore S. Rappaport. Wireless Communications: Principles and Practice. Prentice-Hall, Inc.,UpperSaddleRiver,NJ,U.S.A.,secondedition,December2001. Stephen Read. The grain of space in time: the spatial/functional inheritance of amsterdam’s centre. UrbanDesignInternational,5(3):209–220,December2002. JohnH.ReifandJongyanWang. SocialPotentialFields: ADistributedBehavioralControlfor AutonomousRobots. RoboticsandAutonomousSystems,27(3):171–194,May1999. AristidesA.G.Requicha. Nanorobots,NEMSandNanoassembly. ProceedingsoftheIEEE— SpecialIssueonNanoelectronicsandNanoprocessing,91(11):1922–1933,November2003. Mitchel Resnick. Turtles, Termites, and Traffic Jams: Explorations in Massively Parallel Mi- croworlds. MITPress,Cambridge,MA,U.S.A.,1994. GiacomoRizzolatti,LucianoFadiga,VittorioGallese,andLeonardoFogassi. Premotorcortex andtherecognitionofmotoractions. CognitiveBrainResearch,3(2):131–141,March1996. Stanley J. Rosenschein and Leslie Pack Kaelbling. A Situated View of Representation and Control. ArtificialIntelligence—SpecialVolumeonComputationalResearchonInteraction andAgency,Part2,73(1–2):149–173,February1995. Maayan Roth, Reid Simmons, and Manuela Veloso. What to Communicate? Execution-Time Decision in Multi-agent POMDPs. In Proceedings of the eighth International Conference onDistributedAutonomousRoboticSystems(DARS’06),pages177–186,Minneapolis,MN, U.S.A.,July2006. Nicholas Roy, Geoffrey Gordon, and Sebastian Thrun. Finding Approximate POMDP solu- tions Through Belief Compression. Journal of Artificial Intelligence Research, 23:1–40, January–June2005. OlavR¨ uppellandRexKirkman. ExtraordinarystarvationresistanceinTemnothoraxrugatulus (Hymenoptera,Formicidae)colonies: Demographyandadaptivebehavior.InsectesSociaux, 52:282–290,2005. R. Andrew Russell. Ant trails - an example for robots to follow? In Proceedings IEEE International Conference on Robotics and Automation (ICRA), pages 2698–2703, Detroit, MI,May1999. Jerome H. Saltzer, David P. Reed, and David D. Clark. End-To-End Arguments in System Design. ACMTransactionsonComputerSystems,2(4):277–288,November1984. 249 Hanan Samet. The Quadtree and Related Hierarchical Data Structures. ACM Computing Surveys,16(2):187–260,June1984. Yuzuru Sato and James P. Crutchfield. Coupled Replicator Equations for the Dynamics of LearninginMultiagentSystems. PhysicsReviewE,1(67):40–43,2003. AndreasSchadschneider. Cellularautomatonapproachtopedestriandynamics–applications. In Proceedings of the International Conference on Pedestrian and Evacuation Dynamics, pages87–97,Duisberg,Germany,April2001a. Andreas Schadschneider. Cellular automaton approach to pedestrian dynamics – theory. In ProceedingsoftheInternationalConferenceonPedestrianandEvacuationDynamics,pages 75–85,Duisberg,Germany,April2001b. Mac Schwager, James McLurkin, and Daniela Rus. Distributed Coverage Control with Sen- sory Feedback for Networked Robots. In Proceedings of Robotics: Science and Systems (RSS’06),Philadelphia,PA,U.S.A.,August2006. Thomas D. Seeley. The Honey Bee Colony as a Superorganism. American Scientist, 77:546– 553,November-December1989. KosukeSekiyamaandToshioFukuda. ModelingandControllingofGroupBehaviorBasedon Self-OrganizingPrinciple.In ProceedingsoftheIEEEInternationalConferenceonRobotics andAutomation(ICRA’96),pages1407–1412,Minneapolis,MN,U.S.A.,April1996. Dylan A. Shell, Chris V. Jones, and Maja J. Matari´ c. Ergodic dynamics by design: a route to predictable multi-robot systems. In Lynne E. Parker, Frank E. Schneider, and Alan C. Schultz, editors, Multi-Robot Systems. From Swarms to Intelligent Automata, Volume III , pages291–297,TheNetherlands,March2005.Springer. Herbert A. Simon. The Sciences of the Artificial. MIT Press, Cambridge, MA, U.S.A., third edition,1996. NeilJ.Smelser. TheoryofCollectiveBehavior. TheFreePress,NewYork,NY,U.S.A.,1962. Warren D. Smith. Church’s thesis meets the N-body problem. Applied Mathematics and Computation—SpecialIssueonHypercomputation,178(1):154–183,July2006. CharlesP.Snow. TheTwoCulturesandtheScientificRevolution. CambridgeUniversityPress, Cambridge,U.K.,1959. Dongjin Son, Bhaskar Krishnamachari, and John Heidemann. Analysis of Concurrent Packet TransmissionsinLow-PowerWirelessNetworks. TechnicalReportISI-TR-2005-609,USC InformationSciencesInstitute,November2005. John R. Spletzer and Camillo J. Taylor. Dynamic Sensor Planning and Control for Optimally TrackingTargets. InternationalJournalofRoboticsResearch,22(1):7–20,January2003. 250 Luc Steels. Evolving grounded communication for robots. Trends in Cognitive Science, 7(7): 308–312,July2003. LucSteels. Towardsatheoryofemergentfunctionality. InJean-ArcadyMeyerandStewartW. Wilson,editors,FromAnimalstoAnimats: ProceedingsofthefirstInternationalConference on Simulation of Adaptive Behavior (SAB’90), pages 451–461, Paris, France, September 1990. Luc Steels and Paul Vogt. Grounding adaptive language games in robotic agents. In Pro- ceedings of the fourth European Conference on Artificial Life (ECAL’97), pages 474–482, Brighton,U.K.,July1997. RobertF.Stengel. OptimalControlandEstimation. Dover,NewYork,NY,U.S.A.,1994. G. Keith Still. Crowd Dynamics. PhD thesis, Warwick University, Mathematics department, August2000. Peter Stone and Manuela Veloso. Multiagent Systems: A survey from a machine learning perspective. AutonomousRobots,8(3):345–383,July2000. John H. Sudd. The foraging method of Pharaoh’s ant, Monomorium pharaonis (Linnaeus). AnimalBehaviour,8:67–75,1960a. John H. Sudd. The transport of prey by an ant Pheidole crassinoda. Behaviour, 16:295–308, 1960b. Ken Sugawara and Masaki Sano. Cooperative acceleration of task performance: Foraging behavior of interacting multi-robots system. Physica D: Nonlinear Phenomena, 100(3/4): 343–354,February1997. KenSugawara,MasakiSano,IkuoYoshihara,KenichiAbe,andToshinoriWatanabe. Foraging Behavior of Multi-robot System and Emergence of Swarm Intelligence. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC’99), pages 257–262,Tokyo,Japan,October1999. Gaurav S. Sukhatme, Amit Dhariwal, Bin Zhang, Carl Oberg, Beth Stauffer, and David A. Caron. The Design and Development of a Wireless Robotic Networked Aquatic Microbial ObservingSystem. EnvironmentalEngineeringScience,24(2):205–215,March2006. DavidJ.T.Sumpter,GuyB.Blanchard,andDavidS.Broomhead. Antsandagents: Aprocess algebra approach to modelling ant colony behaviour. Bulletin of Mathemtical Biology, 63: 951–980,2001. Kohei Suseki, Ken Sugawara, Tsuyoshi Mizuguchi, and Kazuhiro Kosuge. Proportion Reg- ulation for Division of Labor in Multi-Robot System. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’05), pages 2339–2344, Edmonton,Alberta,Canada,August2005. 251 Guy Theraulaz, Simon Goss, Jacques Gervet, and Jean-Louis Deneubourg. Task Differentia- tion in Polistes Wasp Colonies: A Model for Self-Organizing Groups of Robots. In Jean- Arcady Meyer and Stewart W. Wilson, editors, From Animals to Animats: Proceedings of thefirstInternationalConferenceonSimulationofAdaptiveBehavior(SAB’90),pages346– 355,Paris,France,September1990. Guy Theraulaz, Eric Bonabeau, Stamatios C. Nicolis, Ricard V. Sol´ e, Vincent Fourcassi´ e, St´ ephane Blanco, Richard Fournier, Jean-Louis Joly, Pau Fern ´ andez, Anne Grimal, Patrice Dalle,andJean-LouisDeneubourg. Spatialpatternsinantcolonies. ProceedingsoftheNa- tionalAcademyofSciencesoftheUnitedStatesofAmerica,99(15):9645–9649,July2002. George Thomas, Ayanna M. Howard, Andrew B. Williams, and Aryen Moore-Alston. Multi- Robot Task Allocation in Lunar Mission Construction Scenarios. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC’05), pages 518– 523,Waikoloa,HI,U.S.A.,October2005. SebastianThrun. RoboticMapping: ASurvey. InGerhardLakemeyerandBernhardNebel,ed- itors,ExploringArtificialIntelligenceintheNewMillenium,pages1–35.MorganKaufmann PublishersInc.,SanFrancisco,CA,U.S.A.,2002. SebastianThrun,WolframBurgard,andDieterFox. ProbabilisticRobotics. MITPress,Cam- bridge,MA,U.S.A.,2005. RichardT.Vaughan,NeilSumpter,AndyFrost,andStephenCameron.Robotsheepdogproject achieves automatic animal control. In Rolf Pfeifer, Bruce Blumberg, Jean-Arcady Meyer, and Stewart W. Wilson, editors, From Animals to Animats: Proceedings of the fifth Inter- nationalConferenceonSimulationofAdaptiveBehavior(SAB’98),pages494–498,Zurich, Switzerland,August1998. RichardT.Vaughan,KasperStøy,GauravS.Sukhatme,andMajaJ.Matari´ c. Goahead,make my day: Robot conflict resolution by aggressive competition. In Jean-Arcady Meyer, Alain Berthoz,DarioFloreano,HerbertL.Roitblat,andStewartW.Wilson,editors,FromAnimals to Animats: Proceedings of the sixth International Conference on Simulation of Adaptive Behavior(SAB’00),pages491–500,Paris,France,August2000a. Richard T. Vaughan, Neil Sumpter, Jane Henderson, Andy Frost, and Stephen Cameron. Ex- periments in Automatic Flock Control. Journal of Robotics and Autonomous Systems, 31: 109–117,2000b. Richard T. Vaughan, Kasper Støy, Gaurav S. Sukhatme, and Maja J. Matari´ c. Blazing a trail: insect-inspired resource transportation by a robot team. In Proceedings of the International Symposium on Distributed Autonomous Robotic Systems, pages 111–120, Knoxville, TN, October2000c. Richard T. Vaughan, Kasper Støy, Maja J. Matari´ c, and Gaurav S. Sukhatme. LOST: Localization-Space Trails for Robot Teams. IEEE Transactions on Robotics and Automa- tion,18(5):796–812,October2002. 252 AnastasiosVergis,KennethSteiglitz,andBradleyDickinson.TheComplexityofAnalogCom- putation. MathematicsandComputersinSimulation,28:91–113,1986. Lovekesh Vig and Julie A. Adams. Multi-robot coalition formation. IEEE Transactions on RoboticsandAutomation,22(4):637–649,August2006. W.GreyWalter. Animitationoflife. ScientificAmerican,182(5):42–45,May1950. W.GreyWalter. Amachinethatlearns. ScientificAmerican,185(2):60–63,August1951. W. Grey Walter. The Living Brain. W. W. Norton & Company Inc., New York, NY, U.S.A., 1953. David F. Watson. Computing the n-dimensional Delaunay tessellation with application to Voronoipolytopes. TheComputerJournal,24(2):167–172,1981. BarbaraWebb. Canrobotsmakegoodmodelsofbiologicalbehaviour? BehaviouralandBrain Sciences,24(6):1033–1050,2001. Barbara Webb. Robotic experiments in cricket phonotaxis. In Dave Cliff, Philip Husbands, Jean-ArcadyMeyer,andStewartW.Wilson,editors, FromAnimalstoAnimats: Proceedings of the third International Conference on Simulation of Adaptive Behavior (SAB’94), pages 45–54,Brighton,U.K.,August1994. Mark Weiser. Some Computer Science issues in Ubiquitous Computing. Communications of theACM,36(7):75–84,1993. Barry B. Werger. Cooperation without deliberation: A minimal behavior-based approach to multi-robotteams. ArtificialIntelligence,110(2):293–320,1999. Barry B. Werger and Maja J. Matari´ c. Broadcast of Local Eligibility for Multi-Target Ob- servation. In Proceedings of the fifth International Conference on Distributed Autonomous RoboticSystems(DARS’00),pages347–356,Knoxville,TN.,U.S.A.,October2000. Paul White, Victor Zykov, Josh Bongard, and Hod Lipson. Three Dimensional Stochastic Reconfiguration of Modular Robots. In Proceedings of Robotics: Science and Systems (RSS’05),pages161–168,Cambridge,MA,U.S.A.,June2005. Alfred North Whitehead and Bertrand Russell. Principia Mathematica, Volume 1, Preface. CambridgeUniversityPress,Cambridge,U.K.,November1910. Norbert Wiener. Cybernetics, or Control and Communication in the animal and the machine. Wiley&Sons,Inc.,NewYork,NY,U.S.A.,firstedition,1948. Norbert Wiener. Cybernetics, or Control and Communication in the animal and the machine. MITPress,CambridgeMA,secondedition,1962. 253 Uri Wilensky. NetLogo. Center for Connected Learning and Computer- Based Modeling, Northwestern University. Evanston, IL, U.S.A., 1999. http://ccl.northwestern.edu/netlogo,Lastaccess: 31October2006. Deborah J. Withington. Faster evacuation from ferries using sound beacons. Fire: Journal of thefireprotectionprofession,39,March2000. DeborahJ.Withington. Lifesavingapplicationsofdirectionalsound. InProceedingsoftheIn- ternationalConferenceonPedestrianandEvacuationDynamics,pages277–298,Duisberg, Germany,April2001. Deborah J. Withington. Directional sound: an important new ship-evacuation aid. The Naval Architect, Journal of the The Royal Institution of Naval Architects, pages 6–7, July/August 2003. Stephen Wolfram. Universality and complexity in cellular automata. Physica D: Nonlinear Phenomena,10(1–2):1–34,January1984. NingXu,SumitRangwala,KrishnaKantChintalapudi,DeepakGanesan,AlanBroad,Ramesh Govindan, and Deborah Estrin. A Wireless Sensor Network for Structural Monitoring. In ProceedingsoftheInternationalConferenceonEmbeddedNetworkedSensorSystems(Sen- Sys’04),pages13–24,Baltimore,MD,USA,2004. Yang Xu, Paul Scerri, Bin Yu, Steven Okamoto, Michael Lewis, and Katia Sycara. An Inte- grated Token-Based Algorithm for Scalable Coordination. In Proceedings of the fourth In- ternationalConferenceonAutonomousAgentsandMultiagentSystems(AAMAS’05),pages 407–414,Utrecht,TheNetherlands,July2005. Yinan Zhang and Richard T. Vaughan. Ganging up: Team-Based Aggression Expands the Population/Performance Envelope in a Multi-Robot System. In Proceedings IEEE/RSJ In- ternational Conference on Robotics and Automation (ICRA’06), pages 589–594, Orlando, FL,U.S.A.,May2006. Huijing Zhao and Ryosuke Shibasaki. A novel system for tracking pedestrians using multiple single-rowlaser-rangescanners. IEEETransactionsonSystems,ManandCybernetics,Part A,35(2):283–291,March2005. RobertZlot,AnthonyStentz,M.BernardineDias,andScottThayer. Multi-RobotExploration Controlled by a Market Economy. In Proceedings IEEE/RSJ International Conference on RoboticsandAutomation(ICRA’02),pages3016–3023,WashingtonDC,U.S.A.,May2002. Marco Zuniga and Bhaskar Krishnamachari. Analyzing the transitional region in low power wireless links. In Proceedings of the IEEE Communications Society Conference on Sensor andAdHocCommunicationsandNetworks(SECON’04),pages517–526,October2004. 254
Abstract (if available)
Abstract
Swarm multi-robot systems consist of groups of robots with simple interaction rules that exploit local information to collectively perform task-directed activities. Behavioral prediction of such systems is challenging
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Coalition formation for multi-robot systems
PDF
Managing multi-party social dynamics for socially assistive robotics
PDF
Self-assembly and self-repair by robot swarms
PDF
Nonverbal communication for non-humanoid robots
PDF
Robot life-long task learning from human demonstrations: a Bayesian approach
PDF
Coordinating social communication in human-robot task collaborations
PDF
Socially assistive and service robotics for older adults: methodologies for motivating exercise and following spatial language instructions in discourse
PDF
Situated proxemics and multimodal communication: space, speech, and gesture in human-robot interaction
PDF
The task matrix: a robot-independent framework for programming humanoids
PDF
Decentralized real-time trajectory planning for multi-robot navigation in cluttered environments
PDF
Robust loop closures for multi-robot SLAM in unstructured environments
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Towards socially assistive robot support methods for physical activity behavior change
PDF
Optimization-based whole-body control and reactive planning for a torque controlled humanoid robot
PDF
Multi-robot strategies for adaptive sampling with autonomous underwater vehicles
PDF
Active sensing in robotic deployments
PDF
Algorithms and systems for continual robot learning
PDF
Characterizing and improving robot learning: a control-theoretic perspective
PDF
Rethinking perception-action loops via interactive perception and learned representations
PDF
A robotic system for benthic sampling along a transect
Asset Metadata
Creator
Shell, Dylan A.
(author)
Core Title
Macroscopic approaches to control: multi-robot systems and beyond
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
07/31/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
macroscopic control,multi-robot,OAI-PMH Harvest,swarms
Language
English
Advisor
Matarić, Maja J. (
committee chair
), Bickers, Nelson Eugene, Jr. (
committee member
), Lerman, Kristina (
committee member
), Requicha, Aristides A. G. (
committee member
), Sukhatme, Gaurav S. (
committee member
)
Creator Email
dylan.shell@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1479
Unique identifier
UC1193029
Identifier
etd-Shell-20080731 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-102839 (legacy record id),usctheses-m1479 (legacy record id)
Legacy Identifier
etd-Shell-20080731.pdf
Dmrecord
102839
Document Type
Dissertation
Rights
Shell, Dylan A.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
macroscopic control
multi-robot
swarms