Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 902 (2009)
(USC DC Other)
USC Computer Science Technical Reports, no. 902 (2009)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Scalable and Practical Pursuit-Evasion
Marcos A. M. Vieira
Department of Computer Science
University of Southern California
mvieira@usc.edu
Ramesh Govindan
Department of Computer Science
University of Southern California
ramesh@usc.edu
Gaurav S.Sukhatme
Department of Computer Science
University of Southern California
gaurav@usc.edu
Abstract—
In this paper, we consider the design and implementation of
practical, yet near-optimal, pursuit-evasion games. In prior work,
we developed, using the theory of zero-sum games, minimal
completion-time strategies for pursuit-evasion. Unfortunately,
those strategies do not scale beyond a small number of robots. In
this paper, we design and implement a partition strategy where
pursuers capture evaders by decomposing the game into multiple
multi-pursuer single-evader games. Our algorithm terminates,
has bounded capture time, is robust, and is scalable in the number
of robots. In our implementation, a sensor network provides
sensing-at-a-distance, as well as a communication backbone that
enables tighter coordination between pursuers. Our experiments
in a challenging office environment suggest that this approach is
near-optimal, at least for the configurations we have evaluated.
Overall, our work illustrates an innovative interplay between
robotics and communication.
I. INTRODUCTION
We are motivated by practical problems in security and
monitoring for large, structured, spaces (e.g., to ensure the
integrity of a large building or complex). The problem we
focus on is pursuit-evasion wherein robots must pursue and
catch evaders.
In Pursuit-Evasion Games (PEGs), multiple robots (the
pursuers) collectively determine the location of one or more
evaders, and try to corral them. The game terminates when
every evader has been corralled by one or more robots. Several
versions of the problem exist. In certain frameworks, it is
acceptable to merely “sight” an evader for it to be “located”,
in others, a precise coordinate must be reported. Other for-
mulations insist on a certain speed of convergence with fewer
constraints on accuracy. Finally, formulations vary depending
on whether the multi-robot control algorithm is required to
have provably correct behavior, whether the number of evaders
is known a priori, and whether they are malicious or benign.
Each variation of the problem brings with it a different set of
challenges, and several of these variations have been solved
to varying degrees.
We consider the class of PEGs played on a discrete graph.
Specifically, we use a topological map of the environment,
whose nodes correspond to coarse-grained regions and whose
links connect neighboring regions [1, 2]. Discrete graph based
games are acceptable for many uses of pursuit-evasion (e.g.,
surveillance, finding survivors). Of course, the physical multi-
robot games run on a continuous space, but we discretize the
environmental for our localization and game model.
In this paper, we consider a version of the game in which p
pursuers collectively attempt to capture r evaders (Section II).
We are interested in the convergence time of the game (i.e.,
the minimum number of steps for the pursuers to capture the
evaders). We decompose the multi-player game into multiple
multi-pursuer single-evader games. We prove that our algo-
rithm terminates, has bounded captured time, is robust, and is
scalable in the number of robots, being suited for practical
applications. Based on our previous work, where we have
designed the optimal policy that pursuers should use in order to
capture evaders with the minimum number of steps, we design
an assignment algorithm that optimally decomposes the game.
An embedded network provides sensing, communication
and computational resources to the robots. The pursuers make
use of the resources of an environment-embedded network,
wherein robot sensing and communication is enhanced by the
network, to have full knowledge of the game.
We present results from running an implementation of our
algorithm on a physical robot testbed (Section V).
II. ASSUMPTIONS, TERMINOLOGY AND DEFINITIONS
In this section, we start by stating the sensing and com-
munication assumptions for our PEGs, then discuss the class
of games we are interested in. We then lay down some
terminology, and formally define the objective of our PEGs.
This sets the stage for our main contribution, a scalable near-
optimal algorithm for PEG, which is discussed in the next
section.
We focus on PEGs in bounded, spatially complex, environ-
ments similar to today’s office environments. In such environ-
ments, we can assume imperfect geometric regularity (e.g.,
the presence of corridors and 90-degree turns, but possibly
a regular placement of doorways or elevator exits). Because
such environments are obstructed, they present limited line-of-
sight visibility. However, it is increasingly true that such en-
vironments are well provisioned with wireless communication
capability, and that many such environments will likely have
dense embedded sensing (for surveillance or environmental
control).
In this paper, we assume such network-assisted environ-
ments; these environments provide sensing-at-a-distance to
circumvent line-of-sight limitations. Moreover, they provide a
network communication capability that enables much tighter
coordination than would have been possible otherwise. More
specifically, the network a) contains sensors that are able
to approximately localize all participants, and b) provides a
communication backbone that enables participants to exchange
game state.
Based on these assumptions, we consider the class of
PEGs in which all participants have complete (but possibly
imprecise) knowledge of the positions of all participants.
Furthermore, our PEGs are played on a discrete space: we
model the environment as a topological map, whose nodes
correspond to coarse-grained regions and whose links connect
neighboring regions [1, 2]. Our discrete-space assumption is
acceptable for many uses of pursuit-evasion (e.g., surveillance,
finding survivors) and we argue that more precise capture
can be implemented with the addition of a simple proximity
sensor (e.g., a low resolution camera) if necessary. We are
interested in the class of games where enough pursuers (we
make this more precise in the next section) exist to guarantee
termination. Finally, we also assume that pursuers and evaders
move at the same speed (more precisely, we assume that they
move exactly one hop in the topology at each time step); we
have left a relaxation of this assumption to future work.
Within this framework, we are interested in the optimal
strategy that pursuers and evaders should play, where our
measure of optimality is the capture time (defined below).
Before we discuss this, we lay down some terminology.
Let G=(V,L) be a finite connected undirected graph with
V vertices and L links or edges. There are two sets of players
called pursuers P and evaders E. Initially, P and E occupy
some vertices of G. In describing the algorithm, we assume
that time is discrete and increments at steps of 1; in our
implementation, of course, we make no such assumption. At
each time step, all pursuers and evaders are given the positions
of all participants. Both teams play a game on G according
to the following rule. At each step, each pursuer chooses a
neighboring vertex of G to move to, then the evaders do
the same. They then move to the corresponding vertex in
G, as defined in [3], and repeat the previous step. The team
of pursuers P wins if it “captures” all evaders. If an evader
can avoid capture indefinitely, then the evader team wins the
game. In the literature [4], the necessary number of pursuers
to capture an evader in a graph G is denoted by (c(G)).
Let p=|P|, r=|E|, v=|V|. Let P
i
be the current position of
the i
th
pursuer and E
i
be the current position of the i
th
evader.
The tuple a=< P
0
,...,P
p
,E
0
,...,E
e
> represents the current
position of all participants. We define a boolean variable
T(turn) to denote if it is the pursuers’ turn to move or not
(recall that, in our algorithm, pursuers and evaders alternate at
each time step
1
). We say that the tuple < a,T > encodes the
state s of a game.
We can now define our game more formally as follows:
Input Coarse estimated positions of p robots and r evaders in
a bounded environment E.
Output Motion commands for p robots.
Goal Minimize the capture time of the evaders.
1
This assumption enables us to analyze our algorithm. Of course, in real
world experiments, it is difficult to ensure this synchrony.
Restriction No motion model for the evaders available to
pursuers.
The game terminates when a capture state is reached. In a
capture state, at least one pursuer occupies the same vertex
in which an evader resides. There exists a different definition
of termination: if, during the evolution of the game, a pursuer
reaches an evader’s position, the evader exits the game. It is
easy to see that our definition results in a game that is strictly
harder than this variant.
III. PURSUIT-EVASION STRATEGIES
In defining the game, we have avoided mention of the
particular strategy that the pursuers and evaders use. In this
section, we discuss this strategy, which is the main focus of
our paper. We start by discussing the optimal strategy, which
is computationally intractable. We then discuss a computa-
tionally feasible strategy that, as we show experimentally is
near-optimal.
A. Optimal Strategy
In our game, we have assumed that both pursuers and
evaders have complete information about the positions of all
players. Consider now the optimal strategy for the pursuers and
evaders: for the former, to capture the evaders in the shortest
time, and for the latter, to avoid capture for the longest possible
time. To formalize this intuition, we turn to zero-sum games.
Pursuit-Evasion is a zero-sum game since the pursuers’ gain
or loss is exactly balanced by the losses or gains of the evader.
The evader’s goal is to escape as long as possible whereas the
pursuers have to capture the evaders as fast as possible. Zero-
sum games have been extensively studied in the game theory
literature, and our solution models a PEG as a zero-sum game
that uses the minimax algorithm [5]. This algorithm minimizes
the maximum possible loss for each player in the game.
To describe this algorithm, consider first that the evolution
of any PEG can be represented by a game graph, a directed
graph with possible cycles. The start state (as defined by
the starting configuration of the pursuers and evaders) has a
directed edge from itself to all possible next states that the
pursuers can make from the start state. (In our game, we
assume that pursuers and evaders move alternatively). In turn,
from each of these states, there is a directed edge to all possible
next states resulting from evaders’ moves from that state. The
graph can thus be recursively defined. In general, a game is a
traversal on this graph. If this traversal ends in a capture state,
the pursuers win the game. However, it is also possible for
the traversal to repeat states: such a traversal will result in a
non-terminating game and the evaders win.
We construct the game graph by generating all states and
all possible transitions between states. For each state, we can
calculate the cost to reach a capture state using a bottom-up
approach. Then, this is the strategy that pursuers and evaders
follow.
S . Each pursuer’s strategy is to move to the vertex
dictated by that neighboring state whose cost is
least. Each evader’s strategy is to move to the vertex
dictated by that neighboring state whose cost is most
among all neighbors.
In previous work [6], we proved the following theorem:
Theorem 3.1: For any given topological graph G and any
given initial configuration of pursuers and evaders,S termi-
nates in the minimal number of steps.
In practice, to play the game, we first pre-compute a com-
plete state transition diagram offline. This is pre-loaded on all
the robots, and each pursuer or evader makes a decentralized
local state transition decision, given the current state (the
positions of all the robots), to calculate next state.
The complexity of this precomputation is O(v
p+r
). Un-
fortunately, the computational cost of enumerating the state
transition diagram is not practical; for instance, using a modern
desktop, we have been unable to compute the robot’s strategy
for 9 robots.
This motivates the work presented in this paper. In the next
section, we present a scalable algorithm, where we partition a
PEG into multiple multi-pursuer, single-evader games.
B. Partition Strategy
In our partitioning strategy, the pursuers divide the evaders
amongst themselves and play c(G)-1 sub-games (recall that
c(G) is the minimum number of pursuers required to guarantee
termination on a graph G). In each of these sub-games,
the pursuers and the evader each play the optimal strategy
discussed in Section III-A, and the goal is still to minimize the
time to capture the evader. The key insight is that computing
the state transition diagram for these partitioned games is
computationally feasible. This section discusses an assignment
algorithm that allocates c(G) pursuers to each evader.
Our assignment algorithm assumes p>= c(G)∗ r, i.e, that
the number of pursuers is at least that required to ensure that
c(G) pursuers can be assigned to each evader. The inputs
to the assignment algorithm include the graph G and the
initial positions of all pursuers and evaders. The assignment
algorithm outputs for each pursuer which evader to pursue
(and eventually capture).
We model the assignment problem of r teams and r evaders
as a matching problem. Consider the bipartite graph G =
(T,E,L). The set T contains nodes, each of which represents a
team of pursuers: each team represents a distinct combination
of pursuer robots. Each node in the set E represents a single
evader. Finally, each edge l =(u,v) from the set L represents
the assignment of team u to evader v. Each edge l has a cost
c
uv
which is the time to capture evader v with team u. It is
possible to compute c
uv
, the expected time to capture evader v
by team u, by evaluating a c(G)−1 game. Realize that we can
pre-compute this cost and we only need to verify one game
to know the cost of every position. We represent the cost c
uv
in a matrix C.
The goal is to find the assignment that minimizes the
maximum time to capture all evaders.
An assignment can be represented as a matrix X = [x
i j
]
where x
i j
is equal to 1 if team i is assigned to evader j,
and equal to 0 otherwise. The assignment problem is written
formally as follows:
min max
i, j=1,..,N
c
i j
x
i j
subject to
n
∑
j=1
x
i j
= 1 ∀ i∈ N
n
∑
i=1
x
i j
= 1 ∀ j∈ N
x
i j
∈{0,1} ∀ (i, j)∈ N .
This problem is called linear bottleneck assignment
(LBAP) [7] and can be solved optimally by any of the
polynomial-time algorithms based on network flow theory [8].
The assignment algorithm described above is summarized
in Algorithm 1.
Algorithm 1 Assignment of pursuers to evaders
1: Compute a cost metric for a c(G)− 1 game.
2: Generate all possible team configurations, with r teams which
has c(G) pursuers and one pursuer does not belong to more than
one team.
3: for all feasible team configuration do
4: calculate matching cost as a LBAP.
5: update assignment with minimum max game cost.
6: end for
7: return min max assignment.
Algorithm 1 assigns c(g)∗ r pursuers to r evaders. If p>
c(g)∗r, the unassigned pursuers chose randomly which evader
to capture. This may modify the minimum capture time of all
evaders, and adds robustness to our algorithm in case of robot
failure.
Our algorithm solves the global optimization problem de-
scribed above even with the restriction that a pursuer can
participate in at most one team. Would a simple greedy
algorithm have worked? A greedy assignment strategy can
result in a non-terminating game. Consider a 2−2 game where
the c(G)= 1 and the cost matrix C =
1 2
3 ∞
. The optimal
assignment that minimizes the time to capture of all evaders for
this matrix is pursuer 1 to evader 2 and pursuer 2 to evader 1,
which gives max time to capture c= 3. The greedy assignment
would instead assign pursuer 1 to evader 1 and pursuer 2 to
evader 2, with cost c= ∞. Figure 1 illustrates such a game.
e1 p1
e2
p2
Fig. 1. An instance of a bad game for greedy strategy.
Each pursuer robot runs the assignment algorithm locally.
The inputs to the algorithm are the current positions of all
pursuers and evaders; this information is obtained from the
network (and may, of course, be inexact because of sensing
noise). Each pursuer executes the assignment only once, and
sticks with the assignment until the game terminates.
1) Complexity: The number of evaluated team configura-
tions x can be modeled as the number of ways to put p
distinct balls in r identical boxes. Each distinct ball represents
a pursuer and each identical box represents an evader. The
boxes are identical because we do not need to determine the
identity of the evaders, the matching will determine this. For
instance, if c(G)= 3, r= 2, p= 6, configuration< p
1
p
2
p
3
><
p
4
p
5
p
6
> is equivalent to < p
4
p
5
p
6
>< p
1
p
2
p
3
> in our
matching input. Moreover, each box should have c(G) balls.
The number of ways to put p distinct balls in r identical
boxes is to first imagine the boxes as distinct and then deter-
mine what we over-counted. The number of distributions of a
set of p distinct balls into a set of r distinct boxes if each box
is to hold a specified number of balls is
p
p
1
,p
2
,...,p
r
=
p!
∏
r
i=1
p
i
!
where p
1
+ p
2
+...+ p
r
= p. In our case, since all p
i
= c(G),
we have
p!
(c(G)!)
r
. Due the fact that the boxes are identical, we
over-counted r!. Hence, the number of configuration x we have
is x=
p!
(c(G)!)
r
∗(r!)
.
Given that n! ≃ n
n
(Stirling’s approximation), we have
x≃
p
p
(c(G))
c(G)∗r
∗(r
r
)
. If p = c(G)∗ r, we have the number of
evaluated configuration x≃ r
p−r
. The complexity of the Linear
Bottleneck Assignment Problem (LBAP) with N nodes is
O(N
2
) [9]. Thus, this part of our algorithm is O(x
2
). We
also need to evaluate a c(G)− 1 game to know the cost
metric, which has complexity O(|V|
c(G)+1
) (from Section III-
A). Thus, our overall complexity is O(x
2
+|V|
c(G)+1
), where
x =
p!
(c(G)!)
r
∗(r!)
≃ r
p−r
instead of previous O(|V|
p+r
). Since
r<<|V|, this algorithm is significantly more computationally
efficient.
2) Properties: Here we enumerate the properties of the
partition strategy.
Termination: Lemma 3.2 guarantees game will terminate,
assuming no robot fails.
Lemma 3.2: Our algorithm guarantees that the p− r game
will terminate.
Proof: Since we decompose the p−r game into r parallel
c(G)−1 sub-games, and each of these games is guaranteed to
terminate by Theorem 3.1, the overall p− r game terminates.
Optimality of partitioning: Our algorithm optimally par-
titions the evaders across the pursuers, subject to the cost
metric. This property follows from the optimality of the LBAP
assignment algorithm.
Bounded capture time: The completion time for the p− r
game is the maximum completion time across the c(G)− 1
sub-game. However, is it still an open question how far off
from the optimal completion time this partition strategy is.
Scalability: Our algorithm scales better than the optimal
strategy, since its scaling is dominate by r
p−r
, while the
optimal scales as |V|
p+r
, and usually r<<|V|.
Robustness: If p> c(G)∗r, our algorithm can assign extra
pursuers to evaders to ensure robustness to robot failures.
To summarize, our algorithm works as follows. In a de-
centralized manner, each robot pre-computes a c(G)− 1 state
transition diagram. Then, after being informed by the network
of the positions of all robots, each robot runs the assignment
algorithm described above once. Thereafter, using the pre-
computed state transition diagram and network localization
updates, all robots continuously play the game until all evaders
are captured.
IV. DESIGN
In this section, we discuss the hardware and network testbed
which form the basis for our pursuit-evasion experiments. We
then describe, in some detail, our PEG software design and
implementation. This sets the stage for our system evaluation,
which is discussed in the next section.
A. Platform
The Robot Platform. We use a commoditized robotics plat-
form and made minimal modifications to it using commercial
off-the-shelf products. Our platform consists of an iRobot
Create and a small embedded computer mounted on top of
it (Figure 2(b)).
The Create, a differential drive robot, has a round chassis of
33 cm diameter. The robot essentially has two kinds of sensors.
First, a pair of tactile sensors that, together with a bumper, can
help determine whether a robot hits an obstacle and the angle
at which it does so. Second, a suite of infrared (IR) sensors:
the bumper contains an IR wall sensor on the right and an
omnidirectional IR receiver in the top, and four additional IR
sensors mounted underneath the bumper facing down. We do
not add additional sensing hardware to the Create.
The embedded computer, the Ebox 3854, is an 800MHz
embedded PC with 256MB shared DDR memory, and supports
a 1280x1024 VGA interface, one 10/100 LAN, and USB,
mini PCI and compact flash sockets. We chose the EMP-
8602 mini-PCI 802.11 a/b/g wireless card to provide network
connectivity. This choice of platform is primarily designed to
enable ease of programming (the Ebox runs standard Linux),
and has enough ports to support various connectivity options.
The embedded computer is powered by the Create’s battery
through a DC-DC power adapter (picoPSU).
The embedded computer runs Linux Fedora Core 6 as the
operating system. For sensing and control, we developed a
Create driver for Player [10], using which we are able to move
the robot, turn on/off LEDs, read the bumpers, buttons and IR
sensors. We set the nominal speed to 0.2 m/s.
The Network. The robots use the network shown in Fig-
ure 2(a). This network is deployed above the false ceiling
on one floor of a large office building and consists of two
tiers: an upper tier containing 6 Stargates (embedded com-
puters running Linux), and a lower-tier containing 56 TelosB
motes (tiny commercial sensor nodes). The sensor nodes run
an 8MHz Texas Instruments MSP430 microcontroller, have
10KB RAM and a 2.4 GHz IEEE 802.15.4 Chipcon Wireless
Transceiver with a nominal bit rate of 250 Kbps.
The network provides the robots two capabilities. The
upper-tier nodes form a wireless communication backbone that
can be used for inter-robot communication and coordination.
In the absence of this backbone, communication between the
robots can be significantly delayed as a result of transient
network outages caused by robot mobility. The presence of
a static network infrastructure ensures that robots can quickly
communicate with each other, and thereby coordinate more
effectively. The lower-tier nodes provide the capability of a
virtual position sensor, and can sense the position of pursuers
and evaders. We describe the details of this capability below.
(a) (b)
Fig. 2. a)Layout of the network testbed b)robot platform
B. Network-Assisted Localization
Our robots do not have extra hardware such as sonar or
lasers to determine robot poses. In our system, the network
estimates robot position. As we have discussed above, we
use a topological map as a representation of the environment.
The goal of our network-assisted localization subsystem is to
approximately place each robot at a node in this topological
map.
This subsystem uses the signal strength of periodic “beacon”
messages emitted roughly once every second (we add some
randomization to the interval to reduce collisions) by the
robots themselves.
The second-tier motes receive these beacons, and report all
beacons whose signal strength is above a certain threshold.
Tenet [11], a readily available open-source software package
for programming wireless sensor networks, is the software
that collects the beacon signal strength. A centralized robot
location server then applies a voting scheme on a sliding
window of reports (the width of the window is 7) to generate
a location estimate. Since radio propagation characteristics
can vary significantly even with small displacements, we
use a simple heuristic filter to smooth the estimate. This
filter essentially updates a robot’s current position only after
n consecutive reports are received. Clearly, this trades off
increased latency for a smoother evolution off the estimate.
Figure 3(a) motivates the need for smoothing the estimate,
and shows the performance of the filter on our testbed.
Our architecture also allows for a decentralized location sys-
tem. Instead of retrieving location information from the server,
this information is flooded on the upper-tier network, of which
the robots are a part. We have experimented with this version
as well, and the performance of the system is comparable to
the centralized system, as we show in Section V.
C. Wall-following
Given the paucity of sensors on the Create (and our coarse
localization technique), we use a simple wall-following
behavior using the IR sensors to traverse the environment.
Clearly, wall-following may not generalize to other
kinds of environments. It suffices for our purposes as a simple,
even primitive, traversal behavior; more sophisticated behav-
iors can only improve the performance of our system. That
said, our wall-following component includes several
optimizations to ensure robustness: in our environment, at
least, it always results in forward progress.
The robot continuously emits IR signals and reads the IR
receiver continuously. As long as the IR values are between
two thresholds and none of the bumpers are activated, the
robot will move forward parallel to the wall using simple PD
control.
If the IR reading falls below a threshold, the robot assumes it
has “lost” the wall and attempts tosearch for it. It does this
by executing a spiral (simultaneous rotation and translation)
until the bumper senses contact (assumed to be with the wall).
To avoid open office or elevator doors in the experiments, we
use a “virtual wall” (essentially, an IR transmitter) available
from iRobot. When the virtual wall is detected, the robot tries
to avoid going through it by going back and turning to the
left for a while without bumping, then going forward while
turning to the right also without bumping. If the bump sensor
is activated and the virtual wall is simultaneously detected, it
invokes the escape behavior described below.
The escape behavior is invoked whenever the robot is
stuck. In addition to the scenario discussed above, this be-
havior is invoked whenever the power consumption of the
robot is above a predetermined threshold over a predefined
time window (this might happen when the robot encounters a
small obstacle that does not activate the bumper). To execute
the escape behavior, the robot simply moves backwards for a
short distance and then resumes wall-following.
Our robot’s wall following behavior uses the IR transmitter
near the right side of the robot. As such, it keeps the wall to
its right. To reverse direction, the robot implements adetach
behavior which enables it to cross a corridor and find the
opposite wall. The detach behavior is simple: the robot
swivels slightly to the left and then moves in an arc until
a bumper is activated. However, if the robot hits a wall in less
than a preset time, it repeats this maneuver since it is unlikely
to have crossed the corridor in that (short) time.
D. Navigation
The navigation component calculates the goal position, and
invokes wall-following to move from one topological
node to the adjacent one. Navigation is executed every time
the robot changes its position (as well as when any evader
changes its position), so the robot continuously updates its
trajectory.
The navigation component calculates the goal position given
its team’s position and that of its assigned evader. Using [6],
we pre-compute a state transition diagram for a given team
configuration. This state transition diagram is pre-loaded on all
the robots, and each pursuer or evader makes a decentralized
local state transition decision, given the current state. This
decision tells the robot which topological node to move to
next.
However, there exists an important subtlety in the navigation
algorithm imposed by the minimality of our platform. Our
robot has no inherent proprioception capability, it can only
travel parallel to a wall, keeping the wall to its right (since
the robot has only one IR wall sensor positioned on its right).
As a result, the robot might actually move in a direction
opposite to that intended by the navigation component. To
rectify this, we add a simple a posteriori correction to the
navigation component. If the wall-follower moves the
robot to a node that it does not expect to arrive at, it invokes
the detach behavior described above to reverse direction.
V. EVALUATION
In this section, we describe the results from several games
played on the physical robot testbed described in Section IV.
We play the games on the floor plan shown in Figure 2(a),
using the network whose nodes are shown in that figure. While
our games are played only in the one environment shown (and
future work needs to validate the generality of the claims we
make later in this section), we emphasize two important points
about the environment that lead us to believe that our results
would hold generally in other similar environments. First, our
building floor is representative of many office buildings (i.e,
it does not have any unusual features that would materially
affect our conclusions) and we do not alter the environment
in any way other than to place virtual walls in some locations
2
.
Second, our network design is not optimized in any way for
this particular application. Rather, the network was designed
to mimic harsh wireless communication conditions to stress
test wireless systems and applications.
The convergence time of a game depends on the initial
configuration. We play our games using a worst-case initial
configuration (there can be many). To calculate the worst-case
configuration, we implemented an idealized game simulator,
which models time in discrete steps and implements the
strategy discussed in Section IV. We exhaustively enumerate
all configurations to compute a worst-case configuration.
We can also use the simulator to compare the partition
strategy with the optimal (Table I) for various topologies.
The first column is the topology. The second column is the
necessary number of pursuers to guarantee the termination of
the game for that topology. The third column is the maximum
number of steps to terminate the game (t) with 1 evader, which
of course is the same in both strategies. The fourth and fifth
columns give the maximum number of steps to terminate the
game using optimal and partition strategy for two evaders. For
these topologies, the partition strategy has the same completion
time as optimal strategy.
2
We place a virtual wall in front of the elevator doors to prevent the robot
from entering an open elevator. We also place a virtual wall in front of the
doors whose thresholds are lower than the height of the Create bumper.
The drawback of our partition strategy is we might use more
pursuers than the minimum necessary. In a ring topology, 3
pursuers are sufficient to capture 2 evaders. In the partition
strategy, we need 4 pursuers. We believe that this tradeoff
is acceptable, since the partition strategy enables efficient
capture.
It is not enough, however, to use the simulator alone. The
simulator plays an idealized game and differs from the real
environment in three ways. First, it is not affected by local-
ization error, which can shorten or prolong a game, depending
on whether the error affects the pursuer or the evader. Second,
while our robots only follow walls in one direction (and
must cross the corridor to reverse direction) and therefore
must correct their orientation a posteriori (Section IV), our
simulator does not mimic this constraint. The third difference
arises from the synchronous execution in the simulator. In
the real world, the pursuer and evader decision-making is not
synchronized, and the game may terminate (but infrequently)
earlier than in a simulator. For this reason, we played a few
(specifically, 2− 1, 4− 2, and 6− 3) games on our real world
testbed, and we discuss the results below.
Topology c(G) t(c(G)-1 game) t(2c(G)-2 game)
Optimal Partition
Grid 2D 3x3 2 4 4 4
Cylinder Grid 3x3 2 3 3 3
Torus Grid 4x4 3 4 4 4
TABLE I
GAMES AND THEIR PROPERTIES
An Illustrative Game Instance. Figure 3(b) shows the
trajectory of each pursuer during one instance of a 4− 2
game. The nodes and solid undirected edges connecting them
represent the topological map. The solid directed lines (lines
with arrows) show the pursuer path as determined by the
localization system. The dashed lines illustrate the evader path.
The edge labels represent the time sequence of the robot.
The pursuer and evader’s initial positions are indicated by the
corresponding icons.
The game evolves as follows. Pursuers 1 to 4 start at
position 3, 3, 4, and 4 respectively, and all evaders start at
position 8 (this is a worst-case configuration, as determined
by simulation). Pursuers execute the partition algorithm and
independently decide on their teams. Pursuers 1 and 2 try to
capture evader 1 and pursuers 3 and 4 will chase evader 2.
Pursuers 2 and 4 try to corral the corresponding evader by
approaching them through the alternate path in the topology,
while pursuers 1 and 3 try to capture evaders by moving
directly towards them. At time 2, pursuers 1 and 3 are at
location 2. They do not have to move since either option will
not affect the game convergence time. At the end of time 2,
all pursuers are at near location 2. All pursuers keep moving
and are at location 5. At time 4, pursuers 1 and 3 switch their
orientation (turn around to another wall invokingdetach). At
time 5, evaders move to location 0 to be as far away from all
pursers as possible, otherwise pursuer 1 would have captured
them. Pursuers are at 2, 2, 7,and 7. Finally, at steps 6 and 7,
pursuers go toward the evaders and corral them since evaders
(a)
1 3 4
6 7 8
5
4
2
3
5 6
e1
p1 p3
p4
e2
0
1
6 7
3
5
2
5
p2
1
2
4
7
(b) (c)
Fig. 3. Results: a) Network-assisted localization estimates: unfiltered estimates (bottom) are noisy, filtered estimates (top) eliminate noise. b) A 4 pursuer, 2
evader game. c) Capture Time
do not have any option to escape.
In simulation, the capture time for the described game is
5 steps. Our results took 7 steps because our robots do not
have a sense of direction and can only reverse direction by
going in the wrong direction to determine that they have done
so. This behavior is not capture in the simulator. Despite this
(and other) non-idealities, our implementation works well in
the real-world.
Results across Multiple Games. Finally, Figure 3(c) depicts
the capture time across multiple games. The capture time is
similar between the games, as one might expect (since each
game is partitioned into a 2− 1 game). However, there are
small differences in capture times. For a 4−2 game, even with
a pursuer failure in localization, the game terminated since
only 2 robots are needed in our topology. Our 6−3 game had
a longer convergence time because robots are not able to move
at nominal speed continuously as a results of floor surface
differences and because (as it happens in our environment)
pursuers bump into each other and are slowed down during
obstacle avoidance.
A video demonstrating a game is available online [12]. This
video was taken by attaching a video-capable cellphone to
each robot, so it shows the progress of the game from the
perspective of each robot. One can see the wall-following and
detach behaviors at work.
VI. RELATED WORK
Our broad interest is in the theme of using pervasive
sensing and communication infrastructure with actuation. In
the robotics and sensor networking communities this area
has burgeoned recently. Examples include investigations on
sensor-network guided robot localization [13], sensor-network
guided robot navigation [14, 15, 16, 17] and exploration [18],
robot-assisted sensor network localization [13], robot-assisted
sensor network deployment [19, 20, 18] and sensor-network
guided robot pursuit-evasion [21].
To situate our work in the existing literature, we classify
the type of PEGs using seven criteria: the ratio of the number
of pursuers to the number of evaders; whether pursuers and
evaders have full and/or global visibility, or whether they can
only see within a threshold distance or until occluded by
an obstacle (usually modelled by the edge of a polygon in
2D); what additional information robots have with respect to
the opponents’ strategy or planning algorithm; whether the
environment is modelled as a graph (discrete) or a polygon
(continuous half-space with lines in 2D as boundaries); how
the evader is captured, whether by being surrounded, seen or
sensed by the pursuer, or approached within a certain distance,
or physically contacted; the relative speed between the pursuer
and evader;and, if the time to capture is important.
Table II shows a classification of the related work in the
literature along these dimensions. Our work is distinct from
several pieces of prior work, as shown in Table II. The
novelty of our proposed work is clear in several dimensions:
while other work has explored theoretical bounds on eventual
capture [22, 23], or pursuit-evasion under constrained geome-
tries [24, 27, 28, 29], or has examined sophisticated control
strategies [26, 21], our proposed work attempts to minimize
the time of captured of a multi-pursuer multi-evader under the
pragmatic realization of physical multi-robot games.
In [4], an algorithm to determine if K pursuers are sufficient
to capture an evader is presented. They also showed that
every graph is topologically equivalent to a graph with pursuer
number at most two.
In the survey presented by Alspach [30], a number of
references on the necessary number of pursuers for a given
graph class can be found. Aigner and Fromme [22] proved
that in a planar graph G, 3 pursuers are sufficient for the
pursuers to win the game. Quilliot [31] extended this result,
giving an upper bound to the number of pursuers depending
on the genus of the graph G. In [32], the necessary number
of pursuers is studied under three graph product operations.
In [33], given certain conditions, the complexity of pursuit on
a graph is Exptime-complete.
To the best of our knowledge, we are the first to present a
scalable algorithm to minimize the time to capture of a multi-
pursuer multi-evader game, and to validate it in a real-world
implementation.
VII. CONCLUSIONS
We presented an assignment algorithm that guarantees
the game terminates, has bounded captured time, is robust,
and is scalable in the number of robots, being suited for
practical applications. We have reported on the design and
experimental characterization of a nontraditional mobile robot-
based pursuit evasion system. In our system robot sensing
[22] [23] [24] [25] [26] [21] Our work
Pursuer to Evader Ratio ≥ 1 1 ≥ 1 ≥ 1 ≥ 1 1 ≫ 1
Pursuer Visibility full full local local local full full
Evader Visibility full full full full local none full
Information full full full full no pursuer full full
Environment Graph Polygon Polygon Polygon Polygon Polygon Graph
Capture touch touch see touch touch touch touch
Speed (faster entity) same same same evader any same same
Time to Capture no no no no no no yes
Robot Implementation None None None None Partial Partial Full
TABLE II
RELATED WORK IN PURSUIT-EVASION
and communication is enhanced by an environment-embedded
network. We have validated the feasibility of our algorithm
by experimentally playing mobile robot-based pursuit evasion
games on a physical testbed.
REFERENCES
[1] B. Kuipers and Y .-T. Byun, “A robot exploration and mapping strategy
based on a semantic hierarchy of spatial representations,” Tech. Rep.
AI90-120, 1, 1990.
[2] M. J. Matari´ c, “Integration of representation into goal-driven behavior-
based robots,” IEEE Transactions on Robotics and Automation, no. 3,
June 1992.
[3] R. J. Nowakowski and P. Winkler, “Vertex-to-vertex pursuit in a graph,”
Discrete Mathematics, vol. 43, no. 2-3, pp. 235–239, 1983.
[4] A. Berarducci and B. Intrigila, “On the cop number of a graph,” Adv.
Appl. Math., vol. 14, no. 4, pp. 389–403, 1993.
[5] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach.
Pearson Education, 2003.
[6] M. A. M. Vieira, R. Govindan, and G. S. Sukhatme, “Optimal policy
in discrete pursuit-evasion games,” Department of Computer Science,
University of Southern California, Tech. Rep. 08-900, 1, 2008.
[7] R. Jonker and A. V olgenant, “A shortest augmenting path algorithm for
dense and sparse linear assignment problems,” Computing, vol. 38, no. 4,
pp. 325–340, 1987.
[8] R. E. Burkard and E. Cela, “Linear assignment problems and exten-
sions,” Handbook of Combinatorial Optimization, vol. 4, 1999.
[9] U. Pferschy, “Solution methods and computational investigations for the
linear bottleneck assignment problem,” Computing, vol. 59, no. 3, pp.
237–258, 1997.
[10] B. Gerkey, R. Vaughan, and A. Howard, “The player/stage project: Tools
for multi-robot and distributed sensor systems,” in 11th Int. Conf. on
Advanced Robotics (ICAR 2003), June 2003.
[11] O. Gnawali, B. Greenstein, K.-Y . Jang, A. Joki, J. Paek, M. Vieira,
D. Estrin, R. Govindan, and E. Kohler, “The TENET Architecture for
Tiered Sensor Networks,” in Proceedings of the ACM Conference on
Embedded Networked Sensor Systems, November 2006.
[12] “Pursuit-evasion game video,” 2008. [Online]. Available:
http://flipflop.usc.edu/roboComm/
[13] P. Corker, R. Peterson, and D. Rus, “Localization and navigation assisted
by networked cooperating sensors and robots,” International Journal of
Robotics Research, vol. 24, no. 9, pp. 771–786, 2005.
[14] Q. Li and D. Rus, “Navigation protocols in sensor networks,” ACM
Tansactions on Sensor Networks, vol. 1, no. 1, pp. 3–35, August 2005.
[15] K. J. O’Hara, V . Bigio, E. Dodson, A. Irani, D. Walker, and T. Balch,
“Physical path planning using the gnats,” in IEEE International Confer-
ence on Robotics and Automation, 2005.
[16] G. Alankus, N. Atay, C. Lu, and B. Bayazit, “Adaptive embedded
roadmaps for sensor networks,” in IEEE International Conference on
Robotics and Automation, 2007.
[17] M. Batalin and G. S. Sukhatme, “Coverage, exploration and deployment
by a mobile robot and communication network,” in Proceedings of the
International Workshop on Information Processing in Sensor Networks,
Palo Alto Research Center (PARC), Palo Alto, Apr 2003, pp. 376–391.
[18] Maxim Batalin and Gaurav S. Sukhatme, “The Design and Analysis of
an Efficient Local Algorithm for Coverage and Exploration Based on
Sensor Network Deployment,” IEEE Transactions on Robotics, vol. 23,
no. 4, pp. 661–675, Aug 2007.
[19] A. Howard, M. J. Matari´ c, and G. S. Sukhatme, “An incremental self-
deployment algorithm for mobile sensor networks,” Autonomous Robots,
vol. 13, no. 2, pp. 113–126, 2002.
[20] P. I. Corke, S. E. Hrabar, R. Peterson, D. Rus, S. Saripalli, and G. S.
Sukhatme, “Deployment and connectivity repair of a sensor net,” in 9th
International Symposium on Experimental Robotics 2004, 2004.
[21] S. Oh, L. Schenato, P. Chen, and S. Sastry, “Tracking and coordination
of multiple agents using sensor networks: system design, algorithms and
experiments,” Proceedings of the IEEE, vol. 95, no. 1, Jan 2007.
[22] F. M. Aigner. M, “A game of cops and robber,” Tech. Rep., 1984.
[23] J. Sgall, “Solution of David Gale’s lion and man problem,” Theor.
Comput. Sci., vol. 259, no. 1-2, pp. 663–670, 2001.
[24] L. J. Guibas, J.-C. Latombe, S. M. LaValle, D. Lin, and R. Motwani,
“Visibility-based pursuit-evasion in a polygonal environment,” in WADS
’97: Proceedings of the 5th International Workshop on Algorithms and
Data Structures. London, UK: Springer-Verlag, 1997, pp. 17–30.
[25] V . Isler, S. Kannan, and S. Khanna, “Randomized pursuit-evasion in a
polygonal environment,” IEEE Transactions on Robotics, vol. 5, no. 21,
pp. 864–875, 2005.
[26] R. Vidal, O. Shakernia, H. J. Kim, D. H. Shim, and S. Sastry, “Proba-
bilistic pursuit-evasion games: theory, implementation, and experimental
evaluation,” Robotics and Automation, IEEE Transactions on, vol. 18,
no. 5, pp. 662–669, 2002.
[27] R. Murrieta-Cid, T. Muppirala, A. Sarmiento, S. Bhattacharya, and
S. Hutchinson, “Surveillance strategies for a pursuer with finite sensor
range,” Int. J. Rob. Res., vol. 26, no. 3, pp. 233–253, 2007.
[28] S. Bhattacharya, S. Candido, and S. Hutchinson, “Motion strategies for
surveillance,” in Robotics: Science and Systems, 2007.
[29] W. Cheung, “Constrained pursuit-evasion problems in the plane,” Master
Thesis, U.British Columbia, 2005.
[30] B. Alspach, “Searching and sweeping graphs: a brief survey,” Le
Matematiche (Catania), vol. 59, pp. 5–37, 2004.
[31] A. Quilliot, “A short note about pursuit games played on a graph with a
given genus,” J. Comb. Theory, Ser. B, vol. 38, no. 1, pp. 89–92, 1985.
[32] S. Neufeld and R. Nowakowski, “A game of cops and robbers played
on products of graphs,” Discrete Math., vol. 186, no. 1-3, 1998.
[33] A. S. Goldstein and E. M. Reingold, “The complexity of pursuit on a
graph,” Theor. Comput. Sci., vol. 143, no. 1, pp. 93–112, 1995.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 900 (2008)
PDF
USC Computer Science Technical Reports, no. 692 (1999)
PDF
USC Computer Science Technical Reports, no. 796 (2003)
PDF
USC Computer Science Technical Reports, no. 907 (2009)
PDF
USC Computer Science Technical Reports, no. 683 (1998)
PDF
USC Computer Science Technical Reports, no. 706 (1999)
PDF
USC Computer Science Technical Reports, no. 906 (2009)
PDF
USC Computer Science Technical Reports, no. 669 (1998)
PDF
USC Computer Science Technical Reports, no. 723 (2000)
PDF
USC Computer Science Technical Reports, no. 910 (2009)
PDF
USC Computer Science Technical Reports, no. 732 (2000)
PDF
USC Computer Science Technical Reports, no. 746 (2001)
PDF
USC Computer Science Technical Reports, no. 704 (1999)
PDF
USC Computer Science Technical Reports, no. 745 (2001)
PDF
USC Computer Science Technical Reports, no. 921 (2011)
PDF
USC Computer Science Technical Reports, no. 841 (2005)
PDF
USC Computer Science Technical Reports, no. 873 (2005)
PDF
USC Computer Science Technical Reports, no. 786 (2003)
PDF
USC Computer Science Technical Reports, no. 603 (1995)
PDF
USC Computer Science Technical Reports, no. 888 (2007)
Description
Marcos A. M. Vieira, Ramesh Govindan, Gaurav S. Sukhatme. "Scalable and practical pursuit-evasion." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 902 (2009).
Asset Metadata
Creator
Govindan, Ramesh (author), Sukhatme, Gaurav S. (author), Vieira, Marcos A. M. (author)
Core Title
USC Computer Science Technical Reports, no. 902 (2009)
Alternative Title
Scalable and practical pursuit-evasion (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
8 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270824
Identifier
09-902 Scalable and Practical Pursuit-Evasion (filename)
Legacy Identifier
usc-cstr-09-902
Format
8 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/