Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Techniques for efficient information transfer in sensor networks
(USC Thesis Other)
Techniques for efficient information transfer in sensor networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TECHNIQUES FOR EFFICIENT INFORMATION TRANSFER
IN SENSOR NETWORKS
by
Fang Bian
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulllment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
August 2007
Copyright 2007 Fang Bian
Dedication
To my family
ii
Acknowledgments
I would like to express my appreciation to my adviser Prof. Ramesh
Govindan for his support, patience, and encouragement throughout my
graduate studies.
iii
Table Of Contents
Dedication ii
Acknowledgments iii
List Of Tables vii
List Of Figures viii
Abstract xii
Introduction 1
Chapter 1 Utility-based Sensor Selection 19
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2 Network and Problem Setup . . . . . . . . . . . . . . . . . . . 23
1.2.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . 23
1.2.2 Utility-Based Sensor Selection . . . . . . . . . . . . . . 24
1.3 Submodular Utility Functions . . . . . . . . . . . . . . . . . . 28
1.4 Supermodular Utility Functions . . . . . . . . . . . . . . . . . 34
1.4.1 Set-weighted functions . . . . . . . . . . . . . . . . . . 40
1.4.2 Set-weighted single set selection . . . . . . . . . . . . . 43
1.4.3 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . 49
1.5 Geometric Penalty Functions . . . . . . . . . . . . . . . . . . . 52
1.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
iv
Chapter 2 Energy-Efcient Broadcasting in Wireless Ad Hoc Net-
works: Lower Bounds and Algorithms 60
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.1.1 Network and Energy Cost Model . . . . . . . . . . . . . 62
2.1.2 The Problem . . . . . . . . . . . . . . . . . . . . . . . . 64
2.1.3 Our Contributions . . . . . . . . . . . . . . . . . . . . . 65
2.2 Theoretical Lower Bound for broadcasting . . . . . . . . . . . 67
2.3 Our GBA algorithm For The broadcasting problem . . . . . . 71
2.3.1 Description of GBA . . . . . . . . . . . . . . . . . . . . . 72
2.3.2 Analysis of GBA . . . . . . . . . . . . . . . . . . . . . . 74
2.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.4.1 GBA vs BIP . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4.2 Practical Lower Bounds . . . . . . . . . . . . . . . . . . 81
2.4.3 A GBA-weak network . . . . . . . . . . . . . . . . . . . 85
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Chapter 3 QCRA: Quasi-static Centralized Rate Allocation for Sen-
sor Networks 89
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.3 Quasi-static Centralized Rate Allocation . . . . . . . . . . . . 96
3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3.2 Rate Allocation . . . . . . . . . . . . . . . . . . . . . . . 101
3.3.3 Rate Adaptation . . . . . . . . . . . . . . . . . . . . . . 105
3.3.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.3.5 Limitations and Discussions . . . . . . . . . . . . . . . 108
3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.4.1 Performance of QCRA . . . . . . . . . . . . . . . . . . . 114
3.4.2 Comparison with IFRC . . . . . . . . . . . . . . . . . . 116
3.4.3 Evaluation of Extensions . . . . . . . . . . . . . . . . . 118
3.4.4 Comparison with other sophisticated heuristics . . . . 122
3.4.5 Performance of QCRA with dynamic routing . . . . . . 124
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Chapter 4 HLR: Using Hierarchical Location Names for Scalable Rout-
ing and Rendezvous in Wireless Sensor Networks 128
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
v
4.2 Overview and Related Work . . . . . . . . . . . . . . . . . . . . 134
4.2.1 Feasibility Discussion . . . . . . . . . . . . . . . . . . . 135
4.2.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . 139
4.3 HLR Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.3.2 Automatic Route Aggregation . . . . . . . . . . . . . . . 148
4.3.3 Dealing with Route Changes . . . . . . . . . . . . . . . 151
4.3.4 Relaxing the Connectivity Assumption . . . . . . . . . 152
4.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.4 Routing and Rendezvous Primitives . . . . . . . . . . . . . . . 157
4.4.1 Unicast . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.4.2 Area Broadcast and Area Anycast . . . . . . . . . . . . 159
4.4.3 Rendezvous Based on Random Hashing . . . . . . . . 160
4.4.4 Data-Locality Preserving Hashing . . . . . . . . . . . . 163
4.4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.5 Performance Evaluation Through Simulations . . . . . . . . 168
4.5.1 Methodology and Metrics . . . . . . . . . . . . . . . . . 170
4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
4.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 181
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Conclusions and Future Work 184
References 188
vi
List Of Tables
3.1 Values of rate adaptation parameter C . . . . . . . . . . . . . 114
3.2 Average goodput of weighted fairness evaluation . . . . . . . 121
vii
List Of Figures
1 Design Space for Efcent Information Delivery in WSN . . . 17
1.1 MAXIMUM KNAPSACK graph withjUj = 2 . . . . . . . . . . . . . 31
1.2 SETCOVER graph with n = 4 . . . . . . . . . . . . . . . . . . . 38
1.3 O(log n) Integrality Gap Example . . . . . . . . . . . . . . . . . 49
2.1 Broadcasting in the wireless network: there is an obstacle
between V
root
and x
2
. . . . . . . . . . . . . . . . . . . . . . . 64
2.2 Network for the associated set-cover problem . . . . . . . . . 68
2.3 GBA vs. BIP on 500m x 500m practical networks . . . . . . . 79
2.4 GBA vs. BIP on 1000m x 1000m practical networks . . . . . 79
2.5 GBA vs. BIP on BIP-weak networks . . . . . . . . . . . . . . . 80
2.6 GBA vs. BIP on GBA-weak networks . . . . . . . . . . . . . . 80
2.7 2-SPT: the paths from s to z, from z
1
to x, and from z
2
to y
are shortest paths. z
1
and z
2
can be reached from z by one
broadcasting. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.8 1-SPT Lower Bounds for 500m x 500m practical networks . 83
viii
2.9 1-SPT Lower Bounds for 1000m x 1000m practical networks 83
2.102-SPT Lower Bounds for 500m x 500m practical networks . 83
2.112-SPT Lower Bounds for 1000m x 1000m practical networks 83
2.12A GBA-weak network with optimal cost for broadcasting:
f
c
+ v
c
r
2
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.13Broadcast tree built by GBA for the GBA-weak network
shown in Fig. 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.1 Example of contention . . . . . . . . . . . . . . . . . . . . . . 98
3.2 Layout of the testbed . . . . . . . . . . . . . . . . . . . . . . . 110
3.3 Example link quality changes on the testbed during daily
time and night time . . . . . . . . . . . . . . . . . . . . . . . . 111
3.4 Evaluation of QCRA . . . . . . . . . . . . . . . . . . . . . . . . 112
3.5 Routing Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.6 Performance of QCRA and IFRC . . . . . . . . . . . . . . . . 117
3.7 Local rate on each node in IFRC . . . . . . . . . . . . . . . . . 117
3.8 Evaluation of multiple sinks extension . . . . . . . . . . . . . 119
3.9 Evaluation of weighted fairness extension . . . . . . . . . . . 120
3.10Evaluation of per-link rate adaptation . . . . . . . . . . . . . 121
3.11Comparison QCRA with Lower-Bound heuristic . . . . . . . 123
3.12Route changes during QCRA performance evaluation with
dynamic routing . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.13Performance of QCRA with Dynamic Routing . . . . . . . . . 126
ix
4.1 Example: a sensor network with HLI and routing table of
node 2:2:1 built by HLR . . . . . . . . . . . . . . . . . . . . . . 134
4.2 The same sensor network in Figure 4.1 with more details in
area 1 shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.3 Example: mapping from 2-d data space to the network
shown in Figure 4.2 . . . . . . . . . . . . . . . . . . . . . . . . 165
4.4 Comparison of average path length in GPSR and HLR on
networks with density 20 . . . . . . . . . . . . . . . . . . . . . 172
4.5 Comparison of average query cost in Diffusion on networks
with density 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
4.6 Comparison of average query cost in DHT over HLR and
GHT on networks with density 20 . . . . . . . . . . . . . . . . 175
4.7 Comparison of average insertion cost in DIM on networks
with density 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
4.8 Comparison of average query cost in DIM on networks with
density 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
4.9 Comparison of average query cost in DIM on networks with
density 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.10Average number of routing table changes under single node
failure on networks with size 50, 100, 150, 200 and density
20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4.11Average number of control packets to re-converge under
single node failure on networks with size 50, 100, 150, 200
and density 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
4.12HLR Software Architecture . . . . . . . . . . . . . . . . . . . . 181
4.13HLR Experiment Topology . . . . . . . . . . . . . . . . . . . . 181
x
4.14HLR Experiment Result with the topology shown in Fig-
ure 4.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
xi
Abstract
Efcient information delivery in sensor networks is one major research
area in the community. In one dimension, efciency for information de-
livery in sensor networks includes energy efciency, capacity efciency
and scalability. In another dimension, the efciency can be achieved
at different network layers: application layer, transport layer or routing
layer. These two dimensions describe a design space for efcient infor-
mation delivery in wireless sensor networks.
We have made several contributions to this design space. First, we have
proposed and studied utility-based sensor selection framework to max-
imize the lifetime of sensor networks at all three layers. Then, we have
theoretically studied the problem of energy-efcient broadcasting in a
realistic network model. Next, we have proposed and evaluated QCRA, a
quasi-static centralized rate control for sensor networks, which achieves
capacity efciency at the transport layer. Finally, we have proposed and
evaluated HLR, a data-centric routing protocol that achieves scalability
at the routing layer without requiring accurate geographic locations.
xii
Introduction
Sensor networks consist of many small sensing devices that monitor an
environment and communicate using wireless links. The operation of
sensor networks heavily relies on efcient information transfer, either
from sensor networks to some control center, or from one sensor node
to another sensor node. For example, a security monitoring sensor sys-
tem for burglaries or res requires the delivery of the sensed alarm to a
control center; a structural health monitoring sensor systems requires
sending the sensed data back to some analysis center for scientic anal-
ysis of the health of the structures, etc. Without efcient information
delivery, most sensor networks can not function properly. Therefore, ef-
cient information transfer is a major goal for sensor network research.
1
In one dimension, different systems have different requirements for ef-
ciency: energy efciency, capacity efciency and scalability. In an-
other dimension, efciency can be achieved at various levels: applica-
tion, transport and routing layers.
Battery-powered sensor networks for long-running monitoring system
require energy efciency. Energy efciency at application layer maxi-
mizes the lifetime of sensor network from the view of the applications.
Energy efciency at the transport layer minimizes the energy consumed
using transport techniques such as reliable transmission. While energy
efcient routing minimizes the total energy consumed for message deliv-
ery through selection of the path to be delivered.
For systems where latency of message delivery is critical, it is important
to maximize the limited network capacity of sensor networks. Capacity
efciency at application layer makes maximize use of network capacity
by selecting of the data to be delivered, such as QOS in sensor networks.
Capacity efciency at the transport layer controls the data delivery rate
so as not to cause the congestion in networks. Capacity efciency at
routing layer improves the network capacity via the selection of path.
For example, multi-path routing is one such technique.
Due to limited storage resources in sensor nodes, scalability in mem-
ory size is another important goal for efciency for information delivery.
2
The goal of scalability at application layer is to be able to design and
implement various scales of applications within the resource limitation.
Scalability at the transport layer requires using limited amount of mem-
ory for transport layer functions such as retransmissions. Scalability at
the routing layer requires having a small size of routing table regardless
the scale of the network.
A large design space is constructed with the above two dimensions. In
this thesis, we present our contributions within this design space.
Energy Efciency
First, we look at the problem of energy efciency. For systems where the
sensors are powered by battery, one of the most important requirements
for efciency is prolonging the lifetime of the networks. There are many
techniques proposed to prolong the lifetime of networks. One thread
of work is to extend lifetime at the routing level, or the transport level,
such as through routing or topology control, while another thread is
to maximize lifetime at the application level via carefully selecting what
and how much data to be retrieved. Since the decision of what and how
much data to be retrieved species the trafc pattern in network, which
in turn affects the performance of routing and topology control, it is very
3
important to study the problem of what, how much, and how data to be
retrieved in one set for lifetime management.
Further, lifetime of sensor networks should not be dened uniformly by
one standard, such as the total amount of data. Rather, it should be de-
ned from the view of the application. For example, in a sensor network
system which monitors the growth of some plants, sensors for measur-
ing humidity, acidity, temperature may be deployed and it is important
for applications to be able to retrieve all three different kinds of data.
In this case, the lifetime should be the lifetime when all three kinds of
sensors are alive and are able to transmit their sensed data back to the
base station.
Each application has its own specic view of the problem. Therefore,
it is very important to nd one unied lifetime management framework
which achieve energy efciency at all three layers: application, transport
and routing layers. In other words, it is important for ne one lifetime
managment framework which let applications species how it views the
lifetime of the network, meanwhile also taking into account energy ef-
ciency at both transport and routing layer which specify how much data
is sent through which path.
Prior work on this thread has centered around one of two paradigms:
(1) Maximize the total amount of data collected, or (2) Collect data from
4
all sensors for as long as possible. The rst of these assumes that all
data is equally interesting or important to the application. In particular,
optimizing for the total amount of data will give undue preference to
data measured close to a base station, as that data can be extracted at
relatively low cost. On the other hand, the second objective is based on
the assumption that data collection is only worthwhile if data from all
sensors is being collected, and as soon as one node's energy is depleted,
the network may as well not collect any data.
Clearly, both approaches are oversimplications of reality. Which data
is useful depends heavily upon the specic application and its needs. In
many cases, individual sensor readings will be of little use; the appeal
of sensor networking lies in the ability to aggregate and correlate sensor
readings from different locations. In other scenarios, readings from a
group of proximate sensors may be considered redundant, and it would
sufce to obtain a reading from one of them.
Here, we argue that the algorithms for deciding which data to retrieve
should be sufciently generic to let the application specify how useful
particular measurements are. We propose a framework wherein the ap-
plication can specify the utility of measuring data (nearly) concurrently
at each set of sensors. The goal is then to select a sequence of sets to
5
measure whose total utility is maximized, while not exceeding the avail-
able energy. Alternatively, we may look for the most cost-effective sensor
set, maximizing the product of utility and system lifetime. This approach
is very generic, and permits us to model many applications of sensor net-
works.
Formally, we study the problem: given a network topology and application-
specied utility function u(), it is then the algorithm's decision how to
trade off the utility and energy consumption of sensor sets in an opti-
mal way, and maximize the total utility extracted from the network until
the network ceases to function. Such a utility-based sensor selection
approach has also been proposed recently by Byers and Nasser [6].
Ideally, one would like to be able to nd an (approximately) optimal se-
quence of sets and associated communication scheme to measure for
arbitrary monotone utility functions. This goal seems very ambitious: as
we show, the problem of selecting an optimal sequence of sets is NP-hard
in many settings. In this thesis, we explore three natural and practically
important classes of utility functions in more detail. Specically, we fo-
cus on submodular functions (with returns for additional sensors dimin-
ishing for larger sets), supermodular functions (with returns increasing
for larger sets), and a general framework of geometric covering objectives.
In the latter case specically, we show that the utility-based approach is
6
analogous to a penalty-based approach that characterizes the penalty of
a sensor set as its collective distance from the targets to be measured.
We show that the optimum sequence of sets for submodular functions
can be found in polynomial time, while optimizing the cost-effectiveness
of supermodular functions is NP-hard. For a practically important sub-
class of supermodular functions, we present an LP-based solution if
nodes can send for different amounts of time, and show that we can
achieve an O(log n) approximation ratio if each node has to send for the
same amount of time. Finally, for geometric covering objectives, we show
that nding the best sensor set is NP-hard, and unless P=NP, the opti-
mum solution cannot be approximated
Through utility-based lifetime management framework, one can decide
which and how much data to be retrieved from the sensor network. Be-
cause the approach depends on the application-specied utility func-
tions, it is more natural that such a decision is made centrally either
on some back-end server or on some powerful back station node. Nat-
urally, this requires the ability to broadcast a command efciently into
a network. Next in this thesis, we theoretically study energy-efcient
broadcasting in wireless sensor networks.
We consider a general case where each node can control its power level,
which is true in most off-shelf sensor nodes today, and study how to
7
energy efciently broadcast message from the control center/base sta-
tion to all of the nodes. Most prior work have used simpler model for
energy cost for wireless communications by accounting only the analog
radiation cost for transmission and ignored the xed cost for electron-
ics in transmission and reception circuitry in nodes. Furthermore, in
a network it is possible for some node pairs not be able to communi-
cate directly even though they are in their radio ranges due to obsta-
cles present in the terrain of the network. Here, we consider a network
of nodes with obstacles and use more general model of energy cost for
communications and develop lower bounds and algorithms for one-to-all
broadcasting. We show a theoretical bound of
(log N) on the approxi-
mation ratio for any polynomial time algorithm for this problem unless
P = NP, where N is the number of nodes in the network. We present a
broadcasting algorithm, called GBA, which meets this lower bound up
to a constant factor. This also improves upon a recently published re-
sult for broadcasting in a network where nodes have k power levels from
O(log
3
N) approximation ratio[42] to O(log N) for the symmetric cost case.
For practical network scenarios consisting of a few tens to several 100
nodes, we calculate two simple lower bounds for broadcasting (longest
shortest path cost and multi-casting cost to two distant destinations).
We show using extensive simulations that our GBA algorithm is within a
8
small factor of these lower bounds implying that our algorithm performs
well in practice.
Capacity Efciency
We next examine the problem of achieving capacity efciency. For sys-
tems where sensors are deployed to collect high-rate data, such as struc-
tural health monitoring, seismic monitoring system, and volcanic mon-
itoring system etc., the most important requirement is to deliver data
as fast as possible. However, to use the network capacity efciently,
sending rate of sensors must be controlled. Because network capacity
is limited, injecting too much data into the network would simply cause
congestion and lead to packet loss. In addition, network capacity var-
ied with time due to dynamics in environment. An efcient rate control
which can accurately estimate the maximum safe rate in these systems
would be very important.
Rate control for congestion mitigation and avoidance has received signif-
icant attention in the sensor networks literature [58, 29, 83, 16]. Rate
control can be either distributed or centralized. In distributed rate con-
trol, each node locally determines a safe transmission rate that would
avoid network congestion. For example, IFRC [58], a recently proposed
9
technique, employs a novel congestion sharing mechanism and standard
AIMD rate adaptation in order to achieve distributed fair and efcient
rates in a wireless sensor network. However, techniques such as IFRC
can converge on rates that are slightly less than optimal, require careful
parameter tuning in order to perform well in a given environment and
make the software in sensor nodes more complex.
On the other hand, centralized techniques have the advantages of sim-
plicity and optimality. However, a centralized technique faces the chal-
lenges of dealing with the network dynamics. There is also a signicant
thread in the literature that has examined centralized rate allocation.
This body of work seeks to determine optimal transmission rates, given
information about topology, link loss rates, and the communication pat-
tern. In many settings, determining the optimal rate is computationally
intractable [32, 38], and many heuristics have been proposed for obtain-
ing near-optimal rate allocations. To our knowledge, however, no work
has examined whether centralized rate allocation is actually feasible in
a real-world wireless network.
In this thesis, we take a step back and ask: is a near-optimal quasi-
static centralized rate allocation even feasible for wireless sensor net-
works? This approach is attractive because of its potential efciency
10
and simplicity: near-optimal rate assignments would lead to higher over-
all network efciency, and centralized rate assignment would reduce the
complexity of sensor node software. However, intuition suggests that
quasi-static centralized rate allocation is unlikely to work well in prac-
tice, since wireless loss rates vary dynamically, and heuristics for near
optimal rate allocation are computationally expensive.
Formally, we study the problem: given a network of N wireless sensor
nodes each using a CSMA medium access layer and each having a back-
log of data to send to a base station, and given the routing tree used by
these nodes to send trafc to the base station, we seek to centrally (at
the base station) allocate a fair and efcient rate for each sensor such
that the goodput achieved at the base station in a real-world network
matches the assigned rate for each sensor. Furthermore, our rate al-
location is quasi-static: that is, rate assignments are recomputed once
every epoch, where the duration of an epoch is on the order of 10s of
minutes.
Clearly, a purely static centralized rate allocation is infeasible since wire-
less propagation characteristics vary widely with time; the next best ap-
proach is to consider infrequent rate adjustments, as we do with quasi-
static rate allocation. Of course, our problem statement simplies one
11
aspect of reality: we do not consider routing changes due to node fail-
ures. Our methods can be adapted to trigger rate allocation decisions
when a node failure is detected by the base station. Such an adapta-
tion will work well in environments where node failures do not occur at
shorter time scales than rate adaptation.
We design a fast rate assignment heuristic that, given the topology of the
wireless sensor network, and the loss rates on the routing tree links,
assigns a fair and efcient rate to each node. This heuristic essen-
tially computes a coarse-grained TDMA schedule among all neighbors
of a node, in order to determine the trafc levels in the vicinity of that
node. The node with the highest trafc level determines the fair rate allo-
cation; recursive applications of the heuristic allow nodes unconstrained
by the bottleneck node to achieve higher rates. Second, we show how a
single parameter C sufces in practice to adapt the rate assignments to
account for the fact that, in practice, wireless sensor radios use CSMA,
that wireless loss rate estimation techniques can be imperfect, and that
wireless loss rates vary even over time scales of 10s of minutes. The rate
adaptation parameter C is empirically derived by observing the behavior
of the network during an epoch.
12
Contrary to intuition, through extensive experiments, we nd that our
quasi-static centralized rate allocation (QCRA) performs well on a 40-
node wireless testbed even under harsh conditions, sometimes even near
optimal.
Scalability
Finally, we study a technique of achieving scalability in routing for sen-
sor networks, especially for data-centric routing. While for large systems
where there is need for in-network data-centric storage, it is very im-
portant to design a scalable routing protocol which provides data-centric
routing primitives. Prior works on data-centric routing uses either ood-
ing or geometric routing. Anecdotal evidence from current deployments
suggests that ooding adversely impacts the performance even in net-
works with tens of nodes. The alternative, geometric routing are cur-
rently infeasible since they require accurate geographic locations.
In this thesis, we consider an alternative approach to providing routing
primitives for data-centric abstractions. We observed that until practical
ad-hoc localization systems are developed, early deployments of wireless
sensor networks will manually congure location information in network
nodes in order to assign spatial context to sensor readings. We argue
13
that such deployments will use hierarchical location names (for exam-
ple, a node in a habitat monitoring network might be said to be node
number N in cluster C of region R), rather than positions in a two- or
three-dimensional coordinate system. In fact, we know of at least two
deployments [79, 1] that use such a naming scheme to assign spatial
context to sensor readings.
We consider deployments where nodes are congured with hierarchical
location identiers (HLIs). An HLI is simply a machine readable form of
a hierarchical location name. The central idea is that these HLIs can be
used to build a scalable routing system (which we call HLR) for sensor
networks. In HLR, the location naming hierarchy implicitly denes an
area hierarchy. Nodes in an area hierarchy maintain detailed routing
information about nodes within their area, and less detail about nodes
outside their area.
HLR constructs and maintains these routing tables using a variant of
the distance-vector based routing protocol DSDV [55]. While the basic
design of HLR borrows heavily from the routing literature for wired net-
works, it incorporates two novel features. The rst is a technique for
automatically aggregating routing entries at area boundaries that allows
neighboring areas to maintain summarized views of an area. Assuming
the hierarchy of the network is appropriately designed, the size of the
14
routing table can grow logarithmically with network size. The second is
a mechanism for routing to partitioned areas—classical area hierarchy
based algorithms make the assumption that areas are connected.
In addition to supporting unicast, HLR could also be used to support
area-based multicast or anycast. Unicast can be used for tasking in-
dividual nodes. Area-based multicast enables any node in network to
deliver a message to a subset of nodes which shares a common prex in
their HLIs. For example, area-based multicast can be used in TinyDB
to deliver a query to a set of nodes around a monitored plant to start
collecting data without having to localize any node within the queried
area.
Using the routing tables that HLR constructs, it is possible to provide a
variety of packet routing primitives: unicast to a specied node within
the network, broadcast or anycast to a specied area, rendezvous using
a random hash or a locality-preserving mapping. Particularly novel in
HLR is the design of the rendezvous primitives, since previous designs
of such primitives for sensor networks leveraged geographic position-
ing. These primitives can be used for data-centric routing systems like
Diffusion and TinyDB, as well as for data-centric storage systems like
GHT [62] and DIM [41].
15
We have implemented HLR in TinyOS, and have implemented simpli-
ed versions of data-centric routing and storage systems that use HLR's
routing primitives. We use extensive simulations to compare the perfor-
mance of HLR-based data-centric routing and storage to systems that
use geographic routing. We nd that the performance of the two classes
of systems is comparable; while aggregated route entries increase the
average path length in HLR, geographic routing based rendezvous some-
times incurs signicant overhead in walking the outer perimeter. We
also evaluate the behavior of HLR under dynamics, nding that route
changes caused by link failures can often be constrained to a small area
and are not propagated throughout the network. Finally, we report ex-
periences from running HLR on a small-sized network of Mica2 motes.
Taken together, these results imply that HLR is a viable routing layer
for many kinds of sensor networks that can be immediately employed in
near-term sensor network deployments.
Our reliance on congured node addresses may seem to be awkward,
given the networking community's experience with manual conguration
in the Internet context. We make two observations in our defense. First,
many Internet components (the backbone routing system, the name sys-
tem) are still manually congured. Second, unlike the Internet which is
comprised of different administrative organizations, sensor networks are
16
likely to be managed and deployed by one organization. Furthermore,
until precise self-localization technology is deployed, we expect that sen-
sor network deployments will need to be carefully planned, with human
involvement in identifying each node's position. Given this, it is a small
step to congure these positions on nodes (and techniques can be devel-
oped to reduce the error in this conguration step).
Our Contributions
Figure 1: Design Space for Efcent Information Delivery in WSN
Figure 1 shows the design space for efcient information delivery in wire-
less sensor networks and our contributions within this design space. We
17
will discuss our work for utility-based lifetime management in chapter 1,
energy-efcient broadcasting in chapter 2, QCRA in chapter 3 and HLR
in chapter 4.
There are many other works in literature which contributes to the other
spaces in the efciency design space. The recently proposed architecture
Tenet [25] lls in the space of scalability at application layer. Reliable
routing techniques [13, 23] lls in the space of capacity efciency at
routing layer. Energy-aware reliable transport such as ESRT [66] lls in
the space for energy efciency in transport layer. For the spaces where
we have contributions, there are also many other works in literature,
and we will discuss each in detail in the related section of each chapter.
18
Chapter 1
Utility-based Sensor Selection
1.1 Introduction
Sensor networks consist of many small sensing devices that monitor an
environment and communicate using wireless links. The limited en-
ergy of sensor nodes curtails the lifetime of sensor network deployments.
Therefore, a large body of literature has examined methods for extend-
ing sensor network lifetime by carefully managing communication and
computation resources.
One thread has focused on techniques to maximize sensor network life-
time by using routing or topology control [15, 67, 40, 9, 88, 68, 34, 17].
Somewhat orthogonal to this thread is the approach of asking which
19
and how much data can be collected over the lifetime of a network. Work
on this question is usually centered around one of two paradigms: (1)
Maximize the total amount of data collected, or (2) Collect data from all
sensors for as long as possible. The rst of these assumes that all data
is equally interesting or important to the application. In particular, op-
timizing for the total amount of data will give undue preference to data
measured close to a base station, as that data can be extracted at rel-
atively low cost. On the other hand, the second objective is based on
the assumption that data collection is only worthwhile if data from all
sensors is being collected, and as soon as one node's energy is depleted,
the network may as well not collect any data.
Clearly, both approaches are oversimplications of reality. Which data
is useful depends heavily upon the specic application and its needs. In
many cases, individual sensor readings will be of little use; the appeal
of sensor networking lies in the ability to aggregate and correlate sensor
readings from different locations. In other scenarios, readings from a
group of proximate sensors may be considered redundant, and it would
sufce to obtain a reading from one of them.
In this chapter, we argue that the algorithms for deciding which data to
retrieve should be sufciently generic to let the application specify how
20
useful particular measurements are. We call the usefulness of a concur-
rently measured sensor set S its utility u(S). Given a network topology
and application-specied utility function u(), it is then the algorithm's
decision how to trade off the utility and energy consumption of sensor
sets in an optimal way, and maximize the total utility extracted from
the network until the network ceases to function. Such a utility-based
sensor selection approach has also been proposed recently by Byers and
Nasser [6].
Ideally, one would like to be able to nd an (approximately) optimal se-
quence of sets and associated communication scheme to measure for
arbitrary monotone utility functions. This goal seems very ambitious: as
we show, the problem of selecting an optimal sequence of sets is NP-hard
in many settings. In this paper, we explore three natural and practically
important classes of utility functions in more detail. Specically, we fo-
cus on submodular functions (with returns for additional sensors dimin-
ishing for larger sets), supermodular functions (with returns increasing
for larger sets), and a general framework of geometric covering objectives.
In the latter case specically, we show that the utility-based approach is
analogous to a penalty-based approach that characterizes the penalty of
a sensor set as its collective distance from the targets to be measured.
21
We show that the optimum sequence of sets for submodular functions
can be found in polynomial time, while optimizing the cost-effectiveness
of supermodular functions is NP-hard. For a practically important sub-
class of supermodular functions, we present an LP-based solution if
nodes can send for different amounts of time, and show that we can
achieve an O(log n) approximation ratio if each node has to send for the
same amount of time. Finally, for geometric covering objectives, we show
that nding the best sensor set is NP-hard, and unless P=NP, the opti-
mum solution cannot be approximated to within better than a multi-
plicative factor of 1.822.
The paper is structured as follows. Section 1.2 formally describes the
utility-based sensor selection problem. Sections ?? and ?? then exam-
ine the feasibility of solving several variants of the utility-based sensor
selection problem for submodular and supermodular functions. Sec-
tion ?? discusses geometric coverage objectives. Finally, Section 1.6
discusses related work and Section 1.7 summarizes our contributions.
22
1.2 Network and Problem Setup
1.2.1 Network Model
Formally, our network model can be described as follows. The network
is considered as a directed graph G = (V;E) with n =jVj nodes including
a special root node r2 V . Each non-root node v has a nite initial energy
E
v
. If node v is to sense data, it incurs a sensing cost
v
. To transmit
data along the link e = (v;w)2 E, node v incurs a transmission cost of
e
per unit of data. The total of sensing cost and transmission cost may not
exceed a node's energy. When the two are equal, we say that the node's
energy is depleted.
This network model is consistent with current practice in sensor network
deployment. Most existing deployments, including the James Reserve
habitat monitoring network [1], the Great Duck Island network [79, 80],
and the Extreme Scaling network [2] are tiered: they consist of a large
number of small battery-powered motes, sending data (perhaps after
some local processing) to a smaller number of well endowed upper-tier
32-bit embedded nodes (e.g. Stargates). The small-form-factor motes
are most often battery equipped, and thus constrain the system lifetime,
while the upper-tier nodes are much more powerful, and rarely impose
23
additional constraints. For our purposes, this means that once data has
reached the upper-tier, it can be extracted from the network. This al-
lows us to consider all upper-tier nodes as one base station or root, for
the purposes of theoretical treatment.
This network model is also consistent with at least one proposed sensor
network architecture [24]. That architecture proposes to place application-
specic functionality in the upper-tier nodes. Applications can generi-
cally task collections of motes. Each mote senses the environment, pro-
cesses the sensed data as specied in a task, and returns the data to
an upper-tier node. Indeed, our problem formulation is motivated by
this architecture: applications may use our utility-based sensor selec-
tion framework to determine which set(s) of sensors to task.
1.2.2 Utility-Based Sensor Selection
We are now ready to characterize the optimization problem of utility-
based sensor selection. In addition to the network with costs and initial
energies for nodes, we are given a monotone utility function u : 2
V
! R
+
.
The meaning of this function is that if all nodes of S are measured
(nearly) concurrently, then the application derives a benet of u(S) from
this measurement. The goal is to calculate a sequence of measurements
24
to give the maximum total benet until the sensor nodes' energy is de-
pleted.
Utilities naturally capture application requirements in many practical
sensor network scenarios. For example, in a beam-forming application, a
non-collinear set of sensors has a higher utility than one that is collinear.
In a surveillance application, a set of (approximately) evenly spaced sen-
sors around the periphery of the sensed area has a higher utility than
a clustered set of sensors in one corner of the sensed area. In a struc-
tural monitoring application, a set of sensors in which none are on the
nodes (nulls in the structural mode shapes) has a higher utility than a
set where some are.
More formally, a feasible solution consists of a collectionS =fS
1
;::: ;S
k
g
of subsets of V to be measured, with a non-negative amount of data t
i
to
be transmitted from each set S
i
. In addition, the solution must specify a
ow f of value
P
i:v2S
i
t
i
from each node v to the root node r, such that the
total cost of ows and sensing does not exceed any node's energy. (This
ow captures the delivery of all relevant data from node v to the root.)
The goal is then to nd a feasible solution of maximum total utility.
We can state this as a linear program (with exponentially many variables)
as follows. For each set S V , we have a variable t
S
specifying the
25
amount of data sent from S to the root. (Here, we denote by
+
(v) and
(v) the set of outgoing and incoming edges of v, respectively.)
Maximize
P
SV
u(S) t
S
subject to
P
e2
+
(v)
f
e
P
e2
(v)
f
e
=
P
S:v2S
t
S
8v6= r
P
e2
+
(v)
e
f
e
+
P
S:v2S
v
t
S
E
v
8v6= r:
The rst constraint characterizes ow conservation: at every node, the
total amount of ow leaving exceeds the amount entering exactly by the
amount of data that the node itself is sending. The second constraint
simply states that the total of transmission and sensing cost for no node
exceeds the available energy.
Notice that the LP formulation does not contain a “time” component.
Indeed, the order in which the sets transmit their data does not affect
the total utility derived from a sequence. It is also not difcult to see that
the ow f can be decomposed into different ows f
S
for each sending set
S, while keeping the total amount
P
SV
f
S
transmitted along any link
e the same as f
e
. Hence, we are justied in considering the simplied
formulation above.
We term the problem as formalized above the Utility-based Sensor Selec-
tion Problem (USSP). Notice that the formulation is very similar to that
26
proposed by Byers and Nasser [6], but the focus of our work is signi-
cantly different from theirs (Section 1.6). Several natural variants of the
problem may be of practical importance:
In the Integral Utility-based Sensor Selection Problem (IUSSP), all
data transmission amounts t
S
are required to be integral. This vari-
ant is important in applications which require data collection in dis-
crete data units (for example, [51] describes a scenario in which
a structure's response needs to be sampled for no less than 1
second at high enough frequency in order to have enough samples
to compute a FFT).
In the Utility-based One Set Sensor Selection Problem (UOSSP), the
amount of data is allowed to be positive for at most one set, i.e.
we are interested in collecting data only from one set. This case is
of importance in determining the set with the best utility-to-energy
tradeoff.
Naturally, the previous two variants can be combined, by requir-
ing that only a single set transmit data, and do so for an integral
amount of time. We then obtain the Integral Utility-based One Set
Sensor Selection Problem (IUOSSP).
27
Notice that even the LP for the basic USSP problem is unlikely to be
solvable explicitly for arbitrary utility functions, due to its exponential
size. Indeed, even specifying a utility function completely would require
giving a value for each set S, and thus space exponential in the number n
of sensors. These two observations motivate studying restricted classes
of utility functions of practical importance. In the following sections, we
investigate more closely submodular and supermodular functions, as
well as utility functions applicable to geometric coverage settings.
1.3 Submodular Utility Functions
In game theory, submodular functions are frequently studied as a natu-
ral restriction of utility functions [82]. Recall that a function u : 2
V
! R
is submodular if u(S
1
[ S
2
) + u(S
1
\ S
2
) u(S
1
) + u(S
2
) for all sets S
1
;S
2
.
The practical importance of submodular functions is more evident from
an equivalent characterization: u is submodular if and only if u(S
1
+ v)
u(S
1
) u(S
2
+ v) u(S
2
) for all sets S
2
S
1
and elements v = 2 S
2
. This in-
equality captures the idea of “diminishing returns”: the benet of adding
the element v to a set is non-increasing as the set that v is added to in-
creases. In sensor networks, utility functions will be submodular if the
28
data measured by different sensors is (partially) redundant; it will cease
to be so if multiple measurements yield superlinear benet.
For example, when sensors are deployed for measuring the average tem-
perature, one natural notion of utility is the expected reduction in vari-
ance of the estimate of average temperature. Under some natural as-
sumptions, this function is submodular. In order to maximize the sys-
tem lifetime utility, one will then be trading off accuracy in the estimate
against taking measurements for a longer time.
In this section, we show that USSP can be solved in polynomial time for
arbitrary submodular functions. The key observation is that, without
loss of generality, an optimal solution never measures two sensors at
the same time. This observation then allows us to reduce the size of the
linear program to polynomial, and solve it explicitly.
Lemma 1 Without loss of generality, the optimum solution to USSP only
retrieves data from singleton sets, i.e. t
S
= 0 for all non-singleton sets S.
Proof. Let ((t
S
); (f
e
)) be an optimal solution for USSP, and assume that
the set S =fv
1
;v
2
;::: ;v
k
g (with k 2) transmits t
S
> 0 amount of data.
Consider a solution which sets t
0
S
:= 0, and instead increases t
0
fv
i
g
:=
t
fv
i
g
+ t
S
for i = 1;::: ;k. Each node v in the new solution still needs to
29
transmit the same total amount of data as before, so f is still a valid
ow, and the new ((t
0
S
); (f
e
)) is a feasible solution.
From the diminishing returns property, we can inductively derive that
u(S)
P
k
i=1
u(fv
i
g), and as the t
S
0 values for all other sets S
0
remain un-
changed, the total utility of the solution changes by exactly t
S
(
P
k
i=1
u(fv
i
g)
u(S)) 0. In particular, the new solution is at least as good as the original
one. By continuing in this way, we eventually arrive at a feasible solution
in which only singleton sets have non-zero amounts of data transmitted,
and which has at least the same total utility.
Based on the observation in the lemma, we can remove all variables
for non-singleton sets from the LP for USSP, arriving at the following
modied LP:
Maximize
P
v2V
u(fvg) t
v
subject to
P
e2
+
(v)
f
e
P
e2
(v)
f
e
= t
v
8v6= r
P
e2
+
(v)
e
f
e
+
v
t
v
E
v
8v6= r:
This new LP has polynomial size, and can thus be solved explicitly in
polynomial time. Notice also that it is enough for the application to
specify the utilities of singleton sets. Thus, our observation allows us
to circumvent the problem of needing exponential space to describe the
30
r
c
1
(E
c
1
= 1
S
1
B
) ^ v (E
^ v
= 1) c
2
(E
c
2
= 1
S
2
B
)
v
1
v
2 1 1
1 1
1
1 1
Figure 1.1: MAXIMUM KNAPSACK graph withjUj = 2
utility function completely — a complete description is not necessary to
nd the optimum solution.
Though USSP with submodular functions is polynomial solvable through
LP, we can prove that the version requires the integral amount of data is
NP-hard through reduction from the maximum knapsack problem.
Theorem 2 IUSSP with modular utility functions is NP-hard.
Proof. We will give a reduction from MAXIMUM KNAPSACK problem: given
a nite set U, for each u2 U a size S
u
2 Z
+
and a value value(u)2 Z
+
,
a positive integer B2 Z
+
, nd a subset U
0
U with maximum weight of
the chosen elements, i.e.,
X
u2U
0
value(u) such that
X
u2U
0
S
u
B.
Given an instance of MAXIMUM KNAPSACK, we construct an instance of
IUSSP with a submodular utility function. The graph G = (V;E) has a
node v
i
and a carry-over node c
i
for each i2 U, as well as a bottleneck
node ^ v and the root r. There is a direct link from each v
i
to c
i
, from each
31
v
i
to ^ v, from each c
i
to r and from ^ v to r. All links have unit transmission
cost and all nodes have zero sensing cost. The initial energy of each v
i
and ^ v is 1, while the initial energy of each node c
i
is 1
S
i
B
. The resulting
graph for a nite set with two elements is shown in Figure 1.1.
Here we can assume any submodular function as long as the utility for
each node v
i
is value(u) and utility for all other nodes are zero. Similar to
the argument for USSP with submodular functions, the optimal solution
for IUSSP with submodular function only retrieves data from singleton
sets. With our construction, each source node can only send at most
one unit of data. Any selected node v
i
will send exactly 1
S
i
B
through the
carry over node c
i
, and will send the remaining
S
i
B
through the bottleneck
node ^ v. Due to the initial energy limitation at the bottleneck node ^ v, the
total size of the selected nodes will not exceed B. The total utility of the
constructed IUSSP problem is equal to the summation of the values of
the selected singleton sets. It is obvious that the solution of the con-
structed instance of IUSSP with the submodular function is exactly the
solution to the given instance of MAXIMUM KNAPSACK. Hence complete
the proof.
Notice that modular utility functions are special case for both submod-
ular and supermodular utility functions, the above theory also proves
32
that IUSSP with both submodular and supermodular utility functions
are NP-hard.
Though the hardness proof for IUSSP with submodular functions does
not directly carry over to UOSSP with submodular functions where one
set of nodes is to be selected with fractional amount of data. We can
prove that UOSSP is also NP-hard with just a little more addition to the
proof show in Theorem 2. The key idea is to add something to enforce
the resulting selection of node set must send one unit data.
Theorem 3 UOSSP with submodular utility functions is NP-hard.
Proof. Similar to what we did in Theorem 2, we will give a reduction
from MAXIMUM KNAPSACK problem.
Given an instance of MAXIMUM KNAPSACK, we construct an instance of
UOSSP with a submodular utility function similar to the construction we
did in Theorem 2 with the exception that we now add one more bottle-
neck node ^ v
2
, which is directly connected to the root r. The initial energy
of node ^ v
2
is 1, the transmission cost from node ^ v
2
to node r is also 1
and the sensing cost is zero. Let Utility(f ^ v
2
g) = 1+
X
u2U
value(u). Dene the
utility of a set to be the summation of the utility of the elements in the
set.
33
If the selected optimal solution to the constructed instance of UOSSP has
less than one unit of data, then the total utility is less than
X
u2U
value(u),
while just selecting node ^ v
2
would give us at least 1 +
X
u2U
value(u). There-
fore, the resulting selected set must send one unit of data.
As we have proved in Theorem 2, when the set of nodes has to send one
unit of data, the solution of the constructed instance is the solution to
the original MAXIMUM KNAPSACK problem.
It is obvious that IUOSSP with submodular functions is also NP-hard.
Notice that in our proof in Theorem 3, the design of the utility function
is in fact modular. The hardness for UOSSP and IUOSSP with strict
submodular functions is still unknown.
1.4 Supermodular Utility Functions
Supermodularity in a sense captures the opposite of submodularity: the
benet of individual nodes is larger as they are added to an already larger
set; more generally, the benet from combining two (disjoint) sets is at
least as large as the sum of the individual benets. Formally, we say
that a function u : 2
V
! R is supermodular if u(S
1
[ S
2
) + u(S
1
\ S
2
)
34
u(S
1
) + u(S
2
) for all sets S
1
;S
2
V . In sensor network applications, su-
permodular functions will occur when the application requires the com-
bination of diverse data measured (nearly) simultaneously.
For example, in an aquatic monitoring application, sensors of differ-
ent types (temperature, light, chlorophyll, :::) may be deployed in an
expanse of water. An application may be interested in a correlated mea-
surement of all light sensors, a correlated measurement of all sensors
at depth 10ft, and a measurement of all sensors that had measured a
certain phenomenon the previous day. Naturally, there may be overlap
between these sets: for instance, a light sensor may be at a depth of
10ft, and have measured the phenomenon the previous day. If the ap-
plication is such that missing even one of the relevant sensors makes
the measurement (nearly) useless, for instance because correlations are
to be analyzed in detail, then the utility derived from a particular set
S is simply the number of categories (light, 10ft, phenomenon) that are
completely contained in S. This function is supermodular - in fact, it is
a special case of the category of set-weighted functions studied in more
detail below.
Notice that a function can be submodular and supermodular at the
same time; this happens exactly when the function is of the form u(S) =
35
P
v2S
u(fvg), i.e. the utility of any set is simply the sum of utilities of
the individual elements in the set. Though one can solve the same LP
used for the sub-modular case here, notice that the resulting optimum
singletons can be combined freely to get another optimum solution.
The property of supermodularity allows us to infer an interesting prop-
erty of the optimal solution: without loss of generality, the sending sets
are nested. In other words, if S
1
;S
2
are sets sending data, then either
S
1
S
2
, or S
2
S
1
.
Lemma 4 Without loss of generality, the optimum solution for USSP with
a supermodular utility function consists of nested sets.
Proof. Let ((t
S
); (f
e
)) be an optimal solution for USSP, and assume that
there are sets S
1
;S
2
such that neither S
1
S
2
nor S
2
S
1
, yet t
S
1
> 0
and t
S
2
> 0. Without loss of generality, assume that t
S
1
> t
S
2
. Consider a
solution which instead sets t
0
S
2
= 0, t
0
S
1
= t
S
1
t
S
2
, and t
0
S
1
[S
2
= t
S
1
[S
2
+ t
S
2
,
as well as t
0
S
1
\S
2
= t
S
1
\S
2
+ t
S
2
. In the new solution, each node still needs
to transmit the same amount of data, so ((t
0
S
); (f
e
)) is still feasible.
36
Since each set other than S
1
;S
2
;S
1
[ S
2
, and S
1
\ S
2
still transmits the
same amount of data, the change in the total system utility is
u(S
1
[ S
2
) t
S
2
+ u(S
1
\ S
2
) t
S
2
u(S
1
) t
S
2
u(S
2
) t
S
2
= t
S
2
(u(S
1
[ S
2
) + u(S
1
\ S
2
) u(S
1
) u(S
2
))
0;
by the supermodularity of u. This reduces the number of pairs (S
1
;S
2
) of
non-nested sets by one, as it can be easily veried that whenever S
1
[S
2
is
not nested with some set S
0
, then at least one of S
1
;S
2
is not nested with
S
0
, either. Thus, we can continue this process inductively, and arrive at
a nested solution of at least the same total utility.
While this lemma shows that the optimum solution gathers data from
at most n sets, it unluckily does not restrict the class of sets that these
could be drawn from, and thus, does not restrict the size of the LP.
Hence, it does not suggest a polynomial-time algorithm. Indeed, we con-
jecture that USSP is NP-complete for supermodular functions. While we
currently have no proof of this conjecture, we can show that the ver-
sion requiring integral data amounts to be sent is not only NP-hard, but
cannot be approximated unless P=NP.
37
r
^ v (E
^ v
= k)
v
1
v
2
v
3
v
4
1 1 1 1
1
Figure 1.2: SETCOVER graph with n = 4
Theorem 5 IUSSP with supermodular utility functions is NP-hard. Unless
P=NP, it cannot be approximated to within any multiplicative factor.
Proof. We will prove that it is NP-hard to decide if the system admits
a solution with positive total utility. That proves both the NP-hardness
and approximation hardness result. We reduce from the decision version
of SETCOVER: Given a universe U =f1;::: ;ng, and a collection of subsets
S
i
U for i = 1;::: ;m, is there a collection of k sets S
i
j
covering the entire
universe, i.e. such that
S
k
j=1
S
i
j
= U.
Given an instance of SETCOVER, we construct an instance of IUSSP with
a supermodular utility function. The graph has one node v
i
for each set
S
i
, as well as a bottleneck node ^ v and the root r. There is a directed link
from each v
i
to ^ v, and from ^ v to r. All links e have transmission cost
e
= 1, and all sensing costs are 0. The initial energy of each v
i
is 1, while
the initial energy of ^ v is k. The resulting graph for four sets is shown in
Figure 1.2.
38
To complete the reduction, we need to specify the utility function. For
any node set Tfv
1
;::: ;v
m
g, we dene the utility u(T) := 2
jTj
if
S
i:v
i
2T
S
i
=
U, and u(T) := 0 otherwise. This reduction can be computed in poly-
nomial time. The utility function, although dening the value for 2
m
subsets, can be specied in polynomial space (and time) by giving the
collection of sets S
i
. It is also not difcult to verify that u is in fact super-
modular, by considering the three cases that neither T
1
nor T
2
denes a
set cover, that exactly one of them does, and that both of them do.
We now claim that a positive total utility can be achieved in this setting
if and only if the SETCOVER instance had a solution of size at most k. If
fS
i
j
g is a set cover of size at most k, then it is easy to see that the node
setfv
i
j
g leads to a feasible solution, of utility 2
k
> 0. Conversely, iffv
i
j
g is
a node set contributing total utility to an IUSSP solution, we rst observe
that by feasibility, it has size at most k. Furthermore, since the utility is
positive,fS
i
j
g must be a SETCOVER by the denition of u. Thus, we have
established the claim, proving both hardness results.
Notice that the same hardness results carry over to the case when we are
seeking a single set to send an integral amount of data, i.e., the integral
version of UOSSP.
39
1.4.1 Set-weighted functions
While Theorem 5 shows a very strong hardness result for IUSSP with
supermodular functions, the functions it uses in the reduction are ar-
guably not very natural in the context of sensor networks. In particular,
one would assume that most utility functions specied by applications
have a more explicit compact representation. Here, we focus on a further
restriction on supermodular functions.
We term the class of functions set-weighted utility functions. A set-
weighted utility function is characterized by a collectionP of sets, and
for each set P 2P a non-negative weight w
P
. The utility of a set S is
then dened as u(S) =
P
PS;P2P
w
P
. If the collectionP is small enough
(for instance, polynomial in n), then it provides a natural and compact
representation of u. Also notice that any function u dened in this way
is always supermodular.
For set-weighted utility functions, we can in fact nd the optimum so-
lution to USSP by solving a linear program. The insight for this linear
program is based on Lemma 4: the optimum solution for any super-
modular utility function without loss of generality consists of a nested
collection of sets. In particular, we can think of this collection as being
40
ordered by decreasing size. Then, any solution can be fully character-
ized by specifying how much data each node sends, or, equivalently, how
long each node sends. Suppose that up to some time t, all nodes in S
still send data (or, the nodes in S each send t or more units of data).
Then, these nodes up to time t contribute utility t u(S) = t
P
PS;P2P
w
P
.
Based on this observation, and changing the order of summation in the
objective function, we can rewrite the linear program for USSP as follows:
Maximize
P
P2P
w
P
t
P
subject to t
P
s
v
8P2P;v2 P
P
e2
+
(v)
f
e
P
e2
(v)
f
e
= s
v
8v6= r
P
e2
+
(v)
e
f
e
+
v
s
v
E
v
8v6= r:
Notice that we are introducing variables s
v
for the amount of data sent by
each node v. By expressing the utility function in terms of contributions
by subsets, we managed to eliminate the terms for sets other than the
P 2 P. By solving this polynomial-sized LP, we can nd an optimal
solution to USSP in polynomial time.
Although USSP for set-weighted supermodular functions are solvable
through LP, the version of the problem requires integral data is NP-hard.
Theorem 6 IUSSP with set-weighted supermodular functions is NP-hard.
41
Proof. We will give a reduction from the densest k-subgraph problem
[21, 22]: Given a graph G = (V;E) and a number k, as well as a density
requirement , is there a subset S V of size at most k containing at
least k edges. (Here, we say that S contains an edge e if it contains
both of its endpoints.)
Given an instance of the densest k-subgraph problem, we construct an
instance of IUSSP with a set-weighted utility function as follows: The
graph G
0
= (V
0
;E
0
) consists of a sink node r, one bottleneck nodes ^ v
1
,
and a node v for each v of the given graph G. Each such node v is
connected to the bottleneck node ^ v
1
, and bottleneck node is connected to
the sink r. All sensing costs are 0, and all transmission costs are 1. All
nodes have initial energy of 1, with the exception of node ^ v
1
, which has
initial energy of k.
The collectionP contains each pair (u;v) of G
0
such that (u;v)2 E is an
edge of the given graph G, and assigns them a weight of 1. We claim that
the given graph contains a k-subgraph of density at least if and only if
the constructed IUSSP gives total utility at least k .
First we observe that each node can send at most 1 unit of data. Due to
the integral requirement of IUSSP, if the node is chosen, then it will send
exactly 1 unit of data. Due to the nesting property of optimal solution,
only one set of nodes will be selected The utility of the set equals to the
42
edges included in this set, therefore, if the constructed IUSSP gives total
utility at least k , then the original graph has a k-subgraph.
On the reverse direction, if the original instance of the dense k-subgraph
problem has a k-subgraph with density , then there exist k nodes which
can send one unit data through bottleneck node which has utility k .
The above two statements proves the correctness of the reduction, and
hence the NP-hardness of the IUSSP with set-weighted supermodular
functions.
1.4.2 Set-weighted single set selection
In the context of understanding the solutions to the USSP problem, an
important question is which single set is most cost-effective, i.e. gives
the largest utility for its energy consumption. In other words, which
set S maximizes the product of the set's utility u(S) and the lifetime of
the system if only S sends data? This is the problem termed UOSSP
above. Here, we show that even for the special case of set-weighted
supermodular functions, the UOSSP problem is NP-hard.
Theorem 7 UOSSP with set-weighted supermodular functions is NP-hard.
43
Proof. We will give a reduction from the densest k-subgraph problem
[21, 22]: Given a graph G = (V;E) and a number k, as well as a density
requirement , is there a subset S V of size at most k containing at
least k edges. (Here, we say that S contains an edge e if it contains
both of its endpoints.)
Given an instance of the densest k-subgraph problem, we construct an
instance of UOSSP with a set-weighted utility function as follows: The
graph G
0
= (V
0
;E
0
) consists of a sink node r, two bottleneck nodes ^ v
1
; ^ v
2
,
and a node v for each v of the given graph G. Each such node v is
connected to the bottleneck node ^ v
2
, and both bottleneck nodes are con-
nected to the sink r. All sensing costs are 0, and all transmission costs
are 1. All nodes have initial energy of 1, with the exception of node ^ v
2
,
which has initial energy of k.
The collectionP contains each pair (u;v) of G
0
such that (u;v)2 E is an
edge of the given graph, and assigns them a weight of 1. In addition, the
node ^ v
1
is inP as a singleton set, having very large weight 1 +jEj k. We
claim that the given graph contains a k-subgraph of density at least
if and only if there is a node set to sense that gives total utility at least
1 + k (jEj + ).
For the rst direction, observe that if the set S contains at most k nodes
and has density , then sending one unit of data each from nodes ^ v
1
and
44
from all nodes v2 S gives the desired total utility. Conversely, we rst
observe that the maximum utility that can be obtained without including
^ v
1
is at most kjEj, as the nodes can send at most k units of data, and
the utility of any such set of nodes is at mostjEj. Thus, the optimum
solution will always send one unit of data from ^ v
1
, and is thus also forced
to send at most one unit of data from each other node v. But then, it will
always send data from exactly k nodes other than ^ v
1
(else, some of the
energy of ^ v
2
would go unused). Since the total utility was assumed to be
at least 1 + k (jEj + ), the utility of the set S of nodes other than ^ v
1
must
be at least k. This in turn means that the set S contains at least k
edges in G. This proves the correctness of the reduction, and thus the
NP-hardness of the problem.
Notice that that set-weighted supermodular functions is a special class
of supermodular functions, the hardness result carries over for general
supermodular functions. Also notice that in proof in Theorem 7, nodes
will send a integral amount of data: one unit, therefore, the NP-hardness
carries over to the IUOSSP with set-weighted supermodular functions.
While the UOSSP problem for set-weighted supermodular functions is
NP-hard to solve optimally, it can be approximated to within a factor of
O(log n) within polynomial time, using an LP-rounding based approach.
The algorithm rst solves the corresponding linear program for USSP
45
(which is also the natural linear program for UOSSP) in polynomial time,
and then considers all candidate node sets C
x
of the form C
x
= fv 2
V j s
v
xg. Among all such sets C
x
, it chooses the one maximizing
the product of u(C
x
) and the system lifetime if all of C
x
sends. Notice
that the algorithm only needs to look at such sets C
x
for values x = s
v
for some v. Thus, at most n different sets need to be considered. For
each set, calculating the system lifetime is done in turn by solving the
corresponding ow LP.
The proof of the O(log n) approximation guarantee is based on the follow-
ing useful lemma.
Lemma 8 If x
1
x
2
x
3
::: x
k
0 are k real numbers with associ-
ated non-negative rational weights w
1
;w
2
;::: ;w
k
, then max
n
i=1
(x
i
P
i
j=1
w
j
)
P
k
i=1
w
i
x
i
H
k
, where H
k
=
P
k
i=1
1=i denotes the k
th
Harmonic number.
Proof. To prove the claim, we rst consider the case when all w
i
= 1
(thus,
P
i
j=1
w
j
= i). Let i
be the index maximizing x
i
i. Thus, we have
that x
i
(i
x
i
)=i, and
P
k
i=1
x
i
H
k
i
x
i
P
k
i=1
1=i
H
k
= i
x
i
= max
i
(i x
i
):
46
If the w
i
are not all equal to 1, then we rst multiply all of them by the
common denominator to make them integers — notice that multiplying
all weights by the same constant does not affect the validity of the in-
equality to be proved. Then, if the number x
i
has weight w
i
2 N, we
replace it by w
i
copies of the number x
i
, each with weight 1. Notice that
this leaves both the left-hand side and right-hand side in the inequality
max
k
i=1
(x
i
P
i
j=1
w
j
)
P
k
i=1
w
i
x
i
H
k
unchanged; for the right-hand side, this
follows because w
i
x
i
is replaced by
P
w
i
j=1
x
i
, and for the left-hand side, it
can be seen that the maximum after the replacement is attained at an
index i
0
such that x
i
0 is the w
i
th
copy of the number x
i
.
Using this lemma, we can state and prove the approximation guarantee
for the LP-rounding algorithm.
Theorem 9 The above approximation algorithm achieves an O(log n) ap-
proximation ratio.
Proof. First, the utility of a fractional solution that can use at most
one set is certainly upper-bounded by the solution that is allowed to use
multiple sets, i.e. the value of the LP-solution of USSP is an upper bound
on the optimum UOSSP value.
47
If the algorithm selects a set S
x
, then the lifetime of S
x
is at least x (since
the fractional solution was able to send at least x units of data for each
node in S
x
). Thus, the algorithm obtains total utility at least
x u(S
x
) = x
P
PSx;P2P
w
P
:
If we sort the sets P 2P by non-increasing t
P
values, numbering them
P
1
;::: ;P
k
, then we can rewrite and lower-bound the algorithm's total util-
ity as max
k
i=1
t
P
i
P
i
j=1
w
P
j
.
On the other hand, the optimum fractional solution obtains total utility
P
k
i=1
t
P
i
w
P
i
. By the above lemma, the ratio between these two quantities
is bounded by H
k
= O(log k), which in turn is O(log n) whenever the size
of the collectionP is polynomial in n.
If we want to improve the approximation guarantee beyond the factor
O(log n) proved above, the approach will have to be based on a different
upper bound from the one given by the USSP LP. (Notice that the USSP
LP is also the natural LP relaxation for UOSSP.) For the family of ex-
amples depicted in Figure 1.3 exhibits an O(log n) integrality gap in the
LP. The graph has n nodes in addition to the root. Each node is con-
nected to the root through an edge of transmission cost 1. All the other
n sensor nodes have sensing cost 0, unit initial energy 1, and weights
48
r
E : 1 E : 1=2 E : 1=3 E : 1=n
1 1
1
1
Figure 1.3: O(logn) Integrality Gap Example
1; 1=2; 1=3;::: ; 1=n. The fractional solution sends 1=i units of data from
each node i, for a total utility of H
n
. On the other hand, the UOSSP so-
lution can only chose one set, all of whose nodes have to send the same
amount of data. Since any set of i nodes can send at most 1=i units (one
of them must be a node with energy at most 1=i), the maximum utility
for UOSSP is 1, proving an integrality gap of H
n
=
(log n).
1.4.3 Greedy Algorithms
The O(log n) LP-rounding algorithm given above implicitly shows that se-
lecting the best single set gives utility within O(log n) of selecting the best
sequence. For many types of problems (such as maximizing the value
of a submodular monotone function f(S) on sets S), simple greedy al-
gorithms based on adding or removing one element at a time lead to
optimal solutions or good approximations. Here, we show that neither
49
the greedy addition nor removal algorithm will give a good approximation
for supermodular functions.
Specically, we consider the following two algorithms. In the description,
we let t
S
be the maximum amount of data that the set S can send if no
other nodes are sending data.
Addition Algorithm: Start with the empty set S
0
=;. In each iteration
i (until all the nodes are selected), always add the node v
i
such that
the total utility t
S
i1
[fv
i
g
u(S
i1
[fv
i
g) is maximized. (That is, set S
i
:=
S
i1
[fv
i
g.) Output the best of all the S
i
.
Removal Algorithm: Start with all nodes S
0
0
= V . In each iteration i
(until no nodes are left), always remove the node v
0
i
such that the total
utility t
S
0
i1
nfv
0
i
g
u(S
0
i1
nfv
0
i
g) is maximized. (That is, set S
0
i
:= S
0
i1
nfv
0
i
g.)
Output the best of all the S
0
i
.
Even if the two algorithms are combined (i.e. the better of their solutions
is output), the approximation guarantee can be as bad as
(n). The
example has a network consisting of the sink r, a bottleneck node ^ v, and
n other nodes v
i
. The bottleneck ^ v is connected to all other nodes with
edges of transmission cost 1, and it has an energy of 1. No other edges
exist, and the energy of each v
i
is also 1. Then, it is easy to see that
50
the amount of data that can be sent by a set S is 1=jSj, and hence, the
optimum solution will be the one maximizing
u(S)
jSj
.
The supermodular utility function u is dened as follows: We partition
the nodes v
1
;::: ;v
n
into S
1
:=fv
1
;v
2
g and S
2
:=fv
3
;::: ;v
n
g. Their utilities
are u(S
1
) =
1
2
n for some very small constant , and u(S
2
) =
1
2
+ n. The
utility of the entire set is u(fv
1
;::: ;v
n
g) = 1.
Each node in S
1
has individual utility =2, and each node in S
2
has utility
. Any set S S
1
has utility u(S) = u(S
1
) + jS\ S
2
j, and any set S S
2
has u(S) = u(S
2
) + =2jS\ S
1
j. Finally, all other sets S have utility
u(S) = jS\S
2
j+=2jS\S
1
j. It is not difcult to verify that this function
u is indeed supermodular.
The greedy addition algorithm will add all nodes from S
2
rst, never en-
countering the set S
1
as a possible solution. Similarly, the removal al-
gorithm will remove the nodes from S
1
rst. Hence, the output of any
solution based on the greedy algorithm will be the set of all nodes, giving
a total utility of 1=n. On the other hand, the set S
1
can send half a unit
of data, and thus achieves utility
1
2
(1=2n)
1
4
, which is better than the
greedy solution by a factor of
(n).
51
1.5 Geometric Penalty Functions
In the previous two sections, we investigated the special cases of sub-
modular and supermodular utility functions. One of our main goals was
to nd a compact representation for utility, as arbitrary utility functions
require specifying up to 2
n
different function values.
Another approach to obtain compactly represented utility functions is
based on the observation that the real-world application of sensor net-
works frequently involves monitoring geometric entities such as struc-
tures or habitats. In those cases, the quality of a solution can often be
characterized in terms of the average or maximum distance of “interest-
ing” points on the geometric entity from the measuring sensors.
Specically, let A R
2
denote an area to be monitored, and w : A! R
+
a weight function measuring how “important” it is to measure any given
point x2 A. Notice that A may be discrete or continuous. We let d
x;y
be a
distance metric measuring how effectively a node at point x can measure
the point y; most frequently, we will use the Euclidean distance d
x;y
=
p
(x
1
y
1
)
2
+ (x
2
y
2
)
2
, but our framework also permits using different
metrics, or distance measures that are not truly metrics (such as the
Euclidean distance with the additional provision that the d
x;y
=1 if x
and y are too far apart). Given a set S of sensors, we use d
S;x
to denote the
52
distance of the point x from the sensor set, i.e. d
S;x
= min
y2S
d
y;x
. Based
on this notion of distance, we can dene the weighted maximum and
average distance penalty as ^ p(S) = max
x2A
d
S;x
, and p(S) =
R
x2A
w
x
d
S;x
dx.
(If A is a discrete set, the integral should be replaced by a sum.)
Based on this penalty, we can now dene the utility of a set to be in-
versely proportional to its penalty, i.e. u(S) := 1=^ p(S) or u(S) := 1=p(S).
Alternatively, we can phrase the problem as a penalty minimization in-
stead of a utility maximization. In particular, in many monitoring appli-
cations, the goal will be to chose one best set of sensors that will monitor
the environment. Thus, the goal would be to choose a set S such that
^ p(S)=t
S
or p(S)=t
S
is minimized subject to the constraint that the set S
can indeed send t
S
units of data.
Unfortunately, the problem of choosing the best single set is NP-hard for
both of these objectives, at least if the amount of data sent by all nodes
must be integral.
Theorem 10 If t
S
is required to be integral, then nding the set S mini-
mizing ^ p(S)=t
S
or p(S)=t
S
is NP-hard. In fact, unless P=NP, ^ p(S)=t
S
cannot
be approximated to within better than 1.822.
Proof. We give reductions from the Euclidean k-median and Euclidean
k-center problems, respectively. The former was shown to be NP-hard by
53
Papadimitriou [53], and the latter was shown to be hard to approximate
to within 1.822 by Feder and Greene [18]. In both cases, we are given a
nite set of points A R
2
, and are asked to nd a subset S A of size at
most k. For the k-median problem, the objective is to minimize
P
x2A
d
S;x
,
and for the k-center problem, to minimize max
x2A
d
S;x
.
For the reduction, in both cases, we let A be the area to be covered by
sensors. There is a sensor node v
x
with energy 1 located at each point
x2 A, and all the v
x
are connected to a bottleneck node ^ v with edges of
sending cost 1. The bottleneck in turn is connected to the root r via an
edge of sending cost 1, and has energy k.
To prove that this is a correct (and approximation-preserving, in the case
of k-center) reduction, rst notice that if S is a solution for k-center (or
k-median), then having each node v
x
with x2 S send one unit is feasible,
and gives exactly the same objective value. Conversely, if S is the set of
nodes sending data in a solution to the sensor selection problem, then
by integrality and the energy constraints of 1 on nodes v
x
, each node in
S sends exactly one unit of data. But then, the objective function value
of the set S as a solution to k-center or k-median is exactly its penalty in
the corresponding sensor selection problem. This completes the proof.
54
Using a similar technique we used in the proof for Theorem 3, by adding
one node which essentially constrains the network to send exactly one
unit of data instead of any fractional amount of data, we can prove that
UOSSP with geometric utility function p(S) is also NP-hard.
Theorem 11 Finding one set S minimizing p(S)=t
S
is NP-hard.
Proof. The key idea of the proof is to add one more sensor such that in
optimal solution this sensor must send exactly one unit of data which in
turn constrains the amount of data the network can send. We use the
same reduction as the proof shown in Theorem 10, with the exception
that now we add one more point ^ z such that the minimum distance to
any other point in A is 2 + max
x2A;y2A
d
x;y
, we add a new sensor node v
^ z
with unit distance to point ^ z. Node v
^ z
has unit energy and sending cost
and directly connected to the sink node r. Let the weight of point ^ z be
w
^ z
= k (1 +
P
x2A
max
y2A
d
x;y
).
Now we claim that in the optimal solution, sensor node v
^ z
must be se-
lected and all of the selected nodes send exactly one unit of data. First,
due to the far away placement of the node v
^ z
. it is obvious to see that:
8x2 A : d
S;x
= d
Snv
^ z
;x
, Second, because of the large weight of point ^ z, the
nearest sensor node v
^ z
must be selected in the optimal solution. Now as-
sume that we have got an optimal solution S
0
. Denote S
00
= S
0
nv
^ z
. Without
55
loss of generosity, we havejS
00
j k. Assume thatjS
00
j = k + c where c 1,
we have:
p(S
0
)=t
S
= (w
^ z
+
P
x2A
d
S
00
;x
)=(
k
k+c
)
(k + c) w
^ z
=k
w
^ z
+ c w
^ z
=k
= w
^ z
+ c (1 +
P
x2A
max
y2A
d
x;y
)
> w
^ z
+ c
P
x2A
max
y2A
d
x;y
w
^ z
+ c
P
x2A
d
S;x
w
^ z
+
P
x2A
d
S;x
Therefore, any solution S with Cardinality k would be better than the
solution S
0
, contradict the assumption that S
0
is the optimal solution.
Therefore, an optimal solution to the constructed UOSSP must send one
unit of data.
Similar to the proof given in Theorem 10, we have the correct reduction
from k-median problem, and therefore, prove the NP-hardness of the
IUOSSP with geometric function p(S).
This result does not rule out the possibility that sending fractional amounts
of data may make the problem for the objective ^ p(S)=t
S
tractable, or the
56
existence of good approximation algorithms for the objective p(S)=t
S
. In-
deed, as shown by Arora et al. [3], there is a Polynomial-Time Approxi-
mation Scheme (PTAS) for the Euclidean k-median problem.
1.6 Related Work
Energy-conservation in sensor networks has received a tremendous amount
of attention in the literature. For example, there is extensive work on
overhearing avoidance and node duty-cycling at the MAC layer [56, 12,
90, 43]. As well, several pieces of work have focused on topology con-
trol [4, 8, 10, 89] to conserve energy in dense deployments. Our utility-
based sensor selection is largely complementary, in that applications,
rather than the system, indicate which set of sensors should be active.
Also complementary, for a similar reason, is work on increasing network
lifetime using energy-aware routing [15, 67, 40, 9, 88, 68, 34, 17].
Utility functions have signicant prior history in networking, having been
used to model architectural questions relating to network QoS [69] and
network pricing [36]. In the sensor networks literature, however, we have
found relatively few applications thereof. Perhaps closest to our work is
that of Byers and Nasser [6], who rst proposed utility-based sensor se-
lection. However, they have analytically examined only the special class
57
of utility functions that depend exclusively on the size of the set. More
recently, Isler and Bajcsy [31] used utility functions to model the sen-
sor selection problem for target localization. By contrast to both these
pieces of work, we examine a broader class of utility functions. Finally,
the work of Mainland et al. [46] is tangentially relevant to ours. They pro-
posed a price-based resource management scheme for sensor networks.
Their focus is more on the mechanistic aspects of resource management,
whereas ours is algorithmic.
1.7 Summary
In this chapter, we have examined a utility-based sensor selection ap-
proach that enables sensor network applications to express utilities as-
sociated with retrieving data from sets of sensors. We study the feasibil-
ity of determining, subject to network lifetime constraints, the sequence
of sets whose data has the maximum total utility.
We explore fractional and integral variants of the utility-based sensor se-
lection problem, as well as single-set variants thereof, on three important
classes of utility functions. We nd that many variants are NP-hard, and
some hard to approximate. On the other hand, submodular functions
can be optimized efciently, and an important subclass of supermodular
58
functions admits a fractional solution via solving an LP, and an O(log n)-
approximation when nodes are constrained to send the same amount of
data. Finally, we show that geometric utilities can be cast into a penalty
framework for which we are able to prove preliminary hardness results.
59
Chapter 2
Energy-Efcient Broadcasting in
Wireless Ad Hoc Networks:
Lower Bounds and Algorithms
2.1 Introduction
Wireless ad hoc networks are useful in any situation where temporary
network connectivity is needed, such as in the battleeld and in disaster
relief. In such a multi-hop wireless network, every node may be required
to perform routing in order to achieve end-to-end communication among
60
nodes. We consider networks where each node has transmit power con-
trol and an omni-directional antenna and therefore can adjust the area
of coverage with its transmission.
Wireless communications consume signicant amounts of battery power
[33], and therefore, energy efcient operations are critical to enhance the
life of such networks. Some amount of power is lost even when a node
is in idle mode. A recent study [19] shows that the power consumed
in transmitting and receiving packets in standard WaveLAN cards range
from 800 mW to 1200 mW. During the past few years, there has been
increasing interest in the design of energy efcient protocols for wireless
ad hoc networks [65, 77, 71, 73, 74].
Broadcasting is an important operation in wireless ad hoc networks,
where a message from a given source must reach all other nodes. Each
node has local broadcast capability, that is, it can transmit a message to
reach all nodes that are reachable with a certain amount of transmitted
energy. There are several works on energy efcient protocols where all
nodes are assumed to have xed transmission range [33, 78, 71, 72, 73].
With power control a node can adjust its range to transmit a message to
reach one or more nodes [65]. The problem of minimizing the energy cost
for broadcasting from a given source to all other nodes in a network,
with power control was presented in [85]. The approach taken in [85]
61
is to build a source rooted spanning tree by adjusting transmit powers
of nodes, followed by a sweep operation to remove redundant transmis-
sions. Since nding the optimal broadcast tree is difcult, as it requires
all possible spanning trees to be evaluated, the authors presented sev-
eral heuristics. In [84], Wan et. al. proved that the BIP algorithm in [85]
has a constant ratio to optimal solution given that there are no obstacles
in the network and that the xed energy cost for electronics is negligi-
ble. Therefore, this result will not hold when there are obstacles in the
network.
2.1.1 Network and Energy Cost Model
In this chapter we consider the one-to-all broadcasting problem in wire-
less networks, where nodes have power control and any node can be
the source. A node can adjust its transmission radius, but can reach all
nodes within that radius only if there are no obstacles. That is, if there is
an obstacle between two nodes, then these nodes cannot communicate
directly. Our network model is more realistic as many ad hoc networks
operate in buildings and in the eld where there are many obstacles that
block radio signals. We assume that nodes are xed and the obstacles
in the eld do not change.
62
We will also consider more general model for energy cost for commu-
nication. Energy consumed in radio transmissions depends on several
factors including the number of bits sent, the range of transmission,
and the losses in the environment [60]. The ratio of signal strength to
the noise level at a receiver must be above a certain threshold for reliable
detection. There are two primary components of the energy cost for com-
munication between a pair of sensor nodes: a xed component of energy
consumed in electronic circuits when transmitting or receiving a packet,
and a variable component when transmitting a packet which depends
on the distance of transmission. Typically, this variable part of the en-
ergy cost in transmission is proportional to d
, where
ranges from 2 to
4 [60] and this is the only cost considered in most of the previous works
[85, 78].
We will use the following model for transmission energy cost:
p = f
c
+ v
c
d
2
(2.1)
where d is the distance for transmission, and f
c
is the cost for signal
processing and amplication, which is needed for both reception and
transmission, and v
c
is the cost associated with the radiation part of
transmission. We assume that the radio channel is symmetric, that is,
63
!
"
#
!
#
"
#
$
#
%
$
&
'**+
#
,
Figure 2.1: Broadcasting in the wireless network: there is an obstacle between
V
root
and x
2
the energy required to transmit a message from node i to node j is the
same as that from node j to node i.
2.1.2 The Problem
We abstract our broadcasting problem as follows. Given a graph G =
(V;E), where each vertex represents a wireless node, and any edge (u;v)2
E means node u and node v can communicate with each other directly by
using the distance of the edge (u;v) as the transmission radius. Missing
an edge between a pair of nodes implies that they can not communicate
directly due to obstacles. Energy cost for packet transmission is given
by the above equation consisting of xed and variable cost. For receiv-
ing a packet, there is a xed energy cost due to the node electronics.
The energy cost for one-to-all broadcasting will include all the reception
costs and intermediate retransmission costs. For example, in Fig. 2.1,
64
if node V
root
is the source of a broadcast, and nodes S
1
;S
2
are chosen
as forwarding nodes with the transmission radius shown by the circles,
then the total cost includes the transmission costs and reception cost.
In general, the total energy cost for broadcasting, written as C(s), in an
N node network with k nodes transmitting the message is:
C(s) = (N 1) f
c
+
k
X
i=1
p
i
= (N 1) f
c
+
k
X
i=1
(f
c
+ v
c
d
i
2
)
where k is the number of transmitting nodes (including source), p
i
is the
energy cost used by the ith forwarding node, and d
i
is the transmission
radius chosen by the ith forwarding node.
2.1.3 Our Contributions
The main contributions of this chapter are the following. For the wire-
less ad hoc network model with obstacles we show a lower bound of
(log N) on the approximation ratio of any polynomial time algorithm for
this problem, unless P = NP. We then present a greedy broadcasting
algorithm (GBA) which provably meets the O(log N) performance bound.
Although we can prove that the worst case approximation ratio of broad-
casting problem in general wireless ad hoc network is bounded by a
65
factor O(log N), it is possible that for practical networks BIP or GBA al-
gorithm may give much better solutions compared to optimal solution.
Since we will not know the optimum solution for a given practical net-
work, we developed lower bounds for energy cost for broadcasting for
comparing the effectiveness of GBA solutions. This evaluation is per-
formed through extensive simulations using two lower bounds (longest
shortest path cost and multicasting cost to two distant nodes) for many
different size networks. The results show that the GBA algorithm per-
forms well in practical networks.
We can observe several interesting properties about our GBA algorithm:
The O(log N) bound on the performance of GBA algorithm holds not
just for the energy cost model in (2.1), but for any energy cost model
for communication as long as the cost is symmetric.
Recently, Liang [42] gave a polynomial time algorithm for broadcast-
ing in wireless ad hoc networks where all nodes have k power levels
and proved that the approximation ratio has a bound of O(log
3
N),
for the symmetric case. Because GBA works for any symmetric cost
structure, we improve their approximation ratio to O(log N).
GBA produces a single shared tree which is an O(log N) approxima-
tion to the optimum for broadcasting from any source.
66
This chapter is organized as follows. In Section 2 we present our theo-
retical bounds and establish the O(log N) lower bound result. In Section
3, we describe the details of our GBA algorithm. In Section 4 we develop
lower bounds for energy cost for broadcasting in practical networks and
present our simulation results. In this section we also compare our re-
sults with the BIP algorithm which is closest to our work. Our results
show that GBA, which meets the theoretical bound for general networks,
has good performance compared to BIP algorithm and for many practical
networks, performs better than BIP algorithm.
2.2 Theoretical Lower Bound for broadcasting
In [7], it is formally proved that broadcasting problem in a wireless ad
hoc network with obstacles is an NP-complete problem. Our objective is
to design a practical approximation algorithm with good performance. In
this section, we will prove a lower bound of
(log N) on the approximation
ratio of any polynomial time algorithm for this problem, unless P = NP.
We will prove this lower bound by reducing the set cover problem to the
problem of broadcasting with obstacles in a wireless network.
The Minimum Set Cover Problem is dened as: Given a universe U of
m elements and a collection S
1
;S
2
; ;S
n
of subsets of U, with cost 1
67
root
S
2
S
n
ε ε ε ε
1 1 1 1 1
1 1
...
...
X
1
X X X X X 2 3 4 5
6
1 S
Figure 2.2: Network for the associated set-cover problem
specied for each subset, the minimum set cover problem asks for a
minimum cost collection of sets whose union is U.
Now we give a reduction from the Minimum Set Cover problem to our
problem as follows (Fig. 2.2): Construct an undirected graph G = (V;E),
where
V =frootg
[
fX
1
;X
2
; ;X
m
g
[
fS
1
;S
2
; ;S
n
g (2.2)
and for each subset S
i
, add an edge between node root and node S
i
, and
for each element X
j
in the subset S
i
, add an edge between node X
j
and
node S
i
in graph G. And the energy cost to reach all of the S
i
nodes from
root is , where is a small constant greater than zero. In addition, the
energy cost of each node S
i
to reach all of its child nodes X
j
is 1. In our
construction, any missing edge implies that there is an obstacle between
the corresponding nodes. To nd the minimum set cover in the original
problem, now becomes nding a minimum-energy broadcasting scheme
from root.
68
Lemma 12 In an optimum solution to the broadcasting problem specied
in Fig. 2.2, any element node X
i
; i = 1; 2; ;m, will not be chosen as a
forwarding node.
Proof. Since the root node is the source of broadcasting, and all of the
subset nodes S
i
;i = 1; 2; ;n must get the broadcast message, obviously,
cost must be paid to reach all the subset nodes. Since forwarding
of message from any element node can only reach some subset nodes
of S
i
s which already have the broadcast message, such transmissions
are unnecessary. Therefore, in the optimum solution, element nodes
X
i
;i = 1; 2; ;m will not be chosen as forwarding nodes.
Lemma 13 If we can nd a solution in the original minimum set cover
problem with cost K, then we can nd a solution in the corresponding
broadcasting in wireless network with obstacles problem with cost K + .
Proof. If we could nd a solution in the original minimum set cover
problem with cost K, then in the corresponding broadcasting problem,
we could just construct a solution by choosing these K subsets, solved
from the minimum set cover problem, as the forwarding nodes, and
make root responsible for forwarding messages to all the subset nodes.
Since the union of the K subsets is the universe U, this solution is a
complete broadcast scheme with energy cost K + .
69
Lemma 14 If we could nd a solution to the broadcasting problem with
obstacles in wireless network specied in Fig. 2.2 with cost K +, then we
can nd a solution in original minimum set cover problem with cost K.
Proof. If we could nd a solution to the broadcasting problem with cost
K + , then we could construct a solution in minimum set cover problem
by choosing those K subsets whose representative nodes in Fig. 2.2 are
chosen as forwarding Nodes. Obviously, all of the nodes in Fig. 2.2 get
the broadcast message, which implies that the union of these K subsets
is U.
In [20] and [63], it is proved that the lower bound on the approximation
ratio of any polynomial time algorithm for the minimum set cover prob-
lem is
(log N). By Lemma 13 & 14, the same lower bound holds for our
broadcasting problem.
Theorem 15 The lower bound on the approximation ratio of any polyno-
mial time algorithm for the broadcasting problem in wireless networks with
obstacles is
(log N) given that P6= NP.
70
2.3 Our GBA algorithm For The broadcasting
problem
From the lower bound proof, we could see the similarities between the
set cover problem and the broadcasting problem in wireless networks
with obstacles. There is a simple algorithm which can achieve an ap-
proximation ratio of O(log N) for the set cover problem. Therefore, it
seems reasonable to try and discover a simple greedy algorithm which
obtain a similar approximation ratio for our problem. In this section,
we present an O(log N)-approximation to our problem using a simple
polynomial time algorithm that we call the Greedy Broadcasting Algo-
rithm(GBA). From Section 2, it follows that this is the best possible (up
to a constant factor) approximation ratio that can be achieved in polyno-
mial time unless P = NP.
We will assume that we are given a set of nodes V =fv
1
;v
2
;::: ;v
N
g. We
will further assume that we are given a non-negative cost c(v
i
;v
j
) be-
tween any pair of nodes v
i
;v
j
. This cost represents the amount of power
required to be able to send information from v
i
to v
j
. The cost can be
innite for nodes which can not communicate directly (eg. because of
an obstacle). We will assume that each node has a path of non-innite
cost to any other node. We will further assume that the cost function is
71
symmetric i.e. the cost of a transmission from v
i
to v
j
is the same as the
cost of a transmission from v
j
to v
i
. If a node v transmits with power , it
can transmit to every node x2 V such that c(v;x) .
2.3.1 Description of GBA
The algorithm always maintains a collectionC =fT
1
;T
2
;::: ;T
J
g of trees.
Let V (T) denote the set of nodes in tree T. The algorithm ensures that
the sets V (T
1
);V (T
2
);::: ;V (T
J
) are disjoint, and that their union is the set
V . The number of trees inC keeps decreasing as the algorithm advances.
Before specifying the algorithm, we need to establish some notation.
The cost c(v;T) of transmitting from a node v to a tree T is dened as
min
x2V (T)
c(x;v). We use the term x(v;T) to denote the node in V (T) which
achieves this minimum (if two or more nodes in V (T) achieve this mini-
mum, then we break the tie arbitrarily).
At any given time during the execution of the algorithm, dene n(v;) as
the number of trees T in the collectionC such that v62 T and c(v;T) .
Informally, n(v;) measures the number of trees which can be reached
from v using a single transmission of power .
The prot ratio (v) of a node v is dened as min
n(v;)
. Let (v) refer to
the transmission power which achieves this minimum; again, ties are
72
broken arbitrarily. Intuitively, the prot ratio of a node is the best ratio
of the number of trees covered to the power used.
Note there are at most N possible values of that are “interesting” for
node v. The quantities (v);(v);x(v;T) etc. can all be calculated in poly-
nomial time. It is also important to note that the quantities (v) etc. are
not static but change as the classC changes during the course of the
algorithm.
We are now ready to describe GBA:
The Initialization Step: Let T
i
be the tree consisting of a single node
v
i
. The classC consists of the trees T
1
;T
2
;::: ;T
N
. Also, initialize C to
0.
The Greedy Step:
1. If the classC has a single tree T then output this as the shared
tree and exit, else go to step 2.
2. Find the node v with the minimum value of (v). Let T be the
tree inC which contains v. Then for each tree T
0
6= T inC such
that c(v;T
0
) (v), do the following:
(a) Add an edge from v to x(v;T
0
). The trees T and T
0
have
now merged into a single tree; we continue to refer to this
merged tree as T.
73
(b) Remove T
0
fromC; the merged tree T continues to remain
inC.
3. C C + (v).
4. Start another iteration of the greedy step.
One nice feature of GBA is that it returns a single tree which can be used
by any source to perform a broadcast; in the next section we show that
this tree is within a factor O(log N) of the optimum irrespective of the
source. Notice that the quantity C is maintained by the above algorithm
but is never used; this quantity will help us in our analysis.
2.3.2 Analysis of GBA
In order to use the shared tree T for a given source s, we reorient the
tree so that s becomes the root. Then we start at the root, and transmit
the message with sufcient power to reach all the children of the root.
From each of the children, we recursively follow the same process till the
message gets to all the nodes in T. Let C(s) denote the total power used
by the above strategy.
Lemma 16 C(s) 2C
74
The proof of Lemma 16 is straightforward, and is omitted . This Lemma
is very useful, as we now need to only analyze the cost C and not worry
about the source.
Let C
(s) denote the minimum power needed to broadcast a message
from source s to all the other nodes in the network. Observe that we
do not know how to efciently compute C
(s); we will use C
(s) only
as an artifact in our analysis. Dene C
= min
s2V
C
(s), and dene the
associated optimum broadcasting tree as T
. Assume that GBA went
through the greedy step K times. Let v
i
denote the chosen vertex, let C
i
denote the value of (v
i
), and let n
i
denote the value of n(v
i
;(v
i
)) during
the i-th iteration of the greedy step. For each node v, dene
(v) as its
transmitting power used in T
and dene n
i
(v) as the number of distinct
clusters its children belong to during the i-th iteration of the greedy step.
Notice that C
=
P
v
(v) and
i
(v)
(v)
n
i
(v)
due to the denition of .
Observe that the number of trees inC decrease by exactly n
i
during the
i-th iteration.C has N trees at the beginning of the algorithm, and only
one at the end. Hence
P
K
i=1
n
i
= N 1. Also, C =
P
K
i=1
C
i
. We will use
the term m
i
to denote the size ofC at the beginning of the i-th iteration.
Notice that C
i
=n
i
= (v
i
)
i
(v)
(v)
n
i
(v)
, and m
i
m
i+1
= n
i
.
Lemma 17
P
v
n
i
(v) (m
i
1)
75
Proof. For any node v and cluster X such that v does not belong to X,
dene Z
(v;X) = 1 if there exists a node u2 X which is a child of v in
T
, and dene Z
(v;X) = 0 otherwise. Let Z
(X) =
P
v62X
Z
(v;X). Let
C
i
denote the set of clusters at the beginning of step i. Now,
P
v
n
i
(v) =
P
X2C
i
;v62X
Z
(v;X) =
P
X2C
i
Z
(X): Consider any cluster X which does not
contain the root of T
. There must be a node in X whose parent does not
belong to X. Hence Z
(X) 1. Since only one cluster inC
i
contains the
root,
P
X2C
i
Z
(X) (m
i
1), which proves this lemma.
The above lemma is crucial to the following lemma:
Lemma 18
9v s.t.
(v)=n
i
(v) C
=(m
i
1) (2.3)
Proof. Suppose that8v;
(v)=n
i
(v) > C
=(m
i
1), then we have C
=
P
v
(v) >
P
v
n
i
(v) C
=(m
i
1), due to Lemma 18, we can get: C
> C
,
which is a contradiction, thus prove the lemma.
Since at each iteration we choose a node v with best prot ratio, we can
therefore conclude that:
C
i
=n
i
= (v
i
)
i
(v)=n
i
(v) C
=(m
i
1): (2.4)
Now let, H(N) =
P
N
j=1
1
j
denote the N-th harmonic number.
76
Lemma 19 C C
H(N).
Proof. The equation (2.4) is the crucial property of GBA which will re-
sult in the desired bound. Now,
C =
K
X
i=1
C
i
K
X
i=1
n
i
(C
=(m
i
1)) [Using equation 2.4]
= C
K
X
i=1
n
i
=(m
i
1)
!
= C
K
X
i=1
n
i
X
j=1
1=(m
i
1)
!
C
K
X
i=1
n
i
X
j=1
1=(m
i
j)
!
= C
N1
X
i=1
1=i [Since m
i
n
i
= m
i+1
]
C
H(N)
The following theorem is now immediate from Lemmas 16 and 19.
Theorem 20 C(s) 2C
(s) H(N)
It is well known that H(N) 1 + lnN, which gives us an upper bound
of O(log N) on the performance of GBA, irrespective of the source. It is
77
worth observing that the constant hidden inside the O-notation is quite
small.
2.4 Evaluation
We use extensive simulations to evaluate the performance of our GBA
broadcasting algorithm for a number of different network scenarios. We
also use simulations to compute the lower bounds for practical net-
works. For our simulations, we chose randomly generated undirected
connected graphs as network topologies. Such a graph will represent a
realistic network with obstacles, since only those pairs of nodes that can
communicate directly will have edges between them. In order to see the
effect of the density of nodes, we chose two different size networks: one
is a network in a 500m x 500m square area; the other is a network in
a 1000m x 1000m square area. We generated networks with number
of nodes ranging from 10 to 150 increasing by increments of 10. Also
for each size, we generated 50 different instances
. So in total, we have
1500 different networks for our simulations.
On average, the number of edges is 75% of that of the full mesh graph.
78
500m x 500m networks
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
0 20 40 60 80 100 120 140 160
Number of Nodes
MAX BIP/GBA
MAX GBA/BIP
AVG BIP/GBA
Figure 2.3: GBA vs. BIP on 500m x
500m practical networks
1000m x 1000m
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
2.1
2.2
0 20 40 60 80 100 120 140 160
number of nodes
MAX
GBA/BIP
MAX
BIP/GBA
AVG
BIP/GBA
Figure 2.4: GBA vs. BIP on 1000m x
1000m practical networks
2.4.1 GBA vs BIP
Since BIP [85] broadcasting heuristic is the most widely cited algorithm
for this problem, we compare the performance of GBA and BIP for the
various practical networks. We are interested in the average energy cost
for broadcasting from any node. For this purpose, we iteratively chose
every node as the source for broadcasting and calculated the energy cost
with the GBA scheme and the BIP scheme and computed the cost ra-
tio. For a given N, we calculated the energy cost for broadcasting with
these two algorithms for 50 different instances of network topologies.
The maximum cost and average costs are computed for the various net-
works, and for comparison purposes, we plot the maximum cost ratio of
GBA over BIP and BIP over GBA as well as the ratio of the average cost.
Fig. 2.3 and Fig. 2.4 show the maximum and average ratio results for
79
GBA vs. BIP on BIP-weak networks
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
BIP-weak networks
AVG
BIP/GBA
Figure 2.5: GBA vs. BIP on BIP-weak
networks
GBA vs. BIP on GBA-weak networks
0
0.5
1
1.5
2
2.5
GBA-weak networks
AVG
BIP/GBA
Figure 2.6: GBA vs. BIP on GBA-weak
networks
networks size ranging from 10 nodes to 150 nodes. Our results clearly
show that GBA performs better on the average and in many instances
it performs signicantly better than the BIP scheme. With increase in
network size, the maximum energy cost ratio tends to settle down with
GBA performing better than BIP by at least 20%.
In addition, we conducted simulation runs on special networks which
are known to be bad for GBA or BIP schemes and computed the average
energy cost for broadcasting in such networks. We evaluated the per-
formance of both algorithms on some specially designed networks given
in [84] which are weak for BIP, and Fig. 2.5 shows the simulation re-
sults. We also designed a family of networks that we believe gives bad
performance for GBA, we call these networks GBA-weak (their structure
is described in section 2.4.3), and Fig. 2.6 shows our results. The plots
in these gures show the ratio of average energy cost for broadcasting
80
from any source with the BIP and GBA schemes for several different
BIP-weak and GBA-weak networks. Clearly, GBA performs better than
BIP for both classes of weak networks in most cases.
In summary, our simulation results show that:
GBA works better than BIP on the average.
In some cases, GBA works signicantly better than BIP.
For both BIP-weak and GBA-wek networks, GBA outperforms BIP.
When the density of the network becomes high, the performance of
GBA and BIP converges.
2.4.2 Practical Lower Bounds
In the worst case, our algorithm (or any polynomial time algorithm) can
be
(log N) times more expensive than the optimum, unless P = NP.
However, it is possible that the performance of our algorithm may be
much better on realistic scenarios. In this section, we present two ef-
cient methods of computing a lower bound on the cost of the optimal
solution
y
. The ratio of the cost of our algorithm to the lower bound on
y
The through exploration of the lower bound is beyond the scope of this chapter.
81
s
z
x
y
z
1
z
2
Figure 2.7: 2-SPT: the paths from s to z, from z
1
to x, and from z
2
to y are
shortest paths. z
1
and z
2
can be reached from z by one broadcasting.
any given instance gives a bound on the approximation ratio of our algo-
rithm for that instance.
a. 1-SPT
z
` Find the node v such that the cost of unicasting to v from
source s is the highest. Use this cost as a lower bound. This node v
and the corresponding cost can easily be found using one invocation of
Dijkstra's algorithm; we omit the details.
b. 2-SPT ` Suppose we are given a source s and two nodes x;y2 Vfsg.
The energy-optimum tree for multicasting from s to the setfx;yg must
have the following form: there is a node z2 V such that s is connected to
z via the shortest path from s to z, z is connected to z
1
and z
2
, and then z
1
is connected to x via the shortest path from z
1
to x and z
2
is connected to
y via the shortest path from z
2
to y, as illustrated in Fig. 2.7. Let C
(x;y)
be the cost of this tree. Two observations can now be made:
z
SPT stands for shortest path tree.
82
500m x 500m networks
0
0.5
1
1.5
2
2.5
3
3.5
4
0 20 40 60 80 100 120 140 160
number of nodes
MAX
GBA/lbound
MIN
GBA/lbound
AVG
GBA/lbound
Figure 2.8: 1-SPT Lower Bounds for
500m x 500m practical networks
1000m x 1000m networks
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0 20 40 60 80 100 120 140 160
number of nodes
MAX
GBA/lbound
MIN
GBA/lbound
AVG
GBA/lbound
Figure 2.9: 1-SPT Lower Bounds for
1000m x 1000m practical networks
500m x 500m networks
0
0.5
1
1.5
2
2.5
3
3.5
0 20 40 60 80 100 120 140 160
number of nodes
MAX
GBA/lbound
MIN
GBA/lbound
AVG
GBA/lbound
Figure 2.10: 2-SPT Lower Bounds for
500m x 500m practical networks
1000m x 1000m networks
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0 20 40 60 80 100 120 140 160
number of nodes
MAX
GBA/lbound
MIN
GBA/lbound
AVG
GBA/lbound
Figure 2.11: 2-SPT Lower Bounds for
1000m x 1000m practical networks
1. C
(x;y) can be computed efciently by guessing z and its trans-
mission radius (by trying out all values of z and the transmission
radius, for example). We omit the details.
2. C
(x;y) is a lower bound on the optimum broadcast cost.
The above observations could be used to compute max
x;y2V
C
(x;y), which
would be a valid lower bound. To our experiences, this turns out to be
83
prohibitively expensive computationally for our topologies. Instead, we
use a heuristic to choose values for x and y as follows: rst, choose
x as the node that has the highest shortest path cost from source s.
Then choose y as the node such that the product of its shortest path
costs from s and x is the highest. We could analogously dene C
(w;x;y),
C
(u;w;x;y), etc and use these as lower bounds, but the computational
complexity of these larger problems prohibits these generalizations. To
see how well our scheme works for practical networks, we calculated
the lower bound energy cost on the 1500 randomly generated networks.
We calculated the ratio of the GBA cost over the lower bound cost for
practical networks and Fig. 2.8 and Fig. 2.9 show the results with
respect to 1-SPT lower bound performance, and Fig. 2.10, Fig. 2.11
show the results with respect to the 2-SPT lower bound performance.
Our simulation results show that 2-SPT lower bound is tighter than 1-
SPT. The maximum ratio of GBA/lower bound is between 3 and 4 in
different size networks. This ratio seems to converge to a small factor of 2
to 2.5 with increase in the density of nodes. We have chosen very simple
lower bounds for broadcasting and our results show that the cost ratio
is very small. With even tighter lower bound, we can expect this ratio to
be much smaller in practical networks. Therefore, we can conclude that
84
GBA algorithm will perform well for broadcasting in practical networks
of a few hundred nodes in 1000m x 1000m area.
2.4.3 A GBA-weak network
It would be useful to nd some specially designed networks which are
“bad” for GBA. We call these networks GBA-weak networks. In this sec-
tion, we will give one construction for such a class of networks. In order
to make a network bad for GBA, the construction should proceed as
follows:
Start with a network with known optimal solution.
It has N + 1 nodes, among which N nodes are forwarding nodes.
The kth forwarding node will spend an energy cost of
OPT
Nk+1
for the
transmission.
We can see that in such a network, the energy cost of GBA produced
broadcasting scheme is O(log N) OPT, which is really bad. The network
is shown in Fig. 2.12.
In this network, nodes P
1
;P
2
; ;P
N
are distributed on concentric circles
with the same center P
0
. Node P
0
can communicate with all other nodes,
85
...
d
N-2
d
N-1
d
N-3
d
N
P
N-2
P
N-1 P
N
r
N-3
r
N-2
r
N-1
r
N
d
2
d
1
r
2
P
2
P
1
P
0
r
N
+ε
Figure 2.12: A GBA-weak network with
optimal cost for broadcasting: f
c
+v
c
r
2
n
!"#
!"$
!
#
%
&
$
'
*
Figure 2.13: Broadcast tree built by
GBA for the GBA-weak network shown
in Fig. 2.12
yet node P
i
, 1 i N, can only communicate with its nearest neighbors
P
i1
and P
i+1
. By carefully tuning the distance (r
i
) between the nearest
neighboring nodes and the radius (d
i
) of the circles, we can make GBA to
nd a tree as shown in Fig. 2.13. The following conditions for d
i
s and r
i
s
will make GBA perform poorly on this network:
1. For each r
k
, let f
c
+ v
c
r
2
k
=
OPT
Nk+1
, where OPT = f
c
+ v
c
r
2
N
and thus we get r
k
=
q
2r
2
N
N+k
2(Nk+1)
, where r
1
= d
1
2. For each k;
fc+vcd
2
k
k
>
fc+vcd
2
N
N
3. For each k;d
k1
d
k
d
k+1
4. (d
k
d
k1
) r
k
(d
k
+ d
k1
)
We can see that all of the above conditions can hold for certain values of
d
1
;d
2
; ;d
N
. Once condition 2) holds, we will have no chance to choose
86
P
0
as the cluster head and any d
k
as the transmission radius in GBA.
Thus we have the following conclusion:
fc+vcr
2
k+1
2
fc+vcr
2
k
1
=
OPT
2(Nk)
OPT
Nk+1
=
1
2
+
1
2(N k)
< 1 (2.5)
and
f
c
+ v
c
r
2
1
=
OPT
N
<
f
c
+ v
c
(r
N
+ )
2
N
(2.6)
Which in turn can lead to the following conclusion: The total energy cost
required by the broadcasting scheme produced by GBA would be:
N
X
k=1
1
N k + 1
OPT = H(N) = O(log N) (2.7)
The shared broadcasting tree built by GBA is shown Fig. 2.13.
2.5 Summary
In this chapter, we considered realistic wireless ad hoc networks with
obstacles and established theoretical lower bound on the approximation
ratio of the energy cost for broadcasting. With more general network
model with obstacles and energy cost model, we showed that no poly-
nomial time algorithm can achieve an approximation ratio beter than
87
O(log N), unless P=NP. We developed and presented a broadcasting al-
gorithm, called GBA, and proved that this algorithm guarantees O(log N)
approximation ratio performance and thus it is an optimal polynomial
time approximation algorithm for energy efcient broadcasting in wire-
less ad hoc networks with obstacles. In practical networks, we showed
that the GBA algorithm performs quite well and on the average performs
better than the BIP scheme by at least 20%. Using extensive simula-
tions we compared the performance of GBA with respect to simple lower
bounds for broadcasting and showed that it is only within a factor of 4 or
less with respect to this lower bound for networks up to a few hundred
nodes. With this result, we established that GBA can achieve broadcast-
ing for a given practical network which is at most 4 times the optimum
cost for broadcasting.
88
Chapter 3
QCRA: Quasi-static Centralized
Rate Allocation for Sensor
Networks
3.1 Introduction
control for congestion mitigation and avoidance has received signicant
attention in the sensor networks literature [58, 29, 83, 16]. These tech-
niques are focused on distributed rate adaptation, whereby each node
locally determines a safe transmission rate that would avoid network
89
congestion. For example, IFRC [58], a recently proposed technique, em-
ploys a novel congestion sharing mechanism and standard AIMD rate
adaptation in order to achieve distributed fair and efcient rates in a
wireless sensor network. However, techniques such as IFRC can con-
verge on rates that are slightly less than optimal, and require careful
parameter tuning in order to perform well in a given environment.
On the other hand, there is a signicant thread in the literature that
has examined centralized rate allocation. This body of work seeks to de-
termine optimal transmission rates, given information about topology,
link loss rates, and the communication pattern. In many settings, de-
termining the optimal rate is computationally intractable [32, 38], and
many heuristics have been proposed for obtaining near-optimal rate al-
locations. To our knowledge, however, no work has examined whether
centralized rate allocation is actually feasible in a real-world wireless
network.
In this chapter, we take a step back and ask: is a near-optimal quasi-
static centralized rate allocation even feasible for wireless sensor net-
works? This approach is attractive because of its potential efciency
and simplicity: near-optimal rate assignments would lead to higher over-
all network efciency, and centralized rate assignment would reduce the
complexity of sensor node software. However, intuition suggests that
90
quasi-static centralized rate allocation is unlikely to work well in prac-
tice, since wireless loss rates vary dynamically, and heuristics for near
optimal rate allocation are computationally expensive.
In its most basic form, the problem that we consider in this chapter
can be stated as follows. Given a network of N wireless sensor nodes
each using a CSMA medium access layer and each having a backlog of
data to send to a base station, and given the routing tree used by these
nodes to send trafc to the base station, we seek to centrally (at the base
station) allocate a fair and efcient rate for each sensor such that the
goodput achieved at the base station in a real-world network matches
the assigned rate for each sensor. Furthermore, our rate allocation is
quasi-static: that is, rate assignments are recomputed once every epoch,
where the duration of an epoch is on the order of 10s of minutes (in this
chapter, we consider an epoch duration of 15 minutes).
This problem statement is motivated by long-running continuous-monitoring
wireless networks for high-data rate applications [52, 47]. While the way
the problem has been formulated may seem contrived, we have deliber-
ately stated the problem in this manner to examine whether centralized
rate allocation is practical in real-world settings. Clearly, a purely static
centralized rate allocation is infeasible since wireless propagation char-
acteristics vary widely with time; the next best approach is to consider
91
infrequent rate adjustments, as we do with quasi-static rate allocation.
Of course, our problem statement simplies one aspect of reality. We
do not extensively consider routing changes due to node failures. Our
methods can be adapted to trigger rate allocation decisions when a node
failure is detected by the base station. Such an adaptation will work well
in environments where node failures do not occur at shorter time scales
than rate adaptation.
We make two important contributions in this work. First, we design
a fast rate assignment heuristic that, given the topology of the wireless
sensor network, and the loss rates on the routing tree links, assigns a
fair and efcient rate to each node. This heuristic essentially computes
a coarse-grained TDMA schedule among all neighbors of a node, in order
to determine the trafc levels in the vicinity of that node. The node
with the highest trafc level determines the fair rate allocation; recursive
applications of the heuristic allow nodes unconstrained by the bottleneck
node to achieve higher rates. Second, we show how a single parameter C
sufces in practice to adapt the rate assignments to account for the fact
that, in practice, wireless sensor radios use CSMA, that wireless loss rate
estimation techniques can be imperfect, and that wireless loss rates vary
even over time scales of 10s of minutes. The rate adaptation parameter
92
C is empirically derived by observing the behavior of the network during
an epoch.
Contrary to intuition, we nd that our quasi-static centralized rate allo-
cation (QCRA) performs well on a 40-node wireless testbed even under
harsh conditions. Extensive experiments on this testbed show that sen-
sor nodes achieve a goodput very close to their allocated rate, and that
this achieved goodput is nearly 50% higher than that achieved by IFRC.
Furthermore, even though our heuristic is somewhat simplistic, we nd
that our achieved goodput often equals an empirically-determined opti-
mal rate. Finally, we show that our heuristic can be easily extended to
support weighted fair allocations, as well as sensor networks with mul-
tiple base stations.
Much remains to be done in order to make QCRA truly feasible. Ef-
cient methods for obtaining network topology, fast and accurate link
loss estimation techniques, and low-overhead practical rate distribution
protocols will need to be developed. Regardless, we believe that our re-
sults make a strong case for the feasibility of such methods, a case that
has not been made in the literature before.
93
3.2 Related Work
Fair and efcient rate allocation has been studied before for both multi-
hop wireless networks and wireless sensor networks in particular. Prior
work in this areas had explored both decentralized and centralized schemes.
In our work, we explore practical centralized rate allocation; we are not
aware of any centralized rate allocation scheme for a CSMA MAC layer
that has been evaluated with real-world experiments.
Several distributed rate allocation approaches are related to QCRA. Ee et
al. [16] propose a distributed rate allocation scheme to achieve fair rate
allocation for wireless sensor networks. Unlike QCRA, however, their
rate allocation scheme is fair, but not necessarily efcient in the sense we
describe in Section 3.3.4. Woo et al. [87] propose a distributed scheme
that uses AIMD rate adjustment strategy to achieve fairness and ef-
ciency. Similarly IFRC [58] is a distributed rate allocation scheme that
explicitly shares congestion information and uses an AIMD scheme to
achieve fair and efcient rate allocation. Finally, Huang et al. [?] de-
scribe a theoretical framework for calculating max-min fair rates in a
distributed fashion for ad-hoc networks. Their approach, like ours, also
uses cliques in the wireless contention graph. By contrast to all of these,
94
however, we explore the feasibility of centralized rate control in real sen-
sor networks.
ESRT [66] is a centralized rate allocation scheme for wireless sensor
networks. In ESRT, each node sends data at a xed rate to the base sta-
tion which then informs the node to increase or decrease its rate based
on the perceived goodput. Though ESRT concentrates on reliable and
energy efcient operation it can be used to achieve fairness, but not ef-
ciency. Furthermore, ESRT has only been evaluated analytically and
through simulation.
Centralized packet scheduling for multihop wireless networks had been
studied extensively in the literature. Jain et al. [32] show the intractabil-
ity of a joint routing and scheduling problem for multihop wireless net-
work for any-to-any trafc. They propose a technique to obtain lower
and upper bounds on the rates that can be achieved by a given set of
ows in a network. Kodialam et al. [38] address the same problem and
propose approximation algorithms to achieve a near-optimal allocation.
Unlike these, we examine rate allocation for a given routing tree where
the nodes use a CSMA MAC layer.
Other work tangentially related to QCRA includes Fusion [29] and CODA [83].
These attempt to mitigate congestion in wireless sensor network with
95
increasing offered load. Finally, Nandagopal et al. [49] propose a dis-
tributed mechanism to achieve any pre-dened MAC layer fairness model.
3.3 Quasi-static Centralized Rate Allocation
In this section we describe our goals and give an overview of QCRA, fol-
lowed by a detailed description of its rate allocation and rate adaptation
scheme. We then discuss several practical extensions to the rate allo-
cation scheme, and complete the section with a discussion of QCRA's
limitations.
3.3.1 Overview
In its simplest form, the problem we examine is the following. Consider
a sensor network consisting of N nodes each sending data over multiple
hops to a single base station. For simplicity, we consider the case where
nodes are always backlogged (i.e., have data to send). We assume that
a routing protocol has established a routing tree; any one of the com-
monly used tree-routing techniques could be used for this purpose. In
what follows, we assume that the routing tree is xed in every epoch,
and we discuss later how to deal with route changes and node failures.
96
Furthermore, we assume that each node employs CSMA-based medium
access control. This assumption is consistent with current practice in
sensor networks.
Let f
i
be the ow originating from node i. We investigate the feasibil-
ity of quasi-static centralized allocation of transmission rates r
i
to every
node i such that the end-to-end goodput received by all ows f
i
is fair
in a max-min sense
. QCRA makes rate allocation decisions once every
epoch. We assume that we are given the network topology, as well as
loss rates on routing tree links; we discuss later how this information
can be obtained. QCRA does not require additional specialized queuing
or scheduling mechanisms at sensor nodes.
Centralized schemes to achieve fairness among ows in wireless net-
works have been studied extensively in the literature. Optimal central-
ized rate allocation is known to be computationally intractable, and well-
performing heuristics proposed in the literature have assumed a time-
synchronized MAC. However, feasibility for use in a real-world setting
requires a light-weight rate-allocation scheme that performs well on a
There is a broader question of whether fairness is the right goal for wireless sensor
networks. It is currently unclear what trafc policy a sensor network should enforce, so
we selected the max-min fairness for our experiments. Notice that, since our approach
is centralized, a variety of other rate allocation strategies can be implemented within
our framework.
97
1
2 3
4 5 6
7 8 9 10 11
Figure 3.1: Example of contention
CSMA MAC. In this section, we describe QCRA's heuristics for rate allo-
cation and adaptation.
The simplest heuristic to calculate transmission rates in a sensor net-
work would be to equally divide the bandwidth at a node among all the
ows traversing the node and all its neighbors, and then allocate the
minimum of all the rates that are assigned to the ows at any node as
the transmission rate. For example, in Figure 3.1, all nodes send data
to node 1. To simplify illustration, we assume ideal links in this discus-
sion. Solid lines depict the routing tree, and dotted lines connect nodes
that can hear each other's trafc but do not have a child-parent rela-
tionship. Data trafc at a given node in this network depends on the
number of ows traversing the node and all its neighbors. In our exam-
ple, the bandwidth around node 5 will be utilized by data trafc from all
the ows traversing node 5 and trafc from all the ows traversing neigh-
bors of node 5. Thus, all ows except ow f
3
contend for the bandwidth
at node 5. The above straw-man heuristic would take the sum total of the
98
ows from node 5 and its neighbors, which equals 18, and would assign
a rate B=18 (B being the channel capacity) to each ow f
i
except f
3
in the
network. However, this assignment, though fair, can be improved upon.
Transmissions from 6! 3 and 4! 2 would not interfere with each other
and can, in principle, occur simultaneously.
QCRA employs a more sophisticated rate allocation by taking into ac-
count transmissions that can occur simultaneously. In QCRA, a node
and its neighbors are divided into independent sets of nodes that can
transmit simultaneously. From each such set, the node that contends
for the maximum amount of bandwidth is chosen and sum of bandwidth
usage of such nodes from each set denes the total bandwidth require-
ment at a node. For example, in Figure 3.1, at node 6, node 6 and
neighbors of node 6, namely 3; 5 and 11 are divided into three sets (6),
(3,5), and (11). Now, the number of ows transmitted traversing nodes
6, 3, 5 and 11 is 2, 3, 4 and 1 respectively. But since 3 and 5 do not
interfere, their transmissions can be conceptually scheduled simultane-
ously, so the total trafc around node 6 equals 2 + 4 + 1 = 7. QCRA uses
this estimate of total trafc at a node while dividing the bandwidth fairly
among ows at a node.
By clustering nodes whose transmissions can be potentially co-scheduled,
our heuristic implicitly assumes a coarse-grained time-division. In a
99
CSMA MAC, clearly, it will not be possible to schedule transmissions
in this manner. Thus, a rate allocation derived from this heuristic will
result in collisions in a CSMA MAC, which in turn will reect in an in-
creased loss rate on the on-tree links. Moreover, any real-world rate
allocation scheme also needs to be robust to variations in the link quali-
ties caused due to small environmental changes. Losses on links can be
repaired using a limited number of retransmissions at the link-layer, and
short-term variations in link quality can be assumed by having a rela-
tively large upper-bound on the number of retransmissions. However,
QCRA needs to account for the bandwidth consumed by these retrans-
missions when making rate allocation decisions. Given the loss rate on a
link, QCRA can easily compute the expected number of retransmissions
on that link. To account for the variation in bandwidth consumption due
to variation in link quality, QCRA over-estimates the expected number of
retransmissions by a factor C. This parameter is empirically and adap-
tively determined: QCRA determines an appropriate C by measuring the
achieved goodput during an epoch, and uses that value to assign rates
for the next epoch.
We now describe QCRA's rate allocation and rate adaptation mechanisms
in more detail.
100
3.3.2 Rate Allocation
An important input to QCRA is the capacity of the wireless channel.
We have found it important to empirically measure the channel capacity
by sending packets back-to-back from one node to another, rather than
relying on the nominal channel capacity determined by the radio. An
empirical measurement takes into account environmental factors, and
MAC implementation artifacts (like backoff). We denote the channel ca-
pacity by B. This measurement of channel capacity need be done only
once for a deployed network, and between only one pair of nodes.
A second input for QCRA is the network topology (more precisely, the list
of neighbors of each node), and the loss rates on the link from a node
to its parent. Before making a rate allocation decision for the rst time,
QCRA computes loss rates by sending a small number of probe packets
along each tree link. In later epochs, QCRA uses loss rates observed by
actual data trafc during an epoch to perform rate allocation decisions
for the next epoch. Topology and loss-rate information can be piggy-
backed on data packets in a real implementation of QCRA; we assume
that the overhead of doing this is very small relative to the amount of
data trafc transferred during an epoch.
101
To describe QCRA, we introduce the following notation. Consider a net-
work G = (V;T;L;E;B), where
V is a set that contains all of the nodes in network.
T is the tree constructed by the routing protocol for network G. For each
pair of nodes (u;v)2 T : node v is the parent of node u.
L is the set of the link qualities of each node to its parent. We dene l
u
as the link loss rate from node u to its parent node.
E describes neighbor pairs in the network, i.e. 8(u;v)2 E: node u is a
neighbor of node v.
B is the measured channel capacity for network G.
Dene the goodput of the network as the lowest observed packet recep-
tion rate at the base station from any node in the network. We rst calcu-
late the trafc at each node as a function of goodput. Denote by MAX RT
the maximum number of link-layer transmissions (in Section 3.3.1 we
motivated the need for link-layer retransmissions) allowed at a node in
network G. Let ETX(u) be the expected number of packet transmissions
over the link from node u to its parent node:
ETX(u) =
MAX RT
X
i=1
i (1 l
u
) l
i1
u
+ MAX RT l
MAX RT
u
:
102
We dene the effective link quality as the packet reception rate where
each packet may be retransmitted up to MAX RT times. Let variable
EFLQ(u) be effective link quality of the link from node u to its parent
node: EFLQ(u) = 1 l
MAX RT
u
: We dene effective path quality as the
product of the effective link qualities of the links long the path to the
sink. Let variable EFPQ(u) be effective path quality of the path from
node u to the sink: EFPQ(u) =
Q
8v2path(u)
EFLQ(v):
Given routing tree T, for each node u in the network, we calculate the to-
tal number of nodes in the subtree rooted at node u, denoted by NUM CHILD(u).
Now, suppose that the network goodput is g. Then, each node should
originate trafc at a rate given by: ORG TRAFFIC(u) = g=EFPQ(u): Fur-
thermore, the trafc forwarded by a node would equal the total trafc
from all the descendants of u in the subtree:
FWD TRAFFIC(u) = g NUM CHILD(u)=EFPQ(u):
The expected total trafc transmitted by a node TOTAL TRAFFIC(u) is
the sum of these two quantities times the estimated number of trans-
missions ETX(u) for each packet.
We dene an independent set as set of nodes which can transmit simul-
taneously. For each node u, QCRA uses a heuristic to nd the set of
103
disjoint maximal independent sets (denoted by I(u)) among all its neigh-
bors (Computing the maximum independent set is intractable). QCRA
starts with a random ordering of neighbors of u. If a neighbor n can be
scheduled concurrently with all nodes in an independent set s belonging
to I(u), QCRA adds n to s. If no such s exists, it creates a new singleton
set consisting of n and adds it to I(u). (We discuss alternative strategies
in Section 3.3.5).
Given I(u) for each node u, IFRC calculates the amount of trafc around
u assuming that all nodes in each independent set in I(u) can be sched-
uled simultaneously. That is, for each set s in I(u), QCRA computes the
trafc for that set as:
max(TOTAL TRAFFIC(u));8u2 s:
The amount of trafc around u, or more precisely, the maximal number
of ows that could contend for the channel around u is then dened as
the sum of these quantities for all s in I(u). Denote this sum by F(u).
In its nal step, QCRA computes the network goodput g by solving the
equation B = F(u). Intuitively, this is the rate that can be sustained
104
through the bottleneck node. Then, it assigns to each node u a transmis-
sion rate g=EFPQ(u). Thus, each node is assigned different transmission
rates, with the aim of achieving fair goodput across all nodes.
3.3.3 Rate Adaptation
The rate allocation heuristic is somewhat idealized, since it assumes that
transmissions from non-interfering nodes can be scheduled simultane-
ously. To obtain a scheme that works in practice, we overestimate the
expected number of transmissions on a link by a factor C. That is, rather
than ETX(u), we use ETX(u) C in the heuristic described above. This
overestimation accounts for packet losses arising from the use of a CSMA
MAC, as well as temporal variations in link quality within an epoch.
QCRA initially uses a value 1 for C in the rst epoch. In subsequent
epochs, it calculates C by observing the achieved goodput AGP(u) of
each node u at the end of the epoch:
C = max
8u2V
(EGP(u)=AGP(u)) C:
where EGP(u) denotes the expected goodput of node u. It uses this value
of C in the next epoch. Intuitively, C captures the variability in link
quality within the network. Our design is somewhat conservative, using
105
a single parameter to capture link quality variations across the entire
network. As we shall see, even with this conservative calculation, QCRA
achieves goodputs that are near-optimal in a sense we make precise in a
later section. It is possible to compute C values for individual links and
achieve higher throughput at the expense of fairness. We explore this
briey in our evaluation section.
3.3.4 Extensions
In this section, we discuss extensions to QCRA's rate allocation heuris-
tics.
Extension to weighted fairness: QCRA can easily accommodate appli-
cations where different sources have different priorities or weights, by
logically duplicating a node in proportion to its weight. Specically, let
weight(u) be the weight for node u, then we alter the computation of the
number of descendants of u as follows: NUM CHILD(u) =
P
8v2subtree(u)
weight(v)
Most of the rest of QCRA remains unchanged, except the last step where
we assign the transmission rate: rate(u) = (g=EFPQ(u)) weight(u).
106
Extension to multiple sinks: QCRA works unmodied for the case
when multiple sinks exist. The only difference is that the rate alloca-
tion heuristic must consider neighboring nodes on different trees. How-
ever, where there are multiple trees in network, it is inefcient to assign
each node in every tree the same goodput. We now discuss how QCRA
computes efcient rate allocations, where nodes whose ows are uncon-
strained by the same bottleneck are assigned higher rates.
Efcient rate allocation: To assign efcient max-min fair rates, we
recursively apply our rate allocation heuristic in a manner that we now
describe. Recall from Section 3.3.2 that QCRA computes the network
goodput g based on the trafc around that node u which has the lowest
goodput. Intuitively, u is the bottleneck node in the topology. To assign
efcient rates, we use the following procedure:
1. Assign g to u, its descendants, all neighbors of u, and their descen-
dants. Mark all these nodes.
2. If there are any unmarked nodes in the network, recursively invoke
the rate allocation heuristic on the unmarked nodes, using the as-
signed goodputs for the marked nodes. This procedure will identify
another bottleneck node u
0
. Repeat these two steps on u
0
and apply
this procedure until no marked nodes remain.
107
3.3.5 Limitations and Discussions
QCRA is robust to link quality uctuations within an epoch. However,
its rate allocation needs to be invoked whenever a node fails or when
the routing tree is signicantly changed. In cases where this happens
infrequently, QCRA might be practicable. In section 3.4.5, we evalauted
the performance of QCRA over a version of relatively stable dynamic
routing protocol. Our evaluations show that with less than eleven node's
route changes between consecutive epoches, the performance of QCRA
is still comparable to the one with static routing. However, QCRA is
likely to fail in highly dynamic settings where most links vary in quality
dramatically on the order of minutes. In such networks, it might be
possible to ensure stable link qualities through transmit power control,
an approach we have not tried.
QCRA's rate allocation heuristic ignores hidden terminal effects. For
example, consider a topology with three nodes u, v, and w. Node u is
the parent of node v, and a neighbor of node w, while node v and node
w are not neighbors. Now, when node v and w transmit simultaneously,
u may not be able to hear w. Yet, QCRA may assign v and w to same
independent sets, assuming that they could transmit simultaneously.
Despite this, as we shall see, QCRA performs surprisingly well.
108
QCRA also has a simple heuristic in computing its independent sets.
When computing these sets, it considers the neighbors of node u in ran-
dom order. Arguably, better independent sets might result if the neigh-
bors were considered in order of decreasing TOTAL TRAFFIC. In prac-
tice, as we shall see, this simplication does not result in reduced perfor-
mance. Intuitively, this is because the actual transmission scheduling
by the CSMA MAC introduces many inefciencies in practice that a more
sophisticated independent set calculation does not result in improved
performance. In section ??, we compared the performance of QCRA
with one heuristics with much sophisticated independent set calcula-
tion. Our evaluation shows that the heuristic with a more sophisticated
independent set calculation fails to accurately estimate the rate alloca-
tion.
3.4 Evaluation
In this section, we evaluate the performance of QCRA on a 40-node
wireless sensor testbed. Each node in our testbed is a Tmote with a
8MHz Texas Instruments MSP430 microcontroller, 10 KB RAM and a
2.4 GHz IEEE 802.15.4 Chipcon Wireless Transceiver with a nominal bit
rate of 250 Kbps. These motes are deployed over 1125 square meters
109
of a large ofce oor (Figure 3.2). In our experiments, packet size is
48 octets, and the measured radio capacity in our testbed is about 38
packet per second. We added nodes 41 and 42 to replace two dead nodes
4 and 35.
In our implementation, QCRA's rate allocation and adaptation heuristics
are implemented on a PC connected to our sensor network testbed. As an
aside, the computational complexity for rate calculation in QCRA is (d
N) where d is the density of the network. Even with our non-optimized
implementation, QCRA can calculate rates for our 40-node experimental
topologies within 3 seconds. On the sensor nodes themselves, we have
implemented software to generate data at a pre-programmed rate, and
software to estimate link quality and to measure and report topology
information.
Figure 3.2: Layout of the testbed
110
0
20
40
60
80
100
56 58 60 62 64 66 68 70 72 74 76 78
Percentage of Link Quality Change
Link ID
(a) Daytime Link Dynamics
0
20
40
60
80
100
56 58 60 62 64 66 68 70 72 74 76 78
Percentage of Link Quality Change
Link ID
(b) Nighttime Link Dynamics
Figure 3.3: Example link quality changes on the testbed during daily time and
night time
We have extensively tested QCRA on various routing trees with different
sizes during different times of different days. Here we present the results
mainly on a 40-node network, but our experiments were conducted at
different times (day and night).
Our testbed is a fairly harsh wireless environment. It is installed within
an ofce building, where several research labs use various wireless de-
vices. Figure 3.3 plots the changes of link qualities after every hour for
about ve hours. We use packet reception rate as the metric for link
quality. The y-axis plots the percentage of the link quality change, and
the x-axis depicts the link ID. Figure 3.3(a) was measured during one day
in our testbed, and the gap between each measurement is about half an
hour. As we can see from the gure, the network is volatile during the
day; changes in reception rate by 10-15% are quite common, and some
111
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(a) RouteA, Day One
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(b) RouteA, Day Two
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(c) RouteB, Day One
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(d) RouteB, Day Two
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(e) RouteA, Night One
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(f) RouteA, Night Two
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(g) RouteB, Night One
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
epoc-1
epoc-2
epoc-3
epoc-4
epoc-5
epoc-6
(h) RouteB, Night Two
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
1 2 3 4 5 6
Standard Deviation of Goodput
Epoc Id
rtA-D1
rtA-D2
rtB-D1
rtB-D2
rtA-N1
rtA-N2
rtB-N1
rtB-N2
(i) Standard Deviation
Figure 3.4: Evaluation of QCRA
112
links change in reception rate by more than 50%. At night, on the other
hand, the testbed is relatively stable (Figure 3.3(b)).
2
1
3
4
5
6 7 8
9 10 11 12 13
14 15
16 17
18 19
20
21
22
23
24
25
26
27
28 29 30
31 32 33
34 35 37 38
39 40 41
(a) Routing Tree A
1
12
15
2
3
7 5
9
13
6
20
8
10
14
11
29
16 17 18
32
30
19
21
22 23
24 25 26 27 28
31
33
34
35
37 38
39 40
41 42
(b) Routing Tree B
Figure 3.5: Routing Trees
To understand the performance of QCRA, we conducted experiments
both during the day and at night. In each experiment, we xed the
routing tree
y
, and ran QCRA over six epochs of 15 minutes each. For
each experiment, we plot the achieved goodput per-ow. Such a plot
shows the accuracy of QCRA's rate assignment and shows the ability of
QCRA to dynamically adapt to overall network conditions across epochs.
y
In preliminary experiments not reported here, we have found that QCRA works well
over dynamic topologies in which the routing protocol adapts to link quality dynamics.
A detailed exploration of QCRA's performance under dynamics is left to future work.
113
3.4.1 Performance of QCRA
In this section, we describe results from two very different topologies as
shown in Figure 3.5(a) and Figure 3.5(b). Notice that these topologies
have a fairly large hop-diameter (7 hops in both cases) and are fairly
unbalanced; in this sense, they are non-trivial.
1 2 3 4 5 6
rtA-D1 1.0 1.0 1.1 1.1 1.1 1.4
rtA-D2 1.0 1.0 1.0 1.2 1.4 1.4
rtB-D1 1.0 1.5 1.6 1.7 1.8 1.9
rtB-D2 1.0 1.7 1.8 1.9 2.0 2.1
rtA-N1 1.0 1.4 1.5 1.6 1.8 1.9
rtA-N2 1.0 1.3 1.5 1.5 1.5 1.6
rtB-N1 1.0 1.5 1.8 2.0 2.3 2.4
rtB-N2 1.0 1.3 1.5 1.6 1.7 1.8
Table 3.1: Values of rate adaptation parameter C
Figure 3.4 shows two days and two nights's worth of experiments for the
two topologies. Each sub-gure in Figure 3.4 gives per-ow goodput for
six epochs. The X-axis is source node id, and Y-axis the goodput of ow
from the corresponding source node. Table 3.1 lists the maximum rate
adaption parameter C of epochs 2 to 5 in these experiments. In the rst
epoch of each experiment, of course, the value of C is always 1.
Figure 3.4(i) plots the standard deviation of the goodput achieved by all
ows for all eight experiments, each has six epochs of sub-experiments.
We use standard deviation as a metric for fairness. Because C is 1 during
114
the rst epoch, the standard deviation of achieved goodput is higher in
general for the rst epochs of all experiments. For most other epochs,
the standard deviation ranges from 0.005 to 0.01, with one exception
which ranges up to 0.02. This indicates that most ows have goodput
0.015 packet/sec within the average, which itself ranges from 0.15 to
0.3. These results show that QCRA's rate allocation heuristic works
well; nodes are assigned transmission rates that result in fair achieved
goodput.
As the above paragraph also discusses, except Figure 3.4(a) and Fig-
ure 3.4(b), the goodput of different ows in epoch 1 varies dramatically
due to lack of rate adaptation. This, of course, is due to the fact that C
is 1 in the rst epoch. It illustrates the importance of C: without a factor
that accounts for CSMA collisions and link quality variability, QCRA rate
allocation is often erratic. The exceptionally good results in Figure 3.4(a)
and Figure 3.4(b) correspond to experiments conducted during a holiday
weekend when our ofce building was virtually devoid of activity. This is
more evident in Table 3.1, where the C parameter for these two days is
relatively lower than for the other experiments.
For the experiments shown in Figures 3.4(c) 3.4(d) 3.4(e) 3.4(f) 3.4(h),
the goodputs in epochs 3 through 6 are nearly identical, which suggests
that QCRA is able to nd a stable goodput of the network within a small
115
number of epochs. There is one exception; the goodput shown in Fig-
ure 3.4(g) varies more. After careful inspection, we found that one of
the links in the network has an exceptionally high loss rate, and QCRA
takes a longer time to adapt. Later in this section, we discuss a strategy
that computes per-link C-values, allowing QCRA to converge faster and
to assign higher goodputs, but have lower fairness overall.
Finally, another way to understand QCRA's performance is to examine
the achieved channel utilization. Out of 48 epochs in total, the total
trafc around the most congested node in 43 of those epochs was no less
than the measured 38 packets per second. This shows that QCRA's rate
allocation achieves high channel utilization. Of the remaining epochs,
the lowest total trafc around the most congested node is around 30
packet per second. During these epochs, QCRA conservatively traded
goodput for fairness because some links with high losses were observed.
3.4.2 Comparison with IFRC
In this subsection, we compare QCRA against IFRC [58], a practical
distributed rate allocation scheme which can adapt to link quality and
routing dynamics.
z
We perform this comparison on a 30-node network.
z
We do not compare with rates achieved through precise transmission scheduling
using TDMA, since our aim is to nd the highest achievable rate using CSMA.
116
0
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20 25 30 35
Goodput (pkts/sec)
Node Id
IFRC
QCRA: epoc-2
QCRA: epoc-3
QCRA: epoc-4
QCRA: epoc-5
QCRA: epoc-6
Figure 3.6: Performance of QCRA and IFRC
0
0.1
0.2
0.3
0.4
0.5
0.6
10:10 10:20 10:30 10:40 10:50 11:00 11:10 11:20 11:30 11:40 11:50
Rate (pkts/sec)
Time (Hour:Min)
Figure 3.7: Local rate on each node in IFRC
We used a routing tree similar to that of routing tree A in Figure 3.5(a)
with just fewer nodes. Both experiments were conducted at night, to
minimize differences due to changes in wireless propagation character-
istics.
Figure 3.6 plots the goodput of all ows from both IFRC and QCRA.
IFRC achieves a minimum goodput across all nodes of 0.2653 packet/sec,
while QCRA achieves 0.3997 packet/sec(the rst epoch is omitted since
117
no rate adaptation is performed during that epoch), a nearly 50% percent
improvement over IFRC.
Using a more detailed view of IFRC's performance, we are also able to
quantify how far away from the optimal QCRA's performance is. Con-
sider the rate allocation measured from our IFRC experiments as shown
in Figure 3.7. IFRC uses AIMD to nd a stable operating rate; the rate
assigned to each node varies in a classic saw-tooth fashion. In steady
state, these saw-tooth variations are upper bounded by a rate about
0.45 packet/sec. Since the saw-tooth peaks are where IFRC detected
congestion, it is safe to say that the maximum goodput of the network
is no larger than 0.45 packet/sec. Using this as an empirically-derived
optimal network fair goodput, Figure 3.6 shows that, QCRA can achieve
within thirteen percent of the optimal in the worst case. Indeed, many
epochs in our experiments achieved nearly optimal goodput.
3.4.3 Evaluation of Extensions
In this section, we evaluate QCRA extensions to multiple sinks and
weighted fairness.
118
2
1
3
5
6 7 8
9 10 11 12 13
14 15
16 17
18 19
20
21
24
22
23
25
26
27
28
29
30
31 32
35
37
33
34
38
42
39
41
40
(a) RoutingTree
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0 5 10 15 20 25 30 35 40 45
Goodput (pkts/sec)
Node Id
Sink-A
Sink-B
(b) Performance
Figure 3.8: Evaluation of multiple sinks extension
Multiple sinks: We evaluate our extension to multiple sinks together
with our extension for efcient rate allocation. Figure 3.8(b) is the good-
put achieved by all 38 source nodes in a 40-node network with the rout-
ing structure shown in Figure 3.8(a) (two of the nodes are sinks). As
shown in the gure, most of the nodes associated with the same sinks
achieved the roughly same goodput, and nodes associated with different
sinks achieved different goodput. Due to the layout of the testbed (Fig-
ure 3.2), the sinks are at opposite ends of the testbed. So most nodes
in the routing tree rooted at sink B (which has much fewer nodes) do
not contend with the nodes associated with sink A, and therefore get
higher goodput due to our efcient rate allocation. Further, a subset
of nodes associated with sink A (nodes 2-6 and nodes 21-28), achieved
higher goodput without affecting the goodput of the other nodes. For our
119
topology and routing trees, these node are not constrained by the bottle-
neck node and are therefore assigned higher but equal rates by our rate
allocation scheme.
Weighted Fairness: For evaluating weighted fairness, we used routing
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
Figure 3.9: Evaluation of weighted fairness extension
tree A and randomly assigned half of the nodes with weight one, and half
of the nodes with weight 2. Figure 3.9 shows the goodput of all ows for
two epochs. From the gure, we can see that the nodes with weight two
achieve approximately twice the goodput of nodes with weight one. For
clarity, in Table 3.2 we also list for all six epochs of sub-experiments the
average goodput of each subset of nodes and their ratio.
Per-link C values: In some pathological cases, when the tree contains
high loss-rate links, QCRA might take several epochs to converge, as one
of our experiments showed above. One approach to faster convergence
120
Epoch Id Weight One Set Weight Two Set Ratio
1 0.15 0.3 2.0
2 0.15 0.31 2.1
3 0.073 0.14 1.9
4 0.07 0.13 1.9
5 0.068 0.13 1.9
6 0.065 0.12 1.8
Table 3.2: Average goodput of weighted fairness evaluation
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
epoch-1
epoch-2
epoch-3
epoch-4
epoch-5
epoch-6
Figure 3.10: Evaluation of per-link rate adaptation
might be to assign separate C values to each link. Thus, for the link from
u to its parent p, we can dene:
C(u) =
(EGP(u)=AGP(u))
(EGP(p)=AGP(p))
C(u)
Figure 3.10 depicts the goodput achieved in one experiment (six epochs)
conducted during the day using per-link rate adaptation for routing tree
B (Figure 3.5(b)). The high goodput at one node is due to a transient
hardware glitch on node 6. On this link, the loss rate reached as high as
92%, and later recovered by itself; the link loss rate to its parent varied
121
from 35:7% to 65:4%. Despite this, QCRA converged to a fair rate allo-
cation within one epoch. Experiments (Figures 3.4(c) and 3.4(d)) con-
ducted with the same routing tree during roughly the same time with
a single C, achieved minimum goodput of 0:174 packet/sec and 0:2176
packet/sec respectively. The minimum goodput with per-link C values
is about 0:2419 packet/sec, signicantly higher. However, this approach
exhibits a slightly higher variability: the average standard deviation with
per-link is 0:0147 in this case, as against 0:01 with the global C.
3.4.4 Comparison with other sophisticated heuristics
As we have mentioned in section 3.2, there are many works in literature
to examine optimal transmission rate. The difference between QCRA
and other sophisticated heuristics lies in the fact that QCRA is specif-
ically tuned for working in real networks. In this section, we took one
sophisticated lower bound heuristics proposed in [32]. The key idea
of the lower bound heuristic is to use brute-force to iteratively select
many distinctive independent sets. Then similar to a maximum ow
problem, it tries to solve a linear programming which tries to schedule
each independent set such that the minimum goodput is maximized. By
running many iterations, it is supposed to nd enough distinctive in-
dependent sets such that the scheduling among these independent sets
122
will approach to the optimal solution. We slightly modies the heuris-
tic to incorporate retransmissions. In our evaluation, we ran only 2000
iterations of independent set searching.
For comparison, we rst run QCRA on a 40-node testbed during the
night to eliminate the effect of environment as much as possible, then
we took the network information such as link quality from the last epoch
and get an estimated rate assignment due to lower-bound heuristic pro-
posed in [32]. We ran one epoch of experiment with the rate allocation
and compare the performance to the performance of the last epoch. Fig-
ure 3.11 shows two such instances.
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
gput(pkt/sec)
node id
QCRA-AchievedGoodPut
QCRA-AssignedGoodPut,
LowerBound-AchievedGoodPut
LowerBound-AssignedGoodPut
(a) Instance1
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
gput(pkt/sec)
node id
QCRA-AchievedGoodPut
QCRA-AssignedGoodPut,
LowerBound-AchievedGoodPut
LowerBound-AssignedGoodPut
(b) Instance2
Figure 3.11: Comparison QCRA with Lower-Bound heuristic
As the gures show, other more sophisticated heuristic assigns higher
goodput than what can be achieved in network, while QCRA more ac-
curately estimate the network goodput. In both cases, the minimum
123
achieved goodput are roughly the same. The average ratio of achieved
goodput over the assigned goodput is 1.05, 1.20 for QCRA respectively,
while they 0.78, 0.75 for the lower bound heuristic. The overestimation
of the network goodput would assign much higher rate to nodes which
in turn wasted more energy of nodes in doing useless works. As our ex-
perimentations show, sophisticated centralized rate allocation heuristics
fail to accurately estimate the rate allocations in real networks.
3.4.5 Performance of QCRA with dynamic routing
In this section, we describe the results from two days and two nights
running of QCRA with dynamic routing. We implemented our dynamic
routing above MultiHopLQI. Our implementation of dynamic routing de-
creases the number of occurance of loops, breaks loops when it happens,
and builds stable routes.
In detail, we add a hold-down timer for re-establishing the new route
when the old path becomes not available. Hold-down timer is a classical
way for simple routing protocols such as distance-vector routing to solve
the count-to-innity problem. In order to overcome the loss of routing
beacons in high trafc, we add feedback from forwarding engine to take
124
into account the reception of data packet. In order to increase the sta-
bility of the routing, a tag is added to indicate whether or not a route
is stable. A node only switches to another path only during the period
when a node does not have a stable routing. A route only becomes sta-
ble when it is observed to have persistent reasonably good link and path
quality for a certain period of time. A route becomes unstable when it ob-
serves un-acceptable link quality or when the path to the sink becomes
unavailable.
0
5
10
15
20
25
1 2 3 4 5 6
Number of nodes with route change
Epoch ID
Day1
Day2
Night1
Night2
Figure 3.12: Route changes during QCRA performance evaluation with dy-
namic routing
The evaluation of QCRA with dynamic routing used per-node rate adap-
tion with xed maximum number of retransmission. We use xed maxi-
mum number of retransmission of decrease the effect of route changes,
and hence the parent link quality changes, while use the per-link rate
adaption to estimate the expected path quality for each node.
125
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
epoch-1
epoch-2
epoch-3
epoch-4
epoch-5
epoch-6
(a) Day1
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
epoch-1
epoch-2
epoch-3
epoch-4
epoch-5
epoch-6
(b) Day2
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
epoch-1
epoch-2
epoch-3
epoch-4
epoch-5
epoch-6
(c) Night1
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25 30 35 40 45
Goodput (packet/sec)
Node Id
epoch-1
epoch-2
epoch-3
epoch-4
epoch-5
epoch-6
(d) Night2
Figure 3.13: Performance of QCRA with Dynamic Routing
We evaluated the performance of QCRA with this modied dynamic rout-
ing based on MultiHopLQI on our 40-node testbed. Figure 3.12 shows
the number of route changes between each consecutive epochs. As we
have mentioned, our testbed environment is quite harsh. Even with our
modication for stability on dynamic routing protocol, the route changes
are still quite often. In fact, in worst case, more than half of the nodes
change their path to the base station. Figure 3.13 shows the results of
126
QCRA on two days and two nights. As shown in gures, performance
of QCRA is better during the nights due to less environmental noise.
Even in daily experiment, the ratio of maximum achieved goodput over
the minimum achieved goodput is less than 2. We can expect QCRA to
perform much better in a more stable environment.
3.5 Summary
In this chapter, we present the design and evaluation of QCRA, a quasi-
static centralized rate assignment scheme for fair and maximum rate
allocation in WSN. QCRA includes a novel light-weight rate assignment
heuristic to compute fair and efcient rates, and a rate adaption mech-
anism to address longer time-scale wireless link quality variation. Using
extensive experiments we show that QCRA performs well on networks of
up to 40 nodes.
127
Chapter 4
HLR: Using Hierarchical
Location Names for Scalable
Routing and Rendezvous in
Wireless Sensor Networks
4.1 Introduction
Data-centric abstractions for routing and storage have received a fair
amount of attention in the research literature. Data-centric routing sys-
tems such as Diffusion [30] and TinyDB [44] have been used in many
128
sensor network deployments. A body of literature has proposed a com-
plementary class of data-centric storage systems [70, 62, 41] that sup-
ports the construction of distributed hash tables and indices for scalable
querying.
While many advances have been made in designing data-centric abstrac-
tions, not much attention has been paid to the underlying packet rout-
ing and rendezvous primitives. Existing implementations of data-centric
abstractions use two kinds of packet routing primitives: ooding and
geographic routing. Anecdotal evidence from current deployments sug-
gests that ooding adversely impacts the performance even in networks
with tens of nodes. The alternative, geographic routing using protocols
like GPSR [35], requires assigning position information to nodes. Such
information is generally expected to be dynamically computed using an
ad-hoc localization system [5, 27, 39], or a system that assigns virtual
coordinates [59, 50] for routing purposes. These systems are currently
the subject of active research, and practical deployments are perhaps a
few years away.
Here we consider an alternative approach to providing routing primi-
tives for data-centric abstractions. Our approach is based on the obser-
vation that without dynamically computed node positions, most near-
129
to medium-term sensor network deployments will congure node loca-
tions
. Node location provides context for the data collected from the
sensor network. Such location information is often loosely associated
with geography or topography. Thus, in a habitat monitoring network, a
node might be located within the “chaparral” region or within a “ripar-
ian” region. In an in-building network, the location of a node may be
specied by oor and wing (e.g. 13th oor, west wing). Furthermore,
location names often have a natural hierarchy. In a habitat monitoring
network, such a hierarchy might be dened by, for example, a quadrant
of the habitat, followed by a section, and within it a particular cluster of
nodes. In a building network, the hierarchy might be dened by oors,
wings and rooms. A hierarchical location naming scheme is more user-
friendly than a system in which nodes are manually assigned positions.
In fact, we know of at least two deployments [79, 1] that use such a
naming scheme to assign spatial context to sensor readings.
In this chapter, we consider deployments where nodes are congured
y
with hierarchical location identiers (HLIs). An HLI is simply a machine
readable form of a hierarchical location name. Thus, a sensor node in a
We use position to denote the precise position of a node in some geometric coordi-
nate system, and location when other forms of expressing where a node is situated are
used.
y
In Section 4.2, we discuss mechanisms for conguring these nodes that do not
require signicant manual intervention.
130
building might be assigned an HLI of the form 5:4:10 where 5 denotes the
fth oor, 4 denotes the east wing, and 10 denotes the 10th room on the
east wing.
The central thesis of this chapter is that these HLIs can be used to build
a scalable routing system (which we call HLR) for sensor networks. Ob-
serve that the HLI hierarchy can be modeled as an area hierarchy [37].
Imposing an area hierarchy on a network is a well-studied way of scaling
routing protocols in wired and wireless networks. In HLR, the location
naming hierarchy implicitly denes an area hierarchy; by contrast, in
wired networks, other factors such as organizational boundaries or ca-
bling costs might determine the design of area hierarchies. Thus, in our
example above, a node 5:4:10 is in the 5-th top-level area, the 4-th second
level area within the top-level area and so on. Nodes in an area hierar-
chy maintain detailed routing information about nodes within their area,
and less detail about nodes outside their area. In HLR, for example, the
node 5:4:10 would have a routing table entry for all nodes within the area
5:4, one entry for each of the sub-areas of 5, and one entry for each of the
top-level areas.
HLR constructs and maintains these routing tables using a variant of
the distance-vector based routing protocol DSDV [55]. While the basic
131
design of HLR borrows heavily from the routing literature for wired net-
works, it incorporates two novel features. The rst is a technique for
automatically aggregating routing entries at area boundaries that allows
neighboring areas to maintain summarized views of an area. The second
is a mechanism for routing to partitioned areas—classical area hierarchy
based algorithms make the assumption that areas are connected.
Using the routing tables that HLR constructs, it is possible to provide a
variety of packet routing primitives: unicast to a specied node within
the network, broadcast or anycast to a specied area, rendezvous using
a random hash or a locality-preserving mapping. Particularly novel in
HLR is the design of the rendezvous primitives, since previous designs
of such primitives for sensor networks leveraged geographic position-
ing. These primitives can be used for data-centric routing systems like
Diffusion and TinyDB, as well as for data-centric storage systems like
GHT [62] and DIM [41].
We have implemented HLR in TinyOS, and have implemented simpli-
ed versions of data-centric routing and storage systems that use HLR's
routing primitives. We use extensive simulations to compare the perfor-
mance of HLR-based data-centric routing and storage to systems that
use geographic routing. We nd that the performance of the two classes
of systems is comparable; while aggregated route entries increase the
132
average path length in HLR, geographic routing based rendezvous some-
times incurs signicant overhead in walking the outer perimeter. We
also evaluate the behavior of HLR under dynamics, nding that route
changes caused by link failures can often be constrained to a small area
and are not propagated throughout the network. Finally, we report ex-
periences from running HLR on a small-sized network of Mica2 motes.
Taken together, these results imply that HLR is a viable routing layer
for many kinds of sensor networks that can be immediately employed in
near-term sensor network deployments.
Our reliance on congured node addresses may seem to be awkward,
given the networking community's experience with manual conguration
in the Internet context. We make two observations in our defense. First,
many Internet components (the backbone routing system, the name sys-
tem) are still manually congured. Second, unlike the Internet which is
comprised of different administrative organizations, sensor networks are
likely to be managed and deployed by one organization. Furthermore,
until precise self-localization technology is deployed, we expect that sen-
sor network deployments will need to be carefully planned, with human
involvement in identifying each node's position. Given this, it is a small
step to congure these positions on nodes (and techniques can be devel-
oped to reduce the error in this conguration step).
133
2.1.0
1:
3:
2.1:
2.2.2:
2.2.3:
2
2.2.3
2.2.1
2.2.2
2.2.1:
3
1
Figure 4.1: Example: a sensor network with HLI and routing table of node 2:2:1
built by HLR
1.5.1
1.1
1.2
1.4
3
1
2.2.3
2.2.1
2.2.2
2
1.3.2
1.3.1
2.1.0
Figure 4.2: The same sensor network in Figure 4.1 with more details in area 1
shown.
4.2 Overview and Related Work
In this section, we discuss the feasibility of Hierarchical Location Iden-
tiers(HLI) for sensor network deployments. Then, we briey list the
routing primitives HLR provides and how we make use of HLI to build
HLR. Finally, we compare HLR to the other related work.
134
4.2.1 Feasibility Discussion
A fundamental premise behind our approach is that most sensor net-
work deployments will need to associate nodes with names in order to
make sense of generated data. Often, these names will have location
information embedded in them. An example of such a name is: (Resi-
dence Hall 1).(Third Floor).(West Wing).(Sensor 5). Such names have two
natural properties that reect the way humans think about sensory data
acquisition: they are hierarchical, and they contain some location infor-
mation embedded in them that is usually imprecise (i.e. not a position
in some coordinate system). We term the machine readable numeric ID
translated from these hierarchical string names as Hierarchical Location
Identiers(HLI).
Here, we observe that these hierarchical identiers (HLIs) indicate ap-
proximate topological proximity of the nodes and can therefore be lever-
aged to build scalable routing primitives for wireless sensor networks.
Before we discuss how to make use of HLI to support scalable routing
primitives, we rst discuss mechanisms for node HLI's assignment.
Consider a sensor network deployment in a building or a habitat. We ex-
pect that most such deployments will be planned: a domain expert will
need to determine where to place the sensors, how many to place, etc.
135
When planning this deployment, the network administrator needs some
way to associate the data received from a sensor node with its location.
In the absence of localization, the administrator will likely have to man-
ually create a “database” that maps node identiers to human-readable
location identiers. These location identiers are often hierarchical, and
we argue that it is feasible to use this database to congure HLIs for
nodes.
This HLI conguration can be automated in several ways.
For example, the administrator can “zap” each individual sensor device
with its HLI before deployment. Alternatively, one can design a simple
bootstrap protocol (similar to DHCP) by which a node obtains its HLI.
The design of such a protocol is well-understood but is a bit beyond
the scope of this chapter. In this way, HLR simply leverages the fact
that most sensor network deployments will be planned, and does not
add any additional human involvement beyond what will be required for
such deployments anyway.
136
4.2.2 Overview
Equipped with HLIs, our work shows that scalable routing could be de-
signed for wireless sensor networks. The key insight behind routing us-
ing HLIs (HLR) is that one can use HLIs automatically and dynamically
build an aggregated routing table that scales well with network size. Be-
hind this insight lies the assumption that hierarchical location naming is
also approximately topologically-congruent to node placement. That is,
all nodes whose HLIs begin with 1 are situated within some well-dened
geographic region. In our example, all such nodes would be within Res-
idence Hall 1. This property also applies recursively, so that all nodes
whose HLIs start with 1:3 are on the 3rd oor of Residence Hall 1. In
situations where this is the case, HLR can build compact and accurate
routing tables. For example, in the sensor network shown in Figure 4.1,
by running HLR, node 2:2:1's routing table will contain one entry to the
whole area 1, one entry to area 3 and one entry to its sibling area 2:1
in addition to one entry for the other nodes in the same area like 2:2:2
and 2:2:3. So, assuming the hierarchy of the network is appropriately
designed, the size of the routing table can grow logarithmically with net-
work size.
137
Scalability is not the only advantage of HLR. In addition to supporting
unicast, HLR could also be used to support area-based multicast or any-
cast. Unicast can be used for tasking individual nodes. Area-based mul-
ticast enables any node in network to deliver a message to a subset of
nodes which shares a common prex in their HLIs. For example, area-
based multicast can be used in TinyDB to deliver a query to a set of
nodes around a monitored plant to start collecting data without having
to localize any node within the queried area. This primitive cannot be
easily supported by any geographical location-based routing using loca-
tion information.
However, that is not the only advantage of HLR. In this chapter, we also
show how easily we can leverage HLR to support rendezvous-based prim-
itives such as hash-lookup and data-locality preserving hashing, which
are important building blocks for data-storage systems proposed for sen-
sor networks. Hash lookup can be used to implement functionality
equivalent to a single key data storage system such as the GHT [62],
while the last primitive can be used to build a locality-preserving data
storage system which supports multi-dimensional queries such as the
DIM [41]. In addition, these rendezvous primitives are building blocks
for other in-network aggregation techniques. For example, hash lookup
138
can be used to implement aggregation of data from within an area, by
sending data to, say, the smallest identier within the area.
To our knowledge, these are all the primitives that have been proposed
for use for data dissemination and querying in sensor networks. That
HLR can support them without requiring ad-hoc localization systems is
its main selling point.
The challenges in the design of many of these primitives, and in the
design of HLR itself, lie in dealing with dynamics like route changes
and partitioned areas. We discuss in detail the design of basic HLR in
Section 4.3 and the design of routing and rendezvous-based primitives
in Section 4.4.
4.2.3 Related Work
In this section, we discuss related work that have inspired the design of
HLR and contrast our work to the other sensor network routing propos-
als.
Hierarchical routing has been a subject of research for decades [37,
81, 64]. Today's Internet, for example, is built on top of hierarchical
routing schemes such as BGP [64] and OSPF [48]. A hierarchical routing
139
scheme such as ours will provide the scalability needed by these large
scale sensor networks.
The ad hoc networks community has also been working on hierarchical
network organizations [57, 54] where the primary goal is to provide a
reliable communication infrastructure for node mobility management.
For a large class of sensor network applications, mobility is not an issue,
and HLR does not attempt to solve the dynamics resulting from node
mobility.
Clustering schemes [81, 28, 26] represent a related part of the literature.
In general, such schemes (particularly as proposed for sensor networks)
are somewhat orthogonal to HLR, since they are explicitly focused on
node energy management. HLR clusters nodes into areas for routing
information scaling. However, the design of rendezvous primitives in
HLR can also be applied to those hierarchical clustering schemes.
Geographic routing schemes [35] are complementary to HLR. They rely
on precise position information, while HLR relies on logical location names.
Automatically determining position information (localization) is still the
subject of much research. Of course, a sensor network deployment
could use congured precise position information, but this might re-
quire signicant manual labor, especially in environments where GPS
signals might not be readily available.
140
Recently, several virtual coordinate schemes have been proposed to sup-
port stateless location-based routing [59, 50]. It is unclear that these
schemes can be used for routing without the development of another
service that maps a node's identier to a virtual coordinate (since a
virtual coordinate has almost no relation to the physical coordinate).
Such schemes cannot be used for data-centric storage as well without
incurring signicant data migration overhead when virtual coordinates
change.
4.3 HLR Details
In this section, we discuss the details of HLR. We start by discussing an
overview of HLR performance, then describe its aggregation and robust-
ness mechanisms in some detail.
4.3.1 Overview
HLR assumes that HLIs have been congured into network nodes. As
discussed above, the fundamental premise is that deployments will, in
the absence of localization, need to maintain a mapping between some
node identier (perhaps drawn from a at name-space) and some textual
141
description of a location. Typically, this information will be maintained
in a database or le. HLR only requires that, in addition to (or perhaps
instead of) the textual description of location, a network administrator
assign a hierarchical location identier (or HLI) to each node. Thus, for
example, the administrator needs to translate a location identier like
“node 1 in oor 5 of building 1” into an identier of the form 1:5:1.
When a node is assigned a HLI such as 1:5:1, we say that it belongs to
the top-level area numbered 1, the second-level area 5 of the top-level
area 1, and so on. We say that the top-level area has a depth of 3. In
HLR, different top-level areas are allowed to have different HLI depths,
e.g. 1:5:1 and 2:1.
HLR is fairly minimal in its assumptions about what nodes need to be
congured with. It only requires that each node know its own HLI. Thus,
a network can be incrementally deployed without having to recongure
existing nodes.
The key insight behind HLR is that one can automatically and dynami-
cally construct aggregated routing tables with the congured HLI at each
node, using a modied version of a distance-vector algorithm such as
DSDV [55]. DSDV is a distance vector routing algorithm which asso-
ciates a sequence number with each destination to avoid the “count to
innity” problems associated with distance-vector protocols. In the basic
142
DSDV, each node advertises a route to itself, and associates that route
with a monotonically increasing sequence number. Neighboring nodes
periodically exchange distance vectors to each destination, together with
the sequence number of each destination. To a given destination, a node
might possess several routes heard from each of its neighbors. Of these
routes, a node only considers routes assigned the most recent sequence
number. There may be more than one such route corresponding to dif-
ferent paths to the destination, but the key intuition is that all routes
with the same sequence number represent a consistent view of paths
to a destination. From these routes, each node picks the shortest, and
advertises that to its neighbors. This intuition also explains why DSDV
avoids the count-to-innity problem associated with earlier distance vec-
tor algorithms; at any instant, the routes selected by each node to a des-
tination taken together form a tree rooted at the destination node, which
by denition is acyclic.
The main challenge in adapting DSDV to HLR is route aggregation. The
goal of HLR is to scale the routing table such that, for example, in a
network as shown in Figure 4.2, the node with HLI 1:3:1 should have:
One route to each node in area 1:3, such as 1:3:2, 1:3:3, and so on.
143
One route to each of the sibling areas of 1:3, such as 1:1, 1:2, 1:4, 1:5
and so on. A route to, say, area 1:5 is said to be an aggregated route.
Aggregation is the fundamental contributor to scaling the Internet
as well as HLR.
One aggregated route to each top level area other than its own, such
as 2, 3, and so on.
Depending upon how the HLIs are assigned, such a routing table can
scale logarithmically with network size, which implies that the overhead
of HLR is only logarithmic of that of DSDV.
In HLR, we accomplish route aggregation automatically using a simple
modication to DSDV. The intuition for doing this comes from the follow-
ing observation. Consider the network shown in Figure 4.2, suppose that
nodes 1:3:1 and 1:5:1 are neighbors of each other. Then, the former can
create a route for the 1:5 aggregate, when it hears a route advertisement
from 1:5:1. Thus, any packets destined towards any node in 1:5 from 1:3:1
will be forwarded to 1:5:1. This aggregation relies on an important prop-
erty: all nodes within 1:5 (and more generally, any area) are connected
(i.e. there exists a path between two nodes in an area that does not exit
144
the area). For now, we assume that this connectivity assumption is sat-
ised. Later in this section, we will discuss how HLR can be adapted to
deal with situations when an area is internally partitioned.
For context, most of what we have discussed above is well-known in the
routing literature; area hierarchies have been studied for a long time.
However, our contribution here is the design and implementation of a
distance vector protocol for wireless sensor networks that performs au-
tomatic route aggregation. In wired networks, link-state protocols like
OSPF perform these kinds of aggregation, but we do not know of ac-
tual designs or prototypes of distance vector protocols that have been
augmented to automatically aggregate routes.
How does HLR perform this aggregation? As we discussed above, instead
of maintaining routes to individual nodes, HLR conceptually maintains
routes to areas. At the boundary of an area (such as the one between
1:3:1 and 1:5:1 in our example above), nodes aggregate routes to areas.
A node can detect that one of its links intersects an area boundary by
comparing its own HLI with that in the route it hears. A node may hear
many routes to an area, potentially one from each “gateway” node (a node
which has at least one link that intersects the area boundary); if so, it
145
picks one with the highest path quality
z
one of these and propagates it
to its neighbors.
Unlike DSDV which conceptually builds a tree rooted at the destination,
HLR builds, for each destination area, a forest with trees rooted at the
area's “gateway” nodes. In this way, it maintains DSDV's loop-freedom
and reduces the number of routing packets since each node only needs
to join and propagate one tree for each hierarchical destination area.
For example, in the network shown in Figure 4.2, all nodes in area 2
only need to keep track of one path to area 1, but not necessary the
same path. For instance, node 2:2:1 and nodes in area 2:1 join the tree
rooted at node 1:3:1, while node 2:2:2 and 2:2:3 use the route to node 1:3:2
as its path to area 1. Also, all nodes in area 2 only propagate one path
to area 1 to area 3. Nodes in sub-area 1:3 of area 1 need to keep track of
path to the subareas 1:1 and 1:2 and so on.
What are the trade-offs in using HLR over vanilla DSDV for wireless
sensor networks? Clearly, maintaining routes to every node in a wireless
sensor network is neither feasible nor necessary. Yet, we argue that
a protocol like HLR can be very useful in wireless sensor networks a)
because its areas mirror logical location-based distinctions which often
z
The selection of the best is dependent on the metric chosen for routing. In our
simulation, we used number of hops as the metric.
146
form the basis of user queries or network tasking instructions (e.g. in
an in-building network, many queries are likely to be expressed in terms
of oors and wings), and b) HLR efciently maintains routes to these.
As we show later, HLR can be used to efciently implement a variety of
routing primitives in a highly scalable fashion, so the intuition here is
that by expending a little energy to provide a general routing substrate,
we can make the rest of the system signicantly more efcient.
When used this way, HLR has one advantage and one disadvantage.
Like any protocol based on area hierarchies, HLR does not provide op-
timal paths. However, as we will show in section 4.5, the performance
of HLR is often comparable to, or better than other alternatives since
those alternative often have other pathologies (e.g. traversing the outer
perimeter in a GHT). The advantage of HLR, and an important one from
the perspective of sensor networks, is that most node or link failures only
affect a small number of nodes (usually those within the failed node's
own lowest-level area). We validate this in our simulations.
We have implemented HLR on Berkeley motes. Details of our implemen-
tation and some results from a small deployment will be discussed in
section 4.6.
147
4.3.2 Automatic Route Aggregation
We now discuss, in some detail, the route selection, aggregation, and
route propagation rules in HLR. For simplicity, in this discussion we
assume that all areas are internally connected. In the next subsection,
we discuss how HLR relaxes this assumption.
In HLR, each node periodically exchanges routes. Each route is asso-
ciated with the HLI of a destination node. This is an important point;
HLR does not propagate routes to an area, and routes always refer to a
node within an area. HLR does, however, compute and store routes to
an area. When a node receives multiple routes to nodes within the same
area, it picks one of those and re-advertises it. For example, consider a
node 2:2:1 which receives ve routes, one each to 1:5:1, 1:2:3, 1:3:1, 1:4:1
and 1:1:2. From the perspective of this node, all of these routes represent
paths to destinations in area 1. We call area 1 the effective destination
area from the perspective of node 2:2:1. Then, node 2:2:1 picks one of
these routes, say the route to 1:3:1, and advertises that.
This is a subtle point; one would have expected HLR to be designed
such that 2:2:1 would advertise the aggregate 1 instead of the route 1:3:1.
x
Doing so, however, without violating the semantics associated with the
x
Note that while 2:2:1 has 5 routes to area 1, it only re-advertises one of them, thus
maintaining the desired scaling behavior.
148
sequence numbers turned out to be tricky. This behavior of HLR denes
the intuition described above: HLR maintains a forest of trees for a given
area, and different nodes “join” different trees in this forest by picking the
best available route. However, this choice has an interesting trade-off. If
it had been possible to advertise the aggregate, then even if any one of
the ve selected routes had changed, that change would be hidden from
nodes downstream of 2:2:1. Now, however, if the selected route 1:3:1 fails,
another route will have to be selected and propagated
{
, so this choice
has weaker failure containment properties. In practice, though, as our
simulations show, the performance of HLR is still quite good, and most
failures affect only a small number of nodes.
Thus, each route is associated with an HLI of a node, a sequence num-
ber, a path metric to the destination node, and a lifetime associated with
the route. The route lifetime is used to purge stale routes, and the se-
quence number for loop avoidance. In our simulations, we use the hop
distance as the path metric. While this is known to be a bad choice
in selecting paths in wireless networks [86], we augment this with link
blacklisting (see below) in our current implementation. Longer term, we
see using other additive path metrics that capture notions of link and
{
Unless all the routes to area 1 fail, this change will not trigger an instant propaga-
tion; rather, it will be propagated in the next regular advertisement.
149
path quality [86, 11] in HLR. HLR can be easily modied to include more
sophisticated path metrics.
We now more precisely describe the route selection and aggregation rules.
From our discussion above, this is the step in which the route aggrega-
tion is implicitly performed, since HLR does not propagate aggregates.
Suppose that node A has n different routes that it has heard from its
neighbors. It rst partitions the set of routes such that all routes in a
subset share an HLI prex h dened as follows: if h has l elements, then
the rst l 1 elements of h and of A's HLI must be the same. The in-
tuition, of course, is that h denes a distinct area outside A for which
A need only maintain one route. Each subset also denes one effective
destination area. In our example above, the 5 routes that node 2:2:1 has
denes a subset. In this case, h is 1 and l is 1.
Now, consider a single subset. The node A selects exactly one route
from this subset using the following rule. It further renes the subset
by associating all routes to the same HLI into one cluster. From each
cluster, it picks the lowest cost route with the most recent sequence
number. Then, from within these selected routes, it picks the lowest cost
route. These rules are basically designed to select the nearest “gateway”
for the area corresponding to that subset. Different nodes select different
150
gateways to a given area, and the chosen routes form a forest (as we have
described earlier).
Having selected one route to each subset (or effective destination area),
node A advertises these routes to its neighbors. In this manner, HLR
scales well, since it maintains the property of hierarchical routing pro-
tocols: more detailed routing information about nearby nodes, and less
about nodes farther away.
4.3.3 Dealing with Route Changes
HLR deals with route dynamics (addition of a node, failure of a link etc.)
in ways similar to other routing protocols. Each route is associated with
a lifetime, and must be refreshed at least once within that lifetime oth-
erwise it is considered to have failed. HLR uses two frequencies of route
advertisement. For a route that has recently changed, nodes re-advertise
their routing tables with moderate frequency to allow for faster conver-
gence. For routes that have been relatively stable, the route advertise-
ment interval is set to be an order of magnitude higher. The lifetime is
set to four times this longer interval. All the parameters are congurable
in HLR.
151
Wireless links are known to be notoriously unstable, so dropped route
advertisements are more likely to be the norm than the exception. Clearly,
this can impact route stability: lost advertisements might result in route
expirations. To avoid this, our implementation uses a simple link-layer
black-listing scheme that lters out asymmetric links as well as highly
lossy links, and paths are selected on the rest of the topology. When a
link degrades and is marked unusable, the attached node performs the
appropriate actions.
4.3.4 Relaxing the Connectivity Assumption
In our discussions so far, we have relied on an important property, that
of the connected-ness of an area. In practice, one would expect this
condition to be mostly, but not always, satised. For example, in a
building network, it might be reasonable to deploy sensors such that
sensors within a oor are connected (using our denition in subsection
3.1). However, given the vagaries of wireless communication, it would be
unwise to rely on this property for the correctness of the system. In this
section, we show that we can add a little machinery to HLR's basic mech-
anism in order to deal with partitioned areas (where the connected-ness
assumption is violated). Note that in our discussions below, we assume
152
that while an area may be partitioned, the entire network is connected;
HLR nds an alternate path to the sub-areas.
Our basic approach is to identify the partitioned areas by assigning a
unique identier (termed as cluster ID) to each connected component of
the partitioned area. Nodes external to the area then “join” two different
trees, one for each component: to them, different components look like
different areas. However, data packets destined to a given HLI in the
area are duplicated and sent to both partitions, since it is a priori unclear
which partition contains the node associated with the HLI.
We now describe several details of this scheme. The rst detail is the
denition of a cluster ID; in HLR, nodes within an area settle on the
lexicographically smallest HLI of any node within an area. For example,
in the sensor network shown in Figure 4.2, the cluster ID of area 2 is
2:1:0. Notice that a sub-area of area 2 might have an entirely different
cluster ID: thus, in our example, 2:2 would choose 2:2:1 as its cluster ID.
Thus, if an area is partitioned into two, the two partitions will end up
choosing different cluster IDs. We discuss below how this affects route
selection. However, note that a basic property of HLR is that an area's
partition is not visible outside the enclosing area as long as the latter
itself is connected. In our example, assume area 2:2 is partitioned into
two parts: one with cluster id 2:2:1, the other with cluster id 2:2:2. As long
153
as area 2 is still connected, nodes 2:2:1 and 2:2:2 will see same cluster ID
for area 2, which is 2:1:0. And thus the truth that area 2:2 is partitioned
nodes is transparent to nodes in area 1 and area 3.
How do all the nodes within an area determine their cluster ID? In HLR,
a node whose HLI is of the form a:b:c maintains one route to all top-level
siblings of area a, all children of a who are siblings of a:b, and all nodes
within a:b. Thus, for each level, just from its routing table, a node can
determine the cluster ID. If there exists a partition at a particular level,
then the connected components settle on different cluster IDs.
When a node announces its route, it attaches its cluster ID to the route.
In this manner, nodes outside the area eventually see two different clus-
ter IDs for the same effective destination area. We then need to modify
HLR's route selection algorithm so that different partitions fall into dif-
ferent subsets (see subsection 4.3.2). Then, a node will pick one route
for each partition correctly.
There are three other details to take care of. First, while nodes in an area
converge on a cluster ID, the cluster ID visible externally might change,
causing a fair bit of route churn. To reduce the churn, a node holds
down a route that announces a change in cluster ID. Second, nodes
within one partition of an area must be able to distinguish between
154
routes to nodes within the same partition, and nodes from another par-
tition of the same area. The latter routes might “enter” the partition from
another area; HLR tags such external routes with the identier of this
external area in order to detect this. Finally, we must augment the route
selection rules to prefer internal routes to external routes.
With these changes, HLR is able to route correctly without the assump-
tion of internal connected-ness.
4.3.5 Discussion
In this section, we have described how HLIs can be leveraged to build
scalable routing based on a variant of DSDV. Several questions arise
when considering HLR in the context of sensor networks.
First, how does HLR interact with energy management schemes? In gen-
eral, these schemes can be classied into two classes: topology con-
trol [10, 89] and coordinated sleep/wakeup [56, 90]. Topology control
schemes try to maintain a connected network using a (continuously
varying) fraction of the nodes. For such schemes, HLR should work with-
out any change. For coordinated sleep-wakeup schemes, HLR will need
to be slightly modied such that, if a node's next hop is currently asleep,
155
it can buffer packets to that node until it awakes. With this modica-
tion, we believe that coordinated sleep/wakeup does not conceptually
alter the correctness of HLR, nor does it impact its performance.
Second, whether or not can we nd an efcient way to automatically
assign HLIs to nodes? Bootstrapping from virtual coordinates repre-
sents a promising approach towards automatic HLI assignment. Should
such a localization system exist, it undoubtedly can be adapted to au-
tomatically assign HLIs to nodes and it also can be integrated into any
location-based routing scheme.
Third, how to decide the layers in the HLI hierarchy for different applica-
tions? This is actually a trade-off between the scalability and resolution
and may depend on the particular application.
Finally, many sensor network applications rely on nodes communicating
with a base station. How does HLR t in this scenario? It is conceptually
possible to design a variant of HLR that supports this form of communi-
cation; we have left the design of this for future work.
156
4.4 Routing and Rendezvous Primitives
From a sensor network perspective, HLR enables a variety of routing and
rendezvous primitives that can improve the scalability of systems like
Directed Diffusion [30] or TinyDB [44], or enable data-centric storage
systems like GHT [62] even in the absence of location information. In
this section, we show how HLR can be used to provide these primitives.
4.4.1 Unicast
HLR can provide “any-to-any” or unicast transmission primitives. More
precisely, any node can send a message addressed to the HLI of any other
node, and HLR attempts to deliver the message in a best-effort manner.
Such a primitive can be useful in many contexts: monitoring the status
of a node, or tasking a node to perform a specic action such as turning
on a camera.
Achieving unicast functionality in HLR is rather straightforward. HLR
forwards unicast packets based on the longest prex match of HLI. How-
ever, HLR must allow a packet's address to match multiple routing table
entries. This functionality enables correct packet delivery in the pres-
ence of network partitions (Section 4.3.4). As we have discussed earlier,
157
when more than one entry matches, a separate copy of the packet is
forwarded for each matching entry, i.e. one copy of each packet is deliv-
ered to each partition of the destination area. To avoid multiple copies
delivered to each partition, the destination area of every copy is asso-
ciated with a partition cluster ID, i.e. the destination of every copy of
the packet is dened by the pair (destination area, cluster ID). Since
the area HLI plus the cluster ID can uniquely identify a partition of the
area, it is guaranteed that each partition will receive exactly one copy of
the packet. All copies but one are dropped when they enter the lowest-
level area; the partition that contains the destination node will correctly
deliver the packet to the destination.
An alternative would have been to forward packets along one of the en-
tries, and either back-track (which would involve maintaining state in
the routing protocol) or have the node “tunnel” the packet to the par-
tition containing the destination. Both these approaches are complex,
and we chose to trade-off some additional overhead in packet duplica-
tion assuming partitions happen infrequently.
158
4.4.2 Area Broadcast and Area Anycast
HLR also provides two other powerful routing primitives: broadcasting
to all nodes within an area, or anycasting to one node within the area.
Thus, a broadcast packet addressed to an HLI prex 1:2 would be deliv-
ered (best-effort, of course) to all nodes within that area. Similarly, an
anycast packet (a bit in the packet header distinguishes between anycast
and broadcast packets) addressed to a HLI prex 1:2 would be delivered
to some node within that area.
The implementation of these primitives falls out quite easily from HLR's
basic design. An area anycast is forwarded similarly as a unicast packet
until it reaches some node within the destination area. When an area
is partitioned, it sufces to forward the area anycast towards one of the
partitions. Finally, when a node receives an anycast packet whose HLI
prex is a prex of its own HLI, it assumes that the packet is destined
for itself.
An area broadcast is also forwarded much like a unicast packet un-
til it reaches some node within the destination area. When an area is
partitioned, a node outside the area will have multiple routes with same
destination area ID but different cluster IDs. The node then forwards the
packet to each partitioned sub-area and modies the destination address
159
as the corresponding partitioned sub-area. When a packet reaches the
destination area or partitioned sub-area, the packet is ooded through-
out the area. Flooding within an area must be done with care. Consider
a broadcast to area 1:2. If any node outside this area receives the packet
from a node within the area, it drops the packet to prevent further prop-
agation of the ooding.
We argue that these primitives will help scale data-centric routing pro-
tocols. In particular, because the areas are aligned along “application-
specic” location boundaries (e.g. in an in-building network, there might
be areas corresponding to oors, and sub-areas corresponding to wings),
we expect most location-based queries will also be well-aligned along
area boundaries. Accordingly, we expect these primitives to be used
fairly frequently in a sensor network deployment.
Finally, we believe it is also possible to implement source-specic multi-
cast [14] using reverse-path forwarding on the routing table provided by
HLR. We have left the design of this primitive to future work.
4.4.3 Rendezvous Based on Random Hashing
HLR also provides rendezvous primitives that can be used to implement
data-centric storage schemes like distributed hash tables. For this, HLR
160
basically provides a way to consistently and randomly hash an arbitrary
key to a node in the network using a primitive calledhash-lookup(key).
This primitive is similar in principle to the key lookup provided by dis-
tributed hash table (DHT) systems like CAN [61] or Chord [76], but its
implementation is very different. Using this primitive, it is possible to im-
plement the DHT primitives such as put(key,packet) and get(key).
Furthermore, using the lookup functions provided by hash-lookup it is
also possible to implement other rendezvous mechanisms like the trig-
gers proposed in [75]. We do not discuss the details of this implemen-
tation here, but note that such triggers can be very useful for actuation
based on the occurrence of certain events within a sensor network.
Prior work [62] has proposed to implement these primitives using ge-
ographic routing. HLR can achieve similar functionality without using
geographic routing. HLR provides this functionality by treating a hashed
key as an HLI, and routing the packet containing that key to the node
whose HLI is closest to the key. Before we describe the details of the
implementation, we must note that HLR's hashing does not necessarily
maintain all the properties of DHTs. In a classical DHT, the key space
is likely to be much larger (128 or 160 bits) than the HLI space. Fur-
thermore, in a classical DHT, the nodes are arranged uniformly along
161
the key space (enabling load balancing), while in HLR the node loca-
tion in the key space is determined by the HLI assignment to nodes. To
some extent, this can be rectied by carefully assigning HLIs since this
assignment is under the control of the network administrator.
Function hash-lookup() sends a packet that has the key as the destina-
tion HLI. In addition, the packet has a bit indicating that it needs to be
processed as hash lookup. Assume for a moment that the network has
converged, the routing tables don't change, and the network is not par-
titioned. Then, every node in the system has one routing entry for each
top-level area. The node that issues thehash-lookup() treats the key as
an HLI and routes the packet to the top-level area whose area identier is
closest to but larger than the top-level area in the key (with wraparound).
For example, assume that there are three top-level areas in the system:
1, 5 and 7. Then, the key 4:3:2 would rst be routed towards area 5, by
our rule, and a key 8:5:1 would be routed to area 1. When the packet
reaches area 5, the same procedure is now followed, but at the second
level of the area hierarchy, until a nal node is reached.
In the presence of partitions, the cluster ID determines which partition
is “closer” to the key.
This hashing algorithm has an interesting property: a node has enough
local information to determine if it should be the target of ahash-lookup().
162
It can determine if its own top-level area is closest to the key, and so on
recursively. This property is useful in maintaining the correctness of
a hash-lookup(); if, because of routing transients a node receives a
lookup not destined for itself, it can re-route the packet.
Implementing a distributed hash table using our primitive is simple.
The put() and get() primitives can be implemented the same way as
hash-lookup(). Local replication is then simply a matter of storing an
additional copy at the node in the leaf area whose ID is the second clos-
est to that of the corresponding area ID in the key. Triggers of the kind
suggested by [75] can be similarly implemented.
4.4.4 Data-Locality Preserving Hashing
In the previous subsection, we have introduced a rendezvous primitive
which is based on randomly hashing a specied key. A newly introduced
data-centric storage scheme, DIM [41], uses a data-locality preserving
hash. In this section, we show that HLR can be extended to support this
kind of hashing as well. The basic idea is to map the multi-dimensional
data space to HLIs so that each HLI is assigned a hyper-rectangle of the
data space such that at any level, the hyper-rectangles assigned to all
HLIs at that level disjointly cover the entire data space. Ultimately, every
163
node is assigned a disjoint hyper-rectangle in the multi-dimensional data
space, i.e. the node owns the hyper-rectangle. In this section, we discuss
how HLR can provide locality-preserving hashing, and how a simplied
version of DIM can be built on top of it.
Concretely, we say that HLR provides a data-space multicast primitive
send-dsm(H,p) which delivers packet p to all the nodes that own part
of the hyper-rectangle H. (Of course, unicasting to a single point in the
data-space is a degenerate case of this primitive, so we don't discuss it
further. We have left an exploration of an analogous anycast primitive to
future work.)
To understand how HLR implements the send-dsm() primitive, we need
to describe how the hyper-rectangles in data-space are mapped to nodes.
We use a mapping very similar to the one used in DIM [41], but instead
of relying on geographic divisions, we divide the HLI space, as illustrated
in Figure 4.3.
To show the basic idea, we take a 2-d space [0; 1) [0; 1) as an example
and map it to the network shown in Figure 4.2. Our description here can
be easily generalized to multi-dimensional data spaces. At the top level,
areas 1, 2, and 3 are divided into two sets which partition the data space
aligned with the rst dimension. The result is that area 1 is responsible
for sub-space [0; 0:5) [0; 1) and areas 2 and 3 together are responsible
164
Figure 4.3: Example: mapping from 2-d data space to the network shown in
Figure 4.2
for sub-space [0:5; 1) [0; 1). We repeat such divisions within each set
of areas and alternatively aligned with each dimension of the data space
until each set contains only one node. For example, areas 2 and 3 equally
divide the sub-space [0:5; 1) [0; 1) while the sub-space [0; 0:5) [0; 1) is
further divided among the sub-areas of area 1, and so on. Note that if
the distribution of areas or the distribution of nodes within areas are not
uniform, the division of the data space can be adjusted accordingly. For
example, instead of choosing 0:5, we can use 0:1 or 0:9 for a dimensional
division of the data space.
Given the mapping procedure above, it can be seen that the part of the
data space assigned to each node is a hyper-rectangle in the data space
and the hyper-rectangles of all nodes disjointly cover the entire data
space. Furthermore, given the hierarchy in HLIs, an inherent property
of our scheme is that the hyper-rectangles of HLIs which share the same
165
prex are also close in data space. Such a mapping enables the con-
struction of a data-centric storage scheme that efciently supports range
queries.
Using this mapping between nodes and the data space, how does HLR
support thesend-dsm() primitive? Given a hyper-rectangleH, each node
can locally apply the above mapping procedure to determine which top-
level areas might contain the nodes which would fall in H. Using this,
the node at which send-dsm() is invoked will route the packets towards
those top-level areas, creating copies of the packets if necessary (this
is analogous to query splitting in DIM). This same procedure is applied
recursively with each area until a copy of the packet reaches each node
whose hyper-rectangle intersects H. At some point, when H entirely cov-
ers the hyper-rectangle associated with an area, HLR simply oods the
packet within that area.
For example, assume a range query Q: [0:6; 0:8) [0:3; 0:7) is issued at
node 1:1:1. Node 1:1:1 looks it up in its routing table, and matches area 2
and 3 whose hyper-rectangle intersects Q. Therefore, node 1:1:1 will split
Q into two sub-queries: Q
1
: [0:6; 0:8) [0:3; 0:5) and Q
2
: [0:6; 0:8) [0:5; 0:7),
and send Q
1
to area 2 and Q
2
to area 3. When Q
1
reaches area 2, say node
2:1:1, it will be further split into two sub-queries: Q
11
: [0:6; 0:75) [0:3; 0:5)
166
and Q
12
: [0:75; 0:8) [0:3; 0:5). The procedure goes on until the hyper-
rectangle of the receiving node completely contains the sub-query. It is
now easy to see how DIM can be built on top of send-dsm(H,p). A DIM
data insertion would specify a H which is merely a point (a degenerate
case of a hyper-rectangle). For a DIM query, the H corresponds to the
query rectangle itself. Query replies can simply be unicast to the HLI of
the query issuer.
Our description of DIM on HLR has ignored dynamics such as node fail-
ure, node join, and link dynamics which may cause changes to HLR
routing tables. When the routing table changes, the mapping between
nodes and their hyper-rectangles might change. When this happens,
a DIM built on HLR needs to check whether its hyper-rectangle has
changed and whether the tuples it has stored need to migrate to some
other nodes. As with random hashing, this check can be performed en-
tirely locally. When a node decides that some of its data belongs to a
hyper-rectangle that it no longer owns, it reinserts the data.
k
Finally,
DIM's local replication can also be mimicked in HLR; recall that DIM
stores an extra copy of the packet at a node that would have owned the
hyper-rectangle if the current owner fails. Once area partition occurs,
k
Note that this may happen often, as with a apping route. A DIM built on HLR
needs some hysteresis mechanisms built in that would prevent it from re-inserting
data at every routing change. We have left the design of this for future work.
167
we need to rebuild DIM locally within the partitioned area by treating
each partition a single sub-area. For example, when area 1:1 is parti-
tioned due to some node failure, the hyper-rectangle mapped to area 1:1
is re-split among all partitions of area 1:1. In general, local rebuilding
will cause data migration among sub-areas, but this overhead should be
small assuming partitions happen relatively infrequently.
4.4.5 Summary
In this section, we have described how several routing primitives that
are thought to be important for sensor networks can be supported using
HLR. We have implemented all of these primitives (as well as simplied
versions of distributed hash tables and DIM) in TinyOS. In the next sec-
tion, we evaluate these primitives using simulation.
4.5 Performance Evaluation Through Simula-
tions
In this section, we investigate HLR using simulations, comparing it to
other methods of implementing the routing and rendezvous primitives
168
(e.g. using geographic routing). Now, it is easy to see that, asymptoti-
cally and with high enough density, none of HLR's primitives are likely
to outperform a geographic routing based approach. In a dense network
on a 2-dimensional surface, the asymptotic path lengths are O(
p
N), and
geographic routing based approaches will approach this performance.
Indeed, HLR will perform worse in general because route aggregation
can increase path lengths. So, our real goal here is not to demonstrate
that HLR is better than other alternatives, but that it is no worse than
other alternatives. HLR's usefulness, then, is that it provides equivalent
functionality while making fewer assumptions about available technol-
ogy (e.g. precise localization).
We perform four sets of experiments. First, we evaluate the performance
of HLR unicast by comparing its average path length for all-pair com-
munication with that of GPSR. Second, we compare the efcacy of area
broadcast in HLR, by evaluating a workload of diffusion queries that are
geographically scoped. We compare HLR against a version of Diffusion
that uses a simple geocast mechanism [91]. Third, we implement a DHT
and a DIM on HLR and compare them to GHT and DIM on top of GPSR
for purpose of evaluating rendezvous primitives. Finally, we evaluate the
performance of HLR under dynamics and measure the overhead induced
network-wide by node failure.
169
4.5.1 Methodology and Metrics
We use ns-2 for our simulations. We implemented HLR (including the
functionality that detects and deals with partitions) in ns-2, and all of
the routing and rendezvous-based primitives we described in Section 4.4.
Using these primitives, we implemented a simplied version of one-phase
pull Diffusion, GHT and DIM in ns-2. The total of HLR code for primitives
and routing protocol is about 2800 lines.
An interesting methodological challenge we faced was to randomly gen-
erate connected hierarchical topologies for evaluating HLR. Our topology
generator rst computes random hierarchical areas where the depth of
each area is random, and the size of the sub-areas is roughly same.
Then it lays out this topology on a 2-dimensional surface.
To generate
a random hierarchical topology, we rst calculate the size of the network
using the number of nodes in the network, radio range and density. (In
our simulations, radios have a range of 30m. We also simulate for two
different densities, 10 neighbors per node, and 20). Then we split the
network into grids such that the number of grids is the smallest number
greater than the total number of nodes. Now starting from the top-level
areas, we randomly allocate contiguous free grids to this area, such that
The reason we only use 2-D topology is simply because GPSR currently only works
on 2-D. HLR doesn't rely on this assumption.
170
the number of grids equals the total number of nodes within the area.
Then in a breadth rst way, each sub area is allocated contiguous grids
from its grids allocated to its parent area. This breadth-rst approach
can lead to un-satisable states, at which point our generator back-
tracks and repeats the re-allocation procedure. Finally, we randomly
pick a point within the grid as the coordinate of the node. We generate
topologies whose size ranges from 25 to 200 nodes with step size of 25
nodes.
Unless otherwise specied, our metrics are: a) the messaging cost of
implementing a particular primitive (for unicast, this can be equivalently
expressed as the average path length), and b) the control overhead of
HLR routing.
For most of our experiments, we compute the scaling behavior of the
metric discussed above. We computed our metrics for several topolo-
gies ranging from 25 to 200 nodes. For each topology size, the reported
number is an average of 5 randomly chosen topologies.
yy
yy
Resource constraints prevented us from averaging over more topologies. Recall that
our topology generation algorithm employs a back-tracking procedure to assign areas
to node locations. Using our implementation, it sometimes took more than a day to
generate an instance of the topology.
171
4.5.2 Results
Unicast Routing Performance Our rst experiment simply measures
the cost of unicast communication in HLR, and compares it with the cost
of unicast using GPSR. For both these schemes, we conducted a simula-
tion where each node sends a message to all of the other nodes. We then
calculated the average path length incurred using either scheme. Figure
4.4 plots the average path length for HLR and GPSR. This gure shows
that the average path length in HLR is often three hops longer than that
in GPSR. These results are for a network with a density of 20, so GPSR,
in most cases, does not incur perimeter mode routing. In a network with
a density of 10 (gure not shown), the gap between GPSR and HLR is
decreased to about one hop.
Figure 4.4: Comparison of average path length in GPSR and HLR on networks
with density 20
172
While this might seem somewhat pessimal, our current understanding
of sensor networks suggests that they are not likely to be used for ar-
bitrary point-to-point routing. Rather, we expect other primitives like
rendezvous and area broadcast will be more likely used, since they more
naturally support querying and triggering. Thus, we now discuss the
performance of data-centric routing and data-centric storage systems
implemented on HLI.
Diffusion We implemented a simplied version of one-phase pull Dif-
fusion
zz
in ns-2. This version uses two underlying routing layers, HLR
and GPSR. We augmented GPSR to support geocast (broadcasting to all
nodes within a rectangle). In our implementation, the packet is unicast
using GPSR until it reaches a node within the specied region, and then
ooded within the region.
Our goal in this experiment was to try to understand the expected perfor-
mance of Diffusion on these two routing layers. Lacking traces of actual
workloads, we generated a synthetic query workload for Diffusion. We
generated interest messages of varying geographical scopes, assuming
that the scopes were all aligned with the HLR areas. This assumption
is not particularly disadvantageous for GPSR, but is also the most likely
zz
Equivalently, we can be said to have implemented the tree-building procedure that
TinyDB [45] uses.
173
kind of query in an HLR based system (queries with un-aligned scopes
can be implemented as multiple area broadcasts). We assume that the
size of the geographic scope is distributed exponentially: most queries
are to small areas.
Again, this seems like a plausible assumption for sensor network query
workloads.
Figure 4.5: Comparison of average query cost in Diffusion on networks with
density 20
Figure 4.5 plots the comparison of average query delivery cost between
Diffusion using GPSR and Diffusion over HLR. In this case, we assume
that the query asks for themin over a set of sensor readings at each node
within the target areas. The results are aggregated along the return
path(Therefore, for each query in Diffusion, query delivery cost equals
reply delivery cost). Notice, in this case, that the performance of HLR
is much closer to that of GPSR than was the case for all-pairs unicast.
Clearly, in this case, the longer path lengths resulting from aggregation
174
matter less, and the ooding costs dominate. This conclusion is true
even at a lower node density (10), the results of which we omit for brevity.
GHT How well does a DHT implemented on HLR perform compared to
a GHT? To test this, we performed several random hash-lookup()s on
the DHT over HLR, and performed the equivalent put() operations in a
GHT. Figure 4.6 compares the messaging cost of these two schemes.
2
4
6
8
10
12
14
16
18
20
25 50 75 100 125 150 175 200
Average Number of Messages
Network Size
DHT+HLR
GHT
Figure 4.6: Comparison of average query cost in DHT over HLR and GHT on
networks with density 20
In this particular case, we nd that performance of DHT over HLR is
much better than that of a GHT, quite unexpectedly given our results
from Figure 4.4. The reason is simply because nearly every put() op-
eration in a GHT incurs perimeter traversal, which is pretty expensive
compared to greedy mode delivery. Further, some operations incur a
traversal of the outer perimeter, which skews the average. In HLR, how-
ever, the average cost of a hash-lookup() is the same as the average
175
unicast cost. For a lower density of 10 neighbors per node, the plots
look almost identical (omitted for brevity). At these densities, DHT over
HLR encounters longer paths, but GHT encounters longer perimeters as
well.
DIM To evaluate the efciency of the send-dsm() primitive, we imple-
mented DIM on top of HLR and compared it to DIM op top of GPSR.
For this comparison, we used sensor data collected from a deployed in-
building testbed; each sensor periodically collects light, temperature and
humidity readings. In the dataset, there were 509765 readings. From
these readings, we generate a balanced insertion workload (10 insertion
per node) for every node in the network from the data set. And we in-
serted the selected data subset into the DIM. For our query workload,
we generated a set of 3-D range queries where the query box size is ex-
ponentially distributed and its location is uniformly placed.
Figure 4.7 plots the comparison of data insertion cost between two ver-
sions of DIM on networks with density 20. In most cases, DIM on HLR
has smaller insertion cost than DIM on GPSR. In the latter, the exis-
tence of empty zones [41] forces DIM to rely on GPSR's perimeter mode
to nd the owner, resulting in a longer delivery path and a higher cost.
At a lower density (10 neighbors per node), this performance advantage
176
2
2.5
3
3.5
4
4.5
5
5.5
6
25 50 75 100 125 150 175 200
average number of messages sent/recvd per insersion
network size
DIM+HLR
DIM+GPSR
Figure 4.7: Comparison of average insertion cost in DIM on networks with den-
sity 20
decreases. In HLR, paths become longer. However, DIM relies less on
perimeter mode than GHT (see above), hence DIM is less affected by a
decrease in density.
Figure 4.8: Comparison of average query cost in DIM on networks with density
20
Finally, Figure 4.8 compares the query delivery cost between two ver-
sions of DIM on networks with density 20. Here again, we see that DIM
on HLR outperforms DIM over GPSR. There are two contributors to this.
177
Figure 4.9: Comparison of average query cost in DIM on networks with density
10
One is, as before, that DIM on GPSR encounters many more perime-
ter traversals in discovering empty zones. The other is a more subtle
point that has to do with the way the data-locality preserving hashes
for the two schemes work. In DIM over GPSR, with 3 or higher dimen-
sional data, a query hyper-rectangle may actually be split across two
nodes that are far apart physically. However, in DIM over HLR, the query
hyper-rectangle owned by an area is always enclosed within the hyper-
rectangle belonging to the parent area. Thus, DIM over HLR preserves
data-locality more than DIM over GPSR, explaining the performance im-
provement. At a lower density as shown in Figure 4.9, the performance
difference is a little more, since some of the performance advantages
come from the data-locality properties of HLR.
178
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14
fraction of nodes
number of table changes till reconverge
50
100
150
200
Figure 4.10: Average number of routing table changes under single node failure
on networks with size 50, 100, 150, 200 and density 20.
Dynamics Finally, we address the important question of HLR perfor-
mance under network dynamics. Specically, we are interested in HLR
overhead caused by the failure of a single node. In our experiment, we se-
quentially fail and recover each node in the network, waiting long enough
for the network to re-converge between node failures. Our two metrics
for HLR performance are: a) the average number of routing table changes
caused by a single node failure, and b) the number of routing messages
sent until the network converges after a single failure. For each network
size, we computed these metrics over ve instances of topology sizes.
Figure 4.10 plots the distribution of routing table changes. This gure
shows how well HLR localizes the effect of failure, an important consid-
eration for wireless sensor networks. On average, more than 90% nodes
are unaffected by a node failure! Only when nodes at the boundary of
179
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40
fraction of nodes
number of control packets sent till reconverge
50
100
150
200
Figure 4.11: Average number of control packets to re-converge under single
node failure on networks with size 50, 100, 150, 200 and density 20.
top-level areas fail do we see that some nodes change their routing ta-
bles several time before convergence. Even then, the magnitude of these
changes is relatively small even for networks of 200 nodes.
Figure 4.11 plots the distribution of extra overhead caused by a single
node failure. Again, this value is noticeably small. In most cases, the
vast majority of the nodes are completely unaffected by a single failure
and see no routing trafc at all for a failure. In the most egregious cases,
some nodes see about 20 messages while the routing protocol converges.
From our perspective, this is highly encouraging; the impact of dynamics
is very local and is one of the bigger selling points of HLR.
180
Routing Table
Management
Anycast
Multicast
Unicast
HLR−Core
Hash−lookup
Locality Preserving
Hashing
MAC
TinyDiffusion NeighborList Module
Use Update
Figure 4.12: HLR Software Architecture
Figure 4.13: HLR Experiment Topology
4.6 Implementation
We have implemented the HLR routing protocols and most of the routing
and rendezvous primitives in on Berkeley motes. Figure 4.12 shows the
software architecture of our implementation. The NeighborList module of
TinyDiffusion exports a ltered send and ltered receive interface which
lters out bad quality links including asymmetric links and fragile links.
181
On top of this, as discussed in Section 4.3, currently we just use a simple
hop-count as our path metric.
The HLR core module implements the core algorithm of HLR which con-
structs and maintains the routing table. The routing table management
module helps organize the routing table by effective destination area in
order to enable efcient route processing. The routing primitive module
implements unicast, area multicast and area anycast, while the ren-
dezvous primitives module implements the hashing lookup function and
data-preserving hashing function. As we have described in Section 4.4,
the hashing lookup function and data-preserving hashing make use of
the routing primitives for data delivery.
Figure 4.14: HLR Experiment Result with the topology shown in Figure 4.13
As a proof-of-concept, we ran HLR on a network of 10 Mica-2 motes.
The topology for our experiment is shown in Figure 4.13 where the mote
transmission power has been reduced in order to create a multi-hop
182
network. We let HLR run for four hours. Figure 4.14 gives the number of
routing table changes during that four hour interval. As we can see, over
four hours in the worst case, a mote saw seven routing table changes
indicating that the routing tables might be expected to be quite stable in
realistic deployments. However, many aspects of HLR need to be veried
in the real world: partition recovery, better path metrics, and dynamics
in larger deployments,
4.7 Summary
In this chapter, we have described a pragmatic routing layer for sensor
networks. This layer is built upon the observation that many sensor
network nodes will be assigned hierarchical location identiers. We de-
scribed the design of HLR, a routing protocol that constructs scalable
routing tables. Using HLR, it is possible to implement several routing
primitives for data-centric routing and storage. Our results indicate that
HLR performs well and contains dynamics.
183
Conclusions and Future Work
In this thesis, we have discussed four techniques for improving informa-
tion delivery efciency at various levels with various efciency require-
ments.
We rst examined a utility-based sensor selection approach which en-
ables sensor network applications to express utilities associated with
retrieving data from sets of sensors. We studied the feasibility of de-
termining, subject to network lifetime constraints, the sequence of sets
whose data has the maximum total utility. The algorithm to select sen-
sors based on utility will naturally give the optimal routing selection to-
gether with the fractional ows on each path. Therefore, this approach
achieves the energy efciency at all three layers.
184
We explored fractional and integral variants of the utility-based sensor
selection problem, as well as single-set variants thereof, on three im-
portant classes of utility functions. We found that many variants are
NP-hard, and some hard to approximate. On the other hand, submodu-
lar functions can be optimized efciently, and an important subclass of
supermodular functions admits a fractional solution via solving an LP,
and an O(log n)-approximation when nodes are constrained to send the
same amount of data. Finally, we showed that geometric utilities can be
cast into a penalty framework for which we are able to prove preliminary
hardness results.
Then we considered the problem of energy efciency broadcasting. We
considered realistic wireless networks with obstacles and established
theoretical lower bound on the approximation ratio of the energy cost
for broadcasting. With the more general network model with obstacles
and an energy cost model, we showed that no polynomial time algorithm
can achieve an approximation ratio better than O(log N), unless P=NP.
We developed and presented a broadcasting algorithm, called GBA, and
proved that this algorithm guarantees O(log N) approximation ratio per-
formance. Through extensive simulations, we showed that the GBA al-
gorithm performs quite well through simulations.
185
Next, we considered the problem of capacity efciency at the transport
layer. We presented the design and evaluation of QCRA, a quasi-static
centralized rate assignment scheme for fair and efcient rate allocation
in WSN. QCRA includes a novel light-weight rate assignment heuristic
to compute fair and efcient rates, and a rate adaption mechanism to
address longer time-scale wireless link quality variation. Using extensive
experiments we showed that QCRA performs well on networks of up to
40 nodes. QCRA is the rst work to evaluate the feasibility of centralized
rate control in wireless sensor networks.
Finally, we studied the problem of designing a scalable routing proto-
col for large-scale sensor networks where data-centric in-network pro-
cessing is necessary. We described a pragmatic routing layer for sensor
networks. This layer is built upon the observation that many sensor
network nodes will be assigned hierarchical location identiers. We de-
scribed the design of HLR, a routing protocol that constructs scalable
routing tables. Using HLR, it is possible to implement several routing
primitives for data-centric routing and storage. Our results indicate that
HLR performs well and contains dynamics.
186
Future Work
The problem of efcient information delivery meeting various efciency
requirements in sensor networks is complex. The work presented in this
thesis is mostly at the research stage. Much remains to be done to make
each idea presented in this thesis really practical.
First, utility-based sensor selection framework enables a uniform way to
manage network lifetime. The utility-based framework is a very natural
way of expressing the true goal of a senor network application, and being
able to select (approximately) optimal schedules for practical classes of
utility functions is crucial in making the best use of a deployment. One
important future work is to characterize more generally the classes of
utility functions that would be useful for sensor network applications.
Utility-based sensor selection not only gives the sensor set to be tasked,
but also species multiple routing paths and the fraction of ows on each
specic routing path. Finding an efcient way to distribute the decision
back into the network can be another future work.
Second, in theory, GBA achieves the best approximation ratio. One way
for implementing GBA in wireless sensor networks is through source
routing, i.e. the broadcasting packet carries the paths together with the
data packet. However, the size of the path would be nearly linear to the
187
size of the network. Therefore, it is important to nd an efcient way of
carrying the broadcasting path. The answer to the feasibility of a central-
ized broadcasting is yet to be found through design and implementation
of a dissemination scheme for broadcasting path, which constitutes the
major future work.
Next, a few steps remain before we can realize a truly practical quasi-
static centralized allocation as discussed in QCRA(Chapter 3: accurate
link quality measurements, low-overhead mechanisms to collect topology
information and distribute rate information, and fast rate adjustments
in the face of node failures. We intend to explore these as a part of future
work.
Finally, the performance of HLR in a large-scale sensor network remains
to be found through large-scale experiments with the context of real ap-
plications.
188
References
[1] The Extensible Sensing System.
[2] Anish Arora, Rajiv Ramnath, Emre Ertin, and Prasun Sinha et.
al. Exscal: Elements of an extreme scale wireless sensor network.
In 11th IEEE International Conference on Embedded and Real-Time
Computing Systems and Applications (RTCSA), 2005.
[3] S. Arora, P. Raghavan, and S. Rao. Approximation schemes for eu-
clidean k-medians and related problems. In Proc. 30th ACM Symp.
on Theory of Computing, 1998.
[4] Sangeeta Bhattacharya, Guoliang Xing, Chenyang Lu, Gruia-
Catalin Roman, Brandon Harris, and Octav Chipara. Dynamic
wake-up and topology maintenance protocols with spatiotemporal
guarantees. In International Conference on Information Processing
in Sensor Networks (IPSN), Los Angeles, CA, 2005.
[5] N. Bulusu, D. Estrin, L. Girod, and J. Heidemann. Scalable Co-
ordination for Wireless Sensor Networks: Self-conguring Localiza-
tion Systems. In Proceedings of the Sixth International Symposium
on Communication Theory and Applications (ISCTA '01), Ambleside,
Lake District, UK, July 2001.
[6] John Byers and Gabriel Nasser. Utility-based decision-making in
wireless sensor networks. Technical Report 2000-014, 1 2000.
[7] Mario Cagalj, Jean-Pierre Hubaux, and Christian Enz. Minimum-
energy broadcast in all-wireless networks: Np-completeness and
distribution issues. Technical report No. IC/2002/021, March 2002.
[8] A. Cerpa and D. Estrin. ASCENT: Adaptive self-conguring sensor
networks topologies. In Proceedings of the IEEE Infocom, New York,
USA, June 2002. IEEE.
189
[9] Jae-Hwan Chang and Leandros Tassiulas. Fast approximate algo-
rithms for maximum lifetime routing in wireless ad-hoc networks.
In NETWORKING '00: Proceedings of the IFIP-TC6 / European Com-
mission International Conference on Broadband Communications,
High Performance Networking, and Performance of Communication
Networks, pages 702–713. Springer-Verlag, 2000.
[10] B. Chen, K. Jamieson, H. Balakrishnan, and R. Morris. Span: An
Energy-efcient Coordination Algorithm for Topology Maintenance
in Ad Hoc Wireless Networks. In Proceedings of the IEEE Infocom,
pages 85–96. IEEE Computer Society Press, 2001.
[11] D. De Couto, D. Aguayo, J. Bicket, and R. Morris. A High-
Throughput Path Metric for Multi-Hop Wireless Routing. In Pro-
ceedings of the 9th Annual International on Mobile Computing and
Networking, San Deigo, CA, September 2003.
[12] T.V. Dam and K. Langendoen. An Adaptive Energy-Efcient MAC
Protocol for Wireless Sensor Networks. In Proceedings of the ACM
Conference on Embedded Networked Sensor Systems, Los Angeles,
CA, November 2003.
[13] S. De, C. Qiao, and H. und. Meshed multipath routing with selec-
tive forwarding: An efcient strategy in wireless sensor networks.
Elsevier Computer Communications Journal, (4), 2003.
[14] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei.
The pim architecture for wide-area multicast routing. IEEE/ACM
Transactions on Networking, 4(2):153–162, 1996.
[15] Qunfeng Dong. Maximizing system lifetime in wireless sensor net-
works. In IPSN, 2005.
[16] Cheng Tien Ee and Ruzena Bajcsy. Congestion control and fair-
ness for many-to-one routing in sensor networks. In SenSys '04:
Proceedings of the 2nd international conference on Embedded net-
worked sensor systems, pages 148–161, New York, NY, USA, 2004.
ACM Press.
[17] N. Ehsan and M. Liu. Minimizing power consumption in sensor
networks with quality of service requirement. In to appear in An-
nual Allerton Confercence on Communications, Control and Comput-
ing (Allerton 2005), Allerton, IL, 2005.
[18] T. Feder and D. Greene. Optimal algorithms for approximate clus-
tering. In Proc. 20th ACM Symp. on Theory of Computing, 1988.
190
[19] Laura Feeny and Martin Nilsson. Investigating the energy consump-
tion of a wireless network interface in an ad hoc networking envi-
ronment. In Proceedings INFOCOM 2001, Anchorage, Alaska.
[20] U. Feige. A threshold of log n for approximating set cover. Journal of
the ACM, (4):634–652, 1998.
[21] U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem.
In Proc. 25th ACM Symp. on Theory of Computing, 1993.
[22] U. Feige and M. Seltser. On the densest k-subgraph problem. Tech-
nical report, The Weizmann Institute, Rehovot, 1997.
[23] D. Ganesan, R. Govindan, S. Shenker, and D. Estrin. Highly-
resilient, energy-efcient multipath routing in wireless sensor net-
works, 2001.
[24] Ramesh Govindan, Eddie Kohlerand Deborah Estrin, Fang Bian,
Krishna Chintalapudi, Om Gnawali, Sumit Rangwala, Ramakrishna
Gummadi, and Thanos Stathopoulos. Tenet: An Architecture for
Tiered Embedded Networks. Technical report, November 10 2005.
[25] Ramesh Govindan, Eddie Kohler, Deborah Estrin, Fang Bian, Kr-
ishna Chintalapudi, Om Gnawali, Sumit Rangwala, Ramakrishna
Gummadi, and Thanos Stathopoulos. Tenet: An architecture for
tiered embedded networks. CENS Technical Report 56, 2005.
[26] Zygmunt J. Haas, Marc R. Pearlman, and Prince Samar. The Zone
Routing Protocol (ZRP) for Ad Hoc Networks . Technical report, In-
ternet draft, July 2002.
[27] T. He, C. Huang, B. M. Blum, J. A. Stankovic, and T. Abdelzaher.
Range-free localization schemes for large scale sensor networks. In
Proceedings of the 9th Annual International Conference on Mobile
Computing and Networking, San Diego, CA, September 2003.
[28] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy-
Efcient Communication Protocol for Wireless Microsensor Net-
works. In Proceedings of the 33rd Hawaii International Conference
on System Sciences (HICSS '00), January 2000.
[29] Bret Hull, Kyle Jamieson, and Hari Balakrishnan. Mitigating con-
gestion in wireless sensor networks. In SenSys '04: Proceedings
of the 2nd international conference on Embedded networked sensor
systems, pages 134–147, New York, NY, USA, 2004. ACM Press.
191
[30] C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed Diffusion:
A Scalable and Robust Communication Paradigm for Sensor Net-
works. In Proceedings of the Sixth Annual ACM/IEEE International
Conference on Mobile Computing and Networking (Mobicom 2000),
Boston, MA, August 2000.
[31] Volkan Isler and Ruzena Bajcsy. The sensor selection problem
for bounded uncertainty sensing models. In IPSN, pages 151–158,
2005.
[32] Kamal Jain, Jitendra Padhye, Venkata N. Padmanabhan, and Lili
Qiu. Impact of interference on multi-hop wireless network perfor-
mance. In MobiCom '03: Proceedings of the 9th annual international
conference on Mobile computing and networking, pages 66–80, New
York, NY, USA, 2003. ACM Press.
[33] C. E. Jones, K. M. Sivalingam, P. Agrawal, and J. C. Chen. A survey
of energy efcient network protocols for wireless networks. Wireless
Networks, 7:343–358, 2001.
[34] I. Kang and R. Poovendran. Maximizing static network lifetime
of wireless broadcast adhoc networks. In IEEE ICC, Anchorage,
Alaska, 2003.
[35] B. Karp and H. T. Kung. GPSR: Greedy Perimeter Stateless Routing
for Wireless Networks. In Proceedings of the Sixth Annual ACM/IEEE
International Conference on Mobile Computing and Networking (Mo-
bicom 2000), Boston, MA, August 2000.
[36] F. Kelly, A. Maulloo, and D. Tan. Rate control in communication
networks: shadow prices, proportional fairness and stability. In
Journal of the Operational Research Society, volume 49, 1998.
[37] L. Kleinrock and F. Kamoun. Hierarchical Routing for Large Net-
works: Performance Evaluation and Optimization. Computer Net-
works, 1:155–174, 1977.
[38] Murali Kodialam and Thyaga Nandagopal. Characterizing achiev-
able rates in multi-hop wireless networks: the joint routing and
scheduling problem. In MobiCom '03: Proceedings of the 9th annual
international conference on Mobile computing and networking, pages
42–54, New York, NY, USA, 2003. ACM Press.
[39] K. Langendoen and N. Reijers. Distributed localization in wireless
sensor networks: a quantitative comparison. Computer Networks:
192
The International Journal of Computer and Telecommunications Net-
working, Special issue: Wireless sensor networks, pages 499–518,
November 2003.
[40] Qun Li, Javed Aslam, and Daniela Rus. Online power-aware rout-
ing in wireless ad-hoc networks. In MobiCom '01: Proceedings of the
7th annual international conference on Mobile computing and net-
working, pages 97–107. ACM Press, 2001.
[41] X. Li, Y. J. Kim, R. Govindan, and W. Hong. Multi-dimensional
Range Queries in Sensor Networks. In Proceedings of the ACM Con-
ference on Embedded Networked Sensor Systems, Los Angeles, CA,
November 2003.
[42] Weifa Liang. Constructing minimum-energy broadcast trees in wire-
less ad hoc networks. In Mobihoc 2002, 2002.
[43] Mingyan Liu and Chih fan Hsin. Network coverage using low duty-
cycled sensors: Random and coordinated sleep algorithms. In IPSN,
2004.
[44] S. Madden, M. Franklin, J. Hellerstein, and W. Hong. The Design of
an Acquisitional Query Processor for Sensor Networks. In Proceed-
ings of ACM SIGCMOD, San Diego, CA, June 2003.
[45] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG:
a Tiny AGregation Service for Ad-Hoc Sensor Networks. In Proceed-
ings of 5th Annual Symposium on Operating Systems Design and
Implementation (OSDI), Boston, MA, December 2002.
[46] Geoff Mainland, David C. Parkes, and Matt Welsh. Decentralized,
adaptive resource allocation for sensor networks. In In Proceedings
of the 2nd USENIX/ACM Symposium on Networked Systems Design
and Implementation (NSDI), 2005.
[47] K. Mechitov, W. Y. Kim, G. Agha, and T. Nagayama. High-Frequency
Distributed Sensing for Structure Monitoring. In Proc. First Intl.
Workshop on Networked Sensing Systems (INSS 04), 2004.
[48] J. Moy. RFC 2328: OSPF Version 2, April 1998.
[49] Thyagarajan Nandagopal, Tae-Eun Kim, Xia Gao, and Vaduvur
Bharghavan. Achieving mac layer fairness in wireless packet net-
works. In MobiCom '00: Proceedings of the 6th annual international
conference on Mobile computing and networking, pages 87–98, New
York, NY, USA, 2000. ACM Press.
193
[50] J. Newsome and D. Song. GEM: Graph Embedding for Routing and
Data-Centric Storage in Sensor Networks without Geographic In-
formation. In Proceedings of the ACM Conference on Embedded Net-
worked Sensor Systems, Los Angeles, CA, November 2003.
[51] Jeongyeup Paek, Krishna Chintalapudi, John Cafferey, Ramesh
Govindan, and Sami Masri. A wireless sensor network for struc-
tural health monitoring: Performance and experience. In Proceed-
ings of the Second IEEE Workshop on Embedded Networked Sensors
(EmNetS-II),, Syndney, Australia, May 2005.
[52] Jeongyeup Paek, Krishna Chintalapudi, John Cafferey, Ramesh
Govindan, and Sami Masri. A wireless sensor network for struc-
tural health monitoring: Performance and experience. In Proceed-
ings of the Second IEEE Workshop on Embedded Networked Sensors
(EmNetS-II), May 2005.
[53] C. Papadimitriou. Worst-case and probabilistic analysis of a geo-
metric location problem. SIAM Journal on Computing, 10:542–557,
1981.
[54] G. Pei and M. Gerla. Mobility management for hierarchical wireless
networks. Mobile Networks and Applications, 6(4):331–337, August
2001.
[55] C. Perkins and P. Bhagwat. Highly Dynamic Destination-Sequenced
Distance-Vector Routing (DSDV) for Mobile Computers. In Proceed-
ings of the ACM SIGCOMM, London, UK, August 1994.
[56] Joseph Polastre, Jason Hill, and David Culler. Versatile Low Power
Media Access for Wireless Sensor Networks. In Proceedings of the
ACM Conference on Embedded Networked Sensor Systems, Balti-
more, MD, November 2004.
[57] R. Ramanathan and Martha Steenstrup. Hierarchically-organized,
multihop mobile wireless networks for quality-of-service support.
Mobile Networks and Applications, 3(1):101–119, June 1998.
[58] Sumit Rangwala, Ramakrishna Gummadi, and Ramesh Govindan.
Interference-aware fair rate control in wireless sensor networks. In
Proceedings of the ACM SIGCOMM 2006, 2006.
[59] A. Rao, S. Ratnasamy, C. Papadimitriou, S. Shenker, and I. Stoica.
Geographic Routing wihtout Location Information. In Proceedings of
the 9th Annual International on Mobile Computing and Networking,
San Deigo, CA, September 2003.
194
[60] T. S. Rappaport. Wireless Communication. Prentice-Hall, 1996.
[61] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A
Scalable Content-Addressable Network. In Proceedings of the ACM
SIGCOMM, San Diego, CA, August 2001.
[62] S. Ratnasamy, B. Karp, L. Yin, F. Yu, D. Estrin, R. Govindan,
and S. Shenker. GHT: A Geographic Hash Table for Data-Centric
Storage. In Proceedings of the First ACM International Workshop on
Wireless Sensor Networks and Applications, Atlanta, GA, September
2002.
[63] Ran Raz and Shmuel Safra. A sub-constant error-probability low-
degree test and a sub-constant error-probability pcp characteriza-
tion of np. Proceedings of the twenty-ninth annual ACM symposium
on Theory of computing, pages 475–484, 1997.
[64] Y. Rekhter and T. Li. RFC 1771: A border gateway protocol 4 (BGP-
4), March 1995.
[65] R.Ramanathan and R. Hain. Topology control of multihop wireless
networks using transmit power adjustment. In In Proceedings Info-
com 2000, June 2000.
[66] Yogesh Sankarasubramaniam, Ozgur B. Akan, and Ian F. Aky-
ildiz. Event-to-sink reliable transport in wireless sensor networks.
IEEE/ACM Trans. Netw., 13(5):1003–1016, 2005.
[67] Curt Schurgers and Mani B. Srivastava. Energy efcient routing in
wireless sensor networks. In MILCOM, pages 357–361, Vienna, VA,
2001.
[68] Curt Schurgers, Vlasios Tsiatsis, Saurabh Ganeriwal, and Mani B.
Srivastava. Topology management for sensor networks: Exploiting
latency and density. In Symposium on Mobile Ad Hoc Networking
and Computing (MobiHoc), pages 135–145, Lausanne, CH, 2002.
[69] Scott Shenker. Fundamental design issues for the future internet.
September 1995.
[70] Scott Shenker, Sylvia Ratnasamy, Brad Karp, Ramesh Govindan,
and Deborah Estrin. Data-centric storage in sensornets. SIGCOMM
Comput. Commun. Rev., 33(1):137–142, 2003.
[71] S. Singh and C. S. Raghavendra. Pamas: Power aware multi-access
protocol with signaling for ad hoc networks. ACM Computer Com-
munications Review, July 1998.
195
[72] S. Singh, C.S. Raghavendra, and J. Stepanek. Power-aware broad-
casting in mobile ad hoc networks. In Proceedings of PIMRC'99 Con-
ference, September 1999.
[73] S. Singh, M. Woo, and C.S. Raghavendra. Power-aware routing in
mobile ad hoc networks. In Proceedings ACM/IEEE Mobicom'98,
October 1998.
[74] K. M. Sivalingam, M. B. Srivastava, and P. Agrawal. Low power link
and access protocols for wireless multimedia networks. In Proceed-
ings IEEE Vehicular Technology Conference VTC'97, May 1997.
[75] I. Stoica, D. Adkins, S. Zhuang, S. Shenker, and S. Surana. Internet
indirection infrastructure. In Proceedings of the ACM SIGCOMM,
Pittsburgh, PA, August 2002.
[76] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H Balakrish-
nan. Chord: A Scalable Peer-To-Peer Lookup Service for Internet
Applications. In Proceedings of the ACM SIGCOMM, San Diego, CA,
August 2001.
[77] I. Stojmenovic and X. Lin. Power-aware localized routing in wireless
networks. In Proceedings of IEEE IPDPS, pages 371–376, Cancun,
Mexico, May 2000.
[78] Ivan Stojmenovic, Mahtab Seddigh, and Jovisa Zunic. Dominat-
ing sets and neighbor elimination-based broadcasting algorithms in
wireless networks. IEEE Transactions on parallel and distributed
systems, 12(12), December 2001.
[79] Robert Szewczyk, Alan Mainwaring, Joseph Polastre, John Ander-
son, and David Culler. An analysis of a large scale habitat mon-
itoring application. In SenSys '04: Proceedings of the 2nd inter-
national conference on Embedded networked sensor systems, pages
214–226. ACM Press, 2004.
[80] Robert Szewczyk, Eric Osterweil, Joseph Polastre, Michael Hamil-
ton, Alan Mainwaring, and Deborah Estrin. Habitat monitoring with
sensor networks. Commun. ACM, 47(6):34–40, 2004.
[81] P. F. Tsuchiya. The Landmark Hierarchy: A New Hierarchy for Rout-
ing in Very Large Networks. In Proceedings of the ACM SIGCOMM,
Stanford, CA, August 1988.
[82] A. Vetta. Nash equilibria in competitive societies with applications
to facility location, trafc routing, and auctions. In FOCS, 2002.
196
[83] Chieh-Yih Wan, Shane B. Eisenman, and Andrew T. Campbell.
Coda: congestion detection and avoidance in sensor networks. In
SenSys '03: Proceedings of the 1st international conference on Em-
bedded networked sensor systems, pages 266–279, New York, NY,
USA, 2003. ACM Press.
[84] Peng-Jun Wan, Gruia Calinescu, Xiangyang Li, and Ophir Frieder.
Minimum-energy broadcast routing in static ad-hoc wireless net-
works. In Proceedings of the IEEE Infocom 2001 Conference, An-
chorage, Alaska USA, April 2001.
[85] Jeffrey E. Wieselthier, Gam D. Nguyen, and Anthony Ephremides.
On the construction of energy-efcient broadcast and multicast
trees in wireless networks. In INFOCOM (2), pages 585–594, 2000.
[86] A. Woo, T. Tong, and D. Culler. Taming the Underlying Challenges
of Reliable Multihop Routing in Sensor Networks. In Proceedings of
the ACM Conference on Embedded Networked Sensor Systems, Los
Angeles, CA, November 2003.
[87] Alec Woo and David E. Culler. A transmission control scheme for
media access in sensor networks. In MobiCom '01: Proceedings of
the 7th annual international conference on Mobile computing and net-
working, pages 221–235, New York, NY, USA, 2001. ACM Press.
[88] Guoliang Xing, Xiaorui Wang, Yuanfang Zhang, Chenyang Lu,
Robert Pless, and Christopher Gill. Integrated coverage and con-
nectivity conguration for energy conservation in sensor networks.
2005.
[89] Y. Xu, J. Heidemann, and D. Estrin. Geography-informed energy
conservation for ad hoc routing. In Proceedings of the ACM/IEEE In-
ternational Conference on Mobile Computing and Networking, pages
70–84, Rome, Italy, July 2001. ACM.
[90] W. Ye, J. Heidemann, and D. Estrin. An energy-efcient mac proto-
col for wireless sensor networks. In Proceedings of the IEEE Infocom,
June 2002.
[91] Y. Yu, R. Govindan, and D. Estrin. Geographical and Energy Aware
Routing: A Recursive Data Dissemination Protocol for Wireless Sen-
sor Networks. Technical Report UCLA/CSD-TR-01-0023, UCLA
Computer Science Department, May 2001.
197
Abstract (if available)
Abstract
Efficient information delivery in sensor networks is one major research area in the community. In one dimension, efficiency for information delivery in sensor networks includes energy efficiency, capacity efficiency and scalability. In another dimension, the efficiency can be achieved at different network layers: application layer, transport layer or routing layer. These two dimensions describe a design space for efficient information delivery in wireless sensor networks.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Robust routing and energy management in wireless sensor networks
PDF
Rate adaptation in networks of wireless sensors
PDF
Efficient and accurate in-network processing for monitoring applications in wireless sensor networks
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
PDF
Language abstractions and program analysis techniques to build reliable, efficient, and robust networked systems
PDF
Reconfiguration in sensor networks
PDF
Dynamic routing and rate control in stochastic network optimization: from theory to practice
PDF
Transport layer rate control protocols for wireless sensor networks: from theory to practice
PDF
Reliable languages and systems for sensor networks
PDF
On location support and one-hop data collection in wireless sensor networks
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
Models and algorithms for energy efficient wireless sensor networks
PDF
Aging analysis in large-scale wireless sensor networks
PDF
Robust and efficient geographic routing for wireless networks
PDF
Distributed edge and contour line detection for environmental monitoring with wireless sensor networks
PDF
Congestion control in multi-hop wireless networks
PDF
Multichannel data collection for throughput maximization in wireless sensor networks
PDF
Cooperation in wireless networks with selfish users
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
Asset Metadata
Creator
Bian, Fang (author)
Core Title
Techniques for efficient information transfer in sensor networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
07/25/2007
Defense Date
04/04/2007
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,sensor networks,wireless
Language
English
Advisor
Govindan, Ramesh (
committee chair
), [illegible] (
committee member
), N[illegible], C.S. (
committee member
)
Creator Email
bian@enl.usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m674
Unique identifier
UC1126821
Identifier
etd-Bian-20070725 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-524267 (legacy record id),usctheses-m674 (legacy record id)
Legacy Identifier
etd-Bian-20070725.pdf
Dmrecord
524267
Document Type
Dissertation
Rights
Bian, Fang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
sensor networks
wireless