Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Robust routing and energy management in wireless sensor networks
(USC Thesis Other)
Robust routing and energy management in wireless sensor networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ROBUST ROUTING AND ENERGY MANAGEMENT
IN WIRELESS SENSOR NETWORKS
by
Om Prakash Dev Gnawali
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2009
Copyright 2009 Om Prakash Dev Gnawali
Acknowledgements
ThankstoProf. RameshGovindanforbeinganexcellentadvisor, theopportunitytogetinvolved
in a wide range of projects, and guidance throughout those projects.
I would like to acknowledge my collaborators. Tenet was a joint work with Jeongyeup Paek,
Ben Greenstein, August Joki, Ki-Young Jang, Marcos Vieira, Prof. Eddie Kohler, and Prof.
Deborah Estrin. AEM was a joint work with Jongkeun Na. Work on the study of interaction
betweenARQ,blacklisting,andmetric-basedroutingwasajointworkwithMarkYarvisandProf.
John Heidemann. CTP was a joint work with Rodrigo Fonseca, Kyle Jamieson, and Prof. Philip
Levis.
Thanks to my mom, dad, family, and friends for their support, without which this thesis could
not have been written.
ii
Table of Contents
Acknowledgements ii
List of Tables v
List of Figures vi
Abstract ix
Chapter 1: Introduction 1
1.1 Scope and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Overview and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2: Literature Review 9
2.1 Energy management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Approaches to Reliable Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Network protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 3: Background: An Architecture for Tiered Sensor Networks (Tenet) 21
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 The Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 The Networking Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 4: Application-informed Energy Management 27
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Chapter 5: Approaches to Making Sensor Network Routing Reliable 52
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Detailed Approaches to Improve Path Reliability . . . . . . . . . . . . . . . . . . . 54
5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
iii
Chapter 6: Collection Tree Protocol 71
6.1 Goals and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Design Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.4 Link Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.5 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.6 Forwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.7 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Chapter 7: Conclusion 122
Bibliography 124
iv
List of Tables
2.1 Summaryofsensornetworkrequirementsandsupportingenergymanagementtech-
niques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Survey of networking research related to collection routing. . . . . . . . . . . . . . 17
6.1 List of testbeds on which CTP was evaluated. . . . . . . . . . . . . . . . . . . . . . 104
6.2 CTP’s performance across the testbeds . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3 Results on how channel selection effects CTP’s performance on Tutornet. . . . . . 116
6.4 Detailed Motelab results on how link layer settings affect CTP’s topology and
performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
v
List of Figures
1.1 Techniques used by AEM and CTP to handle dynamics across the protocol stack. 5
3.1 A tiered sensor network with masters and motes. . . . . . . . . . . . . . . . . . . . 24
4.1 AEM’s frames and schedules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Radio states across nodes with LPL during an experiment. . . . . . . . . . . . . . 44
4.3 AEM control and data frames during an experiment. . . . . . . . . . . . . . . . . . 44
4.4 AEM and LPL duty-cycle with varying workloads. . . . . . . . . . . . . . . . . . . 45
4.5 Task response latency for different workloads with AEM and LPL. . . . . . . . . . 45
4.6 Latency distribution with AEM and LPL. . . . . . . . . . . . . . . . . . . . . . . . 46
4.7 Duty-cycle distribution with AEM and LPL. . . . . . . . . . . . . . . . . . . . . . 46
4.8 Adaptation of AEM’s duty-cycle to workload . . . . . . . . . . . . . . . . . . . . . 47
4.9 AEM’s performance with node failure. . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.10 Distribution of AEM’s control and data frame sizes. . . . . . . . . . . . . . . . . . 49
4.11 LPL experiment results for varying sleep interval on Tutornet. . . . . . . . . . . . 50
5.1 Reliability vs. distance profile used in the simulation to compare protocol combi-
nations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Evaluation of the impact of retransmission alone, and in combination with black-
listing and reliability metrics, on protocol performance . . . . . . . . . . . . . . . . 62
5.3 The effectiveness of blacklisting and routing metrics on protocol performance with-
out any retransmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Comparisonofleadingprotocolcombinationswithretransmission,blacklisting,and
ETX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5 The impact of density and blacklisting on protocol performance. . . . . . . . . . . 65
vi
5.6 Map of the testbed on which performance of different protocol combinations was
evaluated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.7 Network connectivity on the testbed on which performance of different protocol
combinations was evaluated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.8 Comparison of testbed and simulation results for various protocol configurations . 68
6.1 Lack of tight correlation between packet reception rate and physical measurements
of the channel such as RSSI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Link estimator designs seen in wireless networks . . . . . . . . . . . . . . . . . . . 83
6.3 Comparision of routing topology formed using different link estimators. . . . . . . 84
6.4 AnexamplethatshowsMultiHopLQIselectingapoorqualitylinkforpackettrans-
mission. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.5 Interface used by the four-bit estimator to handle information flow into and out of
the link estimator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6 The four-bit estimator combines the link quality estimate based on unicast and
broadcast traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.7 Impact of each bit of information on the performance of the four-bit link estimator 93
6.8 Comparision of node depth and cost for MultiHopLQI and the four-bit estimator
with different transmit power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.9 Comparision of delivery ratios for MultiHopLQI and the four-bit estimator . . . . 95
6.10 The CTP routing frame format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.11 The CTP forwarding path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.12 The CTP data frame format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.13 Comparision of delivery ratio for CTP and MultiHopLQI . . . . . . . . . . . . . . 108
6.14 Comparision of cost for CTP and MultiHopLQI . . . . . . . . . . . . . . . . . . . . 110
6.15 The evolution of the control overhead for CTP and MultiHopLQI . . . . . . . . . . 110
6.16 CTP control overhead for selected nodes during an experiment in which new nodes
were introduced. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.17 Inconsistent routing state and resulting control overhead . . . . . . . . . . . . . . . 112
6.18 Comparision of robustness of CTP and MultiHopLQI against node failures . . . . 113
vii
6.19 Determining inter-packet delay for CC2420 radio . . . . . . . . . . . . . . . . . . . 115
6.20 Survey of wireless environment of Tutornet using a spectrum analyzer . . . . . . . 117
6.21 Energy consumption for CTP and MultiHopLQI for 100ms-300ms sleep intervals. . 119
viii
Abstract
Wireless sensor networks are deployed in a wide range of applications such as habitat, environ-
mental,andstructuralhealthmonitoring. Thesenetworkshaveseveralsourcesofdynamicsacross
the wireless protocol stack, which make it challenging to build robust systems. Sensor network
systems must adapt to these dynamics while remaining efficient. There is an emerging class of
dynamically-taskable sensor networks that can support multiple and concurrent applications. In
such networks, of which Tenet is an example, the protocols must adapt to the changing appli-
cation workload. The dynamics at the physical, link, and network layers directly affect routing
and forwarding protocol performance. The quality of wireless channel or the network topology
can change rapidly even in a stationary network. The routing protocols must quickly detect these
dynamics and take steps to maintain robust and efficient routing paths. We use Application-
informed Energy Management (AEM) and Collection Tree Protocol (CTP) as case studies in our
study of the type of dynamics present in sensor networks and present techniques that make these
protocols robust and energy-efficient. AEM can adapt to the application dynamics at the time
scale of application injection and link layer dynamics at the time scale of link quality changes
to provide robust and efficient duty-cycles in sensor networks. CTP can adapt to the changes in
physical and link layers at the time-scale of link coherence and packet transmissions, and changes
in network topology at the time-scale of network events such as node introduction, deletion, and
topology changes. Our findings suggest that agility at the time scale of the dynamics is critical
to the design of robust and energy-efficient protocols.
ix
Chapter 1
Introduction
Sensor networks are used to observe the physical world. Sensor networks have been deployed
to monitor the habitats [76], micro-climates [105], agriculture fields [61], permafrost [9], volca-
noes [110], bridges [57], and buildings [115]. The system infrastructure for these sensor networks
is optimized for the particular deployments and are typically not usable across the applications.
As sensor networks mature, we observe a small set of standard components such as link layer and
routing protocols being used across the deployments. In the future, we expect sensor network
components, hardware and software, to become more standardized and common across applica-
tions and the platforms to evolve into a more re-usable, dynamically taskable software systems
that support multiple concurrent applications and reliable end-to-end data delivery. Mat´ e [65]
and Tenet [36] have some of these properties.
This emerging class of sensor networks face dynamics across all layers of the protocol stack.
While robustness is necessary to perform well despite these dynamics, energy-efficiency is another
core requirement that any sensor network protocol must satisfy for them to be viable in practice.
Most sensor nodes are powered using batteries which necessitates energy-efficient design of sensor
network protocols. Otherwise, the limited energy available at a node can get depleted rapidly
resulting in a short network lifespan. Sensor networks must be both robust and energy-efficient
before they can be widely deployed in the real world.
1
There is a large body of research in making sensor networks energy-efficient at different layers
of the wireless sensor network protocol stack. Achieving energy-efficiency, while simultaneously
ensuring robust and reliable operation and data delivery is challenging because robustness and
reliability often implies techniques that have high communication overhead (large number of
retransmissions, stronger consistency, alternate routes, etc.) which precludes energy-efficiency.
1.1 Scope and Assumptions
In this thesis, we will consider the type of sensor networks that are deployed to monitor environ-
ments, habitats, and structures. We will assume that these deployments consist of one or more
powerful nodes providing storage, processing, and communication services and less powerful wire-
lesssensornodesdeployedatscaletoobservethephysicalworld. Wehavefoundthisarchitectural
assumption reflected on most deployments of sensor network applications.
We assume the wireless sensor network stack consists of the physical, link, network, transport,
and application layers. We use this model of the protocol stack in our exploration of techniques
that help protocols achieve robustness and energy-efficiency. A deviation from this model of
wireless protocol stack is rare in data gathering sensor network applications.
We assume that application workload changes dynamically in response to task insertion and
termination during runtime in the class of sensor networks that we consider in this thesis. We
perform a study of application-level dynamics and techniques to adapt sensor network protocols
to these dynamics in this thesis.
1.2 Problem Statement
Thereisalargebodyofworkinmakingsensornetworksrobust,reliable,andenergyefficient. The
dynamics present across the wireless sensor network protocol stack make achieving these goals
challenging. The physical, link, and network layer dynamics can occur in sensor networks due to
2
node addition and deletion, and temporal changes in wireless channel quality. In reusable sensor
networks that support the execution of multiple applications, potentially concurrently, dynamics
at the application layer are caused by the changing application profile over time.
There is evidence that when sensor network protocols do not directly take these dynamics
into account, and at appropriate time-scale, the protocols sacrifice robustness and energy effi-
ciency. The performance of state-of-the-art duty-cycling and routing protocols illustrates this
phenomenon.
A radio sleep scheduling protocol whose settings do not adapt to the changes in traffic load in
thenetworktendstosacrificeenergy-efficiencyacrosssomerangeofworkload. Acarefultuningof
low-power MAC protocol parameters tailored to a specific traffic profile results in highly efficient
network. However, if the traffic profile of an application is unknown (when network load depends
on the data values, for instance) or dynamic over a large range, this approach that optimizes
the system (MAC or other energy saving protocol) for one or a few sets of traffic profile leads to
inefficient energy expenditure on a subset of the complete range of traffic regime required by the
application. We have experimentally verified these inefficiencies.
To address these problems, there has been attempts to make energy-efficient MAC protocols
adapttotrafficworkloadinalivenetwork. However,duetolayerboundaries,theseMACprotocols
do not have complete information regarding the precise data packet timing across a range of
possible workload and data packet timing patterns. Although it is feasible to extend these MAC
protocols to use data timing information, there is no prior work in statically analyzing sensor
network application to infer parameters that describe data timing patterns to seed these sleep
scheduling adaptation algorithms during runtime. The need for such adaptation over a much
wider range of data timing patterns become especially critical in dynamically-taskable sensor
networks such as Tenet. A routing protocol that does not adapt to the dynamics across the
protocol stack at the respective time-scales tends to be inefficient, fragile, and perform poorly.
3
Per-hop retransmissions, blacklisting, and reliability metric-based routing are three prevalent
techniquesusedinwirelessrouting, includingwirelesssensornetworkroutingresearch, toimprove
routing performance against these dynamics. There has been no systematic study of relative
effectiveness of these techniques in improving routing performance. The implication of using a
combination of these techniques to the routing performance is also not clear.
Thelayersofawirelesssensornetworkprotocolstackoftenexhibitdynamicsatdifferenttime-
scales. The physical and link layers can experience change in their state in the order few tens
of milliseconds while network layer state changes due to node introduction, deletion, or routing
path changes can occur in the few seconds to tens of seconds range. The collection protocols, for
instance, estimate link quality using periodic beacons sent every tens of seconds while the link
quality can change much more rapidly. Similarly, route updates are sent at a fixed frequency
regardless of the dynamics of the network. These collection protocols have high control overhead
evenwhenthereisnodatatosend. Despitethehighcontroloverhead,theseprotocolsarenotagile
and tend to suffer from poor performance. We have experimentally verified that these problems
exist in the state-of-the-art collection protocols such as MintRoute [114] and MultiHopLQI [96].
Thus,sensornetworkprotocolsdonotoftendirectlyaddressandadapttothedynamicspresent
at different layers of the protocol stack at appropriate time-scales. They are tuned to work best
within a narrow range of scenarios and perform poorly outside that range. We have found that it
is an open problem to make these sensor networks robust and energy-efficient across a wide range
of environments, settings, and applications.
1.3 Thesis Overview and Contributions
The underlying theme in our research is incorporating mechanisms into sensor network routing
andduty-cyclingprotocolstoaddressthedynamicspresentinthenetworktomakesensornetwork
systems robust and efficient. To this end, we have designed a radio duty-cycling protocol called
4
Protocol Stack
Protocol Physical Link Network Transport Application
AEM
Elastic
Workload‐
CTP
Frames
adaptive
duty‐cycle
Agile Link Estimation
Adaptive
Beaconing
g
DatapathValidation
Figure 1.1—Techniques used by AEM and CTP to handle dynamics across the protocol stack.
AEM [37], studied different mechanisms in use to make sensor network routing reliable [38], and
designed a routing protocol called CTP [32, 35].
Our design of AEM, a coordinated radio duty-cycling protocol, uses two mechanisms to adapt
to the application level dynamics. First, it statically analyzes sensor network application to infer
traffic profile. Second, it tailors the network-wide radio duty-cycle to the estimated traffic profile.
In ordertoaccommodatepacketretransmissions necessitatedbylinktransients, AEMuses elastic
frames, that can flexibly extend the radio wake-up times.
We studied the effectiveness of per-hop retransmissions, blacklisting, and reliability metric-
based routing, and combinations of these techniques to improve routing performance and con-
cluded that ETX-based routing with retransmissions is an effective way to ensure high end-to-end
path reliability.
We designed CTP, a robust, reliable, and efficient ETX-based routing protocol for sensor
networks. CTP advances the design of wireless sensor network protocol using three key ideas.
First, it uses agile and accurate link estimator using information from the physical, link, and
network layers. Second, it uses adaptive beacons to minimize control overhead without sacrificing
5
agility to topology changes. Finally, it uses datapath validation of routing inconsistencies to
quickly detect and repair loops. These mechanisms together allow CTP to adapt to a wide range
of physical, link, and network layer dynamics.
Figure 1.1 summarizes the techniques used by AEM and CTP to handle the dynamics present
in the network.
Our study of the source of dynamics in wireless sensor network protocol stack and the tech-
niques to adapt protocols to these dynamics is done using two protocol case studies – AEM
and CTP. However, there are broader implications of the findings from this thesis beyond these
protocols.
One of the insights from this thesis is different techniques are required to detect the dynamics
at different protocol layers and protocols must use appropriate techniques to adapt to them
depending on the source of dynamics and protocol goals. At the application layer, for instance, it
is possible to analyze and predict the application traffic profile over time. The technique used in
AEM to infer traffic profile from application analysis is directly applicable to other linear data-
flow programming languages. In addition, the mechanism is conceptually applicable to systems
such as TinyDB [74] and other query-response sensor network systems
Our study of the effectiveness of per-hop retransmissions, blacklisting, and metric-based rout-
ing identified that a combination of per-hop retransmissions and ETX metric-based routing is
highly effective in improving routing performance. We note that many wireless network proto-
cols proposed and in use today, including CTP, use ETX-based routing and retransmission and
have found limited use of aggressive blacklisting. This trend in routing protocol design can be
considered a validation of our findings.
Our study of the dynamics encountered by routing protocols across the protocol stack and
the design of mechanisms that make routing protocols robust, reliable, and efficient distills three
6
general principles in wireless routing protocol design: agile link estimation, rapid data-path vali-
dation, and adaptive beaconing. These principles are applicable in other routing protocols (such
as point-to-point routing) in multi-hop wireless mesh networks.
In this thesis, we make the following contributions:
• Design and implementation of AEM, a radio duty-cycling protocol that adapts to the
application-level dynamics by analyzing the traffic profile of sensor network applica-
tions, statically computing the radio sleep schedules in the network, and duty-cycling a
dynamically-taskable network.
• Study of the effectiveness of per-hop retransmissions, blacklisting, and minimum-cost
routing, and combinations of these techniques, to improve reliability in sensor network
routing.
• Design and implementation of CTP, a collection routing protocol that consists of ac-
curate and agile link estimator, techniques to minimize routing control overhead, and
mechanism to quickly detect and repair routing loops.
1.4 Thesis Organization
In Chapter 2, we review related work and describe the essentials of techniques used in sensor
networks towards making them more robust, reliable, energy efficient.
In Chapter 3, we describe the Tenet sensor network architecture. We present the rationale
behind the architecture, design principles, and key components of Tenet to provide the context
for our discussion of AEM and CTP. The two protocol case studies, AEM and CTP, are designed
to integrate with the Tenet architecture.
7
In Chapter 4, we present AEM. We describe how AEM adapts to the application-level dynam-
ics, integrates with the Tenet architecture, and makes Tenet applications energy-efficient with
robust coordinated duty-cycling protocol.
In Chapter 5, we study the effectiveness of per-hop retransmissions, blacklisting, and metric-
based routing to improve routing protocol performance.
In Chapter 6, we present CTP. We describe the key mechanisms that enable CTP to quickly
adapt to the dynamics in the physical, link, and network layers while remaining efficient. Specif-
ically, we describe its three key features - link estimation that combines the physical, link, and
network layer information, data-path validation of topology, and adaptive beaconing to make it
an agile, robust, and efficient routing protocol.
In Chapter 7, we summarize our work and discuss future work.
8
Chapter 2
Literature Review
This thesis uses a radio duty-cycling and a routing protocol as case studies to understand the
source of dynamics in the network and propose techniques to enable these protocols to adapt
to those dynamics while remaining energy efficient. So our survey of related work will focus on
energy-efficient sleep scheduling and routing protocols. We also review literature related to the
prevalent approaches to improving reliability of wireless routing because one of our goals in this
thesis is to not only understand the techniques to make routing reliable but also to design and
implementahighlyrobustandreliableroutingprotocol. Wenotethatmostworkinnetwork-wide
energy management, such as sleep scheduling, redundancy control, and energy efficient routing,
rely on the existence of in-node energy saving mechanisms enabled by modular hardware de-
sign [43, 87, 51] that allows different components to be turned off using software and energy
conscious operating systems [66].
2.1 Energy management
Energy management in wireless sensor networks has seen extensive research interest. Much of
this research focuses on reducing communication energy by duty-cycling the radios, or reducing
the volume of information transmitted. Since the literature in this area is vast, we focus only on
9
Requirements Supporting Systems
Low Duty-cycle S-MAC [118], B-MAC [86], LPL [79], SCP-MAC [119],
WiseMAC [31], X-MAC [12], Koala [80], Dozer [14],
FPS [44], AEM
Handle network transients LPL, AEM
Application-Informed AEM
Multiple applications FPS, AEM
Time Synchronization LPL, FPS, AEM
Reliable transport Koala, AEM
Table 2.1—Summary of sensor network requirements and supporting energy management techniques.
the most closely relevant pieces of work. Table 2.1 summarizes the requirements of dynamically-
taskable sensor networks and supporting energy management systems and protocols.
Coordinated sleep scheduling uses synchronized radio sleep-wakeup across all nodes, or sub-
sets thereof. The main challenge in coordinated scheduling is efficiently maintaining the time
synchronization among the nodes so that the schedules can remain synchronized. S-MAC [118]
forms clusters of nodes with the same sleep and wakeup schedules. Fast path algorithm [69]
can augment S-MAC to setup and coordinate wakeup schedules for data delivery using explicit
signaling along a path and use of timeouts to tear down the schedules. T-MAC [106], also a
synchronized wakeup and sleep scheduler, attempts to shorten the wakeup schedule by clustering
all the transmissions at the beginning of the schedules so that the radio can be turned off after
the transmissions are complete, even before the end of the wakeup period. SCP-MAC [119] polls
for signal on the channel at a small and globally synchronized time window thereby making the
long preambles unnecessary while achieving even lower duty cycles. When a query is injected
to a sensor network running TinyDB [13], the nodes typically take the first few seconds of each
epoch to generate sensor reading and to forward it to the base station. TinyDB nodes turn off
after a set number of seconds of each epoch and wakeup again at the beginning of the next epoch.
Each time a network is on and is used to collect sensor readings, AppSleep [88] tells the network
10
the next expected time to transmit the next batch of sensor readings. The nodes can go to low
power state until that wakeup time. Dozer [14], and FPS [44] on the other hand, use staggered
sleep schedules such that the parent and its children in a tree routing topology negotiate sleep
schedules. Koala [80] coordinates its sleep schedules for bulk transfer. SMACS [95] assigns com-
munication slots between neighbors using a distributed protocol. Because the schedules are not
globally coordinated, random frequency assignments are made for each link-pair to ensure low
contention during communication. Our work on AEM falls into this class of systems, but differs
fromalloftheminthatAEMpredictsthetrafficprofileofapplicationsandadaptsitscoordinated
sleep schedules to the applications that run on the sensor networks. In contrast to the fast path
algorithm, AEM statically computes the beginning and end of the sleep schedules. Unlike these
other systems, because of the design features that makes AEM energy-efficient and robust, it can
support concurrent application tasks, dynamic re-tasking, tailors its sleep scheduling to applica-
tion needs, and supports system services such as task dissemination, dynamic routing, and time
synchronization.
One of the challenges in the design of a coordinated sleep scheduling protocol that does not
globally synchronize the schedules is enabling data delivery without excessive end-to-end delays.
Some coordinated sleep scheduling MAC protocols use schedule adaptation techniques to either
adjust the lengths of existing wakeup schedules or introduce new schedules on-the-fly to accom-
modate traffic dynamics. When nodes running S-MAC overhear data packet transmissions, they
can use adaptive listening to turn their radio on outside the normal wakeup and sleep sched-
ule in anticipation of those packets, thereby reducing data delivery latency. SCP-MAC can use
adaptive channel polling by causing the receiver to poll the channel a few times after its regular
polling schedule giving the sender an opportunity to send more packets before the next regularly
scheduled polling schedule. T-MAC can use future request to send to extend the wake up time
of the receiver to avoid the early sleeping problem in multi-hop forwarding. These sleep schedule
adaptation techniques are known to be critical to achieve acceptable performance in coordinated
11
duty-cycling protocols. In a similar spirit, AEM adapts to the link-layer dynamics using elas-
tic schedules, which can increase in size to accommodate link-layer retransmissions. Unlike the
adaptation techniques surveyed above, AEM’s elastic schedules are not intended to reduce data-
delivery latency or to increase throughput, but are intended to accommodate retransmissions,
and demonstrated to do so successfully. AEM uses a different mechanism, globally synchronized
schedules derived from static analysis of the application, to reduce packet delivery latency.
Uncoordinated sleep scheduling establishes sleep schedules without any explicit receive and
transmit slot assignment protocol. The receiver node in Piconet [8], for example, broadcasts
its address before putting its radio in the listen mode. The transmitter can send its message
immediately after it hears the broadcast message. Other examples of such receiver-initiated
schemes are LPP [80] and RI-MAC [100], in which the transmitter waits for a beacon or signal
from the receiver to start data transmission. B-MAC [86] takes the opposite approach called low
power listening: the transmitter transmits a preamble longer than the sleep period of the nodes.
Inthistransmitter-drivenscheme,thereceiver,isguaranteedtowakeupduringthetransmissionof
the preamble, which it can acknowledge to initiate message reception. WiseMAC [31] transmitter
keeps track of the channel polling schedule of the receivers so that the transmitter can transmit
the packet at the time when the receiver radio is turned on. If there are more packets to send
to the same receiver, the transmitter can request the receiver to keep its radio on by setting the
frame pending bit on the link header. There is one key difference between this mechanism that
extends the schedule and the quiet-time in AEM: WiseMAC schedule extension is triggered by
a request sent by the transmitter while such an extension in AEM schedule is triggered by any
packet transmission or reception on the link. In STEM [92], all the nodes transition to the receive
mode periodically; the transmitter must send packets continuously until the receiver wakes up
and can initiate communication. X-MAC [12] and BoX-MAC [79] use a packet train to simulate
long preambles to provide low power listening-like capability in 802.15.4 packet radio such as
12
Chipcon’s CC2420. We perform a quantitative comparison with a transmitter-initiated scheme,
LPL [86], and show that AEM can achieve a 6-fold lower duty-cycle under certain workloads.
Application-informed energy management has also been explored in various contexts. In wire-
less and mobile networks, there are proposals to let the applications configure the power manage-
mentpoliciesbasedontheircommunicationrequirement. KravetsandKrishnanproposeallowing
applications to provide information about the expected communication pattern and negotiate
communication schedule with the base station in a wireless network [60]. STPM [5] provides an
APIfortheapplicationstoexpresstheircommunicationpatternandlatencyrequirementssothat
the system can decide if switching to low power state is desirable and results in energy saving.
Re-designing OS abstractions that permit applications to achieve energy-efficient I/O is shown to
be highly effective in TinyOS [59]. AEM does not explicitly require applications to specify energy
requirements. It instead performs static analysis to infer application workload and to adapt the
network-wide radio duty-cycling to the workload.
Hierarchical power management uses a second, often low-power, radio or channel to perform
sleep-wakeup coordination. For example, the protocols can use the low power radio to negotiate
or infer communication schedules and turn on the high power data communication radio only
during communication schedules. PAMAS [94] uses a low power signaling channel to participate
in RTS/CTS signaling and turns on the data radio to receive only the packets directed to the
node thereby avoiding overhearing packets destined to the neighbors. Paging Channel [2], wake-
on-wireless [93], and STEM [92] explore the idea of waking up a node using a low power radio
and turning on the main radio only for the duration of the data communication. Pering et al.
describedual-radiopathway thatuseslowerpowerradiotocommunicateschedulesandparameters
for higher power radio and avoid energy draining resource discovery protocols of more complex
radios in a platform with three-radio hierarchy [84]. Interface cache [123], an example of using
interfacehierarchyforenergysavings,allowsausertointeractwithalowpowerinterfaceathuman
speeds and sends a composed command to a high power node which turns on for long enough
13
to receive and process the command from the cache. AEM is not designed to take advantage of
hierarchical radio or interfaces itself but it allows the underlying networking and MAC protocols
to use such mechanisms. In contrast, AEM does not require specialized hardware to work.
Redundancy control is another area of research that attempts to save energy, hence maximize
the network lifetime, by turning off nodes that are unnecessary to maintain the desired communi-
cationorsensingfidelity. SPAN[18]wakesupanodeifitdetectsthattwoofitsneighborscannot
communicate with each other with existing set of awake nodes and thus increasing the network
connectivity. GAF [116] considers nodes that are geographically close interchangeable for routing
and turns off nodes that are not necessary to maintain the routing fidelity. In ASCENT [16], a
node monitors the message loss rate and network density and participates in the network if the
messagelossrateishighand/ornetworkdensityislowtherebyprovidingpathdiversity,whichcan
increase data delivery performance. Carbunar et al. propose an algorithm to determine if sensor
coverage of an area is redundant [15]. These redundant sensors do not increase the sensing fidelity
and can be turned off. AEM does not attempt to control communication or sensing redundancy.
Energy-aware routing is also an active area of research that attempts to find the most energy-
efficient paths in the network for data forwarding. These protocols typically use a routing metric
thatreflectstheenergycostofdatadeliveryalongapathtopicktheenergyefficientpathsfordata
delivery. The MT [114] metric, similar to the ETX metric [23], uses the number of transmissions
and retransmissions necessary to successfully deliver packets along a path as routing metric.
Younis et al. proposed an energy metric but compute the routes centrally on the gateways and
assign one of these roles to the nodes in the network: sensing, relaying, sensing and relaying, and
inactive[120]. ChangandTassiulasproposearoutingmetricthattakesintoaccounttransmission
cost,initialenergy,andresidualenergyonthenodestobalancetheenergyuseinthenetwork[17].
AEM does not have a routing component but it does not rule out the use of energy aware routing
metrics by the underlying networking subsystem.
14
TangentialbutcomplementarytoAEMisworkonplatformssuchasTrio[27]andPrometheus[50].
Theseplatformsaredesignedtocarefullyselecttheenergysourcedependingonthecomputational
and communication goals at a given time and harvest energy from the environment to enable long
term deployment.
Finally, AEM communication schedules are qualitatively similar to periodic task execution
in real-time systems [63] but with one key difference: the end of communication schedule, or
communication deadline, is not fixed.
2.2 Approaches to Reliable Routing
Here we review work related to the three approaches to improve routing performance: per-hop
retransmission, blacklisting, and reliability metrics for path selection.
Per-hop Retransmission
Per-hop retransmission is probably the oldest known technique to increase delivery rate on a link
that is selected for data delivery [10]. MAC level retransmission (or Automatic Repeat Request,
ARQ) is used in the 802.11 MAC to improve delivery rate. ARQ and Forward Error Correcting
codes have been proposed to improve per-hop reliability [70]. The goal of per-hop retransmission
is to improve the quality of a given link, thus improving whatever path has been selected.
Blacklisting
Blacklisting eliminates unreliable, lossy, or asymmetric links from the set of links used for com-
munication. Lundgren et al. identified gray zones and suggested that links in this zone should be
ignored (blacklisted) while making routing decisions [73]. Ignoring fading links [19], only using
links with good signal strength [26], and using power at which a message is received to identify
good links and using only good links for routing are different ways in which researchers have
15
implemented blacklisting. Ultimately the goal of blacklisting is to avoid poor-quality links, thus
forcing selection of reasonable paths.
Reliability Metrics in Routing
All routing protocols use some routing metric to select paths. If the routing metric is selected
to represent end-to-end reliability, the routing protocol can identify paths with high reliability.
De Couto et al. proposed an ETX (Expected number of transmissions) metric that considers
forward and backward reliabilities to identify high throughput paths in a network [23]. This work
focuses on maximizing throughput in 802.11b-based networks. Yarvis et al. proposed using the
minimum of forward and backward reliabilities as link metric and using that to find the most
reliable path but they note that this results in longer paths [117]. Awerbuch et al. proposed
minimizing the amount of time a packet uses the network (medium time metric) in a multi-rate
radio environment to maximize throughput [7]. Draves et al. proposed Per-hop Round Trip Time
andPer-hopPacketPairDelaylink-qualitymetricsbutconcludethatthesemetricsperformworse
than the ETX metric [25].
Interactions of these approaches
We know of only one work [114] that considers how retransmission, blacklisting, and reliability
metric help improve data delivery performance. Their study examines the effect of blacklisting
(with only two thresholds) on shortest path routing. Furthermore, their study does not examine
packet delivery in the absence of per-hop retransmissions. Generally speaking, our work on
the comparison of the three techniques to improve reliability more systematically explores the
parameter space by comparing different combinations of these three techniques and quantifying
theeffectofeachtechniqueinthecombination. Forexample,weexploretheimpactofblacklisting
(with five different thresholds) on ETX and the ML metric as well. We also develop a deeper
understanding of the reliability metric by studying it at different resolutions. Finally, we control
16
Research area Protocols and Systems
Control plane ETX [23], MT [114], MultiHopLQI [96],
EAR [54], LOF [121], ENT [25]
Data plane Flush [56], RMST [99], CODA [108],
Fusion [45], IFRC [89], RCRT [81]
Table 2.2—Survey of networking research related to collection routing.
the number of retransmissions as a parameter to our routing protocol; this provides an additional
insight on how to achieve a desired delivery rate and delivery cost.
2.3 Network protocols
Robust and energy-efficient network protocol design is an area of extensive research in sensor
networks. In sensor networks, collection is the dominant form of traffic in which all the nodes
in the network send packets to one or a small number of origin or roots. Collection protocols
typically consist of link estimator, routing protocol, and forwarding engine or protocol. These
protocols must quickly adapt to the dynamics in the network to achieve energy-efficiency and
robustness in collection data delivery.
The design of CTP, our protocol case study, draws on the work on collection layers such
as MultiHopLQI [96] and MintRoute [114], and the tradeoff they introduce between cost and
responsiveness. Table 2.2 summarizes related research in this area.
The link estimators of many routing layers use PRR estimators based directly on packet loss
statistics. MintRoute [114] uses a windowed mean with EWMA (WMEWMA) estimator to esti-
matelinkquality. Inseparatework,WooandCulleralsoconsidernumerousotherestimators[112]
suchasEWMA,flip-flopEWMA(discussedbelow),andwindowedmovingaverage,andfindthem
inferior to WMEWMA. Like MintRoute, we use WMEWMA estimators, but unlike MintRoute
andotherknownpriorwork, weseparate linkestimates based ondata andbeacontraffic, yielding
17
a mechanism similar to the flip-flop EWMA estimator discussed below. A number of routing
layers [96, 104] and studies [97] propose link quality estimates from the PHY as PRR estimators,
and we describe the pitfalls of these approaches and how well they perform against CTP in prior
work [32].
The EAR link quality measurement framework [54] uses a combination of passive, active, and
cooperative techniques to measure link qualities in a mesh network. The CTP 4-bit link estima-
tor [32] adapts some of EAR’s techniques to CTP, namely passive and active link monitoring,
while also integrating information from the physical layer into the link estimator, a technique
not present in EAR. Further, while EAR divides its operation into “measurement” and “update”
cycles, CTP operates continuously, enabling it to respond quickly to changes in the RF environ-
ment. Finally, CTP tackles a harder problem, however, since its link estimator must operate in
embedded networked sensors with space-constrained neighbor tables.
“Learn on the fly” [121] uses MAC latency until the successful transmission of a packet as link
metric, and although can be used with ETX path metric, focuses on a discussion of a beaconless
geographic routing with static nodes. This work, like other proposals [32, 62, 40], and ours,
share the the same philosophy, using information from the link layer for accurate link estimation.
The CTP estimator, in addition to the information from the link layer, also uses information
from the physical and network layers using well-defined and narrow interfaces to the physical,
network, and data link layers. Our work borrows ideas from these proposals for link estimation
but CTP also proposes mechanisms to make collection efficient using networking techniques such
as inconsistency detection and control traffic timing.
Also in the domain of mesh wireless networks, Noble et al. [55] propose a Flip-Flop Filter
for link estimation. The flip-flop filter switches between an agile EWMA and a stable EWMA
dependingonwhethernewdatafallswithinthreetimesthestandarddeviationofthesampledata
(the 3σ rule). Woo and Culler [112] evaluate the EWMA flip-flop filter, and find that it does not
provide an advantage for the problem of link estimation in low-power wireless networks.
18
At a high level, adaptive beaconing and datapath validation combine elements of proactive
and reactive routing paradigms, proactively maintaining (at low cost) a rough approximation of
the best routing gradient, and making an effort to improve the paths data traffic traverses. Mesh
routing work has established that a good way of constructing routes in a wireless mesh network
is to minimize either the expected number of transmissions (ETX [23]) or a bandwidth-aware
function of the expected number of transmissions (ENT [25]) along a path [114], instead of simply
thenumberofhopsapacketmusttravelalonganypath. CTPdrawsonmeshroutingwork, using
ETX as its routing metric.
Adaptive beaconing extends Trickle [67] to time its routing beacons. Using Trickle enables
quick discovery of new nodes and recovery from failures, while at the same time enabling long
beacon intervals when the network is stable. This approximates beaconless routing [121] in stable
and static networks without sacrificing agility or new node discovery.
Other recent work in routing uses geographic location [58, 64], anchor beacons [33, 77], or
DHT-like mechanisms [1] to achieve very highly-scalable routing. Like the former geographic
routing proposals, CTP scales well with modestly increasing node density, but for a different
reason: CTP uses Trickle timers to adapt beaconing rate, while geographic routing requires nodes
to know their location. The later anchor beacon-based and DHT-based approaches can introduce
stretch to routes, and are useful for collection on a larger scale than we consider in this paper.
CTP also borrows from work on reliable sensornet transport protocols [56, 99, 90, 107] which
seek to maximize throughput by timing transmissions on a path such that a pipelining effect
occurs. These prior protocols motivated the CTP forwarding timer.
A large body of sensor network protocol work examines how to mitigate the congestion that
occurs when collection traffic concentrates in the vicinity of a sink. For example, CODA [108]
and Fusion [45] propose different ways of mitigating congestion by using channel occupancy sam-
pling [108] or a combination of backpressure and rate-limiting [45]. Ee et al. [28] and IFRC [89]
attempt to allocate a fair rate to every sender in the network. CTP facilitates the use of such
19
higher-layer protocols, but does not replace their functionality. CTP’s transmit timers prevent
self-interference by a single transmitter, but do not coordinate transmitters, leaving this to higher
layers.
Other congestion-control collection protocols take more unorthodox approaches. Siphon [109]
uses a small number of virtual sinks equipped with high-bandwidth wireless radios to deliver
data from congested regions of the network. Compared to Siphon, CTP does not rely on high-
bandwidthexternalradios, andisthussuitableforthemostgeneralcaseofsensornet-onlydeploy-
ments. The Funneling MAC [4] is an example of a link-layer protocol that addresses congestion;
compared to CTP, it uses a combination of TDMA and CSMA to improve fidelity when traffic
becomesfunnelednearthecollectionsink. CTP,however,achieveshighfidelitywithouttheuseof
morecomplexTDMAmediumaccesscontrol. Finally, wenoteDozer[14], aproprietarycollection
protocol running exclusively on Shockfish hardware, whose source code we could not obtain.
RAP [71] is a network architecture for real-time collection in wireless sensor networks. Unlike
other collection protocols, RAP attempts to deliver data to a sink within a time constraint, or
not at all: a different task as compared to collection. However, RAP uses similar mechanisms as
CTP, such as MAC priorities and queuing mechanisms.
Eeet al.[29]describeadecompositionforthenetworklayerinsensornetworksintoforwarding
and routing modules, with a narrow interface between them. While our work agrees to a large
degree with that decomposition, we find that augmenting the interface between the two modules
to allow the data plane to trigger updates in the routing module provides a large benefit.
20
Chapter 3
Background: An Architecture for Tiered Sensor Networks
(Tenet)
Our study of network dynamics and the techniques to adapt to those dynamics to achieve robust-
ness and energy-efficiency is applicable to networks such as Tenet. These networks are primarily
designed to collect data from the network using a command-response communication pattern.
Theyalsosupportmultipleandconcurrenttasks, andprovidereliabledatadeliveryandtime syn-
chronization services. The protocol case studies in this thesis integrate with task dissemination,
routing, and time synchronization components of Tenet.
3.1 Motivation
Researchinsensornetworkshasimplicitlyassumedanarchitecturalprinciplethat,inordertocon-
serve energy, it is necessary to perform application-specific or multi-node data fusion on resource-
constrained sensor nodes as close to the data sources as possible. Like active networks [103],
this allows arbitrary application logic to be placed in any network node. This principle has
governed the design of data-centric routing and storage schemes, sensor network databases, and
programming environments that promote active sensor networks. This principle can minimize
communication overhead but increases system complexity and reduces manageability. Systems
21
arehardtodevelopanddebugsinceapplicationwritersmustimplementsophisticatedapplication-
specific routing schemes, and algorithms for multi-node fusion, while contending with mote-tier
resource constraints.
Tenet’s design is motivated by a property common to many recent sensor network deploy-
ments [110, 105, 101, 39, 6]. These deployments have two tiers: a lower tier consisting of motes,
which enable flexible deployment of dense instrumentation, and an upper tier containing fewer,
relatively less constrained nodes with higher-bandwidth radios, which we call masters (Figure
3.1). Tiers are fundamental to scaling sensor network size and spatial extent, since the masters
collectively have greater network capacity, larger spatial reach than a flat (non-tiered) field of
motes, and can be engineered to have significant sources of energy. For these reasons, most future
large-scale sensor network deployments will likely be tiered.
3.2 The Design Principles
Tenetconstrainstheplacementofapplicationfunctionalityinthenetwork: Multi-node data fusion
functionality and multi-node application logic should be implemented only in the master tier. The
cost and complexity of implementing this functionality in a fully distributed fashion on motes
outweighs the performance benefits of doing so. This architectural principle simplifies application
development and results in a generic mote tier networking subsystem that can be reused for a
variety of applications, all without significant loss of overall system efficiency.
In addition to the architectural principal stated above, our Tenet system has the following key
properties:
Asymmetric Task Communication: The Tenet principle prohibits multi-node fusion in the
mote tier. Any and all communication from a master to a mote takes the form of a task. Any and
all communication from a mote is a response to a task; motes cannot initiate tasks themselves.
Here, a “task” is a request to perform some activity, perhaps based on local sensor values; tasks
22
and responses are semantically disjoint. Thus, motes never communicate with (send sensor data
explicitlydirectedto)anothermote. Rather,masterscommunicatewith(sendtaskstoandreceive
data from) motes, and vice versa.
Addressability: In Tenet, any master can communicate with any other master as long as there
is (possibly multi-hop) physical-layer connectivity between them; any master can task any mote
as long as there is (possibly multi-hop) physical-layer connectivity between them; and any mote
shouldbealwaysbeabletosendataskresponsetothetaskingmaster. Therequirementtosupport
master-to-master communication allows, but does not require, the construction of distributed
applications on the masters. Addressability requires much less of motes, however; a mote must
be able to communicate with at least one master, not all masters, and mote-to-mote connectivity
is not required. This is by design, and greatly simplifies mote implementations.
Task Library and Execution: Motesprovidealimitedlibraryofgenericfunctionality, tasklets,
such as timers, sensors, thresholding, data compression, and other forms of simple signal process-
ing. A task library that simultaneously simplifies mote, master, and application programming
while providing good efficiency is a key piece of the Tenet architecture. A task is composed of
arbitrarily many tasklets linked together in a linear chain. The tasklets expose parameters to
control this service. For example, to construct a task that samples the light sensor every 500 ms
and sends the samples to its master with the tag LIGHTREADING, we write:
periodic(500ms)->sample(LIGHT, LIGHTREADING)-> Send()
TheTenetschedulermaintainsaqueueoftaskswaitingtousethemote’smicrocontroller. The
scheduler operates at the level of tasklets, and knows how to execute the task’s tasklets in order.
Since it operates at this level of granularity (as opposed to executing each complete task one at
a time), several concurrently executing tasks may get fair access to a mote’s resources.
Robustness and Manageability: Robust networking mechanisms, which permit application
operation even in the face of extensive failures and unexpected failure modes, are particularly
important for the challenging environments in which sensor networks are deployed. The tools in
23
Figure 3.1—A tiered sensor network with masters and motes.
thetasklibrarycanprovideusefulinsightintonetworkproblems—suchaswhyaparticularsensor
or group of sensors is not responding, or why node energy resources have been depleted far faster
than one would have expected—and allow automated response to such problems.
3.3 The Networking Subsystem
Tenet’s networking subsystem allows an application running on a master to disseminate the tasks
to the motes and collect responses to those tasks.
Tiered Task Dissemination: Tenet’s task dissemination subsystem reliably floods task de-
scriptions to all motes. When an application requests task dissemination, the master assigns a
sequence number, caches the task packet, and broadcasts the packet to the neighboring masters
and motes. To recover from losses, each node occasionally transmits a concise summary of all
the packets it has in its cache. These transmissions are governed by an exponentially backed off
timer [67], so that when the network quiesces, the overhead is minimal. If a node detects that a
neighborhas ahighersequence numberthan inits owncache, it immediatelyrequests themissing
sequence number using a unicast request. If a node detects that a neighbor has some missing
packets, it immediately broadcasts a summary so the neighbor can rapidly repair the missing
sequence packets, and so other nodes can suppress their own rebroadcast.
24
Tiered Routing: In Tenet, all nodes, masters and motes, are assigned globally unique 16-bit
identifiers. Masters run IP, and use the lower 16 bits of their IP address as their globally unique
identifierwhilemotesusetheTinyOSnodeidentifier. Allthemasterandmotenodeidentifiersare
manually assigned and assumed to be globally unique. Tenet’s routing system has four compo-
nents: mastertierrouting leveragesexisting802.11ad-hocwirelessroutingtechnology;master-tier
overlay routing extends the 16-bit identifiers into IP addresses and uses the kernel routing table
for routing, mote-to-master routing is adapted from existing tree-routing implementations; and
master-to-mote routing uses data-driven route establishment.
Tenet’s addressability principle calls for any mote to be able to return a response to the
tasking master. The mote-to-master routing enables a mote to establish a path to the nearest
master, which can forward the task to the tasking master. In order to enable motes to discover
the nearest master, each border master periodically sends beacons into the mote cloud. The
beacons cascade down the network as path metric is updated by each mote. A mote selects, as
its “parent”, that neighbor which advertised the best path to a master. Over time, a mote’s
parent may change if the path quality to its nearest master degrades, or if the nearest master
fails, conditions detected by the periodic beacons. We use three standard tree routing protocols
(MultiHopLQI [96], MintRoute [114], and CTP described in detail in Chapter 6) to support this
nearestmasterselection. Theforwardinglogicissimplythis: Whenamotereceivesapacketfrom
any neighbor that is not its parent, it forwards the packet to the parent.
Tenet’smaster-to-mote routing usesdata-drivenrouteestablishment. Whenamotegetsatask
response data packet from a non-parent, it establishes a route entry to the source address (say S)
in the packet, with the next hop set to the sender with a timeout of 30 seconds. Subsequently,
when the parent sends a packet destined to S (say a transport acknowledgment from a master),
thenodeusesthisroutingentrytoforwardthepackettoS,andresetstheassociatedtimer. Thus,
the routing entry is active as long as a mote has recently communicated with its master. Masters
25
also implement a similar algorithm that sets up these data-driven routes so packets on the master
tier are correctly routed towards S.
Reliable Transport: Tenet needs a mechanism for transmitting task responses from a mote
to the master that originated the task, possibly with end-to-end reliability. Reliable stream
transport uses TCP-like connection establishment and negative acknowledgments to recover the
missing packets. Reliable packet transport uses positive acknowledgment to re-attempt packet
delivery.
Other support protocols: Many Tenet applications require the ability to timestamp sensor
readings. Tenet uses FTSP [78] to provide a globally synchronized timing service.
26
Chapter 4
Application-informed Energy Management
Inthischapter, weconsidertheproblemofenergymanagementindynamically-taskablenetworks,
using Tenet as a case study. Application-informed Energy Management (AEM) considers energy
management for Tenet under three new constraints: dynamic multi-hop routing and tasking,
multiple concurrent applications, and reliable end-to-end data delivery. For AEM to efficiently
and robustly satisfy these constraints, it uses techniques to directly adapt to the application level
and link layer dynamics. AEM statically analyzes and infers the traffic profile for the application
and accordingly tunes the duty-cycling protocol to provide the best trade-off in latency and data
delivery performance.
4.1 Overview
An important open problem in the context of systems such as Tenet is energy management. To
our knowledge, no existing energy management proposal (and there are many, section 2.1) has
been demonstrated for a programming system such as Tenet. The key challenge is preserving
the generality and wide applicability of such a system while achieving low duty-cycle operation;
most existing work in the area makes one or more assumptions (e.g., about the workload or
the application’s tolerance to latency, about support for broadcast traffic or lack of support for
end-to-end reliable transport, and so forth).
27
AEM is a radio duty-cycling protocol designed in the context of the Tenet system, and makes
two conceptual advances. The first is that Tenet’s tasking language permits static analysis of
tasks. Static analysis can be used to infer application workload in a manner transparent to the
application. The second conceptual advance is to coordinate network-wide radio duty-cycling
tailored to the application workload and the expected traffic pattern within the network, using
parameters derived from static analysis. This coordination is itself achieved via Tenet’s tasking
andtimingsystems; AEMdynamicallysetsupglobally-synchronizedperiodicschedules,usingthe
taskingmechanism,fortransmissionofcontrol(routing,timesynchronization,taskdissemination)
and data packets. AEM computes the radio sleep and wakeup times based on the globally-
synchronized network time provided by Tenet’s timing system. It scales the number of data
schedules in proportion to the number of concurrent tasks. Moreover, the duration of radio on-
times is variable and elastic, adapting to load transients and retransmissions. Finally, AEM is
designed to be robust to routing changes, time synchronization transients, and node failure.
We use a simple example to illustrate how AEM works. In Tenet, when a user wants to collect
a light sensor reading every two minutes from each node in the network, the user presents the
following task to the system:
periodic(2 mins)->sample(LIGHT)-> Send()
AEM analyzes this task and makes two inferences: (a) that a response will be generated every
two minutes and (b) that a response will fit entirely into a single packet with a small payload.
These two inferences allow AEM to determine the duration for which radios on the motes should
be turned on or off. After performing this analysis, AEM prepends the task with a description of
these parameters:
dataframe(2mins, ...)->periodic(2 mins)->sample(LIGHT)-> Send()
and disseminates the transformed task into the network. Upon receiving this task, motes
schedule packet transmissions and radio activity based on the specified duty-cycle parameters.
28
Other energy-management proposals such as Koala [80], Dozer [14], and SCP-MAC [119]
achieve lower duty-cycles than AEM. This is not surprising, since they do so at the cost of
generality, either assuming a specific workload or specific traffic profile, or lacking support for
flexible tasking and concurrent applications. AEM is meant to be complementary to such hand-
tuned sleep-scheduling techniques; these should be used where it is necessary to achieve ultra-low
duty cycles and it is possible to leverage application knowledge to do so. At the same time, it is
important to have a sleep scheduling component like AEM in a general-purpose sensing system to
support those application deployments that can live with the (rather substantial) energy savings
that AEM provides, but want to enjoy the benefits of a readily available, reusable (and robust)
sleep scheduling mechanism.
4.2 Goals
The design of AEM attempts to achieve the following list of goals:
• Our most important goal is low duty-cycle operation – the system must support this in
order to ensure network longevity. However, this is not our only goal; we are interested
indesignsthatachievethelowestpossibleduty-cycles, whilestillmeetingtheothergoals
listed below. This precludes the use of ultra-low duty-cycle approaches [119, 80], which
fail to meet one or more of the goals below.
• Our next goal is alignment with Tenet’s design. Tenet supports re-tasking, concurrent
execution of multiple tasks, flexible forms of tasking (periodic or event-triggered), dy-
namicrouting,end-to-endreliabletransport,andtimesynchronization. WerequireAEM
to support these features as well, since that would extend Tenet’s applicability over a
wide dynamic range (capable of supporting high data-rate applications like structural
monitoring [82] and imaging [42], as well as low-rate applications).
29
• Ourthirdgoalisrobustness;thatAEMshouldworkregardlessofchangestothetopology,
arrival or departure of nodes, or transient failures in other system services such as time
synchronization or routing.
• Our fourth goal is low latency; AEM should deliver events or samples with a delay no
greaterthantheinter-sampleorinter-eventtime. Thisrequirementisfairlyconservative,
and in practice AEM does significantly better. However, there is an obvious trade-off
between latency and duty-cycles, and we are interested in designs that favor lower duty-
cycles over latency.
• Our final requirement is transparency. We require that AEM make minimal or no modi-
fications to existing parts of Tenet. This preserves the modularity of the overall system,
enabling easy evolution of its components.
These goals emphasize AEM’s adherence to Tenet’s design. Whereas most other work has
attempted to design energy efficiency from the ground up, in this work we have attempted to
retrofit energy efficiency into an existing system that has properties such as manageability, re-
taskability, and support for end-to-end reliable transport.
4.3 Design
AEM achieves its goals using two conceptual advances. The first is based on the observation that
Tenet’ssimpletaskingpermitsstaticanalysisoftasksatthemastertier. Thisstaticanalysiscanbe
used to infer application workload. The second conceptual advance is to tailor radio duty-cycling
to the application workload using Tenet’s tasking mechanism and some simple functionality built
into the mote tier. In this section, in addition to describing these two conceptual advances, we
briefly discuss other aspects of AEM’s design such as bootstrapping, recovery mechanisms that
30
allows AEM to continue to function during time synchronization failures, and alternative designs
of AEM’s radio duty-cycling schedules.
4.3.1 Static Analysis of Tasks
A key observation that motivates the design of AEM is that Tenet’s design permits inspection of
application activity at the master tier. Specifically, in Tenet, we can statically analyze a task just
before it is disseminated (and possibly modify it before dissemination) into the network. Since
Tenet’s tasking language is a linear data-flow language, it is possible to analyze a task and infer
the following two parameters, which can be used to control radio duty-cycles: (i) the start time
when a task starts executing controls when the motes should turn on their radio, and (ii) the
period between two executions of a repeating task determines how often a mote should turn its
radio on.
To compute these parameters, AEM performs a simple data-flow analysis of the task descrip-
tion, and partitions each task description into the following sections:
synchronization -> periodicity -> data generation/processing -> packing -> send
The start time is computed by analyzing the synchronization section of the task. Some task
descriptionsexplicitlyspecifywhentaskexecutionshouldstart; forexample, whendatacollection
needs to be synchronized across all nodes. When this section is missing, AEM takes the liberty
to modify the task description to improve system performance. As we describe later, AEM’s
duty-cycle design ensures that all nodes’ wakeup times are synchronized, so synchronizing task
execution with these times can result in lower latency.
Theperiodicity sectiondescribeshowfrequentlyshouldamoteexecutethetaskandpotentially
generate data, while the packing section describes how many data items should be included in the
payload. AEM computes the period for duty-cycling using these sections: if a task executes every
x seconds and packs n samples into a packet, the period at which packets are generated is x× n.
31
As we describe below, if one or more of these sections are missing from the task description,
AEM makes reasonable choices for the missing section(s). Of course, it is possible to write tasks
that do not conform to this template – for example, a task with multiple synchronization and
periodicity sections. In this case, our static analysis fails and as a result AEM does not perform
duty-cycling. Oftentimes a task with multiple synchronization or periodicity sections can be re-
written as separate tasks, each with a single synchronization and periodicity. These transformed
tasks are then amenable to AEM’s static analysis. However, in the context of Tenet, we have
not come across practical sensing applications that could not be thus implemented using a set of
periodic tasks. To duty-cycle a network running a complex task that eludes such transformation
will require a more complex data-flow analysis and mechanism to disseminate a larger number of
parameters regarding the timing of schedules to the network.
WenowillustratehowAEM’sstaticanalysisworksforavarietyofcommontaskspecifications.
Periodic collection: Consider a task that periodically generates data using the light sensor:
periodic(1s)->sample(LIGHT)->send()
This task is missing the synchronization and packing components. Because the task does not
specify an absolute time at which the application should start running, AEM can synchronize the
starttimes atallnodes(settingthestarttimetoafutureinstant,takingdisseminationlatencyinto
account). To do so, AEM prepends the tasklet globaltimewait(), which blocks task execution
until the specified network-time passed in the argument, to the task. If the task description does
not specify a packing tasklet, AEM assumes that each sample is sent in a separate packet and
derives the period from the argument for the periodic() tasklet only.
One-shot collection: The following task generates one sample reading:
Sample(LIGHT)->send()
As before, since this task does not specify a synchronization component, AEM can modify
the task to specify a synchronized start time. In the absence of a periodic() tasklet, AEM will
ensure that the period is set to 0.
32
The following task differs from the above only in that it already specifies a synchronization
section:
globaltimewait(NOW+2 mins)->sample(LIGHT)->send()
AEM does not modify the task description in this case, and computes other parameters as
described above.
Synchronizedperiodiccollection: Thefollowingtaskrequeststhemotestosamplelightevery
2 minutes and send 10 samples at a time.
globaltimewait(NOW+2 mins)->periodic(2 mins)->sample(LIGHT)->pack(10)->send()
Following the previous discussion, the start time is set to 2 minutes from the task injection
time, and the period is set to 20 minutes because the nodes will send a task response every 20
minutes (10 samples are packed into one packet).
Event-triggered collection: Tenet allows users to specify event-triggered data collection. For
example, the following task generates a packet if the light reading exceeds a threshold of 10:
periodic(1000ms)->sample(LIGHT)->threshold(LIGHT, 10)->send()
AEM can not know in advance if the reading will exceed the threshold so it conservatively
assumes that each sample will exceed the threshold. All the parameters are thus derived in the
same way it is derived for a periodic collection. However, as we describe later, AEM incurs only
a small overhead for this conservative approach: if, when the radio is turned on, AEM observes
no activity, it quickly turns the radio off. In this way, even for event-triggered collection, near 1%
duty-cycles are possible.
In general, the freedom to analyze and modify task descriptions presents several optimization
opportunities. For example, AEM can schedule task execution more precisely, so that data is
generated just before the radio turns on, enabling more efficient use of radio on time. Similarly,
AEM can schedule start times of multiple tasks so their radio on-times can overlap, reducing
the energy expended in turning the radios on or off. Finally, it is possible to control network
33
Time
On
Radio State
Off
o o o
Control
schedule
Data
schedule
Control frame
Data frame
Figure 4.1—Radio Duty-cycling Frames and Schedules.
parameters such as the transport-layer timeout values using the results of static analysis. We
have left these to future work.
4.3.2 Duty-cycling the Radio
The second contribution of AEM is the design of a mechanism for network-wide radio duty-
cycling based on the parameters computed from task analysis, as described above. Although we
have described task analysis for a single task, the duty-cycling mechanism allows for concurrent
tasks, and accommodates traffic generated by system services such as routing, dissemination, and
time synchronization.
In AEM, the master nodes use the results of task analysis to compute and distribute duty-
cycling information to the motes. AEM’s duty-cycling mechanism falls into the class of schemes
that perform scheduled wakeup — radios are turned on and/or off at pre-determined times. This
is a natural choice, since we are able to infer application workload through static analysis.
Before discussing the design of AEM’s scheduled wakeup, we introduce two terms. A frame
is a time interval during which the radio at a specific node is on (i.e., capable of transmission or
reception). A schedule consists of a (usually periodic) sequence of frames.
AEM uses synchronized elastic frames. The start time of each frame is pre-determined, but
its end is not. Instead, a frame ends (and the radio is turned off) when no packets are detected in
the channel for a specified period of time. This approach is robust to topology changes, does not
34
require pre-computation of frame durations, and can absorb transient traffic fluctuations while
still achieving low duty-cycles. The idea of extending the radio on time beyond the normal sleep-
wakeup schedule to improve network performance is not new as evidenced by the existence of
similar techniques in energy-efficient MAC protocols. However, one advance that we make is
demonstrate the importance of this technique in enabling a robust and efficient operation of a
comprehensive networking stack with radio duty-cycling.
4.3.2.1 Frames and Schedules
An AEM master node computes the sleep wakeup schedules before disseminating them into the
network. Each schedule has three parameters t,l,p. The radio at every node is first turned on at
time t, and then again every p seconds. Thus, at each node, the radio is turned on at t+p, t+2p,
and so forth; these times mark the beginning of successive frames. While the radio is on, nodes
can receive and transmit packets. In particular, each node contends with other nodes (using a
CSMA MAC) to transmit packets. We require no modifications to the MAC layer.
An important feature in the design of AEM is the parameter l, called the quiet-time
1
It
determines the amount of time for which the radio stays on after the last received or transmitted
packet. At the end of this time, the radio is turned off. Thus, in AEM, frame lengths are not
fixed, but are elastic, with a minimum length of l. Frames adapt to the activity in the channel,
permitting the system to handle load transients or increases in the number of retransmissions.
Load transients might occur, for example, when a routing change causes packets to be backed
up at a node; after these routing changes are resolved, the packets can be transmitted during a
frame.
Since AEM is required to conform to the Tenet design, it needs to support the transmission
of control traffic in addition to data traffic resulting from possibly multiple concurrent tasks. To
1
AEM’s quiet-time is similar to LPL’s off-timer, but has a different function. LPL’s off-timer is designed to
allow transmission of back-to-back packets without duty-cycling the radio in between. In AEM, the quiet-time is
used to extend a frame to handle load transients.
35
support control traffic and multiple concurrent tasks, AEM: a) distinguishes between two types
of schedules, control and data schedules (Figure 4.1); b) allows multiple schedules to be active at
any given time.
Control traffic is sent during the control schedule and data traffic is transmitted during the
data schedule. This way, control traffic is isolated from data traffic since the loss of control traffic
can adversely affect system performance. The data schedule parameters are derived from task
analysis. t is derived from the start time parameter. The parameter p can simply be computed
fromtheperiodicity parameterderivedfromtaskanalysis. However,thishaslatencyimplications,
especially for reliable transport protocols that use end-to-end acknowledgments. For this reason,
to enable fast end-to-end retransmissions of data packets, AEM trades-off some duty-cycle for
reduced latency by scheduling data frames more frequently than the periodicity value would
suggest.
For simplicity, the control schedule is a sequence of periodic control frames with fixed period-
icity, which is usually statically determined based on control protocol parameters. Many control
protocols (e.g., routing and time synchronization) are naturally periodic, but some others (e.g.,
Tenet’s task dissemination, which uses Trickle [67] style exponential timers) are not.
Once schedules are computed, they are disseminated to the mote network using Tenet’s task
disseminationmechanism. Thereexistspecifictaskletsthatcanbeusedtospecifycontrolanddata
schedules. A Tenet master node can generate and disseminate a task description that contains
multiple tasklets specifying the current set of schedules. These can be disseminated even when
the network is already being duty-cycled using a different set of schedules; thus, in AEM, we
can re-schedule duty-cycling (to, for example, accommodate a new task injected into the system).
Usingthetaskdisseminationmechanismforcontrollingduty-cyclinghasanotheradvantage: when
a task is deleted from the system, its corresponding data schedule can also be deleted using the
same mechanism.
36
AEMschedulesareanaturalfitforperiodictrafficpatterns. AEMschedulesalsosupporttasks
thatsendevent-triggeredresponses(e.g., whenasensorreadingexceedsacertainthreshold), with
a slight loss of efficiency. Nodes turn their radios on in synchrony, but when there are no events,
they are turned off after quiet-time. As we show in our experiments, we are able to achieve low
duty-cycles even with this slight loss of efficiency. However, AEM’s periodic control schedules are
not a perfect match for some control protocols which use exponential timers to schedule control
packet transmissions. One example is Tenet’s task dissemination mechanism. It works well in
AEM, at the cost of slightly higher task dissemination times. However, CTP [35] control traffic
does not work well in AEM; CTP uses aggressive beaconing to quickly detect and repair loops.
We have left the integration of CTP with AEM to future work, but we note that CTP was not
explicitly designed to support schedule wakeup schemes.
Finally, as we have discussed before, multiple concurrent schedules can be active in the system
at any given time. More precisely, at any time, there will be one active control schedule, and
zero or more data schedules. It is therefore possible that two frames belonging to two different
schedules may overlap; while one frame is active, the start time for a second frame may occur.
AEMhandlesthiseasilybykeepingtheradioon,andenablingtransmissionsforthesecondframe.
Thesetransmissionscontendwithanyremainingtransmissionsfromthefirstframe,andtheframe
duration is extended until no activity is detected.
The schedules across the nodes are timed using globally-synchronized network clock. In the
eventtheschedulephaseisout-of-syncduetotransientclockdrifts,aftertheglobally-synchronized
clock corrects the timing errors, AEM’s schedules come back in phase with no additional mecha-
nism. The start time of each frame is calculated as a function of the network time. This approach
prevents the frames from going out-of-phase despite varying clock rates and phase between the
nodes caused by the difference in hardware or time-varying CPU loads.
37
4.3.2.2 Performance Implications
AEM’s design has two interesting performance implications. First, in this design, node transmis-
sions are synchronized to the beginning of a frame. This increases the likelihood of packet loss
relativetoduty-cyclingdesignsthatdonotusescheduledwakeup. Datatransmissionsareresilient
to these packet losses because of link layer retransmissions. However, broadcast control packets
are affected more significantly because of this synchronization. Losses in control packets can de-
lay routing convergence, or cause nodes to be de-synchronized. In AEM, we alleviate the loss of
control packets by spreading control transmissions across different successive control frames. For
example, if the routing protocol sends one beacon every 30s, we set up control frames that repeat
every 15s and allow half the nodes (e.g., those with even node IDs) to transmit during the first
frame and the other half during the second frame. All nodes have their radios on during both
frames, so they can receive all transmissions. This technique improves control traffic reliability at
the expense of additional radio on-time. More generally, the number of additional frames should
adapt to network density, and we have left this adaptation to future work.
Thesecondperformanceimplicationismoresubtle,andarisesfromourdesignofelasticframes.
Consider a chain topology A-B-C. Node C might not overhear the packets transmitted by A to
B and as a result it might put its radio to sleep after quiet-time. When B starts forwarding the
packets to C, its packets are dropped, reducing overall efficiency and delivery ratio. Fortunately,
this scenario happens only occasionally, and there is a simple solution. In our example, B has to
determine when it should stop transmitting to C. B infers that C’s radio might already be off if
these two conditions are true:
• The quiet-time has elapsed without B having received a packet or an acknowledgment
from C.
• B did not receive an acknowledgment from C even after a fixed number (5 in our imple-
mentation) of consecutive retransmissions.
38
When these conditions are met, B pauses forwarding packets to C. The second condition is
conservative, since C’s radio might be on but the link from B to C might be lossy. Even in this
case, it is better to pause forwarding until the next data frame since that might allow us to later
resume forwarding on a better link when the routing protocol recomputes the routes.
4.3.3 Bootstrapping
Whenthenetworkstarts,allradiosareturnedon. Amasternode,perhapsunderthecontrolofthe
system administrator, can disseminate a control schedule to begin radio duty-cycling. The users
utilizetheTenettaskingmechanismtoinitiateduty-cycling. Oncethistaskisdisseminated,duty-
cycling operation can start. While a network is in duty-cycled mode, tasks can be disseminated,
as can additional data schedules.
When a node joins the network, it keeps its radio on. However, at this point it does not know
anything about the schedules active in the network. It will eventually learn this from Tenet’s task
dissemination mechanism. When it does, and after it is time synchronized with the rest of the
network, the node can start duty-cycling its radios. Until then, it behaves just as it would if it
were not duty-cycled, with one important exception: it queues all control packet transmissions,
until it overhears another control packet transmission (and likewise for data packets), at which
point it attempts to clear the corresponding queue. This frame inference technique ensures that
it participates correctly in the duty-cycling schedules, even though it has no explicit knowledge of
the schedules. This conservative approach works even when a group of topologically contiguous
nodes joins the network, as long as some subset of the nodes are operating consistent schedules
or their transmissions are triggered by transmissions from a master node.
39
4.3.4 Handling Time Synchronization Failures
AEM relies on a network time synchronization protocol, like FTSP [78]. Such protocols suffer
fromclockdriftsandAEMallowsforthisbyusingasmallguardtime(2ms, twicethelargestsyn-
chronizationerrorwehaveseeninournetwork)beforetransmittingdatapacketsatthebeginning
of each frame.
However, network time synchronization can fail in more pathological ways. If a few FTSP
beacons are lost, a node can become de-synchronized from the rest of the network. We did not
see this occur during the experiments in this chapter but can occur in longer term deployments
so it is important to design AEM to be robust to such de-synchronizations. When a node is
desynchronized (i.e., FTSP signals a loss of synchronization), AEM puts the node in “recovery”
mode. In this mode, it turns on the radio and keeps it on until the node is synchronized again.
Because the radio is on, the node can receive all the packets sent by the neighbors, including time
synchronization beacons. This allows the time synchronization protocol to improve its estimate
and stabilize. However, during recovery, a node might still be in the forwarding path. It can still
receive packets, but cannot transmit unless it knows that other nodes’ radios are on. To send
packets, it uses the frame inference technique described above: it queues all control packet trans-
missions, until it overhears another control packet transmission (and likewise for data packets),
at which point it attempts to clear the corresponding queue. This technique generalizes easily
to the case when many nodes are de-synchronized: eventually, all nodes will be “clocked” by
transmissions from the master node.
4.3.5 Alternative Schedule Designs
There are many possible designs for a scheduled wakeup scheme that satisfy our goals stated
earlier,andwedesignedandimplementedtwoalternativesbeforesettlingonthedesignwepresent
in this chapter.
40
Staggered Frames: In this design, the transmission and reception schedules are staggered along
the routing tree in such a way that a parent’s frame overlaps with that of its children, and the
parent waits to receive packets from all its children before transmitting to its own parent. The
efficiency of this approach (similar to DMAC [72]), however, comes at a price: it adapts poorly to
network topology changes – sometimes a single parent change can result in re-computation and
resynchronization of schedules in half of nodes in the network. Moreover, it requires additional
control overhead to adapt to such changes.
Fixed Frames: In this design, frame start and end times are synchronized across the network.
Because the frame setup does not explicitly use any topology information, this scheme is robust
toroutingtopologychanges. However, thisschemerequiresanaccuratepre-computationofframe
duration (the previous scheme suffers from this drawback as well).
4.4 Implementation
We have implemented AEM as a component of Tenet in TinyOS 2.x. It uses in 2.7 KB of code
space and requires 128 bytes of RAM for task and duty-cycling state maintenance. In addition,
AEM also requires RAM for packet buffers, as we discuss below. We use buffer size of 6 packets
in our experiments. AEM’s schedules can be configured and altered using the Tenet tasking
mechanism, as discussed in Section 4.3.
While designing AEM, one of our goals was to leverage as much of the existing Tenet software
as possible (Section 4.2). Our implementation is able to achieve this, with two exceptions noted
below. AEM transparently interposes an asynchronous packet buffer to which MultiHopLQI
routing protocol, FTSP time synchronization protocol, and the Tenet dissemination protocol
send their packets. The AEM module orchestrates the packet egress from this buffer depending
on the duty-cycling state of the radio. Data packets use a similar but separate queue. Upon
41
a successful packet enqueue operation, the Tenet stack progresses as if the packet transmission
operation had been completed in a non-duty-cycled network.
We made three modifications to existing software components. First, we increased the end-
to-end retransmission timeout for the Tenet packet transport protocol for AEM. Such a change is
required for any duty-cycling method which increases packet latency. Second, AEM required one
change in the MultiHopLQI forwarding engine to pause transmissions when it guesses that the
receiver’s radio might be turned off (Section 4.3.2). Finally, we added frame inference to the base
station to time its transmissions. The alternative would have been to run a complete instance of
AEM at the base station. Our implementation preserves the transparent bridging design of the
base station.
4.5 Evaluation
Inthissectionweevaluate,throughexperimentsconductedona40-nodetestbed,AEM’sabilityto
achievelowradioduty-cycleanddatadeliverylatencywhilepreservingallthedesignrequirements
of Tenet.
4.5.1 Experimental Methodology
We conducted our experiments on a tiered network testbed with several Stargate nodes and 40
TelosB motes. All nodes are located above the false ceiling across multiple rooms and hallways
on the fourth floor (50m by 20m area) of the USC Ronald Tutor Hall. The wireless environment
above the false ceiling is harsh, with some links experiencing above 30% packet loss rates. All
nodes run the Tenet stack modified to support AEM. In most experiments, we use a single Tenet
master node. We configured the mote radios to transmit at -8.906 dBm, which results in a tree
with 4-hop depth.
42
In our experiments, we are interested in measuring the steady state behavior of AEM. For this
reason, each experiment starts with a 10 minute initialization period during which routing paths
are established, time is synchronized, dissemination states are initialized, and control schedules
are set up
2
. Measurements start 10 minutes into the experiment, when an application injects a
task and the network initiates data schedules.
Our experimental workload is as follows. In most experiments, a fixed fraction f of the
nodes are tasked to generate a sensor reading once every 2 minutes. When f is small, our setup
approximates an event-triggered workload, and when f is large, a periodic workload. Unless
otherwisestated, eachrunofanexperimentlastsfor40minutes, andallexperimentsareaveraged
over 3 or more runs.
In each experiment, the nodes are tasked to use Tenet’s end-to-end reliable transport mecha-
nism. Thus, in every experiment, the delivery ratio is 100%.
During each experiment, we measure:
Duty-cycle: We compute duty-cycle by dividing the total time the radio is on at each node by the
experiment duration. We are interested both in the average duty-cycle across all nodes, and (in
some cases) the distribution of duty-cycles.
Latency: We measure the elapsed time between packet generation at a mote and packet arrival at
themaster. Aswithduty-cycle, weareinterestedbothinaverageanddistributionalperformance.
These metrics correspond to two of the goals described in Section 4.2. We also have designed
experiments to measure AEM’s adherence to other goals. To demonstrate AEM’s alignment with
Tenet’s design, we conduct experiment with multiple concurrent tasks, and another with multiple
master nodes. To demonstrate its robustness, we conduct an experiment where half the nodes
in the network are made to fail, showing that AEM recovers. Finally, the transparency goal is
achieved by careful implementation (Section4.4).
2
Later in this section, we present one experiment that demonstrates that our AEM implementation correctly
adapts to large transients.
43
13
26
39
0 0.5 1 1.5 2 2.5 3 3.5
Radio State at Nodes
Time(s)
Figure4.2—RadiostatesacrossnodeswithLPL
during an experiment.
12
26
38
0 0.5 1 1.5 2 2.5 3 3.5
Radio State at Nodes
Time (s)
CONTROL DATA
Figure 4.3—AEM control and data frames dur-
ing an experiment.
4.5.2 Experiments and Results
Aside from other approaches to duty-cycling listed in Section 4.3.2, there is one other plausible
duty-cycling approach that would have satisfied many of our goals. LPL
3
is the only radio duty-
cyclingsoftwaresystemavailablethatismatureandrobustenoughtosupportTenetrequirements.
We integrated LPL into the Tenet stack. In this section, we also compare AEM against Tenet
with LPL. The goal of this comparison is to test whether an existing design would have sufficed.
To calibrate AEM’s performance, we compare its duty-cycle with that of an omniscient sched-
uler. An omniscient scheduler only keeps the radio on for the exact amount of time necessary to
send or receive all the packets transmitted during an experiment (this includes all control and
data packets as well as retransmissions). We compute the omniscient scheduler’s duty cycle by
counting all the transmissions and receptions at each node during an experiment, assigning a
nominal average transmission and reception time (10ms, derived experimentally), and dividing
the total time by the experiment duration. AEM itself deviates from this scheduler because its
guard time and its quiet time constitutes overhead.
3
We use BoX-MAC-2 [79], a modern and stable implementation of LPL included in TinyOS 2.x in our experi-
ments.
44
0
2
4
6
8
10
12
14
16
18
20
0 20 40 60 80 100
Dutycycle (%)
Fraction of responding nodes
LPL
AEM
Omniscient Scheduler
Figure 4.4—AEM and LPL duty-cycle with
varying workloads.
0
2
4
6
8
10
12
14
0 20 40 60 80 100
Latency (s)
Fraction of responding nodes
LPL
AEM
Figure 4.5—Task response latency for different
workloads with AEM and LPL.
4.5.2.1 Single Task Performance
We first explore AEM’s performance when the user executes one task in the network. We vary
the fraction f of nodes that execute the task, presenting a varying workload to the system. For
each fraction f, nodes are selected uniformly from across the network.
To study the impact of workload on duty-cycle and latency, we do experiments with six
different workloads with 0% to 100% of the nodes responding to the injected task. Figure 4.3
illustrates the control and data schedules during one of our experiments at three nodes along
a path in the routing tree. AEM achieves low duty-cycle by turning on the radio only during
these schedules. Notice how frames are aligned, and how some frames are longer than others,
demonstrating elasticity. Finally, notice how the length of elastic frames increases as we go up
the tree: the duty-cycle adapts to increasing traffic automatically.
Figure4.4showsthatAEMachievesaduty-cycleofabout1.6%whenthereisnotaskresponse
and about 2.7% when all 40 nodes respond to a task with sensor data. This performance is
remarkable: when all 40 nodes respond, the network is generating one packet on average every
3 seconds, yet the system is able to maintain a 2.7% duty-cycle. This performance is within a
factor of 2-3 of the omniscient scheduler. This difference is not surprising, since most of AEM’s
inefficiency comes from its 70ms quiet-time (the duration of 7 packet transmissions). This choice
is rather conservative, and we expect to be able to optimize it significantly.
45
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
CDF(nodes)
Latency (s)
LPL 20%
AEM 20%
Figure 4.6—Latency distribution with AEM
and LPL.
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30
CDF(nodes)
Dutycycle (%)
LPL 20%
AEM 20%
Figure 4.7—Duty-cycle distribution with AEM
and LPL.
Figure 4.7 shows the distribution of duty-cycle across the nodes with AEM when 8 out of 40
nodes respond to the task. (The results are similar for other fractions of responding nodes). The
distribution is tight, with a range from 1.6% to 4.9% and a 90th percentile duty-cycle at 3.7%.
Finally, Figure 4.5 shows that the data delivery latency ranges from 5.8s to 13.9s. AEM,
with its elastic frames, tries hard to transmit a packet from sender to base station within one
data frame duration. However, if a packet needs to be retransmitted, the sender needs to wait
for a transport timeout (15s in our implementation) and the next data frame (the data frame
periodicity is 10s). Retransmitted packets explain why the latency is higher than the forwarding
latency, and also why the latency varies significantly between experiments. Figure 4.6 shows that
the latency distribution ranges from 0.4s to 16.5s across the nodes with 90th percentile latency
at 11s.
LPL comparison Figure 4.4 shows that LPL duty-cycles are a factor of 6-9 higher than AEM
and a factor of 18-19 higher than the omniscient scheduler. This is attributable to LPL’s high
system maintenance overhead (15.2% vs 1.6% for AEM with no data packets to send) dominated
by broadcast traffic which is especially costly for LPL due to long preambles. This also explains
whyLPL’sduty-cycleisrelativelyinsensitivetoworkload. Figure4.2illustrateshowthisproblem
can result in the radio being turned on for up to half a second at a time in our experiments.
Finally, Figure 4.7 shows that, with 8 out of 40 nodes responding to a task, the duty-cycle for
46
0
2
4
6
8
10
0 5 10 15 20 25 30 35 40
Dutycycle(percent)
Time(minutes)
Figure 4.8—Duty-cycle increases with additional tasks and decreases as tasks terminate.
LPL can range from 11% to 21%, a much larger range than with AEM. Because of its higher
duty-cycle, LPL is able to achieve much lower latencies than AEM; Figure 4.6 shows the latency
distribution to be in the 0-5s range.
Thus, AEM meets our duty-cycle and latency design goals. LPL satisfies most of the other
goals, with the exception of low-duty cycle operation (our primary goal). In the following exper-
iments, we illustrate AEM’s adherence to our other goals.
4.5.2.2 Multiple Task Performance
AEM reacts to a new task insertion (potentially by different users) by setting up an additional
data schedule to accommodate the traffic generated in response to the task. To study how AEM
adapts to task insertion and deletion, we conducted the following experiment: the first task is
injected 5 mins after the system initialization, the second task at 15 mins, the second task is
terminated at 25 mins and the first task is terminated at 35 mins. The first task is a periodic task
that generates sensor readings every 2 mins but filters data locally so that only 20% of the nodes
respond with data. The second task is also a similar periodic task with the same data filter but
it generates sensor readings every minute.
Figure 4.8 shows the 1-min windowed average duty-cycle across the nodes over time. The
dutycycle increases with the insertion of the first task, increases further with the insertion of
the second task, decreases when the second task is terminated and decreases further when the
47
0
2
4
6
8
10
0 5 10 15 20 25 30 35 40
Dutycycle(percent)
Time(minutes)
Figure 4.9—50% of the nodes fail at 20 mins.
first task is deleted. Thus AEM performance adapts to multi-task scenario as expected: higher
duty-cycles with increased number of tasks.
4.5.2.3 Tiered Networks
Tenet is designed for tiered networks, so AEM must support such networks as well. Our design
for AEM required no changes in order to support multiple masters. We conducted a single-task
experimentusingthesame40nodes, butwithtwomasters. Asexpected, theduty-cycledecreases
(from 2.7% with one master to 2.4% with two) since the traffic is now spread out over two trees.
However, the decrease is not dramatic, since nodes in one tree can still overhear some nodes in
the other tree. We see no noticeable change in latency; this is because the dominant component
of latency in AEM is the transport timeout and the data periodicity, not the forwarding latency.
4.5.2.4 Robustness
AEM is, by design, robust to transient loss of de-synchronization of network time, and to routing
dynamics. It is also robust to packet loss (since its schedule dissemination re-uses Tenet’s task
dissemination mechanism). In this section, we conduct an experiment to demonstrate AEM’s
robustness to node failure. During the course of a 40-node single-task experiment, we deactivated
50% of the network nodes. In this experiment, we used 0 dBm transmit power, so that the
deployment was dense enough that the rest of the network remained fully connected. Figure 4.9
48
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000
CDF(Frames)
Length(ms)
Control Frame
Data Frame
Figure 4.10—Distribution of AEM’s control and data frame sizes.
plots the per-min average duty-cycles across the network as a function of time. As expected, the
duty-cycle increases when the task is first injected into the system at 10 mins. Between 25 and 30
mins into the experiment, we deactivated 20 nodes. As the figure shows, AEM continues to work
despite this disruption, and has an average duty-cycle that is about half of what it was before the
node failures, as one might expect.
4.5.2.5 Other experiments
Finally, we briefly discuss several other experiments that give us insights about AEM’s perfor-
mance or explore the sensitivity of our results to parameter settings.
Frame length distribution Figure 4.10 shows the distribution of the length of the frames
during an experiment in which all the motes responded to a single task. The minimum possible
frame length is 70ms (the value of the quiet-time parameter). It is interesting to note that many
data frames are significantly longer than the minimum frame length, illustrating that the system
adapts when necessary to absorb retransmissions. Control frames are generally smaller, since
control packets are not retransmitted and their load does not vary with time.
Performance at a higher density In all our experiments, we have conducted experiments
with the same radio transmit power setting. To validate that AEM’s duty-cycle is still low at
49
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50
CDF(nodes)
Dutycycle (%)
250ms
500ms
1000ms
Figure 4.11—LPL experiment results for varying sleep interval on Tutornet.
higher densities, we conducted an experiment where all the motes used 0 dBm transmit power,
andallwererespondingtoasingletask. Atthishigherdensity,weobservedthattheaverageduty-
cycle decreased from 2.7% to 2.26%, and the latency from 10.6ms to about 9.2ms. At the higher
transmit power, the shallower tree results in lower duty-cycles (since fewer nodes forward the
packet), and better quality paths cause fewer retransmissions resulting in slightly lower latency.
LPL’s Parameter Sensitivity LPL performance is sensitive to its sleep interval and channel
polling count threshold. In our experiments, we used a 500ms sleep interval. Figure 4.11 shows
that, for the workloads we use, 500ms sleep interval gives low duty-cycles. With a 1000ms sleep
interval, the preamble length overhead outweighs the polling overhead, while with a 250ms sleep
interval, the opposite is the case.
LPL performs a sequence of Clear Channel Assessments to check if the channel is clear. If it
detects more CCA samples than a specified threshold, it assumes the presence of a transmitter
and keeps the radio on to receive a packet. In our experiments, we used the default value of 3. A
small value can cause many false positives in channel assessment, causing the node to turn on a
radio more frequently. On the other hand, a larger value might miss packets, resulting in higher
loss rates. With a threshold of 30, LPL’s average duty-cycle is reduced from about 19% to about
16%. Moreexperiments mightbe needed tofind the“optimal” threshold fora givenenvironment,
but we believe that our general conclusion (that AEM provides lower duty-cycle operation) holds.
50
4.6 Conclusion
The radio duty-cycling protocols that are designed and optimized for a specific workload is not
extensible to systems such as Tenet. The design of an efficient and robust duty-cycling protocol
appropriate for Tenet required the use of techniques to explicitly adapt to application-layer and
link-layer dynamics. Our experience with AEM suggests that it is possible to run a general
purpose, interactive, anddynamicallytaskablemulti-usernetworkssuchasTenetatsub3%duty-
cycles for widely varying workloads.
51
Chapter 5
Approaches to Making Sensor Network Routing Reliable
Presenceofunreliablelinksandtemporallinkdynamicsinwirelesssensornetworksrequiresuseof
different techniques to make data delivery to the sinks reliable. Three commonly used techniques
toimprovedatadeliveryreliabilityinclude(1)link-layerretransmission,(2)blacklistingbadlinks,
and (3) end-to-end routing metrics. In this chapter, using simulation and testbed experiments,
westudytheeffectivenessandtradeoffsofcombinationsoftheseapproaches. Thisstudyidentifies
ETX metric-based routing in conjunction with per-hop retransmissions to be a highly effective
way to improve the performance of network routing and thus sets the stage for our work on
collection routing, which we describe in Chapter 6.
5.1 Introduction
Radio links in wireless sensor networks exhibit widely varying reliability over time, space, and
from node to node. The radio used in current research platforms have shown widely varying
performance over time and space and use very simple CSMA MAC protocols [122, 114, 117]. The
drive to minimize node cost and size motivate a minimal hardware and software structure, yet
the sensor network as a whole must provide a reliable environment for communications. Recent
research has explored several techniques to improve reliability: link-level retransmission (ARQ);
blacklisting, i.e., rejecting bad links; and routing using a metric that reflects path reliability. In
52
this chapter, we conduct the first careful exploration of the interactions between these techniques
as each strives to improve reliability in different ways.
Per-hop retransmission (often called ARQ at the MAC layer) is a widely used technique to
improve reliability of a given link [10]. Retransmissions are attempted one or more times up to
some limit before the packet is declared lost. Using link-level ARQ, losses can be quickly detected
and corrected, and even a few per-link retransmissions can greatly improve end-to-end reliability.
Blacklistingisatechniquethatpreventslowqualitylinksfrombeingconsideredforpathselec-
tion [26]. With blacklisting, all nodes collect statistics about delivery rates with their neighbors.
Thesedeliveryratesareusedtoestimatethequalityofallwireless“links”. Linkswithlossratebe-
low a configured blacklisting threshold are ignored—inbound and outbound packets on those link
are dropped. By avoiding tenuous links, blacklisting can improve end-to-end reliability, although
ignoring links risks partitioning the network.
Recent research has proposed the use of link reliability as a metric for routing path selec-
tion [23]. Such a metric allows the routing protocol to consider cumulative link reliability over
paths, and find the most reliable end-to-end path. Several metrics have been proposed to rep-
resent reliability, and we review them in Section 5.2.3. Metric-based routing can incur higher
control message overhead if link reliability changes frequently.
Thesethreemechanismsarenotmutuallyexclusive;eachapproachestheproblemofimproving
end-to-end reliability in a different way. However, to our knowledge, there is no literature that
systematically compares these techniques across a range of parameters, both individually and in
combination. In this chapter, we conduct such a systematic study. This study is complicated
by the fact that the parameter space is rather large—each technique can be used with different
settingsandthresholds. Weusesimulationtoexplore thespacethoroughly, thenvalidateselected
simulation results through testbed experiments.
Our goal is to understand how different techniques can combine to provide “reasonably” re-
liable end-to-end delivery in the face of lossy links. We assume that applications can tolerate or
53
recover from occasional loss [107, 99], and that the primary source of loss is due to noise and
environmental effects, not congestion [90, 108]. These characteristics are typical of many current
sensor networks.
Our study compares the techniques of per-hop retransmission, blacklisting, and metric-based
routing and studying their interactions identifies several key results: One is that per-hop re-
transmissions is a necessary addition to any other mechanism if reliable data delivery is a goal.
Additional interactions between the services are more subtle. First, in a multi-hop network, ei-
ther blacklisting or reliability metrics like ETX can provide consistent high-reliability paths when
added to ARQ. Second, at higher deployment densities, blacklisting has a lower routing overhead
than ETX. But at lower densities, blacklisting becomes less stable as the network partitions.
These results are consistent across both simulation and testbed experiments. Finally, we have
conflicting results about the effects of combining all three mechanisms. Testbed results suggest
that moderate blacklisting can reduce the cost of route discovery when added to metric-based
routing, however this observation is not supported in simulation. We conclude that ETX with
retransmissions is the best choice in general, but that blacklisting may be worth considering at
higher densities, either with or without ETX.
5.2 Detailed Approaches to Improve Path Reliability
Weconsiderthethreeapproachestoimprovepathreliability: per-hopretransmission,blacklisting,
and reliability-based metrics in routing. This section briefly reviews the specific algorithms we
use in our simulations and testbed experiments. Since blacklisting and reliability metrics depend
on estimates of link reliability we begin by summarizing how link statistics can be collected.
Blacklisting and reliability metrics estimate link quality so that they can select high quality
links to be used for routing. Link delivery rate changes over time due to environment or transient
trafficcharacteristics. Linkstatisticsneedstobereasonablyresponsivetothesechanges. Wooand
54
Culler evaluated a range of options for link estimator and neighborhood table management [114].
They identify Window Mean with Exponentially Weighted Moving Average (WMEWMA) to be
a good estimator of link quality in a wireless sensor network. One can use active or passive
techniques to collect link statistics. Active techniques rely on periodic broadcasts containing
link statistics about each neighbor. Passive probing involves piggybacking link statistics to the
outgoing data packets or inferring link delivery statistics using information from the MAC. Com-
binations of active and passive probing are also feasible as shown in our later work described in
chapter6.4. WeuseactiveprobingandWMEWMAestimatorinourtestbedexperiments. Choice
ofasingleprobingtechniqueshouldnotfavoranygivenprotocolsinceitaffectsallreliabilitytech-
niques equally.
5.2.1 Per-hop retransmission
Retransmission is a well known technique to improve the quality of unreliable links. Retransmis-
sion is often done at the link layer, or it can be done at higher layers, both to the same effect. We
vary the number of allowed data retransmissions from zero to three.
Incontention-basedMACs, thereisoftenahigherpossibilityofcollisionduringthecontention
period. In our testbed experiments we use S-MAC [118] which includes an RTS/CTS protocol.
It is important to distinguish retransmissions of the contention signal from retransmissions of the
data. We always allow up to seven attempts at retransmitting RTS signals independent of the
number of allowed data retransmissions.
5.2.2 Blacklisting
Blacklisting removes unreliable links from the set of links routing layer can use to form a path.
Only the links with reliability higher than a blacklisting threshold are made available for sending
andreceivingmessages. Blacklistingisusuallyappliedabovethelinklayerandbeforethemessage
gets to the network layer. Our blacklisting implementation drops incoming and outgoing packets
55
on each link that it determines to have reliability below the specified blacklisting threshold. Note
that this approach effectively eliminates asymmetric links from consideration. In this chapter, we
focusonusingblacklistingtoreducetheimpactofqualityvariationsacrossdifferentlinks. Adding
hysteresis to blacklisting reduces the impact of temporal variation in link quality increasing the
stability of network performance. In our experiments, we use a single threshold which classifies a
link as a bad link as soon as and for as long as its reliability falls below the threshold.
Setting a very high blacklisting threshold results in only highly reliable links participating in
route selection, which ultimately helps select a path with high end-to-end reliability. However, a
high threshold can also make nodes unreachable if removal of links with lower reliability creates
a network partition. Setting the threshold too low allows mediocre links to be selected on a path,
which could result in low end-to-end reliability. In our study, we explore the impact of threshold
selection on routing performance. However, since less-reliable links tend to form less-desirable
paths, we examined high and moderate thresholds in greater detail than low thresholds.
5.2.3 Reliability Metrics
Path reliability, when used as an end-to-end routing metric, can identify the most reliable end-
to-end path between two nodes. By default, many sensor [49] and ad hoc routing protocols use
latency or hop count as a metric. Because they do not differentiate paths based on reliability,
they tend to select paths with low reliability [24]. To determine path quality, we first quantify the
reliability of each link in terms of a metric: the success rate, expected number of retransmission,
or signal strength. A routing protocol can then aggregate these link metrics over the links on a
path to compute the path metric, and select paths with the best end-to-end routing metric.
A given metric has an associated resolution that limits path differentiation. The resolution
is applied when links are measured. For instance, for a success rate metric, a resolution of 20%
categorizes all links into five classes with reliabilities 0–20%, 21–40%, 41–60%, 61–80%, and 81–
100%. Alowresolutionmetricmayreducethequalityoftheresultingpathbytreatingalinkwith
56
81% reliability as equivalent to a link with 99% reliability. However at high resolutions (say 1%),
routing algorithms can over optimize to accomplish limited improvement, switching from a 97%
link to a 98% and a 99% link. Since link qualities are experimentally observed and approximate
to begin with, these changes incur the cost of propagating new routes while providing little or no
actual change in quality.
In this study, we use a variant of ETX [23], also proposed as MT [114] by Woo et al., as the
routing metric. ETX is defined as the expected number of transmissions (including retransmis-
sion) for a successful end-to-end data forwarding and hop-by-hop acknowledgment. The following
expressionshowshowtocomputetheETXmetricforapathpconsistingoflinksa..zwithforward
reliability of forward
a
and backward reliability of backward
a
for link a:
etx(a)=
1
forward
a
∗ backward
a
ETX(p)=etx(a)+...+etx(z)
Our version of ETX rounds the ETX value for each link to its nearest integer, effectively
reducing the resolution of the ETX metric. For example, forward and reverse reliabilities in
the range [0.82,1] result in an ETX value of 1, which makes links different in reliability by as
much as 0.18 appear identical. With lower link reliability, ETX becomes more sensitive to small
differenceinlinkreliabilityenablingittocomparelinksatahigherresolution. Thustheresolution
of this reliability metric, while variable, is at most 0.18. This implementation was intended to
approximate previously reported routing systems as closely as possible.
We also consider end-to-end success rate (SR) as a second routing metric. We use the SR
metric to evaluate the effect of metric resolution on performance because it provides consistent
resolution across the range of values. To compute end-to-end success rate we use the product of
forward and backward reliabilities of all links in a path as our metric. This metric is similar to
the metric proposed in [117], but by taking the minimum of forward and backward reliability,
57
that metric tends to under-estimate link reliability when links are asymmetric. Note a variation
that includes only forward reliability is a reasonable alternative when acknowledgments are not
enabled (but we do not evaluate this variation).
5.3 Evaluation
5.3.1 Evaluation Metrics
To compare protocol alternatives we consider the following evaluation metrics:
Routing Overhead Routing overhead provides an estimate of energy cost for finding a path
fordataforwarding. Wecomputeitbycountingpacketssentduringpathdiscovery. Thisestimate
assumes an energy-conserving MAC protocol is in use so that idle listening does not dominate
energy consumption. (An alternative is to count packets received; we do not do that because it
is more sensitive to density.)
Active link quality estimation involves a periodic exchange of bi-directional link quality esti-
mate with each neighbor and can be an additional source of overhead. We do not measure this
cost in our experiments and therefore slightly overestimate the relative cost of ML with retrans-
missions but no blacklisting. However, for this configuration, data delivery cost is much higher
than alternative schemes, our overall results do not change.
Path Reliability Path reliability measures the ratio of successfully delivered messages at the
sink to the number sent by sources. Although a high path reliability is desirable, a slightly lower
reliability may be tolerable if accompanied by much lower overhead. This metric is sometimes
called delivery ratio.
58
Path Length We measure path length in hops from source to sink. Longer path lengths cor-
respond to higher delivery latency. This relationship is approximate, however, since we do not
explicitly model MAC-level retransmission costs on latency or energy.
Data Dissemination Overhead This metric captures the cost to send data, and includes
retransmissions but excludes routing overhead. This metric is computed by normalizing the total
numberofdatatransmissionsbythenumberofsuccessfullydeliveredmessagestoreflectthecostof
packetsthataresentbutlost. AssuminganenergyconservingMAC,dataoverheadapproximates
the energy consumed to send data in the system.
Wedecidedtoevaluateroutinganddatadisseminationoverheadsseparatelysothatourresult
can be used to estimate aggregate overheads for applications with different route update and data
rates. We also note that retransmissions affect path reliability and data dissemination overhead
while blacklisting and reliability metrics impact all evaluation metrics.
5.3.2 Simulations
We conducted a simulation study of reliability techniques to systematically explore the param-
eter space of each mechanism and combinations of the mechanisms. This section reports this
exploration; Section 5.3.3 validates key results in a testbed.
5.3.2.1 Simulation Methodology
Weconsideredtheinteractionbetweenthreetechniques: retransmissions,blacklisting,andmetric-
basedrouting. Weevaluatedall96combinationsoftheseparameters: 0, 1, 2or3retransmissions;
0%
1
, 40%, 60%, 70%, 90%, and 95% blacklisting thresholds; and minimum latency (ML), Success
Rate (SR) metric at 1% and 10% resolutions (SR01 and SR10), and expected transmissions
(ETX)asroutingmetrics. Forbrevity, we summarize the parameters usingathree-tuple notation
1
A threshold of 0 does not filter out any bad links and the behavior of the underlying routing protocol is
unchanged.
59
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30
Link Reliability
Distance (m)
Figure 5.1—Reliability vs. distance profile used in the simulation to compare protocol combinations.
(A, B%, C), where A is the number of retransmissions, B is the blacklisting threshold, and C is
the routing metric.
Weevaluatethesetechniquesusingtheone-phase-pull(OPP)variantofDirectedDiffusion[41].
Directed Diffusion is a data-centric mechanism for naming, aggregation, and dissemination of
information in a sensor network [49]. We chose Directed Diffusion for our experiments because
it is used in several sensor network deployments, is freely available, and allows us to observe a
specific real protocol. In OPP, the querying node, also called the sink, broadcasts a query, also
called an interest, into the network. Data generated by the source nodes are directed back to
the sink using reverse path routing for given query attributes, also called a gradient. Queries are
re-injected into the network every interest epoch. Intermediate nodes pick up the first interest
message they receive and ignore the rest of the interest messages that arrive in the same epoch.
By default, nodes with diffusion select a minimum-latency path like the ML metric. To simulate
other reliability metrics we extended OPP to encode the routing metric as an additional attribute
in the interest message. Nodes update the metric value when they forward the interest message,
rebroadcasting interests if the metric improves within a single epoch. While we use this specific
protocol, we expect them to be applicable to other ad hoc routing protocols as well.
We conduct our simulations using diffusion release 3.2.0 as a process-level simulator. Packets
are sent between nodes as UDP packets. Between each pair of nodes, all packets are subject to
60
probabilistic loss as a function of distance based on propagation profiles (Figure 5.1) from Zhao
et al. [122].
We consider a 125-node network with nodes placed in a 124× 124m
2
area using a placement
strategy similar to that in [41]. A sink is placed in the lower left sixteenth of the sensor field.
We use ten clustered sources; a first source is chosen in the upper right sixteenth of the area,
additional sources are taken as the nearest nodes to that source. This constrained source and
sink placement allows us to maintain consistent average path lengths from source to sink across
different randomly generated instances of the sensor network. To vary node density we changed
thenumberofnodesbyplacing45,65,and125nodesinthegivenarea. Assuminganominalradio
range of 30m and uniform node density, these placements result in topologies with an average of
8, 12, and 23 neighbors per node. For a given density, we generated 20 topologies with i.i.d.
random node placements and simulated all 96 parameter combinations on those topologies.
Each source generates one data packet every two seconds. Our simulations do not aggregate
data within the network. We set an interest epoch of 100 seconds, so any routing choice affects
about 50 data packets.
Our simulation results report 95% confidence intervals for each metric, obtained from 20
simulation runs.
5.3.2.2 Simulation Results
We have extensively explored the parameter space of interaction between blacklisting, reliability
metrics and link-layer retransmission. This section is organized in a manner that brings out the
main results. Recall that the focus of our study is to find a set of mechanisms that enables highly
reliable delivery with low overhead. We base our initial discussions on simulation of a network
with an average density of 23 neighbors having a non-zero reception rate. Towards the end of the
section, we discuss the impact of density.
61
0.0
0.2
0.4
0.6
0.8
1.0
0 1 2 3
Reliability
Number of retransmissions
(*,0%,ML)
(*,0%,ETX)
(*,0%,SR10)
(*,0%,SR01)
(a) Reliability with 0-3 retransmissions
0.0
0.2
0.4
0.6
0.8
1.0
0 1 2 3
Reliability
Number of retransmissions
(*,70%,ML)
(*,0%,ETX)
(b) Reliability for (*,70%,ML) and
(*,0%,ETX)
0
5
10
15
20
0 1 2 3
Number transmit/packet delivery
Number of retransmissions
(*,70%,ML)
(*,0%,ETX)
(c) Delivery Cost for (*,70%,ML) and
(*,0%,ETX)
Figure 5.2—Evaluation of the impact of retransmission alone, and in combination with blacklisting and reliability
metrics, on protocol performance
Our first result is that link-layer retransmissions are necessary for achieving high reliability,
given the packet loss rates observed in practical sensor network settings. Figure 5.2(a) shows
that, without retransmissions, none of our mechanisms exhibit path reliability exceeding 70%. In
addition, Figures 5.2(a) and 5.2(b) show that a small number of retransmissions is sufficient to
achieve high path reliability (above 90%) when used in combination with either blacklisting or a
reliability metric. That retransmissions can improve path delivery is somewhat obvious, but it is
62
0
2
4
6
8
10
12
14
16
18
20
95 90 70 60 40 0
Path Length (hops)
Blacklisting Threshold
(0,*,ML)
(0,*,ETX)
(0,*,SR10)
(0,*,SR01)
(a)Pathlengthfordifferentmetricswithdifferent
blacklisting thresholds
0
100
200
300
400
500
600
700
800
95 90 70 60 40 0
Interest cost
Blacklisting Threshold
(0,*,ML)
(0,*,ETX)
(0,*,SR10)
(0,*,SR01)
(b) Interest cost with different metrics
0.0
0.2
0.4
0.6
0.8
1.0
95 90 70 60 40 0
Reliability
Blacklisting Threshold
(0,*,ML)
(0,*,ETX)
(0,*,SR10)
(0,*,SR01)
(c) Reliability with different blacklisting thresh-
olds
0
10
20
30
40
50
60
70
80
90
100
110
120
95 90 70 60 40 0
Number transmit / packet delivered
Blacklisting Threshold
(0,*,ML)
(0,*,ETX)
(0,*,SR10)
(0,*,SR01)
(d) Delivery cost at different blacklisting thresh-
olds
Figure 5.3—The effectiveness of blacklisting and routing metrics on protocol performance without any retrans-
mission
worth emphasizing particularly because commonly used sensor network MAC (such as those in
TinyOS
2
) either omit ARQ or make it optional.
Highreliabilitycancomeatthecostofhighoverhead,however,ifpathlengthsbecometoolong
ornumberofretransmissionsexcessive. Next, weobservethatETXtogetherwithretransmissions
canachievehighreliabilityefficiently . Figures5.2(b)and5.2(c)showthatETXcanachievenearly
98%deliverywithupto3retransmissionswithabout18transmissionsperdeliveredpacket. Since
2
Realizing the importance of this technique, packet retransmission is now supported as an option by the link
layer in the latest TinyOS 2.x version.
63
0
20
40
60
80
100
120
(0,0%,ML)
(0,70%,ML)
(3,70%,ML)
(2,0%,ETX)
(1,0%,SR01)
Number transmit/packet delivery
(a) Data delivery cost
0
0.2
0.4
0.6
0.8
1
(0,0%,ML)
(0,70%,ML)
(3,70%,ML)
(2,0%,ETX)
(1,0%,SR01)
Reliability
(b) Data Delivery Reliability
Figure 5.4—Comparison of leading protocol combinations with retransmission, blacklisting, and ETX
typical path lengths incurred for ETX are 8–9 hops, this suggests about two transmissions per
hop.
The efficiency of metric-based routing can depend on the choice of metric. Figure 5.2(a)
shows that a higher resolution reliability metric (SR01) achieves a higher reliability (68%) than
lower resolution reliability metrics (ETX at 40% and SR10 at 51%) at zero retransmissions. It
would then seem that tuning the resolution of the reliability metric can improve reliability sig-
nificantly. However, doing so increases path length and overhead. With a high-resolution metric
(SR01), paths are twice as long as the ML metric (Figure 5.3(a) at 0% blacklisting threshold);
high-resolution metrics are clearly unacceptable for latency critical applications. Similarly, a high
resolution metric triggers many more route updates; Figure 5.3(b) shows that, without black-
listing, SR01 incurs 591 transmissions (4.7 transmissions per node) during interest propagation,
more than twice that of ETX or SR10. For this reason, ETX together with a small number of
retransmissions provides better path selection at low overhead. We note that this result, while
not startling, is new: ETX has been shown to have good performance, but, to our knowledge, its
performance in concert with link-layer retransmissions had not been studied before.
64
0.0
0.2
0.4
0.6
0.8
1.0
95 90 70 60 40 0
Reliability
Blacklisting Threshold
(3,0%,ETX) (3,*,ML)
8 neighbors / node
12 neighbors / node
23 neighbors / node
Figure 5.5—The interaction between density and blacklisting. (3,0%,ETX) shown for comparison.
More surprising is the observation that the ML metric, together with blacklisting and retrans-
missionsisabletoachievecomparablereliabilityatloweroverhead thanETXwithretransmissions
(Figure 5.4(a)). Essentially, blacklisting enables ML to find paths that have moderate reliability,
andretransmissionsonthesepathsimprovespathreliability(forexample,comparethedifferences
in reliability between (0,0%,ML), (0,70%,ML), and (3,70%,ML) in Figure 5.4(b)). ETX pays a
higher interest overhead to find similar, high-reliability paths. Thus, Figure 5.3(b) shows that
interest overhead for ETX is 33% higher than that for the ML metric, and arises from the fact
thatmetric-basedroutingmustpropagaterouteupdatesashigher-reliabilitypathssupersedelow-
latency, low-reliability paths. Blacklisting, on the other hand, immediately rejects these paths as
beneath threshold.
Variations in network density can affect these results. Unfortunately, while (3,70%,ML) is
comparableto(3,0%,ETX)athighdensities, thereliabilityofblacklistingfallsoffatlowerdensity
deployments. For example, as Figure 5.5 shows, at 12-neighbor density, (3,70%,ML) is about as
reliable as (3,0%,ETX) (83% vs. 92%) with 30% lower interest costs. But when we consider
less dense deployments in Figure 5.5, the reliability of (3,70%,ML) falls off because blacklisting
begins to partition the network, rejecting unreliable but necessary links. For these reasons, we
conclude that despite its higher interest cost, ETX together with retransmissions is the most
65
Figure 5.6—Testbed deployment map. Gray boxes are relay nodes. The black circle (top-right) is a source node.
The black box (bottom-left) is a sink node.
desirable alternative since it is more stable across a range of densities. However, in a high density
deployment of a sensor network, blacklisting may be preferable because of its lower interest cost.
The disadvantage of ETX is its higher interest cost. We hypothesized that the addition of
moderateblacklistingtometric-basedroutingcouldservetoreducethiscost. Simulationresultsdo
notsupportthishypothesisforETX,but(asweshowlater)ourtestbedresultsdo. Insimulation,
Figure 5.3(b) shows, interest overhead for (3,*,ETX) is fairly constant with moderate blacklisting
values (0–90%). A similar observation is true for path reliability (Figure 5.3(c)) and delivery cost
(Figure5.3(d)). Testbedresultsreachaquitedifferentconclusion, suggestingthistopicasanarea
for future work.
5.3.3 Testbed Experiments
To validate the simulation results, we conducted experiments on an 18-node testbed. Given
the logistical difficulty of exploring the entire space of 96 experimental configurations, we chose
five configurations as representative samples. Of these, the configuration (0,0%,ML) forms the
baseline, (1,60%,ETX) has all three mechanisms (retransmissions, blacklisting, and a reliability
metric), and (1,60%,ML), (0,60%,ETX), and (1,0%,ETX) consider combinations of two mecha-
nisms each at one specific parameter setting.
66
Figure 5.7—Network connectivity in the testbed. Dotted lines indicate links with less than 60% reliability in one
direction.
5.3.3.1 Methodology
In our 18-node Stargate [102] testbed, we configured one node to function as a source, one as a
sink, and 16 nodes as relays. Figure 5.6 is a map that shows how these nodes are deployed on
a floor of our office building. A Mica2 [22] node attached to each Stargate was used for radio
communication. We adjusted the radio transmit power on the mote such that each node has 5-15
neighbors. This setting provides a rich network connectivity (Figure 5.7) which makes available
numerous possible paths between the source and the sink. The motes run TinyOS 1.x, but with
S-MAC [118] as the MAC layer.
We used Emstar [34] on each node to setup and control the experiments. Emstar on each
Stargate node uses the Mica2 as a network interface. We use Emstar’s link statistics collection
moduleanditsblacklistingmodule. Emstar’slinkstatisticscollectionusesWMEWMAtoestimate
link quality to its neighbors. Its blacklisting module uses the link statistics estimate to identify
67
0
5
10
15
20
25
30
35
40
45
(0,0%,ML)
(1,60%,ML)
(0,60%,ETX)
(1,0%,ETX)
(1,60%,ETX)
Interest Cost/round
Testbed
Simulation
(a) Interest Cost
0
2
4
6
(0,0%,ML)
(1,60%,ML)
(0,60%,ETX)
(1,0%,ETX)
(1,60%,ETX)
Path Length (hops)
Testbed
Simulation
(b) Path Length (hops)
0.00
0.20
0.40
0.60
0.80
1.00
(0,0%,ML)
(1,60%,ML)
(0,60%,ETX)
(1,0%,ETX)
(1,60%,ETX)
Delivery Rate
Testbed
Simulation
(c) Delivery Rate
0
10
20
30
40
50
(0,0%,ML)
(1,60%,ML)
(0,60%,ETX)
(1,0%,ETX)
(1,60%,ETX)
0
40
80
120
160
200
240
280
Number transmit / packet delivered
Testbed
Simulation
(d) Delivery Cost
Figure 5.8—Comparison of Testbed and simulation results for various configurations using four metrics
68
links that have delivery rate below a configured threshold and disables those links. The Mica2
mote has a Chipcon CC1000 [47] radio.
Minimal software modification was necessary to the simulation software to run it on the Star-
gate testbed. Diffusion was configured to use Emstar’s blacklisting service, and to obtain link
statistics from Emstar. We configured Emstar to send neighbor probes every 10 seconds. In our
experiments, each configuration ran for 37.5 minutes. During that time, the sink sent 15 rounds
of interest and the source sent data every three seconds.
Finally, in order to validate our simulation results on the testbed, we collected the temporal
link statistics and topology information from the testbed and input those into the simulator. The
next section compares the results obtained from simulation, with results from our testbed.
5.3.3.2 Results from Testbed Evaluation
Figures 5.8(a) through 5.8(d) compare the values of different metrics obtained using the testbed
and from a comparable simulation, for the five configurations. There is, by and large, remark-
able agreement between our testbed experiments and our simulation: for most metrics and for
most configurations, the difference between experiment and simulation falls within the bounds
of experimental error. This gives us confidence that our conclusions (Section 5.3.2) will hold in
practice. Now we focus on situations where there is some disagreement between experiment and
simulation.
In many cases, the testbed results are slightly different from those obtained using simulations
on the same topology. Lacking a detailed instrumentation at the MAC layer, we are not able to
isolate the cause for this discrepancy. We conjecture this difference can be explained by observing
that the simulator may not accurately capture interference from concurrent transmissions and
does not simulate the fine-grain link dynamics. Furthermore, the testbed results exhibit greater
variability than simulations on the same topology. This can be attributed to fewer number of
nodes and runs on the testbed. We plan to verify our experimental results on a larger testbed.
69
Results from the testbed also have a higher variability relative to those discussed in Sec-
tion 5.3.2, particularly for configurations with blacklisting, We attribute this to our earlier simu-
lations’ not capturing the temporal variations in link quality observed in the testbed.
Finally,oneconfigurationwhereexperimentdeviatesfromsimulationistheimpactofblacklist-
ingoninterestoverheadwhenusedinconjunctionwithETX.Figure5.8(a)showsthat(1,0%,ETX)
uses about 28 total messages in a 18-node network while the ML metric only uses about 18 total
messages every interest epoch. This contradicts our simulation results, which suggest that black-
listing has a negligible effect on reducing interest overhead. We do not have an explanation for
this discrepancy.
5.4 Conclusion
Inthischapter, weexaminedtheinterplaybetweenthreemechanismsforimprovingthereliability
ofwirelessroutingpathwithlowoverhead: blacklisting,reliabilitymetrics,andretransmission. To
our knowledge, this study is the first systematic evaluation of this design space. Our simulations
reveal several interesting results: link-layer retransmissions are necessary for high path reliability;
a reliability metric like ETX, together with up to three link-layer retransmissions can provide
high path reliability at low overhead; more surprisingly, the ML metric together with blacklisting
andretransmissionscanoftenprovidecomparablereliabilitywithslightlyloweroverhead,butthis
configuration is sensitive to the blacklisting threshold. Given these results, we conclude that a
reliability metric such as ETX, together with link-layer retransmissions, is a robust choice that
works well across the range of configurations we explored.
70
Chapter 6
Collection Tree Protocol
Collection is a best-effort datagram routing and forwarding service that is used to deliver data to
thesinkinasensornetwork. Implementingarobustandefficientcollectionprotocolisnotoriously
difficult, particularly due to the dynamics that such protocols must handle at the physical, link,
and network layers. In this chapter, we describe the design, implementation, and evaluation
of a protocol called the Collection Tree Protocol (CTP) to understand the mechanisms that
enablessuchroutingprotocolstoadapttothedynamicsacrosstheprotocolstackandstillachieve
robustness, reliability, and energy-efficiency.
6.1 Goals and Overview
Collection trees are a core building block for sensor network applications and protocols. In their
simplest use, collection trees provide an unreliable, datagram routing layer that deployments
use to gather data [110, 76, 105]. Additionally, tree collection protocols provide the topology
underlying most point-to-point routing protocols, such as BVR [33], PathDCS [30], and S4 [77]
as well as transport protocols such as IFRC [89], RCRT [81], Flush [56], and Koala [80].
Thischapterdescribeskeyprinciplesfordesigningcollectionprotocolsthatcansimultaneously
achieve four goals:
71
Reliability: a collection protocol should deliver at least 90% of end-to-end packets when
a route exists, even under challenging network conditions. 99.9% delivery should be achievable
without end-to-end mechanisms.
Robustness: it should be able to operate without tuning or configuration in a wide range of
network conditions, topologies, workloads, and environments.
Efficiency: it should achieve this reliability and robustness while sending few packets.
Hardware Independence: because sensor networks use a wide range of platforms, the
implementation should be robust, reliable, and efficient without assuming specific radio chip
features.
Achieving these goals depends on link estimation accuracy and agility. For example, recent
experimental studies have shown that, at the packet level, wireless links in some environments
have coherence times as small as as 500 milliseconds [98]. Being efficient requires using these links
whenpossible,butavoidingthemwhentheyfail. The4-bitlinkestimator,whichwedescribelater
in this chapter, combines information from the physical, link, and network layers to accurately
estimate the link qualities, but it achieves this accuracy by changing its estimates as quickly as
every 5 packets.
Such dynamism is inherently challenging. Rapid topology changes lead to routing loops and
other problems that harm reliability and efficiency. Incorporating two mechanisms into a routing
layer can make it robust, efficient, and reliable in the presence of rapid topology changes.
The first is adapting the Trickle [67] algorithm, originally designed for propagating code up-
dates, to dynamically adapt the control traffic rate. This allows a protocol to react in tens of
milliseconds to topology changes, while sending a few control packets per hour when the topology
is stable.
The second is actively using its datapath to validate the routing topology as well as detect
loops. Each data packet contains the link-layer transmitter’s estimate of its distance. A node
detects a possible routing loop when it receives a packet to forward from a node with a smaller
72
or equal distance to the destination. Rather than drop such a packet, the routing layer tries to
repairthetopologyandforwardsthepacketnormally. Usingdatapacketsmaintainsinconsistency
detection agility precisely when a consistent topology is needed, even when the control traffic rate
is very low due to Trickle.
We ground and evaluate these principles in a concrete protocol implementation, which we call
the Collection Tree Protocol (CTP). In addition to incorporating agile link estimation, adaptive
beaconing, and datapath validation, CTP includes many mechanisms and algorithms in its for-
warding path to improve its performance. These include re-transmit timers, a hybrid queue for
forwarded and local packets, per-client queueing, and a transmit cache for duplicate suppression.
To explore whether a routing layer with these principles can meet the above goals in a wide
spectrum of environments with minimal adjustments, we evaluate CTP on 12 different testbeds
ranging in size from 20–310 nodes and comprising 7 hardware platforms. While not deployments
in the field, the testbeds comprise diverse environmental conditions beyond our control, provide
reproducibility, and enough diversity that give us confidence that CTP achieves the above goals.
Anecdotal reports from several deployments support this belief. In two testbeds that have Telos
nodes, weevaluateCTPusingthreelinklayers: fullpower, lowpowerlistening[86]andlowpower
probing [80]. In one Telos-based testbed where there is exceptionally high 802.11b interference,
we evaluate CTP on an interference-prone and an interference-free channel.
Evaluating CTP’s use of agile link estimation, adaptive beaconing, and datapath validation,
we find:
• Across all testbeds, configurations, and CSMA layers, CTP’s end-to-end delivery ratio
ranges from 90.5% to 99.9%.
• CTP achieves a median duty cycle of 3% across the nodes in a network in an experiment
in which the network generates data at 30 packets/min and delivers them to the sink.
73
• Compared to MultiHopLQI, a collection protocol used in recent sensor network deploy-
ments [110], CTP drops 90% fewer packets while requiring 29% fewer transmissions.
• ComparedtoMultiHopLQI’sfixed30secondbeaconinterval, CTP’sadaptivebeaconing
and datapath validation sends 73% fewer beacons while cutting loop recovery latency by
99.8%.
• Testbeds vary significantly in their density, connectivity, and link stability, and the
dominant cause of CTP packet loss varies across them correspondingly.
OurworkonCTPmakesthreeresearchcontributions. First,itdescribesthreekeymechanisms,
agile link estimation, adaptive beaconing, and datapath feedback, which enable routing layers
to remain efficient, robust, and reliable in highly dynamic topologies on many different sensor
platforms. Second, it describes the design and implementation of CTP, a collection protocol that
uses these three mechanisms. Third, by evaluating CTP on 12 different testbeds, it provides a
comparative study of their behavior and properties. The variation across testbeds suggests that
protocols designed for and evaluated on only a single testbed are prone to failures when they
encounter different network conditions.
6.2 Challenges
Implementing robust and efficient wireless protocols is notoriously difficult, and protocols for
collection are no exception. At first glance, collection protocols may appear very simple. They
provide best-effort, unreliable packet delivery to one of the data sinks in the network. Having
a robust, highly reliable, and efficient collection protocol benefits almost every sensor network
application today, as well as the many transport, routing, overlay, and application protocols that
sit on top of collection trees.
74
Figure 6.1—Plot of Packet reception rate (PRR) over sliding window of 100 packets (Upper) and RSSI for every
received packet (Lower) during an experiment with inter-packet interval of 10ms. PRR can vary on timescales
ordersofmagnitudesmallerthanbeaconrates. RSSIandotherphysical-layermeasurementsareabsentfordropped
packets: using them can bias measurements.
But despite providing a simple service that is fundamental to so many systems, and being in
use for almost a decade, collection protocols today typically suffer from poor performance. Some
deployments observe delivery ratios of 2-68% [76, 61, 105, 110].
Furthermore, it is unclear why collection performs well in controlled situations yet poorly in
practice, even at low data rates. To better understand the causes of these failures, we ran a series
ofexperimentson12differenttestbedsandfoundtwophenomenatobethedominantcauses: link
dynamics and transient loops.
6.2.1 Link Dynamics
Protocols today use periodic beacons to maintain their topology and estimate link qualities. The
beaconing rate introduces a tradeoff between agility and efficiency: a faster rate leads to a more
agile network but higher cost, while a lower rate leads to a slower-to-adapt network and lower
cost. Early protocol designs, such as MintRoute, assumed that intermediate links had stable,
independent packet losses, and used this assumption to derive the necessary sampling window
for an accurate estimate [114]. But in some environments, particularly in the 2.4 GHz frequency
75
space, links can be highly dynamic. Experimental studies have found that many links are not
stationary, but bursty on the time scale of a few hundred milliseconds [98].
The upper plot in Figure 6.1, taken from the Intel Mirage testbed, shows an example of such
behavior: the link transitions between very high and very low reception multiple times in a 2-
second window. Protocols today, however, settle for beacon rates on the order of tens of seconds,
leading to typical rate mismatches of two to three orders of magnitude. This means that at low
beaconrates,periodiccontrolpacketsmightobserveareceptionratioof50%,datapacketsobserve
periodsof0%and100%. Theperiodsof0%causemanywastedretransmissionsandpacketdrops.
For a periodic beacon to be able to sample these link variations, the beacon rate would have to
be in the order of few hundred milliseconds.
The 4-bit estimator addresses link dynamics by actively using data packets to measure link
quality. Thisallowsittoadaptveryquicklytolinkchanges. Suchagility,howeverposesachallenge
in routing protocol design - how should a routing protocol be designed when the underlying link
topology changes in the order of a few hundred milliseconds?
6.2.2 Transient Loops
Rapidlinktopologychangescanhaveseriousadverseeffectsonexistingroutingprotocols,causing
losses in the data plane or long periods of disconnection while the topology adjusts. In most
variationsofdistributeddistancevectoralgorithms,linktopologychangesresultintransientloops
which causes packet drops. This is the case even in path-vector protocols like BGP, designed to
avoid loop formation [83]. The MultiHopLQI protocol, for example, discards packets when it
detects a loop until a new next hop is found. This can take a few minutes, causing a significant
outage. We experimentally examine this behavior of MultiHopLQI in Section 6.7.3.1.
Some protocols prevent loops from forming altogether. DSDV, for example, uses destination-
generated sequence numbers to synchronize routing topology changes and prevent loops [85]. The
tradeoffisthatwhenalinkgoesdown,theentiresubtreewhoserootusedthatlinkisdisconnected
76
until an alternate path is found. This can only happen when the global sequence number for the
collection root changes.
In both cases, the problem is that topology repairs happen at the timescale of control plane
maintenance,whichoperatesatatimescaleordersofmagnitudelongerthanthedataplane. Since
the data plane has no say in the routing decisions, it has to choose between dropping packets or
stopping traffic until the topology repairs. This, in turn, creates a tension on the control plane
between efficiency in stable topologies and delivery in dynamic ones.
6.3 Design Elements
CTP uses three main techniques to achieve robustness, reliability, and energy-efficiency. First, it
estimateslinkqualitycombininginformationfromthephysical,link,andnetworklayers,updating
the estimate as quickly as every 5 packet transmission. However, rapidly changing link qualities
causes nodes to have stale topology information, which can lead to routing loops and packet
drops. CTP uses the remaining two mechanisms to be agile to link dynamics while also having a
low overhead when the topology is stable. The second mechanism is datapath validation: using
data packets to dynamically probe and validate the consistency of its routing topology. The third
mechanism is adaptive beaconing, which extends the Trickle code propagation algorithm so it can
be applied to routing control traffic. Trickle’s exponential timer allows nodes to send very few
control beacons when the topology is consistent, yet quickly adapt when the datapath discovers
a possible problem.
Agile Link Estimation
Predicting how well a particular link will deliver packets is fundamental when choosing reliable
andefficientroutestodeliverpackets. AsmentionedinSection6.2,therearesignificantchallenges
in building a robust link estimator. The underlying links are highly dynamic, exhibiting a bursty
77
behavior over short time scales. Traditional packet counting techniques for link estimator face
a problem of rate mismatch in part due to these fast, correlated changes. On the other hand,
strategiesthatsamplethequalityoverreceivedpacketssufferfromanestimationbias. Toaddress
theseissues,thefour-bitlinkestimator(4B)usedinCTPcombinesinformationfromthephysical,
data link, and routing layers to provide accurate estimates despite these challenges. Quality
estimates in 4B come from two sources: low beaconing to bootstrap the topology and form a
rough quality estimate, and unicast data retransmissions for fast, accurate updates, changing the
estimate as quickly as every 5 packets.
Datapath Validation
Every collection node maintains an estimate of the cost of its route to a collection point. We
assume expected transmissions (ETX) as the cost metric, but any similar gradient metric can
work just as well. A given node’s cost is the cost of its next hop plus the cost of its link to the
next hop: the cost of a route is the sum of the costs of its links. Collection points advertise a cost
of zero.
Eachdatapacketcontainsthetransmitter’slocalcostestimate. Whenanodereceivesapacket
to forward, it compares the transmitter’s cost with its own. Since cost must always decrease, if
a transmitter’s advertised cost is not greater than the receiver’s, then the transmitter’s topology
informationisstaleandtheremaybearoutingloop. Usingthedatapathtovalidatethetopology
in this way allows a protocol to detect possible loops on the first data packet after they occur.
Adaptive Beaconing
Weassumethatthecollectionlayerupdatesstaleroutinginformationbysendingcontrolbeacons.
As with data packets, beacons contain the transmitter’s local cost estimate. Unlike data packets,
however, control beacons are broadcasts. A single beacon updates many nearby nodes.
78
Collection protocols typically broadcast control beacons at a fixed interval [96, 114]. This
intervalposesabasictradeoff. Asmallintervalreducesthelengthoftimethetopologyinformation
is allowed to be stale and the duration a loop can persist, but uses more bandwidth and energy.
Alargeintervaluseslessbandwidthandenergybutcanlettopologicalproblemspersistforalong
time.
Adaptive beaconing breaks this tradeoff, achieving both fast recovery and low cost. It does so
by extending the Trickle algorithm [67] to maintaining its routing topology.
Trickle is designed to reliably and efficiently propagate code in a wireless network. Trickle’s
basic mechanism is transmitting the version number of a node’s code using a randomized timer.
Trickle adds two mechanisms on top of this randomized transmission: suppression and adapting
the timer interval. If a node hears another node advertise the same version number, it suppresses
its own transmission. When a timer interval expires, Trickle doubles it, up to a maximum value
(τ h
). When Trickle hears a newer version number, it shrinks the timer interval to a small value
(τ l
).
If all nodes have the same version number, their timer intervals increase exponentially, up to
τ h
. Furthermore, only a small subset of nodes transmit per interval, as a single transmission can
suppressmanynearbynodes. Whenthereisnewcode, however, theintervalshrinkstoτ l
, causing
nodes to quickly learn of and receive new code.
Unlike algorithms in ad-hoc routing protocols such as DSDV [85], adaptive beaconing does
assume the tree maintains a global sequence number or version number that might allow a simple
application of Trickle. Instead, adaptive beaconing its routing cost gradient to control when to
reset the timer interval. The routing layer resets the interval to τ l
on three events:
1. It is asked to forward a data packet from a node whose ETX is not higher
than its own. The protocol interprets this as neighbors having a significantly out-of-date
estimate and possibly a routing loop. It beacons to update its neighbors.
79
2. Its routing cost decreases significantly. The protocol advertises this event because
it might provide lower-cost routes to nearby nodes. In this case, “significant” is an ETX
change of 1.5
1
.
3. It receives a packet with the P bit set. The “Pull” bit advertises that a node wishes
to hear beacons from its neighbors, e.g., because it has just joined the network and needs
to seed its routing table. The pull bit provides a mechanism for nodes to actively request
topology information from neighbors. Section 6.5.4 provides greater detail on the P bit.
In a network with very stable links, both the first and second events are rare. As long as nodes
do not set the P bit, the beacon interval increases exponentially, up to τ h
. When the topology
changes significantly, however, affected nodes reset their intervals to τ l
, and transmit to quickly
reach consistency. While it could, our current design of adaptive beaconing does not use Trickle’s
suppression mechanism
2
because we have not found a significant improvement in the performance
of adaptive beaconing with the suppression mechanism.
6.4 Link Estimation
Accurate link quality estimates are a prerequisite for efficient routing in wireless networks: poor
link estimates can cause a 200% or greater slowdown in network throughput [23]. Furthermore,
accurate and responsive link estimation is key to applying more sophisticated opportunistic for-
warding [11] or network coding techniques [53]. Despite its importance, link estimation remains
an open problem, in part because many factors conspire to make it challenging, such as the
prevalence of intermediate-quality links [122], the time-varying nature of a wireless channel [97],
multipathinter-symbolinterference[3], linkasymmetries[75], andhardwarevariations[124]. Fur-
thermore, the physical, link, and network layers each have valuable information that can improve
1
This threshold corresponds to the new path being at least one hop shorter than the current path, assuming
high quality links (ETX value close to 1) on the path. It maybe better to make this threshold a function of the
current path estimate, but we leave that to future work.
2
In the terminology of the Trickle algorithm, CTP sets k =∞.
80
estimates, such as channel quality, packet delivery ratios, route utility, and acknowledgments.
The complexity of this design space, combined with the rich information that certain chipsets or
protocols can provide, has led many protocols to use cross-layer design, where each layer freely
shares protocol-specific information in order to improve performance.
In our design of the four-bit estimator (4B), we propose a different approach. We distill the
feedback provided by the physical, link, and network layers for accurate link estimation to narrow
interfaces. The benefit of narrow interfaces has a long history in system design: they simplify
semantics, reduce dependencies, and are easier to use as well as implement. All together, our
proposed interfaces provide 4 bits of information: 1 from the physical layer, 1 from the link layer,
and2fromthenetworklayer. Thesebitsofinformationareprotocolindependent,therebykeeping
layers decoupled and avoiding unforeseen dependencies that hinder network evolution.
To examine the efficacy of this 4-bit interface approach, we consider it in wireless sensor
networks. Unlike higher-power wireless meshes, RAM limitations mean wireless sensor networks
cannot store state for all possible neighbors. During the course of a deployment, a node can come
across a large number of, sometimes transient, links so even on higher-power platforms with more
RAM,itisoftendesirabletominimizethememoryfootprintofthelinkandroutingtablestoallow
other components, such as the packet buffer and applications, to utilize the available memory.
Therefore, link estimation accuracy is not the only concern: an estimator must also choose good
neighbors to estimate.
Eachlayerintheprotocolstackcancontributetowardsthesegoals. Fromthephysicallayerwe
can measure channel quality during a packet transmission. We can distill this down to the white
bit, which denotes whether the channel quality during a received packet was high. From the link
layer, we can measure whether packets are delivered and acknowledged. We can distill this down
to the ack bit, which denotes whether the node received a layer 2 acknowledgment in response
to a transmission. From the network layer, we can learn which links are the most valuable for
higher-layer performance. We can distill these concerns down to two bits: the pin bit, which tells
81
the estimator to not evict a link because it is in use, and the compare bit, which the estimator
can use to ask the network layer if a link looks promising.
6.4.1 Information from the Physical, Link, and Network Layers
In this section, we first describe the pitfalls of not using information from all the layers of the
protocol stack while estimating link qualities. Then we discuss the type of information available
in the physical, link, and network layer that can help in making link quality estimation accurate
and agile.
6.4.1.1 Layer Limitations
Alinkestimatorshouldbeaccurateandefficient. Itshouldprovidegoodestimatesoflinkqualities,
and be agile in detecting changes, all the while minimizing memory requirements and overhead
traffic. Eachofthephysical,link,andnetworklayerscanprovidevaluableinformationforthelink
estimator, as demonstrated by previous work (c.f. Figure 6.2). We argue that a link estimator
shoulduseinformationfromall threelayerstobestachievethesegoals,notonlybecauseeachlayer
can provide information that is unique or much more inexpensively obtained, but also because
there are different link conditions that some layers can detect while others cannot. The physical
layer’s per-packet channel quality assessment cannot always detect channel temporal variations.
While the link layer can accurately measure ETX, it cannot inexpensively decide which links to
estimate. The network layer knows which links are most useful for routing, but estimating link
qualities at the network layer is inefficient and slow to adapt.
As an example of how we can use information to help the link estimator, we take a closer
look at two collection protocols, a version of CTP that only uses beacon probes (CTP-Beacons)
to estimate link quality and MultiHopLQI [96], which relies solely on the link quality indicator
(LQI) provided by the CC2420 [46] radio chip. We have performed collection experiments with
these protocols on an 85-node testbed at a low rate, with each node generating one packet every
82
(a) EAR, ETX (b) MintRoute (c) MultiHopLQI
(d) SP (e) Four Bit
Figure 6.2—A link estimator, represented by the triangle in the center of each figure, interacts with up to three
layers. Attached boxes represent unified implementation. Outgoing arrows represent information the estimator
requests on packets it receives. Incoming arrows represent information the layers actively provide.
83
0
4
3
1
2
1
1
3
2
2
1
3
3
2
2
1
2
1
3
2 3
2
1
3
3
1
3
3
2
1
3 2
4
3
3
2
3
2
3 3
3
3
3
4
3
5
3
5
2
3 2
4
3
3
2
4
4 4
3
4
3
3
4
3
4
4
3
4
4
3
4
4 4
3
4
3
4
4
4
4
4
(a) CTP-Beacons (cost = 3.14)
0
1
1
1
1
1
1
1
1
1
1
2
1
2 1
1 2
2
1
1
1
1
1
1
1
1
2
2
1
2
2
1
2
2 2
2
2
2 2
2
2
2
2
1
1
2 2
2
2
2
2
2 2
2
2
2
2
2 2
2 2
2
2
3
3
2
3
2 3
3
3
2
3
3 3
3
4
3
3
3
3
3
3
(b) MultiHopLQI (cost = 2.28)
0
1
1
1
1
1
1
1
1
1
1
1
1
2
2 1
1
2
1
1
1
1
1
1
1
1
2
2 2
2
2
1
2 2
2
2
3 3
3
2
2
2
1
1
1
2
1
2
2
1
2
2 2
2
2
2
2
1 1
2
2
2
2
2
2
2
1 2
2
2
2
2
2 2
3
3
2
2
2
3
2
2
(c) CTP-Beacons unconstrained (cost = 1.86)
Figure 6.3—Routing trees formed on 85 node topology by CTP with 10-node link table, MultiHopLQI, and CTP-
Beacons with unrestricted link table. The average cost in transmissions per delivered message is in parenthesis.
The root is the node in the bottom left corner, and darker nodes (hop-count indicated inside each circle) mean
longer paths to the root.
10 seconds. Figure 6.3 shows a typical routing tree formed by CTP-Beacons (a), MultiHopLQI
(b), and a version of CTP-Beacons with no restriction on the size of the link estimator tables (c).
It also shows the average cost, in number of transmissions, for each delivered packet. Lower costs
mean shorter paths with good quality.
84
0.6
0.7
0.8
0.9
1
PRR(fraction)
PRR from P to C
70
80
90
100
110
LQI
LQI from P to C
0
20
40
60
80
0 1 2 3 4 5 6 7 8 9 10 11 12
Cum. # packets (x10
4
)
Time(hrs)
Unackd packets
Figure 6.4—Unaware of the reduced PRR, MultiHopLQI attempts to deliver packets on the same link between
the forth and sixth hour causing increased number of retransmissions due to unacknowledged packets.
CTP’s cost is higher than MultiHopLQI’s, even though the latter only uses physical layer in-
formation. Thisisthesymptomoftwoproblems. First,becauseCTP-Beaconsusesabidirectional
probe-based link estimator, its link table size limits a node’s in-degree. Second, also because of
the limited link table size, it may be that the best outgoing link is not even on the table to be
selected for routing. Figure 6.3(c) shows that when the link table is unrestricted, CTP-Beacons
can outperform MultiHopLQI.
In Section 6.4.4 we show how using information from the physical, link, and network layers we
can completely mitigate these problems. The following subsections elaborate on the benefits and
limitations of each layer.
6.4.1.2 Physical Layer
The physical layer can provide immediate information on the quality of the decoding of a packet.
Suchphysicallayerinformationprovidesafastandinexpensivewaytoavoidborderlineormarginal
links. It can increase the agility of an estimator, as well as provide a good first order filter for
85
inclusion in the link estimator table. So we can attribute this measurement to the right link, we
assume that the corruption on source address in the packet header not detected by packet CRC a
rare event and thus a minor concern in link estimation. In Figure 6.2, MultiHopLQI (c) and SP
3
(d) use physical layer information for link estimation.
As this physical layer information pertains to a single packet and it can only be measured
for received packets, channel variations can cause it to be misleading. For example, many links
on low power wireless personal area networks are bi-modal [97], alternating between high (100%
packet reception ratio, PRR) and low (0% PRR) quality. On such links, the receiver using only
physical information will see many packets with high channel quality and might assume the link
is good, even if it is missing many packets.
Figure 6.4 shows a limitation of physical layer information that we observed during a 12-hour
low rate collection experiment using MultiHopLQI on a 94 node testbed. As Figure 6.2(c) shows,
MultiHopLQI does not use link layer information. Although the protocol performed well overall,
there were bursts of packet loss. As Figure 6.4 shows, for a period of time, the PRR between the
nodes C and P dropped from an average of 0.9 to almost 0.6. This degradation in link quality
was not accompanied by a drop in the decoding quality indicator (LQI). All of the packets C
received had high quality: it just wasn’t receiving all the packets.
6.4.1.3 Link Layer
Link estimators such as ETX [23] and MintRoute [114] use periodic broadcast probes to measure
incoming packet reception rates. These estimators calculate bidirectional link quality — the
probability a packet will be delivered and its acknowledgment received — as the product of the
qualitiesofthetwodirectionsofalink. Whilesimple, thisapproachisslowtoadapt, andassumes
that periodic broadcasts and data traffic behave similarly.
3
When using information provided by the underlying radio.
86
By enabling link layer acknowledgments and counting every acknowledged or unacknowledged
packet, a link estimator can generate much more accurate estimates at a rate commensurate with
the data traffic. These estimates are also inherently bidirectional. In Figure 6.2(a) EAR and
ETX use feedback from the link layer for link estimation. Rather than inferring the ETX of a
link by multiplying two control packet reception rates, with link layer information on data traffic
an estimator can actually measure ETX. However, albeit accurate, relevant, and fast, sending
data packets requires routing information, which in turn requires link quality estimates. This
bootstrapping is best done at higher layers. Also, especially in dense networks, choosing the right
set of links to estimate can be as important as the estimates themselves, which can get expensive
if done solely at the link layer.
6.4.1.4 Network Layer
Thephysicallayercanprovidearoughmeasureofwhetheralinkmightbeofhighquality,enabling
a link estimator to avoid spending effort on marginal or bad links. Once the estimator has gauged
the quality of a link, the network layer can in turn decide which links are valuable for routing and
which are not. This is important when space in the link table is limited. For example, geographic
routing [52] benefits from neighbors that are evenly spread in all directions, while the S4 routing
protocol [77] benefits from links that minimize distance to beacons. One recent infamous wireless
sensor network deployment delivered only 2% of the data collected, in part due to disagreements
between network and link layers on what links to use. For this reason, the MintRoute protocol
(Figure 6.2(b)) integrates the link estimator into its routing layer. SP (Figure 6.2(d)) provides a
rich interface for the network layer to inspect and alter the link estimator’s neighbor table. The
network layer can perform neighbor discovery and link quality estimation, but without access
to information such as retransmissions, acknowledgments, or even packet decoding quality, this
estimation becomes slow to adapt and expensive.
87
Figure 6.5—The Four-bit estimator interface. The link estimator, represented by the triangle in the center, uses
four bits of information from the three layers. Outgoing arrows represent information the estimator requests on
packets it receives. Incoming arrows represent information the layers actively provide.
In the following section we describe how we can efficiently achieve cooperation between the
link estimator and all three layers, with clean and well-defined interfaces using only four bits of
information. We then demonstrate in Section 6.4.4 that indeed our interfaces allow significant
performance gains through effective information exchange between the layers.
6.4.2 Estimator Interfaces
Figure 6.5 shows the interfaces each layer provides to a link estimator. Together, the three layers
provide four bits of information: two bits for incoming packets and one bit each for transmitted
unicast packets and link table entries.
Thephysicallayerprovidesasinglebitofinformation. Ifset, thiswhitebitdenotesthateach
symbol in received packet has a very low probability of decoding error. A set white bit implies
that during the reception, the channel quality is high. The converse is not necessarily true: if the
white bit is clear, then the channel quality may or may not have been high during the packet’s
reception.
88
The link layer provides one bit of information per transmitted packet: the ack bit. The link
layer sets the ack bit on a transmit buffer when it receives a link layer acknowledgment for that
buffer. If the ack bit is clear, the packet may or may not have arrived successfully.
The network layer provides two bits of information, the pin bit and the compare bit. The
pin bit applies to link table entries. When the network layer sets the pin bit on an entry, the
link estimator cannot remove it from the table until the bit is cleared. The link estimator can
ask a network layer for a compare bit on a packet. The compare bit indicates whether the route
provided by the sender of the packet is better than the route provided by one or more of the
entries in the link table. We describe how the 4B estimator uses the compare bit in Section 6.4.3
below.
The four bits represent what we believe to be the minimal information necessary for a link
estimator to accurately estimate link qualities. Furthermore, we believe that the interfaces are
simpleenoughthattheycanbeimplementedformostsystems. Forexample,radioswhosephysical
layersprovidesignalstrengthandnoisecancomputeasignal-to-noiseratioforthewhitebit,using
a threshold derived from the signal-to-noise ratio/bit error rate curve. Physical layers that report
recovered bit errors or chip correlation can alternatively use this information. In the worst case,
if radio hardware provides no such information, the white bit can never be set.
The interfaces introduce one constraint on the link layer: they require a link layer that has
synchronous acknowledgments. While this might seem demanding, it is worthwhile to note that
most commonly-used link layers, such as 802.11 and 802.15.4, have them. Novel or application-
specific link layers must include acknowledgment to function in this model.
The compare bit requires that a network layer be able to tell whether the route from the
transmitter of a packet is better than the routes of current entries in the link table. The compare
bit does not require that the network layer be able to decide on all packets, merely some subset
of them. This implies that some subset of network layer packets, such as routing beacons, contain
route quality information.
89
1.0
Ack Bit ETX
Beacon PRR
Beacon EWMA
Hybrid ETX
1.0 0.83
5.0 3.1 1.7 2.1
1.25 6
3.9
Received/Acked Packet Lost/Unacked Packet
0.67
1.2
Figure6.6—The4BestimatorcombinesestimatesofETXseparatelyforunicastandbroadcasttrafficwithwindow
sizes of ku = 5 and k
b
= 2 respectively. The latter are first themselves averaged before being combined. We show
incoming packets as light boxes, marking dropped packets with an “× ”. The estimator calculates link estimates
for each of the two estimators at the times indicated with vertical arrows.
6.4.3 The Four-Bit (4B) Link Quality Estimator
We describe a hybrid estimator that combines the information provided by the three layers and
link estimation beacons in order to provide accurate, responsive, and useful link estimates. The
estimator maintains a small table (e.g., 10) of candidate links for which it maintains ETX values.
It periodically broadcasts beacons that contain a subset of these links. Network layer protocols
can also broadcast packets through the estimator, causing it to act as a layer 2.5 protocol that
adds a header and footer between layers 2 and 3.
TheestimatorfollowsthebasictablemanagementalgorithmoutlinedbyWooetal.[114],with
one exception: it does not assume a minimum transmission rate, since it can leverage outgoing
data traffic to detect broken links. Link estimate broadcasts contain sequence numbers, which
receivers use to calculate the beacon reception rate.
The estimator uses the white and compare bits to supplement the standard table replacement
policy. When it receives a network layer routing packet which has the white bit set from a node
that is not in the link estimation table, the estimator asks the network layer whether the compare
bit is set. If so, the estimator flushes a random unpinned entry from the table and replaces it
with the sender of the current packet.
The estimator uses the ack bit to refine link estimates, combining broadcast and unicast ETX
estimates into a hybrid value, an approach necessitated by the large variance of traffic volume
90
across different links in the network [54]. We follow the link estimation method proposed by
Woo et al. [114], separately calculating the ETX value every k
u
or k
b
packets for unicast and
broadcast packets, respectively. If a out of k
u
packets are acknowledged by the receivers, the
unicastETXestimateis
ku
a
. Ifa=0, thentheestimateisthenumberoffaileddeliveriessincethe
last successful delivery. The calculation for the broadcast estimate is analogous, but has an extra
step. We use a windowed exponentially weighted moving average (EWMA) over the calculated
reception probabilities, and invert the consecutive samples of this average into ETX values. These
two streams of ETX values coming from the two estimators are combined in a second EWMA, as
showninFigure6.6. Theresultisahybriddata/beaconwindowed-meanEWMAestimator. When
there is heavy data traffic, unicast estimates dominate. When the network is quiet, broadcast
estimates dominate.
Contrary to most pure broadcast-based estimators, the 4B estimator does not actively ex-
change and maintain bidirectional estimates using the beacons. Because the ack bit inherently
allows the measurement of bidirectional characteristics of links, our estimator can afford to only
usetheincomingbeaconestimatesasbootstrappingvaluesforthelinkqualities, whicharerefined
by the data-based estimates later. This is an important feature, as it decouples the in-degree of
the nodes in the topology from the size of the link table.
6.4.4 Evaluation
Weimplementedthe4BestimatorinTinyOS2.x. Wecallthecollectionprotocolavailablepriorto
the4BestimatorCTP-Beaconsbecauseitderivedlinkestimatessolelybasedonperiodicbeacons.
Wecallthecollectionprotocolthatusesthe4BestimatorCTP,whichisnowthedefaultcollection
protocol in TinyOS 2.x. We study the performance of 4B by replacing the estimator used in CTP
with the estimator used in MultiHopLQI, and by removing some bits from the 4B estimator
depending on the experiment. Our estimator uses the four bit interfaces to the physical, link, and
network layers. In this section, we perform a detailed experimental comparison between the CTP
91
estimator - 4B, CTP-Beacons, and MultiHopLQI estimators. As described above, MultiHopLQI
uses the Link Quality Indication (LQI), a feature of the CC2420 radio, and for that radio it is
currently the best performing collection implementation for TinyOS.
In our comparison we run all three protocols on the Mirage testbed, using 85 MicaZ [21] nodes
with one node set as the basestation. We also ran experiments on a second testbed, Tutornet,
using 94 TelosB [87] nodes. Transmit power is set at 0 dBm unless otherwise specified. In each
experiment, we stagger the boot time of all the nodes using a uniform distribution over a range
of thirty seconds. Each node sends a collection packet every 16 seconds to the sink, with some
jitter to avoid packet synchronization with other nodes. This creates many concurrent flows in
the network, converging at the sink, but not overloading the network as our focus is on low data-
rate sensor network systems. All experiments on Mirage lasted between 40 and 69 minutes. On
Tutornet we ran much longer experiments, ranging from 3 to 12 hours. The fact that the testbeds
are static, and that all of our results agree from one testbed to the other gives us confidence in
the results of the shorter runs.
Theprimarymetricweusetoevaluateperformanceiscost: thetotalnumberoftransmissions
in the network for each unique delivered packet. Cost is important as it directly relates to
network lifetime. It takes into account the number of hops in a path, the number of per-link
retransmissions needed, and also the wasted network effort in packets that are dropped. To put
cost into perspective, we also look at the average depth of the topology trees. If all links are
perfect, average depth is a lower bound for cost. The difference between the two is indicative of
thequalityofthelinkschosen,asitstemsfromeitherretransmissionsordroppedpackets. Finally,
we also look at delivery rate, the fraction of unique messages received at the root. Results in
this section are averaged over 16000-22000 packets sent to the sink during the experiment which
lasted 40-69 minutes; standard deviation is not shown on the figure for clarity in presentation
focused on qualitative trends.
92
1.5
2
2.5
3
1.5 2 2.5 3
Average Cost (xmits/packet)
Average Tree Depth (hops)
White/Compare Bits
Ack Bit: Unidir. Est.
Ack Bit: Unidir. Est. White/Compare Bits
4B
CTP + white bit
CTP + unidir
CTP T2
MultiHopLQI
Cost = Depth
Figure 6.7—Exploring the link estimation design space: adding the ack bit and/or the white and compare bits
to CTP-Beacons (T2 in this figure) decreases cost and the average depth of a node in the routing tree.
We first explore how the addition of each of the bits in Section 6.4.2 impacts cost and route
length. We compare the CTP-Beacons with 4B, and two intermediary implementations. The
uppermost-left point of Figure 6.7 shows the cost and depth of CTP-Beacons running on the
Mirage testbed. Adding unidirectional link estimation to CTP-Beacons with the ack bit reduces
average tree depth by 93%, and reduces cost by 31%. Unidirectional estimates decouple in-degree
from the link table size, hence the large decrease in depth.
Adding the white and compare bits to the resulting protocol decreases cost to 55% of the
original CTP-Beacons, possibly because of improved parent selection. Adding only the white and
compare bits to CTP-Beacons provides reductions of 15% in cost and 23% in average depth. The
figure also shows MultiHopLQI’s cost and depth in the same testbed for comparison. It is only
when we use information from the three layers that 4B does better than MultiHopLQI. 4B has
29%lowercostand11%shorterpathsthanMultiHopLQI.OnexperimentsonTutornet, 4B’scost
and average depth were respectively 44% and 9.7% lower than MultiHopLQI’s. Thus, we can be
confident that these trends on lower average cost and depth with 4B compared to MultiHopLQI
are consistent across the testbeds. The trees produced by 4B are very similar qualitatively and in
average depth to the trees produced by CTP-Beacons with unrestricted link tables (Figure 6.3).
93
2
3
4
5
6
7
8
1.5 2 2.5 3 3.5 4
Average Cost (xmits/packet)
Average Tree Depth (hops)
0 dBm
-10 dBm
-20 dBm
4B
MultiHopLQI
Cost = Depth
Figure 6.8—Average node depth and cost for MultiHopLQI and 4B for decreasing transmit powers on the Mirage
testbed. 4B reduces cost by 19-28%.
Figure 6.8 compares the cost and average node depth of 4B and MultiHopLQI in the Mirage
testbed as transmit power varies from − 20 dBm to 0 dBm. In each protocol, we see that both
average node depth and cost increase with decreasing transmit power, as nodes need to route
packets over more hops to get to the sink. 4B’s improvement in cost over MultiHopLQI ranges
from 29% to 11%, and the improvement in average depth from 11% to 3.5%. 4B’s cost, for the 0
and -10 dBm cases, is at most 13% above the lower bound, while it is at most 43.4% above the
lowerboundforMultiHopLQI.Inanetworkwithmanyhops, bothprotocolsbecomelessefficient.
The relative increase in cost (62% above average depth for 4B and 95% for MultiHopLQI) are
indicative of retransmissions and/or losses. Even with similar tree depths, however, 4B is able to
select better links in this situation.
Figure 6.9 looks at the per-node distribution of delivery ratios for the same experiments, and
gives some insight on why the costs in Figure 6.8 grow faster than the average depth. For 0
and 10 dBm, 4B showed an average delivery ratio above 99.9%, with minimum 99.3%. For 0
dBm, MultiHopLQI’s average delivery ratio over all nodes was 95.9%, with the worst node at
64%. As the transmit power decreases, the relative impact of RF noise on wireless performance
increases, creating localized asymmetries in the network. As in the example of Section 6.4.1.1,
94
0 dBm −10 dBm −20 dBm 0 dBm −10 dBm −20 dBm
0.4 0.6 0.8 1.0
Delivery Ratio
MultiHopLQI 4B
Figure 6.9—Boxplots of per-node delivery distributions at decreasing transmit power, for both MultiHopLQI and
4B. Whiskers show the minimum and maximum values. Boxes show the 1st and 3rd quartiles. The line is the
median. 4B maintains much higher and consistent delivery rates across the network.
MultiHopLQI’s performance drops as some of this variation in link quality is not captured by the
physical layer link quality indicator. We plan to look further into the dynamic behavior of the
network, but the much smaller number of packet losses in 4B, even at -20dBm, indicates that
most of the inefficiency seen in its cost is due to retransmissions, rather than loss. This suggests
that the estimator is agile enough to notice packet losses and trigger the switch to a new route.
6.5 Routing
This section describes how CTP discovers, selects, and advertises routes.
6.5.1 Implications of Agile Link Estimation
Because the wireless links are inherently dynamic, accurate link estimation requires agile link
estimation. The 4B estimator uses a narrow, well-defined interfaces that allow a link estimator to
useinformationfromthephysical,link,andnetworklayersandshowssignificantimprovementson
costanddeliveryratiooverthestateoftheart,whilemaintaininglayerednetworkingabstractions.
Usingtheinformationprovidedby4Bpresentssomechallenges,however. Itreflectstheunderlying
link dynamics, and its quality estimates can change in the time scale of data transmissions, as
95
NE Seq. No.
Link ETX Node ID
Rsrvd.
Link Estimation
Node ID Link ETX
Node ID
payload…
Network
16 bits
(a) Link estimation frame
P C Parent
Parent ETX
Reserved
16 bits
Route updates
Control
ETX
(b) Routing frame
NE Seq. No.
Link ETX Node ID
Rsrvd.
16 bits
Link Estimation
Node ID
Link ETX
Node ID
P C Parent
Parent ETX
Reserved
Route updates
Control
ETX
(c) Complete routing frame
Figure 6.10—The CTP routing frame format
quickly as 5 packet times. This agility makes transient inconsistencies in the topology the norm
rather than an exceptional condition.
CTP has two key innovations to address these inconsistencies. First, the routing algorithm
has a variable timer to exchange path quality information with neighbors. We view the topology
maintenance as a consistency problem, and use the Trickle algorithm to control the dissemination
of topology information. Exchanges are quick when changes in link quality are detected, and
slow down otherwise. Second, the forwarding is designed to work despite transient loops that
will inevitably form. The forwarding logic detects gradient inconsistencies and loops as they are
traversed by data packets, and triggers topology adaptation while packets are in transit. Next,
we describe in more detail, respectively, how the routing and forwarding components of CTP deal
with these challenges.
96
6.5.2 Route Computation and Selection
Figure 6.10 shows the CTP routing packet format nodes use to exchange topology information.
The routing frame has two fields and two control bits. It advertises the node’s current parent
and routing cost. It also includes two control bits: the pull bit (P) and the congested bit (C). We
discuss the meaning and use of the P bit below. The C bit is reserved for potential future use in
congestion control and is not relevant for this chapter.
Changingroutestooquicklycanharmefficiency,asgeneratingaccuratelinkestimatesrequires
time. To dampen the topology change rate, CTP employs hysteresis in path selection: it only
switches routes if it believes the other route is significantly better than its current one, where
“significantly” better is having an ETX at least 1.5 lower.
While hysteresis has the danger of allowing CTP to use sub-optimal routes, noise in link
estimates causes better routes to dominate a node’s next hop selection. We use a simple example
toillustratehowestimationnoisecausesthetopologytogravitatetowardsandprefermoreefficient
routes despite hysteresis. Let a node A have two options, B and C, for its next hop, with identical
costs of 3. The link to B has a reception ratio of 0.5 (ETX of 2.0), while the link to C has a
reception ratio of 0.6 (ETX of 1.6).
If A chooses B as its next hop, its cost will be 5. The hysteresis described above will prevent
A from ever choosing C, as the resulting cost, 4.6, is not≤ 3.5. However, the ETX values for the
linkstoBandCarenotstatic: theyarediscretesamplingsofarandomprocess. Correspondingly,
even if the reception ratio on those links is completely stable, their link estimates will not be.
Assume, for simplicity’s sake, that they follow a Gaussian distribution; the same logic holds
for other distributions as long as their bounds are not smaller than the hysteresis threshold. Let
E
X
be a sample from distribution X. As the average of the AB distribution is 2.0, but the average
of the AC distribution is 1.6, the probability that E
AB
− E
AC
> 1.5 is much higher than the
probability that E
AC
− E
AB
> 1.5. That is, the probability that AC will be at least 1.5 lower
97
than AB is much higher than the probability that AB will be at least 1.5 lower than AC. Due to
random sampling, at some point AC will pass the hysteresis test and A will start using C as its
next hop. Once it switches, it will take much longer for AB to pass the hysteresis test. While A
will use B some of the time, C will dominate as the next hop.
6.5.3 Control Traffic Timing
When CTP’s topology is stable, it relies on data packets to maintain, probe, and improve link
estimates and routing state. Beacons, however, form a critical part of routing topology mainte-
nance. First, since beacons are broadcasts, they are the basic neighbor discovery mechanism and
provide the bootstrapping mechanism for neighbor tables. Second, there are times when nodes
must advertise information, such as route cost changes, to all of their neighbors.
BecauseCTPseparateslinkestimationfromitscontrolbeacons, itsestimatordoesnotrequire
or assume a fixed beaconing rate. This allows CTP to adjust its beaconing rate based on the
expected importance of the beacon information to its neighbors. Minimizing broadcasts has the
additional benefit that they are typically much more expensive to send with low-power link layers
than unicast packets. When the routing topology is working well and the routing cost estimates
are accurate, CTP can slow its beaconing rate. However, when the routing topology changes
significantly, or CTP detects a problem with the topology, it can quickly inform nearby nodes so
they can react accordingly.
CTP sends routing packets using a variant of the Trickle algorithm [67]. It maintains a
beaconing interval which varies between 64ms and one hour. Whenever the timer expires, CTP
doubles it, up to the maximum (1 hour). Whenever CTP detects an event which indicates the
topology needs active maintenance, it resets the timer to the minimum (64ms). These values are
independent of the underlying link layer. If a packet time is larger than 64ms, then the timer
simply expires several times until it reaches a packet time.
98
6.5.4 Resetting the Beacon Interval
As discussed in Section 6.3 mentioned, three events cause CTP to reset its beaconing interval to
the minimum length.
The simplest one is the P bit. CTP resets its beacon interval whenever it receives a packet
with the P bit set. A node sets the P bit when it does not have a valid route. For example,
when a node boots, its routing table is empty, so it beacons with the P bit set. Setting the P bit
allows a node to “pull” advertisements from its neighbors, in order to quickly discover its local
neighborhood. It also allows a node to recover from large topology changes which cause all of its
routing table entries to be stale.
CTP also resets its beacon interval when its cost drops significantly. This behavior is not
necessary for correctness: it is an efficiency optimization. The intuition is that the node may now
be a much more desirable next hopfor its neighbors. Resetting its beacon interval lets them know
quickly.
The final and most important event is when CTP detects that there might be a routing topol-
ogy inconsistency. CTP imposes an invariant on routes: the cost of each hop must monotonically
decrease. Let p be a path consisting of k links between node n
0
and the root, node n
k
, such that
node n
i
forwards its packets to node n
i+1
. For the routing state to be consistent, the following
constraint must be satisfied:
∀i∈{0,k− 1}, ETX(n
i
)>ETX(n
i+1
),
where ETX(x) is the path ETX from node x to the root.
CTP forwards data packets in a possible loop normally: it does not drop them. However, it
introduces a slight pause in forwarding, the length of the minimum beacon interval. This ensures
that it sends the resulting beacon before the data packet, such that the inconsistent node has a
chancetoresolvetheproblem. IfthereisaloopoflengthL, thismeansthattheforwardedpacket
99
Send Queue!
Transmit Cache!
Transmit Timer!
Link! Link!
Client Queues! Pool!
?!
duplicate!
Figure 6.11—The CTP forwarding path.
takes L− 1 hops before reaching the node that triggered topology recovery. As that node has
updated its routing table, it will pick a different next hop.
Ifthefirstbeaconwaslost, thentheprocesswillrepeat. Ifitchoosesanotherinconsistentnext
hop, it will trigger a second topology recovery. In highly dynamic networks, packets occasionally
traverse multiple loops, incrementally repairing the topology, until finally the stale node picks a
safe next hop and the packet escapes to the root. The cost of these rare events of a small number
of transient loops is typically much less than the aggregate cost of general forwarding: improving
routes through rare transient loops is worth the cost.
6.6 Forwarding
This section describes CTP’s data plane. Unlike the control plane, which is a set of consistency
algorithms, the concerns of the data plane are much more systems and implementation oriented.
In the previous section, we described the important role that the data plane plays in detecting
inconsistencies in the topology and resetting the beacon interval to fix them. In this section, we
describe four mechanisms in the data plane that deal with efficiency, robustness, and reliabil-
ity: per-client queueing, a hybrid send queue, a transmit timer, and a packet summary cache.
Figure 6.11 shows the CTP data path and how these four mechanisms interact.
100
P C THL
Route ETX
Origin
Seq. No. Collect ID
payload…
Reserved
16 bits
Loop detection
Duplicate detection
Transport
Control
Figure 6.12—The CTP data frame format.
Figure 6.12 shows a CTP data frame, which has an eight byte header. The data frame shares
two fields with the routing frame, the control field and the route ETX field. The 8-bit time has
lived, or THL field is the opposite of a TTL: it starts at zero at an end point and each hop
increments it by one. A one-byte application dispatch identifier called Collect ID, allows multiple
clients to share a single CTP layer.
CTP uses a very aggressive retransmission policy. By default, it will retransmit a packet up
to 32 times. This policy stems from the fact that all packets have the same destination, and,
thus, the same next hop. The outcome of transmitting the next packet in the queue will be the
same as the current one. Instead of dropping, CTP combines a retransmit delay with proactive
topology repair to increase the chances of delivering the current packet. In applications where
receiving more recent packets is more important than receiving nearly all packets, the number of
retransmissions can be adjusted without affecting the routing algorithm.
6.6.1 Per-client Queueing
CTP maintains two levels of queues. The top level is a set of one-deep client queues. CTP allows
each client to have a single outstanding packet. If a client needs additional queueing, it must
implement it on top of this abstraction. These client queues do not actually store packets; they
are simple guards that keep track of whether a client has an outstanding packet. When a client
sends a packet, the client queue checks whether it is available. If so, the client queue marks itself
101
busy and passes the packet down to the hybrid send queue. A one-deep queue per client provides
isolation, as a single client cannot fill the send queue and starve others.
6.6.2 Hybrid Send Queue
CTP’s lower level queue contains both route through- and locally-generated traffic (as in [113]),
maintainedbyaFIFOpolicy. ThishybridsendqueueisoflengthC+F,whereCisthenumberof
CTPclientsandF isthesizeofthebufferpoolforforwardedpackets. Followingthispolicymeans
that, technically, the send queue never rejects a packet. If it is full, this means the forwarding
path is using all F of its buffers and all C clients have an outstanding packet.
WhenCTPreceivesapackettoforward,itfirstchecksifthepacketisaduplicate: Section6.6.4
describes this process below. If the packet is a duplicate, it returns it to the link layer without
forwarding it. If the packet is not a duplicate, CTP checks if it has a free packet buffer in its
memory pool. If it has a free buffer, it puts the received packet on the send queue and passes the
free buffer to the link layer for the next packet reception.
6.6.3 Transmit Timer
Multihop wireless protocols encounter self-interference, where a node’s transmissions collide with
priorpacketsithassentwhichothernodesareforwarding. ForarouteofnodesA→B →C →...,
self-interferencecaneasilyoccuratB whenAtransmitsanewpacketatthesametimeC forwards
the previous one [68].
CTP prevents self interference by rate-limiting its transmissions. In the idealized scenario
abovewhereonlytheimmediatechildrenandparentareinthetransmissionrangeofatransmitter,
if A waits at least 2 packet times between transmissions, then it will avoid self-interference, as C
will have finished forwarding [113]. While real networks are more complex (the interference range
can be greater than the transmit range), 2 packet times represents the minimum timing for a flow
longer than 2 hops.
102
The transmission wait timer depends on the packet rate of the radio. If the expected packet
time is p, then CTP waits in the range of (1.5p,2.5p), such that the average wait time is 2p but
there is randomization to prevent edge conditions due to MAC backoff or synchronized transmis-
sions.
6.6.4 Transmit Cache
Link layer acknowledgments are not perfect: they are subject both to false positives and false
negatives. False negatives cause a node to retransmit a packet which is already in the next hop’s
forwarding queue. CTP needs to suppress these duplicates, as they can increase multiplicatively
on each hop. Over a small number of hops, this is not a significant issue, but in face of the many
hops of transient routing loops, this leads to an exponential number of copies of a packet that can
overflow all queues in the loop.
Since CTP forwards looping packets in order to actively repair its topology, CTP needs to
distinguish link-layer duplicates from looping packets. It detects duplicates by examining three
values: the origin address, the origin sequence number, and the THL. Looping packets will match
intheaddressandsequencenumber, butwillhaveadifferentTHL(unlesstheloopwasamultiple
of 256 hops long), while link-layer duplicates have matching THL values.
When CTP receives a packet to forward, it scans its send queue for duplicates. It also scans
the transmit cache. This cache contains the 3-tuples of the N most recently forwarded packets.
The cache is necessary for the case where duplicates are arriving more slowly than the rate at
which the node drains its queue: in this case, the duplicate will no longer be in the send queue.
For maximal efficiency, the transmit cache should be as large as possible. We have found
that, in practice and even under high load, having a cache size of four slots is enough to suppress
most (> 99%) duplicates on the testbeds that we used for experiments. A larger cache improves
duplicate detection slightly but not significantly enough to justify its cost on memory-constrained
platforms.
103
Testbed Platform Nodes Physical size Degree PL Cost Cost Churn
m
2
or m
3
Min Max PL node·hr
Tutornet (16) Tmote 91 50× 25× 10 10 60 3.12 5.91 1.90 31.37
Wymanpark Tmote 47 80× 10 4 30 3.23 4.62 1.43 8.47
Motelab Tmote 131 40× 20× 15 9 63 3.05 5.53 1.81 4.24
Kansei
a
TelosB 310 40× 20 214 305 1.45 - - 4.34
Mirage Mica2dot 35 50× 20 9 32 2.92 3.83 1.31 2.05
NetEye Tmote 125 6× 4 114 120 1.34 1.40 1.04 1.94
Mirage MicaZ 86 50× 20 20 65 1.70 1.85 1.09 1.92
Quanto Epic-Quanto 49 35× 30 8 47 2.93 3.35 1.14 1.11
Twist Tmote 100 30× 13× 17 38 81 1.69 2.01 1.19 1.01
Twist eyesIFXv2 102 30× 13× 17 22 100 2.58 2.64 1.02 0.69
Vinelab Tmote 48 60× 30 6 23 2.79 3.49 1.25 0.63
Tutornet (26) Tmote 91 50× 25× 10 14 72 2.02 2.07 1.02 0.04
Blaze
b
Blaze 20 30× 30 9 19 1.30 - - -
a
Packet cost logging failed on 10 nodes.
b
Blaze instrumentation does not provide cost and churn information.
Table 6.1—Testbed configuration and topology properties, from most to least dynamic. Cost is transmissions
per delivery and PL is Path Length, the average number of hops a data packet takes. Cost/PL is the average
transmissions per link. There are two entries for Tutornet with TMotes: one is 802.15.4 channel 16 the other
channel 26.
6.7 Evaluation
This section evaluates how the mechanisms described above, namely adaptive control traffic rate,
datapath validation, and the data plane optimizations, combine to achieve the four goals from
Section 6.1: reliability, robustness, efficiency, and hardware independence.
We evaluate our implementation of CTP, using the 4-bit link estimator from [32], on 12 differ-
ent testbeds, encompassing 7 platforms, 6 link layers, multiple densities and frequencies. Despite
havinganecdotalevidenceofseveralsuccessfulreal-worlddeploymentsofCTP,theseresultsfocus
on publicly available testbeds, because they represent at least theoretically reproducible results.
The hope is that different testbed environments we examine sufficiently capture a reasonable de-
greeofvariationinnodedensity,radiotechnology,wirelessenvironment,anddeploymenttopology.
104
6.7.1 Testbeds
Table 6.1 summarizes the 12 testbeds we use. It lists the name, platform, number of nodes,
physical span, and topology properties of each network. Motelab is at Harvard University. Twist
is at TU Berlin. Wymanpark is at Johns Hopkins University. Tutornet is at USC. Neteye is at
Wayne State University. Kansei is at Ohio State University. Vinelab is at UVA. Quanto is at
UC Berkeley. Mirage is at Intel Research Berkeley. Finally, Blaze is at Rincon Research. Some
testbeds (e.g., Mirage) are on a single floor while others (e.g., Motelab) are on multiple floors.
Unless otherwise noted, the results of the detailed experiments are from the Tutornet testbed.
To roughly quantify the link-layer topology of each testbed, we ran an experiment where
each node broadcasts every 16s. The interval is randomized to avoid collisions. To compute the
link layer topology, we consider all links that delivered at least one packet. The minimum and
maximum degree column in Table 6.1 are the in-degree of the nodes with the smallest and largest
number of links, respectively. We consider this very liberal definition of a link because it is what
a routing layer or link estimator must deal with: a single packet can add a node as a candidate,
albeit perhaps not for long.
As the differing delivery results on Tutornet in Table 6.2 indicate, the link stability and qual-
ity results should not be considered definitive for all experiments. For example, most 802.15.4
channels share the same frequency bands as 802.11: 802.15.4 on an interfering channel has more
packet losses and higher link dynamics than on a non-interfering one. For example, Tutornet on
channel 16 has the highest churn, while Tutornet on channel 26 has the lowest. We revisit the im-
plications of this effect in Section 6.7.3.9. All of the values in Table 6.1 for 802.15.4 testbeds, with
the exception of Quanto and channel 16 Tutornet experiment (Mirage, Tutornet, Vinelab, Twist,
Wymanpark, Kansei, Neteye, Motelab) use the non-interfering channel 26. Channel allocation
concerns prevented us from doing the same in Quanto: it was measured with channel 15.
105
To roughly quantify link stability and quality, we ran CTP with an always-on link layer for
3 hours and computed three values: PL, the average path length (hops a packet takes to the
collection root); the average cost (transmissions/delivery); and the node churn, or rate at which
nodes change parents. We also look at cost/PL, which indicates how any transmissions CTP
makes on average per hop. Wide networks have a large PL. Networks with many intermediate
links or sparse topologies have a high cost/PL ratio (sparsity means a node might not have a
good link to use). Networks with more variable links or very high density have a high churn
(density can increase churn because a node has more parents to try and choose from). As the
major challenge adaptive beaconing and datapath validation seek to address is link dynamics, we
order the testbeds from the highest (Tutornet on channel 16) to the lowest (Tutornet on channel
26) churn.
We use all available nodes in every experiment. In some testbeds, this means the set of nodes
across experiments is almost but not completely identical, due to backchannel connectivity issues.
However, we do not prune problem nodes. In the case of Motelab, this approach greatly affects
thecomputedaverageperformance,assomenodesarebarelyconnectedtotherestofthenetwork.
6.7.2 Experimental Methodology
We modified three variables across all of the experiments: the inter-packet interval (IPI) with
which the application sends packets with CTP, the MAC layer used, and the node ID of the root
node. Generally, to obtain longer routes, we picked roots that were in one corner of a testbed.
We used 6 different MAC layers. On TelosB/TMote [87] nodes, we used the standard TinyOS
2.1.0 CSMA layer at full power (CSMA), the standard TinyOS 2.1.0 low-power-listening layer,
BoX-MAC (BoX) [79], and low-power probing (LPP) [80]. On mica2dot [20] nodes, we used the
standardTinyOS2.1.0BMAClayer(B-MAC)forCC1000radio[86]. OnBlazenodes,weusedthe
standard TinyOS 2.1.0 BMAC layer (B-MAC) for CC1100. On eyesIFXv2 nodes, we use both the
TinyOS 2.1.0 CSMA layer (CSMA) as well as TinyOS 2.1.0 implementation of SpeckMAC [111].
106
Testbed Frequency MAC IPI Avg 5th% Loss
Delivery Delivery
Motelab 2.48GHz CSMA 16s 94.7% 44.7% Retransmit
Motelab 2.48GHz BoX-50ms 5m 94.4% 26.9% Retransmit
Motelab 2.48GHz BoX-500ms 5m 96.6% 82.6% Retransmit
Motelab 2.48GHz BoX-1000ms 5m 95.1% 88.5% Retransmit
Motelab 2.48GHz LPP-500ms 5m 90.5% 47.8% Retransmit
Tutornet (26) 2.48GHz CSMA 16s 99.9% 100.0% Queue
Tutornet (16) 2.43GHz CSMA 16s 95.2% 92.9% Queue
Tutornet (16) 2.43GHz CSMA 22s 97.9% 95.4% Queue
Tutornet (16) 2.43GHz CSMA 30s 99.4% 98.1% Queue
Wymanpark 2.48GHz CSMA 16s 99.9% 100.0% Retransmit
NetEye 2.48GHz CSMA 16s 99.9% 96.4% Retransmit
Kansei 2.48GHz CSMA 16s 99.9% 100.0% Retransmit
Vinelab 2.48GHz CSMA 16s 99.9% 99.9% Retransmit
Quanto 2.425GHz CSMA 16s 99.9% 100.0% Retransmit
Twist (Tmote) 2.48GHz CSMA 16s 99.3% 100.0% Retransmit
Twist (Tmote) 2.48GHz BoX-2s 5m 98.3% 92.9% Retransmit
Mirage (MicaZ) 2.48GHz CSMA 16s 99.9% 99.8% Queue
Mirage (Mica2dot) 916.4MHz B-MAC 16s 98.9% 97.5% Ack
Twist (eyesIFXv2) 868.3MHz CSMA 16s 99.9% 99.9% Retransmit
Twist (eyesIFXv2) 868.3MHz SpeckMAC-183ms 30s 94.8% 44.7% Queue
Blaze 315MHz B-MAC-300ms 4m 99.9% - Queue
Table 6.2—Summary of experimental results across the testbeds. The first section compares how different low-
power link layers and settings affect delivery on Motelab. The second section compares how the 802.15.4 channel
affects delivery on Tutornet. The third section shows results from other TelosB/TMote testbeds, and the fourth
section shows results from testbeds with other platforms. In all settings, CTP achieves an average delivery ratio of
over 90%. In Motelab, a small number of nodes (the 5th percentile) have poor delivery due to poor connectivity.
In the cases where we use low power link layers, we report the interval. For example, “BoX-1s”
meansBoX-MACwithacheckintervalof1second, while“LPP-500ms”meanslow-powerprobing
with a probing interval of 500ms.
Evaluatingefficiencyisdifficult,astemporaldynamicspreventknowingwhattheoptimalroute
wouldbeforeachpacket. Therefore,weevaluateefficiencyasacomparativemeasure. Wecompare
CTP with the TinyOS 2.1 implementation of MultiHopLQI, a well-established, well-tested, and
highly used collection layer that is part of the TinyOS release. As MultiHopLQI has been used in
recent deployments, e.g., on a volcano in Ecuador [110], we consider it a reasonable comparison.
Other notable collection layers, such as Hyper [91] and Dozer [14] are either implemented in
TinyOS 1.x (Hyper) or are closed source and specific to a platform (Dozer, TinyNode). As
107
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35 40
Delivery Ratio
Time(hours)
max
median
min
(a) Delivery Ratio for CTP
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35 40
Delivery Ratio
Time(hours)
max
median
min
(b) Delivery Ratio for MultiHopLQI.
Figure 6.13—CTP has a consistently higher delivery ratio than MultiHopLQI. In these plots we show for each
time interval the minimum, median, and maximum delivery ratio across all nodes.
TinyOS 2.x and 1.x have different packet scheduling and MAC layers, we found that comparing
with 1.x protocols unfairly favors CTP.
6.7.3 Experiments and Results
6.7.3.1 Reliable, Robust, and Hardware-Independent
Before evaluating the effectiveness of each mechanism to the overall performance of CTP, we first
look at high-level results from experiments across multiple testbeds, as well as a long duration
experiment. Table 6.2 shows results from 21 experiments across the 12 testbeds. In these exper-
iments, we chose IPI values well below saturation, such that the delivery ratio is not limited by
available throughput. The Loss column describes the dominant cause of packet loss: retransmit
means CTP dropped a packet after 32 retransmissions, queue means it dropped a received packet
due to a full forwarding queue, and ack means it heard link layer acknowledgments for packets
that did not arrive at the next hop.
In all cases, CTP maintains an average delivery ratio above 90%: it meets the reliability goal.
The lowest average delivery ratio is for Motelab using low power probing (500ms), where it is
90.5%. The second lowest is Motelab using BoX-MAC (50ms), at 94.4%. In Motelab, retransmit
108
is the dominant cause of failure: retransmission drops represent CTP sending a packet 32 times
yet never delivering it. Examining the logs, this occurs because some Motelab nodes are only
intermittently and sparsely connected (its comparatively small minimum degree of 9 in Table 6.1
reflects this). Furthermore, CTP maintains this level of reliability across all configurations and
settings: itmeetstherobustnessgoal. CTPrequiresconfigurationoftwoconstantsacrossdifferent
radio chips – the threshold on physical measurement (LQI, RSSI) that determines if a channel is
of high quality (white-bit) and rate-limiting inter-packet delay in the forwarding engine. Thus,
CTP also meets the hardware independence goal. We therefore focus comparative evaluations on
MultiHopLQI.
To show the consistency of delivery ratio over time, in Figure 6.13(a), we show the result from
one experiment when we ran CTP for over 37 hours. The delivery ratio remains consistently high
over the duration of the experiment. Figure 6.13(b) shows the result from a similar experiment
with MultiHopLQI. Although MultiHopLQI’s average delivery ratio was 85%, delivery is highly
variable over time, occasionally dipping to 58% for some nodes. In the remainder of this section
we evaluate through detailed experiments how the different techniques we use in CTP contribute
to its higher delivery ratio, while maintaining low control traffic rates and agility in response to
changes in the topology.
6.7.3.2 Efficiency
Aprotocolthatrequiresalargenumberoftransmissionsisnotwell-suitedforduty-cyclednetwork.
We measure data delivery efficiency using the cost metric which accounts for all the control and
data transmissions in the network normalized by the packets received at the sink. This metric
gives a rough measure of the energy spent delivering a single packet to the sink. Figure 6.14
compares the average delivery cost for CTP and MultiHopLQI computed over 70000 packets.
CTP cost is 24% lower than that of MultiHopLQI. The figure also shows that control packets
for CTP occupy a much smaller fraction of the cost than MultiHopLQI (2.2% vs. 8.4%). The
109
1
1.5
2
2.5
3
3.5
4
4.5
5
CTP MultiHopLQI
Delivery cost per packet
Protocol
Control cost
Data cost
Figure 6.14—CTP’s cost is 24% lower than MultiHopLQI and the portion of that is control is 73% lower.
0
100
200
300
400
500
600
0 1 2 3 4 5
Total number of beacons / node
Time(hours)
MultiHopLQI
CTP
Figure 6.15—CTP’s beaconing rate decreases and stabilizes over time. It is significantly smaller than MultiHo-
pLQI’s over the long run.
decrease in data transmissions is a result of good route selection and agile route repair. The
decrease in control transmissions is due to CTP’s adaptive beaconing.
6.7.3.3 Adaptive Control Traffic
Figure 6.15 shows CTP’s and MultiHopLQI’s control traffic from separate five-hour experiments
on Tutornet. CTP’s control traffic rate is high at network startup as CTP probes and discovers
the topology, but decreases and stabilizes over time. MultiHopLQI sends beacons at a fixed
interval of 30s. Using a Trickle timer allows CTP to send beacons as quickly as every 64ms and
quickly respond to topology problems within a few packet times. By adapting its control rate and
110
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100 120
Total number of beacons
Time(minutes)
Node 38
Node 39
Node 40
Node 41
Figure 6.16—Number of beacons for selected nodes in the neighborhood of the new node. There is a big jump in
control traffic shortly after four new nodes are introduced and it levels off.
slowing down when the network is stable, however, CTP has a much lower control packet rate
than MultiHopLQI. At the same time, it can respond to topology problems in 64ms, rather than
30s, a 99.8% reduction in response time.
Lower beacon rates are generally at odds with route agility. During one experiment, we
introduced four new nodes in the network 60 minutes after the start. Figure 6.16 shows that the
control overhead for selected nodes in the vicinity of the new nodes increases immediately after
the nodes were introduced as beacons are sent rapidly. The beacon rate decays shortly afterward.
The increase in beaconing rate (in response to the pull bit) was localized to the neighborhood
of the nodes introduced, and produced fast convergence. New nodes were able to send collection
packets to the sink within four seconds after booting.
6.7.3.4 Topology Inconsistencies
Next we look at how route inconsistencies are distributed over space and time, and their impact
oncontrol overhead. Figure 6.17(a)shows inconsistencies detected byeach node in an experiment
over a 6.5-hour period. Inconsistencies are temporally correlated across nodes, and typically
constrained to a subset of nodes. The lowest curve in Figure 6.17(b) shows the cumulative count
of route inconsistencies in the same experiment, and how the rate decreases over time. In the
beginning, most of the inconsistencies are due to discovery and initial rounds of path selection.
111
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7
Node id
Time(hours)
(a) Inconsistent routing states over time and across
nodes. Each point is a detected route inconsistency.
0
50
100
150
200
250
0 1 2 3 4 5 6 7
Total number of events / node
Time(hours)
Total Beacons
Churn Beacons
Inconsistencies
(b) Control overhead from route inconsistencies.
Figure 6.17—Inconsistent routing state and resulting control overhead
Over time, link dynamics are the dominant cause of inconsistencies. We have also observed
a similar trend in number of parent changes: frequent changes in the beginning as the nodes
discover new links and neighbors and fewer changes once the network has selected high quality
routes.
When a node detects such an inconsistency, it resets its beacon timer. The top curve in
Figure 6.17(b) shows the total number or routing beacons sent (Total Beacons). The middle
curve, Churn Beacons, is the subset of these beacons sent by the Trickle timer when a parent
change resets the interval. The difference between these two curves provides an upper bound on
the number of beacons sent due to inconsistencies. It is an upper bound because of the beacons
that would have been sent normally, at the slowest beacon interval, and some occasional beacon
caused by packets with the pull bit set. In 6.5 hrs, the nodes sent 12299 total beacons while they
detected 3025 inconsistencies and triggered 4485 beacons due to parent change: CTP sent 2.6
beacons per inconsistency detected in order to repair and reestablish the path to the root.
112
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100 120 140
Delivery Ratio
Time(minutes)
max
median
min
(a) Nodes fail at 60 minutes and CTP does not observe
any significant disruption.
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100 120 140
Delivery Ratio
Time(minutes)
max
median
min
(b) Nodes fail at 80 minutes and MultiHopLQI’s me-
dian delivery drops to 80% for 10 minutes.
Figure 6.18—Robustness of CTP and MultiHopLQI when the 10 most heavily forwarding nodes fail.
6.7.3.5 Robustness to Failure
To measure how effectively the routes can adapt to node failures, we ran CTP for two hours with
an application sending packets every 8s. After 60 minutes, we removed the ten nodes that were
forwarding the most packets in the network. CTP uses the 4-bit link estimator, which reflects
changes in the topology in a few packet times. This resets the trickle timers and causes rapid
route convergence around the failure.
Figure6.18(a)plotstheminimum,median,andmaximumdeliveryratioacrossnodeovertime.
The figure shows only a tiny change in delivery ratio due to the disruption: the minimum delivery
ratio across the network drops to 98%. 15 nodes dropped 1 or 2 packets each right after the
disruption, and most nodes found new routes in under 1s. The 10-minute dip in the graph is an
artifact of the sliding window we used to calculate average delivery ratio. The median delivery
ratio remained at 100%.
Figure 6.18(b) shows the result of a similar experiment with MultiHopLQI. After 80 minutes
we removed ten nodes that were forwarding the most packets. The resulting disruption, caused
the delivery ratio of some nodes to drop as low as 60%, while the median delivery ratio dropped
to 80%.
113
6.7.3.6 Agility
The prior experiment shows that CTP can quickly route around node failures when there is a
constant stream of traffic. To delver deeper into how CTP adapts to sudden topology changes,
we ran a slightly different experiment. We ran CTP on Tutornet with each node generating
data packet every 8s for six minutes allowing CTP to settle on a good routing tree while it was
delivering 100% of the packets. Then we stopped generating data traffic on all the nodes for 14
minutes. At20thminute,weremoved(erasedtheprogramrunningonthemote)node26fromthe
network and shortly thereafter made node 53 (node 26’s child in the routing tree) start sending
data packets. As expected, packet transmissions from node 53 to non-existent node 26 fails.
We found that after twelve packet transmissions (325 ms), CTP switched to node 40 as its
new parent. Thus, although the beacon rate in the network had decreased to one beacon every 8
minutes, CTP was able to quickly (in 325ms), select a new parent when its existing parent was
removed from the network. CTP remains efficient even when the beacon interval decays tens of
minutes, maintaining the ability to react to topology changes within a few packet transmission
times.
6.7.3.7 Transmit Timer
CTP pauses briefly between packet transmissions to avoid self-interference, as described in Sec-
tion 6.6.3. Here we show how we established this value for the CC2420 radio and quantify its
benefit to CTP’s reliability.
Figure 6.19 shows how the duration of a transmission wait timer affects a single node flow on
channel 26 in the Tutornet testbed. In this experiment, a single node sends packets as fast as it
can across 3-hops to the data sink. The transmission timers in Figure 6.19 range from [1,2] to
[10,20]ms. At values below [7,14]ms, delivery dips below 95%.
Although goodput increases slightly with smaller transmit timers, this benefit comes at a
significant cost: the delivery ratio drops as low as 72%, which does not satisfy the reliability
114
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9 10 11
15
20
25
30
35
40
45
50
Delivery Ratio
Goodput (pkts/s)
Wait time[x,2x]ms
Delivery
Goodput
Figure 6.19—Effect of a per-hop rate-limiting transmit timer on goodput and delivery ratio on the CC2420 radio.
Wait time between packets is [x,2x] ms.
requirement. However, as the timer length increases past [8,16]ms, goodput drops significantly
as the timer introduces idleness. Therefore, CTP uses 7-14ms (1.5-3 average packet times) as its
wait timer for the CC2420 radio.
Similarly, it uses, 1.5-3 average packet times as its transmit timer on other radios. We have
found that this setting works across the platforms and testbeds in these experiments but factors
such as load, density, and link qualities ultimately determine the maximum rate that a path can
accommodate. SomeMACsmightintroducelargedelaysbetweenthepackets,inwhichcase,CTP
transmit timers can be made smaller.
AlthoughCTP’sprimarygoalisnothighthroughputtraffic,itsuseoftransmittimersallowsit
toavoidcollisionswhileunderhighload. Transmittimersareinsufficientforend-to-endreliability:
bottleneck links of low PRR can overflow transmit queues. Robust end-to-end reliability requires
higher-layer congestion and flow control [56, 80, 89, 110], but CTP’s transmit timers make its
reliability more robust to the high load these protocols can generate.
6.7.3.8 Transmit cache
We evaluate the effect of the transmit cache by running two experiments on Tutornet. Both
experiments use the CSMA link layer, send packets every 8s, and use 802.15.4 channel 16. The
115
Chan. Freq. Delivery PL Cost Cost Churn
PL node-hr
16 2.43GHz 95.2% 3.12 5.91 1.894 31.37
26 2.48GHz 99.9% 2.02 2.07 1.025 0.04
Table 6.3—Results on how channel selection effects CTP’s performance on Tutornet. Channel 16 overlaps with
Wi-Fi; channel 26 does not.
first experiment uses standard CTP; the second disables its transmit cache. Standard CTP has
an average cost of 3.18 packets/delivery. With the transmit cache disabled, this jumps to 3.47
packets/delivery, a 9% increase. The transmit cache improves CTP’s efficiency by 9%.
These cost values are over 50% higher than those reported in Table 6.1 because channel 16
suffers from 802.11 interference, while the results in Table 6.1 are on channel 26, which does not.
The next section examines how CTP responds to external interference in greater detail.
6.7.3.9 External Interference
The first two results in the second set of rows in Table 6.2 are obtained from experiments on
Tutornet with the same link layer, transmission rate, and root node, but differ significantly in
their delivery ratio. This difference is due to the 802.15.4 channel they used. The experiment
on channel 26 (2.48GHz) observed an average delivery ratio of 99.9%; the experiment on channel
16 (2.43GHz) observed an average delivery ratio of 95.2%. Table 6.3 summarizes the differences
between the two settings.
Using channel 16, the average path length increased from 2.02 to 3.12 hops and the cost
increased from 2.07 to 5.91 transmissions per successfully delivered packet. The increase in cost is
not only due to longer paths but also a larger number of transmissions per hop, which increased
from 1.025 to 1.894.
Channel 16 overlaps with several 802.11b channels (2-6), while channel 26 is almost entirely
outside the 802.11b band. Figure 6.20 shows Wi-Fi activity by Wi-Fi channel on the Tutornet
testbed. RF interference from Wi-Fi causes link qualities to drop, increases tree depth because
116
Figure 6.20—802.11 activity captured using the Wi-Spy Spectrum Analyzer tool on Tutornet. Channel 1 and 11
are most heavily used by the building occupants.
Link Layer Average PL Cost Cost Duty Cycle
Delivery PL Median Mean
CSMA 94.7% 3.05 5.53 1.81 100.0% 100%
BoX-50ms 94.4% 3.28 6.48 1.98 24.8% 24.9%
BoX-500ms 97.1% 3.38 6.61 1.96 4.0% 4.6%
BoX-1s 95.1% 5.40 8.34 1.54 2.8% 3.8%
LPP-500ms 90.5% 3.76 8.55 2.27 6.6% 6.6%
Table 6.4—Detailed Motelab results on how link layer settings affect CTP’s topology and performance.
longerlinksaregenerallylessreliable. Italsocausesalargernumberofretransmissions,decreasing
effective capacity.
To test this hypothesis, we re-ran the channel 16 experiment with inter packet intervals of 22s
and 30s. Table 6.2 shows the results. At 22s, CTP has an average delivery ratio of 97.9% and at
30sithas99.4%.CTPachieveshighdeliveryevenwithhighexternalinterference, buttheexternal
interference on the channel does lower the supportable data rate.
6.7.3.10 Link Layers
The first section of Table 6.2 contains six experiments on the Motelab testbed using the standard
TinyOS CSMA layer, low-power listening with BoX-MAC, and low-power probing with LPP.
Table 6.4 has further details on these results.
Low-power listening observes longer paths (higher average PL) and higher end-to-end costs,
but the per-link cost (cost/PL) decreases. Longer sleep intervals cause CTP to choose longer
117
routes of more reliable links. This shift is especially pronounced once the interval is above 500ms.
The sleep interval the link layer uses affects the link qualities that CTP observes. If the signal-
to-noise ratio is completely stable, link qualities are independent of how often a node checks the
channel. There are temporal variations in the signal-to-noise ratio: this suggests that low-power
link layers should consider the effects of link burstiness [98].
Low-powerprobinggenerallyunder-performslow-powerlistening. Forthesamecheckinterval,
it has a lower delivery ratio and a higher duty cycle. This result should not be interpreted as a
general comparison of the two, however. Each implementation can of course be improved, CTP
is only one traffic pattern, and we only compared them on a single testbed with a single traffic
rate. Nevertheless, CTP meets its reliability goal on both.
We also ran CTP with low-power link layers on the both Twist testbeds. For the eyesIFXv2
platform, we used the SpeckMAC layer, with a check interval of 183ms. The lower delivery ratio
with SpeckMAC compared to CSMA is due to queue overflows on the bottleneck links because of
longer transmission times at the MAC layer.
6.7.3.11 Energy Profile
For a more detailed examination of CTP’s energy performance, we use the Blaze platform, devel-
oped by Rincon Research. Blaze has a Texas Instruments MSP430 microcontroller and a Chipcon
CC1100[48]radio.
4
Weuseda20-nodenetworkthatRinconResearchhasdeployedina33m× 33m
space. One node in this testbed has a precision 1 Ω resistor in series with the battery, connected
to a high precision 24-bit ADC using a data acquisition unit. We later converted this voltage to
energy and extrapolated to per day energy consumption.
We ran a total of 20 experiments for energy profiling, changing the MAC sleep interval and
application data generation interval. For each experiment, we let the network warm up for about
4
Rincon Research maintains Blaze support code in the TinyOS 2.x “external contributions” repository.
118
0
50
100
150
200
250
0 50 100 150 200 250
Energy Consumption (mAh/day)
Message generation interval (s)
CTP (100ms)
CTP (200ms)
CTP (300ms)
MultiHopLQI (300ms)
Figure 6.21—Energy consumption for CTP and MultiHopLQI for 100ms-300ms sleep intervals.
15 minutes. We then use a dissemination protocol to request the nodes to start sending data at
a given message interval. We collected energy data for 15 minutes.
The platform consumes 341.6 mAh/day in idle mode without duty-cycling. The result from
figure 6.21 shows that the the full CTP stack can run for as little as 7.2 mAh/day, compared to
60.8 mAh/day for MultiHopLQI. During these experiments CTP consistently delivered 99.9% of
the data packets to the sink.
This result suggests that CTP with a properly-designed low power hardware platform can
be used in long lasting deployments: even with a moderately-rapid (for a low power network)
message interval of 240s, two AA batteries (5000 mAh) can supply sufficient energy to run a node
for more than 400 days. This result is significant because the experiment takes into account the
cost for running a full network stack consisting of dissemination and CTP protocols.
CTP’slowenergyprofileispossiblebecauseitselectsefficientpaths,avoidsunnecessarycontrol
traffic, and actively monitors the topology using the the data plane. Reduction in control traffic
is especially important in these networks because broadcast packets must be transmitted with
long preambles. CTP’s ability to taper off that overhead using exponentially increasing beacon
interval allows it to achieve much lower energy consumption compared to the protocols that use
periodic beacons.
119
6.7.3.12 Testbed Observations
The most salient differentiating dynamics property that we found across the testbeds is churn.
On Motelab and Kansei, the churn is higher than on other testbeds. Analysis of CTP logs show
that some sections of Motelab are very sparse and have only a few links with very low PRR.
These low-quality links are bursty, such that nodes cycle through their list of alternative parents
and actively probe the network in search of better parents. These small number of nodes account
for most of the churn in the network – 7% of the nodes accounted for 76% of parent changes
on Motelab. This also explains why most packet losses in Motelab are due to retransmission
timeouts.
On Tutornet and Kansei, churn is more uniform across nodes, but for different reasons. When
operatingonaninterferingchannel,Tutornetseesburstylinksduetoburstsof802.11interference,
causing nodes to change parents often. On a non-interfering channel, CTP has very low churn
on Tutornet. These bursts of interference cause nodes to be unable to deliver packets for periods
of time, causing queue overflows to be the dominant cause of packet loss. In Kansei, the high
churn is due to the sheer degree of the network: nodes actively probe their hundreds of candidate
parents.
6.8 Conclusion
The three mechanisms, agile link estimation, adaptive beaconing, and datapath validation allows
a collection protocol to remain efficient, robust, and reliable in the presence of a highly dynamic
link topology. Our implementation of these mechanisms, CTP, offers 90-99.9% packet delivery
in highly dynamic environments while sending up to 73% fewer control packets than existing
approaches. It is highly robust to topology changes and failures. It places a minimal expectations
onthephysicalandlinklayer,allowingittorunonawiderangeofplatformswithoutanyneedfor
120
fine-tuning parameters. Minimizing control traffic, combined with efficient route selection, allows
CTP to achieve duty cycles of <3% while supporting loads of 25 packets/minute.
121
Chapter 7
Conclusion
In this thesis, we extensively studied the dynamics present in wireless sensor network protocol
stack. Using two case studies, we presented techniques that enable protocols to adapt to the
dynamics present in the physical, link, network, and application layers. AEM uses application
traffic profile analysis and elastic transmission windows to adapt to the application and link layer
dynamics and enables sensor networks to become robust and energy-efficient. CTP uses agile link
estimation, adaptive beaconing, and rapid datapath validation to adapt to the physical, link, and
network level dynamics to provide robust and energy-efficient routing.
Making a system aware of application profile and dynamically optimizing the system during
runtime is an important area of research. The application profile analysis technique presented in
thisthesisisrestrictedtolineardata-flowprograms. Thenextsteptofullyrealizethepotentialof
application-awaresystemistodesignanalgorithmthatcananalyzeandinfernetworkusepattern
from an arbitrary C program. This critical missing area of research has the potential to bring a
bigchangeinthewaysystemsarebuilt,acrosstheprotocolstack,andacrossthetypeofnetworks.
In the context of routing protocols, for example, understanding of application’s networking needs
over time might enable a system to switch between proactive and reactive protocols or push
and pull protocols during runtime. It might be possible to change the end-to-end retransmission
122
timeouts in a transport protocol depending on the latency requirement of the application running
in the network.
Theadaptationtechniquespresentedinthisthesisarestudiedinthecontextofsensornetwork
radioduty-cyclingandroutingprotocols. Itisanopenquestionforfutureresearchtoexplorehow
they generalize to other sensor network protocols and other types of networks. Further research
is needed to understand if protocols such as network-wide time synchronization protocols can
work robustly and accurately using adaptive beaconing. It also remains to be seen if the adaptive
techniques used in CTP can make wireless mesh routing more robust and efficient.
123
Bibliography
[1] Utku G¨ unay Acer, Shivkumar Kalyanaraman, and Alhussein A. Abouzeid. Weak state
routing for large scale dynamic networks. In MobiCom ’07: Proceedings of the 13th annual
ACM international conference on Mobile computing and networking, pages 290–301, New
York, NY, USA, 2007. ACM.
[2] Yuvraj Agarwal, Curt Schurgers, and Rajesh Gupta. Dynamic power management using
on demand paging for networked embedded systems. In ASP-DAC ’05: Proceedings of the
2005 Asia and South Pacific Design Automation Conference , pages 755–759, New York,
NY, USA, 2005. ACM.
[3] Daniel Aguayo, John Bicket, Sanjit Biswas, Glenn Judd, and Robert Morris. Link-level
measurements from an 802.11b mesh network. In SIGCOMM ’04: Proceedings of the 2004
conference on Applications, technologies, architectures, and protocols for computer commu-
nications, pages 121–132, New York, NY, USA, 2004. ACM.
[4] Gahng-Seop Ahn, Se Gi Hong, Emiliano Miluzzo, Andrew T. Campbell, and Francesca
Cuomo. Funneling-mac: a localized, sink-oriented mac for boosting fidelity in sensor net-
works. In SenSys ’06: Proceedings of the 4th international conference on Embedded net-
worked sensor systems, pages 293–306, New York, NY, USA, 2006. ACM.
[5] Manish Anand, Edmund B. Nightingale, and Jason Flinn. Self-tuning wireless network
powermanagement.InMobiCom’03: Proceedingsofthe9thannualinternationalconference
on Mobile computing and networking, pages 176–189, New York, NY, USA, 2003. ACM.
[6] A. Arora, R. Ramnath, E. Ertin, P. Sinha, S. Bapat, V. Naik, V. Kulathumani, Hongwei
Zhang, Hui Cao, M. Sridharan, S. Kumar, N. Seddon, C. Anderson, T. Herman, N. Trivedi,
M.Nesterenko,R.Shah,S.Kulkami,M.Aramugam,LiminWang,M.Gouda,YoungriChoi,
D. Culler, P. Dutta, C. Sharp, G. Tolle, M. Grimmer, B. Ferriera, and K. Parker. Exscal:
elements of an extreme scale wireless sensor network. In ERTCSA ’05: Proceedings of the
11th IEEE International Conference on Embedded and Real-Time Computing Systems and
Applications, 2005., pages 102–108, Aug. 2005.
[7] HerbertRubensBaruchAwerbuch,DavidHolmer. Highthroughputrouteselectioninmulti-
rate ad hoc wireless networks. In WONS ’04: Proceedings of the First Working Conference
on Wireless On-demand Network Systems, August 2004.
[8] F. Bennett, D. Clarke, J. Evans, A. Hopper, A. Jones, and D. Leask. Piconet: Embedded
Mobile Networking. IEEE Personal Communications, 4(5):8–15, 1997.
124
[9] JanBeutel, StephanGruber, Andreas Hasler, RomanLim, Andreas Meier, ChristianPlessl,
Igor Talzi, Lothar Thiele, Christian Tschudin, Matthias Woehrle, and Mustafa Yuecel. Per-
madaq: A scientific instrument for precision sensing and data recovery in environmental
extremes. In IPSN/SPOTS ’09: Proceedings of the ACM/IEEE International Conference
on Information Processing in Sensor Networks (IPSN 2009), SPOTS track, pages 265–276,
San Francisco, CA, USA, April 2009. ACM/IEEE.
[10] Vaduvur Bharghavan, Alan Demers, Scott Shenker, and Lixia Zhang. MACAW: a media
access protocol for wireless lan’s. In SIGCOMM ’94: Proceedings of the conference on
Communications architectures, protocols and applications, pages 212–225, New York, NY,
USA, 1994. ACM.
[11] Sanjit Biswas and Robert Morris. Exor: opportunistic multi-hop routing for wireless net-
works. In SIGCOMM ’05: Proceedings of the 2005 conference on Applications, technologies,
architectures, and protocols for computer communications, pages 133–144, New York, NY,
USA, 2005. ACM.
[12] Michael Buettner, Gary V. Yee, Eric Anderson, and Richard Han. X-mac: a short preamble
mac protocol for duty-cycled wireless sensor networks. In SenSys ’06: Proceedings of the
4th international conference on Embedded networked sensor systems, pages 307–320, New
York, NY, USA, 2006. ACM.
[13] Phil Buonadonna, Joseph Hellerstein, Wei Hong, David Gay, and Samuel Madden. TASK:
Sensor Network in a Box. In EWSN ’05: Proceeedings of the Second European Workshop
on Wireless Sensor Networks, 2005, pages 133–144, Istanbul, Turkey, January 2005.
[14] NicolasBurri, PascalvonRickenbach, andRogerWattenhofer. Dozer: ultra-lowpowerdata
gathering in sensor networks. In IPSN ’07: Proceedings of the 6th international conference
on Information processing in sensor networks, pages 450–459. ACM, 2007.
[15] Bogdan Carbunar, Ananth Grama, Jan Vitek, and Octavian Carbunar. Redundancy and
coveragedetectioninsensornetworks. ACM Transactions on Sensor Networks,2(1):94–128,
2006.
[16] Alberto Cerpa and Deborah Estrin. ASCENT: Adaptive Self-Configuring sEnsor Networks
Topologies. IEEE Transactions on Mobile Computing, 3(3):272–285, 2004.
[17] Jae-Hwan Chang and Leandros Tassiulas. Energy Conserving Routing in Wireless Ad-hoc
Networks. In INFOCOM ’00: Proceedings of Nineteenth Annual Joint Conference of the
IEEE Computer and Communications Societies., pages 22–31, 2000.
[18] Benjie Chen, Kyle Jamieson, Hari Balakrishnan, and Robert Morris. Span: An Energy-
Efficient Coordination Algorithm for Topology Maintenance in Ad Hoc Wireless Networks.
In MobiCom ’01: Proceedings of the 7th ACM International Conference on Mobile Comput-
ing and Networking, pages 85–96, Rome, Italy, July 2001.
[19] Kwan-Wu Chin, John Judge, Aidan Williams, and Roger Kermode. Implementation ex-
perience with manet routing protocols. SIGCOMM Computer Communication Review,
32(5):49–59, 2002.
[20] Crossbow Technology. Mica2dot Datasheet. http://www.xbow.com/products/Product_
pdf_files/Wireless_pdf/MICA2DOT_Datasheet.pdf, 2006.
[21] Crossbow Technology. MicaZ Datasheet. http://www.xbow.com/Products/Product_pdf_
files/Wireless_pdf/MICAz_Kit_Datasheet.pdf, 2006.
125
[22] Crossbow Technology. Mica2 Datasheet. http://www.xbow.com/products/Product_pdf_
files/Wireless_pdf/MICA2_Datasheet.pdf, 2009.
[23] DouglasS.J.DeCouto,DanielAguayo,JohnBicket,andRobertMorris. Ahigh-throughput
path metric for multi-hop wireless routing. In MobiCom ’03: Proceedings of the 9th annual
international conference on Mobile computing and networking, pages 134–146, New York,
NY, USA, 2003. ACM.
[24] Douglas S. J. De Couto, Daniel Aguayo, Benjamin A. Chambers, and Robert Morris. Per-
formance of multihop wireless networks: Shortest path is not enough. In Proceedings of the
First Workshop on Hot Topics in Networks (HotNets-I), Princeton, New Jersey, October
2002. ACM SIGCOMM.
[25] Richard Draves, Jitendra Padhye, and Brian Zill. Comparison of routing metrics for static
multi-hop wireless networks. In SIGCOMM ’04: Proceedings of the 2004 conference on
Applications, technologies, architectures, and protocols for computer communications, pages
133–144, New York, NY, USA, 2004. ACM.
[26] R. Dube, C. D. Rais, Kuang-Yeh Wang, and S. K. Tripathi. Signal stability-based adaptive
routing (ssa) for ad hoc mobile networks. IEEE Personal Communications, 4(1):36–45,
1997.
[27] Prabal Dutta, Jonathan Hui, Jaein Jeong, Sukun Kim, Cory Sharp, Jay Taneja, Gilman
Tolle,KaminWhitehouse,andDavidCuller.Trio: enablingsustainableandscalableoutdoor
wireless sensor network deployments. In IPSN ’06: Proceedings of the 5th international
conference on Information processing in sensor networks, pages 407–415, New York, NY,
USA, 2006. ACM.
[28] ChengTienEeandRuzenaBajcsy. Congestioncontrolandfairnessformany-to-onerouting
in sensor networks. In SenSys ’04: Proceedings of the 2nd international conference on
Embedded networked sensor systems, pages 148–161, New York, NY, USA, 2004. ACM.
[29] Cheng Tien Ee, Rodrigo Fonseca, Sukun Kim, Daekyeong Moon, Arsalan Tavakoli, David
Culler,ScottShenker,andIonStoica. Amodularnetworklayerforsensornets. InOSDI ’06:
Proceedings of the 7th conference on USENIX Symposium on Operating Systems Design and
Implementation, pages 249–262. USENIX Association, 2006.
[30] Cheng Tien Ee, Sylvia Ratnasamy, and Scott Shenker. Practical data-centric storage. In
NSDI’06: Proceedings of the 3rd conference on Networked Systems Design & Implementa-
tion, pages 325–338, Berkeley, CA, USA, 2006. USENIX Association.
[31] Amre El-Hoiyi, J.-D. Decotignie, and J. Hernandez. Low power MAC protocols for infras-
tructurewirelesssensornetworks. InProceedings of the Fifth European Wireless Conference,
Barcelona, Spain, February 2004.
[32] Rodrigo Fonseca, Omprakash Gnawali, Kyle Jamieson, and Philip Levis. Four Bit Wire-
less Link Estimation. In Hotnets-VI: Proceedings of the Sixth Workshop on Hot Topics in
Networks, Atlanta, GA, November 2007.
[33] Rodrigo Fonseca, Sylvia Ratnasamy, Jerry Zhao, Cheng Tien Ee, David Culler, Scott
Shenker, and Ion Stoica. Beacon vector routing: scalable point-to-point routing in wireless
sensornets. In NSDI’05: Proceedings of the 2nd conference on Symposium on Networked
Systems Design & Implementation, pages 329–342, Berkeley, CA, USA, 2005. USENIX As-
sociation.
126
[34] Lewis Girod, Thanos Stathopoulos, Nithya Ramanathan, Jeremy Elson, Deborah Estrin,
Eric Osterweil, and Tom Schoellhammer. A system for simulation, emulation, and deploy-
ment of heterogeneous sensor networks. In SenSys ’04: Proceedings of the 2nd international
conference on Embedded networked sensor systems, pages 201–213, New York, NY, USA,
2004. ACM.
[35] Omprakash Gnawali, Rodrigo Fonseca, Kyle Jamieson, David Moss, and Philip Levis. Col-
lection tree protocol. In SenSys ’09: Proceedings of the 9th international conference on
Embedded networked sensor systems, Berkeley, CA, USA, 2009. ACM.
[36] Omprakash Gnawali, Ki-Young Jang, Jeongyeup Paek, Marcos Vieira, Ramesh Govindan,
Ben Greenstein, August Joki, Deborah Estrin, and Eddie Kohler. The tenet architecture
for tiered sensor networks. In SenSys ’06: Proceedings of the 4th international conference
on Embedded networked sensor systems, pages 153–166, New York, NY, USA, 2006. ACM.
[37] Omprakash Gnawali, Jongkeun Na, and Ramesh Govindan. Application-Informed Radio
Duty-Cycling in a Re-Taskable Multi-User Sensing System. In IPSN ’09: Proceedings of the
8th international conference on Information processing in sensor networks, pages 145–156,
San Francisco, CA, USA, 2009. ACM.
[38] Omprakash Gnawali, Mark Yarvis, John Heidemann, and Ramesh Govindan. Interaction of
Retransmission, Blacklisting, and Routing Metrics for Reliability in Sensor Network Rout-
ing. In SECON ’04: Proceedings of the First IEEE Conference on Sensor and Adhoc Com-
munication and Networks, pages 34–43, Santa Clara, CA, USA, October 2004. IEEE.
[39] RichardGuy,BenGreenstein,JohnHicks,RahulKapur,NithyaRamanathan,TomSchoell-
hammer, Thanos Stathopoulos, Karen Weeks, Kevin Chang, Lew Girod, and Deborah Es-
trin. Experiences with the Extensible Sensing System ESS. CENS Technical Report 61,
March 29 2006.
[40] Tian He, John A. Stankovic, Chenyang Lu, and Tarek Abdelzaher. Speed: A stateless
protocol for real-time communication in sensor networks. In ICDCS ’03: Proceedings of the
23rdInternationalConferenceonDistributedComputingSystems,pages46–55,Washington,
DC, USA, May 2003. IEEE Computer Society.
[41] JohnHeidemann,FabioSilva,andDeborahEstrin. Matchingdatadisseminationalgorithms
to application requirements. In SenSys ’03: Proceedings of the 1st international conference
on Embedded networked sensor systems, pages 218–229, New York, NY, USA, 2003. ACM.
[42] John Hicks, Jeongyeup Paek, Sharon Coe, Ramesh Govindan, and Deborah Estrin. An
Easily Deployable Wireless Imaging System. In ImageSense ’08: Proceedings of Workshop
on Applications, Systems, and Algorithms for Image Sensing, 2008.
[43] Jason L. Hill and David E. Culler. Mica: A Wireless Platform for Deeply Embedded
Networks. IEEE Micro, 22(6):12–24, 2002.
[44] Barbara Hohlt, Lance Doherty, and Eric Brewer. Flexible power scheduling for sensor
networks. In IPSN ’04: Proceedings of the 3rd international symposium on Information
processing in sensor networks, pages 205–214, New York, NY, USA, 2004. ACM.
[45] Bret Hull, Kyle Jamieson, and Hari Balakrishnan. Mitigating congestion in wireless sensor
networks. In SenSys ’04: Proceedings of the 2nd international conference on Embedded
networked sensor systems, pages 134–147, New York, NY, USA, 2004. ACM.
127
[46] TexasInstruments. CC2420DataSheet. http://focus.ti.com/lit/ds/symlink/cc2420.
pdf, 2008.
[47] TexasInstruments. CC1000DataSheet. http://focus.ti.com/lit/ds/symlink/cc1000.
pdf, 2009.
[48] TexasInstruments. CC1100DataSheet. http://focus.ti.com/lit/ds/symlink/cc1100.
pdf, 2009.
[49] Chalermek Intanagonwiwat, Ramesh Govindan, Deborah Estrin, John Heidemann, and
Fabio Silva. Directed diffusion for wireless sensor networking. ACM/IEEE Transactions
on Networking, 11(1):2–16, February 2002.
[50] Xiaofan Jiang, Joseph Polastre, and David Culler. Perpetual environmentally powered sen-
sor networks. In IPSN ’05: Proceedings of the 4th international symposium on Information
processing in sensor networks, page 65, Piscataway, NJ, USA, 2005. IEEE Press.
[51] Philo Juang, Hidekazu Oki, Yong Wang, Margaret Martonosi, Li Shiuan Peh, and Daniel
Rubenstein. Energy-efficient computing for wildlife tracking: design tradeoffs and early
experiences with zebranet. SIGARCH Computer Architecture News, 30(5):96–107, 2002.
[52] Brad Karp and H. T. Kung. Gpsr: greedy perimeter stateless routing for wireless net-
works. In MobiCom ’00: Proceedings of the 6th annual international conference on Mobile
computing and networking, pages 243–254, New York, NY, USA, 2000. ACM.
[53] Sachin Katti, Hariharan Rahul, Wenjun Hu, Dina Katabi, Muriel M´ edard, and Jon
Crowcroft. Xors in the air: practical wireless network coding. In SIGCOMM ’06: Pro-
ceedings of the 2006 conference on Applications, technologies, architectures, and protocols
for computer communications, pages 243–254, New York, NY, USA, 2006. ACM.
[54] Kyu-Han Kim and Kang G. Shin. On accurate measurement of link quality in multi-hop
wireless mesh networks. In MobiCom ’06: Proceedings of the 12th annual international
conference on Mobile computing and networking, pages 38–49, New York, NY, USA, 2006.
ACM.
[55] Minkyong Kim and Brian Noble. Mobile network estimation. In MobiCom ’01: Proceedings
of the 7th annual international conference on Mobile computing and networking, pages 298–
309, New York, NY, USA, 2001. ACM.
[56] Sukun Kim, Rodrigo Fonseca, Prabal Dutta, Arsalan Tavakoli, David Culler, Philip Levis,
ScottShenker,andIonStoica. Flush: areliablebulktransportprotocolformultihopwireless
networks. In SenSys ’07: Proceedings of the 5th international conference on Embedded
networked sensor systems, pages 351–365, New York, NY, USA, 2007. ACM.
[57] SukunKim, ShamimPakzad, DavidCuller, JamesDemmel, GregoryFenves, StevenGlaser,
andMartinTuron. Healthmonitoringofcivilinfrastructuresusingwirelesssensornetworks.
In IPSN ’07: Proceedings of the 6th international conference on Information processing in
sensor networks, pages 254–263, New York, NY, USA, 2007. ACM.
[58] Young-Jin Kim, Ramesh Govindan, Brad Karp, and Scott Shenker. Geographic routing
madepractical. InNSDI’05: Proceedings of the 2nd conference on Symposium on Networked
Systems Design & Implementation, pages 217–230, Berkeley, CA, USA, 2005. USENIX
Association.
128
[59] Kevin Klues, Vlado Handziski, Chenyang Lu, Adam Wolisz, David Culler, David Gay, and
Philip Levis. Integrating concurrency control and energy management in device drivers.
In SOSP ’07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems
principles, pages 251–264, New York, NY, USA, 2007. ACM.
[60] Robin Kravets and P. Krishnan. Application-driven power management for mobile commu-
nication. Wireless Networks, 6(4):263–277, 2000.
[61] K.G. Langendoen, A. Baggio, and O.W. Visser. Murphy loves potatoes: Experiences from
a pilot sensor network deployment in precision agriculture. In WPDRTS ’06: Proceedings
of the 14th International Workshop on Parallel and Distributed Real-Time Systems, pages
1–8, apr 2006.
[62] Seungjoon Lee, Bobby Bhattacharjee, and Suman Banerjee. Efficient geographic routing
in multihop wireless networks. In MobiHoc ’05: Proceedings of the 6th ACM international
symposium on Mobile ad hoc networking and computing, pages 230–241, New York, NY,
USA, 2005. ACM.
[63] J.P. Lehoczky. Fixed priority scheduling of periodic task sets with arbitrary deadlines. In
RTSS ’90: Proceedings of the 11th IEEE Real-Time Systems Symposium, pages 201–209,
Dec 1990.
[64] Ben Leong, Barbara Liskov, and Robert Morris. Geographic routing without planarization.
In NSDI’06: Proceedings of the 3rd conference on Networked Systems Design & Implemen-
tation, pages 25–25, Berkeley, CA, USA, 2006. USENIX Association.
[65] Philip Levis and David Culler. Mat´ e: a tiny virtual machine for sensor networks. In
ASPLOS-X: Proceedings of the 10th international conference on Architectural support for
programming languages and operating systems, pages 85–95, New York, NY, USA, 2002.
ACM.
[66] Philip Levis, David Gay, Vlado Handziski, Jan-Hinrich. Hauer, Ben Greenstein, Martin
Turon, Jonathan Hui, Kevin Klues, Cory Sharp, Robert Szewczyk, Joe Polastre, Philip
Buonadonna, Lama Nachman, Gilman Tolle, David Culler, and Adam Wolisz. T2: A
Second Generation OS For Embedded Sensor Networks. Technical Report TKN-05-007,
Telecommunication Networks Group, Technische Universit¨ at Berlin, November 2005.
[67] PhilipLevis,NeilPatel,DavidCuller,andScottShenker.Trickle: aself-regulatingalgorithm
forcodepropagationandmaintenanceinwirelesssensornetworks. InNSDI’04: Proceedings
of the 1st conference on Symposium on Networked Systems Design and Implementation,
pages 2–2, Berkeley, CA, USA, 2004. USENIX Association.
[68] JinyangLi, CharlesBlake, DouglasS.J.DeCouto, HuImmLee, andRobertMorris. Capac-
ityofadhocwirelessnetworks. InMobiCom ’01: Proceedings of the 7th annual international
conference on Mobile computing and networking, pages 61–69, New York, NY, USA, 2001.
ACM.
[69] Yuan Li, Wei Ye, and John Heidemann. Energy and latency control in low duty cycle
MAC protocols. In WCNC ’05: Proceedings of the IEEE Wireless Communications and
Networking Conference, New Orleans, LA, USA, March 2005.
[70] Hang Liu, Hairuo Ma, Magda El Zarki, and Sanjay Gupta. Error control schemes for
networks: An overview. Mobile Networks and Applications, 2(2):167–182, 1997.
129
[71] Chenyang Lu, B.M. Blum, T.F. Abdelzaher, J.A. Stankovic, and Tian He. Rap: a real-time
communication architecture for large-scale wireless sensor networks. In RTAS ’02: Proceed-
ings of the Eighth IEEE Real-Time and Embedded Technology and Applications Symposium,
pages 55–66, San Jose, CA, 2002.
[72] Gang Lu, Bhaskar Krishnamachari, and Cauligi S. Raghavendra. An adaptive energy-
efficient and low-latency mac for data gathering in wireless sensor networks. Parallel and
Distributed Processing Symposium, International, 13:224a, 2004.
[73] HenrikLundgren,ErikNordstr¨ o,andChristianTschudin. Copingwithcommunicationgray
zones in ieee 802.11b based ad hoc networks. In WOWMOM ’02: Proceedings of the 5th
ACM international workshop on Wireless mobile multimedia, pages 49–55, New York, NY,
USA, 2002. ACM.
[74] Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong. Tag: a tiny
aggregation service for ad-hoc sensor networks. In OSDI ’02: Proceedings of the 5th sym-
posium on Operating systems design and implementation, pages 131–146, New York, NY,
USA, 2002. ACM.
[75] Ratul Mahajan, Maya Rodrig, David Wetherall, and John Zahorjan. Analyzing the mac-
level behavior of wireless networks in the wild. In SIGCOMM ’06: Proceedings of the
2006 conference on Applications, technologies, architectures, and protocols for computer
communications, pages 75–86, New York, NY, USA, 2006. ACM.
[76] Alan Mainwaring, David Culler, Joseph Polastre, Robert Szewczyk, and John Anderson.
Wireless sensor networks for habitat monitoring. In WSNA ’02: Proceedings of the 1st
ACM international workshop on Wireless sensor networks and applications, pages 88–97,
New York, NY, USA, 2002. ACM.
[77] Y Mao, F Wang, L Qiu, S Lam, and J Smith. S4: Small State and Small Stretch Routing
Protocol for Large Wireless Sensor Networks. In NSDI ’07: Proceedings of the 4th USENIX
Symposium on Networked Systems Design and Implementation, pages 101–114, 2007.
[78] Mikl´ os Mar´ oti, Branislav Kusy, Gyula Simon, and
´ Akos L´ edeczi. The flooding time syn-
chronization protocol. In SenSys ’04: Proceedings of the 2nd international conference on
Embedded networked sensor systems, pages 39–49, New York, NY, USA, 2004. ACM.
[79] David Moss and Philip Levis. BoX-MACs: Exploiting Physical and Link Layer Boundaries
in Low-Power Networking. Stanford Information Networks Group Technical Report SING-
08-00, 2008.
[80] Razvan Musaloiu-E., Chieh-Jan Mike Liang, and Andreas Terzis. Koala: Ultra-low power
data retrieval in wireless sensor networks. In IPSN ’08: Proceedings of the 7th international
conference on Information processing in sensor networks, pages 421–432, Washington, DC,
USA, 2008. IEEE Computer Society.
[81] JeongyeupPaek and RameshGovindan. Rcrt: rate-controlledreliable transport forwireless
sensornetworks. InSenSys’07: Proceedingsofthe5thinternationalconferenceonEmbedded
networked sensor systems, pages 305–319, New York, NY, USA, 2007. ACM.
[82] JeongyeupPaek,OmprakashGnawaliKi-YoungJang,DanielNishimura,RameshGovindan,
John Caffrey, Mazen Wahbeh, and Sami Masri. A Programmable Wireless Sensing System
for Structural Monitoring. In WCSCM ’06: Proceedings of the 4th World Conference on
Structural Control and Monitoring, San Diego, CA, July 2006.
130
[83] Dan Pei, Xiaoliang Zhao, Dan Massey, and Lixia Zhang. A study of bgp path vector
route looping behavior. In ICDCS ’04: Proceedings of the 24th International Conference
on Distributed Computing Systems, pages 720–729, Washington, DC, USA, 2004. IEEE
Computer Society.
[84] Trevor Pering, Vijay Raghunathan, and Roy Want. Exploiting radio hierarchies for power-
efficient wireless device discovery and connection setup. In VLSID ’05: Proceedings of the
18th International Conference on VLSI Design held jointly with 4th International Confer-
ence on Embedded Systems Design, pages 774–779, Washington, DC, USA, 2005. IEEE
Computer Society.
[85] Charles E. Perkins and Pravin Bhagwat. Highly dynamic destination-sequenced distance-
vectorrouting(dsdv)formobilecomputers.InSIGCOMM’94: Proceedingsoftheconference
onCommunicationsarchitectures, protocolsandapplications,pages234–244,NewYork,NY,
USA, 1994. ACM.
[86] Joseph Polastre, Jason Hill, and David Culler. Versatile low power media access for wire-
less sensor networks. In SenSys ’04: Proceedings of the 2nd international conference on
Embedded networked sensor systems, pages 95–107, New York, NY, USA, 2004. ACM.
[87] Joseph Polastre, Robert Szewczyk, and David Culler. Telos: enabling ultra-low power wire-
less research. In IPSN ’05: Proceedings of the 4th international symposium on Information
processing in sensor networks, page 48, Piscataway, NJ, USA, 2005. IEEE Press.
[88] Nithya Ramanathan, Mark Yarvis, Jasmeet Chhabra, Nandakishore Kushalnagar, Laksh-
man Krishnamurthy, and Deobrah Estrin. A Stream-Oriented Power Management Protocol
for Low Duty Cycle Sensor Network Applications. In EmNetS ’05: Proceedings of the IEEE
Workshop on Embedded Networked Sensors, pages 53–61, Sydney, Australia, May 2005.
IEEE Computer Society.
[89] Sumit Rangwala, Ramakrishna Gummadi, Ramesh Govindan, and Konstantinos Psounis.
Interference-aware fair rate control in wireless sensor networks. In SIGCOMM ’06: Pro-
ceedings of the 2006 conference on Applications, technologies, architectures, and protocols
for computer communications, pages 63–74, New York, NY, USA, 2006. ACM.
[90] Yogesh Sankarasubramaniam,
¨ Ozg¨ ur B. Akan, and Ian F. Akyildiz. Esrt: event-to-sink
reliable transport in wireless sensor networks. In MobiHoc ’03: Proceedings of the 4th ACM
international symposium on Mobile ad hoc networking & computing, pages 177–188, New
York, NY, USA, 2003. ACM.
[91] Thomas Schoellhammer, Ben Greenstein, and Deborah Estrin. Hyper: A routing protocol
to support mobile users of sensor networks. Technical Report 2013, CENS, 2006.
[92] Curt Schurgers, Vlasios Tsiatsis, Saurabh Ganeriwal, and Mani Srivastava. Topology man-
agement for sensor networks: exploiting latency and density. In MobiHoc ’02: Proceedings
of the 3rd ACM international symposium on Mobile ad hoc networking & computing, pages
135–145, New York, NY, USA, 2002. ACM.
[93] Eugene Shih, Paramvir Bahl, and Michael J. Sinclair. Wake on wireless: an event driven
energy saving strategy for battery operated devices. In MobiCom ’02: Proceedings of the
8th annual international conference on Mobile computing and networking, pages 160–171,
New York, NY, USA, 2002. ACM.
131
[94] Suresh Singh and C. S. Raghavendra. PAMAS: power aware multi-access protocol with
signalling for ad hoc networks. SIGCOMM Computer Communication Review, 28(3):5–26,
1998.
[95] K.SohrabiandG.J.Pottie. Performanceofanovelself-organizationprotocolforwirelessad-
hocsensornetworks.InVTC’99: ProceedingsoftheIEEEVehicularTechnologyConference,
volume 2, pages 1222–1226 vol.2, 1999.
[96] TinyOS source. The MultiHopLQI protocol. http://www.tinyos.net/tinyos-2.x/tos/
lib/net/lqi, 2009.
[97] Kannan Srinivasan, Prabal Dutta, Arsalan Tavakoli, and Philip Levis. Some implications of
low power wireless to ip networking. In HotNets-V: Proceedings of the The Fifth Workshop
on Hot Topics in Networks, November 2006.
[98] Kannan Srinivasan, Maria A. Kazandjieva, Saatvik Agarwal, and Philip Levis. The beta-
factor: measuring wireless link burstiness. In SenSys ’08: Proceedings of the 6th ACM
conference on Embedded network sensor systems, pages 29–42, New York, NY, USA, 2008.
ACM.
[99] FredStannandJohnHeidemann.Rmst: reliabledatatransportinsensornetworks.InSNPA
’03: Proceedings of the First IEEE International Workshop on Sensor Network Protocols
and Applications, pages 102–112, May 2003.
[100] Yanjun Sun, Omer Gurewitz, and David B. Johnson. Ri-mac: a receiver-initiated asyn-
chronous duty cycle mac protocol for dynamic traffic loads in wireless sensor networks. In
SenSys ’08: Proceedings of the 6th ACM conference on Embedded network sensor systems,
pages 1–14, New York, NY, USA, 2008. ACM.
[101] Robert Szewczyk, Joseph Polastre, Alan Mainwaring, and David Culler. Lessons From A
Sensor Network Expedition. In EWSN ’04: Proceedings of the First European Workshop on
Wireless Sensor Networks, pages 307–322, Istanbul, Turkey, January 2004.
[102] Crossbow Technology. Stargate platform. http://platformx.sourceforge.net/
Documents/manuals/6020-0049-02_A_Stargate.pdf, 2009.
[103] David L. Tennenhouse and David J. Wetherall. Towards an Active Network Architecture.
Computer Communication Review, 26(2):5–17, April 1996.
[104] Gilman Tolle and David Culler. Design of an application-cooperative management system
for wireless sensor networks. In EWSN ’05: Proceedings of the Second European Workshop
on Wireless Sensor Networks, 2005.
[105] Gilman Tolle, Joseph Polastre, Robert Szewczyk, David Culler, Neil Turner, Kevin Tu,
StephenBurgess,ToddDawson,PhilBuonadonna,DavidGay,andWeiHong.Amacroscope
intheredwoods. InSenSys’05: Proceedingsofthe3rdinternationalconferenceonEmbedded
networked sensor systems, pages 51–63, New York, NY, USA, 2005. ACM.
[106] Tijs vanDamandKoenLangendoen. Anadaptiveenergy-efficientmacprotocolforwireless
sensornetworks. InSenSys ’03: Proceedings of the 1st international conference on Embedded
networked sensor systems, pages 171–180, New York, NY, USA, 2003. ACM.
132
[107] Chieh-Yih Wan, Andrew T. Campbell, and Lakshman Krishnamurthy. Psfq: a reliable
transport protocol for wireless sensor networks. In WSNA ’02: Proceedings of the 1st ACM
internationalworkshoponWirelesssensornetworksandapplications,pages1–11,NewYork,
NY, USA, 2002. ACM.
[108] Chieh-YihWan, ShaneB.Eisenman, andAndrewT.Campbell. Coda: congestiondetection
and avoidance in sensor networks. In SenSys ’03: Proceedings of the 1st international
conference on Embedded networked sensor systems, pages 266–279, New York, NY, USA,
2003. ACM.
[109] Chieh-Yih Wan, Shane B. Eisenman, Andrew T. Campbell, and Jon Crowcroft. Siphon:
overload traffic management using multi-radio virtual sinks in sensor networks. In SenSys
’05: Proceedings of the 3rd international conference on Embedded networked sensor systems,
pages 116–129, New York, NY, USA, 2005. ACM.
[110] GeoffWerner-Allen,KonradLorincz,JeffJohnson,JonathanLees,andMattWelsh. Fidelity
and yield in a volcano monitoring sensor network. In OSDI ’06: Proceedings of the 7th
symposium on Operating systems design and implementation, pages 381–396, Berkeley, CA,
USA, 2006. USENIX Association.
[111] Kai-Juan Wong and D. K. Arvind. Speckmac: low-power decentralised mac protocols for
low data rate transmissions in specknets. In REALMAN ’06: Proceedings of the 2nd inter-
national workshop on Multi-hop ad hoc networks: from theory to reality, pages 71–78, New
York, NY, USA, 2006. ACM.
[112] Alec Woo and David Culler. Evaluation of Efficient Link Reliability Estimators for Low-
Power Wireless Networks. Technical Report CSD-03-1270, UC Berkeley, May 2004.
[113] Alec Woo and David E. Culler. A transmission control scheme for media access in sensor
networks.InMobiCom’01: Proceedingsofthe7thannualinternationalconferenceonMobile
computing and networking, pages 221–235, New York, NY, USA, 2001. ACM.
[114] Alec Woo, Terence Tong, and David Culler. Taming the underlying challenges of reliable
multihop routing in sensor networks. In SenSys ’03: Proceedings of the 1st international
conference on Embedded networked sensor systems,pages14–27,NewYork,NY,USA,2003.
ACM.
[115] Ning Xu, Sumit Rangwala, Krishna Kant Chintalapudi, Deepak Ganesan, Alan Broad,
RameshGovindan,andDeborahEstrin.Awirelesssensornetworkforstructuralmonitoring.
In SenSys ’04: Proceedings of the 2nd international conference on Embedded networked
sensor systems, pages 13–24, New York, NY, USA, 2004. ACM.
[116] Ya Xu, John Heidemann, and Deborah Estrin. Geography-informed energy conservation for
ad hoc routing. In MobiCom ’01: Proceedings of the 7th annual international conference on
Mobile computing and networking, pages 70–84, New York, NY, USA, 2001. ACM.
[117] Mark D. Yarvis, W. Steven Conner, Lakshman Krishnamurthy, Alan Mainwaring, Jasmeet
Chhabra, and Brent Elliott. Real-world experiences with an interactive ad hoc sensor net-
work. In IWAHN ’02: Proceedings of the International Workshop on Ad Hoc Networking,
pages 143–151, August 2002.
[118] Wei Ye, John Heidemann, and Deborah Estrin. Medium Access Control with Coordinated,
Adaptive Sleeping for Wireless Sensor Networks. ACM/IEEE Transactions on Networking,
12(3):493–506, June 2004.
133
[119] Wei Ye, Fabio Silva, and John Heidemann. Ultra-Low Duty Cycle MAC with Scheduled
Channel Polling. In Proceedings of the Fourth ACM SenSys Conference, Boulder, Colorado,
USA, November 2006. ACM.
[120] M. Younis, M. Youssef, and K. Arisha. Energy-aware routing in cluster-based sensor net-
works. InMASCOTS’02: Proceedingsofthe10thIEEEInternationalSymposiumonModel-
ing, Analysis and Simulation of Computer and Telecommunications System, pages 129–136,
October 2002.
[121] H.Zhang,A.Arora,andP.Sinha. Learnonthefly: Data-drivenlinkestimationandrouting
insensornetworkbackbones. InINFOCOM ’06: Proceedings of the 25th IEEE International
Conference on Computer Communications, pages 1–12, April 2006.
[122] Jerry Zhao and Ramesh Govindan. Understanding packet delivery performance in dense
wireless sensor networks. In SenSys ’03: Proceedings of the 1st international conference on
Embedded networked sensor systems, pages 1–13, New York, NY, USA, 2003. ACM.
[123] Lin Zhong and Niraj K. Jha. Energy efficiency of handheld computer interfaces: limits,
characterization and practice. In MobiSys ’05: Proceedings of the 3rd international confer-
ence on Mobile systems, applications, and services, pages 247–260, New York, NY, USA,
2005. ACM.
[124] Marco Zuniga and Bhaskar Krishnamachari. An Analysis of Unreliability and Asymmetry
in Low-Power Wireless Links. Transactions on Sensor Networks, 3(2), 2007.
134
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
PDF
Efficient and accurate in-network processing for monitoring applications in wireless sensor networks
PDF
Techniques for efficient information transfer in sensor networks
PDF
Multichannel data collection for throughput maximization in wireless sensor networks
PDF
Transport layer rate control protocols for wireless sensor networks: from theory to practice
PDF
Dynamic routing and rate control in stochastic network optimization: from theory to practice
PDF
Rate adaptation in networks of wireless sensors
PDF
Models and algorithms for energy efficient wireless sensor networks
PDF
Robust and efficient geographic routing for wireless networks
PDF
Relative positioning, network formation, and routing in robotic wireless networks
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Reconfiguration in sensor networks
PDF
On location support and one-hop data collection in wireless sensor networks
PDF
Language abstractions and program analysis techniques to build reliable, efficient, and robust networked systems
PDF
Design of cost-efficient multi-sensor collaboration in wireless sensor networks
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
Congestion control in multi-hop wireless networks
PDF
Cooperation in wireless networks with selfish users
PDF
Theory and design of magnetic induction-based wireless underground sensor networks
Asset Metadata
Creator
Gnawali, Om Prakash Dev
(author)
Core Title
Robust routing and energy management in wireless sensor networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
08/27/2009
Defense Date
07/07/2009
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
application-informed energy management,collection,collection tree protocol,duty cycling,energy efficient,OAI-PMH Harvest,routing,sensor networks
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Govindan, Ramesh (
committee chair
), Heidemann, John (
committee member
), Krishnamachari, Bhaskar (
committee member
)
Creator Email
gnawali@usc.edu,om_p@enl.usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2577
Unique identifier
UC1173293
Identifier
etd-Gnawali-3216 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-260009 (legacy record id),usctheses-m2577 (legacy record id)
Legacy Identifier
etd-Gnawali-3216.pdf
Dmrecord
260009
Document Type
Dissertation
Rights
Gnawali, Om Prakash Dev
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
application-informed energy management
collection tree protocol
duty cycling
energy efficient
routing
sensor networks