Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Exploiting diversity with online learning in the Internet of things
(USC Thesis Other)
Exploiting diversity with online learning in the Internet of things
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
EXPLOITING DIVERSITY WITH ONLINE LEARNING IN THE INTERNET
OF THINGS
by
Pedro Henrique Gomes da Silva
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulllment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
December 2019
Copyright 2019 Pedro Henrique Gomes da Silva
Acknowledgements
This work was only possible with the help of many people. I would like to express
my gratitude and appreciation for all the help and support received.
First I would like to thank my advisor Prof. Bhaskar Krishnamachari for the
great help and guidance provided throughout my studies. I am sure that I was lucky
to have Prof. Bhaskar as a tutor and that without his help my work would have
been much harder and uncertain. In addition to the great academic mentorship,
he also always guides his students on how how to navigate the challenges of the
doctorate without losing focus and perseverance. I would also like to thank Prof.
Ramesh Govindan, Prof. John Silvester, Prof. Cauligi Raghavendra and Prof.
Murali Annavaram for serving on my qualifying exam committee. I also had a
great experience being a teacher assistant for over two years and would like the
thank Prof. Ethan Katz-Bassett, Prof. Mark Redekopp and Prof. Allan Weber for
helping me to improve my teaching skills.
I also was helped by the EE sta and would like to thank Diane Demetras, Tim
Boston and Shane Goodo for solving bureaucratic issues that popped up along the
way of my academic life at USC. I would also like to acknowledge the Annenberg
ii
Fellowship Program and the Electrical Engineering department for nancial support
during my studies.
I want to thank Dr. Thomas Watteyne for the wonderful time I spent in INRIA
Paris and all the feedback he provided during the development of the work on this
dissertation. Special thanks also to Dr. Tengfei Chang for help during OpenWSN
implementation and collaboration in some papers. During the period I spent in
INRIA I was able to interact with many talented researchers that I would like to
recognize here: Keoma, Jonathan, Ziran, Mali sa, and Remy.
During my time doing my research in the Autonomous Networks Research
Group I received support from many colleagues. They made my academic ex-
perience more complete with true companionship during the easy and tough mo-
ments. I would like to thank Shanxing, Jason, Pradipta, Kwame, Suvil, Martin,
Nachikethas, Quynh, Pranav, Keyvan, Griey, Parisa, Yanting, and others that
I had the pleasure to work with in the lab. Special thanks also to my Brazilian
fellows Prof. J o Ueyama and Bruno Oliveira.
I also would like to thank my current employer Ericsson, especially my man-
ager Bj orn Johannisson and the research team in Ericsson Research Brazil, that
encouraged and helped me to nalize this dissertation.
Last but not least, I owe my deep gratitude to my beloved family, my mom,
dad, and sister. Without your help and understanding, I could not have gone so
far. And I would like to thank my love Cintia who recently reappeared in my life
iii
and brought new meaning to it; your presence meant a lot to me and helped me in
the nal steps of this dissertation.
iv
Table of Contents
Acknowledgements ii
List Of Figures viii
List Of Tables xii
Abstract xiii
Chapter 1: Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . 1
1.1.2 The Internet of Things (IoT) . . . . . . . . . . . . . . . . . 5
1.1.3 The Industrial IoT (IIoT) . . . . . . . . . . . . . . . . . . . 6
1.2 State-of-the-art and challenges for the (Industrial) Internet of Things 9
1.3 Timeslotted channel hopping (TSCH) protocol . . . . . . . . . . . . 12
1.4 Summary of contributions and publications . . . . . . . . . . . . . . 14
Chapter 2: Measuring diversity in TSCH networks 17
2.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Testbeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 Methodology for the IoT testbeds . . . . . . . . . . . . . . . 29
2.3 Results from the datasets . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.1 Packet delivery ratio (PDR) for dierent channels . . . . . . 32
2.3.2 Distribution of the number of channels with a high PDR . . 35
2.3.3 Average number of neighbors per channel . . . . . . . . . . . 38
2.3.4 PDR versus RSSI - the \waterfall plot" . . . . . . . . . . . . 40
2.4 Lessons learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Chapter 3: Flooding-based reliable TSCH 47
3.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Routing protocol for low-power and lossy networks (RPL) . . . . . 53
3.3 FBR-TSCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
v
3.4 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.6 Lessons learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Chapter 4: Multi-hop and blacklist-based optimized TSCH 76
4.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.2 MABO-TSCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.1 Channel oset assignment . . . . . . . . . . . . . . . . . . . 85
4.2.2 Distributed blacklist negotiation . . . . . . . . . . . . . . . . 87
4.2.3 Multi-armed bandit link estimation . . . . . . . . . . . . . . 93
4.3 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.3.1 Dierent types of FHSS . . . . . . . . . . . . . . . . . . . . 97
4.3.1.1 Default TSCH . . . . . . . . . . . . . . . . . . . . 98
4.3.1.2 Centrally blacklisted TSCH . . . . . . . . . . . . . 98
4.3.1.3 Optimal TSCH . . . . . . . . . . . . . . . . . . . . 99
4.3.1.4 First Good Arm MABO-TSCH . . . . . . . . . . . 99
4.3.1.5 Best Arm MABO-TSCH . . . . . . . . . . . . . . . 100
4.3.2 Simulation setup . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.3 Tuning the algorithms . . . . . . . . . . . . . . . . . . . . . 101
4.3.3.1 Evaluating the data collection application . . . . . 101
4.3.3.2 Evaluating an event-triggered application . . . . . 106
4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.5 Lessons learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Chapter 5: Thompson sampling-based multi-channel RPL 117
5.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.2 RPL DAGrank calculation and preferred parent selection . . . . . . 126
5.3 TAMU-RPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.3.1 Thompson sampling-based ETX estimation . . . . . . . . . . 129
5.3.2 Multi-channel DAGrank calculation . . . . . . . . . . . . . . 132
5.3.3 Passive link quality update . . . . . . . . . . . . . . . . . . . 134
5.4 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.4.1 Dierent RPL implementations . . . . . . . . . . . . . . . . 141
5.4.2 Evaluating end-to-end ETX per node . . . . . . . . . . . . . 142
5.4.3 Evaluating end-to-end ETX over time . . . . . . . . . . . . . 144
5.4.4 Evaluating loop formation . . . . . . . . . . . . . . . . . . . 147
5.4.5 Evaluating the number of packets received at the sink . . . . 149
5.4.6 Evaluating the delay of packets received at the sink . . . . . 151
5.4.7 Evaluating the energy consumption . . . . . . . . . . . . . . 153
5.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.5.1 Details of the implementation . . . . . . . . . . . . . . . . . 155
vi
5.5.1.1 Thompson-sampling ETX estimation . . . . . . . . 155
5.5.1.2 Passive link quality update . . . . . . . . . . . . . 157
5.5.2 Evaluating TAMU-RPL in a controlled real environment . . 159
5.6 Lessons learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Chapter 6: Conclusions and future work 171
6.1 Future work and research directions . . . . . . . . . . . . . . . . . . 175
Reference List 183
vii
List Of Figures
1.1 Network layers (left), TCP/IP stack (middle) and reliable WSN stack
(right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 An example of a simplied TSCH schedule with 4 channel osets
and 7 time slots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Dierent proposals in this dissertation. . . . . . . . . . . . . . . . . 16
2.1 IEEE 802.15.4 and IEEE 802.11 channels in the 2.4 GHz band. . . . 19
2.2 Boxplot for the PDR statistics of all 5 testbeds, including statistics
from all the datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Histogram of the number of channels with PDR> 50% for all links. 37
2.4 PDR as a function of average RSSI - the \waterfall plot". . . . . . . 41
2.5 \Water-fall" plot for all 16 channels in the Soda testbed. . . . . . . 43
2.6 \Water-fall" plot for all 16 channels in the Tutornet testbed. . . . . 44
3.1 Time slot structure (IEEE 802.15.4 TSCH and FBR-TSCH). . . . . 57
3.2 Objective Function versus p with dierent duty-cycles . . . . . . . . 61
3.3 Objective Function versus Duty-cycle with dierent p . . . . . . . . 61
3.4 Delay versus duty-cycle . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.5 Delay versus p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
viii
3.6 Reliability versus duty-cycle . . . . . . . . . . . . . . . . . . . . . . 64
3.7 Reliability versus p . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.8 Energy consumption versus duty-cycle . . . . . . . . . . . . . . . . 65
3.9 Energy consumption versus p . . . . . . . . . . . . . . . . . . . . . 65
3.10 Objective function versus duty-cycle for FBR-TSCH and RPL. . . . 66
3.11 Comparison between the performance of the proposals in [1, 2, 3, 4,
5, 6, 7]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.12 A typical RCO frequency drift characteristic. Each curve corre-
sponds to a dierent input voltage (in Volts) Source: [8]. . . . . . . 73
4.1 MABO-TSCH algorithms. . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Total number of received packets at the sink on the Tutornet testbed.103
4.3 Average regret and percentage of optimal channels on the Tutornet
testbed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4 Total number of received packets on the Soda and Grenoble testbeds. 105
4.5 Reliability and average number of retransmissions per received packet
on the Tutornet testbed, . . . . . . . . . . . . . . . . . . . . . . . . 107
4.6 Reliability and average number of retransmissions per successfully
received packet on the Soda testbed, . . . . . . . . . . . . . . . . . 108
4.7 Reliability and average number of retransmissions per successfully
received packet on the Grenoble testbed. . . . . . . . . . . . . . . . 109
4.8 Total number of received packets at the sink in the testbed-based
experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.9 Total number of received packets at the sink over time. . . . . . . . 112
4.10 Channels used by all the leaf nodes (the light bars are failed trans-
missions and dark bars successful). . . . . . . . . . . . . . . . . . . 113
ix
5.1 TAMU-RPL algorithms. . . . . . . . . . . . . . . . . . . . . . . . . 129
5.2 Example of linear regression with two linear functions and threshold
r equal to 90%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.3 End-to-end ETX for all the nodes in static networks in 4 dierent
conditions on the Tutornet testbed. . . . . . . . . . . . . . . . . . . 143
5.4 End-to-end ETX sum for all the nodes on the Tutornet testbed. . . 145
5.5 End-to-end ETX sum for all the nodes on the Tutornet, Soda and
Grenoble testbeds. . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.6 Number of loops formed in the simulation on the Tutornet testbed. 148
5.7 Number of packets received at the sink on the Tutornet and Soda
testbeds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.8 Number of packets received at the sink on the Tutornet and Soda
testbeds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.9 Delay (number of time slots) per packet received at the sink on the
Tutornet and Soda testbeds. . . . . . . . . . . . . . . . . . . . . . . 152
5.10 Energy overhead per packet successfully received at the sink node. . 154
5.11 Physical placement of 5 OpenMote nodes (left) and logic topology
(right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.12 Number of packets received at the sink node for the Experiment 1.
Initially all three relays are on; at 15 minutes Relay 1 is turned o,
at 30 minutes Relay 2 is also turned o; and at 45 minutes Relay 1
is turned back on. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.13 Preferred parent selection over time for the Experiment 1. . . . . . . 163
5.14 Number of packets received at the sink node for the Experiment 2.
A Wi-Fi network that is placed close to Relay 1 starts trac at 15
minutes and stops trac at 45 minutes. . . . . . . . . . . . . . . . . 165
5.15 Preferred parent selection over time for the Experiment 2. . . . . . . 165
x
5.16 Number of packets received at the sink node for the Experiment 3.
A Wi-Fi network that is placed close to Relay 1 ; Wi-Fi trac starts
and LLN unicast trac stops together at 15 minutes. The LLN
unicast data trac resumes at 30 minutes. . . . . . . . . . . . . . . 167
xi
List Of Tables
2.1 Summary of testbeds' features . . . . . . . . . . . . . . . . . . . . . 32
2.2 Average number of neighbors with stable link (PDR> 50%) for all
nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
xii
Abstract
Over the past 15 years, research in wireless sensor networks has progressed dra-
matically and several wireless technologies can now be found in devices that are
present in our daily lives. It is estimated that more than 11 billion devices are
already connected to the Internet of Things, some of them through wireless sensor
networks comprising constrained devices that run on batteries and have limited
capabilities. The next breakthrough will be the introduction of these technologies
in industrial environments for the interconnection of machinery and automation of
processes. This new scenario is also known as the Industrial Internet of Things
(IIoT). The IIoT will be essential for the success of the Industry 4.0, which is the
vision where industrial processes will be able to leverage cyber-physical systems
to allow machines and humans to interact and create customizable and dynamic
\smart factories".
There have been signicant advances in the area of IoT and many of the classic
problems in this eld have already been solved. However, the new requirements
of the IIoT have brought up new problems that have yet to be addressed. Such
networks will need to meet more stringent requirements for throughput, reliability,
xiii
delay and power consumption. Despite advances in hardware and new standards,
there are still a large number of sensors already operating that must be reused in
the future IIoT. This means that the new protocols and standards for the IIoT
may still be able to use legacy hardware and will have to improve the network
performance by making use of more advanced algorithms in the network stack.
The most widely used IoT radio standard, IEEE 802.15.4, has been enhanced to
provide a scheduling protocol for more robust operations and better use of radio
resources. The improvements made for the IEEE 802.15.4 standard are part of the
Time-Slotted Channel Hopping (TSCH) protocol, which will be the main target of
the proposals in this dissertation.
We set out some dierent ways to explore diversity in the wireless networks,
which are aimed at improving the performance of the IoT protocols and meeting
the requirements of IIoT while maintaining the compatibility with legacy hardware
and protocols such as TSCH. The strategies improve the use of radio resources by
exploiting time, frequency and spatial diversity. This dissertation initially examines
the variations in the quality of the wireless links in several testbeds, concerning the
three domains previously mentioned, i.e., time, frequency and space. Following
this, we reveal three solutions: FBR-TSCH, MABO-TSCH and TAMU-RPL.
The Flooding-Based Reliable TSCH protocol (FBR-TSCH) is a protocol for
event-triggered applications that is optimized for the collection of non-deterministic
signals sent by a sensor to a sink node in a dynamic environment with high levels
xiv
of external interference. The Multi-hop And Blacklist-based Optimized TSCH
(MABO-TSCH) is a protocol for improving frequency hopping algorithm. It
employs a distributed blacklist that is optimized for multi-hop networks in envi-
ronments with high levels of external interference and multi-path fading. Finally,
the Thompson sAmpling-based MUlti-channel RPL (TAMU-RPL) is a protocol
to optimize the reactiveness of routing algorithms. It uses a Thompson Sampling
heuristic to make a dynamic parent selection and quickly change the routing tree
in networks with high levels of external interference. Even though the solutions
proposed are focused on specic technologies and protocols, such as TSCH and
RPL, the algorithms introduced are generic enough to be used in other scenarios
where online learning is required to optimize the distributed decision-making and
improve the performance of networks by exploiting dierent kinds of diversities.
The results show that even with constrained hardware, the networks can cope
with unreliable links caused by high levels of external interference and multi-path
fading, and achieve a performance that meets the requirements of some of the
applications in the IIoT.
xv
Chapter 1
Introduction
1.1 Background
1.1.1 Wireless Sensor Networks
Wireless sensor networks are networks composed of small devices that have sens-
ing, computing, memory, and wireless communication capabilities. These devices
can be used for dierent applications, but WSNs usually involve the monitoring of
(natural) processes in a given environment [9, 10]. In a traditional WSN, the sensor
nodes are spread out over a large area to monitor the environment, e.g., a forest,
a city or a factory, using sensors capable of measuring temperature, light, concen-
tration of gases, etc. [11]. The data obtained from the environment is transmitted,
through one or more hops, to a central node, called the sink node, where the WSN
is connected to another network, possibly the Internet. Several dierent aspects
1
related to wireless communication in WSNs have been investigated over the past
15 years, including all dierent layers of the protocol stack [12, 13, 14, 15].
WSNs have many peculiar features that require the networking protocols to be
redesigned to overcome their constraints and satisfy the requirements of the ap-
plications. Unlike regular (wired or wireless) networks, the sensor nodes in WSNs
usually run on batteries and, hence, energy consumption must be kept to a min-
imum to ensure longer operation. Moreover, owing to restrictions in processing
and memory resources, most of the tasks have to be ooaded so that they can be
processed by more powerful nodes, such as the sink node or nodes in other net-
works. However, wireless communication also requires energy consumption, which
makes it very important to strike a balance between data communication and the
local processing carried out at the sensor nodes. In addition, WSNs are subject
to a high degree of external interference when working in a shared spectrum, such
as the worldwide Industrial, Scientic and Medical (ISM) band at 2.4 GHz. Since
sensor nodes use simple radios and transmit at low power, the links found in WSNs
are usually unstable and the protocols have to cope with a lack of reliability at
the physical layer. On account of the limitations described above, WSNs are also
known as Low-Power Lossy Networks (LLNs).
The networking stack for WSNs has been dierentiated from the regular TCP/IP
stack employed in xed or wireless networks [16]. Many dierent stacks have been
recommended over the years and new protocols have been standardized. The IEEE
2
standards are most often adopted at the lower layers, mainly physical and medium
access control (MAC) layers. IEEE 802.15.4 is the de facto standard for WSNs.
A large number of sensor nodes used for research and commercial products use
radio chips that are compliant with IEEE 802.15.4 standard. The IETF (RFCs)
are usually adopted at higher layers, including network, transport and application
layers. Dierent working groups within IETF have adapted the TCP/IP stack for
constrained nodes in WSNs. In 2007, an IETF working group created the 6LoW-
PAN (IPv6 over Low-Power Wireless Personal Area Networks - RFC 4944) proto-
col [17]. The 6LoWPAN protocol consists of encapsulation and header compression
mechanisms that allow IPv6 packets to be transmitted over IEEE 802.15.4-based
networks. RPL (Routing Protocol for Low-Power and Lossy Networks - RFC 6550)
is the default routing protocol that can work with the 6LoWPAN layer. RPL is
described in more detail in Section 3.2. There is also CoAP (Constrained Applica-
tion Protocol - RFC 7252), which is designed to work on the application layer and
provided an HTTP-like protocol for lightweight messages used in WSNs.
WSNs are very concerned with resource eciency, but many of the protocols
suer from unreliability, large delays, and instability in the face of external in-
terference, etc. More recently, research in WSNs has also addressed the question
of reliability and proposed enhancements that allow these networks to be used in
harsh environments with applications that have strict requirements with regard to
throughput, latency, reliability and energy consumption [18]. New standards have
3
also been set out that are aimed at making WSNs more robust. The Time Slotted
Channel Hopping (TSCH) protocol is an enhancement technique that was intro-
duced to the IEEE 802.15.4 standard in 2015. Finally, 6TiSCH (IPv6 over the
TSCH mode of IEEE 802.15.4e - RFC 8480) is an adaptation layer for TSCH that
standardizes ways of creating, modifying and optimizing the schedules that can
be used by TSCH-based nodes, in addition to ensuring network security. 6TiSCH
allows protocols such as 6LoWPAN, RPL, and CoAP to work on top of TSCH.
Figure 1.1 makes a comparison between the traditional TCP/IP stack and the
enhanced WSN stack for reliable networks. Further details on the protocols for
reliable WSN will be given in Section 1.1.3.
Figure 1.1: Network layers (left), TCP/IP stack (middle) and reliable WSN stack
(right).
4
1.1.2 The Internet of Things (IoT)
The denition of the Internet of Things (IoT) has changed over time owing to the
convergence of several dierent technologies. The term IoT was coined by Kevin
Ashton in 1999 [19], when Radio Frequency Identication (RFID) communication
was seen as an essential technology for bringing constrained devices (things) to the
Internet. Currently, the IoT encompasses technologies related to dierent areas of
information technology, e.g., real-time analytics, machine learning, wireless sensor
networks, and embedded systems, and it has had an impact on the economy, by
changing the way products are designed and how people interact with the environ-
ment [20].
The IoT interconnects home appliances, vehicles, machines, and other physical
objects, to form a vast communications network that is currently estimated to
include more than 11 billion devices [21]. It has accelerated the development of
cyber-physical systems such as smart buildings, smart cities and smart factories.
These systems are expected to be the biggest drivers of productivity in the coming
decades [22]. They involve dierent types of machine-type communications (MTC),
such as Ethernet, Wi-Fi, ZigBee, Bluetooth, cellular (4G and 5G), etc.
There have been several published surveys that cover all aspects of IoT tech-
nologies [23, 24, 25, 26]. The authors in [23] show how the IoT trend has evolved in
the past decade. They compare the evolving patterns of IoT, WSN and ubiquitous
computing and examine the main dierences between these dierent technologies.
5
In [24], the main enabling technologies are surveyed, including the areas of commu-
nication, hardware, and software. The authors in [25] set out a 5-layer scheme that
generically represents how an IoT architecture should be viewed from the devel-
oper·s perspective. The 5-layer architecture should comprise the following layers:
(i) perception, (ii) networking, (iii) middleware, (iv) application and, (v) business.
Key issues aecting the development of IoT applications are also addressed in [25].
An overview of IETF standardization activities related to IoT is presented in [26].
Within the scope of this dissertation, we are only concerned with the networking
features of the IoT. Moreover, in this (narrow) area, the terms WSN, LLNs, and
IoT are very blurred and as a result, research carried out in these elds is quite
often overlapping. In view of this, in this dissertation, these three concepts are used
interchangeably.
Within the realm of IoT, we will discuss the networking features related to the
use of the IoT for industrial applications, where there are more stringent require-
ments. This eld is referred to in the literature as the \Industrial IoT" (IIoT) [27].
1.1.3 The Industrial IoT (IIoT)
The IoT is a key element of the Industry 4.0, which is a view of industrial processes
that are able to leverage cyber-physical systems and allow machines and humans to
interact and create customizable and dynamic \smart factories". In the Industry
4.0, the machines should be able to sense the environment and collaborate with
6
other systems and machines as well as human beings. The IIoT involves networks
with a much smaller coverage area, but have to cope with a higher node den-
sity (in terms of nodes per square meter), more reliable communication (99.999%
reliability) and lower latency (millisecond delays). In addition to this, there are
several hindrances imposed by the industrial environment, which is surrounded by
metal objects and machinery and is subject to a high degree of noise and external
interference.
Applications for the IoT in industry vary from all types of process automation
to optimizing logistics, improving security surveillance and even enabling remote
assistance. A comprehensive survey of the applications of the IIoT and their main
challenges can be found in [28, 29].
The most important technological advance with regard to the use of IoT net-
works based on IEEE 802.15.4 in the industry was the seminal work of the Time
Synchronized Mesh Protocol (TSMP) [30]. This is a communication protocol devel-
oped by Dust Networks that species a synchronization scheme and frequency and
time slot-based scheduling, similar to a Time-Division Multiplexing (TDM) sys-
tem. The deterministic scheduling allows the nodes to operate at very low power
and still be reliable in noisy environments. It also employs channel hopping to
avoid interference and multi-path fading. The synchronization protocol runs in a
7
distributed manner, unlike most of the other existing protocols at that time. More-
over, TSMP forms the basis of the most successful standardized protocols for IoT
in the industry, e.g. WirelessHART, ISA 100.11a, and TSCH [31].
WirelessHART is a wireless protocol that is based on the Highway Addressable
Remote Transducer (HART) standard. The HART standard is a multi-vendor stan-
dard for industrial processes, that involves the whole communication stack, from
the physical to the application layers, including security concerns and interoper-
ability with old industrial control systems. The WirelessHART protocol works on
top of the IEEE 802.15.4 standard with scheduled transmissions based on TDMA.
Each time slot in WirelessHART networks is 10 ms long (xed) and is solely used by
one pair of nodes for contention-free medium access. The standard also denes the
management layers and how the applications should communicate with the legacy
control systems [32]. ISA100.11a is the wireless standard developed by the Interna-
tional Society of Automation (ISA). It is largely based on TSMP and, hence, similar
to WirelessHART. The main dierences between WirelessHART and ISA100.11a is
that the latter uses a modied IEEE 802.15.4 MAC layer and supports a variable
time slot size, while the former relies completely on the IEEE 802.15.4 standard
and is restricted to time slots that are 10 ms long. In addition, ISA100.11a adopts
IETF as a standard for the high layers in the stack, including UDP and 6LoWPAN,
while WirelessHART relies on proprietary protocols to maintain compatibility with
the HART standard [33].
8
The last step towards a standardized IoT stack for the industry was taken
in 2015 when the TSCH protocol became a part of the IEEE 802.15.4 standard.
TSCH is also based on TSMP but has greater
exibility than WirelessHART and
ISA100.11a, which allows the network designers to optimize the network stack [34].
Like all the IIoT protocols mentioned previously, TSCH relies on time-frequency
scheduling that can be ne-tuned to optimize medium access and meet the require-
ments specied at the application layer [35, 36]. More details on TSCH can be
found in Section 1.3.
1.2 State-of-the-art and challenges for the (Industrial)
Internet of Things
As mentioned above, the IEEE 802.15.4 is the de facto standard used in the IoT,
being adopted by WirelessHART [37], ISA 100.11a [38], and many other networks.
It denes a physical layer (PHY) that operates on the worldwide 2.4 GHz ISM
band and a medium access control (MAC) layer based on the CSMA/CA protocol.
Several studies have shown that IEEE 802.15.4 networks based on CSMA/CA are
not scalable and cannot be used in applications where high availability and quality
of service are prerequisites [39, 40, 41]. The main reasons for this are: (i) its
single-channel operation, which limits the total available throughput and does not
9
protect the nodes against narrow-band external interference, and (ii) the contention-
based access method, which makes the protocol more susceptible to the exponential
growth of contention time in networks with a large number of devices [42].
Several other protocols have been adopted to circumvent the problems faced by
the legacy IEEE 802.15.4 CSMA/CA-based MAC. These strategies employ multi-
ple channels that are available in the 2.4 GHz band and adaptive schedule-based
medium access control to avoid contention. In this way, they are able to (i) in-
crease the overall network throughput, with concurrent transmissions over orthogo-
nal channels [43, 44]; (ii) reduce the network congestion and medium contention [45];
(iii) increase the resistance to external interference [46]; and (iv) reduce energy con-
sumption, as the nodes follow tight schedules that allow them to \sleep" longer, and
hence avoid preamble sampling [47]. Surveys on protocols that focused on solving
issues related to single-channel legacy IEEE 802.15.4 CSMA/CA-based MAC can
be found in [48, 49].
As mentioned above, IIoT applications include a number of requirements, e.g.,
low latency, high reliability, and low energy consumption, that are usually not sup-
ported by legacy LLNs based on the IEEE 802.15.4 protocol. New algorithms have
to be created to enable the use of simple IEEE 802.15.4 devices in critical systems.
The general approach adopted in this dissertation is to exploit dierent types of
diversities available in the IEEE 802.15.4 technology as a means to optimizing the
network protocol stack in multiple layers. This is carried out by means of online
10
learning algorithms that are able to dynamically improve protocols at dierent lev-
els of the network stack, from frequency hopping in the medium access control to
the routing tree formation in the network layer.
The latest version of the IEEE 802.15.4 [50] standard oers a range of alterna-
tives to replace the legacy CSMA/CA-based access protocol. The most promising
alternative is the Timeslotted Channel Hopping (TSCH) protocol. Its operation is
based on network synchronization and the time-frequency schedules that are fol-
lowed by each node. With the aid of the TSCH, nodes are able to know when
to transmit and receive packets. They also benet from using Frequency Hopping
Spread Spectrum (FHSS), which randomly spreads the transmission across all 16
channels
1
available in the 2.4 GHz band.
The use of time-frequency schedules and FHSS in the TSCH protocol provides
a number of benets to the network:
it can entirely eliminate contention for medium access by using dedicated time
slots;
it reduces energy waste since the nodes remain sleeping when they are not
transmitting or receiving data;
it helps the network to achieve a quality of service, since the schedule followed
by the nodes is designed to meet the requirements, such as maximum end-to-
end delay, minimum node lifetime or throughput, etc.;
1
In this dissertation the terms frequency and channel are used interchangeably when referring
to the portion of the spectrum used for communication.
11
it also improves reliability by oering a wider range of frequency diversity.
Even though the algorithms and solutions provided in this dissertation are
protocol-agnostic and may be applied to any multi-channel multi-hop wireless net-
work, TSCH has been chosen as the main target because it is extensively used in
the industry and the fact that open-source solutions that make it easier to assess
our proposals are widely available.
1.3 Timeslotted channel hopping (TSCH) protocol
TSCH slices time into slots, with a slot having sucient duration (typically 10ms)
to accommodate a data packet of maximum size and an acknowledgment (ACK)
packet, as well as all the required guard times. It employs Time Division Multiple
Access (TDMA) together with multiple channels for communication. More than one
transmission can be scheduled to occur at the same time in a TSCH network through
a well-designed scheduling algorithm. As long as simultaneous transmissions are
made through dierent frequencies, or between non-interfering nodes, a collision-
free operation can be guaranteed.
The schedule consists of a sequence of atomic resource units (time-frequency
allocations), called cells. A group of cells repeats over time. This sequence of
repeating cells is denoted as a slotframe. Each cell can be shared, when contention-
based access should be employed using CSMA/CA, or dedicated, when contention-
free access is guaranteed for the scheduled nodes. All the cells are uniquely identied
12
by a tuple (slot oset and channel oset). The slot oset sets the location of the cell
in time from the beginning of the current slotframe; the channel oset is a virtual
channel that is converted into an actual frequency at each time slot. The conversion
is performed by the Frequency Hopping Spread Spectrum (FHSS) algorithm, that
follows a pseudo-random pattern and can spread the packets across the 16 channels
allocated at the 2.4 GHz band (numbered 11 to 26). The aim of FHSS is to mitigate
the eects of multipath fading and external narrow-band interference [46, 51], and
thus improve the network reliability, as well as the throughput.
Figure 1.2 shows an example of a simplied TSCH schedule. It has 4 channel
osets that are converted into frequencies during the execution of the schedule. In
this example data collection is the targeted application. Data are scheduled to be
transmitted in dedicated cells towards the sink (node A). The rst time slot has
two cells that are shared. They can be used, for instance, for downlink trac or
broadcast packets. On the other hand, the last time slot is reserved for nodes so
that they can sleep and save energy. Each time slot in a TSCH schedule has an
Absolute Sequence Number (ASN), which is a 5-byte counter that is incremented in
each time slot. A large amount of research has been done in the optimization of the
TSCH schedule, both centralized solutions [35] as well as distributed ones [52, 53].
There are also specic proposals focused on improving green-eld deployments [54],
as well as application-specic optimized schedules [55, 56]. A comprehensive survey
on the scheduling algorithms proposed for TSCH can be found in [57].
13
Figure 1.2: An example of a simplied TSCH schedule with 4 channel osets and
7 time slots.
1.4 Summary of contributions and publications
This dissertation explores ways of using diversity to improve the performance of
TSCH networks.
We begin conducting a comprehensive analysis of the link quality on multi-
ple testbeds that have constrained nodes (called motes) with IEEE 802.15.4 radios.
This initial study is detailed in Chapter 2 where connectivity datasets from 5 dier-
ent testbeds are analyzed. Dierent statistics are displayed, including the following:
(i) variations of Packet Delivery Ratio (PDR) over dierent channels, (ii) the dis-
tribution of the number of channels with high PDR, (iii) the average number of
neighbors per channel, among others. This analysis provided the guidelines for our
later studies which are described in Chapters 3, 4 and 5, where we dene protocols
that exploit dierent types of diversity in LLNs. Some of the results from this
initial study were published in [58] and [59].
14
In Chapter 3 we introduce the Flooding-Based Reliable TSCH (FBR-TSCH)
protocol, which is a solution based on TSCH that relies solely on broadcast packets
and optimizes an application that involves forwarding non-deterministic signals
from the sensors to the sink node. The solution is very adaptive and appropriate for
scenarios with high levels of external interference. It was tested in both simulations
and a real scenario. This protocol was published in the 2016 EWSN Dependability
Competition [3]. On the basis of the experience and learning, we obtained from the
2016 EWSN Dependability Competition, a survey was published in [60]. The work
in [60] explores an alternative means of tackling the same problem and recommends
the use of constructive interference together with TSCH.
In Chapter 4 we introduce the Multi-hop And Blacklist-based Optimized TSCH
(MABO-TSCH) protocol. MABO-TSCH is a distributed blacklisting solution
that is optimized for multi-hop networks and is compliant with the TSCH standard.
A dynamic channel quality estimation algorithm is employed that is based on the
Multi-armed Bandit (MAB) problem [61]. The scheme is implemented and tested
empirically both in simulations and a real testbed. MABO-TSCH was published
in [62].
In Chapter 5 we introduce the Thompson sAmpling-based MUlti-channel RPL
(TAMU-RPL) protocol . It consists of a modication to RPL for agile adaptation
to network changes based on reinforcement learning. The main algorithm denes a
new link quality estimation based on Thompson-sampling, that keeps track of the
15
ETX (Expected Transmission Counter) [63] of multiple neighbors and considers
the link quality over dierent channels; TAMU-RPL also employs a hybrid ETX
estimation algorithm that includes both unicast and broadcast packets and uses
physical information such as RSSI to improve the ETX estimation. TAMU-RPL
was submitted for publication in [64].
Figure 1.3 shows the 4 dierent proposals in this dissertation and illustrates
how these works can be placed in an optimized IIoT network stack.
Figure 1.3: Dierent proposals in this dissertation.
16
Chapter 2
Measuring diversity in TSCH networks
Diversity schemes in wireless communications aim to increase the probability of
correctly receiving a packet (or signal).
1
This is achieved by transmitting infor-
mation over two or more dierent and independent channels [65]. The schemes are
often used to mitigate data loss due to multipath fading and interference from other
systems operating on the same channel.
There are 4 basic types of diversity in wireless communications: time diversity,
frequency diversity, space diversity and code diversity. Other more sophisticated
schemes, such as cooperative diversity, multi-user diversity and polarization diver-
sity, which are derived from the four basic schemes listed above, are becoming
popular in next-generation technologies, e.g., multi-user MIMO. In cooperative di-
versity the signal is decoded based on copies received from dierent transmitters,
in multi-user diversity the best receiver is selected by the transmitter to maximize
reception probability, while in polarization diversity the signal is decoded using
1
This chapter includes work from [58, 59].
17
dierent antennas with dierent polarization. Simple networking technologies such
as IEEE 802.15.4 are usually able to exploit 3 types of diversity schemes: time,
frequency and space [66].
TSCH networks are capable of exploiting diversity through a combination of
optimized schedules, frequency hopping sequences and routing trees. Building opti-
mized schedules, using optimized frequency hopping sequences and optimized rout-
ing trees are ways of exploiting diversity in communication. They have the freedom
to exploit all 16 channels specied by the IEEE 802.15.4 standard in the 2.4 GHz
band. This band, however, is shared with other WLAN and WPAN technologies
such as Wi-Fi, Bluetooth, and RFID. It is also expected that WWAN technolo-
gies such as LTE and NR (5G) [67] will use unlicensed spectrum, including the
2.4 GHz band, to increase capacity in indoor environments, which may impact the
performance of IEEE 802.15.4 devices even more. Thus, the expected high levels of
co-channel interference make the use of frequency diversity even more important.
The schedule-based medium access allows transmissions in TSCH networks to
be allocated in a deterministic way, in frequency and time. One simple way of
exploiting time diversity is through the retransmission of packets at dierent time
slots. The default behavior of TSCH already supports this type of diversity. For
every unicast data frame transmission, an ACK frame is sent back within the same
time slot and nodes are expected to retransmit the frames whenever the ACK is
not received. Selecting when packet retransmission should take place is of crucial
18
importance to improve the chances of packet delivery since losses are usually caused
by interference bursts from external networks.
The technology that causes most interfere with the IEEE 802.15.4 networks
is Wi-Fi (IEEE 802.11). Figure 2.1 shows how IEEE 802.15.4 and IEEE 802.11
channels overlap in the 2.4 GHz spectrum. IEEE 802.15.4 nodes undergo severe
interference from IEEE 802.11 nodes, but the opposite is not always true. This
is due to the degree of imbalance in the transmission power employed by dierent
technologies. While the maximum power of the commonly used IEEE 802.15.4
transceivers is limited to 0 dBm, the FCC rules and regulations allow Wi-Fi devices
to use up to 30 dBm.
Figure 2.1: IEEE 802.15.4 and IEEE 802.11 channels in the 2.4 GHz band.
Since power decays exponentially with distance, it is expected that the IEEE 802.15.4-
based devices that are far from the Wi-Fi nodes will undergo much less interference.
19
In indoor environments, such as oce buildings, paths with more hops may pro-
vide improved connectivity if the relay nodes are far away from the devices that
generate Wi-Fi trac. In this way, space diversity can be exploited by forwarding
packets through multiple paths, preferably avoiding areas where there are sources
of interference.
2.1 Related work
In this chapter, we perform a series of experiments to characterize testbeds and
explore ways of improving the protocols by employing algorithms that exploit di-
versity. There are many many studies that perform similar analysis for dierent
deployments using dierent types of measurement procedures.
Most prior work on this topic is based on IEEE 802.11 standard, measuring
the performance of Wi-Fi cards with dierent parameters, e.g. varying CSMA/CA
contention window size, CCA threshold, transmission power, hardware, etc. [68].
IEEE 802.11 cards usually interface with the CPU via high-speed buses (e.g. PCIe),
hence the measurement process is very fast. The main dierence between such prior
work and measurements in LLNs is the type of bus that interconnect the radio to
the MCU, and the MCU to the laptop. The most common bus for radio-MCU
communication is I2C, which imposes some limitation in the maximum throughput
that can be transmitted over-the-air and how quickly the links measurements can
be performed [69]. The lower data rate of LLNs (hundreds of kbps) also limits how
20
quickly link statistics can be gathered, as most of the measurements require the
transmission of probe packets.
The work in [39] presented one of the rst studies that explained the performance
of wireless links in LLNs. The authors analyzed the fundamental aspects that
in
uence packet delivery (percentage of packets that are successfully received) at
dierent layers of the stack. They used the Mica motes, one of the rst sensor
nodes available for LLN experimentation, running at 433 MHz. The experiments
consisted of approximately 60 motes placed in 3 dierent environments: (i) an oce
building, (ii) a parking lot, and (iii) an open state park. In the experiments, one
node transmitted a sequence of packets spread over time and all other sensor nodes
recorded the (successful or failed) reception. Many features could be extracted in
this seminal work, including the \signal strength vs. distance" and \signal strength
vs. packet loss" proles of this type of network. An important conclusion such as
the asymmetrical nature of LLN links was also be derived from the experiments.
However, this study had several limitation, e.g., the lack of external interference,
the use of one single channel and the long periods between packets transmission.
Testbeds are an important tool for the evaluation of wireless networks. There
is a large number of publicly available experimentation facilities, some of them
specically designed for IoT applications [70, 71]. Among all testbeds, FIT IoT-
Lab stands out as the largest deployment with 1700+ nodes. It is spread across
6 dierent sites in France and has a multitude of hardware, including xed sensor
21
nodes and mobile nodes mounted on top of robots. The nodes also have dierent
radio interface, e.g., Bluetooth Low Energy (BLE), IEEE 802.15.4 at 2.4 GHz
as well as sub-GHz radios at 868 MHz. Nodes in each dierent site of the IoT-
Lab are placed dierently, which means that each deployment has its own wireless
propagation characteristics.
The work in [72] described FIT IoT-Lab's infrastructure and discussed its im-
portance for protocol testing to be able to conduct reproducible experiments. They
commented on how few are the scientic work from recent papers on LLN that
have reproducible results when the experiments are based on testbeds or specic
simulations. They concluded that for the IoT-LAB deployment at Strasbourg, the
structure of the building and the level of WiFi interference may highly in
uence
the results of experiments. Lastly, they also described how the testbed results vary
signicantly depending on the time of the day that the experiment runs, mainly
due to the presence of external interference and multi-path fading caused by peo-
ple moving round in the environment. The work in [73] described a study done
on another deployment of the IoT-Lab, at Grenoble. In the experiments, each
node transmitted 100-frame bursts and the (successful or failed) reception of such
bursts was recorded by each receiving node. This process was repeated for all the
16 channels at 2.4 GHz. They quantied how Wi-Fi interference impacts network
performance.
22
The work in [74] proposed a methodology to analyze the circumstances behind
packet losses. The proposed methodology, called MAP, consists of an algorithm
to determine the root causes of degradation in wireless links, accounting for issues
related to the surroundings, hardware as well as software issues. The methodology
is based on complex statistical analysis and on protocol-specic messages, which is
hard to extend to other deployments. The analysis used a real-world deployment
with 343 nodes running over 10 days. They just analyzed the performance in one
single channel and the levels of external interference in the environment were very
low.
In addition to studies on testbeds, real-world deployments have also been used
for wireless experimentation. The authors in [75] have deployed a 44-node network
in an industrial facility and run a 26-day long experiment. In this scenario, due to
the external noise and metallic objects, a large amount of variation on the PDR
occurs over time and frequency. The authors concluded that time variation was
mainly caused by moving objects and people, and frequency, due to multi-path
fading. Many other experiments were conducted in real facilities and even in the
non-urban scenarios. The authors in [76] present the results from a 21-node network
deployed in a peach orchard. Data were collected over more than 3 months and
a large number of statistics were derived, some of then not intuitive and dierent
from previous related work. Links were found to be mainly symmetrical, contra-
dicting work [39]. Also, links were found to be much more stable than in previous
23
work, mainly due to the use of frequency hopping in the link layer protocol (the
experiments were based on TSCH networks). The main problem with real-world
deployments is that they are hard (if not impossible) to reproduce, and the results
cannot generally be extrapolated to other environments. In this sense, real-world
experiments represent a more reliable way of testing algorithms and protocols in
practice.
One of the rst eorts in creating a common set of datasets for wireless research
was CRAWDAD (A Community Resource for Archiving Wireless Data At Dart-
mouth)
2
[77]. This archival community started in 2005 and it has a large number
of dierent types of wireless traces, including connectivity (PDR, RSSI, etc.), data
collection (sensor measurements), management (SNMP), etc. The technologies in-
clude Wi-Fi networks, as well as Bluetooth, RFID, WiMAX, etc. Even though the
IEEE 802.15.4 is listed, the number of traces available is small (only 8 at the time
this dissertation was written) and none of the available datasets include analysis of
multiple channels in the 2.4 GHz spectrum.
3
.
In 2018, one more eort emerged aiming to create an eort to standardize
ways of comparing (benchmarking) experiments in IoT. IoTBench [78] designs a
benchmark for IoT consisting of problem sets, tools (testbeds) and methodologies
(software) for the performance evaluation of lower power wireless networks. One of
2
https://crawdad.org/.
3
As a contribution to the community all datasets analyzed in this dissertation are now avail-
able at CRAWDAD (https://crawdad.org/) and can also be found at https://github.com/
pedrohenriquegomes/phd-datasets.
24
the initiatives from IoTBench is the EWSN's Dependability Competition, to which
work from Chapter 3 has been submitted. IoTBench recommendeds testbed FIT
IoT-Lab, which is also used in our evaluation.
It is clear from the most recent related work that there are still some gaps in
the analysis of the quality of links in wireless networks, especially when there is
interest to exploit dierent types of diversity techniques in these networks:
(i) The experiments do not run a concise and simple methodology to collect net-
work statistics that are independent of network protocols (e.g., routing) and
can be easily reproduced in dierent testbeds;
(ii) Most of the datasets do not have dense data at all three dierent domains
where diversity can be exploited, i.e., time, frequency and space.
The goal of our work is to address the gaps identied in previous work and
provide insights derived from an analysis that accounts how link quality varies in
testbeds over time, frequency and space.
2.2 Testbeds
Testbeds are important tools for the process of evaluating networking algorithms
and protocols. They usually consist of several o-the-shelf hardware components
and specically designed software that allows to conduct stress tests in a controlled
but realistic environment. Testbeds ensure a higher degree of reliability in the test
25
results, without the high costs of deployments in real enviornments, e.g., working
places or industries. The use of testbeds, however, may lead to biased results if
the evaluation is not properly set up or if the testbed itself is not very realistic.
It is important to analyze the features of the environment where the testbed is
installed [59].
In addition to their use to conduct controlled experiments, testbeds are also a
useful means of providing more realistic data for simulations. Coding an algorithm
or protocol for real hardware (that is used in the testbeds or real deployments) often
requires a greater eort than coding a protocol in a simulator. The use of traces
extracted from real experiments in testbeds as input for simulators may improve
the results from the simulation experiments. Testbeds are particularly important
for wireless communications because it is very hard - if not impossible - to properly
simulate physical layer phenomena, such as multi-path fading, interference, noise,
etc.
5 dierent testbeds that were used in this dissertation are described below; all
had nodes with IEEE 802.15.4 radios. Some of them were only used for purposes
of investigation and not considered in the protocol evaluation because of their lack
of realism; others were used for simulation experiments and/or real experiments.
26
1. IoT-Lab Lille is an indoor testbed with 300+ nodes, located at Inria Lille,
which is part of FIT IoT-Lab (www.iot-lab.info) [79]. The nodes are de-
ployed over an area of 225 m
2
on the ceilings and walls. The area com-
prises a single large and almost empty room in an oce building. Fur-
ther information about the deployment of IoT-Lab Lille can be found at
www.iot-lab.info/deployment/lille/. The experiments that used data
from Lille included a subset of 50 nodes, divided into three dierent clusters
in the room. All the nodes used were M3 open nodes
4
.
2. IoT-Lab Strasbourg is an indoor testbed with 400 nodes, located at Inria
Strasbourg, and also part of FIT IoT-Lab. The nodes are deployed inside
a single empty room in an oce building. They are arranged in the form
of a 3-layered grid, that resembles a 3D cube. Further information about
the deployment of IoT-Lab Grenoble can be found at www.iot-lab.info/
deployment/strasbourg/. The experiments that used data from Strasbourg
included a subset of 49 nodes, all close to each other over an approximate
two-layered grid. All the nodes used were M3 open nodes.
3. IoT-Lab Grenoble is an indoor testbed with 900+ nodes, located at Inria
Grenoble, and also part of FIT IoT-Lab. The nodes are all located on the
same
oor of an oce building with four interconnected corridors. They are
4
M3 nodes are replicas of TelosB motes. TelosB is one of the most used devices for LLN
experimentation and prototyping. It consists of an MSP430 microcontroller and a CC2420 radio
chip working at 2.4 GHz which is compatible with IEEE 802.15.4 standard
27
all deployed between the dropped ceiling and the roof. Further information
about the deployment of IoT-Lab Grenoble can be found at www.iot-lab.
info/deployment/grenoble/. The experiments that used data from Greno-
ble included a subset of 50 nodes, that were all located along a long corridor
with two parallel lines of nodes. All the nodes used were M3 open nodes.
4. Berkeley's Soda is a 46-node network with TelosB motes that was deployed
in a laboratory in the Soda Hall building, University of California at Berkeley.
The nodes were in an indoor oce space of approximately 50 50 meters with
Wi-Fi interference. This is a realistic environment with average levels of exter-
nal interference and multi-path fading. Further information about the deploy-
ment of Berkeley Soda can be found at wsn.berkeley.edu/connectivity/.
The experiments that used data from Soda included all 46 nodes.
5. USC's Tutornet is an indoor testbed with 100+ TelosB motes, located at the
University of Southern California; it covers two adjacent
oors in the Ronald
Tutor Hall building [58]. Each
oor has an area of approximately 55 30 me-
ters. There are about 8 Wi-Fi access points spread over each
oor, and operat-
ing across all the 2.4 GHz band. The nodes are deployed between the dropped
ceiling and the roof. This is a real working environment, with high levels of
external interference and multi-path fading. Further information about the
deployment of Tutornet can be found at anrg.usc.edu/www/tutornet/. The
28
experiments that used data from Tutornet included a subset of 40 nodes, all
located on the 4th
oor of the building.
2.2.1 Methodology for the IoT testbeds
The testbeds described above were initially used to extract connectivity traces.
Our concern was with modeling the wireless links between nodes. For this reason,
we ran a series of experiments to obtain information about Packet Delivery Ratio
(PDR) and Received Signal Strength Indicator (RSSI) of individual packets, among
other factors. We divided the testbeds into three dierent groups: (i) all the IoT-
Lab testbeds, (ii) Berkeley's Soda and (iii) USC's Tutornet. We used a dierent
methodology to extract data from each group of testbeds.
IoT-Lab testbeds are deployed in dierent environments. A rmware based
on OpenWSN [80] was chosen to carry out the experiments. This rmware is
a part of the Mercator
5
project, which is designed to provide dense datasets for
wireless networking experiments. The procedure for the statistics collection followed
a simple methodology. Each node broadcasted 100-packet bursts, one after the
other. When one node is broadcasting, all others remained in receiving mode,
constantly listening to the packets. The interval between packet transmissions was
equal to 10 ms (this is called the inter-packet interval). The transmission power
was set to 0 dBm. The nodes that are in listen mode record statistical data about
all the received packets, including timestamp, RSSI, packet size, etc. After all the
5
The source code can be found at https://github.com/openwsn-berkeley/mercator.
29
nodes have transmitted the 100-packet burst, they switch to the next frequency and
repeat the process for all of the 16 dierent frequencies. The whole experiment that
involves collecting data from all 16 frequencies (with all the nodes transmitting 100-
packet bursts), is referred to in this dissertation as a transaction. In the case of the
IoT-Lab testbeds, each transaction takes approximately 14 minutes, a period which
takes account of the delays required for switching frequencies. Each transaction
generates a trace with network connectivity. The traces comprise 16 NN PDR
matrices, where N is the number of nodes and each matrix refers to a dierent
frequency. Since the IoT-Lab testbeds are shared among a large number of users,
it was not possible to conduct continuous experiments for long intervals, so the
inter-transaction interval was set to 2 hours. The number of transactions that
were carried out varied for each testbed for logistical reasons. 60 transactions were
carried out for Grenoble and 84 for Lille and Strasbourg. The traces from these
testbeds were solely used to drive simulations. The statistics from the datasets
are analyzed in Section 2.3. As is clear from the results, Grenoble has more link
statistics and more dynamic features and, for this reason, was the only one among
the IoT-Lab testbeds that was used in evaluating the protocols.
The Soda testbed is not an actual testbed. It was a short-term deployment
of sensors in one of the UC Berkeley laboratories. The process employs the same
methodology as that used in the Mercator project, but the rmware source code
is not publicly available, just the datasets. The inter-packet interval was equal
30
to 20 ms and the transmission power was set to 0 dBm. In the case of the Soda
testbed, each transaction takes approximately 30 minutes. In total, 17 transactions
were made at dierent times of day. The datasets from this testbed were used solely
for simulations.
The Tutornet testbed is the only testbed to which we had complete access and
control. All the TelosB motes are connected to a central computer by USB cables.
In this way, it is possible to communicate with the rmware that is running in the
motes. We developed a rmware based on Contiki-OS [81]
6
that carries out the
same data collection process as in Mercator project. The inter-packet interval was
set to 10 ms. Each transaction in the experiments takes about 12 minutes to be
executed.The inter-transaction interval was set to 15 minutes and the transmission
power was set to 0 dBm. The collection went on continuously for 24 hours and
resulted in 96 traces of network connectivity. The datasets were also used for
simulations, and since we had complete access to the physical testbed, we also
conducted real experiments on it.
Table 2.1 summarizes the features of all the 5 dierent datasets obtained from
the testbeds.
6
The source code can be found at https://github.com/pedrohenriquegomes/
tsch-scheduler-and-simulator.
31
Table 2.1: Summary of testbeds' features
# Testbed
Inter-packet
interval
Transaction
duration
Inter-transaction
interval
# of
transaction
Total
duration
1 Lille 10 ms 14 min. 2 hours 84 6 days
2 Strasbourg 10 ms 14 min. 2 hours 84 6 days
3 Grenoble 10 ms 14 min. 2 hours 60 5 days
4 Soda 20 ms 30 min. random 17 unknown
5 Tutornet 10 ms 12 min. 15 min 96 24 hours
2.3 Results from the datasets
We analyzed all of the 5 dierent datasets from the testbeds described in Sec-
tion 2.2.1. Our focus was on network connectivity, especially with regards to the
PDR and the RSSI.
2.3.1 Packet delivery ratio (PDR) for dierent channels
The PDR is the percentage of packets that are correctly received in a link. Figure 2.2
shows the PDR for the 16 channels in all the testbeds. We took account of all traces
gathered from all the datasets. The boxplots show the PDR statistics of all the
links in the network, which means that a link is included for each pair of nodes that
can receive at least one packet from the 100-packet bursts, and the corresponding
PDR is calculated.
It is clear from Figures 2.2a and 2.2b that the Lille and Strasbourg testbeds do
not show any variation on the quality of their links across dierent channels. The
nodes in these testbeds are conned to single rooms with no external sources of
interference and no dynamic features in the environment. It can be concluded from
32
(a) Lille (b) Strasbourg
(c) Grenoble (d) Soda
(e) Tutornet
Figure 2.2: Boxplot for the PDR statistics of all 5 testbeds, including statistics
from all the datasets
33
the data that these two testbeds have fully-connected networks since the PDR is
not lower than 90% in all the channels. Indeed, from an analysis of the statistics, it
could be noted that every node had N 1 neighbors in all the experiments, where
N is the number of nodes in the network. These are non-realistic testbeds that
could not be used to test protocols.
The Grenoble testbed (shown in Figure 2.2c) has a higher degree of variability,
especially around the lower channels (11 to 16). This was expected as the Grenoble
testbed is located in an oce building with people moving around and there are
some sources of interference from Wi-Fi networks. From the raw data, we could
observe that each node had a large number of neighbors, although the network is
not fully connected. This is a testbed that represents environments with very little
external interference and was used for some simulations in this dissertation.
The Soda and Tutornet testbeds (seen in Figures 2.2d and 2.2e) are the two
most realistic testbeds, since they have a much higher degree of variability of PDR
in the channels. The Soda testbed has a median PDR above 80% in all the channels,
but lower PDRs in channels 13, 14, 20 and 23. From the raw data, we could infer
that each node had approximately N=2 neighbors, in most of the channels and the
number of neighbors per node does not change much with dierent traces, which
suggests that all the transactions were made at times when there were similar levels
of external interference. Finally, the Tutornet testbed had much lower PDR values
in all the channels, including those located outside of the Wi-Fi bands (channels
34
25 and 26). This is a much more realistic testbed with high levels of external
interference. The experiments were carried out during both oce hours and non-
business periods of the day, which explains the higher degree of variance of the PDR
levels. There is a wide variation in the number of neighbors in the Tutornet testbed
both across dierent channels and across dierent network traces. An extensive
analysis of the number of neighbors of each testbed is provided in Section 2.3.3.
It is clear that frequency diversity is not useful for the Lille and Strasbourg
testbeds. However, it is important for the Grenoble and Soda testbeds; and is cer-
tainly essential for the Tutornet testbed. In the case of the Tutornet testbed, owing
to its highly dynamic environment, the choice of a suitable channel for each trans-
mission may improve the packet delivery ratio and ensure the minimum reliability
required by the applications.
2.3.2 Distribution of the number of channels with a high
PDR
Multi-path fading is one of the factors that aect the number of channels with high
quality for a given link. In some scenarios, multipath fading can be even more
harmful than external interference. Multi-path fading is created by the multiple
copies of the signals that reach the receiver through dierent paths as a result of
re
ections from surrounding obstacles. When a signal is transmitted, the receiver
node receives a line-of-sight component, together with replicas that traverse longer
35
paths by bouncing around the objects nearby. Depending on the relative position
of the transmitter and receiver nodes, these replicas may seriously interfere with
the line-of-sight signal, or even improve its signal strength. Since the phase of the
replicas when they reach the receiver node depends on the frequency used, each
dierent channel undergoes a distinct pattern of signal overlapping, and hence, is
aected by multi-path fading in dierent ways. Multi-path fading is common in
indoor deployments that are cluttered with re
ective obstacles, such as metallic
racks. It can also occur when people are moving around.
We aim to analyze how many channels have high PDR in all the links. We
examined the traces for all the testbeds but rst checked the existing links. A link
l
i;j
, from nodei to nodej, exists if there is a channelc such thatPDR
c
l
i;j
> 0. This
means that given that at least one packet goes through a link, it is necessary to
include that link in our analysis. In the case of all existing links, we took account of
the number of channels where PDR is higher than thePDR
thr
threshold. Figure 2.3
plots the histogram of the number of channels with PDR
thr
= 50%
7
.
It can be concluded from Figure 2.3 that practically all the links in the Lille and
Strasbourg testbeds have a PDR greater than 50% in all 16 channels. Many of the
links in the Grenoble testbed are in the same position. This means that it is not a
dicult task to pick the best channel for these links since most of them have a wide
range of choices. On the other hand, the Soda and Tutornet testbeds have links
7
We assume thatPDR
thr
= 50%, as this value is considered to be the minimum PDR required
for link stability in products such as Smart Mesh IP, which is an industrial leader in IEEE 802.15.4
TSCH-based networks [82].
36
(a) Lille (b) Strasbourg
(c) Grenoble (d) Soda
(e) Tutornet
Figure 2.3: Histogram of the number of channels with PDR> 50% for all links.
37
where there is a greater variability with regards to the number of channels with
good quality. In the case of the Tutornet testbed, it is clear that a large number
of links do not have any reliable channel, which means that these links should be
avoided in the routing trees. Moreover, in the same testbed, a large number of links
only have a few channels with good quality, and thus selecting the right channel is
of great importance to ensure high reliability in these links.
In realistic environments, such as the Tutornet testbed, exploiting space diver-
sity is important since being able to select the neighbors with a larger number of
good channels may improve the delivery ratio if a frequency hopping scheme is
employed. In the case of single-channel allocation, the algorithm used to decide
the frequencies to be used by each node has to take into account the number of
channels with high PDR, which varies much more in deployments with external
sources of interference.
2.3.3 Average number of neighbors per channel
We calculated the number of neighbors with a stable link for each node, by assuming
a PDR > 50%. Following this, we compared the number of neighbors per node
across all 16 dierent channels. Table 2.2 shows the average and the standard
deviation of the number of neighbors with a stable link.
On average, the Lille, Strasbourg and Grenoble testbeds have more than 30
neighbors per node. While the Soda testbed has an average of 13, and the Tutornet
38
Table 2.2: Average number of neighbors with stable link (PDR > 50%) for all
nodes
Average number of neighbors (std. dev.)
Channel # Lille Strasbourg Grenoble Soda Tutornet
11 48.97 (1.16) 47.21 (2.47) 34.13 (8.5) 14.02 (4.83) 8.42 (2.93)
12 48.96 (1.16) 47.22 (2.4) 35.24 (8.25) 13.42 (4.88) 5.98 (2.63)
13 48.96 (1.16) 47.26 (2.43) 37.11 (8.02) 13.11 (4.83) 6.0 (2.4)
14 48.97 (1.16) 47.38 (2.05) 35.97 (8.23) 14.23 (4.99) 7.72 (3.03)
15 48.96 (1.16) 48.0 (0.05) 36.6 (7.85) 14.82 (5.12) 10.94 (4.04)
16 48.96 (1.16) 48.0 (0.06) 36.49 (8.0) 13.8 (4.91) 6.35 (3.26)
17 48.96 (1.16) 47.93 (0.31) 36.48 (7.99) 13.23 (5.11) 4.06 (2.86)
18 48.96 (1.16) 47.99 (0.11) 35.04 (8.32) 12.77 (5.14) 4.84 (3.04)
19 48.96 (1.16) 47.98 (0.29) 36.73 (7.8) 13.74 (5.17) 8.44 (3.33)
20 48.96 (1.16) 47.96 (0.43) 33.81 (8.5) 14.06 (5.36) 9.4 (2.97)
21 48.96 (1.16) 47.39 (2.01) 36.52 (7.88) 13.42 (5.09) 8.26 (2.94)
22 48.97 (1.16) 47.42 (1.88) 29.46 (7.85) 13.0 (4.78) 6.73 (2.61)
23 48.97 (1.16) 47.44 (1.89) 36.37 (8.21) 13.08 (4.68) 7.95 (2.86)
24 48.96 (1.16) 47.57 (1.45) 35.51 (8.53) 13.83 (4.75) 8.69 (2.81)
25 48.96 (1.16) 47.97 (0.31) 35.64 (8.52) 13.88 (5.05) 9.6 (3.15)
26 48.96 (1.16) 47.99 (0.18) 34.57 (8.42) 14.12 (5.04) 11.47 (3.46)
All 48.96 (1.16) 47.67 (1.53) 35.35 (8.38) 13.66 (5.01) 7.8 (3.64)
testbed less than 10 for all the channels. A large number of neighbors per node in
the rst testbeds, may impose limitations on the performance of routing protocols
such as RPL, which may not be able to store information about a large number of
neighbors
8
in constrained devices. This also means that any node can be reached
with one or two hops, which is not realistic in a multi-hop network.
The widest variation between channels occurs in the Tutornet testbed. In this
testbed, the number of neighbors that a routing algorithm should include when
building the tree should vary following the channel used for transmissions. When
8
The neighbors table for OpenWSN is restricted to 30 nodes and this number is usually reduced
when motes such as TelosB are used and there is a need for more RAM.
39
exploiting space diversity, the link quality per-channel is very important in envi-
ronments with high external interference, as in these situations, the set of stable
neighbors varies following the channel used. Even though a large number of neigh-
bors could suggest a greater degree of
exibility for the routing algorithm, it is
impracticable to ensure a good link quality estimation for a large number of neigh-
bors owing to the overhead required for exchanging packets with all of them. If
properly designed, a link quality estimator that considers fewer neighbors may still
achieve good results with smaller overhead.
2.3.4 PDR versus RSSI - the \waterfall plot"
We calculated the average RSSI from all the received packets in all of the 100-
packet bursts. Then, we created a scatterplot for the PDR as a function of the
average RSSI. This plot reveals some interesting features of the dierent testbed
deployments. Owing to the shape of the plot, it is referred to as a \waterfall plot"
in [83]. Ideally, in an environment with very little interference, the PDR should
be 0% for RSSI below the receiver sensitivity level and 100% for RSSI above it.
Around the sensitivity value
9
the PDR should increase linearly from 0% to 100%.
Figure 2.4 shows the \waterfall plot" in the dierent testbeds. All the transac-
tions from the experiments, running in all the 16 channels, were taken into account.
In the Lille and Strasbourg testbeds, it can be seen that the minimum values of
the average RSSI measurements are between -80 and -60 dBm, which is well above
9
The RSSI sensitivity for TelosB motes is generally -94dBm.
40
(a) Lille (b) Strasbourg
(c) Grenoble
(d) Soda
(e) Tutornet
Figure 2.4: PDR as a function of average RSSI - the \waterfall plot".
41
the radio sensitivity level. This is another reason for the very good link quality of
these testbeds since all the nodes are very close to each other and the links have a
very short range. The Grenoble testbed has a \PDR vs RSSI" prole which is very
similar to an ideal situation, without much external interference. The links have
a long range since several packets were received with power close to that of the
radio sensitivity. The two most challenging testbeds, Soda and Tutornet, show how
dispersed and unpredictable the \PDR vs RSSI" prole can become. This prole is
mainly aected by external interference and multi-path fading. It can be concluded
that even short links, with high RSSI values, may experience very low PDRs. The
particular behavior of the Soda testbed indicates that all the links had a long range
since the maximum RSSI values obtained were close to -55 dBm. In contrast, the
Tutornet testbed had links with dierent ranges and wide PDR variability.
We will now focus on the two most realistic testbeds: Soda and Tutornet, and
show the \waterfall plot" individually for all 16 channels. Figure 2.5 shows the
plots for the Soda testbed, and Figure 2.6 for the Tutornet testbed.
It can be seen that in the case of the Soda testbed (Figure 2.5), the shape of the
graphs for individual channels does not dier much. Channels with less interference,
such as 15, 20, 25 and 26 show less variability and sharper falls in the curve for
the lowest RSSI values. On the other hand, in the case of the Tutornet testbed
(Figure 2.6), each channel has a dierent \waterfall" shape. Less noisy channels
such as 20 and 26 show a sharper fall in the PDR, but the same did not happen
42
Figure 2.5: \Water-fall" plot for all 16 channels in the Soda testbed.
to channel 25, which was expected to suer less external interference.This may
be related to other wireless devices working on the 2.4 GHz that are used in the
same environment, such as Bluetooth and other proprietary technologies. It can be
concluded that if frequency diversity is to be exploited in environments with high
external interference levels, it is necessary to characterize the behavior of individual
channels, since the relationship between PDR and RSSI levels may widely vary in
dierent channels.
43
Figure 2.6: \Water-fall" plot for all 16 channels in the Tutornet testbed.
2.4 Lessons learned
From the very large number of experiments carried out in multiple testbeds, we
were able to obtain valuable information about the behavior of wireless links and
network connectivity in multiple channels.
Some lessons have been learned from the experiments:
(i) It is essential to analyze the realistic features of a testbed (or real deployment)
to decide whether or not diversity schemes are important for the protocols.
The software overhead and complexity of the protocols may discourage the
44
exploitation of diversity in interference-free environments. On the other hand,
diversity is of paramount importance for environments with high levels of
external interference, as can be seen from the experiments in realistic testbeds.
(ii) Statistical information such as (i) the distribution of PDR across channels,
(ii) the number of channels with high PDR per link, (iii) the average number
of neighbors per node, and (iv) the \PDR versus RSSI" prole, is useful
means of assessing the degree of realism of a testbed and/or real deployments.
These statistics can be analyzed oine to determine how accurate a protocol
evaluation will be in a given environment. Moreover, they are a useful means
of deciding whether or not the employment of diversity schemes is important
in a network.
(iii) The \PDR versus RSSI" prole, shown by the \waterfall" plot, provides in-
sightful information about the link qualities of a network. Although it is easy
to measure RSSI for each packet, combining the averaged RSSI values with
PDR is a challenging task because it requires feedback to detect the packet
losses and the use of a time-window mechanism. However, this information
is very useful because it can enable the nodes to learn the shape of the \wa-
terfall" curve in each channel with a few measurements, and then passively
estimate the link quality of multiple neighbors by overhearing packets and
keeping track of the RSSI values.
45
2.5 Conclusions
Diversity is an important tool for improving the performance of low-power wireless
networks since simple radios can be adopted in these networks and also because of
the widespread presence of external interference in the 2.4 GHz ISM band. When
exploiting diversity, it is important to understand how the link quality varies in
space, over time and across dierent channels. The use of testbeds makes it possible
to understand the best strategies that can be used to exploit diversity in wireless
networks, as well as test/evaluate new protocols and algorithmic strategies.
We conducted experiments in 5 dierent testbeds to obtain link quality statis-
tics and understand the dynamics of the low-power wireless network in real environ-
ments. Based on the experiments, it can be concluded that the degree of realism of
indoor testbeds varies following the environment where the nodes are deployed. The
testbeds deployed in single rooms with low levels of external interference (e.g., Lille
and Grenoble testbeds) showed very stable links with good quality and no need to
exploit diversity. Two other testbeds that were deployed in workplaces (Strasbourg
and Soda) showed wider variations in link quality that can be exploited by space
and frequency. Finally, the Tutornet testbed showed the widest levels of variation
in link quality among all of the 5 testbeds. This is the most realistic testbed in
terms of external interference and multi-path fading. It was clear that the need to
employ diversity schemes increases as the environment becomes more dynamic and
the problem of interference is more common.
46
Chapter 3
Flooding-based reliable TSCH
The 2.4 GHz ISM band is shared with many other wireless systems and devices (e.g.,
Wi-Fi, Bluetooth, RFID, microwave ovens, etc.).
1
In industrial deployments, the
topology and connectivity are aected by the high levels of RF noise, the re
ective
nature of machinery materials, and even vibration and other factors that can cause
nodes to malfunction. When employing LLNs in industrial applications, there are
three main requirements that must be met: (i) high reliability, (ii) low latency, and
(iii) low energy consumption.
Most industrial applications can be classied as either deterministic or non-
deterministic. Deterministic applications include periodic data collection and con-
trol loops [84] employed in process control. Data collection is more often discussed
in the literature concerning wireless sensor networks, where throughput and energy
consumption are optimized. On the other hand, control loops have not yet been
thoroughly investigated. This last type of application requires less throughput, but
1
This chapter includes work from [3, 60].
47
is more stringent in terms of reliability (in the order of 99.999%) and delay (in the
order of milliseconds) [85]. Non-deterministic applications include alarm systems
and the actuation of sensor nodes. The former consists of event-triggered messages
sent from sensor nodes towards the sink, while the latter consists of messages sent
from the sink to the sensors. Non-deterministic event-triggered messages sent from
the sensors to the sink are challenging because they may also require high reliability
and low latency. Unlike deterministic applications, the relay nodes are not able to
sleep most of the time in non-deterministic scenarios, since they do not know when
they will receive the packets that have to be forwarded towards the destination
(usually the sink node).
In this chapter, we set out the Flooding-Based Reliable TSCH (FBR-TSCH)
as a solution for event-triggered applications, where the sensor nodes generate non-
deterministic signals that have to be forwarded to the sink. The network topology
is unknown to the network designer and changes dynamically during the network
operation. The protocol may need to support high levels of interference on the
entire 2.4 GHz band, that is generated by both non-intentional and intentional
interference nodes. The solution was designed to be simple and yet ecient in
optimizing 3 dierent and con
icting criteria: (i) reliability, (ii) end-to-end latency
and (iii) energy consumption.
48
The FBR-TSCH was evaluated through simulations and in a real scenario. The
purpose of the simulations was to ne-tune the few parameters required and calcu-
late the limits of the proposal. In the simulations, we compared FBR-TSCH with
baseline Routing Protocol for Low-Power and Lossy Networks (RPL).In the real
scenario, FBR-TSCH was compared with other 8 state-of-the-art protocols.
The main contributions of this work are as follows:
(i) the design of a TSCH-based simplied MAC protocol for optimizing non-
deterministic applications in noisy environments;
(ii) carrying out simulation-based evaluations that compare the broadcast-based
FBR-TSCH with the unicast-based RPL protocol;
(iii) the implementation and empirical evaluation of the FBR-TSCH protocol on a
real testbed.
2
3.1 Related work
Conventional routing approaches used for data collection employ gradient-based
protocols such as CTP [86] and RPL [87]. These are \best-eort" protocols that
rely on unicast messages and exploit high-quality links. Since traditional routing
protocols require the construction and maintenance of routing trees, these proto-
cols may incur large latency (in the order of seconds) in dynamic environments [88].
2
The source code of FBR-TSCH can be found athttps://github.com/pedrohenriquegomes/
fbr-tsch.
49
Flooding-based approaches have been employed in scenarios where delays of only a
few milliseconds is a requirement, and the network dynamics are unpredictable. In
addition to spatial diversity,
ooding protocols are also able to exploit frequency
diversity via multi-channel protocols and employ other techniques, such as con-
structive interference, where multiple transmissions of the same packet from syn-
chronized nodes increase the chances of successful packet reception. The drawbacks
of
ooding are its high energy consumption and the fact that they may cause net-
work congestion, since the packets are retransmitted by multiple nodes and, if not
properly controlled, the retransmissions can occupy the medium for a long time.
Constructive interference has been exploited in LLNs by numerous protocols [89,
90, 5, 1] as a way of improving packet reception in noisy and interference-prone
environments. Glossy [89] was the rst proposal to implement this approach in
Tmotes. The authors in [89] demonstrate that concerning IEEE 802.15.4 radios, a
maximum time skew of 0.5s is necessary to ensure a high probability of successful
packet reception. Glossy is based on periods of
ooding activities, interspersed
with periods of regular network operation. Within each
ooding period, whenever
an initiator node transmits a packet, all the others relay the packet concurrently
up to N times. The packets have a counter c that determines the number of
retransmissions. Since the duration of the retransmission is the same for all the
nodes, c is also used to synchronize them with the initiator.
50
Glossy can be enhanced with frequency hopping techniques to also take advan-
tage of frequency diversity [90, 5]. The channel that will be used can be inferred
from the data sequence number or the packet relay counter c.
Sparkle [2] was a proposal that employs a sequence of time slots, each running
Glossy-based
ooding for a specic purpose. While one specic slot is used for
synchronization, a few others are used for data dissemination and control messages.
Sparkle also employs topology control to determine the optimal subset of nodes that
can be used as relays and the best power setting for reducing energy consumption.
Chaos protocol builds an all-to-all information dissemination protocol on top of
Glossy. It implements a sequence of network-wide computations and data aggre-
gation to optimize dissemination. Robust Chaos [6] employed the same framework
with the aid of channel hopping and blacklists to improve performance in environ-
ments with high levels of interference.
Finally, RedFixHop [1] improved the delay obtained by Glossy even further by
taking advantage of the hardware-generated acknowledgment packets. In this new
approach, the data packets were generated by nodes when the packet relay counter
c is even; for odd values ofc, hardware-generated ACKs were employed, which com-
pletely bypassed the microcontroller and improve the delay and synchronization.
Constructive interference has been shown to make many protocols more reliable.
It has one serious drawback, however, which is the tight synchronization require-
ment. Sub-microsecond synchronization with a high-precision time base may not
51
be feasible in a network with heterogeneous sensor nodes (e.g., dierent hardware)
or environments with wide dierences in temperature.
Low Power Listening (LPL) MAC protocols [91] have been employed in duty-
cycled networks to save energy while still providing reliable communication. In
these protocols, the nodes are coarse-grained synchronized with their neighbors
(i.e., there is no network-wide synchronization requirement). Either the receivers
or transmitters periodically sample the medium to detect network activity and
decide whether or not to stay awake to receive the packets; when no activity is
detected, the nodes are expected to sleep until the next sampling period.
One frequent problem of the LPL protocols is that of the false wakeups which
occur when a node remains with its radio on due to noise energy or activities from
other networks. The authors in [7] addressed this problem proposing the Con-
tikiMAC protocol with DCCA (Dierentiating Clear Channel Assessment), that
dynamically adjusts the CCA threshold and reduces the chances of false wakeups.
The concept of cognitive radio has been widely explored as a means of mitigating
external interference. The authors in [4] put forward a practical solution that works
on top of LPL protocols and allows multi-channel operation. In this system, the
channel selection process is modeled as a multi-armed bandit problem and the
Thompson sampling algorithm is employed to pick the best channels.
Our FBR-TSCH proposal [3] is based on the time slot structure specied in the
IEEE 802.15.4 TSCH standard, which only requires coarse-grained synchronization.
52
As in other works, our solution makes use of frequency hopping and periods of
ooding activity, followed by periods of energy-saving. The main advantages of our
protocol are that it supports dierent hardware platforms, which is hard to achieve
with protocols based on constructive interference and does not necessarily need to
ne-tune parameters or include a learning phase, which is required by most of the
LPL protocols.
3.2 Routing protocol for low-power and lossy networks
(RPL)
The RPL is a distance-vector routing protocol designed for 6LoWPAN networks.
It creates Destination-Oriented Directed Acyclic Graphs (DODAGs) towards the
sink node [92]. A DODAG is a tree-like topology that supports both downstream
and upstream trac. RPL separates the packet processing and forwarding from
the routing optimization via its Objective Function (OF). An OF describes how
nodes should convert one or more metrics and/or constraints into a DAGrank,
which represents the node's distance from the DODAG root. The DAGrank strictly
increases in the downward direction. It is used by the RPL to detect and avoid loops
and allows the nodes to distinguish between their candidate parents and children
or sibling nodes.
53
The DODAG formation starts at the sink, which generates periodic DODAG
Information Object packets (DIOs). The non-sink nodes listen to the DIOs and join
the DODAG. Every node that joins the routing tree starts to periodically broadcast
the DIOs so that the graph can be extended towards the leaf nodes. Each DIO
packet contains the node's most recent DAGrank and the nodes store the DAGrank
from all the neighbors. During the topology construction, a subset of stable nodes
is chosen from the list of neighbors and a preferred parent is selected based on the
OF. Dierent OFs can be designed to comply with specic optimization criteria
and satisfy the dierent requirements of the applications, such as minimum energy
consumption or minimum end-to-end latency. The designing of ecient Objective
Functions is still a matter of research [88], especially in the context of industrial
networks.
RFC 6719 [93] denes the Minimum Rank with Hysteresis Objective Function
(MRHOF), which has been adopted as the default implementation of RPL in most
LLNs. MRHOF uses a hysteresis mechanism to prevent unnecessary parent switches
being caused by small metric changes. It also supports dierent types of metrics,
such as hop count, latency, etc. The most commonly used metric, however, is
the ETX (Expected Transmission Count) [63]. In summary, MRHOF is a greedy
objective function that minimizes the end-to-end ETX from the sensor nodes to
the sink.
54
RPL uses a single path towards the sink, which in scenarios with high interfer-
ence may lead to greater delays (due to a larger number of retransmissions) and
unreliability. Another issue examined by several papers is the slow convergence of
RPL and route reconstruction in the face of lossy links, which makes it unsuitable
for critical networks. Since nodes only exchange data with the neighbor with the
lowest DAGrank, they do not have much knowledge of the link quality of other
neighbors and may take a long time to nd a better route when interference levels
change. This issue is explored in greater details in Chapter 5.
3.3 FBR-TSCH
The FBR-TSCH protocol has been optimized for dynamic environments with high
levels of external interference, as well as for critical applications.
The protocol was designed to meet the requirements set for the International
Conference on Embedded Wireless Systems and Networks (EWSN 2016) depend-
ability competition, which are as follows:
(i) the number of nodes in the network and their location is unknown and may
change over time;
(ii) the initialization process in the network should be fast and only take about 10
seconds;
55
(iii) external interference can be expected from non-intentional and intentional
sources and spans the whole 2.4 GHz band;
(iv) the network must support multi-hop operation;
(v) the sensor node generates non-deterministic signals (on-o) at a slow rate (at
intervals of a few seconds) and this information must be received by the sink.
Based on the above requirements, several design decisions were made to optimize
the operation of TSCH at the link layer and other protocols at higher layers. Since
the topology is dynamic and the network initialization has to be fast, it is not
feasible to create and maintain routing trees. For this reason, FBR-TSCH adopts
ooding for the network layer. As there is only a single application on the network,
the transport layer is omitted. At the application layer, the binary on-o signal
is contained in a single bit and a sequence number is used to control the network
ooding. Packets with a given sequence number are only re-transmitted once by
each relay node.
A few adaptations were also required at the link layer. First, since the FBR-
TSCH relies solely on the
ooding mechanism, there is no need for link-layer ACK
packets. Hence, the length of the time slot was reduced and the portion reserved
for ACK packets was eliminated. The size of the maximum packet was also taken
into account to reduce the time slot duration from 10 ms to 4 ms. Figure 3.1 shows
a typical IEEE 802.15.4 TSCH time slot structure and the optimized time slot used
by FBR-TSCH.
56
(a) IEEE 802.15.4 TSCH time slot.
(b) FBR-TSCH time slot.
Figure 3.1: Time slot structure (IEEE 802.15.4 TSCH and FBR-TSCH).
FBR-TSCH requires network-wide time synchronization, whereas the other pro-
posals that are based on LPL protocols require synchronization only between pairs
of nodes [7, 4]. Besides, the Glossy-based protocols implicitly obtain synchroniza-
tion employing constructive interference [1, 5]. According to the IEEE 802.15.4
standard, each node must select at least one time source neighbor to remain syn-
chronized with the network. Each node periodically sends Beacons containing a
join priority eld that represents the distance from the node to the sink (the join
priority of the sink is 0 and increases downwards). All the nodes are expected to
join the network and remain synchronized with the neighbor that has the lowest
join priority (i.e., the node closest to the sink).
The join priority is usually obtained from the routing protocol (e.g., RPL),
which is designed to form a loopless structure. RPL and other gradient-based
protocols rely on acknowledged packets to calculate the ETX and pick reliable
neighbors as their preferred parents. A problem that arose while designing the
FBR-TSCH was the fact that it only operates with broadcast messages and does
57
not have a network layer that builds a tree structure. This problem was overcome
by replacing the concept of join priority with the DAGrank calculated by the RPL
protocol. The calculation of the DAGrank accounts for the number of hops to
the sink and the quality of the link to each neighbor. Since the beaconing rate is
xed for all the nodes and inter-node interference is mitigated by random access,
we estimate the link quality in terms of the number of received beacons during a
xed interval. The link quality estimation performed with beacons is a downlink
estimation. Even though wireless links are generally asymmetrical [94], downlink
estimation was used as a rough estimation for the uplink direction.
Loops are detected by the RPL protocol when an inconsistency occurs between
the packet routing decision and the DAGrank announced by the transmitting node
(e.g., the packets routed upwards from a node with lower rank). In FBR-TSCH,
the loop detection is implemented by using a unique sequence number in all the
beacons that is only incremented by the sink and replicated by all the nodes. The
nodes are not synchronized with the beacons that have older sequence numbers.
Whenever a node (or a subset of nodes) is disconnected from the tree that has the
sink, it will not receive beacons with a new sequence number, and as a result, will
desynchronize and eventually re-join the network.
The schedule followed by FBR-TSCH is only based on shared time slots (where
any node can transmit a packet). A special type of time slot is designated where
only beacon packets are allowed to be transmitted. This was introduced to avoid
58
interference between the beacons and data packets. CSMA/CA is not employed
within all the shared time slots since optimal CCA levels are hard to estimate in
noisy environments. A simple random access technique is employed instead. Nodes
transmit with probability p at any data time slot whenever there is a packet to be
transmitted.
3.4 Simulation results
We initially validated FBR-TSCH and assessed its performance by simulations,
comparing it with RPL employing the Minimum Rank with Hysteresis Objective
Function (MRHOF).
A simulator was written in C language which creates a schedule that follows the
specications of the FBR-TSCH proposal. One time slot is reserved for beacons,
followed by a variable number of shared time slots for data packets, and a variable
number of inactive time slots for energy saving. The duty cycle can be changed,
as well as the probability p, which a node uses to transmit packets whenever there
is one in its queue. The simulator takes a set of connectivity traces as input and
uses these traces to model the external interference. Whenever two nodes decide
to transmit in the same time slot, both packets are lost.
When carrying out the simulations, we used the 40-node datasets obtained from
the Tutornet testbed (Section 2.2) since it is the scenario that has the highest levels
59
of external interference. The sensor node randomly generates a new signal with an
average interval of 10 seconds and the simulation lasts 35 minutes.
Three criteria were analyzed during the evaluation:
(i) delay - average time that takes the packets to travel from a sensor node to
the sink;
(ii) reliability - percentage of data packets that were correctly received by the
sink;
(iii) energy consumption - energy consumed by all the sensor nodes.
The rst stage in our evaluation is to nd the optimal parameters for our sce-
nario. Since there are three con
icting metrics in our evaluation (reliability, energy,
and delay), we created a multi-objective function which could be maximized, as fol-
lows:
OF
i
=r
i
e
i
arg max
j
e
j
d
i
arg max
j
d
j
(3.1)
where r
i
is the reliability, or the percentage of packets that were successfully
received by the sink node, in test case i, and the two fractions involving e
i
and d
i
represent the normalized energy and delay, respectively. In the objective function
in Equation 3.1, all the three metrics have the same weight.
In the simulations, we set the sensor node at node #40, which is located in a
position that requires at least 3 hops to reach the sink node (set as node #1).
60
We varied the duty-cycle of the network and the probability p. Figures 3.2
and 3.3 show the value of the objective function as a function of the duty-cycle and
as a function of probability p.
Figure 3.2: Objective Function versus p with dierent duty-cycles
Figure 3.3: Objective Function versus Duty-cycle with dierent p
It can be concluded that for the topology under consideration, the optimal duty-
cycle value is 0.2 and the optimal probabilityp is equal to 0.1. We also performed
the same analysis while changing the sensor node responsible for generating the
61
signal (we also included nodes #5, #10, #20, #25, #30 and #35). Among all
these dierent scenarios, the optimal duty-cycle was also equal to 0.2, except for
sensor nodes #10 and #5, which are nodes that have a one-hop path to the sink
and can aord to have lower duty-cycles of 0.1 and 0.05, respectively. The optimal
probability p was also equal to 10% for sensor nodes #35, #30 and #25, and 20%
for sensor nodes #20 and #10. Sensor node #5 had an optimal probabilityp equal
to 40%.
The RPL protocol was implemented with the default MRHOF.We avoided the
eects that the large convergence time and eventual loops can cause in the analysis
by running the RPL experiments in two phases. In the rst phase, no trac data
were included and only the DIO and Keep-Alive packets were simulated to generate
the routing tree. This phase lasts 35 minutes. With the nal tree formed in the rst
phase, we ran another 35-minute simulation with trac data and a xed routing
tree. In RPL, there are no concurrent transmissions in the network, and thus the
probability of packet transmission (p) should be 100% to maximize the results.
Hence, only the duty-cycle is varied in the RPL simulations. All the packets are
unicast and the number of link-layer retransmissions is equal to 3.
The FBR-TSCH was compared using the optimal parameters found previously,
with RPL. Figures 3.4 and 3.5 show the results of the delay analysis. Figures 3.6
and 3.7 show the reliability. And, nally, Figures 3.8 and 3.9 show the energy
consumption.
62
Figure 3.4: Delay versus duty-cycle
Figure 3.5: Delay versus p
In Figure 3.4 it is clear that RPL obtains lower delay when compared with FBR-
TSCH, but compared with the results in Figure 3.5, it can be concluded that the
advantage of RPL is entirely due to its higher probability of transmission, which
is 100% for RPL and 20% for FBR-TSCH. When p is equal to 100%, FBR-TSCH
outperforms RPL with a delay rate lower than 50 time slots.
From an analysis of Figure 3.6, it can be seen that there is a trade-o between
delay and reliability. Even though RPL obtained lower delay, the number of packets
63
Figure 3.6: Reliability versus duty-cycle
Figure 3.7: Reliability versus p
that could be delivered to the sink was close to half of the number delivered by
FBR-TSCH. The very low reliability of RPL shows that single-path routing is not
an option for critical applications. In Figure 3.7, it can be seen that a larger
probability p incurs more losses to FBR-TSCH, owing to the collisions caused by
simultaneous transmissions. There is a critical value, at p = 10%, where there is a
sharp decline in the reliability of the network.
64
Figure 3.8: Energy consumption versus duty-cycle
Figure 3.9: Energy consumption versus p
Finally, Figure 3.8 shows that both FBR-TSCH and RPL consume practically
the same amount of energy. This can be explained by the fact that most of the
power consumption is caused by the radio being in receive mode. We took note of
the consumption specied in the Tmote datasheet. The receive current is 23mA,
while the transmit current is 21mA. Furthermore, since the data packets are very
short (16 bytes), the time that a node spends in receive mode while receiving a
packet (T
data
plus up to 2RxGuardTime) is very close to the time spent by a
65
node that does not receive anything in the current time slot (2RxGuardTime).
Using the values employed in our implementation, a node spends 960s in receive
mode before detecting that there is no network activity, while a node that correctly
receives a packet, may spend between 512 and 1,472 s in receive mode. On av-
erage, both situations require the same amount of energy. One way to optimize
power consumption is by reducing the guard times, but this may cause network
desynchronization in real deployments. We discuss how to improve the guard times
later, in Section 3.5. Figure 3.9 shows that FBR-TSCH has a very small variation
of energy consumption with dierent values of p.
We calculated the same objective function with Equation 3.1 for the RPL sim-
ulations and compared the results with FBR-TSCH (Figure 3.10).
Figure 3.10: Objective function versus duty-cycle for FBR-TSCH and RPL.
It can be seen in Figure 3.10 that duty-cycles larger than 0:15 make FBR-TSCH
more ecient than RPL. Both protocols have the same energy consumption, and
RPL has reliability that is on average half of the FBR-TSCH. The metric that
66
widely varies with the duty cycle is the delay. When duty-cycles are less than 0:2,
the delay of FBR-TSCH greatly increases, which makes it inecient when compared
with RPL. The main problem of using RPL with very low duty-cycles is that its
reliability is much smaller than FBR-TSCH. Even though in our analysis the same
weights were given to all three criteria, reliability is very often the most important
in IIoT applications. Moreover, the delay shown in Figure 3.4 only took account
of the packets that actually reached the sink node. If we consider that the packets
that did not reach the sink incur in a penalty in the delay metric, let's say of 1000
time slots, RPL would not be able to achieve the same performance as FBR-TSCH
since more than half of the packets transported by RPL would experience these
high delays.
3.5 Experimental results
We implemented the FBR-TSCH based on OpenWSN [80]
3
. All the stack above
the link layer was removed and completely reimplemented for our design. Besides,
the time slot structure was adapted to conform to the FBR-TSCH specications.
The implementation was evaluated during the EWSN 2016 dependability competi-
tion [3]. The schedule used had 1 time slot reserved for beacons, followed by 10 time
slots for data communication and 2 time slots for energy saving. This represented
a duty-cycle of about 85%. The value of p for random access was set to 17%.
3
The source code of FBR-TSCH can be found athttps://github.com/pedrohenriquegomes/
fbr-tsch.
67
The scenario consists of a 45-node network based on TelosB-like motes, spread
over an area of 150m
2
. Among the 45 nodes, 15 are used as sensor and relay nodes
and the other 30 are interfering nodes. The interfering nodes are programmed to
generate repeatable interference patterns [95] that span all 16 channels. One of the
15 nodes is randomly selected to generate an on/o signal (from an LED controlled
by an external device) that must be forwarded to the sink and replicated on an
external I/O pin. The maximum number of hops from the sensor to the sink was
equal to 3, which was a better scenario for delay and reliability than that of the
simulations.
The evaluation takes 35 minutes. During the rst 20 minutes, the levels of
interference are similar to Wi-Fi networks, and during the last 15 minutes, the
interference patterns reproduce extreme cases of intentional jamming. During the
35-minute interval, a total of 576 signal changes are made.
It should be noted that the scenario tested in simulations is dierent from the
one used in the competition. We used the simulations to validate and evaluate
the eciency of our protocol and provide insights on the parameters that could be
useful during the deployment. However, based on the preliminary results from the
competitors, we decided to prioritize the reliability and the delay of our solution
concerning the energy consumption, since the other proposals were much more
ecient in terms of energy than ours and we had to reduce our delay as much as
possible.
68
Figure 3.11 shows the results obtained by FBR-TSCH and 8 other proposals [1,
2, 4, 5, 6, 7].
(a) Reliability (b) Average end-to-end delay
(c) Energy consumption
Figure 3.11: Comparison between the performance of the proposals in [1, 2, 3, 4,
5, 6, 7].
FBR-TSCH had the second-best reliability with a rate of 99.13% and thus out-
performed most of the Glossy-based implementations. All the other proposals that
obtained reliability higher than 90% [1, 5, 6] were based on Glossy, which shows
the robustness of constructive interference.
69
FBR-TSCH obtained an average delay of 87.2ms. Only three other proposals [1,
5, 2] outperformed FBR-TSCH, all of which were based on Glossy. The dierence
between FBR-TSCH and the best solution, however, is lower than 75ms. The two
proposals based on the LPL protocols obtained an average delay higher than 1
second, which shows that this class of protocols is not appropriate for real-time
critical applications that require ms-level end-to-end delays [85].
The results for reliability and delay obtained by FBR-TSCH shows that TSCH
can be applied to industrial applications with closed-loop control systems, where
high levels of reliability (in the order of 99%) and low delay (in the order of tens of
milliseconds) are required.
On the other hand, FBR-TSCH had the highest energy consumption. As
pointed out earlier, the highest proportion of the energy consumption of FBR-
TSCH can be attributed to the time that nodes are required to wait in receive
mode. The inherent ne-grained synchronization of constructive interference and
the use of probe packets in LPL-based protocols were the factors that led the other
solutions to be more energy-ecient.
The large energy consumption of FBR-TSCH can be reduced by ne-tuning the
guard times. However, although this may not compromise other metrics, it might
lead to the desynchronization of the nodes and has to be carefully tested. The
IEEE 802.15.4 TSCH species a guard time of 2.2ms with a time slot length of 10
ms. We reduced the guard time to 1.92ms for the competition
70
From the analysis of the scenario of the competition, it can be concluded that
the guard time could be even further reduced. The slot frame that was employed
had 13 time slots, each with a duration of 4 ms. The beacons, which are used to
synchronize nodes with their clock parent, were also transmitted with a probability
p of 17%. Hence, each node transmits, on average, one beacon every 305 ms. Since
the nodes only synchronize with beacons that have a higher counter (due to the loop
formation avoidance algorithm), nodes may have to wait for at least 3 to 4 hops to
receive a valid beacon, hence, on average every 1.2 seconds. If it can be accepted
that a maximum of 10 valid beacons are lost before a node is desynchronized, we
can aord a maximum interval between synchronization events (i.e., the reception
of a valid beacon) of about 12 seconds. Given a regular clock drift rate of25ppm
(parts per million), and 12 seconds, the minimum guard time necessary is about
12; 000 ms 2 0:0025% = 0:6 ms. The theoretical minimum guard time is
approximately one-third of that used in the competition, which explains the large
energy consumption of our solution. We believe that ne-tuning the guard time
would bring our power consumption close to half of the value measured, which would
make our solution competitive when compared with the other proposals (around
700 J). However, this solution increases the chances of losing synchronization,
which would greatly impair the reliability of FBR-TSCH.
FBR-TSCH does not requires-level synchronization, and this is an advantage
when it is compared with Glossy-based implementations. As mentioned earlier,
71
constructive interference is only successful if the nodes transmit with a maximum
time shift of 0.5 s. Since most nodes simply rely on the internal RC Oscillator
as a clock source, changes in the temperature may considerably alter the run-
ning clock of the dierent nodes and the required synchronization may be hard to
achieve. Figure 3.12 shows a typical RC oscillator drift for ATmega328P microcon-
trollers [8]. We can see that, if the temperature varies from 20:0
C to 30:0
C, the
frequency ranges from approximately 8.1MHz to 8.13MHz (with 6V). Even though
the synchronization is usually driven by interruptions that are triggered by a pre-
cise 32,768Hz crystal, whenever a new piece of code is executed, the clock source
is switched to the 8 MHz oscillator and may become unstable as a result of the
dierence in temperature. If the piece of code that is executed before the packet
is transmitted takes 1,000 instructions, the nodes in a location with 20:0
C will
take 123.45us, while the nodes at 30:0
C will take 123.00us. This means that the
maximum time drift of 0.5s that is required for Glossy-based protocols, cannot be
guaranteed since there is still a need to account for the time drift of the 32,768Hz
crystal. Even though a 10:0
C dierence is a very large gure for regular situations,
it can happen in industrial environments if the nodes are close to the machinery or
other sources of heat.
One last problem related to the use of constructive interference in real wireless
sensor networks is the fact that, since all packets relayed at each hop has to be the
same for all retransmissions, it is not possible to use a per-node key for security
72
Figure 3.12: A typical RCO frequency drift characteristic. Each curve corresponds
to a dierent input voltage (in Volts) Source: [8].
protocols. Network-wide keys can still be used for encryption, but this is a viable
solution for real deployments.
3.6 Lessons learned
The development of FBR-TSCH mainly served to validate the feasibility of making
practical use of the TSCH protocol in critical applications and harsh environments.
To the best of our knowledge, this was the rst time that a study has compared
TSCH with other protocols in an interference stress test.
Some lessons have been learned from the evaluation process:
(i) Single-path RPL with MHROF does not attain the reliability requirements of
critical applications. Its reliability was approximately half of FBR-TSCH;
(ii) The energy consumption of FBR-TSCH is primarily in
uenced by the duty-
cycle. Another possible way of reducing it is by ne-tuning the receiver guard
73
time, which plays an important role in the overall energy consumption of in
networks without a schedule that permits nodes to sleep;
(iii) Very small duty-cycles sharply increase the delay in FBR-TSCH, and make it
less ecient than RPL. The turning point in the simulations was a duty-cycle
equal to 0:15;
(iv) FBR-TSCH outperformed most of the Glossy-based protocols in reliability
and delay, which shows that sub-millisecond synchronization is not necessary
for critical applications;
(v) LPL-based solutions do not perform well concerning reliability and delay but
may entail low energy consumption;
(vi) Constructive interference is a technique that enables protocols to have high
reliability and low delay. The disadvantage of this technique is the ne-grained
synchronization, which may not be feasible in networks working in industrial
scenarios;
3.7 Conclusions
FBR-TSCH is a TSCH-based solution that is optimized for critical applications that
run in interference-prone and noisy environments. The solution relies on broadcast
packets. Its design solved some issues that arise when acknowledgment packets
74
are completely removed. FBR-TSCH also implements a solution for avoiding loop
formation, link quality estimation, and network-wide synchronization.
In addition to the simulations for evaluating FBR-TSCH and comparing it with
unicast RPL routing, we also tested the proposal in a real scenario with 45 nodes.
During the tests conducted in the EWSN 2016 dependability competition, FBR-
TSCH was able to outperform most of the state-of-the-art protocols concerning
reliability and delay. Concerning the energy consumption criteria, however, FBR-
TSCH was not as ecient as the other solutions, which, as has been shown here,
could be improved if the guard times were properly ne-tuned.
75
Chapter 4
Multi-hop and blacklist-based optimized TSCH
TSCH adopts Frequency Hopping Spread Spectrum (FHSS) to mitigate the eects
of multipath fading and external interference.
1
It changes the communication fre-
quency that is used at every time slot. Each time slot has an associated channel
oset that is converted into a frequency employing a pseudo-random hopping func-
tion. The frequency that is used might be any one of the 16 available in the 2.4 GHz
band dened in the IEEE 802.15.4 standard. Frequency Hopping is an easy way of
making simple physical layers, e.g. IEEE 802.15.4 with O-QPSK modulation, more
resistant to external interference.
If a simple blind hopping function is used, all the frequencies are uniformly
selected and transmissions experience average levels of interference. In any network,
link quality is coherent in the short term [96], which means that links with poor
quality are likely to remain in a bad state for a certain period, called the \coherence
time". Experiments in industrial environments show that coherence time can last
1
This chapter includes work from [62].
76
from hundreds of milliseconds to seconds [97]. This means it is preferable to avoid
frequencies in which the most recent transmissions failed. This gives rise to the
practice of blacklisting frequencies and opting for selective hopping to improve the
performance of FHSS [46]. When blacklisting is employed, the frequencies with
bad quality are temporarily excluded from the hopping list. A particular frequency
should be excluded as long as the link quality at that frequency is poor, which
requires an ecient method to estimate the link quality regularly.
Although blacklisting has already been used by other technologies such as Blue-
tooth [98] and ISA100.11a [38], it is still dicult to achieve an optimal implementa-
tion. Building a centralized blacklist is neither trivial nor eective. The quality of
each frequency on all the links has to be collected and combined by a central agent.
Besides, since link qualities vary in dierent locations, a frequency that is unsuitable
for a particular link might be in a good state for others. Distributed blacklisting
is more eective, but requires coordination and may even increase interference in
networks that allow simultaneous transmissions.
In this chapter, we introduce the Multi-hop And Blacklist-based Optimized
TSCH protocol (MABO-TSCH). Our proposal employs a distributed blacklist
for improving the performance of multi-hop wireless networks that have to cope
with high levels of external interference and multi-path fading.
In MABO-TSCH, the hopping sequence is locally built with information ex-
changed between each pair of communicating nodes. Besides, the hopping pattern
77
that must be used in each link is optimally chosen so that, regardless of the neigh-
bor's blacklists, two interfering links never use the same frequency. In this way,
interference between neighboring links is avoided, and optimal TSCH schedules {
with simultaneous transmissions at dierent links { can be executed. The chal-
lenging task of estimating channel quality can be overcome through Multi-Armed
Bandit (MAB) optimization.
The main contributions of this work are as follows:
(i) proposal of a solution for distributed blacklisting that is optimized for multi-hop
networks and compliant with IEEE 802.15.4 TSCH standard;
(ii) a channel quality estimation algorithm based on MAB optimization;
(iii) the implementation and empirical evaluation of the MABO-TSCH protocol,
both in simulation and on a real testbed.
2
4.1 Related work
With the introduction of IoT in the industries and process automation, new tech-
nologies were developed with a focus on increasing reliability. As described in
details in Section 1.1.3, the main technologies for this type of applications are
WirelessHART, ISA 100.11a, and TSCH [31]. Blacklisting is used in all these three
new protocols. Early research [99] have shown that blacklisting is benecial for
2
The source code of MABO-TSCH can be found at https://github.com/
pedrohenriquegomes/mabo-tsch.
78
the coexistence of WirelessHART and Wi-Fi networks, but the benet is smaller
for the performance of Wi-Fi, since these networks have more bursty trac, while
WirelessHART's trac is sporadic and uses lower power. The conclusion is that
simple mechanisms that create static blacklists are not able to keep up with the dy-
namics of Wi-Fi networks. Blacklists were also shown to enable better coexistence
of WirelessHART and other networks based on IEEE 802.15.4. One main dier-
ence between WirelessHART and the other two protocols (i.e., ISA 100.11a and
TSCH) is that the former only allows one single blacklist to be used for all nodes in
the network, while the latter two protocols allow dynamic per-node blacklists [100].
Distributed blacklisting is more ecient and provides
exibility for better exploiting
frequency diversity together space diversity.
The authors in [101] proposed a dynamic approach for both the frequency hop-
ping and the blacklisting mechanisms of ISA 100.11a. The network consisted of
clusters, each one with a cluster-head node that was responsible to monitor the
link quality with all nodes below it, at dierent channels. The channel used for
communication was not changed for every time slot, but only when the current
channel suers degradation, which should be detected by the cluster-head. Be-
sides, the cluster-head removed the channels that have quality below a certain
threshold, creating a distributed blacklist. This approach relies on a few channels
that are expected to have a good quality where control packets are changed, so the
application to scenarios where all channel may suer interference is very limited.
79
The authors in [46] conducted one of the rst works that demonstrate that
frequency hopping improves the reliability of IEEE 802.15.4-based networks. In
the experiment, blind frequency hopping reduced the average ETX by 56% and
the network churn (changes in the parent selection for the routing tree) by 38%.
Dierent sizes of blacklist were tested and it was found that the best solution for
their scenario was to use the best 6 channels (and blacklist the other 10). Their
analysis was conducted with trace-based simulations.
A few recent works have designed TSCH enhancements to improve FHSS. The
authors in [102] put forward an Adaptive TSCH (ATSCH) protocol, which is a
distributed blacklisting solution. It introduces a new type of time slot for noise
oor estimation, and blacklist information that is sent in every Enhanced Beacon
(EB). The quality of channels is periodically estimated through RSSI measurements
during NF time slots; a blacklist is locally calculated by each node and disseminated.
The results show that a blacklist of size 6 can improve the average ETX by 8.1%
when compared with blind hopping. It also shows that blacklisting increases the
average PDR of the whole network and reduces its dispersion, i.e., the links become
stronger and more stable. The negotiation method piggybacks the blacklist into
EB, which is not ecient, since EBs are expected to be exchanged every tens of
seconds, thus incurring a long delay for updating the blacklist. The solution is also
unable to guarantee that the neighbor nodes will use the same blacklist since EBs
are broadcast packets and may not be received by all the neighbors.
80
As an improvement to the ATSCH solution, the work in [103] introduced the
Enhanced TSCH (ETSCH) variant, which has some components: (i) a channel-
quality estimation that measures energy during periods of silence in every time
slot, and (ii) an enhanced beacon hopping sequence listing that only makes use of
the strongest channel for broadcasting EBs. The results show that ETSCH provides
a 24% higher PRR and 50% shorter length of burst packet losses than ATSCH, in
scenarios with a high level of interference. Even though this link estimation proce-
dure outperforms ATSCH, it is only executed at the sink and requires a maximum
acceptable clock drift. Using a smaller subset of stronger channels for broadcasting
EBs is an improvement for blacklist distribution, but still does not guarantee that
all the nodes will use the same blacklist since there are no ACK packets for beacon
transmissions. Finally, both ATSCH and ETSCH fail to take into account cases
where simultaneous transmissions are scheduled in a multi-hop network, in which
case the blacklist may cause internal interference.
In [104], an enhanced algorithm was proposed for the adaptation of the hopping
sequence according to dynamic blacklists. This new algorithm still uses a lookup
table, but the sequence of channels is changed over time and accounts for the
channels that have bad quality. The authors in [104] showed that the algorithm
is more ecient than regenerating the hopping sequence every time a channel is
added/removed from the blacklist, and it still keeps the orthogonality of channels
between interfering links. The algorithm, however, should be executed by all nodes,
81
which implies that the blacklist is not distributed, but centralized. Besides, the
proposal does not detail a good way of adding/removing the channel to/from the
blacklist, which requires an ecient link quality estimator.
An accurate Link Quality Estimator (LQE) has to be implemented to determine
the quality of a channel and allow the blacklist to be eciently created. The authors
in [105] provide a comprehensive survey of available LQEs. Hardware-based LQEs
only rely on the information available from the radio chip, such as RSSI, LQI,
and SNR. Even though this information is part of the IEEE 802.15.4 standard,
and hardware-based LQEs do not require additional computation, their degree of
accuracy is limited because they rely on parameters that need to be ne-tuned.
Software-based LQEs require additional computation but achieve a better degree
of accuracy and stability. Most software-based LQEs leverage data provided by
hardware, such as RSSI and LQI, and improve the estimates by employing dierent
processing methods. We propose a link quality estimation based on the Multi-
Armed Bandit problem and employ an approximate solution based on an -greedy
strategy.
The Multi-Armed Bandit (MAB) problem, where an automated agent seeks to
maximize the total payo obtained after a sequence of trials, is a classical paradigm
in stochastic optimization. In MAB problems, the agents have to choose a strategy
that provides the best trade-o between exploring the unknown environment and
exploiting current knowledge. We refer the reader to [106] for an overview of the
82
MAB technique and examples of practical applications. Although MAB has been
explored in theory for channel allocation problems such as for opportunistic spec-
trum access [107, 108, 109], so far there has been little research on how to provide
a practical application of MAB for adaptive channel allocation in communication
systems.
It is clear from the state-of-the-art that there are still three open problems when
optimizing FHSS:
(i) how to design an optimized hopping sequence that prevents interference be-
tween nodes that have been scheduled for simultaneous transmissions;
(ii) how to create a distributed blacklist and ensure that neighbors agree with the
hopping sequence that will be used;
(iii) how to implement a feasible channel estimation mechanism that does not de-
pend on hardware resources and is adaptable to dynamic networks.
The goal of our MABO-TSCH proposal is to solve these three open problems.
4.2 MABO-TSCH
Multi-Hop And Blacklist-based Optimized TSCH protocol (MABO-TSCH) con-
sists of three key algorithms.
The rst algorithm (Section 4.2.1) assigns channel osets to time slots to prevent
interference. The adopted solution is based on a graph coloring heuristic that
83
associates multiple orthogonal channel osets to each non-leaf node and allows the
use of dierent frequencies in each time slot.
The second algorithm (Section 4.2.2) ensures that there is a proper blacklist
negotiation between the nodes. Each pair of nodes (parent-child in the routing
tree) negotiates a local blacklist by piggybacking blacklist information into the
data or ACK frames.
The third algorithm (Section 4.2.3) measures and classies the channels, and is
responsible for building and maintaining the blacklists. The channel classication
process is modeled as a MAB problem with an approximate solution based on
-greedy strategy.
Figure 4.1 displays the relationship between the algorithms and the nodes in the
network and shows the case of a receiver-based channel oset assignment algorithm.
Figure 4.1: MABO-TSCH algorithms.
84
4.2.1 Channel oset assignment
The channel oset assignment must be executed when dierent pairs of neighbor
nodes communicate in the same time slot oset (i.e., at the same time) to avoid
intra-network interference. The channel oset allocation can be of three types:
link-based, receiver-based or transmitter-based. In the link-based assignment, each
active link is associated with a channel oset. In the receiver and transmitter-based
types, the channel oset is assigned to the receiver or transmitter nodes, respec-
tively, and all the time slots must be executed with the channel oset associated
with the participating nodes.
In data collection applications, unicast transmissions tend to be directed towards
the sink and most of the routing trees have a large number of leaf nodes. Hence,
link-based and receiver-based channel oset assignments are more appropriate for
multi-hop tree-based data collection applications and for this reason MABO-TSCH
uses a receiver-based channel oset assignment
The channel oset assignment is a graph coloring problem, which is known
to be NP-hard. However, heuristics such as greedy degree-ordering known as the
Welsh-Powell [110] algorithm yield near-optimal results in most practical cases.
In our proposed algorithm, we extend the Welsh-Powell heuristic and associate
multiple non-interfering channels to each node. All the nodes are sorted in non-
increasing order according to their degree in a graph that is constructed with nodes
as vertices and interfering links as edges. The coloring problem is solved for the
85
sorted array of nodes (from the highest to the lowest degree) and includes all the
16 available channel osets as colors. After all the nodes have been colored, the
algorithm repeats to nd multiple channel osets for each node. The channel oset
assignment is completed when no more colors can be assigned to any node.
Algorithm 1 shows the algorithm for multiple channel oset assignments. It is
centrally executed and its results are used by an agent, e.g., Path Computation
Element (PCE), that is responsible for computing and disseminating the TSCH
schedule. The network graph must be obtained from the network in advance with
the use of any simple data collection application that can gather PDR statistics
from all links. If the schedule is included in design-time, it can disregard the edges
between the nodes that do not have time slots allocated at the same time, which
will increase the number of assigned channel osets to the nodes.
Even though PDR statistics change signicantly over time, the channel as-
signment algorithm does not require its statistics to be updated to work operate
correctly. However, if this algorithm is not executed regularly, the blacklisting
algorithms may become degraded over time.If there are no concurrent transmis-
sions scheduled in the network there will be less intra-network interference, and
the coloring algorithm will play a less important role. In the extreme case of an
event-triggered application, where there is only sporadic data trac all the channel
osets can be assigned to all the nodes. In this case, Algorithm 1 is not necessary,
86
since there is no con
ict among the transmissions within the network that has to
be resolved.
Algorithm 1 Channel oset assignment
Input:
G(V;E) - network graph with nodes as vertices and interfering links as edges
C - list of 16 channel osets
Output:
Nodes colored with multiple channel osets
1: Sort vertices v
1
;v
2
;:::;v
n
in V in non-increasing degree order
2: colored true
3: while colored is true do
4: colored false
5: for all v
i
in V do
6: ndc
i
as the minimal color inC not assigned to any vertexv
j
connected
to v
i
7: if c
i
exists then
8: colored true
9: Add c
i
to the list of channel osets of v
i
10: end if
11: end for
12: end while
4.2.2 Distributed blacklist negotiation
The blacklist negotiation procedure must ensure that each nodes in a pair of com-
municating nodes will be able to communicate using the same blacklist. Were they
to use a dierent blacklist, their radios would be turned on at dierent frequen-
cies, and they would not be able to hear each other. Moreover, the information
exchanged during the blacklist negotiation should not entail a large overhead in the
network.
87
The dissemination of blacklist information may be based on either broadcast [102,
103] or unicast messages. In the case of broadcast messages, there is no guarantee
that the information will be correctly exchanged between neighbors. Moreover,
broadcast messages such as Enhanced Beacons (EB), tend to have transmission in-
tervals that are much larger than the dynamics of the channels. Unicast messages
ensure that the negotiation is successful, but may lead to some overhead. When
using unicast messages for node-to-node blacklist negotiation, the blacklist can be
embedded into the data or ACK frames. When embedded into the data frame, the
overhead incurred by blacklist information may aect the application performance,
since it uses part of the application payload. When embedded into the ACK frame,
no overhead is perceived by the application, since the time reserved for the ACK
transmission is xed and can accommodate a few extra bytes.
After the exchange of a new blacklist, both the transmitter and receiver must
decide when to start using it. This is a key factor in the blacklist negotiation because
if there is a mismatch of information, a large number of packets might be lost. In the
worst-case scenario, if one of the involved nodes is a time synchronization source,
a part of the network may be disconnected.
Blacklist negotiation has been eciently implemented in this work by adopting
unicast messages and embedding the blacklist information into the ACK or data
frames, depending on the type of application that is employed. Besides, we have
used bidirectional negotiation: this means that the parent nodes are responsible for
88
creating and disseminating blacklists to their children if the ACK-based negotiation
is used; the opposite is the case in data-based negotiation. Blacklist information is
only used in unicast communication; transmissions that use shared time slots (such
as EBs) do not employ blacklisting.
ACK-based negotiation is used when there are constant trac
ows from chil-
dren to parents, since the measurement is calculated by the parent node. Data-
based negotiation is used when an event-triggered application (e.g., issuing an alarm
event) is the main target. Since parents cannot predict when the packets will ar-
rive, the link quality estimation has to be carried out by the children. In this case,
a few bytes of data overhead are perceived by the application, since the blacklist
information has to be embedded in the data frames.
We mainly focus this article on data collection applications that employ ACK-
based negotiation. Algorithm 2 shows the pseudo-code executed at the parent.
Algorithm 3 shows the pseudo-code executed at the children. Algorithm 2 and
Algorithm 3 are evaluated through simulations and real experiments.
Similar algorithms are created for the data-based negotiation, which is more
appropriate for an event-triggered application. Algorithm 5 shows the pseudo-
code executed at the parent. Algorithm 4 shows the pseudo-code executed at the
children. Algorithms 5 and 4 are evaluated through simulations.
In all 4 algorithms, both nodes keep a table with two rows: r
using
andr
negotiating
.
Both rows include the Data Sequence Number (DSN) of the last packet and the
89
blacklist information. Row r
using
has the most recent negotiated blacklist informa-
tion and must be used at the beginning of each time slot. Row r
negotiating
has the
blacklist information that is currently being negotiated and has not yet been used.
Algorithm 2 Blacklist embedded in ACK frame (algorithm for the parent)
1: At the beginning of the time slot, consider the blacklist information in r
using
2: if data frame was successfully received then
3: FS DSN of received data frame
4: BL Most recent local blacklist information
5: if r
negotiating
has DSN equal to FS then
6: if maximum number of retransmissions is reached then
7: BL blacklist information from r
using
8: end if
9: Update r
negotiating
with BL
10: else
11: Replace r
using
by r
negotiating
12: Update r
negotiating
with FS and BL
13: end if
14: Send ACK frame with FS, embedding BL
15: end if
Algorithm 3 Blacklist embedded in ACK frame (algorithm for the child)
1: At the beginning of time slot, consider the blacklist information in r
using
2: FS DSN of data frame to be sent
3: Send data frame with FS
4: if r
negotiating
has a DSN dierent from FS then
5: Replace r
using
by r
negotiating
6: Update r
negotiating
with FS and blacklist information from r
using
7: end if
8: if ACK frame was successfully received then
9: BL blacklist information from received ACK frame
10: Update r
negotiating
with BL
11: end if
At the beginning of every time slot, the nodes use the blacklist information in
r
using
and depending on the success/failure of the packet exchange, the r
negotiating
may replace r
using
and the nodes can start a new blacklist negotiation. It should
90
Algorithm 4 Blacklist embedded in data frame (algorithm for the child)
1: At the beginning of time slot, consider the blacklist information in r
using
2: FS DSN of data frame to be sent
3: if r
negotiating
has a DSN dierent from FS then
4: Replace r
using
by r
negotiating
5: Update r
negotiating
with FS and blacklist information from r
using
6: end if
7: if Maximum number of retransmissions is reached then
8: MaxFlag 1
9: BL blacklist information from r
negotiating
10: else
11: BL Most recent local blacklist information
12: end if
13: Send data frame with FS, embedding BL and MaxFlag
14: if ACK frame was successfully received then
15: MaxFlag 0
16: Update r
negotiating
with BL
17: end if
Algorithm 5 Blacklist embedded in data frame (algorithm for the parent)
1: At the beginning of the time slot, consider the blacklist information in r
using
2: if Data frame was successfully received then
3: FS DSN of received data frame
4: BL blacklist information from received data frame
5: MaxFlag
ag from received data frame
6: if Row negotiating has DSN equal to FS then
7: Update r
negotiating
with BL
8: else
9: if MaxFlag is 1 then
10: Copy blacklist information from r
using
to r
negotiating
11: end if
12: Replace r
using
by r
negotiating
13: Update r
negotiating
with FS and BL
14: end if
15: Send ACK frame with FS
16: end if
91
be noted that r
using
is only replaced when r
negotiating
is consistent on both sides.
The IEEE 802.15.4 standard species that the DSN should only be incremented
after an ACK frame has been received and every ACK frame has its DSN copied
from the data frame that it is acknowledging. Thus, the nodes can guarantee that
both ends have the same information after a packet with a dierent DSN has been
received on both sides.
The blacklist information that should be used when transmitting packets with
DSN
n+2
is the one exchanged with packets with DSN
n
. Every packet has a max-
imum number of retransmissions at the link layer, which by default in our imple-
mentation is equal to 3 (a maximum of 4 trials). The proposed algorithms are
designed for networks with link-layer ACK and a maximum number of retransmis-
sions greater than, or equal to 1.
The blacklist information that is negotiated in all 4 algorithms may be dierent
depending on how this information is used for estimating the channel quality and
optimizing the hopping sequence. In Section 4.2.3, there was a discussion about
which types of blacklist information we use in our scheme. It should be pointed
out that the protocol proposed so far can be employed for any type of blacklist
information can be used.
92
4.2.3 Multi-armed bandit link estimation
A multi-armed bandit problem can be formulated as a set of K probability distri-
butionsB =fR
1
;R
2
;:::;R
K
g, each associated with the rewards delivered by one of
the K arms (or levers). The probability distributions have expected reward value
1
;
2
;:::;
K
and are a priori unknown to the player.
In an MAB problem, at each turn t =f1; 2; 3;:::g, an arm with index i(t) is
chosen and the player is given the reward r(t) R
i(t)
. Let
= max
i=1;2;:::;k
i
,
then the total regret for a sequence of trials with duration T can be dened as:
R
T
=T
T
X
t=1
r(t) (4.1)
Regret is the dierence between the chosen strategy and an optimal strategy
which always chooses the best arm. A common formulation of the MAB problem is
the Bernoulli multi-armed bandit, where a reward ofx is obtained with probabilityp
or otherwise a reward of 0. Related work shows that simple approximate heuristics,
such as the-greedy algorithm, achieve results close to or better than sophisticated
algorithms in most settings [106, 61] of MAB problems.
Our problem is the need to estimate the link quality of each of the 16 channels
to ensure that the best ones will be employed in the blacklisting mechanism. In our
implementation, the channel estimation problem is modeled as a multi-armed ban-
dit problem with Bernoulli distribution, in which the reward equals 1 for successes
93
and 0 for failures. Each node is an autonomous agent with 16 arms that correspond
to the 16 available channels.
We chose -greedy as the strategy for implementing our MAB-based channel
quality estimation because the algorithm is tractable enough to be embedded in
the sensor nodes.
During each trial, the bandit selects the arm (channel) that has the highest
mean reward with probability 1, and selects a random arm with probability .
We dene ^
i
(t) as the empirical mean reward of arm i after t trials. The average
empirical reward for each channel (^
ch
(t)) is updated with the exponential moving
average so that the most recent reward values have more signicance in calculating
the average reward.
The MAB algorithm must be executed at one node for each pair child/parent,
and the blacklist information must be embedded either in the data frames or ACK
frames, as described in Section 4.2.2 If the blacklist information is embedded in
data frames, the MAB algorithm is executed at the child node, while if the blacklist
information is embedded in ACK frames, it must be executed at the parent node.
Based on the current ^
ch
(t) obtained by the -greedy algorithm, the node where
the MAB algorithm is being executed has to create the most accurate channel
quality estimation and create a blacklist information, which will then be sent to
the neighbor node.
94
We propose two dierent types of blacklist information: a simple 2-byte blacklist
bitmap, and an 8-byte rank list. In the case of the blacklist bitmap, thek channels
with the highest average reward are not included in the blacklist (the corresponding
bits are equal to 0), while 16k channels that have the lowest values are included
in the blacklist. In the case of the rank list, all the 16 frequencies are sorted in
non-decreasing order according to their current ^
ch
(t), and the rank of the channel
must be equal to its position in the sorted array.
However, it may represent at least 10% of the data payload (plus the extra
energy consumption) of the common 6LoWPAN data frames, if the data-based
negotiation is employed. However, event-triggered applications, such as alarm sys-
tems, usually do not require large payloads and may easily support the overhead
of up to 8 bytes.
We propose two dierent algorithms for employing each type of blacklist infor-
mation and implementing the optimized frequency hopping. Algorithm 6, called
First Good Arm MABO-TSCH, uses the simple 2-byte blacklist bitmap. Algo-
rithm 7, called Best Arm MABO-TSCH, uses the 8-byte rank list.
In Algorithm 6, each of the available channel osets is translated into an actual
frequency and the rst frequency that is not blacklisted is used. In Algorithm 7
all available channel osets are translated into their corresponding frequencies and
the frequency with the highest rank is used.
95
Algorithm 6 First Good Arm MABO-TSCH frequency selection
Input:
BL - 2-byte bitmapped blacklist
CL - list of available channels osets
Output:
Frequency to be used
1: for all c
i
in CL do
2: freq c
i
converted into n actual frequency
3: if the bit that corresponds to freq in BL is 0 then
4: return freq
5: end if
6: end for
7: return freq
Algorithm 7 Best Arm MABO-TSCH frequency selection
Input:
RL - 8-byte rank list
CL - list of available channels osets
Output:
Actual frequency to be used
1: for all c
i
in CL do
2: freq c
i
converted into actual frequency
3: frequencies frequencies
S
freq
4: end for
5: Sort frequencies in non-decreasing order of rank according to RL
6: return freq with highest rank in frequencies
96
4.3 Simulation results
We rst evaluated the performance of MABO-TSCH with simulations to assess its
eectiveness, as well as its impact on the dierent parameters, and compare its
performance with an optimal solution.
We used a custom-made simulator written in C language
3
. This receives a set
of connectivity traces as input and calculates the routing tree and the optimized
TSCH schedule. The simulator runs the appropriate FHSS algorithm based on the
calculated TSCH schedule and the set of network connectivity traces.
The simulator builds a xed routing tree and a xed TSCH-compatible schedule
that considers the algorithms proposed by the MultiChannel Collection (MCC)
protocol [69]. The only modication made in MCC is the replacement of its graph-
coloring heuristic by Algorithm 1 so that the time slots could be associated with
multiple channel osets.
4.3.1 Dierent types of FHSS
5 dierent types of FHSS were simulated to evaluate the eectiveness of MABO-
TSCH. A TSCH variant is dened for each type of FHSS. The only dierence
between all the 5 variants is the algorithm used for converting channel osets into
frequencies { in other words, the frequency hopping sequence.
3
The source code of the simulator tool can be found at https://github.com/
pedrohenriquegomes/tsch_scheduling_algorithms_and_simulator.
97
4.3.1.1 Default TSCH
In Default TSCH, the channel conversion is carried out with a simple function that
takes into account the channel oset and the current ASN, and does not employ
any blacklisting technique. It is expressed by Equation (4.2):
f =((ch +ASN)%16) (4.2)
where f is the resulting frequency, is a function implemented as a simple
look-up table that maps an index to a frequency, and ch is the channel oset
4.3.1.2 Centrally blacklisted TSCH
A central agent creates a list with N channels that have more links with PDR
below a given threshold and then disseminates this information to all the nodes in
the network. In the simulation, this blacklist is formed employing the link quality
traces and the threshold is set to 90%. The results obtained from the simulation
represent a centralized solution where the sink has complete knowledge of all the
link qualities and can disseminate the best single blacklist to all the sensor nodes.
This is not a realistic solution since it is not possible to make a perfect real-time
link quality estimation at a centralized agent. All the nodes are supposed to convert
the available channel osets in every time slot employing Equation (4.2), and use
the rst frequency that is not in the blacklist.
98
4.3.1.3 Optimal TSCH
It is assumed that all the nodes have a knowledge of the channel quality for all the
links and can pick the channel with the highest PDR. This solution can be regarded
as a distributed blacklist where each pair of nodes selects the best channel for each
packet transmission, among all the channels that are allowed to be used. In this
ideal scenario, all the nodes are supposed to convert all the available channel osets
in each time slot employing Equation (4.2) and pick the frequency with the highest
PDR. This algorithm is an optimal distributed solution and should achieve the best
possible performance.
4.3.1.4 First Good Arm MABO-TSCH
The First Good Arm MABO-TSCH employs a multi-armed bandit problem using
-greedy strategy, as described in Section 4.2.3. In selecting the rst best arm,
the parent constructs a simple 2-byte blacklist bitmap for each of its children.
The blacklist is then shared with the children through the negotiation mechanism
outlined in Section 4.2.2. First Good Arm MABO-TSCH uses Algorithm 6 for
selecting the frequency that will be used in each time slot. The conversion from
channel osets to frequencies is undertaken following Equation (4.2).
99
4.3.1.5 Best Arm MABO-TSCH
The Best Arm MABO-TSCH uses an 8-byte rank list to select the best channel, as
estimated by the-greedy strategy. The 8-byte rank list is also created for each child
and negotiated with the algorithms from Section 4.2.2. It relies on Algorithm 7 and
the channel oset conversion also follows Equation (4.2).
It should be noted that both First Good Arm MABO-TSCH and Best Arm
MABO-TSCH employ the whole framework proposed in this work, which consists
of the three algorithms described in Section 4.2. The objective of implementing
both schemes is to compare their trade-os. The best solution that should be used,
if possible, is Best Arm MABO-TSCH.
4.3.2 Simulation setup
Datasets from Tutornet, Soda and Grenoble testbed were included, as described in
Section 2.2, because of the larger statistical variability of PDR, especially across
dierent channels. The results are presented with a 95% condence interval. The
schedule used in the simulation has a slot frame with 101 time slots, each time
slot taking 15 ms, the same size as that adopted in the real experiments. In each
slotframe, every node has one reserved time slot to send one packet upwards to the
sink node, and a sucient number of reserved time slots to forward packets from
their children.
100
4.3.3 Tuning the algorithms
Before comparing the 5 dierent types of TSCH, we determine their best parame-
ters. Simulations are divided into two groups. We rst evaluated the data collection
applications with blacklist information embedded in the ACK frames. We then eval-
uated event-triggered applications with blacklist information embedded in the data
frames.
4.3.3.1 Evaluating the data collection application
We set out by examining the data collection application, where all the sensor nodes
transmit one packet toward the sink in every slot frame. In this case, since all
the time slots are used for data transmission, the parent nodes can build and
maintain the blacklist using ACK packets, by following Algorithm 2 and 3. It
should be noted that even though we simulated a saturated network (with packets
transmitted at every time slot), this is not a requirement for these algorithms.
The only requirement for Algorithm 2 and 3 is that the parent nodes be able to
predict which time slot will be used by the children nodes to transmit data, since
the parents have to dierentiate between an empty time slot and a time slot where
there was a packet loss.
All 5 types of TSCH are simulated to tune the parameters in the algorithms.
To start with, we selected Centrally blacklisted TSCH and varied the number of
channels in the common blacklist (N). The optimal value ofN was equal to 11 for
101
the Tutornet testbed and 12 for the Soda and Grenoble testbeds. This is similar
to the results obtained in [46], where the optimal blacklist was found to be equal
to 10.
We now nd the best parameters for the -greedy strategy. We rst selected
Best Arm MABO-TSCH and varied, initially considering on xed. The optimal
value of was found to be 0:05 for the Tutornet testbed, 0:025 for Soda and 0:02
for the Grenoble testbeds.
Following this, we consider the First Good Arm MABO-TSCH solution and
varied both the and the k parameters. In this algorithm, the channels are sorted
according to their empirical rewards, and the k channels with the highest rewards
can be employed in the hopping sequence. We simulated First Good Arm MABO-
TSCH with a xed ranging from 0.5 to 0.01 andk ranging from 1 to 16. The best
was found to be 0:03 and the best k was 6, for the Tutornet testbed. The best
was equal to 0:02 and the best k was equal to 5, for the Soda testbed. Finally, the
best was equal to 0:02 and the best k was equal to 6, for the Grenoble testbed.
We also investigated whether an -greedy strategy with a decreasing can
achieve better performance. We set to 0.05 for both types of MABO-TSCH
solutions (First Good Arm MABO-TSCH and Best Arm MABO-TSCH ) and pe-
riodically reduced it by 0.001. The minimum value of is 0.01; after reaching this
value, is reset to 0.05. In our simulations, it was found that a decreasing does
102
not signicantly aect the number of received packets at the sink and the cumula-
tive regret was also very close. It can be concluded that, in a similar way to what
was veried by [61], a decreasing does not signicantly improve the performance
of MAB-based algorithms.
After this, we compared all 5 types of TSCH solutions: Default TSCH, Centrally
blacklisted TSCH, Optimal TSCH, First Good Arm MABO-TSCH and Best Arm
MABO-TSCH. The best values ofN,k and, we used for each as found previously.
To start with, we focused on the results from the Tutornet testbed. Figure 4.2
shows the total number of packets received at the sink for all 5 dierent types of
TSCH. Figure 4.3a shows the average regret per time slot. The average regret
is calculated as the total regret R
T
(Equation (4.1)) divided by the number of
time slots. Figure 4.3b shows the percentage of optimal channels used. This last
statistic was obtained by comparing the channel chosen by all the non-optimal
TSCH solutions in each time slot with the channel chosen by Optimal TSCH.
Figure 4.2: Total number of received packets at the sink on the Tutornet testbed.
103
(a) Average regret per time slot on the Tu-
tornet testbed.
(b) Percentage of optimal channels used on
the Tutornet testbed.
Figure 4.3: Average regret and percentage of optimal channels on the Tutornet
testbed.
Figure 4.2 shows that both types of MAB-based solutions (First Good Arm
MABO-TSCH and Best Arm MABO-TSCH ) outperform Default TSCH and Cen-
trally blacklisted TSCH. The average number of received packets is 43% higher when
Best Arm MABO-TSCH is compared with Default TSCH. The performance of Best
Arm MABO-TSCH is less than 10% lower than Optimal TSCH.
Figure 4.3a shows that Best Arm MABO-TSCH has the least regret and that
the average regret per time slot converges more quickly for this type of TSCH.
Because of the channels that are employed during the network operation, it can
be concluded that Best Arm MABO-TSCH chooses the best channel in approxi-
mately 75% of the transmissions (Figure 4.3a), while Centrally blacklisted TSCH
employs the best channel in approximately 60% of the times. Even small improve-
ments in the decision-making process concerning selecting the best channel can
dramatically improve the performance of the network.
104
Following this, we analyzed the results from the Soda and Grenoble testbeds.
Figure 4.4a and 4.4b show the total number of packets received at the sink with
regard to the Soda and Grenoble testbeds, respectively. The graphs of regret per
time slot and percentage of optimal channels show similar performance to the results
from the Tutornet testbed and are not displayed here.
(a) Soda testbed (b) Grenoble testbed
Figure 4.4: Total number of received packets on the Soda and Grenoble testbeds.
Even though the results were dierent for all three testbeds in terms of absolute
values, all the MABO-TSCH solutions increased the number of received packets
and were much closer to the optimal results. While in the Tutornet testbed, the
number of received packets increased by up to 43% compared with Default TSCH,
in the Grenoble testbed, the rate was about 8%.
The Tutornet and Grenoble testbeds showed higher variability in the results
because there were not many traces available from the Soda testbed (as discussed
in Section 2.2). Hence, multiple runs of the experiments were carried out with
105
the same sequence of PDR statistics, only changing the seed used in the random
number generator used in the simulations.
It is clear from the experiments that the optimal parameters are environmentally-
dependent, and thus may require the tuning process to be periodically repeated to
keep abreast with changes in the network and be self-adaptive to each environment.
This is discussed in more detail later.
4.3.3.2 Evaluating an event-triggered application
We now examine the case of an event-triggered application, where the sensor nodes
have to forward a packet towards the sink at random moments. This application is
often found in alarm systems and the key factors that must be optimized are: (i)
reliability, which is measured by the percentage of packets that successfully reach
the sink, and (ii) energy consumption. It should be pointed out that, reliability
can be optimized to 100% in any network, as long as a sucient number of link-
layer retransmissions take place in each hop. However, in the network in which the
simulations were carried out, it was not possible to obtain 100% reliability in all
three testbeds, even with an optimal policy and 3 link-layer retransmissions.
Since children nodes are those that are aware of which time slots will be used
for data communication, they must be responsible for building/maintaining the
blacklist and make use of the data packets to embed this information. Algorithm 4
and 5 should be used in this case. In the experiment, each node randomly chooses
106
to transmit one packet at the beginning of every slot frame with a probability p
equal to 0.01.
We also simulated all the 5 types of TSCH to nd the optimal parameters for
the algorithms, in a similar way to what was carried out in Section 4.3.3.1. We used
the datasets from the same three testbeds: Tutornet, Soda and Grenoble. In the
case of the Tutornet testbed, the optimal value of N was equal to 12, the optimal
value of was found to be 0:125, and the optimal k was 4. Concerning the Soda
testbed, the optimal value ofN was equal to 10, the optimal value of was found to
be 0:06, and the optimal k was 5. Finally, in the case of the Grenoble testbed, the
optimal value of N was equal to 10, the optimal value of was found to be 0:025,
and the optimal k was 4. The number of link-layer retransmissions was equal to 3
in all the simulations.
With regard to the Tutornet testbed, Figure 4.5 shows the end-to-end reliability
and the average number of retransmissions per packet.
Figure 4.5: Reliability and average number of retransmissions per received packet
on the Tutornet testbed,
107
As it can be seen in Figure 4.5, both the MABO-TSCH-based solutions improved
the average reliability from about 50% for Default TSCH to almost 90%. They
outperformed the centralized solution by more than 10%.
Figure 4.5 shows that the improvement in reliability is due to the reduction in the
number of required retransmissions. While a blind frequency hopping required every
packet to be retransmitted at least 1.4 times, blacklisting solutions (both centralized
and MABO-TSCH-based) reduced the average number of retransmissions to 0.8.
Reducing the number of retransmissions mainly aects the power consumption of
the sensor nodes and the overall delay for the packets.
Figure 4.6 shows the average reliability and the average number of retransmis-
sions per packet concerning the Soda testbed.
Figure 4.6: Reliability and average number of retransmissions per successfully re-
ceived packet on the Soda testbed,
The results obtained from the Soda testbed showed a higher degree of reliability,
which was close to 88% even for Default TSCH. The MABO-TSCH-based solutions
108
improve the reliability from about 88% to 98%, and also outperformed the central-
ized solution. The average number of retransmissions was also reduced from more
than 1:0 to about 0:3.
Finally, concerning the Grenoble testbed, Figure 4.7 shows the average reliability
and the average number of retransmissions per packet.
Figure 4.7: Reliability and average number of retransmissions per successfully re-
ceived packet on the Grenoble testbed.
The results obtained from the Grenoble testbed showed reliability and the av-
erage number of retransmissions that were very similar to the values found for the
Soda testbed. The reliability was increased from less than 90% to more than 98%.
The average number of retransmissions was also reduced from about 1:0 to less
than 0:2.
The results involving three dierent datasets show that the algorithms adapt to
dierent environments with distinct levels of interference and multipath fading. As
noticed previously, the environment changes the optimal parameters that need to
be employed in the algorithms. However, the ne-tuning process should not incur
109
a large amount of overhead on the network operation, since it depends on PDR
statistics from regular trac.
4.4 Experimental results
We implemented First Good Arm MABO-TSCH and Best Arm MABO-TSCH on
OpenWSN 1.10 to evaluate our solutions experimentally.
4
The default OpenWSN
implementation is modied to disable the use of the RPL protocol. As in the
simulation, we used a xed routing tree and static schedule based on MCC. We
only tested the implementation on the Tutornet as this is the testbed that we have
access to and which provided us with the
exibility to carry out longer experiments
so that less time was necessary for debugging the hardware. The same set of
40 nodes was used as that from which the simulation traces were gathered. Node
#1 is set as the sink, the other 39 nodes as sensors.
In the same way, as with the settings for the simulations, the size of the time
slots was set to 15 ms. The slot frame is composed of 101 time slots.Within each
slot frame, there were 39 reserved time slots used for unicast communication (in
which the blacklisting techniques are employed), 5 shared time slots for beaconing,
and 56 time slots used for serial communication for logging and turning the radio
o.
4
The source code of MABO-TSCH can be found at https://github.com/
pedrohenriquegomes/mabo-tsch.
110
Three solutions were examined in the experiments: Default TSCH, First Good
Arm MABO-TSCH and Best Arm MABO-TSCH. We executed 4-hour experiments
with 5 repetitions each for each setting. The repetitions were scattered throughout
the day, in business and non-business hours. Experiments with dierent algorithms
are repeated at similar times of the day to obtain similar patterns of external
interference. An extra node was used to measure external interference levels, so it
was possible to quantify how similar the interference is in dierent experiments.
is xed at 0.05 for Best Arm MABO-TSCH. For First Good Arm MABO-
TSCH, is xed to 0.03 and k is set to 6.
Figure 4.8 shows the total number of packets received at the sink. An average
improvement can be seen of about 23% for Best Arm MABO-TSCH, when compared
to Default TSCH. The larger condence interval of First Good Arm MABO-TSCH
and Best Arm MABO-TSCH is because MAB algorithms have to adapt to the
dynamics of the environment and may undergo a larger variation at moments when
there is a radical change in the environment (i.e., when there is greater external
interference).
Figure 4.9 shows the total number of packets received for each of the 5 repeti-
tions of the experiment, and the time each repetition is executed. Both MABO-
TSCH-based algorithms outperformed Default TSCH in all the experiments. Dur-
ing non-business hours (12am-9am), the improvement is much higher than during
business hours (9am-5pm), when external interference increases. Even though all
111
Figure 4.8: Total number of received packets at the sink in the testbed-based
experiment.
the experiments are repeated at similar periods of the day, the external interference
perceived by each experiment is dierent, since we cannot control the use of Wi-Fi
in the building.
Figure 4.9: Total number of received packets at the sink over time.
The noise measurements enabled us to calculate the correlation between the
dierent experiments. We took a moving average with a length of 40 time slots
to lter the measurements, and calculate the minimum correlation for all the 16
channels. All the experiments had a cross-correlation of at least 0.25 for all the
112
channels, except channels 15, 16, 17, 18, 19 and 22, where the correlation was close
to 0 (which means that the interference pattern was dierent in these channels).
Even though external interference patterns dier, we did not detect any anomalies
that could have drastically interfered with the results. These dierences in external
interference may explain why the performance of First Good Arm MABO-TSCH,
during the interval of 3pm-7pm, was slightly better than that of Best Arm MABO-
TSCH.
Figure 4.10 shows the channel usage in all 5 experiments for each type of TSCH.
Default TSCH uses all channels equally, as expected, while First Good Arm MABO-
TSCH and Best Arm MABO-TSCH more often employ channels with less inter-
ference, such as 20, 25 and 26. First Good Arm MABO-TSCH and Best Arm
MABO-TSCH still select channels with high levels of interference because of the
limited number of channel osets available in the optimal schedule created by MCC.
Figure 4.10: Channels used by all the leaf nodes (the light bars are failed transmis-
sions and dark bars successful).
113
Hence, there is a trade-o between the size of the schedule (which leads to
higher throughput) and the freedom to choose the best channel at every time slot.
It is also clear that the optimization of the TSCH schedule in
uences the use of
blacklists since optimal schedules restrict the set of channels that can be used. To
maximize throughput, reducing the size of the schedule may be preferred (even
if it reduces the availability of channels). On the other hand, when improving
reliability in event-triggered applications, the size of the schedule is less important,
since being able to pick the best channels is the crucial factor for reducing the
number of retransmissions per packet.
4.5 Lessons learned
Based on the results both in the simulation and real experiments, MABO-TSCH
showed that online learning is a viable means of tackling the blacklisting problem.
Some lessons have been learned from the evaluation process:
(i) MABO-TSCH is eective for both periodic and non-periodic data trac;
(ii) If we want to optimize the overhead and utilize ACK packets to embed the
blacklist information, periodic data trac is required, since the parent has to
know when packets are expected;
114
(iii) In the simulation, MABO-TSCH outperformed an ideal centralized solution
and obtained results close to the optimal solution, which demonstrated the
eectiveness of online learning;
(iv) In the simulations, MABO-TSCH was found to require large intervals to con-
verge, which may explain the poor results obtained in the real experiments,
and suggests that it requires a ne adjustment to other network parameters.
One solution is to dynamically adapt the moving average weight used for cal-
culating the average reward so that after the intervals with more losses, the
measurements are more signicant;
(v) MABO-TSCH has been shown to improve the performance of event-triggered
applications. However, in this type of trac, channel quality estimation is
a major challenge, since the rate of packet transmissions is low and, hence,
aects its accuracy. One way of overcoming this issue is to leverage packets
such as keep-alive messages to improve the channel quality estimation.
4.6 Conclusions
MABO-TSCH is a solution for distributed blacklisting that is optimized for multi-
hop networks and compliant with the IEEE 802.15.4 TSCH standard. The solution
involves the use of the three key algorithms. The rst algorithm assigns multiple
channel osets for each time slot so that each link has a set of frequencies to choose
115
from. All of these are orthogonal to other scheduled links, and can completely
avoid interference between nodes in a multi-hop network. The second algorithm
provides a pair-wise blacklist negotiation mechanism with little overhead. The third
algorithm carries out a channel quality estimation based on the multi-armed bandit
problem that can achieve near-optimal results without any type of learning phase
or hardware-related parameters being required.
In light of the main scenario studied, which involved a 40-node indoor network,
MABO-TSCH outperforms the default blind frequency hopping with a 43% higher
throughput in the simulation, and 23% higher throughput in the real experimenta-
tion. MABO-TSCH selects the best frequency for approximately 75% of the time.
It improves network performance even when there are only a limited number of
channels osets are available to be employed in each time slot.
Creating optimal TSCH schedules limits the set of frequencies that the black-
listing mechanism can use. However, there is a trade-o between the optimality of
the schedule (its length) and the usefulness of the blacklisting. It remains an open
question how far the joint optimization of schedules and the blacklisting mechanism
can make a further improvement in the performance of networks in highly dynamic
environments.
116
Chapter 5
Thompson sampling-based multi-channel RPL
The main objective of a routing protocol in multi-hop wireless networks is to create
an optimal path that connects the source and the destination nodes and minimizes
the use of resources.
1
It plays a special role in LLNs, as this type of network
is mainly composed of unreliable links that suer from external interference and
have wide
uctuations in quality over time. The RPL is the protocol standardized
by IETF for the routing layer of LLNs. More details on the RPL protocol can
be found in Section 3.2. Even though RPL was designed for dynamic networks,
several studies have shown that it underperforms in dynamic environments and
mobile networks [111]. The main reason is the slow responsiveness to variations in
the link quality. RPL has to constantly calculate the ETX of links along the path
to build the best forwarding trees.
Estimating the ETX is a challenging task because it requires the exchange of
packets. If a node needs to quickly react to changes in the link quality, it has to
1
This chapter includes work from [64].
117
keep an ETX estimation not only of its current parent but also of other neighbors,
to select a new better path proactively. Hence, there is a trade-o between the
exploitation of the currently selected path with a known minimum cost, and the
exploration of alternative paths to ensure an accurate estimation of their costs and
respond to network changes.
Another issue is that RPL was not designed to work on multi-channel net-
works [111]. As shown in Section 2.3.4, each channel has dierent qualities and
when RPL is used in multi-channel networks, it ends up operating with aggregated
statistical data from all the channels, which may mislead the routing decisions.
Another challenge in RPL implementations is how to leverage broadcast packets
to improve the link quality estimation. Usually, RPL estimates the ETX using
unicast packets, when the data is followed by ACK packets and it is easy to measure
the packet losses. In some scenarios, where (unicast) data packets are not often
transmitted, broadcast packets are one way of obtaining a better ETX estimation
of many links. Besides, each broadcast packet can be received by all the nodes
that are listening, which increases the number of nodes that can keep track of the
neighbors' statistical link qualities and may cause the RPL algorithms to react more
quickly to network changes or mobility.
In this chapter, we introduce the Thompson sAmpling-based MUlti-channel
RPL protocol (TAMU-RPL). In TAMU-RPL, the selection of the preferred parent
in the RPL protocol is modeled as a Multi-Armed Bandit (MAB) problem and the
118
use of the Thompson Sampling heuristic is investigated to improve the reactiveness
of RPL.
TAMU-RPL has a dynamic link quality estimation algorithm that keeps track
of the link quality of a larger number of neighbors. It regards unicast packet trans-
missions as the main source for ETX estimation, but also leverages the broadcast
nature of wireless communication and uses information from other packets that are
overheard. Lastly, an improved version of the DAGrank calculation algorithm is
designed, which carries out the ETX estimation for individual channels.
The main contributions of this work are as follows:
(i) an optimized RPL based on MAB problem that improves the agility to react
to link quality changes;
(ii) a modied RPL Objective Function that makes use of statistics from dierent
channels for the DAGrank calculation of neighbors;
(iii) a hybrid ETX estimation algorithm that examines both unicast and broadcast
packets and uses physical information such as RSSI and the \PDR versus
RSSI" prole of channels to improve the ETX estimation;
(iv) the implementation and empirical evaluation of the TAMU-RPL protocol, both
in simulation and on a real testbed.
2
2
The source code of TAMU-RPL can be found at https://github.com/
pedrohenriquegomes/tamu-rpl.
119
5.1 Related work
There is a vast literature related to routing in LLNs. Two surveys on the dierent
protocols and techniques can be found in [13, 112]. Routing protocols in LLNs
can be classied as
at, hierarchical or location-based [112]. In
at protocols, every
node plays the same role and the contention for medium access is distributed, as well
as the path creation. In hierarchical routing, some nodes play special roles (e.g.,
cluster-head) and the medium access is scheduled, which implies more eciency
and fairness, but higher complexity due to synchronization issues. Lastly, location-
based routing requires the notion of the relative (or absolute) position of the nodes
and usually is based on centralized path creation. The
at protocols evolved more
towards standardization as these types of protocols are more decentralized and
dynamic. From
at protocols, we can distinguish the gradient-based algorithms,
where path creation is based on a gradient value calculated at each node that
tells how \distant" the node is to a given destination. In many LLNs, a common
destination node is the sink, which works as a gateway to another network.
Our interest in this work is on ways of improving the calculation of the gradient
value for this type of routing protocol, i.e.
at gradient-based routing. We also
want to focus on protocols that consider the use of multiple channels, since these
can improve the throughput, reduce the co-channel interference and counter jam-
ming [69]. Survey on multi-channel protocols can be found in [48, 113]. There are
many dierent centralized algorithms proposed for joint and disjoint assignment of
120
channels and routing paths [113]. Although such routing protocols can be highly
optimized [114], they usually do not cope well with a dynamic scenario where nodes
leave and join the network constantly and high levels of external interference may be
expected. Our main aim in this work is to apply the multi-armed bandit framework
to solve the routing problem and implement a solution for gradient-based routing
protocols that runs in a distributed way. Since RPL has been standardized to work
with TSCH networks we aim to improve it. Our objective is to use multi-armed
bandit optimization to make RPL more reactive and leverage the use of multiple
channels in TSCH networks. We brie
y review some previous work that solved
the routing problem through multi-armed bandit optimization and then review the
optimizations proposed for RPL that rely on online learning and other optimization
techniques.
The multi-armed bandit problem has been widely studied and applied to prac-
tical problems in the realm of wireless communication, such as opportunistic spec-
trum access [107] and recongurable antennas [115]. In [116], the authors intro-
duced ecient policies for the problem of selecting multiple random variables (each
one associated with an arm) and noting the reward obtained by each random vari-
able. In the context of routing, this framework corresponds to the execution of
source routing, where a source node selects a pre-determined path at each time.
The authors in [117] examined the same source routing scenario, but only the cu-
mulative end-to-end cost was observed. Since paths share links, the arms that are
121
played (the chosen paths) are not independent, and the policy takes advantage of
that to reduce the time-complexity of the algorithm. Both works [116, 117] con-
sidered the question of source routing and require each node to have the global
knowledge of all possible paths. In LLNs, source routing is restricted to downward
trac, since only the sink node has such global knowledge (usually not completely
up-to-date) and have capabilities for routing calculation.
The work in [118] was the rst, to the best of our knowledge, to tackle the
problem of optimizing hop-by-hop decisions. The authors in [118] showed that
the regret lower bound for hop-by-hop policies is the same as the source routing,
which means that there is no advantage in using source routing, even if the reward
is observed for each link. The hop-by-hop decision-making algorithm employed is
based on KL-UCB. When a packet needs to be forwarded, an index is calculated
for each link using the Kullback-Leibler divergence number between two Bernoulli
distributions. The minimum cumulative index from the sensor node to the sink must
also be calculated, for instance by using the Bellman-Ford algorithm, as suggested
in [118]. The packet is forwarded to the link that minimizes the summation of the
cumulative indexes. Even though the forwarding decision is executed at each hop,
all the nodes still need to partially know the topology so that they can calculate the
shortest path to the sink. Besides, the heavy computation of all proposals makes
them impractical for implementing them in LLNs.
122
The RPL protocol has been extensively analyzed and tested in several related
works [111, 119], many of them concerned with RPL's capabilities for convergecast
routing and adaptability to interference-prone and mobile scenarios [120].
In [121] the authors improved the RPL link quality estimation algorithm by
using a hybrid link-monitoring framework. It adaptively selects one of three schemes
for the measurement of link quality: (i) a regular passive estimation based on
unicast data and ACK packets, (ii) an overhearing mechanism where each of the
nodes listens to packets transmitted to other nodes and counts the number of
retransmissions to estimate the packet loss, and (iii) active probing using bursts of
10 packets sent to a specic target for faster and more precise estimation. These
three mechanisms are combined by a controller that is based on the current state of
the RPL state machine and determine the best way to measure the link to neighbors.
This solution completely relies on events such as new DIO messages being heard and
does not proactively monitor the link to all the neighbors. Besides, it is designed
for single-channel networks and does not carry out link estimation across dierent
channels.
The combination of dierent methods to estimate the link quality more eec-
tively has been investigated by several studies. In [122] an algorithm is proposed
to switch between passive, active and cooperative link quality estimators in IEEE
802.11 networks. Passive estimation involves regular unicast data packets for ETX
calculation, while active estimation requires probing for estimating idle links. The
123
cooperative scheme can leverage the overhearing characteristic of wireless links.
The cooperative scheme introduced in [122] necessarily requires coordination and
may impose extra overhead to the routing protocol.
In [123] the authors argued that good routes can be found simply by ranking the
neighbors according to their rate of broadcast packets received. They empirically
determined that the PDR to neighbors is correlated with the rate of broadcast
packets received, such as Enhanced Beacons and RPL DIOs. Hence, an accurate
link quality estimation to each neighbor is not necessary, since a simple ranking that
accounts for the broadcast packets received is sucient. The authors examined an
RPL implementation with Trickle disabled so that the control packets could be
broadcast within a xed period. They also disregarded the external interference
that could aect both the unicast and broadcast packets.
Finally, the authors in [120] took the rst steps towards using a reinforcement
learning-based scheme to derive an ecient link quality estimation algorithm that
leverages both synchronous and asynchronous measurements. The proposal, called
RL-Probe, carries out asynchronous probing to all the neighbors as soon as certain
patterns in the RSSI and ETX measurements are detected. These patterns indicate
that the network was disrupted and/or signicant events occurred in the network
topology, and may need an agile reaction. A multi-armed bandit-based algorithm
was created for the synchronous probing, where the neighbor nodes are split into
three groups based on their path cost toward the sink node. Each group of neighbors
124
to be probed is an arm in the MAB problem, and the \exploration vs. exploitation"
problem is solved employing a simple-greedy algorithm. The experiments in [120]
did not provide a comprehensive evaluation of the best value. Moreover, the
patterns that are used to start asynchronous probing are statically set for all nodes,
which may not be the optimal solution. As pointed out in [120], the solution is not
tailored to multi-channel networks based on RPL, and this is still an open question.
It is clear from the state-of-the-art that there are still three open problems:
(i) how to implement a dynamic online learning-based link quality estimator that
is simple to run in a constrained device and can improve the RPL protocol to
explore links to neighbors while exploiting the current best parent selection;
(ii) how to improve ETX estimation through trac overhearing that does not re-
quire any kind of coordination and leverage information such as RSSI and
previous PDR measurements;
(iii) how to use statistics from individual channels in multi-channel networks and
improve the estimation of next-hop link quality with the knowledge of aggre-
gated and channel-specic statistics.
The goal of our TAMU-RPL proposal is to solve these three open problems.
125
5.2 RPLDAGrank calculation and preferred parent
selection
The default objective function of RPL is MRHOF, which is a greedy solution where
the end-to-end ETX is minimized. All the nodes that are part of a DODAG must
calculate a DAGrank, which is a scalar value that represents the \cost" to forward
packets to the sink. Even though OF0 (RFC 6552) is not implemented in its original
form, its standard equation for calculating the DAGrank is generally used in all the
RPL implementations. The DAGrank of a given node R(N) is calculated as the
DAGrank of its preferred parent R(P ) plus a rank increase, as in Equation 5.1.
R(N) =R(P ) +R
inc
(5.1)
where R
inc
is the rank increase, as in Equation 5.2.
R
inc
= (RfSp +Sr)m
inc
(5.2)
where m
inc
is the minimum rank increase, which by default has a value of 256,
Rf is the rank factor, Sp is the rank step and Sr is the rank stretch.
Rf is a factor that is usually multiplied by a link property,Sp is a link property
of a given neighbor and Sr is an optional value that is used to allow the selection
of a feasible additional parent.
126
The 6TiSCH minimal conguration species by defaultRf equal to 1,Sr equal
to 0 and m
inc
equal to 256. The parameter Sp is computed as a normalized ETX,
as in Equation 5.3.
Sp = (3ETX) 2 (5.3)
Hence, the DAGrank for each neighbor is periodically calculated following Equa-
tion 5.4. The preferred parent is selected as the neighbor with a minimum R(N).
Hysteresis is calculated, as recommended in RFC 6552, to avoid instability.
R(N) =R(P ) + ((3ETX) 2) 256 (5.4)
The calculation of ETX is the most important task for assigning an accurate
DAGrank to every node and, hence, select the best path to forward the packets.
The default ETX estimation specied in the 6TiSCH minimal conguration and
implemented on OpenWSN simply measures the long-term ratio of the data packets
transmitted to the neighbor (variable numTx) and the number of ACK packets
received from the same neighbor (variable numTxAck).
Since the data packets are only exchanged with the currently preferred parent,
the ETX estimation with respect to all other neighbors is not updated until the
link to the current parent becomes too weak, which may take a long time and incur
the loss of many packets.
127
5.3 TAMU-RPL
The Thompson sAmpling-based MUlti-channel RPL protocol (TAMU-RPL) con-
sists of three key algorithms.
The rst algorithm (Section 5.3.1) uses a Thompson-sampling heuristic to keep
a better estimate of ETX for a subset of neighbors. It explores theK1 neighbors
with lowest DAGrank while exploiting the current best parent. The algorithm
estimates the probability distribution of ETX for allK neighbors, using the number
of unicast packets transmitted and acknowledgments received.
The second algorithm (Section 5.3.2) is added on top of the rst. The nodes
take note of the channel used for each unicast packet that is transmitted and, thus
can measure the ETX per channel. The next hop is now calculated employing the
DAGrank of the neighbors (as in Equation 5.4), but the ETX in question is the one
calculated for the channel being used for the current time slot.
The third algorithm (Section 5.3.3) oers a way to improve the ETX measure-
ment by making use of broadcast packets. It uses the \PDR versus RSSI" prole
of channels that is shown in Section 2.3.4 and improves the ETX measurements
based on RSSI values from other packets other than the regular unicast data.
Figure 5.1 shows the relationship between the algorithms and the nodes in the
network.
128
Figure 5.1: TAMU-RPL algorithms.
5.3.1 Thompson sampling-based ETX estimation
Thompson sampling is a heuristic that addresses MAB problem by taking actions
that maximize the expected reward based on a randomly drawn belief. It considers
a set of contexts , a set of actions A and rewards in R. In each round, an action
a is chosen and a reward r is observed. Reward r2 R follows a distribution that
depends on a and on the current context x. The distribution of r is parametrized
by 2 . The prior distribution P () represents the learner's prior belief in the
parameter for the rewards distribution. After observing the tripletsD =f(x;a;r)g,
the learner obtains a posterior distribution P (jD)/ P (Dj)P (). The para-
metric likelihood probability P (Dj) is the probability of observing D from the
rewards parametrized by .
In TAMU-RPL, each action a2 A corresponds to the selection of a particular
neighbor as the preferred parent. The underlying multi-armed bandit model has
129
one arm linked to each of theK neighbors with lowest DAGrank. Each node has to
decide which arm (neighbor) it should play to optimize the reward. We only applied
Thompson-sampling to improve the ETX estimation in the DAGrank calculation.
The equation used to calculate the DAGrank is the same as in Equation 5.4.
The distribution of the number of acknowledged packets can be modeled as a
binomial random variable with parameter p, where p is the PDR link. The prior
distribution is Beta. The reward is the number of successful transmissions S
n
in
the last N trials.
The algorithm starts with prior distribution Beta(1; 1) (uniform over [0,1]) for
all the neighbors. The posterior distribution is as follows [124]:
Beta(1 +
n
X
i=1
x
i
; 1 +
n
X
i=1
N
i
n
X
i=1
x
i
) =Beta(1 +S
n
; 1 +F
n
) (5.5)
where, x
i
is the number of successful transmissions up to iteration i, and N
i
is the number of trials (successful and failed transmissions) up to iteration i. The
posterior can be simply calculated by using parameters S
n
and F
n
, which are, the
number of successful and failed transmissions, respectively.
The Beta distribution can be easily computed, sinceS
n
andF
n
are integer num-
bers. The k-th order statistics of n uniformly distributed variables is Beta(k;n +
1k). Hence,Beta(;) can be obtained by choosing the-th smallest of+1
uniform variable samples [125]. Other ecient ways of computing Beta distribution
130
are also feasible [126] if arithmetic operations such as logarithms, exponential, etc.
are available.
The algorithm for ETX estimation in TAMU-RPL is shown in Algorithm 8.
Algorithm 8 Thompson sampling with binomial observations and Beta prior
1: for each t = 1,2,..., do
2: L = list of K neighbors with lowest DAGrank
3: for each neighbor i in L do
4: Independently sample
i;t
Beta(1 +S
i;t1
; 1 +F
i;t1
)
5: Set
^
ETX
i
= 1=
i;t
6: end for
7: Select the preferred parent as the neighbor with minimum cost using Equa-
tion 5.4, where ETX is replaced by
^
ETX
i
8: Update number of successful transmissions (S
Nt;t
)
9: Update number of failed transmissions (F
Nt;t
)
10: end for
TAMU-RPL does not require any parameter to be set, except K which is the
number of neighbors that will be included in the sampling. This parameter should
be used to limit the processing required in the nodes, since the Beta calculation
may require the calculation of a large number of uniform random variables.
Finally, it should be noted that even though the estimated ETX (
^
ETX
n
) was
used to calculate the cost for each neighbor, the DAGrank that is broadcast by
each node should use the measured ETX, as dened below.
ETX
n
=S
n
=(S
n
+F
n
) (5.6)
whereETX
n
is the measured ETX to neighborn,S
n
andF
n
are the number of
unicast data packets received successfully and failed, respectively.
131
Hence, the DAGrank announced by the nodes should use Equation 5.4 with
ETX as in Equation 5.6. If the nodes were to announce the estimated value for
their DAGrank, this would increase the probability of a loop formation during the
initial learning phase when TAMU-RPL is still exploring the environment. As time
goes by, it is expected that both
^
ETX
n
andETX
n
will converge to the same value
for all neighbors.
Since generating Beta samples may take quite a long time in microprocessors,
the execution of Algorithm 8 is resource-consuming and, thus, it should only run pe-
riodically. Short periods allow nodes to react to network changes more rapidly, but
incur larger processing overhead and may increase the number of loop formations
in the network. Our implementation allows us to congure the periodic interval for
the execution of Algorithm 8.
5.3.2 Multi-channel DAGrank calculation
As demonstrated in Section 2.3.1, the link qualities vary considerably in dier-
ent channels. Both the estimated and measured ETX used for the algorithm in
Section 5.3.1, make use of the aggregated statistics over all 16 channels. We have
improved the performance of TAMU-RPL by keeping track of the number of success-
ful and failed transmissions per channel (S
c
n
and F
c
n
, respectively) in Equation 5.6.
Hence, each node stores the aggregated measured ETX, as in Equation 5.6 as well
as the measured ETX per channel, as dened below.
132
ETX
c
n
=S
c
n
=(S
c
n
+F
c
n
) (5.7)
In TSCH networks, the nodes re-synchronize after every data packet transmitted
using the time oset of ACK packets. In light of this, the nodes that do not
transmit packets may very often need to exchange dummy packets with the clock
synchronization source (which is usually the same as the RPL preferred parent) to
prevent de-synchronization These dummy unicast packets are referred to as Keep-
Alive packets (KA). They are sent periodically and can be used for improving the
accuracy of the synchronization, as well as to update the link quality statistics.These
unicast packets are only sent after a node joins the DODAG and are not forwarded
towards the sink.
In the multi-channel version of TAMU-RPL, the destination of KA packets is
the preferred parent and should be calculated with the aid of Algorithm 8. When
transmitting data packets, however, nodes try to opportunistically select neighbors
that have a better link at the current channel. Thus, when a node is transmitting a
data packet, it calculates Equation 5.4 for each neighbor by measuring the current
channel statistics (S
c
n
and F
c
n
). If the rank of a given neighbor is smaller than the
rank of the currently preferred parent by a certain threshold (thr
rank
), the data
packet is forwarded to the other neighbor instead of to the preferred parent. In this
way, whenever the preferred parent has a bad link quality in the current channel,
an alternative neighbor is used as a relay for data packets.
133
This stepwise improvement has in practice (Section 5.5) proved to be important
to avoid using channels with bad quality, especially when the preferred parent is
located near a Wi-Fi access point that causes interference in a particular portion
of the 2.4 GHz band.
5.3.3 Passive link quality update
The last algorithm that makes up TAMU-RPL leverages the broadcast feature of
wireless communications and seeks to keep the (multi-channel) link quality estima-
tion updated even when no unicast packet is exchanged with the neighbors. This is
achieved using RSSI information from broadcast and unicast packets.The passive
link quality update algorithm is useful since (i) all the nodes in TSCH periodically
transmit (broadcast) beacons from which the RSSI information can be extracted
and used by all the neighbors, (ii) the nodes may stop exchanging unicast packets
either because the routing table points to a dierent node, or because the applica-
tion does not generate sucient data, in this case, the overheard packets may be
the only source of information to maintain the link quality measurements to certain
neighbors.
The idea behind the passive link quality update is simple: for every (unicast or
broadcast) packet that a node successfully receives, the RSSI value is stored and
used to update the current estimated PDR to the transmitter node. However, even
though the idea is simple, there are a few challenges that have to be overcome.
134
First, the RSSI from the received packets indicates the link quality in the reverse
direction from what we want to estimate. We assume that links are symmetrical
and, hence, use the RSSI from received packets to estimate the PDR of the out-
going link.Even though link quality has been found to be asymmetrical in many
indoor deployments [94], other works [127] have also shown that symmetry is a
valid assumption, especially for links with high or low PDR.
As was shown in Section 2.3.4, the channels can be characterized by the \PDR
versus RSSI" curve. Because of this, the current estimated PDR can be derived
from RSSI measurements by applying regression to the known \PDR versus RSSI"
curve. In our passive link quality update algorithm, all the RSSI are taking into
account, both from unicast and broadcast packets, from overheard packets as well
as from packets directed towards the node estimating the link quality. During
the regression procedure there are two challenges that have to be addressed: (i)
the RSSI measurements vary largely due to external interference and the internal
noise that aects the accuracy of RSSI values, (ii) the \PDR versus RSSI" curve
is not linear and makes it computationally expensive to carry out regression in
microcontrollers. We were able to overcome these two challenges by applying a
Kalman lter in the RSSI measurements and simplifying the \PDR versus RSSI"
curve to two linear functions.
Kalman lters are used to make a more accurate estimate of unknown vari-
ables based on a series of measurements observed over some time with statistical
135
noise. They have been used for dierent applications, including wireless link es-
timation [128, 129]. The implementation of our Kalman lter was based on the
work of [128], but required a dierent lter for each of the 16 channels. Besides,
we did not consider that the nodes have a pre-calibrated \PDR versus RSSI" curve
but used PDR-RSSI pairs out of network measurements to form our \PDR versus
RSSI" curve. In this way, the \PDR versus RSSI" curve changes over time and
captures the link quality dynamics.
The RSSI at a receiver node at time t can be modeled as:
z
t
=x
t
+v
t
(5.8)
x
t
=x
t1
+w
t1
(5.9)
wherez
t
is the RSSI measurement andx
t
is the RSSI estimation at timet. The
noise in the measurement and in the estimation processed are modeled as Gaussian
random variables, respectively, v
t
N(0;R) and w
t
N(0;Q). R is the measure-
ment variance, while Q is the estimation variance. The update equations [130] are
as follows:
Time update equations (prediction):
^ x
t
= ^ x
t1
(5.10)
136
P
t
=P
t1
+Q (5.11)
Measurement update equations (correction):
K
t
=P
t
(P
t
+R)
1
(5.12)
^ x
t
= ^ x
t
+K
t
(z
t
^ x
t
) (5.13)
P
t
= (1K
t
)P
t
(5.14)
where ^ x
t
and ^ x
t
are the a priori and a posteriori RSSI estimates, respectively.
P
t
andP
t
are the a priori and a posteriori estimation error variances, respectively,
and K
t
is the Kalman gain.
The lter needs three parameters to work: Q,R andP
0
. BothQ andR can be
estimated oine or continuously estimated during the network operation. Q can
be obtained from a set of RSSI measurements, and R can be estimated from a set
of noise
oor measurements. P
0
is hard to estimate since its initial covariance is
unknown. It has been shown that the lter is asymptotically optimal regardless of
P
0
[131]. A common practice is to use Q as the initial value of P
0
.
Upon receiving any packet (anycast or broadcast), the nodes update the current
estimated RSSI (^ x
t
) based on Equations 5.10 to 5.14. After calculating the current
137
estimated RSSI, we converted it to map an estimated PDR. This was carried out
through a regression. We approximated the \PDR versus RSSI" curve by two
linear functions. The rst linear function contains all the data points with PDR
below a threshold r, while the second, all the data points above the same PDR
threshold. In our simulations and testbed evaluation, the threshold r was equal to
90%. Figure 5.2 shows an example of a \PDR versus RSSI" plot with the suggested
approximation by two linear functions.
Figure 5.2: Example of linear regression with two linear functions and threshold r
equal to 90%.
Hence, the estimated RSSI and estimated PDR values are approximated by two
linear equations in the form of ^ y = +^ x, where ^ y is the estimatedPDR and ^ x is
the estimated RSSI. The values and can be estimated using the simple linear
regression equations as follows:
^ =y
^
x (5.15)
138
^
=
P
p
i=0
(x
i
x)(y
i
y)
P
p
i=0
(x
i
x)
2
(5.16)
where y and x are the average PDR and RSSI for all the n samples.
Two dierent values for and are derived, considering all data points with
PDR< 90% and considering all data points with PDR 90%. The number p of
samples may vary and it is limited to the memory usage. We assumep = 10, which
will require up to 320 bytes to store the samples for all 16 channels.
The data points (PDR-RSSI pairs) needed for the linear regression calculation
are only unicast packets exchanged between the node calculating the linear regres-
sion and the neighbors, since these are the only packets that give us more precise
PDR statistics because of the acknowledgment packet that follow them. Hence,
every node keeps track of the lastp unicast packets exchanged with each neighbor.
These packets can be data or KA packets. The PDR associated with each of the p
data points is calculated by following a moving average, as follows:
PDR
cur
=
8
>
>
>
<
>
>
>
:
PDR
prev
+ (1) 100; if ACK was received
PDR
prev
; otherwise
(5.17)
wherePDR
cur
is the current PDR estimation andPDR
prev
is the previous PDR
measurement.
139
In the case of packets that have an ACK successfully received, the RSSI from
the ACK is associated with the PDR, as calculated in Equation 5.17. Since there is
no RSSI information for packets that did not have an ACK received, only successful
transmissions are included in the data points. However, for every packet lost (no
ACK received), the PDR estimation in Equation 5.17 is updated to keep track of
the most up-to-date link quality.
5.4 Simulation results
We rst evaluated the performance of TAMU-RPL employing simulations to assess
its eectiveness and compare its performance with an optimal solution. We used
a custom-made simulator written in C language
3
. The simulator receives a set of
connectivity traces and, based on the PDR statistics, calculates the routing paths
with dierent types of RPL implementations. The simulator uses a TSCH slot
frame with 101 time slots (all of them of shared type), where control (DIO, KA,
etc.) or data packets are exchanged.
The DIO packets are broadcast by each node to announce its DAGrank to the
neighbors. It starts with a xed interval (by default 2 seconds) and the interval
is doubled until a network change resets the interval to the minimum value. The
default interval of KA packets is 10 seconds.The data packets are transmitted by
default with an interval of 30 seconds.
3
The source code of the simulator tool can be found at https://github.com/
pedrohenriquegomes/tsch_scheduling_algorithms_and_simulator.
140
In Alg. 8 we adopted K = 20, which is a number large enough to include
all neighbors of each node. We adopted a large K because in simulations there
was time restriction, dierently from real experiments where processing power is
constrained. Finally, parameter thr
rank
was set to 12:5%.
As in Section 4.3, we also used the datasets obtained from the Tutornet, Soda
and Grenoble testbeds (Section 2.2), since these are the most realistic.
5.4.1 Dierent RPL implementations
We implemented two variants of RPL and two types of TAMU-RPL protocol so
that the results could be compared.
We rst implemented RPL with MRHOF ; this uses Equation 5.4 for the DA-
Grank calculation by taking ETX as the ratio between the data packets transmitted
and ACK packets received. In RPL with MRHOF, the nodes have to determine an
initial ETX for the links to neighbors to which, until then, no unicast packet has
been exchanged. By default, this ETX value is equal to 4, which makes the algo-
rithm conservative. In the experiments, we evaluated how this parameter aects
the performance of RPL with MRHOF.
We also implemented Dijkstra with MRHOF, where the shortest-path tree is
calculated with the Dijkstra algorithm. The calculation is based on the complete
knowledge of the PDR matrices. The DAGrank calculation of RPL with MRHOF
141
is kept, which means that the weights of each edge in the shortest-path tree are
equal to RankIncrease, as in Equation 5.2.
Finally, we implemented two variants of TAMU-RPL. Single-channel TAMU-
RPL uses the aggregated statistics over all 16 channels. Hence, the algorithms
shown in Section 5.3.2 and Section 5.3.1 are not used. Multi-channel TAMU-RPL
uses all the algorithms shown in Section 5.3 are used, which means that it considers
statistics for each channel when estimating ETX, and uses information from unicast
and broadcast packets.
5.4.2 Evaluating end-to-end ETX per node
We rst investigated how the ETX of links used by TAMU-RPL changes over time
in networks where all the links have xed PDR. The end-to-end ETX is dened as
the sum of the ETX of links in the routing tree from a sensor node to the sink.
Smaller ETX values require fewer retransmissions and the use of links with better
quality.
At rst, only Single-channel TAMU-RPL, RPL with MRHOF and Dijkstra with
MRHOF were compared. We randomly chose 4 connectivity traces (as described
in Section 2.2.1) from the Tutornet testbed for the rst set of evaluations. Each
simulation lasts 15 minutes, which is equal to the trace sampling interval. Figure 5.3
shows the nal end-to-end ETX (in non-decreasing order) of all 40 nodes, with a
95% condence interval.
142
Figure 5.3: End-to-end ETX for all the nodes in static networks in 4 dierent
conditions on the Tutornet testbed.
The results show that in all cases Single-channel TAMU-RPL can obtain end-
to-end ETX values closer to the Dijkstra with MRHOF routing and the condence
interval is much lower than RPL with MRHOF. This means that: (i) TAMU-RPL
can discover better routes even in a short time of simulation, and (ii) it consistently
does so, unlikely RPL with MRHOF, where the nal calculations for the route
depend very much on the rst neighbor that each node receives DIOs from, and
most likely will be selected as the preferred parent.
143
5.4.3 Evaluating end-to-end ETX over time
We used more connectivity traces from the Tutornet testbed and carried out 8-hour
simulations to determine how the end-to-end ETX of the network varies over time.
Every 15 minutes a new set of PDR matrices is loaded for each link in this new
simulation. We added up the end-to-end ETX of all the 40 nodes in the network
and plotted the variations of this value over time in Figure 5.4.
The comparison was made between Single-channel TAMU-RPL, RPL with MRHOF
and Dijkstra with MRHOF. We took advantage of this set of simulations and also
assessed the eect that the initial ETX value for new links had on the RPL with
MRHOF. In RPL with MRHOF, whenever a new neighbor is discovered, a default
value for the link to that neighbor has to be set before the unicast data packets are
exchanged. By default, this value is equal to 4.0 in the OpenWSN implementation.
We assessed to what extent the performance of RPL with MRHOF changes if this
value is reduced from 4.0 to 1.0.
Figure 5.4 shows the end-to-end ETX sum for all the nodes in the Tutornet
testbed. The simulation of RPL with MRHOF was carried out by assuming that
the initial ETX of the new links was equal to 1.0, 2.0, 3.0 and 4.0, and plotted 4
dierent graphs to help the visualization.
Figure 5.4 has a stepwise behavior. This is because every 15 minutes a new set
of network connectivity matrices are loaded into the simulator. The Single-channel
TAMU-RPL obtained a sum of end-to-end ETX that is smaller than RPL with
144
(a) RPL with MRHOF default ETX = 1.0 (b) RPL with MRHOF default ETX = 2.0
(c) RPL with MRHOF default ETX = 3.0 (d) RPL with MRHOF default ETX = 4.0
Figure 5.4: End-to-end ETX sum for all the nodes on the Tutornet testbed.
MRHOF and much closer to the Dijkstra with MRHOF. RPL with MRHOF has
slightly better results when the default ETX is equal to 1.0, but this incurs more
variability of the ETX, which makes RPL with MRHOF perform much worse at
certain times. Since we want to make a fair comparison, from now on we use the
default ETX value of RPL with MRHOF equal to 1.0 for the rest of the simulations.
Figure 5.5 shows the end-to-end ETX sum for all the nodes in all three testbeds
(i.e., Tutornet, Soda and Grenoble).
145
(a) Tutornet testbed (b) Soda testbed
(c) Grenoble testbed
Figure 5.5: End-to-end ETX sum for all the nodes on the Tutornet, Soda and
Grenoble testbeds.
It is clear that in the Soda testbed, Single-channel TAMU-RPL outperformed
the other RPL implementations, but in the Grenoble testbed, all three implemen-
tations obtained similar results. It can be concluded that TAMU-RPL can obtain
better results in scenarios where there is greater variability of link qualities. From
now on we conne the rest of the analysis just to the Tutornet and Soda testbeds,
as these are the ones where TAMU-RPL showed a more signicant improvement.
146
5.4.4 Evaluating loop formation
As mentioned in Section 5.3, the DAGrank announced by the nodes in both Multi-
channel TAMU-RPL and Single-channel TAMU-RPL uses the measured ETX val-
ues. Since the node that is selected as preferred parent changes frequently before
a good estimate of ETX to the neighbors is achieved (or whenever the network
statistics change rapidly), the number of loops formed in TAMU-RPL is likely to
be higher than in RPL with MRHOF. RPL has mechanisms for loop detection and
in regular network operations, this problem would be resolved, with some overhead
and energy waste.
The loop detection mechanism that runs during network operations was not
implemented in the simulations. Instead, we were able to check (and avoid) the
formation of loops whenever a new routing tree was formed. We analyzed the
number of loops formed in the network while running a 24-hour simulation on both
Tutornet and Grenoble testbeds.
Figure 5.6 shows the number of loops formed (average and standard deviation)
for Single-channel TAMU-RPL and RPL with MRHOF on the Tutornet testbed.
We also examined two cases for RPL with MRHOF, when the default ETX is equal
to 4.0, and when it is equal to 1.0. Figure 5.6 shows the number of loops detected
for each node. If a loop is detected, the change of preferred parent is avoided and
the currently preferred parent is kept.
147
Figure 5.6: Number of loops formed in the simulation on the Tutornet testbed.
It can be seen that only a few nodes formed loops in the routing tree if RPL with
MRHOF with a default ETX equal to 4.0 was used. In this case, the nodes spend
less time exploring the possible neighbors and stick to the rst nodes from which
they listened a DIO. On the other hand, more loops are formed when more changes
are made in the routing tree. Besides, more loops are formed when Single-channel
TAMU-RPL and RPL with MRHOF are employed with default ETX equal to 1.0.
It can be seen that Single-channel TAMU-RPL forms loops, but the number of
occurrences is less than RPL with MRHOF with default ETX equal to 1.0.
We also ran the experiment on the Soda testbed, where very few loops were
formed. The RPL with MRHOF did not show any loop formation and Single-
channel TAMU-RPL had a very low average loop formation per node of approxi-
mately 0.72.
148
5.4.5 Evaluating the number of packets received at the sink
The nodes transmit unicast data packets in addition to the DIOs and KAs in the
next two analyses. The data packets are forwarded towards the sink node by em-
ploying the current routing tree that is built by the dierent RPL implementations.
The four dierent RPL implementations described in Section 5.4.1 were compared.
New data packets were generated by each non-sink node periodically every 30 sec-
onds. Whenever a packet needed to be relayed, the transmission was scheduled
to simulate some processing delay at each relay node with a delay randomly dis-
tributed between 5 and 10 time slots from the instant the packet was received. We
examined the Tutornet and Soda testbeds, the former with a 24-hour simulation
and the latter with a 4-hour simulation. The number of link-layer retransmissions
was set to 3. The total number of data packets transmitted in the network was
112,320, for the Tutornet testbed, while in the Soda testbed the total number of
data packets transmitted was 21,420.
Figure 5.7 shows the number of data packets received at the sink concerning
both testbeds.
Concerning the Tutornet testbed (Figure 5.7a), it can be seen that the number
of packets received when both variants of TAMU-RPL were employed, is more than
double the number of packets received with RPL with MRHOF. Moreover, TAMU-
RPL was able to achieve results much closer to the results obtained by Dijkstra
with MRHOF.
149
(a) Tutornet testbed (b) Soda testbed
Figure 5.7: Number of packets received at the sink on the Tutornet and Soda
testbeds.
In the simulations involving the Soda testbed (Figure 5.7b), there was a similar
pattern of improvement, but the relative increase in the number of packets received
was smaller. In this testbed, the link qualities are better than in Tutornet, which
meant that there were fewer packet losses.
Since the number of packets received at the sink is very important metric and
the improvements of TAMU-RPL Multi-channel were not very signicant in re-
lation to TAMU-RPL Single-Channel we decided to investigate even further the
causes of such result. For the purpose of this analysis we considered TAMU-RPL
with two dierent implementations: (i) TAMU-RPL with Thompson-based ETX
estimation that just implements the algorithm in Section 5.3.1, and therefore is
the same as TAMU-RPL Single-Channel introduced previously; and (ii) TAMU-
RPL with multi-channel DAGrank calculation that just implements the algorithm
in Section 5.3.2. Since trac was periodic the algorithm in Section 5.3.3 did not
150
show signicant improvement when implemented alone. However, in Section 5.5.2
it is shown that passive link quality estimation is important when there are periods
without unicast trac in the network.
Figure 5.8 shows the number of data packets received at the sink considering
the two dierent implementation described above.
(a) Tutornet testbed (b) Soda testbed
Figure 5.8: Number of packets received at the sink on the Tutornet and Soda
testbeds.
It can be seen that implementing only the algorithm in Section 5.3.2 may im-
prove signicantly the performance in some scenarios. In Tutornet testbed the num-
ber of received packets at the sink was increase by more than 50% with TAMU-RPL
with multi-channel DAGrank calculation compared to RPLwith MRHOF.
5.4.6 Evaluating the delay of packets received at the sink
The packets are timestamped when they are generated and when they reach the sink
node and we analyzed how long they take to traverse the routing tree. Whenever a
151
packet is relayed or retransmitted due to a packet loss, a delay is randomly chosen
with a uniform distribution between 5 and 10 time slots.
Figure 5.9 shows the number of time slots needed for the data packets to be
received at the sink in both testbeds.
(a) Tutornet testbed (b) Soda testbed
Figure 5.9: Delay (number of time slots) per packet received at the sink on the
Tutornet and Soda testbeds.
It can be seen that in both testbeds the delay of the packets received at the
sink was reduced in the case of both variants of TAMU-RPL. This means that the
packets required fewer retransmissions along the routing tree. It is also clear that
the Soda testbed had smaller delays because its routing tree has fewer hops than
the Tutornet testbed. Even though the relative improvement is not large, it can be
concluded that TAMU-RPL can achieve results closer to the Dijkstra with MRHOF
solution for the number of received packets, as well as the delay that these packets
undergo in the network.
152
5.4.7 Evaluating the energy consumption
We evaluated the energy consumption of dierent solutions. The only dierence
between TAMU-RPL and the other two solutions (RPL with MRHOF and Dijkstra
with MRHOF ) is the number of control packets transmitted. The RPL protocol
broadcasts DIO messages that include the current node's DAGrank, among other
information. The Trickle algorithm is used to reduce the number of packets trans-
mitted. The initial interval is 2 seconds and this value is doubled for every DIO
that is transmitted until it reaches a maximum value (in our case 60 seconds) or a
networking event occurs, e.g. preferred parent is changed. Since in TAMU-RPL the
preferred parent is changed much more frequently, the number of DIOs transmitted
is expected to be increased.
Figure 5.10 shows the energy overhead due to the transmission of DIO packets.
We assume that nodes are regularly in receive mode and the main overhead caused
by the frequent changes in the preferred parents is the transmission energy of DIO
packets. We considered the energy consumption of a TelosB mote operating at
3V. Simply calculating the energy overhead would be unfair since TAMU-RPL
consumes more energy, but also delivers more packets to the sink. Therefore, we
calculated the energy overhead per data packet successfully received at the sink
node.
153
(a) Tutornet testbed (b) Soda testbed
Figure 5.10: Energy overhead per packet successfully received at the sink node.
It can be seen that in both testbed (Soda and Tutornet) the energy overhead
of TAMU-RPL is higher, as expected. In Soda testbed the energy overhead is al-
most double the overhead, which means that for each packet delivered at the sink,
TAMU-RPL requires two times more control packets, i.e. DIO packets. The en-
ergy overhead in Tutornet testbed is proportionally lower. In that case the energy
overhead is only 15% higher in TAMU-RPL. This overhead could be reduced dras-
tically with changes to the Trickle algorithm and its parameters. It is clear that for
TAMU-RPL since the changes in the routing topology is very frequent, therefore
the interval of DIO broadcast could increase faster, e.g. multiplying by 3 or 4 the
interval for every DIO transmitted.
154
5.5 Experimental results
Our solutions were evaluated experimentally by implementing Single-channel TAMU-
RPL and Multi-channel TAMU-RPL on OpenWSN 1.10
4
. Both the TAMU-RPL
variants were compared with RPL with MRHOF.
5.5.1 Details of the implementation
TAMU-RPL has two algorithms that require more processing and, hence, had to
be optimized to run in microcontrollers. The rst algorithm is the Thompson-
sampling ETX estimation described in Section 5.3.1, where Beta random variable
generation is necessary (see Algorithm 8 for more details). The second algorithm
is the passive link quality update described in Section 5.3.3, where a Kalman lter
has to be applied to RSSI samples and a tting function has to be derived to map
RSSI into PDR (see Section 5.3.3 for more details).
5.5.1.1 Thompson-sampling ETX estimation
A Beta(;) random variable can be obtained by choosing the -th smallest of
+1 uniform variable samples [125]. OpenWSN oers a 16-bit uniform random
number generator that was used as the basis for Beta generation. With regarding
to the sorting algorithm, we compared the use of the qsort function oered by the
4
The source code of TAMU-RPL can be found at https://github.com/
pedrohenriquegomes/tamu-rpl.
155
IAR C/C++ compiler for MSP430 microcontrollers with a hand-crafted Shellsort
5
algorithm that was optimized for a small footprint. We assumed that + 1
512, to limit the array to be sorted to 512 elements, and used a logic analyzer
to measure the processing time. The use of the qsort function increased the code
footprint by 1,472 bytes, while the handcrafted function required 692 bytes. We
repeated the sorting algorithm with 50 dierent random arrays using a TelosB mote.
The qsort showed an average processing time of 231.76 ms and standard deviation
of 3.27 ms, while the handcrafted implementation showed an average processing
time of 201.92 ms and standard deviation of 2.16 ms. It can be concluded that
the long processing time makes it impractical to deploy the algorithm in a TSCH
network with a timeslot that has a duration of tens of milliseconds.
Alternatively, we implemented the widely-used Cheng's BB algorithm [126] for
the Beta random number generator. Cheng's BB algorithm is very ecient but
requires
oating-point computation and functions that are not eciently computed
in microcontrollers, such as logarithms, exponential, and square-root operations.
Even though IAR C/C++ compiler oers support for the
oating-point and all
the necessary functions, the footprint required was larger than the
ash memory
available in TelosB, given the fact that OpenWSN version 1.10 requires about 40KB
for its basic operating system functions. As means of continuing with the tests in
real hardware and still leveraging the OpenWSN operating system, we decided
5
https://en.wikipedia.org/wiki/Shellsort.
156
to replace the TelosB hardware by the state-of-the-art OpenMote motes
6
. With
OpenMote theBeta number generator showed an average processing time of 2.43 ms
with handcrafted sorting-based implementation and 0.271 ms with Cheng's BB
algorithm. Even though the footprint of Cheng's algorithm is about 2 KB larger
than our handcrafted implementation, we decided to use Cheng's BB algorithm so
that we could to be able to test Algorithm 8 in a microcontroller executing it within
a 10 ms time slot.
Since in Algorithm 8 aBeta random number is generated for each stable neigh-
bor, we limited the number of neighbors per node to 15. This meant that Algo-
rithm 8 could run with an average processing time of about 4 ms, which is short
enough to t into a TSCH time slot. Within the slot frame that contains 101 time
slots, we reserved one timeslot where each node runs the Thompson-sampling ETX
estimation.
5.5.1.2 Passive link quality update
The passive link quality update consists of two dierent algorithms. The rst calcu-
lates the estimated RSSI from measurements. Equations 5.8 to 5.14 are used for this
purpose. In the implementation, we updated the estimated RSSI per channel for all
the packets (including ACKs) received or overheard by all the nodes. Parameters
Q (the estimation variance) and R (the measurement variance), in Equations 5.11
6
TelosB is based on a 16-bit MSP430 microcontroller running at 8 MHz with 48KB of
ash
and 1KB of RAM, while OpenMote is based on a 32-bit ARM Cortex-M3 microcontroller running
at 32 MHz with 512KB of
ash and 32KB of RAM.
157
and 5.12, respectively, were periodically updated. The last 10 RSSI measurements
and 20 noise measurements were stored and every 10 time slots (i.e., approximately
every second)Q andR were updated. Q was computed as the variance ofx
t
x
t1
andR was calculated as the variance of the noise measurements. ParameterP
0
was
considered to be equal to the rst value of Q that was computed.
The second algorithm is responsible for mapping the estimated RSSI to an es-
timated PDR value. All the nodes store the current estimated RSSI of the last
10 unicast packets (i.e. data, ACK or KA packets) that were successfully received
and associate that value with the current estimated PDR calculated with Equa-
tion 5.17. This table with the last 10 pairs of \PDR versus RSSI" is stored per
channel. Every 10 timeslots (i.e., approximately every second) the parameters ^
and
^
(Equations 5.15 and 5.16) are calculated for two linear regression functions;
the rst linear function uses the pairs with PDR < 90% and the second linear
function uses the pair with PDR 90%, if they exist. Every time a new packet
broadcast packet is received or a unicast packet is overheard from a given neigh-
bor the RSSI measurement is mapped to an estimated PDR. An estimated ETX
is computed as the inverse of the estimated PDR. If the estimated ETX is dif-
ferent from the measured ETX (Equation 5.7) it means that the nodes have not
exchanged unicast packets with that neighbor and the measurement used by the
Thompson-sampling algorithm is out-of-date. In this case the values of S
n;c
and
F
n;c
are updated to re
ect the current estimated ETX.
158
The time required for executing Equations 5.8 to 5.14 and to update the pa-
rameters ^ and
^
(Equations 5.15 and 5.16) for the two linear regression functions
is under 0.3 ms on OpenMote and can be considered to be negligible and executed
for all packets that are received.
5.5.2 Evaluating TAMU-RPL in a controlled real environment
The initial ETX value for RPL with MRHOF was set to 1.0. As in the simulation,
the slot frames were set up with 101 time slots of 10 ms of duration each. Out of the
101 times slots, 5 were used for serial communication between the computer and the
sensor nodes, 1 was used for running TAMU-RPL calculations and the rest (95 time
slots) were of shared type. The nodes exchange DIO packets, which are broadcast,
and KA and data packets, which are unicast. DIO packets are transmitted with an
interval of 30 seconds, KA with an interval of 10 seconds, and data packets with
an interval of 1 second.
We started the evaluation process considering a small setup with 5 OpenMote
nodes. The physical placement consists of one sink node close to a source node.
There are also 3 relay nodes placed at dierent distances from the sink and source
nodes, at approximately 2 meters, 5 meters, and 8 meters. The environment is
a working oce with average levels of external interference from Wi-Fi networks.
Figure 5.11 illustrates the physical placement and the logical topology.
159
Figure 5.11: Physical placement of 5 OpenMote nodes (left) and logic topology
(right).
We lter out packets in OpenWSN so that, even though the sink and the source
nodes are close to each other they do not process packets from each other. Both
nodes (sink and source) only hear packets from the relay nodes. This setup creates
three possible paths from the source to the destination, each one with two hops
that have the same link quality. The main purpose of such a scenario is to verify
how the source node selects the relay nodes as its best parent towards the sink
node. All experiments lasted 1 hour and were repeated 10 times during dierent
times of the day. Only the sensor node transmits data packets, hence, there is a
total of 3,600 packets transmitted in each experiment. All nodes report statistics
to a laptop connected through USB cable once every slot frame.
In the Experiment 1, we analyzed the reactiveness of TAMU-RPL. We start
unicast data transmission from source to the sink with all three relays on. After
15 minutes Relay 1 is turned o and only Relay 2 and Relay 3 remain on. After
160
15 minutes more, Relay 2 is also turned o and only Relay 3 remains on. After
15 minutes more, Relay 1 is turned back on and Relay 1 and Relay 3 remain
on. Figure 5.12 shows the number of received packets at the sink node during the
Experiment 1.
Figure 5.12: Number of packets received at the sink node for the Experiment 1.
Initially all three relays are on; at 15 minutes Relay 1 is turned o, at 30 minutes
Relay 2 is also turned o; and at 45 minutes Relay 1 is turned back on.
Single-channel TAMU-RPL can deliver approximately 90% of the packets trans-
mitted in the experiment. It outperforms RPL with MRHOF by approximately
20%. This performance is much lower than the result obtained in the simulations,
probably due to the higher levels of external interference in the Tutornet testbed
when compared to the environment used in the experimental evaluation. Due to
the limited levels of external interference, the dierence of packets received should
be mainly caused by packet losses while relay nodes are turned o.
We analyzed which preferred parent the sensor node has selected over time
in Experiment 1. From the log les we noticed that, as expected, Single-channel
TAMU-RPL keeps changing the selected preferred parent over time, while RPL
161
with MRHOF usually sticks to the rst parent that it selects. In general RPL
with MRHOF is much slower reacting to nodes being turned o and drops many
packets before switching to an available neighbor. Besides, in some cases, the
preferred parent selected by RPL with MRHOF does not have the best link, which
incurs more packet loss due to low PDR in the link. Figure 5.13 shows the preferred
parent selection in two situations; in Figure 5.13a we can see a common case where
both protocols select the best parent, but RPL with MRHOF reacts more slowly to
the changes in the topology, and in Figure 5.13b we can see a case where RPL with
MRHOF did not select the best parent at time 15 minutes and remained using
Relay 3.In none of the experiments, we noticed that Single-channel TAMU-RPL
made the wrong decision and got stuck with a parent that did not have the best
link.
We analyzed the log les from all 10 experiments and calculated the average
time that Single-channel TAMU-RPL and RPL with MRHOF take to switch to
the parent with the best link. We disregarded the cases where RPL with MRHOF
did not switch to the parent with the best link. On average Single-channel TAMU-
RPL took 26.4 seconds with a standard deviation of 19.7 seconds. And RPL with
MRHOF took 144.9 seconds with a standard deviation of 96.2 seconds. We can
conclude that RPL with MRHOF is not able to quickly detect when nodes in the
network are turned on and o and this incurs a large number of packet losses.
162
(a) Example of situation where RPL with MRHOF
makes right decisions, but delayed
(b) Example of situation where RPL with MRHOF
makes wrong decisions (chooses Relay 3 instead of
Relay 2 at time 15 minutes)
Figure 5.13: Preferred parent selection over time for the Experiment 1.
In the Experiment 2, we analyzed how TAMU-RPL can detect external inter-
ference in a specic channel and avoid neighbors located close to the sources of
interference.
163
We changed the FHSS used and only utilized three channels in the hopping
sequence. The IEEE 802.15.4 channels used were channel 12, 17 and 22, which
overlaps with the three most commonly used Wi-Fi channels, i.e., channel 1, 6 and
11. Close to the Relay 1, we placed a Wi-Fi access point working on channel 1,
which interferes mostly to IEEE 802.15.4 channel 12. The experiment started with
the Wi-Fi network o, and after 15 minutes it is turned on and two laptops start to
exchange trac using software iPerf3
7
. The Wi-Fi trac lasts 30 minutes and is
turned o at 45 minutes. Since the Wi-Fi network interferes mostly to one particular
IEEE 802.15.4 channel (channel 12) we expect that the source node would avoid
relaying packets to Relay 1 in time slots that use this particular channel. We also
set up a variable that keeps track of the channel utilized and makes sure that all
three IEEE 802.15.4 channels are used uniformly in a round-robin fashion.
Figure 5.14 shows the number of received packets at the sink node during the
Experiment 2.
The dierence in performance increases to about 33% when we compare Single-
channel TAMU-RPL with RPL with MRHOF. This dierence is mainly caused
by the fact that RPL with MRHOF often is not able to detect the degradation
of the link quality between source and Relay 1 and does not switch the preferred
parent. And Figure 5.15 shows an example of preferred parent selection made by the
source node over time. On the other hand Single-channel TAMU-RPL switches the
preferred parent to Relay 2 and can avoid the path that suers more interference.
7
https://iperf.fr/.
164
Figure 5.14: Number of packets received at the sink node for the Experiment 2. A
Wi-Fi network that is placed close to Relay 1 starts trac at 15 minutes and stops
trac at 45 minutes.
Figure 5.15: Preferred parent selection over time for the Experiment 2.
The dierence of performance when we compare Single-channel TAMU-RPL
and Multi-channel TAMU-RPL is very small but not negligible. Even though Re-
lay 2 is far from the source of interference, it also suers from link degradation.
165
When we analyzed the log les we could notice that Multi-channel TAMU-RPL pro-
tocol utilized approximately 5% fewer times channel 12, which explains the slight
improvement in the number of packets received at the sink.
Finally, in the Experiment 3, we analyzed if TAMU-RPL can keep track of
the link qualities even in the absence of unicast data trac being exchanged with
neighbors. The experiment setup is similar to the Experiment 2. The dierence is
that when the Wi-Fi trac is started, the unicast data trac from source to the
sink stops. The unicast data trac is resumed 30 minutes after the beginning of the
experiment and the Wi-Fi trac continues until the end of the experiment. Since
there is no unicast data trac when the external interference is large the only way
of keeping track of the link quality is by using information from broadcast packets
(DIOs and KAs). We expect that TAMU-RPL can keep the link quality up-to-date
and when the unicast data trac resumes the best parent selection is changed from
Relay 1 to another neighbor.
Figure 5.16 shows the number of received packets at the sink node during the
Experiment 3.
We can check that the improvement of Single-channel TAMU-RPL when com-
pared to Single-channel TAMU-RPL is of approximately 25%. On the other hand
the improvement of Multi-channel TAMU-RPL when compared to Single-channel
TAMU-RPL is negligible (approximately 2%). Verifying the log les we could no-
tice that Multi-channel TAMU-RPL that the routing tree formed by Multi-channel
166
Figure 5.16: Number of packets received at the sink node for the Experiment 3. A
Wi-Fi network that is placed close to Relay 1 ; Wi-Fi trac starts and LLN unicast
trac stops together at 15 minutes. The LLN unicast data trac resumes at 30
minutes.
TAMU-RPL and Single-channel TAMU-RPL were the same in most of the experi-
ments. Even though Multi-channel TAMU-RPL analyses the broadcast packets and
estimate the PDR of links even in the absence of unicast trac, we conjecture that
the RSSI levels of broadcast packet did not change signicantly in the environment
and/or the number of broadcast packets received is not sucient to change the
PDR estimation that was calculated with unicast packets. The small improvement
of Multi-channel TAMU-RPL shows that more sophisticated link quality estimation
based on a mix of unicast and broadcast packets may not be so ecient in environ-
ments with mild levels of external interference. However, in simulations especially
using traces from testbed with high levels of interference show that Multi-channel
TAMU-RPL causes signicant improvements in certain environments.
167
5.6 Lessons learned
Based on the results both in the simulation and real experiments, TAMU-RPL
showed that more sophisticated online learning is viable but requires more process-
ing resources and maybe not easily implemented in resource-constrained hardware.
Some lessons have been learned from the evaluation process:
(i) TAMU-RPL is also able to leverage the broadcast nature of transmissions and
keep the link quality estimation even in the absence of unicast data packets
sent towards the sink node;
(ii) In the simulation, TAMU-RPL outperformed the default RPL implementation
and was able to achieve performance close to the one obtained by Dijkstra-
based routing;
(iii) In real experiments TAMU-RPL outperformed the default RPL implementa-
tion and increased the number of packets received at the sink by up to 33%;
(iv) In real experiments, however, the evaluation shows that the improvements
of Multi-channel TAMU-RPL may be dependent on the levels of external
interference in the environment.
168
5.7 Conclusions
TAMU-RPL is a solution for improving the reactiveness of RPL to quick changes
in the link quality from nodes to their neighbors. It is aimed at improving the
performance of networks that suer from large external interference levels. The
solution involves the use of the three key algorithms. The rst performs ETX
estimation based on Thompson-sampling. It keeps a more accurate link quality
estimation to a subset of neighbors and allows the nodes to quickly change the
preferred parent when the link to it is degraded. Thompson-sampling is the heuristic
chosen to solve the exploration vs. exploitation problem that needs to be tackled.
The second algorithm implements a mulit-channel DAGrank calculation and makes
the nodes to be aware of link quality to neighbors at individual channels. Whenever
a node predicts that the link to the currently preferred parent faces degradation at
the channel being used, it forwards the packet to a neighbor with better link quality.
The third algorithm leverages the broadcast nature of wireless transmissions and
uses physical layer information extracted from overheard packets to improve the
ETX estimation when unicast packets are not exchanged for long periods.
We tested TAMU-RPL both in simulations considering a connectivity traces
from a 40-node indoor network and in real experiments. Due to the hardware
limitations, the real implementation was tested in a 5-node network in a real of-
ce environment. TAMU-RPL outperforms the default RPL implementation and
achieved results closer to shortest-path Dijkstra algorithm in the simulations. It
169
was also able to double the number of packets received at the sink. In real experi-
ments, the performance improvement was up to 33% when compared to the default
RPL implementation.
It was possible to verify that complex online decision making that was involved
in TAMU-RPL imposes a series of limitation in terms of hardware required and
in terms of complexity of software. The legacy motes TelosB were not capable of
executing the algorithms and even with state-of-the-art hardware OpenMote, some
limitations on the algorithms used for creating an accurate link quality estimator
restricted the performance of Multi-channel TAMU-RPL. Future investigation is
necessary to ne-tune the parameters of the necessary algorithms and improve the
accuracy of the link quality estimator to consider multiple channels and broadcast
packets.
170
Chapter 6
Conclusions and future work
In this dissertation, several diversity techniques have been explored as ways to im-
prove protocols for low-power lossy networks. Even though the algorithms adopted
are general enough to be applied to any wireless technology employing multi-hop
and multi-channel communication, our target protocol was IEEE 802.15.4 TSCH,
which is the de facto standard for Industrial IoT.
We started our work by examining dierent network connectivity datasets ex-
tracted from various testbeds (Chapter 2). These datasets have been made publicly
available and can be used for IoT experimentation and designing protocols. It was
concluded from this study that (more details in Section 2.4):
i ) dierent testbeds undergo dierent levels of external interference and varia-
tions of PDR across frequency, time and space. An analysis of the testbeds
features and links characteristics is important to ensure the accuracy of the
protocol assessments;
171
ii ) statistics such as the distribution of PDR across multiple channels, the number
of channels with high a PDR per link, the average number of neighbors per node,
and the \PDR versus RSSI" prole, are useful ways of categorizing the degree
of realism of the testbeds;
iii ) the \PDR versus RSSI" prole of a testbed provides valuable information
about the link qualities and how realistic the testbed is and can be useful for
building better link quality estimators.
The datasets obtained in this initial study were also used for simulating the
protocols developed in the dissertation.
After the exploratory study in Chapter 2, we set out three protocols that tackle
problems at dierent layers of the protocol stack. They are discussed in Chap-
ters 3, 4 and 5.
The rst protocol (FBR-TSCH) is a means of dealing with event-triggered appli-
cations, where sensors generate non-deterministic signals that have to be forwarded
to the sink. This makes it possible to cope with highly dynamic networks and en-
vironments with high levels of external interference. In this proposal, there is no
online learning algorithm implemented and the nodes rely on rule-based decisions.
The sensors exploit both space and frequency diversity to forward the packets reli-
ably. FBR-TSCH has been compared with the default RPL protocol with MHROF
in simulations and with other state-of-the-art solutions in a real deployment. The
implementation did not require a large amount of processing power and was based
172
on legacy TelosB nodes. It was concluded from this work that (more details in
Section 3.6):
i ) FBR-TSCH outperforms single-path RPL with MHROF and many other state-
of-the-art solutions, and the main drawback of FBR-TSCH is its high energy
consumption, which is mainly caused by the duty-cycle and the guard times
within the time slots;
ii )
ooding-based solutions, like FBR-TSCH, may outperform sophisticated Glossy-
based protocols in terms of reliability and delay, which shows that sub-millisecond
synchronization is not necessary for critical applications;
iii ) in general, constructive interference is a technique that can enable protocols
to have high reliability and low delay, but these types of protocols have some
disadvantages, such as the synchronization requirement that cannot be obtained
in real industrial scenarios.
The second protocol (MABO-TSCH) creates a distributed blacklist to improve
the performance of multi-hop wireless networks that need to cope with external
interference and multi-path fading. In MABO-TSCH, the hopping sequence is lo-
cally built with information exchanged between each pair of communicating nodes,
and hopping pattern that must be used in each link is optimally chosen so that,
regardless of the neighbors' blacklists, two interfering links never use the same fre-
quency. In this solution, a greedy algorithm for online learning that can optimize
173
the blacklist was employed. The algorithm was simple enough to be implemented
and tested in legacy TelosB nodes. From this approach we concluded that (more
details in Section 4.5):
i ) for a precise link quality estimation and correct blacklisting the channel the
nodes are supposed to dierentiate when no packet is sent and a packet is lost,
hence dierent algorithms need to be used if the application is deterministic or
non-deterministic;
ii ) the proposed solution MABO-TSCH outperforms an ideal centralized solution
and gets results close to the optimal solution, demonstrating the eectiveness
of online learning;
iii ) depending on the online learning technique a long time for convergence is
necessary and the parameters used, such as for greedy selection, may change
over time and need to be re-tuned.
Finally, the last proposal (TAMU-RPL) uses Thompson sampling to proactively
estimate the link quality to multiple neighbors and be able to adapt the routing
tree to network dynamics. TAMU-RPL has a dynamic link quality estimation al-
gorithm that keeps track of link quality to a larger number of neighbors and can
make RPL react faster. We considered both unicast packet and also information
from broadcast packets that are overheard. The link quality estimation is done
174
per channel, which enables improving even further the routing path whenever fre-
quency hopping is used and a node detects that a particular channel has faced bad
performance. In this solution, it was employed a more sophisticated online learning
algorithm that required more processing power. The evaluations were performed
in simulations and using state-of-the-art OpenMote nodes. From this proposal we
concluded that (more details in Section 5.6):
i ) when TAMU-RPL performs link quality estimation per channel it can oppor-
tunistically avoid neighbors with bad quality in specic channels and it allows
TAMU-RPL to select better parent when Wi-Fi nodes are close to relay sensor;
ii ) the proposed solution TAMU-RPL outperforms the default RPL implemen-
tation with MHROF and was able to achieve performance close to the one
obtained by Dijkstra-based routing;
iii ) limitations in processing power and memory did not allow us to implement
TAMU-RPL in legacy TelosB motes. It was clear that state-of-the-art hardware
is required to support more sophisticated online learning algorithms.
6.1 Future work and research directions
In this dissertation we explored diversity in wireless communication of low-power
devices, focusing on the use of online learning techniques to adapt networking
protocols to dynamic conditions. Both solutions that employed online learning
175
(MABO-TSCH and TAMU-RPL) solved the optimization problem using reinforce-
ment learning; in the rst case employing a greedy MAB algorithm, in the second
case employing Thompson sampling. The use of Machine Learning - a category of
algorithms that includes reinforcement learning - has been recently explored to solve
networking problems that involve learning from a large amount of data, as well as
to optimize existing protocols employing continuous learning algorithms that nd
the right balance between exploration and exploitation in dynamic environments.
Machine learning-based optimization is becoming even more important with
the new generation of 5G wireless networks [132] that will combine dierent net-
working protocols to solve multiple use cases, e.g. machine-type communication,
mobile broadband, vehicle-to-vehicle communication, etc. As networks become
more complex and involve a larger number of devices, it becomes unpractical to
have a centralized intelligence that adjusts the parameters of dierent protocols.
The intelligence has to move to the devices, mainly through algorithms that learn
from previous experience and self-adapt to new conditions. In this sense, the work
developed in this dissertation can be applied to other types of networking protocols
and can be extended to optimize other aspects of the communication.
Below we discuss particularly for each part of this dissertation, what are the
open questions that remain or future works that can be developed.
New radio interfaces have emerged and are considered for IoT applications.
Even though IEEE 802.15.4 is still the main standard for low-power short-range
176
communication, new modulations and frequency bands are explored for Low-Power
Wide Area Networks (LPWANs). Short-range technologies, e.g., IEEE 802.15.4,
ZigBee, Bluetooth, etc., are being integrated with long-range technologies, e.g.,
IEEE 802.15.4g, SigFox, LoRa, LTE Cat-M1, NB-IoT, etc. We have yet not seen
similar analysis to that presented in Section 2.3 which considers new radio tech-
nologies and dierent frequency bands, e.g. 868 MHz in Europe and 900 MHz in
the US. One future work is to extend the datasets provided in this dissertation to
include other (new) types of sensor nodes, such as SigFox, LoRA, or NB-IoT. Re-
cently the IoT-Lab testbeds started to support new types of motes
1
. One could use
the same Mercator project and run experiments in the IEEE 802.15.4g, SigFox and
LoRa nodes on IoT-Lab testbeds that support them. The resulting datasets would
include network statistics on 2.4 GHz and 868 MHz. A comparison between real
datasets in both frequencies and an analysis of possible dierent strategies to ex-
ploit diversity on them could be another contribution to be developed in the future.
Some work has already demonstrated that frequency diversity is also important in
new IEEE 802.15.4g networks with OFDM modulation [133]. More investigation on
the exploitation of other types of diversities in these new radios, with new frequency
bands and new modulations, are still open for research.
It was clear from the process utilized to gather connectivity statistics that it
takes a long time to extract dense datasets from LLNs and that there is a large
1
More information on the custom types of nodes supported are in https://www.iot-
lab.info/hardware/custom/.
177
diculty in obtaining ne-grain statistics. Usually, the 100-packet bursts used
to measure the PDR were separated by tens of seconds if all 16 channels were
evaluated. Dierent statistical techniques could be utilized to derive more realistic
link quality models to be used in simulations that are still based on datasets but do
not require long experiments and had higher granularity. Generative Adversarial
Networks (GANs) have been successfully used to automatically generate images and
other synthetic media that are indistinguishable from real ones. The use of such
a technique for generating link statistics for simulations may be a promising way
of increasing the realism and the number of connectivity datasets. Such AI-driven
approaches could quickly generate datasets and extrapolate to new environments
with minimum eort.
The work presented in Chapter 3 focused on the application proposed for the
rst edition of EWSN Dependability Competition. In the consecutive versions of
the competition, there was a large dominance of proposals that exploited construc-
tive interference (CI) and/or capture eect. CI is a very eective tool for increasing
the chances of packet reception in networks with high levels of interference. How-
ever, CI-based protocols are not practical for all scenarios. They require s-level
synchronization, which is hard to be obtained in environments with dierent hard-
ware and large variations of temperature, as explained in Section 3.5. Even if the
capture eect is used for rejecting interfering signals, very tight conditions on the
synchronization mismatch and power imbalance are required, which makes it hard
178
to achieve in industrial scenarios. One last important drawback of CI-based proto-
cols is that the same packet has to be broadcast by all nodes, which makes per-link
security impossible.
We see that an improvement to FBR-TSCH is the combination of TSCH-based
time slots with sequences of CI-based
oods. TSCH protocol can easily adapt to
dierent conditions and can be used for control messages and non-critical applica-
tions. Whenever an alarm is detected in the network and critical messages need to
be forwarded to the sink, short periods of broadcast transmissions based on CI can
be initiated by all nodes. The use of TSCH as a special data plane with CI-based
protocols for critical transmissions allow the network to leverage security features
from 6TiSCH standard, and at the same time obtain the best trade-o between re-
liability, power consumption, and delay provided by CI-based transmissions. Since
the 2018 edition of the EWSN Dependability Competition, the target application
was changed to multipoint-to-multipoint transmissions. In this scenario, shortest-
path routing schemes may outperform
ooding-based approaches especially in en-
ergy consumption, which means that FBR-TSCH could have an advantage in the
criteria that it lost from CI-based protocols. One future work could be to adapt
FBR-TSCH implementation for this new scenario.
The work presented in Section 4 was, to the best of our knowledge, the rst
proposal that created a distributed blacklisting solution based on an online learning
algorithm. In this work, we employed -greedy strategy and were able to obtain
179
great improvement with xed single value. However, it is clear that dynamically
adjusting would further improve the results. We see that an improved version of
the online learning that could self-tune all required parameters is an open question
that can be explored in future work.
Our work built on top of state-of-the-art collection protocol MCC [69] that cre-
ates optimal schedules with simultaneous transmissions. While creating optimal
schedules optimizes the overall network throughput, since it keeps the sink node
busy all the time receiving packets, it restricts the channel oset assignment since
simultaneous transmission that is close to each other have to use orthogonal chan-
nels. Optimal schedules reduces the number of channel osets that can be used
in the frequency hopping algorithm. When external interference increases, nodes
may run out of good channels and may need to use channels in bad conditions,
that will lead to packets losses and will stale the collection pipeline. A future study
that could be done is the investigation of a better trade-o between using optimal
schedules that reduce the performance of blacklisting algorithms, and near-optimal
schedules that allocate more channel oset to reserved time slots and increase the
possibility of designing the best distributed blacklist for each pair of nodes. In our
implementation, the routing and scheduling were performed statically using MCC
protocol. This required that network statistics, mainly network connectivity, be
collected at a central node from which the non-interfering schedule is built. This
solution works optimally as long as nodes do not join or exit the network. One
180
challenge that could be tackled as future work is the integration of a MABO-TSCH
with a distributed scheduling algorithm such as DeTAS [134].
The work presented in Chapter 5 was, to the best of our knowledge, the rst
implementation of online learning to optimize the performance of RPL considering
a multi-channel protocol and leveraging both unicast and broadcast packets to en-
hance the link quality estimation algorithm. Even though in simulated experiments
the performance improvement was up to 100%, in real experiments it was reduced
to approximately 33%, which shows that it is even harder to keep track of the dy-
namics of real links that suer from external interference. This fact, together with
the limitations faced with legacy motes, shows that there is a challenge to nd a
good balance between sophisticated online learning algorithms such as TAMU-RPL
and limitations due to hardware and practical scenarios.
From the experimental results presented both in Section 4.4 and in Section 5.5
we can conclude that online learning imposes many constraints that make the re-
sults in practice be far from the theoretical or simulated. One issue commonly
found throughout the development of this dissertation is that in many cases the
information required to perform a distributed optimization cannot be obtained by
the node that runs the algorithm.This is critical for the link quality estimation,
which is a problem faced both in MABO-TSCH and TAMU-RPL proposals. Either
a feedback channel needs to be created to convey the information obtained at a
dierent node, like in MABO-TSCH when one wants to optimize event-triggered
181
applications and the receiver node only know if a packet is lost if the transmitter
informs that. Or some assumptions that simplify the problem needs to be made,
such as link symmetry that is assumed if RSSI measurement is done on the ACK
response, instead of data transmission. All these issues limit the performance of
distributed algorithms. One possibility for future work when optimizing network
protocols at dierent layers is to realize a joint optimization in a centralized manner.
Even though centralized decision-making is not optimal due to delay in collecting
data and disseminating the optimization results, we believe that when the combi-
nation of optimization in dierent levels is done, the results may be promising. One
future work for the improvement is the combination of optimization of blacklist for
FHSS and multi-channel RPL optimization jointly, in a centralized manner ideally
also considering the schedule of the network. A comparison of distributed versus
centralized approaches would draw a clear picture of which approach is better for
online learning in resource-constrained networks.
182
Reference List
[1] J. Klaue, A. Corona, M. Kubisch, J. Garcia-Jimenez, and A. Escobar, \Red-
FixHop," in Proceedings of the 2016 International Conference on Embedded
Wireless Systems and Networks, EWSN '16, (USA), pp. 289{290, 2016.
[2] D. Yuan and M. Hollick, \Sparkle: Energy ecient, reliable, ultra-low latency
communication in wireless control networks," in Proceedings of the 2016 In-
ternational Conference on Embedded Wireless Systems and Networks, EWSN
'16, (USA), pp. 295{296, 2016.
[3] P. H. Gomes, T. Watteyne, P. Gosh, and B. Krishnamachari, \Reliability
through timeslotted channel hopping and
ooding-based routing," in Pro-
ceedings of the 2016 International Conference on Embedded Wireless Systems
and Networks, EWSN '16, (USA), pp. 297{298, 2016.
[4] A. Maskooki, V. Toldov, L. Clavier, V. Loscr , and N. Mitton, \Channel
exploration/exploitation based on a thompson sampling approach in a radio
cognitive environment," in Proceedings of the 2016 International Conference
on Embedded Wireless Systems and Networks, EWSN '16, (USA), pp. 285{
286, 2016.
[5] P. Sommer and Y.-A. Pignolet, \Dependable network
ooding using glossy
with channel-hopping," in Proceedings of the 2016 International Conference
on Embedded Wireless Systems and Networks, EWSN '16, (USA), pp. 303{
303, 2016.
[6] B. Al Nahas and O. Landsiedel, \Towards low-latency, low-power wireless net-
working under interference," in Proceedings of the 2016 International Con-
ference on Embedded Wireless Systems and Networks, EWSN '16, (USA),
pp. 287{288, 2016.
[7] A. King, J. Hadley, and U. Roedig, \ContikiMAC with dierentiating clear
channel assessment," in Proceedings of the 2016 International Conference on
Embedded Wireless Systems and Networks, EWSN '16, (USA), pp. 301{302,
2016.
183
[8] \Atmegaxx8pa-15 RC oscillator frequency drift compensation."
http://ww1.microchip.com/downloads/en/DeviceDoc/article_ac9_
atmegaxx8pa-15-rc-oscillator.pdf. Accessed: 2019-02-01.
[9] T. Arampatzis, J. Lygeros, and S. Manesis, \A survey of applications of
wireless sensors and wireless sensor networks," in Intelligent Control, 2005.
Proceedings of the 2005 IEEE International Symposium on, Mediterrean Con-
ference on Control and Automation, pp. 719{724, IEEE, 2005.
[10] T. Ojha, S. Misra, and N. S. Raghuwanshi, \Wireless sensor networks for agri-
culture: The state-of-the-art in practice and future challenges," Computers
and Electronics in Agriculture, vol. 118, pp. 66{84, 2015.
[11] A. Mainwaring, D. Culler, J. Polastre, R. Szewczyk, and J. Anderson, \Wire-
less sensor networks for habitat monitoring," in Proceedings of the 1st ACM
international workshop on Wireless sensor networks and applications, pp. 88{
97, ACM, 2002.
[12] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, \Wireless
sensor networks: a survey," Computer networks, vol. 38, no. 4, pp. 393{422,
2002.
[13] J. N. Al-Karaki and A. E. Kamal, \Routing techniques in wireless sensor
networks: A survey," IEEE wireless communications, vol. 11, no. 6, pp. 6{28,
2004.
[14] B. Krishnamachari, D. Estrin, and S. Wicker, \The impact of data aggrega-
tion in wireless sensor networks," in Distributed Computing Systems Work-
shops, 2002. Proceedings. 22nd International Conference on, pp. 575{578,
IEEE, 2002.
[15] S. Bandyopadhyay and E. J. Coyle, \An energy ecient hierarchical clustering
algorithm for wireless sensor networks," in INFOCOM 2003. Twenty-Second
Annual Joint Conference of the IEEE Computer and Communications. IEEE
Societies, vol. 3, pp. 1713{1723, IEEE, 2003.
[16] J. Zheng and A. Jamalipour, Wireless sensor networks: a networking per-
spective. John Wiley & Sons, 2009.
[17] G. Montenegro, N. Kushalnagar, J. Hui, and D. Culler, \Transmission of
IPv6 packets over IEEE 802.15.4 networks." Internet Requests for Comments,
September 2007.
[18] M. A. Mahmood, W. K. Seah, and I. Welch, \Reliability in wireless sen-
sor networks: A survey and challenges ahead," Computer Networks, vol. 79,
pp. 166{187, 2015.
184
[19] K. Ashton, \That `Internet of Things' thing," RFID journal, vol. 22, no. 7,
pp. 97{114, 2009.
[20] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash,
\Internet of things: A survey on enabling technologies, protocols, and applica-
tions," IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2347{
2376, 2015.
[21] Forbes, \2018 Roundup Of Internet Of Things Forecasts And Market Esti-
mates," 2018.
[22] Accenture, \Driving Unconventional Growth through the Industrial Internet
of Things," 2014.
[23] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, \Internet of things
(IoT): A vision, architectural elements, and future directions," Future gener-
ation computer systems, vol. 29, no. 7, pp. 1645{1660, 2013.
[24] L. Atzori, A. Iera, and G. Morabito, \The internet of things: A survey,"
Computer networks, vol. 54, no. 15, pp. 2787{2805, 2010.
[25] R. Khan, S. Ullah, R. Zaheer, and S. Khan, \Future internet: the internet
of things architecture, possible applications and key challenges," in Frontiers
of Information Technology (FIT), 2012 10th International Conference on,
pp. 257{260, IEEE, 2012.
[26] Z. Sheng, S. Yang, Y. Yu, A. Vasilakos, J. Mccann, and K. Leung, \A survey
on the IETF protocol suite for the internet of things: Standards, challenges,
and opportunities," IEEE Wireless Communications, vol. 20, no. 6, pp. 91{
98, 2013.
[27] L. Mainetti, L. Patrono, and A. Vilei, \Evolution of wireless sensor networks
towards the internet of things: A survey," in Software, Telecommunications
and Computer Networks (SoftCOM), 2011 19th International Conference on,
pp. 1{6, IEEE, 2011.
[28] L. D. Xu, W. He, and S. Li, \Internet of things in industries: A survey,"
IEEE Transactions on industrial informatics, vol. 10, no. 4, pp. 2233{2243,
2014.
[29] Y. Lu, \Industry 4.0: A survey on technologies, applications and open re-
search issues," Journal of Industrial Information Integration, vol. 6, pp. 1{10,
2017.
[30] K. Pister and L. Doherty, \TSMP: Time synchronized mesh protocol," in
International Symposium Distributed Sensor Networks, pp. 391{398, 2008.
185
[31] H. Hayashi, T. Hasegawa, and K. Demachi, \Wireless technology for process
automation," in ICCAS-SICE, 2009, pp. 4591{4594, IEEE, 2009.
[32] A. N. Kim, F. Hekland, S. Petersen, and P. Doyle, \When HART goes
wireless: Understanding and implementing the WirelessHART standard," in
Emerging Technologies and Factory Automation, 2008. ETFA 2008. IEEE
International Conference on, pp. 899{907, IEEE, 2008.
[33] S. Petersen and S. Carlsen, \WirelessHART versus ISA100.11a: The format
war hits the factory
oor," IEEE Industrial Electronics Magazine, vol. 5,
no. 4, pp. 23{34, 2011.
[34] T. Watteyne, J. Weiss, L. Doherty, and J. Simon, \Industrial IEEE 802.15.4e
networks: Performance and trade-os," in 2015 IEEE International Con-
ference on Communications (ICC), pp. 604{609, Institute of Electrical &
Electronics Engineers (IEEE), jun 2015.
[35] M. R. Palattella, N. Accettura, L. A. Grieco, G. Boggia, M. Dohler, and
T. Engel, \On optimal scheduling in duty-cycled industrial IoT applications
using IEEE802.15.4e TSCH," IEEE Sensors J., vol. 13, pp. 3655{3666, oct
2013.
[36] A. Yang, A. Sundararajan, C. B. Schindler, and K. S. Pister, \Analysis of
low latency TSCH networks for physical event detection," in Wireless Com-
munications and Networking Conference Workshops (WCNCW), 2018 IEEE,
pp. 167{172, IEEE, 2018.
[37] \WirelessHART Specication 1.1," 2008.
[38] \100.11a-2009: Wireless systems for industrial automation: Process control
and related applications," 2009.
[39] J. Zhao and R. Govindan, \Understanding packet delivery performance in
dense wireless sensor networks," in Proceedings of the 1st international con-
ference on Embedded networked sensor systems, pp. 1{13, ACM, 2003.
[40] K. Srinivasan, P. Dutta, A. Tavakoli, and P. Levis, \An empirical study of
low-power wireless," ACM Transactions on Sensor Networks (TOSN), vol. 6,
no. 2, p. 16, 2010.
[41] G. Anastasi, M. Conti, and M. D. Francesco, \A comprehensive analysis of
the MAC unreliability problem in IEEE 802.15.4 Wireless Sensor Networks,"
IEEE Transactions on Industrial Informatics, vol. 7, pp. 52{65, feb 2011.
[42] M. Kohvakka, M. Kuorilehto, M. Hannikainen, and T. D. Hamalainen, \Per-
formance analysis of IEEE 802.15.4 and ZigBee for large-scale wireless sensor
186
network applications," in Proceedings of the 3rd ACM international workshop
on Performance evaluation of wireless ad hoc, sensor and ubiquitous networks
- PE-WASUN ’06, Association for Computing Machinery (ACM), 2006.
[43] B. Raman, K. Chebrolu, S. Bijwe, and V. Gabale, \PIP: A Connection-
Oriented, Multi-Hop, Multi-Channel TDMA-based MAC for High Through-
put Bulk Transfer," in Proceedings of the 8th ACM Conference on Embedded
Networked Sensor Systems - SenSys ’10, (New York, New York, USA), Asso-
ciation for Computing Machinery (ACM), 2010.
[44] M. Doddavenkatappa and M. Choon, \P
3
: A practical packet pipeline using
synchronous transmissions for wireless sensor networks," in IPSN-14 Pro-
ceedings of the 13th International Symposium on Information Processing in
Sensor Networks, Institute of Electrical & Electronics Engineers (IEEE), apr
2014.
[45] G. Zhou, C. Huang, T. Yan, T. He, J. A. Stankovic, and T. F. Abdelzaher,
\MMSN: Multi-frequency media access control for wireless sensor networks,"
in Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference
on Computer Communications, Institute of Electrical & Electronics Engineers
(IEEE), 2006.
[46] T. Watteyne, A. Mehta, and K. Pister, \Reliability through frequency diver-
sity," in Proceedings of the 6th ACM symposium on Performance evaluation
of wireless ad hoc, sensor, and ubiquitous networks - PE-WASUN ’09, (New
York, New York, USA), Association for Computing Machinery (ACM), 2009.
[47] Y. Kim, H. Shin, and H. Cha, \Y-MAC: An energy-ecient multi-channel
MAC protocol for dense wireless sensor networks," in 2008 International Con-
ference on Information Processing in Sensor Networks (ipsn 2008), Institute
of Electrical & Electronics Engineers (IEEE), apr 2008.
[48] O. D. Incel, \A survey on multi-channel communication in wireless sensor
networks," Computer Networks, vol. 55, pp. 3081{3099, sep 2011.
[49] R. Soua and P. Minet, \Multichannel assignment protocols in wireless sen-
sor networks: A comprehensive survey," Pervasive and Mobile Computing,
vol. 16, pp. 2{21, jan 2015.
[50] \802.15.4-2015: IEEE Standard for Local and Metropolitan Area Networks{
Part 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs) Amend-
ment 1: MAC sublayer," October 2015.
[51] T. Watteyne, S. Lanzisera, A. Mehta, and K. S. Pister, \Mitigating multipath
fading through channel hopping in wireless sensor networks," in Communica-
tions (ICC), 2010 IEEE International Conference on, pp. 1{5, IEEE, 2010.
187
[52] N. Accettura, M. R. Palattella, G. Boggia, L. A. Grieco, and M. Dohler,
\DeTAS: A decentralized trac aware scheduling technique enabling iot-
compliant multi-hop low-power and lossy networks," in Second IEEE WoW-
MoM Workshop on the Internet of Things: Smart Objects and Services, IoT-
SoS, vol. 26, 2013.
[53] S. Duquennoy, B. Al Nahas, O. Landsiedel, and T. Watteyne, \Orchestra:
Robust mesh networks through autonomously scheduled tsch," in Proceedings
of the 13th ACM conference on embedded networked sensor systems, pp. 337{
350, ACM, 2015.
[54] X. Vilajosana, K. Pister, and T. Watteyne, \Minimal IPv6 over the TSCH
mode of IEEE 802.15.4e (6TiSCH) conguration," 2017.
[55] T. Chang, T. Watteyne, Q. Wang, and X. Vilajosana, \LLSF: Low latency
scheduling function for 6TiSCH networks," in Distributed Computing in Sen-
sor Systems (DCOSS), 2016 International Conference on, pp. 93{95, IEEE,
2016.
[56] R. Teles Hermeto, A. Gallais, and F. Theoleyre, \Scheduling for IEEE
802.15.4-TSCH and slow channel hopping MAC in low power industrial wire-
less networks," Computer Communications, vol. 114, no. C, pp. 84{105, 2017.
[57] R. T. Hermeto, A. Gallais, and F. Theoleyre, \Scheduling for IEEE 802.15.4-
TSCH and slow channel hopping MAC in low power industrial wireless net-
works: A survey," Computer Communications, vol. 114, pp. 84{105, 2017.
[58] P. H. Gomes, Y. Chen, T. Watteyne, and B. Krishnamachari, \Insights into
frequency diversity from measurements on an indoor low power wireless net-
work testbed," in 2016 IEEE Globecom Workshops (GC Wkshps), pp. 1{6,
dec 2016.
[59] K. Brun-Laguna, P. Minet, T. Watteyne, and P. H. Gomes, \Moving beyond
testbeds? lessons (we) learned about connectivity," IEEE Pervasive Comput-
ing, vol. 17, no. 4, pp. 15{27, 2018.
[60] T. Chang, T. Watteyne, X. Vilajosana, and P. H. Gomes, \Constructive
interference in 802.15.4: A tutorial," IEEE Communications Surveys and
Tutorials, 2018.
[61] J. Vermorel and M. Mohri, \Multi-armed bandit algorithms and empirical
evaluation," in Machine Learning: ECML 2005, pp. 437{448, Springer Sci-
ence + Business Media, 2005.
188
[62] P. H. Gomes, T. Watteyne, and B. Krishnamachari, \MABO-TSCH: Mul-
tihop and blacklist-based optimized time synchronized channel hopping,"
Transactions on Emerging Telecommunications Technologies, vol. 29, no. 7,
2018.
[63] D. S. J. D. Couto, D. Aguayo, J. Bicket, and R. Morris, \A high-throughput
path metric for multi-hop wireless routing," Wireless Netw, vol. 11, pp. 419{
434, jul 2005.
[64] P. H. Gomes and B. Krishnamachari, \TAMU-RPL: Thompson sampling-
based multi-channel RPL," Submitted to the Transactions on Emerging
Telecommunications Technologies, 2019.
[65] A. F. Molisch, Wireless Communications, vol. 34. John Wiley & Sons, 2012.
[66] A. Mpitziopoulos, D. Gavalas, C. Konstantopoulos, and G. Pantziou, \A
survey on jamming attacks and countermeasures in WSNs," IEEE Commu-
nications Surveys & Tutorials, vol. 11, no. 4, 2009.
[67] A. Al-Dulaimi, S. Al-Rubaye, Q. Ni, and E. Sousa, \5G communications race:
Pursuit of more capacity triggers lte in unlicensed band," IEEE vehicular
technology magazine, vol. 10, no. 1, pp. 43{51, 2015.
[68] C. Tala, L. Ahumada, D. Dujovne, S.-U. Rehman, T. Turletti, and W. Dab-
bous, \Guidelines for the accurate design of empirical studies in wireless net-
works," in International Conference on Testbeds and Research Infrastructures,
pp. 208{222, Springer, 2011.
[69] Y. Chen, P. H. Gomes, and B. Krishnamachari, \Multi-channel data collection
for throughput maximization in wireless sensor networks," in 2014 IEEE 11th
International Conference on Mobile Ad Hoc and Sensor Systems, Institute of
Electrical & Electronics Engineers (IEEE), oct 2014.
[70] A. Gluhak, S. Krco, M. Nati, D. Psterer, N. Mitton, and T. Razand-
ralambo, \A survey on facilities for experimental internet of things research,"
IEEE Communications Magazine, vol. 49, no. 11, pp. 58{67, 2011.
[71] A.-S. Tonneau, N. Mitton, and J. Vandaele, \How to choose an experimen-
tation platform for wireless sensor networks? a survey on static and mobile
wireless sensor network experimentation facilities," Ad Hoc Networks, vol. 30,
pp. 115{127, 2015.
[72] G. Z. Papadopoulos, A. Gallais, G. Schreiner, and T. No el, \Importance of
repeatable setups for reproducible experimental results in IoT," in Proceedings
of the 13th ACM Symposium on Performance Evaluation of Wireless Ad Hoc,
Sensor, & Ubiquitous Networks, pp. 51{59, ACM, 2016.
189
[73] T. Watteyne, C. Adjih, and X. Vilajosana, \Lessons learned from large-scale
dense IEEE 802. 15.4 connectivity traces," in Automation Science and Engi-
neering (CASE), 2015 IEEE International Conference on, pp. 145{150, IEEE,
2015.
[74] W. Dong, Y. Liu, Y. He, T. Zhu, and C. Chen, \Measurement and analysis on
the packet delivery performance in a large-scale sensor network," IEEE/ACM
Transactions on Networking (TON), vol. 22, no. 6, pp. 1952{1963, 2014.
[75] L. Doherty, W. Lindsay, and J. Simon, \Channel-specic wireless sensor net-
work path data," in Computer Communications and Networks, 2007. ICCCN
2007. Proceedings of 16th International Conference on, pp. 89{94, IEEE,
2007.
[76] K. Brun-Laguna, A. L. Diedrichs, D. Dujovne, R. L eone, X. Vilajosana, and
T. Watteyne, \(Not so) intuitive results from a smart agriculture low-power
wireless mesh deployment," in Proceedings of the Eleventh ACM Workshop
on Challenged Networks, pp. 25{30, ACM, 2016.
[77] J. Yeo, D. Kotz, and T. Henderson, \CRAWDAD: a community resource for
archiving wireless data at dartmouth," ACM SIGCOMM Computer Commu-
nication Review, vol. 36, no. 2, pp. 21{22, 2006.
[78] C. A. Boano, S. Duquennoy, A. F orster, O. Gnawali, R. Jacob, H.-S. Kim,
O. Landsiedel, R. Marevici, L. Mottola, G. P. Picco, and M. Z. Xavier Vila-
josana, Thomas Watteyne, \IoTBench: Towards a benchmark for low-power
wireless networking," in 1st Workshop on Benchmarking Cyber-Physical Net-
works and Systems (CPSBench 2018), 2018.
[79] C. Adjih, E. Baccelli, E. Fleury, G. Harter, N. Mitton, T. Noel, R. Pissard-
Gibollet, F. Saint-Marcel, G. Schreiner, J. Vandaele, et al., \FIT IoT-LAB: A
large scale open experimental IoT testbed," in Internet of Things (WF-IoT),
2015 IEEE 2nd World Forum on, pp. 459{464, IEEE, 2015.
[80] T. Watteyne, X. Vilajosana, B. Kerkez, F. Chraim, K. Weekly, Q. Wang,
S. Glaser, and K. Pister, \OpenWSN: a standards-based low-power wireless
development environment," Transactions on Emerging Telecommunications
Technologies, vol. 23, no. 5, pp. 480{493, 2012.
[81] A. Dunkels, B. Gronvall, and T. Voigt, \Contiki - a lightweight and
exible
operating system for tiny networked sensors," in Local Computer Networks,
2004. 29th Annual IEEE International Conference on, pp. 455{462, IEEE,
2004.
190
[82] \SmartMesh IP Application Notes." http://www.analog.com/media/
en/technical-documentation/application-notes/SmartMesh_IP_
Application_Notes.pdf. Accessed: 2019-02-01.
[83] K. Brun-Laguna, Deterministic Networking for the Industrial IoT. PhD the-
sis, Universit e Pierre et Marie Curie, 2018.
[84] C. B. Schindler, T. Watteyne, X. Vilajosana, and K. S. PisterJ, \Implemen-
tation and characterization of a multi-hop 6TiSCH network for experimental
feedback control of an inverted pendulum," in Modeling and Optimization in
Mobile, Ad Hoc, and Wireless Networks (WiOpt), 2017 15th International
Symposium on, pp. 1{8, IEEE, 2017.
[85] V. Gungor and G. Hancke, \Industrial wireless sensor networks: Challenges,
design principles, and technical approaches," IEEE Trans. Ind. Electron.,
vol. 56, pp. 4258{4265, oct 2009.
[86] O. Gnawali, R. Fonseca, K. Jamieson, D. Moss, and P. Levis, \Collection tree
protocol," in Proceedings of the 7th ACM Conference on Embedded Networked
Sensor Systems - SenSys ’09, Association for Computing Machinery (ACM),
2009.
[87] N. Accettura, L. A. Grieco, G. Boggia, and P. Camarda, \Performance anal-
ysis of the RPL routing protocol," in 2011 IEEE International Conference
on Mechatronics, Institute of Electrical & Electronics Engineers (IEEE), apr
2011.
[88] O. Gaddour and A. Koub^ aa, \RPL in a nutshell: A survey," Computer Net-
works, vol. 56, pp. 3163{3178, sep 2012.
[89] F. Ferrari, M. Zimmerling, L. Thiele, and O. Saukh, \Ecient network
ood-
ing and time synchronization with glossy," in Information Processing in Sen-
sor Networks (IPSN), 2011 10th International Conference on, pp. 73{84,
IEEE, 2011.
[90] M. Doddavenkatappa, M. C. Chan, and B. Leong, \Splash: Fast data dis-
semination with constructive interference in wireless sensor networks," in Pro-
ceedings of the 10th USENIX Conference on Networked Systems Design and
Implementation, nsdi'13, (Berkeley, CA, USA), pp. 269{282, USENIX Asso-
ciation, 2013.
[91] D. Moss, J. Hui, and K. Klues, \TinyOS TEP105 - Low Power Listening,"
tech. rep., UC Berkeley, 2007.
191
[92] T. Winter, P. Thubert, A. Brandt, J. Hui, R. Kelsey, P. Levis, K. Pister,
R. Struik, J. Vasseur, and R. Alexander, \RPL: IPv6 routing protocol for
low-power and lossy networks." Internet Requests for Comments, March 2012.
[93] O. Gnawali and P. Levis, \The minimum rank with hysteresis objective func-
tion." Internet Requests for Comments, September 2012.
[94] M. Z. Zamalloa and B. Krishnamachari, \An analysis of unreliability and
asymmetry in low-power wireless links," ACM Transactions on Sensor Net-
works (TOSN), vol. 3, no. 2, p. 7, 2007.
[95] C. A. Boano, T. Voigt, C. Noda, K. R omer, and M. Z u~ niga, \Jamlab: Aug-
menting sensornet testbeds with realistic and controlled interference genera-
tion," in Information Processing in Sensor Networks (IPSN), 2011 10th In-
ternational Conference on, pp. 175{186, IEEE, 2011.
[96] K. Srinivasan, M. A. Kazandjieva, S. Agarwal, and P. Levis, \The -factor
: Measuring Wireless Link Burstiness," in Proceedings of the 6th ACM con-
ference on Embedded network sensor systems - SenSys ’08, Association for
Computing Machinery (ACM), 2008.
[97] D. Sexton, M. Mahony, M. Lapinski, and J. Werb, \Radio channel quality in
industrial wireless sensor networks," in 2005 Sensors for Industry Conference,
pp. 88{94, 2005.
[98] \Bluetooth Core Specication Version 4.0," 2010.
[99] C. De Dominicis, P. Ferrari, A. Flammini, E. Sisinni, M. Bertocco, G. Giorgi,
C. Narduzzi, and F. Tramarin, \Investigating WirelessHART coexistence is-
sues through a specically designed simulator," in Instrumentation and Mea-
surement Technology Conference, 2009. I2MTC'09. IEEE, pp. 1085{1090,
IEEE, 2009.
[100] T. Lennvall, S. Svensson, and F. Hekland, \A comparison of WirelessHART
and ZigBee for industrial applications," in 2008 IEEE International Workshop
on Factory Communication Systems, pp. 85{88, IEEE, 2008.
[101] Y. Wan, Q. Wang, S. Duan, and X. Zhang, \RAFH: reliable aware frequency
hopping method for industrial wireless sensor networks," in Wireless Com-
munications, Networking and Mobile Computing, 2009. WiCom'09. 5th In-
ternational Conference on, pp. 1{4, IEEE, 2009.
[102] P. Du and G. Roussos, \Adaptive time slotted channel hopping for wireless
sensor networks," in 2012 4th Computer Science and Electronic Engineering
Conference (CEEC), Institute of Electrical & Electronics Engineers (IEEE),
sep 2012.
192
[103] R. Tavakoli, M. Nabi, T. Basten, and K. Goossens, \Enhanced time-slotted
channel hopping in WSNs using non-intrusive channel-quality estimation,"
in 2015 IEEE 12th International Conference on Mobile Ad Hoc and Sensor
Systems, pp. 217{225, Institute of Electrical & Electronics Engineers (IEEE),
oct 2015.
[104] C.-F. Shih, A. E. Xhafa, and J. Zhou, \Practical frequency hopping sequence
design for interference avoidance in 802.15.4e TSCH networks," in 2015 IEEE
International Conference on Communications (ICC), pp. 6494{6499, Insti-
tute of Electrical & Electronics Engineers (IEEE), jun 2015.
[105] N. Baccour, A. Koub^ aa, L. Mottola, M. A. Z u~ niga, H. Youssef, C. A. Boano,
and M. Alves, \Radio link quality estimation in wireless sensor networks: A
survey," ACM Transactions on Sensor Networks (TOSN), vol. 8, no. 4, p. 34,
2012.
[106] V. Kuleshov and D. Precup, \Algorithms for multi-armed bandit problems,"
CoRR, vol. abs/1402.6028, 2014.
[107] Y. Gai and B. Krishnamachari, \Distributed Stochastic Online Learning Poli-
cies for Opportunistic Spectrum Access," IEEE Transactions on Signal Pro-
cessing, vol. 62, no. 23, pp. 6184{6193, 2014.
[108] A. Anandkumar, N. Michael, and A. Tang, \Opportunistic Spectrum Access
with Multiple Users: Learning under Competition," in IEEE International
Conference on Computer Communications (INFOCOM), IEEE, 2010.
[109] K. Liu and Q. Zhao, \Distributed Learning in Multi-Armed Bandit With
Multiple Players," IEEE Transactions on Signal Processing, vol. 58, no. 11,
pp. 5667{5681, 2010.
[110] D. J. Welsh and M. B. Powell, \An Upper Bound for the Chromatic Number
of a Graph and its Application to Timetabling Problems," The Computer
Journal, vol. 10, no. 1, pp. 85{86, 1967.
[111] H. S. Kim, J. Ko, D. E. Culler, and J. Paek, \Challenging the IPv6 routing
protocol for low-power and lossy networks (RPL): A survey," IEEE Commu-
nications Surveys Tutorials, vol. 19, no. 4, pp. 2502{2525, 2017.
[112] S. K. Singh, M. P. Singh, and D. K. Singh, \Routing protocols in wireless
sensor networks { a survey," International Journal of Computer Science &
Engineering Survey (IJCSES), vol. 1, no. 2, pp. 63{83, 2010.
[113] W. Rehan, S. Fischer, M. Rehan, and M. H. Rehmani, \A comprehensive sur-
vey on multichannel routing in wireless sensor networks," Journal of Network
and Computer Applications, vol. 95, pp. 1{25, 2017.
193
[114] Y. Wu and M. Cardei, \Multi-channel and cognitive radio approaches for
wireless sensor networks," Computer Communications, vol. 94, pp. 30{45,
2016.
[115] N. Gulati and K. R. Dandekar, \Learning state selection for recongurable
antennas: A multi-armed bandit approach," IEEE Trans. Antennas Propa-
gat., vol. 62, no. 3, pp. 1027{1038, 2014.
[116] Y. Gai, B. Krishnamachari, and R. Jain, \Combinatorial network optimiza-
tion with unknown variables: Multi-armed bandits with linear rewards and
individual observations," IEEE/ACM Transactions on Networking, vol. 20,
pp. 1466{1478, oct 2012.
[117] K. Liu and Q. Zhao, \Adaptive shortest-path routing under unknown and
stochastically varying link states," in Modeling and Optimization in Mobile,
Ad Hoc and Wireless Networks (WiOpt), 2012 10th International Symposium
on, pp. 232{237, IEEE, 2012.
[118] Z. Zou, A. Proutiere, and M. Johansson, \Online shortest path routing: The
value of information," in 2014 American Control Conference, pp. 2142{2147,
Institute of Electrical and Electronics Engineers (IEEE), jun 2014.
[119] T. Clausen, U. Herberg, and M. Philipp, \A critical evaluation of the IPv6
routing protocol for low power and lossy networks (RPL)," in 2011 IEEE 7th
International Conference on Wireless and Mobile Computing, Networking and
Communications (WiMob), pp. 365{372, Oct 2011.
[120] E. Ancillotti, C. Vallati, R. Bruno, and E. Mingozzi, \A reinforcement
learning-based link quality estimation strategy for RPL and its impact on
topology management," Computer Communications, vol. 112, pp. 1{13, 2017.
[121] E. Ancillotti, R. Bruno, and M. Conti, \Reliable data delivery with the ietf
routing protocol for low-power and lossy networks," IEEE Transactions on
Industrial Informatics, vol. 10, pp. 1864{1877, Aug 2014.
[122] K.-H. Kim and K. G. Shin, \On accurate and asymmetry-aware measurement
of link quality in wireless mesh networks," IEEE/ACM Trans. Netw., vol. 17,
pp. 1172{1185, Aug 2009.
[123] R. T. Hermeto, A. Gallais, K. V. Laerhoven, and F. Theoleyre, \Passive link
quality estimation for accurate and stable parent selection in dense 6TiSCH
networks," in Proceedings of the 2018 International Conference on Embedded
Wireless Systems and Networks, EWSN '18, 2018.
[124] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian data analysis,
vol. 2. Chapman & Hall/CRC Boca Raton, FL, USA, 2014.
194
[125] H. A. David and H. N. Nagaraja, Order statistics. Wiley Online Library,
1981.
[126] R. C. Cheng, \Generating beta variates with nonintegral shape parameters,"
Communications of the ACM, vol. 21, no. 4, pp. 317{322, 1978.
[127] A. Cerpa, J. L. Wong, L. Kuang, M. Potkonjak, and D. Estrin, \Statistical
model of lossy links in wireless sensor networks," in Information Processing
in Sensor Networks, 2005. IPSN 2005. Fourth International Symposium on,
pp. 81{88, IEEE, 2005.
[128] M. Senel, K. Chintalapudi, D. Lal, A. Keshavarzian, and E. J. Coyle, \A
Kalman lter based link quality estimation scheme for wireless sensor net-
works," in IEEE GLOBECOM 2007-2007 IEEE Global Telecommunications
Conference, pp. 875{880, Institute of Electrical & Electronics Engineers
(IEEE), nov 2007.
[129] F. Qin, Q. Zhang, W. Zhang, Y. Yang, J. Ding, and X. Dai, \Link qual-
ity estimation in industrial temporal fading channel with augmented kalman
lter," IEEE Transactions on Industrial Informatics, 2018.
[130] R. E. Kalman, \A new approach to linear ltering and prediction problems,"
Journal of basic Engineering, vol. 82, no. 1, pp. 35{45, 1960.
[131] H. W. Sorenson, Kalman ltering: theory and application. IEEE, 1985.
[132] R. Li, Z. Zhao, X. Zhou, G. Ding, Y. Chen, Z. Wang, and H. Zhang, \Intelli-
gent 5G: When cellular networks meet articial intelligence," IEEE Wireless
communications, vol. 24, no. 5, pp. 175{183, 2017.
[133] J. Mu~ noz, P. Muhlethaler, X. Vilajosana, and T. Watteyne, \Why channel
hopping makes sense, even with ieee802.15.4 OFDM at 2.4 GHz," in 2018
Global Internet of Things Summit (GIoTS), pp. 1{7, IEEE, 2018.
[134] N. Accettura, E. Vogli, M. R. Palattella, L. A. Grieco, G. Boggia, and
M. Dohler, \Decentralized trac aware scheduling in 6TiSCH networks: De-
sign and experimental evaluation," IEEE Internet of Things Journal, vol. 2,
no. 6, pp. 455{470, 2015.
195
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Learning, adaptation and control to enhance wireless network performance
PDF
IEEE 802.11 is good enough to build wireless multi-hop networks
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
Congestion control in multi-hop wireless networks
PDF
Dynamic routing and rate control in stochastic network optimization: from theory to practice
PDF
Improving user experience on today’s internet via innovation in internet routing
PDF
Online learning algorithms for network optimization with unknown variables
PDF
Multichannel data collection for throughput maximization in wireless sensor networks
PDF
On practical network optimization: convergence, finite buffers, and load balancing
PDF
Design of cost-efficient multi-sensor collaboration in wireless sensor networks
PDF
Relative positioning, network formation, and routing in robotic wireless networks
PDF
Utilizing context and structure of reward functions to improve online learning in wireless networks
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
PDF
Empirical methods in control and optimization
PDF
Radio localization techniques using ranked sequences
PDF
Understanding the characteristics of Internet traffic dynamics in wired and wireless networks
PDF
Efficient pipelines for vision-based context sensing
Asset Metadata
Creator
da Silva, Pedro Henrique Gomes
(author)
Core Title
Exploiting diversity with online learning in the Internet of things
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
09/23/2019
Defense Date
07/31/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
frequency hopping,Internet of Things,multi-path routing,OAI-PMH Harvest,online learning
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Krishnamachari, Bhaskar (
committee chair
), Govindan, Ramesh (
committee member
), Silvester, John Andrew (
committee member
)
Creator Email
pdasilva@usc.edu,pedrohenriquegomes@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-222444
Unique identifier
UC11674701
Identifier
etd-daSilvaPed-7842.pdf (filename),usctheses-c89-222444 (legacy record id)
Legacy Identifier
etd-daSilvaPed-7842.pdf
Dmrecord
222444
Document Type
Dissertation
Rights
da Silva, Pedro Henrique Gomes
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
frequency hopping
Internet of Things
multi-path routing
online learning