Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
(USC Thesis Other)
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
REALISTIC MODELING OF WIRELESS COMMUNICATION GRAPHS FOR THE DESIGN OF EFFICIENT SENSOR NETWORK ROUTING PROTOCOLS by Marco Antonio Z¶ u~ niga Zamalloa A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful¯llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2006 Copyright 2007 Marco Antonio Z¶ u~ niga Zamalloa Dedication To my brother Carlo, my sisters Dany and Caroline, my nephew Sedrik, and specially pap¶ a y mam¶ a, Federico y Hortensia. And to the wife and kids that I do not have yet, but some day I will. ii Acknowledgements I see my years as a graduate student at the University of Southern California as the most rewarding experience of my life. I have had the great opportunity to interact not only with outstanding researchers but specially superb people. Before joining the program I heard that the relationship with your advisor is perhaps the single most important one that you have as a PhD student, and I am grateful to have hadtheopportunitytoworkwithProfessorBhaskarKrishnamachari. Bhaskar,thankyou for showing me what a great mentor and a great leader is. I would also like to thank the supportand encouragementofProfessor AhmedHelmyandProfessor Ali Zahidthrough- out my studies. And, I am grateful to have met Professor Chen Avin whose insightful comments and guidance contributed signi¯cantly to the results presented in chapter 5. Thank you also to the members of my Qualifying and Defense committees Professor Urbashi Mitra, Professor Konstantinos Psounis, and Professor Ramesh Govindan whose comments and suggestions improve signi¯cantly the quality of the work presented in this thesis. My fellow students played a central role during my studies. I would like to thank specially Shyam, Kiran, Sundeep and Avinash for their valuable comments on my work and for their unconditional friendship. Members of the ANRG and NOMADS group iii havealsobeenamajorsourceofsupportlikeDong-Jin, Amitabha, Ganesh, Jabed, Nara, Gang, Hua, Pai-Han, Joon, Yi, Divyesh, Fan, Yang and Karim. iv Table Of Contents Dedication ii Acknowledgements iii List Of Tables viii List Of Figures ix Abstract xii Chapter 1: Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Transitional Region Analysis . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Impact of Lossy Links in Geographic Routing . . . . . . . . . . . . 3 1.2.3 Performance of Random Walks on Heterogeneous Networks . . . . 4 1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2: Related Work 7 2.1 Link Layer Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Geographical Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Random Walk Based-Queries . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 3: Analysis of the Transitional Region 15 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Impact of Channel Multi-Path . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 Channel and Radio Receiver Models . . . . . . . . . . . . . . . . . 19 3.2.2 Impact on Link Reliability (Extent of Transitional Region) . . . . 23 3.2.3 Expectation and Variance of Packet Reception Rate . . . . . . . . 29 3.2.4 Comparison With Available Link Models . . . . . . . . . . . . . . 34 3.3 Impact of Hardware Variance . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.2 Impact on Asymmetric Links . . . . . . . . . . . . . . . . . . . . . 37 3.3.3 Impact on Extent of Transitional Region . . . . . . . . . . . . . . . 40 3.4 Experiments with Motes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4.1 Channel and Radio Parameters . . . . . . . . . . . . . . . . . . . . 42 v 3.4.2 Chain Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Chapter 4: Impact of Lossy Links on Geographic Routing 53 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Scope, Assumptions and Metrics . . . . . . . . . . . . . . . . . . . . . . . 56 4.3 Analytical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3.2 Analysis for ARQ case . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.3.3 Analysis for the No-ARQ case. . . . . . . . . . . . . . . . . . . . . 63 4.4 Geographic Forwarding Strategies for Lossy Networks . . . . . . . . . . . 65 4.4.1 Distance-based Forwarding . . . . . . . . . . . . . . . . . . . . . . 65 4.4.2 Reception-based Forwarding . . . . . . . . . . . . . . . . . . . . . . 66 4.4.3 PRR£d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.5 Comparison of Di®erent Strategies . . . . . . . . . . . . . . . . . . . . . . 66 4.5.1 PRR£d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.2 Absolute Reception-Based . . . . . . . . . . . . . . . . . . . . . . . 69 4.5.3 Distance-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.6 Experiments with Motes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 5: Performance of Random Walks on Heterogeneous Networks 77 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2 Enhancing Random Walks for Heterogeneity . . . . . . . . . . . . . . . . 80 5.3 Analytical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3.1 Parameters of Line Topology . . . . . . . . . . . . . . . . . . . . . 82 5.3.1.1 Analysis of region 1 . . . . . . . . . . . . . . . . . . . . . 85 5.3.1.2 Analysis of region 2 . . . . . . . . . . . . . . . . . . . . . 86 5.3.1.3 Analysis of region 3 . . . . . . . . . . . . . . . . . . . . . 87 5.3.2 Local Minimum for Maximum Hitting Time . . . . . . . . . . . . . 88 5.3.3 Local Minimum for Expected Hitting Time . . . . . . . . . . . . . 90 5.4 Numerical and Experimental Results . . . . . . . . . . . . . . . . . . . . . 92 5.4.1 Line Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.2 Regular Grids and Random Geometric Graphs . . . . . . . . . . . 96 5.4.2.1 Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.4.2.2 Random Geometric Graphs . . . . . . . . . . . . . . . . . 99 5.4.3 Low-Power Wireless Graphs . . . . . . . . . . . . . . . . . . . . . . 100 5.5 Experiments with Motes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Chapter 6: Conclusions 105 6.1 Analysis of the Transitional Region . . . . . . . . . . . . . . . . . . . . . . 105 6.2 Impact of Lossy Links on Geographic Routing . . . . . . . . . . . . . . . . 107 6.3 Performance of Random Walks on Heterogeneous Networks . . . . . . . . 108 vi Bibliography 108 Appendix A Models from Communication Theory . . . . . . . . . . . . . . . . . . . . . . . . 114 A.1 Log-Normal Path Loss Model . . . . . . . . . . . . . . . . . . . . . . . . . 114 A.2 Encoding and Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.3 Noise Floor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Appendix B Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 B.1 Resistance Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 B.2 Time to Absorption in Markov Chains . . . . . . . . . . . . . . . . . . . . 118 Appendix C The Radio Irregularity Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Appendix D The Transitional Region in MicaZ Motes . . . . . . . . . . . . . . . . . . . . . . 122 vii List Of Tables 3.1 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Empirical Channel Parameters . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3 Empirical Radio Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.4 Comparison of Total Variance and Channel Variance . . . . . . . . . . . . 46 3.5 Analytical Extent of Transitional Region . . . . . . . . . . . . . . . . . . . 49 3.6 Theoretical Models for the Link Layer . . . . . . . . . . . . . . . . . . . . 52 4.1 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2 Empirical Results for Di®erent Forwarding Strategies. . . . . . . . . . . . 74 5.1 Mathematical Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.2 Minimum Maximum and Minimum Average Hitting Times per Region . . 88 5.3 Random Geometric Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.4 Grid Deployments in Realistic Environments . . . . . . . . . . . . . . . . 101 5.5 Random Deployments in Realistic Environment . . . . . . . . . . . . . . . 101 viii List Of Figures 3.1 (a)Areceiverresponsewhereà ` andà h determinedi®erentregionsoflink quality, (b) Interaction of ° ` and ° h with the channel to determine the transitional region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2 Analytical representation of equations 3.6 and 3.7. . . . . . . . . . . . . . 25 3.3 Impactof¾ and´ onextentoftransitionalregion. (a)¡fordi®erentvalues of´ and¾,(b)solidcurvesrepresentaveragepowerdecay,anddottedlines the [-2¾, 2¾] interval of the variance. . . . . . . . . . . . . . . . . . . . . . 26 3.4 Impact of perfect receiver threshold on extent of transitional region, (a) Analytical representation, (b) An instance of PRR vs distance. . . . . . . 27 3.5 cdfs for packet reception rate for receivers in di®erent regions in a speci¯c environment (´ =3, ¾ =3). . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.6 Linear approximation of receiver threshold and Gaussian SNR, the mean of the Gaussian depends on the transmitter-receiver distance . . . . . . . 31 3.7 Comparison of E[ª] and Var[ª] with their linear approximations, E[ª L ] and Var[ª L ]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.8 Comparison of cdfs between the Gaussian model (black curves) and our analytical model (dotted curves) for receivers in di®erent regions. . . . . . 35 3.9 Impact of hardware variance on asymmetric links. . . . . . . . . . . . . . 38 3.10 Impact of channel multi-path and hardware variance on extent of transi- tional region, (a) Impact of channel multi-path (real channel + identical non-variant hardware), (b) Impact of hardware variance (ideal channel + hardwarevariance), (d)Combinedimpactofchannelmulti-pathandhard- ware variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.11 Correlation between output power and noise °oor. . . . . . . . . . . . . . 43 ix 3.12 Simulation results for the relation between in-degree and out-degree, (a) positive correlation, (b) negative correlation. . . . . . . . . . . . . . . . . 44 3.13 Empirical correlation between in-degree and out-degree for di®erent power levels, (a) indoor environment, (b) outdoor environment. . . . . . . . . . . 45 3.14 Kolmogorov-Smirnovtestbetweentheempiricalandmodel-generatedpacket reception rate for medium and high output powers in a grass area. . . . . 47 3.15 Comparison of empirical measurements and instances of analytical model, (a) Empirical indoor P t =-7dB, (b) Empirical outdoor P t =-7dB, (c) Em- pirical outdoor P t =5dB, (d), (e) and (f) are the analytical counterparts. . 48 4.1 Impactofchannelmulti-pathonE[» d ARQ ],(a)impactofpathlossexponent ´, (b) impact of channel variance ¾. . . . . . . . . . . . . . . . . . . . . . 61 4.2 Energy e±ciency metric for the ARQ case. The transitional region often has links with good performance as per this metric. . . . . . . . . . . . . . 62 4.3 Impact of di®erent parameters on q d , (a) ¿, (b) ´, (c) ¾, (d) P t . . . . . . 68 4.4 Performance of PRR blacklisting . . . . . . . . . . . . . . . . . . . . . . . 71 4.5 Performance of distance blacklisting . . . . . . . . . . . . . . . . . . . . . 72 5.1 Examples of line topologies. The big circles denote cluster-heads and the dashedlinestheircoveragek,allnodeswithink hopsfromthecluster-head are its neighbors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2 Resistance calculation for regions 2 and 3. . . . . . . . . . . . . . . . . . . 86 5.3 C query , C event and C total vs the number of clusters for a line topology with 121 nodes and di®erent values of k (6, 8 and 10). . . . . . . . . . . . . . . 95 5.4 Maximum hitting time for a line topology with 121 nodes and values k ranging from 5 to 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.5 C query (expectedhittingtime)foralinetopologywith121nodesandvalues k ranging from 5 to 10.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.6 (a)C event , (b)C query and (c)C total for a grid topology with 169 nodes and values of k ranging from 2 to 6. . . . . . . . . . . . . . . . . . . . . . . . . 98 x 5.7 Empirical study of random walks on degree-heterogeneous graphs, (a) presents the degree of the nodes, (b) presents the query cost for graphs with di®erent number of clusters . . . . . . . . . . . . . . . . . . . . . . . 102 D.1 Comparisonofempiricalmeasurementsforchannel,radioandlinkbetween mica2 and micaZ motes, P t = -10 dBm for both type of motes, (a) chan- nel mica2, (b) radio mica2, (c) link mica2, (d), (e), (f) are their micaZ counterparts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 xi Abstract Recent advances in low-power processor technology, radios, sensors and actuators will allow us to monitor and instrument the physical world with unprecedented granularity and precision. These new types of systems, often referred to as wireless sensor networks, present unique challenges compared to current networked and embedded systems. One of the most important challenges is the design of e±cient routing protocols considering the particular characteristics of the underlying communication graph. Sensor network deployments are characterized by a number of non-idealities such as spatio-temporal variation in wireless link quality, link asymmetry, hardware variance, node heterogeneity and randomized placement. Our thesis is that a realistic modeling of the underlying communication graph incorporating these e®ects is necessary to design highly e±cient routing mechanisms for sensor networks. We substantiate this thesis by developing a realistic communication graph model and incorporating it into the design of a geographic routing mechanism and a random walk-based querying mechanism. First, we provide an in depth analysis of the so-called transitional region in multi-hop wireless networks, and propose a more realistic link layer model. Then, based on this model we present analytical results on the impact of channel multi-path on geographical routing. Finally, we use our model to study random walk-based queries and show that xii their performance can be enhanced by exploiting the node degree heterogeneity present in real wireless sensor networks. xiii Chapter 1 Introduction 1.1 Overview Recent technological advances in MEMS 1 and chip design have resulted in the emergence of a novel networking paradigm where networks of several wireless sensors can interact with the physical world. As in any other network, one of the main tasks of these new wirelesssensornetworks(WSN)isthecommunicationofinformationamongnodes. Given that in the end the underlying communication graph determines to a great extent the qualityandquantityofcommunicationofthenetwork,itisimportanttohaveanaccurate representation of the graph in order to design e±cient routing protocols. Unfortunately, most of the research up-to-date has assumed homogeneous networks using the ideal binary model 2 . While this ideal set-up is su±cient to capture limiting boundsonsomeimportantpropertiesofWSNsuchasconnectivityandcapacity[31,30],it doesnotcaptureotherimportantpropertiesofrealdeployments. Forexample,itdoesnot 1 Micro-Electro-Mechanical Systems 2 In this model, a node receives a packet only if it is within the circular transmission range of the sender. 1 capturethee®ectofchannelmulti-pathonlinkreliability,theimpactofhardwarevariance onlinkasymmetry,northecluster-headcharacteristicofheterogeneousdeployments[29]. Recent empirical studies have shown that the di®erences mentioned above can con- siderably degrade the performance of protocols designed under the ideal model. Gane- san et. al. [25] have shown that the behavior of even the simplest °ooding mechanism can be signi¯cantly a®ected due to asymmetric and occasional long-distance links. It was observed that in real deployments the °ooding tree presents an important presence of cluster-heads. Zhouet. al.[60]reportthatunreliablelinkscanhaveanegativeimpacton routing protocols, specially location-based routing protocols, such as geographic routing. Other works [20, 57, 49] have proposed mechanisms to leverage on the particular prop- erties of wireless links. These studies show that traditional minimum hop-count metrics perform poorly in terms of throughput and energy e±ciency, and that routing metrics based on required-number-of-transmissions have a better performance. The previous works motivate the following thesis: A realistic modeling of the underlying communication graph of wireless sensor networks is necessary to design e±cient routing protocols. 1.2 Research Contributions The proposed thesis is supported through three studies. First, we provide an in-depth understanding of the root causes of unreliability and asymmetry of wireless links and propose a realistic link layer model. Based on this model, the second study presents optimallocalforwardingmetricstominimizeenergyconsumptioningeographicalrouting. 2 Finally, we present an algorithm that improves the performance of random walk based- queries by exploiting node degree heterogeneity. 1.2.1 Transitional Region Analysis Severalempiricalstudies[25,59,57]haverevealedthatreallinksdeviatetoalargeextent from the ideal binary model. These studies have identi¯ed the existence of three distinct reception regions in the wireless link: connected, transitional, and disconnected. In the connected region links are often of good quality; and the disconnected region present no practical links for communication. On the other hand, the transitional is generally characterized by high-variance in reception rates and asymmetric connectivity. Because of its inherent unreliability and extent, the transitional region has a major impact on the performance of upper-layer protocols. In this study, we present an in-depth analysis of the transitional region for static and low-dynamic environments (no signi¯cant time-variance was considered). Our analysis shows how channel multi-path a®ects the extent of the transitional region, and quanti¯es the impact of hardware variance on link asymmetry. One of the major contributions of this work is the derivation of a more realistic link layer model. 1.2.2 Impact of Lossy Links in Geographic Routing In geographic routing, each node forwards a packet to the neighbor whose location is closest to the destination. However, the existence of unreliable links exposes a key weak- ness in greedy forwarding. Neighbors that are closest to the destination (also likely to be farthest from the forwarding node) may have poor links with the current node, these 3 \weak links" can result in increased energy wastage due to dropped packets. On the other hand, if the forwarding mechanism attempts to maximize per-hop reliability by forwarding only to close neighbors with good links, it may cover only a small geographic distance at each hop, which would also result in greater energy expenditure due to the need for more transmissions. Basedonthemodelderivedinthisthesis,wepresentanstudyofthetrade-o®between distance-hop and energy described above. Our analysis, simulations and experiments all show that the product of the packet reception rate (PRR) and the distance traversed towardthedestinationistheoptimallocalforwardingmetricforsystemsusingautomatic repeat request (ARQ). 1.2.3 Performance of Random Walks on Heterogeneous Networks Random walks are an important approach for querying in unstructured systems. In random walks, nodes are visited sequentially in a random order with successive nodes beingneighborsinthegraph. However, mostoftheliteratureisfocusedonsimpleclasses of deterministic graphs (such as 2D Torii), and a major property of real-life networks, degree heterogeneity, is left out of discussion. Heterogeneity on the degree distribution is a characteristic observed in real large- scale WSN for di®erent reasons, ranging from hardware variance to recent proposals [29] suggesting that heterogeneous networks consisting of cluster-heads and regular nodes is a convenient direction to scale WSN. In this study, we propose the use of a simple algorithm to exploit the heterogeneity of the communication graph to enhance the performance of random-walk-based queries. 4 We present analytical results on linear topologies and numerical results on 2D topologies based on the link layer model proposed in this thesis. Our results show that a small percentage of nodes being cluster-heads can lead to signi¯cant improvements in perfor- mance. 1.3 Scope It is important to consider that as technology evolves, future generation radios may re- duce multi-path e®ects 3 , and applications using these radios may not be considerably a®ected by multi-path and hardware variance e®ects. Nevertheless, the large-scale sce- narios targeted by wireless sensor networks pose major constraints on the cost of the radios, leading to resource constrained devices in terms of radio, energy and process- ing capabilities. Hence, given that even resource-rich scenarios such as cellular networks still face signi¯cant shortcomings, the extra constraints posed by WSN (chie°y among them cost and energy) would further complicate the achievement of ideal communication channels. Also, even though this work is focused on WSN, similar link properties have been ob- servedinothermulti-hopwirelessnetworkssuchasMobileAdHocNetworks(MANETS), and hence, some of our results may be applicable to such networks as well. The remainder of the thesis is organized in the following way. Chapter 2 presents the related work on the areas covered by our studies. In chapter 3 we present propose a more realistic link layer model for WSN. The proposed model is used in chapters 4 3 Technology evolution has also led to virtually error-free wired links on the internet, from initial middle-of-the-road links. 5 and 5 to enhance the performance of geographic routing and random walk-based queries, respectively. Our conclusions are presented in chapter 6. And the interested reader can ¯nd some introductory material on the analytical tools used in this thesis in Appendices A and B. 6 Chapter 2 Related Work Recent empirical results [25, 57, 29] have shown the striking discrepancy between the performance of WSN routing protocols in real scenarios versus their behavior under ideal settings. ThesestudieshaveleadtheWSNcommunitytohaveanincreasedunderstanding of the need for realistic link layer models. While the related work on the speci¯c area of routing protocol design on real WSN communication graphs is not extensive, the areas of channel modeling and routing pro- tocols have been extensively studied independently. On one hand, during the last several years a number of routing protocols for wireless sensor networks have been proposed ([34, 33, 14] just to name a few), however most of these protocols were studied based on ideal assumptions. On the other hand, the area of communication theory has devel- oped in the last decades several interesting models to capture the random behavior of the wireless channel. However, most of these e®orts were targeted for a di®erent types of networks, mainly satellite and cellular. In this thesis we use some of the rich tools developed in communication theory to propose a more realistic link layer model for WSN. Then, we use this model to study the 7 performance of geographic routing and random walk-based queries on real scenarios. In the next sections, we present the related work on the areas covered by this thesis. 2.1 Link Layer Modeling Yearsofresearchinwirelesscommunications,particularlycellularnetworks,providearich setofmodelsandtoolsforanalyzingthephysicallayer[46]. However,mostoftheresearch hasbeenfocusedonimprovingphysicallayerparameters,mainlythebiterrorrate(BER). From a networking perspective, an ideal abstraction to design protocols is an underlying communicationgraphwhereedgesaretaggedwiththeprobabilityofsuccessfullyreceiving apacket(packetreceptionrate). Initially,thewirelesssensornetworkcommunityadopted the ideal binary model as the communication premise to build these graphs. Unfortunately, theidealmodeldoesnotcaptureimportantphenomenaofthewireless link. For instance, unreliable and asymmetric links are entirely ignored and other im- portant associated characteristics such as the node degree distribution can be completely misleading. Recentempiricalstudieshavereportedthattheperformanceofroutingproto- cols designed under ideal settings can di®er signi¯cantly when deployed in real scenarios. Kotz et. al. [37] shows that packet losses lead to di®erent connectivity graphs, and coverage ranges that are neither circular nor convex and are often noncontiguous. In one of the earliest works [25], Ganesan et al. present empirical results on the behavior of a simple °ooding in a dense sensor network. They found that the °ooding tree exhibits a highclusteringbehavior,incontrasttothemoreuniformlydistributedtreeobtainedwith the ideal binary model. Couto et. al. [20] and Woo et. al. [57], report that when the real 8 channel characteristics are taken into account, the minimum hop-count metric has poor performance. They show that in real scenarios a cost-based routing using a minimum expected transmission metric shows good performance. Zhou et al. [60] report that radio irregularity has a signi¯cant impact on routing protocols, but a relatively small impact on MAC protocols. The previous studies made clear to the community that the design of WSN routing protocols require the need for more realistic communication graphs. In order to help overcome this problem some tools and models have been proposed recently. In [57], the authors derive a packet loss model based on aggregate statistical measures such as mean and standard deviation of packet reception rate. The model assumes a Gaussian distribution of the packet reception rate for a given transmitter-receiver distance, which as it will be shown in Chapter 3, is not accurate. Using the SCALE tool [10], Cerpa et al. [11] use several statistical techniques to provide a spectrum of models of increasing complexity and increasing accuracy. A more recent model, the Radio Irregularity Model (RIM), was proposed in [60]. Based on experimental data, RIM provides a radio model that takes into account both the non-isotropic properties of the propagation media and the heterogeneous properties of devices to build a richer link model. While the described models are important steps toward a realistic link quality model, they do not provide signi¯cant mathematical insight on how the channel and radio pa- rameters a®ect link unreliability and asymmetry. Also, some of these models [57, 11] do not provide a systematic way to generalize (i.e., extend their validity and accuracy) beyond the speci¯c radio and environment conditions of the experiments from which the models are derived. 9 In chapter 3 we use some tools from communication theory, the log-normal path loss model and the BER expressions of various modulation and encoding schemes, to present an in-depth analysis on unreliable and asymmetric links and provide simple and more realistic analytical models for the link layer. 2.2 Geographical Routing Geographic routing [23, 34] is a key paradigm that is quite commonly adopted for infor- mation delivery in wireless ad-hoc and sensor networks where the location information of nodes is available. The main component of geographic routing is usually a greedy for- warding mechanism whereby each node forwards a packet to the neighbor that is closest to the destination. Geographic routing is attractive for these new types of wireless networks because of its low overhead and the minimal state required at each node. Several works have presented mechanisms to overcome some of the problems of geographic routing such as dead-ends [6, 34, 38], and it has been proven to be an e±cient, low-overhead method of data delivery if it is reasonable to assume (i) su±cient network density, (ii) accurate localizationand(iii)high link reliability independentofdistancewithinthephysicalradio range. However, while assuming highly dense sensor deployment and reasonably accurate localization may be acceptable in some classes of applications, under the assumptions of (iii) concerning highly reliable and deterministic links is unlikely to be valid in any realistic deployment. 10 Given the attractiveness of geographic routing some recent works have explored their performanceonrealscenarios. Kimet. al.[36]usesimulationsandtest-bedmeasurements toshowthatthedi®erencesbetweentheidealmodelandreallinkscausepersistentfailures ingeographicrouting,evenonstatictopologies. Thesedi®erencescancausethreekindsof pathologies in the planarization process required to avoid dead-ends: a link in the planar subgraph is removed when it should not be; nodes at the two ends of an asymmetric link disagree on whether or not the link belongs in the planar graph; or a pair of crossed links remain in the supposedly planar subgraph. The authors propose the Cross-Link DetectionProtocol(CLDP)asthesolutionfortheseproblems. Onthesamelineofwork, Zhou et. al. [60] used empirical data to show that geographic forwarding perform worse in the presence of radio irregularity than on-demand protocols, such as AODV and DSR. The authors propose a technique called Symmetric Geographic Forwarding to alleviate thenegativeimpactoflinkasymmetry. Inthistechnique, beaconmessagesareallowedto containnotonlythenode'sIDandposition, butalsotheIDsofallitsneighbors. Whena nodereceivesabeaconmessageitregistersthesenderasitsneighbor. Ifthereceiver¯nds its own ID in the neighbor list it marks the communication link as symmetric, otherwise, it marks the link as asymmetric. Whenever a node needs to forward a packet, it selects only those neighboring nodes with which it is connected through symmetric links. Similarly to the studies described above, our work aims to overcome challenges posed by the unreliable wireless links, but in a di®erent domain. In chapter 4 we show that the distance-greedy mechanism of geographic routing can signi¯cantly degrade its per- formance, and we propose a new greedy metric to maximize the energy e±ciency of geographic routing in real scenarios. This metric considers not only distance but also the 11 quality of the link in order to minimize the number of transmissions. Our work is related to the studies done in [20] and [57], where a minimum expected transmission metric is proposedtooptimizerouting. However,whiletheminimumexpectedtransmissionmetric is a global path metric, our work provides an optimal local metric suitable for scalable routing protocols such as geographic routing. Our work has sparked interest in the community on optimal geographic forwarding strategiesonreallow-powerwirelesslinks,andsomeworkshavefollowed-uponourinitial study. In [39], the authors propose a new metric called normalized advance (NADV), which also studies the distance-hop trade-o® and provides some °exibility in terms of the metric to be optimized, such as energy or delay. Li et al. [41] insert power control to our proposed PRR£d metric. In [58], studies the PRR£d, among other metrics, in 802.11b networks and suggest that the link quality should be tested using data tra±c. Finally, it is important to describe the research done in the wireless communication area. Traditionally, wireless communication have focused only on pure physical layer techniques to overcome the e®ects of multi-path, for example Rake receivers [7, 42] and multiple input multiple output (MIMO) systems [24, 18]. However, some recent studies have explored the interaction between cooperative diversity techniques, in the physical layer, and routing, in the network layer. In [16], the authors consider in a uni¯ed fashion the e®ects of cooperative communication via transmission diversity and multi-hopping as wellasoptimalpowerallocationschemesinfadingchannels. Khandaniet. al.[35]propose a mechanism based on omni-directional antennas to optimize the energy e±ciency on the transmissionofasinglemessagefromasourcetodestinationthroughsetsofnodesacting as cooperating relays. They present a solution to optimally allocate power for a set of 12 source nodes to a set of destination nodes. Our work, presented in chapter 4, di®ers from the previous works in that it uses only techniques at the network layer with inexpensive radios that do not require any extra functionality at the physical layer. 2.3 Random Walk Based-Queries Randomwalksongraphshavebeenstudiedmathematically,andthereisasubstantial-yet- growing body of theoretical literature on the subject [3, 43, 9]. Recently, they are ¯nding increasinguseinawiderangeofprotocolsinthecontextofseveralnetworkeddistributed systems. For instance, they have been used in unstructured P2P Networks [19, 1, 15], for hybrid application overlays [56], for group membership services in mobile ad hoc networks[21,5], fordistributedmodelchecking [53], andforindexqualitydetermination for the world-wide web [32]. Speci¯cally in the context of unstructured wireless sensor networks, di®erent variants of random-walk-based protocols have been proposed and analyzed by several research groups. Servetto and Barrenechea [51] proposed and analyzed the use of constrained random walks on a grid for performing load-balanced routing between two known nodes. AvinandBrito[4]havearguedthatevensimplerandomwalkscanbeusedfore±cientand robust querying because they are inherently load-balanced and their partial cover times show good scaling behavior. The ACQUIRE protocol [48] provides a tunable look-ahead parameter to combine random walks with controlled °oods and show that such random- walk-based hybrids can outperform °ooding and even expanding-ring-based approaches in the presence of replicated data. The rumor routing algorithm [8] is a hybrid push-pull 13 mechanism that advocates the use of multiple random walks from the events as well as the sinks, so that their intersection points can be used to provide a rendezvous point. Shakkottai [52] has analyzed di®erent variants of random-walk-based query mechanisms and concludes that source and sink-driven sticky-searches (similar to rumor routing) pro- videarapidincreaseofquerysuccessprobabilitywiththenumberofsteps. Mostrecently, Alanyali et al. [2] have proposed the use of random walks in energy-constrained networks to perform e±cient distributed computation of a class of decomposable functions. However, most of the research up-to-date has been based on simple classes of deter- ministic graphs (such as grids). Due to its increasing attractiveness, it is important to evaluate the performance of random walks on real scenarios. An important characteris- tic that has not been considered in the studies described above is degree heterogeneity. Degree heterogeneity is a highly likely characteristic WSN that has been already identi- ¯ed in some empirical works [25, 11], and the model derived in chapter 3 captures this characteristic to some extent. The impact of degree heterogeneity on random walks have been explored by Gkant- sidis [28] et. al., who report that random walks achieves better results than °ooding for searching in P2P networks when the overlay topology is clustered. In chapter 5, we use results relating random walks and electrical networks to show that the insertion of cluster-heads (degree heterogeneity) in line topologies can enhance considerably the per- formance of random walk-based queries in WSN. Related to our analytical result is the work presented by Ghosh [26] et.al. where they present an optimization approach to re- duce the e®ective resistance between two vertices, which in turn can reduce the commute time. 14 Chapter 3 Analysis of the Transitional Region Experimental studies have demonstrated that the behavior of real links in low-power wireless networks (such as wireless sensor networks) deviates to a large extent from the ideal binary model used in several simulation studies [59, 57]. In particular, there is a large transitional region in wireless link quality that is characterized by signi¯cant levels of unreliability and asymmetry. In this chapter, we provide a comprehensive analysis of the root causes of unreliability and asymmetry. In particular, we derive expressions for the distribution, expectation, and variance of the packet reception rate as a function of distance, and for the location and extent of the transitional region. These expressions incorporate important channel and radio parameters such as the path loss exponent and variance of the channel, and the modulation, encoding, and hardware variance of the radios. 3.1 Overview Wirelesssensornetworkprotocolsareoftenevaluatedthroughsimulationsthatmakesim- plifying assumptions about the link layer, such as the ideal binary model. In this model, 15 packets are received only within the circular radio range of the transmitter. However, the real characteristics of low-power wireless links di®er greatly from those on the ideal model, chie°y among these di®erences are the unreliable and asymmetric nature of real links. The signi¯cant di®erences between the ideal model and the real behavior can lead to erroneous performance evaluation of upper-layer protocols (network layer and above). Several studies ([25, 59, 57]) have classi¯ed low-power wireless links in three distinct reception regions: connected, transitional, and disconnected. In the connected region, links are often of good quality, stable and symmetric. On the other hand, the transi- tional region is characterized by the presence of unreliable and asymmetric links; and the disconnected region presents no practical links for transmission. Unfortunately, the transitional region is often quite signi¯cant in size, and in dense deployments such as those envisioned for sensor networks, a large number of the links in the network (even higher than 50% [59]) can be unreliable. Recent studies have shown that unreliable and asymmetric links can have a major impactontheperformanceofupper-layerprotocols. In[25],itisshownthatthebehavior of even the simplest °ooding mechanism can be signi¯cantly a®ected due to asymmetric andoccasionallong-distancelinks. In[37],itisarguedthattheroutingstructuresformed taking into account unreliable links can be signi¯cantly di®erent from the structures formed based on the simple binary model. Similarly, the authors of [60] report that such unreliable links can have a negative impact on routing protocols, particularly geographic forwarding schemes. Other works ([57, 20]) have proposed mechanisms to take advantage of nodes in the transitional region. For instance, the authors of [20] found that protocols using the 16 traditional minimum hop-count metric perform poorly in terms of throughput, and that a new metric called ETX (expected number of transmissions), which uses nodes in the transitional region, has a better performance. The signi¯cant impact of real link characteristics on the performance of upper-layer protocolshascreatedanincreasedunderstandingoftheneedforrealisticlinklayermodels forwirelesssensornetworks. Inordertoaddressthisneed,somerecentworks([57,60,11]) haveproposednewlinkmodelsbasedonempiricaldata. However, ashortcomingofthese models is that they do not provide enough mathematical insight into how channel multi- path and hardware variance a®ect link unreliability and asymmetry. Also, some of these works ([11, 57]) are valid only for the speci¯c channel and radio parameters used in the deployment. In this chapter, we use analytical tools from communication theory, simulations and experiments to present an in-depth analysis of unreliable and asymmetric links in low- power multi-hop wireless networks. The main contributions of this work are twofold. First, it allows us to quantify the impact of the wireless environment and radio charac- teristics on link reliability and asymmetry. And second, we propose a systematic way to generalize models for the link layer that can be used to facilitate the design of e±cient routing protocols. We also derive expressions for the packet reception rate as a function of distance, and for the size of the transitional region. These expressions incorporate several radio parameters such as modulation, encoding, output power, frame size, receiver noise °oor and hardware variance; as well as important channel parameters, namely, the path loss exponent and the log-normal variance. 17 The Chapter is organized as follows. Section 3.2 studies the impact of multi-path on link reliability. First, we present a model for the packet reception rate as a function of distanceinsubsection3.2.1. Basedonthismodel,insubsection3.2.2westudytheimpact of channel and radio parameters on link reliability by analyzing their e®ect on the extent of the transitional region. Then, in subsection 3.2.3 we present approximate expressions for the expectation and variance of the packet reception rate as a function of distance. The section ends with a comparison of available link models with the one proposed in this thesis (subsection 3.2.4). We study the impact of hardware variance in section 3.3. Hardware variance has alreadybeenidenti¯edasthecauseoflinkasymmetry[11],inaddition,wealsoshowthat it can play a signi¯cant role on the extent of the transitional region. In subsection 3.3.1, we present a model for hardware variance. Based on this model, the impact of hardware variance on link asymmetry and reliability is quanti¯ed in subsections 3.3.2 and 3.3.3, respectively. Finally, in section 3.4 we present empirical measurements based on a test- bed of mica2 motes which validate some analytical insights of sections 3.2 and 3.3. A summary is presented in section 3.5. Before proceeding we present the scope of our work. Our study is focused on static and low-dynamic environments and it does not consider interference e®ects nor the non- isotropic property of radio coverage. However, our work can be complemented with other research e®orts to incorporate these properties. For instance, in [55] the authors focus on the study of interference in wireless sensor networks, Cerpa et. al. [12] study some temporal properties and [60] provides an interesting model for the non-isotropic characteristic of radio coverage, the models presented in these works can be used to 18 complement ours. Appendix C presents some guidelines on how to combine the non- isotropic RIM model [60] with our work. 3.2 Impact of Channel Multi-Path The extent of the transitional region is the result of placing speci¯c devices, for example mica2 motes, in an speci¯c environment, like the aisle of a building. If the characteristics of one of these elements is altered (radio or channel) then the extent of the transitional region is also altered. With the intent of analyzing how the channel and the radio deter- mine this extent; ¯rst, we de¯ne models for both elements, and subsequently study their interaction. 3.2.1 Channel and Radio Receiver Models From the network-layer perspective, a desired abstraction for link quality is the packet reception rate as a function of distance. This abstraction can be derived by composing the channel model, which provides the received signal strength (RSS) as a function of distance, with the radio-receiver model, which provides the packet reception rate (PRR) as a function of the signal to noise ratio (SNR). In the remainder of the chapter, the SNR function is denoted by ¨ and the PRR function by ª. Also, the lowercase greek letters: ° =¨(:) and à =ª(:), represent values taken by ¨ and ª for speci¯c points in their respective domains. Table 3.1 presents a summary of the notation used in this chapter. 19 Description Symbol Packet Reception Rate Parameters - packet reception rate (PRR) ª - a speci¯c PRR value in the range of ª à - high PRR à h - low PRR à ` Signal to Noise Ratio Parameters - signal to noise ratio (SNR) ¨ - a speci¯c SNR value in the range of ¨ ° - SNR value corresponding to à h ° h - SNR value corresponding to à ` ° ` - mean of SNR (Gaussian) for distance d ¹(d) - bit error rate as a function of SNR ¯ - bit error rate as a function of SNR in dB B Channel Parameters - path loss exponent ´ - standard deviation ¾ - output power P t - received power P r - noise °oor P n - Gaussian random variable N Transitional Region Parameters - transitional region coe±cient ¡ - beginning of transitional region d b - end of transitional region d e Table 3.1: Mathematical Notation Channel: One of the most common radio propagation models is the log-normal path loss model [46]. This model can be used for large and small coverage systems [50]. Fur- thermore,empiricalstudieshaveshownthatthelog-normalmodelprovidesmoreaccurate multi-path channel models than Nakagami and Rayleigh for indoor environments [44]. According to this model the received power (P r ) in dB is given by: P r (d)=P t ¡PL(d 0 )¡10 ´ log 10 ( d d 0 )+N(0;¾) (3.1) 20 Where P t is the output power, ´ is the path loss exponent that captures the rate at which signal decays with respect to distance,N(0;¾) is a Gaussian random variable with mean 0 and variance ¾ (standard deviation due to multi-path e®ects), and PL(d 0 ) is the power decay for the reference distance d 0 . Equation 3.1 does not consider non-isotropic transmission, which is an important characteristic of low-power wireless links. In Appendix C we present some guidelines on how to incorporate these non-isotropic e®ects in our model by using the expressions derived in the RIM model [60]. Appendix C also presents some information on how to include the path-loss e®ect caused by obstacles. Radio Receiver: The receiver response is given by the packet reception rate as a func- tionoftheSNR.Thepacketreceptionratecanbederivedfrombit-error-ratesexpressions that are widely available in the wireless communication literature. For a modulation M, the packet reception rate (ª) is de¯ned in terms of the bit- error-rate (¯ M ) as 1 : ª(°)=(1¡¯ M (°)) f (3.2) Wheref isthenumberofbitstransmitted,andstep3inTable3.6presentsexpressions of ¯ M for some common narrowband modulation schemes. ¯ M is a function of the SNR, which can be obtained from equation 3.1 and is given by: 1 For ease of explanation, the encoding is assumed to be NRZ. Table 3.6 presents expressions for other encoding techniques. 21 ¨(d) =P r (d)¡P n =N(¹(d);¾) (3.3) WhereN(¹(d);¾) is a Gaussian random variable with mean ¹(d), variance¾ 2 andP n is the noise °oor. ¹(d) can be derived by inserting equation 3.1 in equation 3.3, which leads to: ¹(d)=P t ¡PL(d 0 )¡10 ´ log 10 ( d d 0 )¡P n (3.4) Given that the SNR in equation 3.3 is in dB, let us rede¯ne the packet reception rate in equation 3.2 as a function of the SNR in dB. Denoting !(x) = 10 x=10 and the bit-error-rateforSNRindBasB M (° dB )=¯ M (! (° dB )), thepacketreceptionrateªcan be rede¯ned as: ª(° dB )=(1¡B M (° dB )) f (3.5) While the previous equation is general and valid for any modulation M, the ¯gures in this section assume Non-Coherent FSK (NCFSK) modulation. The ¯gures are for illustrative purposes and any modulation would serve that purpose. NCFSK was chosen becausetheempiricalevaluationpresentedinsection3.4usesNCFSKradios(theCC1000 equipped mica2 motes). 22 3.2.2 Impact on Link Reliability (Extent of Transitional Region) In this subsection our aim is to quantify the impact of channel multi-path on the extent of the transitional region. Given that the channel model is a function of the SNR vs distance and the receiver response is a function of the PRR vs SNR, we can derive the behavior of the PRR vs distance by linking both expressions through the SNR metric. First, we derive the SNR values that determine which links are good or unreliable in the receiver response, and then we use these SNR values to obtain the beginning and end of the transitional region in the channel model. Even though there are no strict de¯nitions for the di®erent regions in the literature, one valid de¯nition is the following: De¯nition 1: In the connected region links have a high probability (>p h ) of having high packet reception rates (>à h ). De¯nition 2: In the disconnected region links have a high probability (> p ` ) of having low packet reception rates (<à ` ). Where p h and p ` can be chosen as any numbers close to 1 and 0 respectively. Letting B ¡1 M (:) be the inverse 2 of B M (° dB ), and ª ¡1 (Ã) = B ¡1 M (1¡ à 1=f ) be the inverse of ª; the PRR values à h and à ` , from the de¯nitions above, can be mapped to 2 BERfunctionsareinjective,hence,whiletheremightnotbeaclosed-formexpressionfortheirinverse function, the SNR in the domain can always be obtained numerically. 23 6 7 8 9 10 11 12 13 0 0.2 0.4 0.6 0.8 1 PRR SNR (dB) (γ h ,ψ h ) (γ l ,ψ l ) good links unreliable links no links or bad links 0 5 10 15 20 0 5 10 15 20 25 30 35 40 distance (m) SNR (dB) γ h γ l transitional region (a) (b) Figure 3.1: (a) A receiver response where à ` and à h determine di®erent regions of link quality,(b)Interactionof° ` and° h withthechanneltodeterminethetransitionalregion. their corresponding SNR values in dB: ° h = ª ¡1 (à h ) and ° ` = ª ¡1 (à ` ). These SNR values determine the beginning and end of the transitional region. Figure3.1(a)showshowà h andà ` determinethreedi®erentregionsforlinkqualityin theradio-receiverresponse(equation3.5),andFigure3.1(b)showshow° h and° ` interact with the channel (equation 3.3) to determine the extent of the connected, transitional and disconnected regions. According to De¯nition 1 the beginning of the transitional region (d b ) satis¯es the following condition: p(ª>à h ) =p h ; *ª is injective ) p(¨>° h ) =p h ; *¨ is Gaussian ) Q( ° h ¡¹(d b ) ¾ ) =p h (3.6) And according to De¯nition 2 the end of transitional region (d e ) satis¯es: 24 0 5 10 15 20 0 5 10 15 20 25 30 35 40 distance (m) SNR (dB) d b d e p h p l γ h γ l transitional region connected region average SNR decay Figure 3.2: Analytical representation of equations 3.6 and 3.7. p(ª<à ` ) =p ` ; *ª is injective ) p(¨<° ` ) =p ` ; ) p(¨¸° ` ) =(1¡p ` ); *¨ is Gaussian ) Q( ° ` ¡¹(d e ) ¾ ) =(1¡p ` ) (3.7) Where Q(:) is the tail integral of a unit Gaussian probability density function (pdf) and ¹(:) is given by equation 3.4. Figure 3.2 depicts an analytical representation of the previous equations. This ¯gure shows how the interaction between the channel and the receiver response determine the extent of the transitional region. Finally, d b and d e can be derived from equations 3.6 and 3.7: d b =10 ° h ¡¾Q ¡1 (p h )¡P t +P n +PL(d 0 ) ¡10n d e =10 ° ` ¡¾Q ¡1 (1¡p ` )¡P t +P n +PL(d 0 ) ¡10n (3.8) While equation 3.8 provides absolute values for the extent of the di®erent regions, it may not be useful to compare the link-quality of di®erent scenarios. With that aim, 25 2 4 6 8 10 10 −1 10 0 10 1 10 2 10 3 Γ σ η = 2 η = 4 η = 6 η = 8 0 5 10 15 20 0 5 10 15 20 25 30 35 40 distance (d) SNR (dB) σ = 1 σ = 2 γ h γ l higher η (a) (b) Figure 3.3: Impact of ¾ and ´ on extent of transitional region. (a) ¡ for di®erent values of ´ and ¾, (b) solid curves represent average power decay, and dotted lines the [-2¾, 2¾] interval of the variance. we de¯ne the transitional region coe±cient ¡ which is the ratio of the extent of the transitional with respect to the extent of the connected region. ¡ = de¡d b d b =10 (° h ¡° ` )+¾(Q ¡1 (1¡p ` )¡Q ¡1 (p h )) 10n ¡1 (3.9) The lower the coe±cient ¡ the smaller the transitional region compared to the con- nected one. For example, for the ideal binary model, where ° h = ° ` and ¾ = 0, the coe±cient ¡ = 0. Notice that ¡ is independent of the noise °oor P n and output power P t ; a higher output power would increase the connected region, but it would increase the transitional region as well, keeping a constant ratio. Equation3.9predictstheimpactofthechannelonthetransitionalregion. Giventhat p h and p ` are high probabilities, (Q ¡1 (1¡p ` )¡Q ¡1 (p h )) is positive, and hence, while a small ¾ decreases the relative extent of the transitional region, a small ´ increases it. Therefore, scenarios with high ´ and low ¾ reduce the relative size of the transitional 26 0 5 10 15 20 0 5 10 15 20 25 30 35 40 distance (d) SNR (dB) transitional region γ th 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR transitional region (a) (b) Figure 3.4: Impact of perfect receiver threshold on extent of transitional region, (a) Analytical representation, (b) An instance of PRR vs distance. region. Figure 3.3 (a) presents ¡ for di®erent values of ´ and ¾, where p ` = p h = 0:9, ° h =10:23 dB and ° ` =8:20 dB 3 . Figure 3.3 (b) depicts analytically the impact of ´ and ¾ on the extent of the transi- tionalregion. TheSNRboundsontheradioreceiver(° h and° ` )are¯xedandindependent oftheenvironment. When¾ increasesfrom1to2thesignalvalues(y-axis)haveahigher probability of entering the transitional region at closer distances from the transmitter and leaving it at farther distances, which results in a larger transitional region. When ´ is increased (left arrow), the faster decay of the signal strength decreases the width of the transitional region. Equation3.9alsopredictstheimpactofthereceiver. Thesharperthereceiverthresh- old, the smaller (° h ¡° ` ) and the smaller the ¡ coe±cient. However, even with a perfect threshold receiver (° h = ° ` ), as the one used on the ideal model, the transitional region 3 ° h and ° ` were obtained for a NCFSK radio with Manchester encoding and a frame size of 100 bytes. Di®erent modulations, encoding and packet sizes do not have a signi¯cant impact on ¡, and the results are not presented due to space constraints. Some of these results are available in [61]. 27 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 PRR F(.) disconnected transitional connected 30% of links between 0.9 and 0.1 Figure 3.5: cdfs for packet reception rate for receivers in di®erent regions in a speci¯c environment (´ =3, ¾ =3). would still exist due to channel multi-path (¾). Figure 3.4 (a) depicts analytically the behavior of a perfect threshold receiver in a real channel, and Figure 3.4 (b) shows an instance of the link behavior. Notice that in this hypothetical scenario the transitional region would consist only of 0/1 links. The model also allows to provide the cumulative distribution function (cdf) of the packet reception rate as a function of distance. According to equation 3.5: F(Ã) =p(ª<Ã) =p(¨<ª ¡1 (Ã)) =1¡Q( ª ¡1 (Ã)¡¹(d) ¾ ) (3.10) Where ¹(d) is the average SNR decay (equation 3.4). Figure 3.5 shows an example of thecumulativedistributionF(Ã)for´ =3and¾ =3. Threedi®erenttransmitter-receiver distances are shown: end of connected region, middle of transitional and beginning of disconnected region. We can notice that independent of the region where the receiver 28 is, the link has a higher probability of being either good or bad (above 0.9 or below 0.1 PRR) than being unreliable (between 0.9 and 0.1). For instance, in the middle of the transitional region the link has a 30% probability of being unreliable; and the probability of observing unreliable links at the end of the connected region or at the beginning of the disconnected region is small (< 5%). Empirical measurements in [11, 61] agree with the analytical cdf in equation 3.10. It is important to remark that the obtained cdfs are valid only for the scope of this work (static and low-dynamic environments); highly dynamic environments add a new dimension of time to the cdfs. 3.2.3 Expectation and Variance of Packet Reception Rate Even though a longer distance does not necessarily imply a lower packet reception rate, theexpectedvalueofthepacketreceptionratedoesdecreasemonotonicallywithdistance inagivenpropagationdirection 4 . Inthissubsection, wepresentapproximateexpressions for the expectation and variance of the packet reception rate ª. These expressions are important because they con¯rm mathematically that the transitional region has a higher variability in PRR than the connected region. First we present the general expressions for the expectation and variance in equa- tions 3.11 and 3.12 . These expressions depend on the PRR versus distance function (receiverresponsegiveninequation3.5)andtheprobabilitydensityfunction(pdf)ofthe 4 The radio model used in this work is isotropic, but this is not true of practical antennas. By linearity of expectation, since E[ª a (d)] is monotonic with distance for a given propagation direction a, it can be shownthattheexpectedPRRaveragedoverallanglesisalsomonotonicwithdistance; however, itshould bekeptinmindthatexpectedPRRvaluesatdi®erentanglesmayshownon-distance-monotonicbehavior with respect to each other. 29 SNR for a given distance d (which is log-normally distributed). Given the mathemati- cal complexity of dealing with the receiver response and the pdf, we derive approximate expressions for the expectation and variance of the packet reception rate ª. In general, the ¯rst two moments of ª are de¯ned by: E[ª] = R 1 ¡1 ª(° dB )f(° dB ;d) ±° dB (3.11) E[ª 2 ] = R 1 ¡1 ª 2 (° dB )f(° dB ;d) ±° dB (3.12) Where f(° dB ;d) represents the pdf of SNR (a Gaussian random variable with param- eters ¹(d) and ¾). The sharp thresholds of ª and ª 2 permit linear approximations: ª(°)¼ª L (°)= 8 > > > > > > < > > > > > > : 0; °·° 0e m e °+b e ; ° 0e <° <° 1e 1; °¸° 1e (3.13) ª 2 (°)¼ª 2 L (°)= 8 > > > > > > < > > > > > > : 0; °·° 0v m v °+b v ; ° 0v <° <° 1v 1; °¸° 1v (3.14) Where m e , m v and b e , b v are the slopes and y-intercepts of the linear approximations ª L and ª 2 L , and ° is in dB. Figure 3.6 shows the approximation procedure for ª L ; the procedure for ª 2 L is similar. The mechanism to obtain the slopes, y-intercepts and limit points of equations 3.13 and 3.14 is presented later. 30 0 5 10 15 0 0.2 0.4 0.6 0.8 1 SNR (dB) PRR SNR < γ 0e SNR > γ 1e Ψ L approximation f(γ dB ,d) approximation Figure 3.6: Linear approximation of receiver threshold and Gaussian SNR, the mean of the Gaussian depends on the transmitter-receiver distance The linear models lead to the following approximations of equations 3.11 and 3.12: E[ª] ¼ R 1 ° 0e ª L (° dB ) f(° dB ;d) ±° dB = R ° 1e ° 0e (m e °+b e ) f(° dB ;d) ±° dB +Q( ° 1e ¡¹(d) ¾ ) (3.15) E[ª 2 ] ¼ R 1 ° 0v ª 2 L (° dB ) f(° dB ;d) ±° dB = R ° 1v ° 0v (m v °+b v ) f(° dB ;d) ±° dB +Q( ° 1v ¡¹(d) ¾ ) (3.16) In the above approximations f(° dB ;d) is evaluated separately on intervals [° 0e ;° 1e ] and[° 0v ;° 1v ]forE[ª]andE[ª 2 ],respectively. Bothintervalsrepresentthelinearapprox- imations of the sharp thresholds of ª and ª 2 , and these thresholds are narrow compared to the [¹¡4¾, ¹+4¾] domain of f(° dB ;d) 5 , hence, linear approximations can be used as 5 While the domain of a Gaussian random variable is [¡1;+1], the interval [¹¡4¾;¹+4¾] contains most of the probability space (.999), and it is wide compared to the sharp threshold of the receiver for common values of ¾ ([54]). 31 wellforf(° dB ;d)in[° 0e ;° 1e ]and[° 0v ;° 1v ]. Now,letusdenotef ª (° dB ;d)andf ª 2(° dB ;d) as the linear approximations of f(° dB ;d) for intervals [° 0e ;° 1e ] and [° 0v ;° 1v ]: f ª (° dB ;d)=m ge °+b ge (3.17) f ª 2(° dB ;d)=m gv °+b gv (3.18) Where: m ge = f(° 1e ;d)¡f(° 0e ;d) ° 1e ¡° 0e b ge = f(° 0e ;d)° 1e ¡f(° 1e ;d)° 0e ° 1e ¡° 0e m gv = f(° 1v ;d)¡f(° 0v ;d) ° 1v ¡° 0v b gv = f(° 0v ;d)° 1v ¡f(° 1v ;d)° 0v ° 1v ¡° 0v Figure3.6showstheapproximationprocedureforf ª (° dB ;d)(GaussianSNRcurvefor E[ª]); theprocedureforf ª 2(° dB ;d)issimilar. Finally, basedonequations3.15and3.16, the ¯rst and second moment approximations of the packet reception rate are given by: E[ª] ¼ R ° 1e ° 0e (m e °+b e ) f ª (° dB ;d) ±° dB +Q( ° 1e ¡¹(d) ¾ ) = R ° 1e ° 0e (m e °+b e ) (m ge °+b ge ) ±° dB +Q( ° 1e ¡¹(d) ¾ ) =((m e +m ge ) ° 3 3 +(b e m ge +b ge m e ) ° 2 2 +b e b ge °)j ° 1e ° 0e +Q( ° 1e ¡¹(d) ¾ ) (3.19) 32 E[ª 2 ] ¼ R ° 1v ° 0v (m v °+b v ) f ª 2(° dB ;d) ±° dB +Q( ° 1v ¡¹(d) ¾ ) = R ° 1v ° 0v (m v °+b v ) (m gv °+b gv ) ±° dB +Q( ° 1v ¡¹(d) ¾ ) =((m v +m gv ) ° 3 3 +(b v m gv +b gv m v ) ° 2 2 +b v b gv °)j ° 1v ° 0v +Q( ° 1v ¡¹(d) ¾ ) (3.20) In general, the parameters of ª L and ª 2 L (slopes, y-intercepts and limit points of equations3.13and3.14)canbeobtainedbycurve-¯ttingªandª 2 throughleastsquares regression techniques, nevertheless, our studies suggest that choosing a line that passes through points A and B with PRRs of 0.1 and 0.9 provides an accurate approximation 6 . Hence, A and B de¯ned as (ª ¡1 (0.1), 0.1) and (ª ¡1 (0.9), 0.9) can be used to obtain the di®erent parameters of ª L : m e = 0:9¡0:1 ° B ¡° A b e = 0:1° B ¡0:9° A ° B ¡° A ° 0v = ¡b e m e ° 1v = 1¡b e m e Where ° A = ª ¡1 (0:1) and ° B = ª ¡1 (0:9), both in dB. For ª 2 L , points A and B are (ª ¡1 ( p 0:1), 0.1) and (ª ¡1 ( p 0:9), 0.9). Figure 3.7 shows an example of numerically calculated curves for the expectation and variance(fromequations3.11and3.12),andtheirapproximationsthroughequations3.19 and 3.20 for ´ = 3 and ¾ = 3. In general, the error depends on the parameters of f(° dB ;d) (pdf of SNR). The smaller ¾, the larger the error because the width of the 6 Actually, no signi¯cant di®erences were found if points A and B are chosen in intervals [0.01, 0.2] and [0.8, 0.99], respectively. 33 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR E[Ψ L ] E[Ψ] Var[Ψ L ] Var[Ψ] Figure 3.7: Comparison of E[ª] and Var[ª] with their linear approximations, E[ª L ] and Var[ª L ]. receiver threshold starts to be comparable with the width of the bell of the Gaussian curve which leads to a less accurate linearization. However, for common values of ¾ [54] the bell is signi¯cantly wider than the receiver threshold and the approximation errors are not signi¯cant. Also, while the expectation decreases monotonically with distance, thevariancehasabellshapewhosemaximumliesinthetransitionalregion; thisbehavior agrees with empirical observations in [57]. 3.2.4 Comparison With Available Link Models Some popular wireless network simulators [40, 27] and recent studies [57] had been using a Gaussian random variable to represent the packet reception rate. The PRR function based on the Gaussian model (ª G ) has the following form: ª G = 8 > > > > > > < > > > > > > : 1; X >1 x; 0·X ·1 0; X <0 (3.21) 34 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 PRR F(.) disconnected transitional connected 60% of links between 0.9 and 0.1 Figure 3.8: Comparison of cdfs between the Gaussian model (black curves) and our analytical model (dotted curves) for receivers in di®erent regions. WhereX isaGaussianrandomvariablewithparameters¹=E[ª]and¾ 2 =Var(ª). The Gaussian model leads to the following cdf F G : F G (Ã)= 8 > > > > > > > < > > > > > > > : 1¡Q( ¡E[ª] p V[ª] ); à =0 1¡Q( áE[ª] p V[ª] ); 0<à <1 1; à =1 (3.22) Figure3.8showsacomparisonbetweenthecdfsoftheGaussianmodel(equation3.22) and our analytical model (equation 3.10) for receivers in the connected, transitional and disconnected regions. Contrary to the analytical cdf, where links have higher probability of being either good or bad (above 0.9 or below 0.1 PRR), the Gaussian model leads to links that have a high probability of being between 0.9 and 0.1; 60% for the node in the transitional region and 40% for the node in the connected region, which may lead to misleading results in protocol testing. The results shown are for ´ = 3, ¾ = 3 and a non-coherent FSK radio, but similar trends are obtained for di®erent parameters. 35 3.3 Impact of Hardware Variance Intheprevious sectionitwasassumedthatall radioshavethesame outputpower P t and noise °oor P n , however, hardware variance induces some °uctuation around the output power set by the user and around the average noise °oor. This variance problem is partially solved during the manufacturing process, where radios with a low output power and/or a high noise °oor (low sensitivity) are usually discarded. However, no upper- bound is used in the ¯ltering process and hardware variance remains as a problem. As stated in [45]: This ¯ltering process is justi¯able, since radios that are more powerful or more sensitive are generally desirable. Hardware variance has already been identi¯ed as the cause of asymmetric links [25]. In this section, we not only quantify the e®ect of hardware variance on link asymmetry, but we also show that hardware variance can have a signi¯cant impact on the extent of the transitional region. It is important to notice that while the output power variance can be calibrated to the same value for all radios, the noise °oor variance can not be eliminated through calibration since it depends on the thermal noise generated by the underlying solid state structure. 3.3.1 Model Hardware variance causes Gaussian distributions (in dB) in the output power and noise °oor[45]. Inordertocapturethesee®ectsletusrede¯neequation3.3bydenotingSNR AB 36 asthesignal-to-noiseratiomeasuredatBfortheoutputpowerofA,thenSNR AB (¨ AB ) is given by: ¨ AB =P tA ¡PL(d)¡P nB =N(P t ;¾ tx )¡PL(d)¡N(P n ;¾ rx ) (3.23) Where ¾ 2 tx are ¾ 2 rx are the variances of the output power and the noise °oor respec- tively, and PL(d)=PL(d 0 )+10 ´ log 10 ( d d 0 )+N(0;¾) is the channel path loss (which is identical in both directions: A!B and B!A). Empiricalmeasurements(Section3.4)showthatthereissomecorrelationbetweenthe output power and noise °oor within the same radio. Our model captures this correlation by representing the output power and noise °oor as a multivariate Gaussian distribution, as shown below: 0 B B @ T R 1 C C A »N 0 B B @ 0 B B @ P t P n 1 C C A ; 0 B B @ S T S TR S RT S R 1 C C A 1 C C A (3.24) Where P t is the nominal output power, P n is the average noise °oor, S the covariance matrix between the output power and noise °oor; and T and R are the actual output power and noise °oor of a speci¯c radio, respectively. 3.3.2 Impact on Asymmetric Links Whentheoutputpowerlevelofallthenodesissettothesamevalue,radioswithidentical non-varianthardware(¾ tx =0,¾ rx =0)leadtothesameSNRinbothdirections(¨ AB = 37 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 SNR (dB) PRR ϒ AB ϒ BA links vulnerable to asymmetry 3.2 dB Figure 3.9: Impact of hardware variance on asymmetric links. ¨ BA according to equation 3.23), which in turn leads to the same packet reception rate (symmetric links). For radios with hardware variance, ¨ AB can be di®erent from ¨ BA . Figure 3.9 shows the e®ect of ¨ AB ¡¨ BA on link asymmetry. Due to the sharp threshold of the receiver, a small value of ¨ AB ¡¨ BA (» 3.2 dB) may lead to signi¯cantly di®erent packet reception rates in both directions (1.0 and 0.4). ¨ AB ¡¨ BA is a random variable and the larger the variance of this di®erence, the higher the probability of link asymmetry. In order to quantify the impact of hardware variance on link asymmetry we will analyze the variance of (¨ AB ¡¨ BA ). Letting(T A ;R A )and(T B ;R B )betherespectiveoutputpowerandnoise°oorofradios A and B, then: ¨ AB ¡¨ BA =(T A ¡PL(d)¡R B )¡ (T B ¡PL(d)¡R A ) =(T A +R A )¡(T B +R B ) (3.25) 38 (T A + R A ) and (T B + R B ) are gaussian random variables representing the sum of the output power and noise °oor of di®erent radios (A and B), and can be assumed to be independent 7 . (T A +R A ) and (T B +R B ) are generated from the same multivariate Gaussian distribution and can be represented by (T +R), hence, Var(¨ AB ¡¨ BA ) = 4£Var(T +R) 8 , and: Var(T +R) =E[(T +R) 2 ]¡E 2 [T +R] =E[T 2 ]¡E 2 [T]+E[R 2 ]¡E 2 [R] +2(E[TR]¡E[T]E[R]) =Var(T)+Var(R)+2Cov(T;R) =S T +S R +2S TR (3.26) Which leads to: Var(¨ AB ¡¨ BA ) =4(S T +S R +2S TR ) (3.27) Where S T , S R and S TR are elements of the covariance matrix in equation 3.24. Equation 3.27 shows that a positive correlation (positive S TR ) between the output powerandnoise°oorofaradioleadstoahighvarianceof¨ AB ¡¨ BA (higherprobability of link asymmetry), while a negative correlation (negative S TR ) reduces the variance (lower probability of link asymmetry). Notice that a negative correlation implies that 7 The manufacturing process can create some correlation among di®erent radios if di®erent batches are produced from special high (low) quality materials, but we assume that all radios belong to the same process. 8 This is derived from the facts that for a random variable X, Var(X) = Var(¡X); and for i:i:d random variables Xi, Var( X i Xi)= X i Var(Xi) 39 nodes with output powers higher than P t (better transmitter) will usually have a noise °oor lower than P n (better receiver), and vice versa. Hence a negative correlation between the output power and noise °oor leads to the lowestprobabilityoflinkasymmetry,followedbyzerocorrelationandpositivecorrelation. 3.3.3 Impact on Extent of Transitional Region In equation 3.3, the randomness of the SNR was due uniquely to multi-path e®ects, but the variance of the output power and noise °oor introduces two other sources of randomness. The combined e®ect of output power variance, channel multi-path and noise °oor variance led to a new expression for the SNR (equation 3.23). Based on this equation the SNR ¨ is given by: ¨ =N(P t ;¾ tx )¡PL(d)¡N(P n ;¾ rx ) =N(P t ¡P n ;¾ hw )¡PL(d) (3.28) Where ¾ 2 hw =¾ 2 tx +¾ 2 rx . Finally, ¨ is given by: ¨ =N(P t ¡P n ;¾ hw )¡PL(d 0 )+N(0;¾ ch ) =N(P t ¡PL(d 0 )¡P n ;¾ t ) (3.29) Where PL(d 0 ) = PL(d 0 )+10 ´ log 10 ( d d 0 ), and the total variance of the system (¾ t ) is given by: ¾ 2 t =¾ 2 ch +¾ 2 tx +¾ 2 rx =¾ 2 ch +¾ 2 hw (3.30) 40 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR (a) (b) (c) Figure3.10: Impactofchannelmulti-pathandhardwarevarianceonextentoftransitional region, (a)Impactofchannelmulti-path(realchannel+identicalnon-varianthardware), (b) Impact of hardware variance (ideal channel + hardware variance), (d) Combined impact of channel multi-path and hardware variance. The hardware variance generates a pseudo-path loss variance (¾ hw ). Equation 3.9 showed that the larger the variance, the larger the extent of the transitional region; hence, radios with hardware variance will always increase the extent of the transitional region. To obtain accurate results for the extent of the transitional region, ¾ should be re- placed by ¾ t in all corresponding equations in Section 3.2. The impact of hardware variance on the extent of the transitional region can be observed in Figure 3.10, which presents simulated link qualities for ´ = 3, ¾ ch = 3 and ¾ hw = 3. Figure 3.10 (a) shows the transitional region when invariant hardware is placed in a real channel (e®ect of ¾ 2 ch ), Figure 3.10 (b) presents a hypothetical scenario where variant hardware is placed in an ideal scenario (no multi-path e®ects), we observe that even in the absence of multi-path e®ects a transitional region is observed due to the pseudo-variance ¾ 2 hw . Finally, Fig- ure 3.10 (c) presents the combined e®ects of ¾ ch and ¾ hw , showing a larger transitional region than in Figures 3.10 (a) and (b). 41 3.4 Experiments with Motes We now present empirical results conducted in static and low-dynamic environments to validate our analytical results on the impact of ´, ¾ ch and ¾ hw on the extent of the transitional region. We will also observe that the correlation between output power and noise °oor in mica2 motes is negative, which is the least damaging in terms of link asymmetry among the di®erent correlations (positive, zero, negative). We considered two environments, an indoor environment (aisle of a building), and an outdoor environment (football ¯eld). All the measurements were made using mica2 motes. These devices use Non-Coherent FSK modulation at 915 MHz with Manchester encoding and provide data rates of 38.4 Kbaud. 3.4.1 Channel and Radio Parameters Channel: Two motes were used to measure the path loss exponent (´), the variance (¾ 2 ch ) and the initial decay PL(d 0 ) of the channel. Table 3.2 presents the values for ´ and ¾ ch . The reference distance (d 0 ) of the log-normal model was set to 1m and its corresponding power decay was found to be 55 dB. Radio:Onemotewasselectedasacommonreceiverandsendertocapturethevariance of the output power P t and noise °oor P n . The measurements were done in a isolated empty room, where each mote had the same source power and was placed at the same environment ´ (95% conf. bounds) ¾ ch (95% conf. bounds) outdoor 4.7 (4.3 - 5.1) 3.2 (2.6 - 3.8) indoor 3.3 (2.1 - 4.5) 5.5 (4.6 - 6.8) Table 3.2: Empirical Channel Parameters 42 −4 −2 0 2 4 −6 −4 −2 0 2 4 6 noise floor deviation output power deviation good tx good rx bad tx bad rx Figure 3.11: Correlation between output power and noise °oor. (95% conf. bounds) output power ¾ tx 2.3 (1.7 - 3.5) noise °oor ¾ rx 1.9 (1.4 - 3.1) Table 3.3: Empirical Radio Parameters physical position with respect to the reference mote. Figure 3.11 presents the empirical measurements, which shows a negative correlation between output power and noise °oor. From our experiments, the resultant covariance matrix is given by: S = 0 B B @ 6:0 ¡3:3 ¡3:3 3:7 1 C C A The standard deviations of the output power (¾ tx ) and noise °oor (¾ rx ) are presented in table 3.3, these values lead to ¾ hw = 3:0. Di®erent power levels were tested and all levels showed a similar variance. The negative correlation of mica2 motes is due to several factors. Nowadays chip implementation is moving toward single chip design, and hence, the performance of the transmitter and receiver is determined by the common underlying solid-state structure. Boardimplementationandantennagainsfurtherenhancethiscorrelationsinceacommon 43 0 10 20 30 40 50 0 10 20 30 40 50 out degree in degree 0 10 20 30 40 50 0 10 20 30 40 50 out degree in degree (a) (b) Figure 3.12: Simulation results for the relation between in-degree and out-degree, (a) positive correlation, (b) negative correlation. path goes from the antenna to the chip. Hence, a radio with a good solid-state structure (lowthermalnoise)andahighpath-antennagainwillleadtohigheroutputpowers(good transmitter),andalowernoise°oor(goodreceiver). Also,whileourmeasurementswhere done in controlled scenarios, characteristics of real deployments such as the remaining output power of batteries, may further enhance the correlation. It is important to observe that this negative correlation leads to some nodes being good transmitters and receivers which may create some cluster behavior, as observed in some empirical studies [25, 11]. The negative correlation has a direct impact on the relation between the out-degree and in-degree of nodes 9 . In section 3.3.2 we had stated that the negative correlation betweentheoutputpowerandnoise°oorleadstothelowestleveloflinkasymmetrywhich implies that the in-degree and out-degree of nodes will be more similar than for positive 9 In-degree is the number of neighbors that can communicate with a speci¯c node, out-degree is the number of neighbors that a speci¯c node can communicate with. 44 0 5 10 15 20 0 5 10 15 20 out degree in degree 0 5 10 15 20 0 5 10 15 20 out degree in degree (a) (b) Figure 3.13: Empirical correlation between in-degree and out-degree for di®erent power levels, (a) indoor environment, (b) outdoor environment. and zero correlations. Figure 3.12 shows simulation results for the relation between in- degree and out-degree for positive and negative correlations between the output power andnoise°oor. Itcanbeobservedthecloserelationbetweenin-degreeandout-degreefor the negative correlation. Figure 3.13 shows the empirical in-degree/out-degree relation of all nodes for all the tested power levels 10 , and we can observe that the empirical trend agrees with the simulation results. This close relation between in-degree and out-degree is highly desirable given the strong dependance that several medium access and network layer protocols have on symmetric links. Noise Floor: Given that our work does not consider interference, the noise °oor can be obtained by the well-known thermal noise equation [46], which leads to a value of -115 dBm for the parameters of the radio chip [17]. However, our measurements showed that the average noise °oor is approximately -105 dBm, the 10 dB di®erence is mainly 10 Links were considered valid if they had a PRR above 10%. The same trend is observed for any blacklisting threshold. 45 due to losses from the output-pin of the chip to the antenna, which are not considered in the thermal noise equation. These losses depend on board implementation and are beyond the scope of this work. Hence, for the model, the average noise °oor P n will be set to -105 dBm. Finally, it is important to consider that bit-error-rate expressions are usually given in terms of E b N 0 (known as SNR per bit), however, most commercially available radios provide only RSSI measurements, which can be converted to SNR per packet (¨). ¨ has a simple relation with E b N 0 : ¨ = E b N 0 R B N , for mica2 motes R = 19.2 kbps (data rate) and B N = 30 kHz (noise bandwidth). Hence, all RSSI measurements can be converted to E b N 0 values. 3.4.2 Chain Topologies For each environment, a chain topology of 21 motes was deployed with nodes spaced 1 meter apart. The frame size was 50 bytes with a preamble of 28 bytes. A simple TDMA protocol was implemented to avoid collisions. Every mote transmitted 100 packets at a rate of 5 packets/sec. Upon reception of a packet the sequence number and the received signalstrength(P r )werestored;simultaneously,thenoise°oorwassampled. Theaverage SNR and packet reception rate were measured for all the links in the network. According to equation 3.30, the expected total variance is the sum of the channel and radio variances (sum of variances of tables 3.2 and 3.3). Table 3.4 shows the expected ¾ t , Expected ¾ t Measured ¾ t Measured ¾ ch indoor 6.3 6.1 5.5 outdoor 4.8 5.1 3.2 Table 3.4: Comparison of Total Variance and Channel Variance 46 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 PRR cdf P t = −7 dBm empirical P t = 5 dBm empirical P t = −7 dBm model P t = 5 dBm model D = 0.17 D = 0.05 Figure3.14: Kolmogorov-Smirnovtestbetweentheempiricalandmodel-generatedpacket reception rate for medium and high output powers in a grass area. themeasured¾ t andthe¾ ch . Theexpected¾ t isderivedfromthe¾ ch and¾ hw oftables3.2 and 3.3, and the measured ¾ t is obtained from the chain topology experiment. We observethattheexpectedandmeasured¾ t aresimilar. Itcanalsobeobservedthat¾ ch is smallerthan¾ t (speciallyfortheoutdoorenvironment)con¯rmingthathardwarevariance contributestothetotalvariance,andconsequentlytotheextentofthetransitionalregion. In order to validate our model we present two comparisons. The ¯rst one is a formal method based on the Kolmogorov-Smirnov (K-S) test and the second one is a compari- son of the packet reception rates versus distance between empirical and simulated data. Figure3.14presentsthecumulativedistributionofthepacketreceptionrateforthechain topology described above. Two power levels, -7 dBm and 5 dBm, are presented for the outdoor environment. For the number of links in the chain topology (420) and a con¯- dence interval of 10% the K-S table has a threshold value of 0.06. For practical purposes letusde¯nelinkswithPRRabove0.9asreliableandneglectlinkswithPRRbelow0.1 11 , 11 A very reasonable assumption considering that links below 0.1 would incur in signi¯cant losses or in high number of retransmissions 47 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR (a) (b) (c) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 distance (m) PRR (d) (e) (f) Figure 3.15: Comparison of empirical measurements and instances of analytical model, (a) Empirical indoor P t =-7dB, (b) Empirical outdoor P t =-7dB, (c) Empirical outdoor P t =5dB, (d), (e) and (f) are the analytical counterparts. then the distance D of the K-S test is considered between 0.1 and 0.9. We observe that low density networks such as the medium power case would not pass the test (0.17 > 0.06), but high density networks such as the high power case pass the test (0.05 < 0.06), i.e. for the high power case both distributions can be considered similar (empirical and simulated). It is also important to notice that the empirical data shows that most of the probability mass is either above 0.9 or below 0.1 for any output power as it was shown in Section 3.2.4. The next comparisons will illustrate why low-dense networks are predicted less accurately than high dense networks. Figure 3.15 shows the empirical packet reception rates versus distance compared to their analytical counterparts. It can be observed that the model provides a reasonable 48 approximation of the real behavior. Table 3.5 shows the expected beginning and end of the transitional region according to equation 3.8 for the scenarios in Figure 3.15. We can observe that this equation provides reasonable predictions for the empirical observations. Severalpowerlevelsweretested,from-20dBmto5dBminstepsof1dBm,andallpower levels showed a similar behavior to the proposed model. A comparison of Figures 3.15 (b) and (e) provides a better understanding of why the K-S test fails to asses both distributions as similar. In the simulated case (Figure 3.15 (e)), the model tends to classify more links as good links than the empirical data. For instance, the simulated data shows that all links under a distance of 2 meters are considered to be 1.0, while the empirical data shows some unreliable links at a distance of 2 meters. Also, for distances 5 and 6 some links are good (> 0.9) in the simulated data, while in the empirical data most of them are below 0.9. Since the transitional region is narrow, the disagreement on a few links leads to approximately 15% of the mass probability shifting from bad links to good links, which causes the failure of the K-S test. Nevertheless, it is also worth considering that the major disagreement is only on the di®erences at the extremes (good and bad links), the slope for the unreliable links is similar, showing that both empirical and modeled data have a close approximation on the number of unreliable links. It is importanttomentionthatourmodelisnotmeanttobeaexactreplicaoftheenvironment but an approximation to it. beginning (m) end (m) indoor P t = -7dB 4.8 23.4 outdoor P t = -7dB 3.4 8.1 outdoor P t = 5dB 6.1 14.6 Table 3.5: Analytical Extent of Transitional Region 49 3.5 Summary This chapter presented an analysis on the unreliability and asymmetry of low-power wireless links. The analysis presented allow us to provide analytical expressions for the boundaries of the transitional region and a systematic approach to obtain mathematical link layer models. The analysis also allowed us to provide some important insights about theimpactofchannelmulti-pathandhardwarevarianceonthetransitionalregion. First, the relative size of the transitional region (¡ coe±cient) is higher for lower path loss exponents and higher variances. Second, hardware variance induces a pseudo-log-normal variance, which increases the size of the transitional region. Third, a negative correlation between the output power and noise °oor leads to nodes that are good transmitters and receivers, which helps to explain the clustering behavior observed in previous works [25, 11]. And fourth, even with a perfect-threshold radio, the transitional region still exists as long as there are multi-path e®ects. Even though the simulations and empirical validation in this chapter were based on radios using NC-FSK modulation and Manchester encoding, the model can be easily extended to other radio characteristics. Table 3.6 presents the steps required for other common modulation techniques and encoding schemes. It is important to highlight that whiledi®erentmodulations,encodingandpacketsizesleadtodi®erentsizesoftheregions, they do not signi¯cantly a®ect the transitional region ¡ coe±cient, some of the results are available in [61]. The work presented in this chapter contributes to a better understanding of the be- havior of low-power wireless links but is not exhaustive. It can be complemented with 50 other studies to capture other important phenomenon present in real scenarios; for in- stance, contention models from [55], temporal properties from [12] and correlations due to direction of propagation from [60]. Thenexttwochaptersusetheexpressionsandmodelderivedinthischaptertoevalu- ate and enhance the performance of geographic routing and random walk-based querying mechanisms. 51 STEP 0 : Radio Obtain output power and noise °oor for all nodes Can use Cholesky decomposition to generate multivariate r.v. equation 3.24 For mica2: ¡20dBm<P t <5dBm, P n =¡105dBm STEP 1 : Channel Obtain channel parameters PL(d 0 );n;¾ Can be obtained through own empirical measurements, or from some published results [54] STEP 2 : SNR Obtain SNR in dB (° dB ) as a function of distance d ° dB (d)=T ¡PL(d 0 )¡10nlog 10 ( d d 0 )¡N(0;¾)¡R STEP 3 : Modulation Select modulation and insert °(d) from previous step, but not in dB (i.e. ° =10 ° dB 10 ) Convert from E b N 0 to RSSI by inserting appropriate bit data rate R and noise bandwidth B N According to modulation select appropriate BER (P e ) ASK noncoherent: 1 2 [exp ¡ °(d) 2 B N R +Q( q °(d) B N R )] ASK coherent: Q( q °(d) 2 B N R ) FSK noncoherent: 1 2 exp ¡ °(d) 2 B N R FSK coherent: Q( q °(d) B N R ) PSK binary: Q( q 2°(d) B N R ) PSK di®erential: 1 2 exp ¡°(d) B N R STEP 4 : Encoding Select packet reception rate Select according to encoding scheme, then insert frame, preamble length and P e obtained in previous step NRZ : (1¡P e ) 8` (1¡P e ) 8(f¡`) 4B5B : (1¡P e ) 8` (1¡P e ) 8(f¡`)1:25 Manchester : (1¡P e ) 8` (1¡P e ) 8(f¡`)2:0 SECDED : (1¡P e ) 8` ((1¡P e ) 8 +8P e (1¡P e ) 7 ) (f¡`)3:0 Table 3.6: Theoretical Models for the Link Layer 52 Chapter 4 Impact of Lossy Links on Geographic Routing Geographic routing [34] is a popular mechanism that uses location information to deliver packets in multi-hop wireless networks. With this mechanism, nodes need to know only the location information of their neighbors and the location of the ¯nal destination. The lowstateandoverheadofgeographicroutingmakesitanattractivemechanismforpacket delivery in WSN. Geographic routing commonly employs a maximum-distance greedy technique to for- ward packets. At each hop, packets are delivery to the neighbor geographically closest to the destination. This forwarding technique performs well in ideal conditions such as the binary model, where a link exist as long as the receiver is within the transmission range. However, in the previous chapter we showed that as the transmitter-receiver distance increases the probability of encountering an unreliable link also increases. Hence, in real scenarios the distance-greedy mechanism of geographic routing is likely to select lossy links, which would degrade its performance. In this chapter we used the model presented in chapter 3 to identify and illustrate this weak-link problem, and present optimal local forwarding techniques to maximize the energy e±ciency of geographic routing. 53 4.1 Overview Geographic routing is a key paradigm that is quite commonly adopted for information delivery in wireless ad-hoc and sensor networks where the location information of the nodes is available (either a-priori or through a self-con¯guring localization mechanism). Geographic routing protocols are e±cient in wireless networks for several reasons. For one,nodesneedtoknowonlythelocationinformationoftheirdirectneighborsinorderto forwardpackets and hence the state stored is minimum. Further, such protocols conserve energy and bandwidth since discovery °oods and state propagation are not required beyond a single hop. Themaincomponentof geographicroutingisusually agreedyforwardingmechanism whereby each node forwards a packet to the neighbor that is closest to the destination. This can be an e±cient, low-overhead method of data delivery if it is reasonable to assume(i)su±cientnetworkdensity,(ii)accuratelocalizationand(iii)highlinkreliability independent of distance within the physical radio range. However, while assuming highly dense sensor deployment and reasonably accurate localization may be acceptable in some classes of applications, it is clear that assumption (iii) concerning highly reliable links is unlikely to be valid in any realistic deployment. The existence of such unreliable links exposesakeyweaknessingreedyforwardingthatwerefertoasthe weakest link problem. At each step in greedy forwarding, the neighbors that are closest to the destination (also likely to be farthest from the forwarding node) may have poor links with the current node. These \weak links" would result in a high rate of packet drops, resulting in drastic reduction of delivery rate or increased energy wastage if retransmissions are employed. 54 This observation brings to the fore the concept of neighbor classi¯cation based on link reliability. Some neighbors may be more favorable to choose than others, not only based on distance, but also based on loss characteristics. This suggests that a blacklist- ing/neighbor selection schememaybeneededtoavoid`weaklinks'. But,whatisthemost energy-e±cient forwarding strategy and how does such strategy draw the line between `weak' and `good' links? We articulate the following energy trade-o® between distance per hop and the overall hop count, which we simply refer to as the distance-hop energy trade-o® for geographic forwarding. Ifthegeographicforwardingschemeattemptstominimizethenumberofhops by maximizing the geographic distance covered at each hop (as in greedy forwarding), it is likely to incur signi¯cant energy expenditure due to retransmission on the unreliable long weak links. On the other hand, if the forwarding mechanism attempts to maximize per-hop reliability by forwarding only to close neighbors with good links, it may cover only a small geographic distance at each hop, which would also result in greater energy expenditure due to the need for more transmission hops for each packet to reach the destination. We will show in this chapter that the optimal forwarding choice is generally to neighbors in the transitional region. In this chapter, our goal is to study the energy and reliability trade-o®s pertaining to geographic forwarding analytically under the realistic packet loss model presented in Chapter 3. We emphasize, however, that the framework, fundamental results and conclusions of this chapter are quite robust and not limited by the speci¯c characteristics of this model. The rest of the chapter is organized as follows. In section 4.2, we present the scope, assumptions and metrics of our analysis. Then, we provide a mathematical 55 analysisoftheoptimumdistanceinthepresenceofunreliablelinksinsection4.3. Asetof tunablegeographicforwardingstrategiesispresentedinsection4.4,andinsection4.5,we evaluate the performance of these strategies. The e®ectiveness of the metrics presented in our analysis is validated through experiments with motes in section 4.6. Finally, we present a summary of the chapter in section 4.7. 4.2 Scope, Assumptions and Metrics TheanalysisofgeographicroutingisbasedonthelinklayermodelpresentedinChapter3 and it uses parameters that resemble the radio in mica2 motes: non-coherent frequency shift keying for the modulation technique and Manchester for the encoding (Table 3.6). Wewillalsousetheexpressionsforthepacketreceptionrateª(equation3.5),thedistance denoting the end of the transitional region d e (equation 3.8), the cumulative distribution function F(ª) (equation 3.10) and the expectation E[ª] (equation 3.19). Scope: Ourworkpresentstechniquestoreducetheenergyconsumptionofgeographic routing during communication events (transmission and reception of packets). Neverthe- less, we should o®er some caveats regarding the scope of our work. Our models do not consider other means of energy savings such as sleep/awake cycles, transmission power control 1 , nor other sources of energy consumption such as processing or sensing. We also do not address MAC-layer behavior such as contention/interference. The work in this chapter is aimed for low tra±c scenarios where interference does not a play a signi¯cant role. Scenarios with medium or high tra±c would require a di®erent analysis. 1 Li et al. present an interesting extension of our work in [41], which includes power control. 56 Assumptions: Our analysis is based on the following assumptions: Nodes know the location and the link quality (PRR) of their neighbors. Nodes know the location of the sink (¯nal destination). Nodes are deployed in a chain. Metrics: From the end-user perspective, an e±cient sensor network should provide as much data as possible utilizing as little energy as possible. Hence, in order to evaluate the energy e±ciency of di®erent strategies we use the following metrics: Delivery Rate (r): percentage of packets sent by the source that reach the sink. Total Number of Transmissions (t): total number of packets sent by the network to attain delivery rate r. Energy E±ciency (»): number of packets delivered to the sink for each unit of energy spent by the network in communication events. » can be derived from the delivery rate r and the total number of transmissions t. Let p src be the number of packets sent by the source, e tx and e rx the amount of energy required by a node to transmit and receive a packet 2 . Therefore, the total amount of energy consumed by the network for each transmitted packet is constant and it is given by: e total =e tx +e rx (4.1) 2 The listening cost of passive neighbors (early rejection) in random deployments can be easily inserted and the interested reader is refereed to our initial work [49]. 57 Hence, the total energy consumed due to communication events is e total £t, and » is given by: » = p src r e total t (4.2) Table 4.1 presents the notation used in this Chapter. 4.3 Analytical Model Given a realistic link layer model, akin to the one described in Chapter 3, our goal is to explore the distance-hop trade-o® in order to maximize the energy e±ciency of the network during communication events. 4.3.1 Problem Description This sub-section describes the notation and set-up used in the analysis. We assume that nodes are placed every ¿ meters in a chain topology 3 . A node is considered to have neighbors up to a distance of 2bd e c, where d e is the end of the transitional region (equation 3.8), the set of distances to the neighbors is given by '=f¿;2¿;3¿;:::;2bd e cg 4 , and the distance between source and sink is denoted by d src¡sink . 3 A non-constant distance between nodes can be also chosen. However, a constant distant ¿ allows a fair comparison of the di®erent regions (connected, transitional, disconnected) 4 The selection of 2bd e c as a \nominal range" does not a®ect the results of this work. Even though other distances can be considered, 2bdec¿ was selected because the expectation and variance of PRR (equations 3.19 and 3.20) it can be derived that nodes beyond this distance have a small probability of having active links. 58 Description Symbol Packet Reception Rate Parameters - packet reception rate (PRR) [Random Process] ª - packet reception rate for a distance d [Random Variable] ª d - cumulative distribution function of ª d F(ª d ) - expected packet reception rate E[ª] - a speci¯c PRR value in the range of ª à - blacklisting threshold à th Signal to Noise Ratio Parameters - signal to noise ratio (SNR) ¨ - a speci¯c SNR value in the range of ¨ ° - SNR value corresponding to à th ° th Channel Parameters - path loss exponent ´ - standard deviation ¾ - output power P t Transitional Region Parameters - end of transitional region d e Energy E±ciency Parameters - end-to-end delivery rate r - end-to-end number of transmissions t - energy e±ciency » - energy spent by network for one transmission e total - optimal forwarding distance d opt - distance between source and sink d src¡snk - number of packets transmitted by source p src - number of hops h - set of distances to neighbors ' - probability that distance d has the highest energy e±ciency q d Table 4.1: Mathematical Notation 59 Let» d betherandomvariablethatdenotestheenergye±ciencyobtainedifadistance d is traversed at each hop, then, the optimal forwarding distance d opt is the one that maximizes the expected value of » d : d opt =argmax d2' E[» d ] (4.3) In the next subsections we derive optimal forwarding metrics for the ARQ and No- ARQ cases. 4.3.2 Analysis for ARQ case We assume no a-priori constraint on the maximum number of retransmissions (i.e. 1 retransmissionscanbeperformed),therefore,risequalto1,andaccordingtoequation4.2 the energy e±ciency is given by: » ARQ = p src kt (4.4) Letting ª d be the random variable representing the PRR for a transmitter-receiver distanced, theexpectednumberoftransmissionsateachhopis psrc ª d . Thenumberofhops h is equal to d src¡sink d , therefore, the total number of transmission t is given by: t= d src¡sink d p src ª d (4.5) Substitutingtinequation4.4,weobtaintheenergye±ciencymetricforatransmitter- receiver distance d: 60 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 8 distance (normalized) d × E[Ψ d ] η = 3 η = 4 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 8 distance (normalized) d × E[Ψ d ] σ = 3 σ = 5 (a) (b) Figure 4.1: Impact of channel multi-path on E[» d ARQ ], (a) impact of path loss exponent ´, (b) impact of channel variance ¾. » d ARQ = dª d e total d src¡sink (4.6) d is de¯ned (constant) for ª d , therefore, the expected value of » d ARQ is given by: E[» d ARQ ]= dE[ª d ] e total d src¡sink (4.7) e total andd src¡sink areconstantsandanapproximateexpressionforE[ª d ]wasderived in equation 3.19. Hence, in order to maximize the energy e±ciency of systems with ARQ we need to maximize dE[ª d ] (PRR£distance product). Next, we evaluate E[» d ARQ ] for all d2' (equation 4.3). Unfortunately, the computa- tionofE[ª d ]involvestheQfunction(tail-integraloftheGaussiandistribution)forwhich no closed-form expressions are known. Hence, we evaluate equation 4.3 numerically. Figures 4.3 (a) and (b) depict the impact of channel multi-path on dE[» d ARQ ]. In these ¯gures the following parameters were used as the basis of comparison ¿=1m, ´=3, 61 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 Samples of X d for ARQ distance (m) X d connected region transitional region Figure 4.2: Energy e±ciency metric for the ARQ case. The transitional region often has links with good performance as per this metric. ¾=3, P t =-10 dBm and f=100; and the x-axis represent the transmitter-receiver distance dnormalizedwithrespecttotheendofthetransitionalregion, whichisapproximately20 meters for the parameters given above. The beginning and end of the transitional region are depicted by vertical lines. Figure 4.3 (a) presents the impact of the path loss exponent ´. We observe that for a higher ´ the optimal forwarding distance shifts left. This is due to the fact that for a higherpathlossexponentthereceivedsignalstrengthdecaysfaster,whichinturnreduces the packet reception rate. Figure 4.3 (b) presents the impact of the channel variance ¾. In this case the optimal forwarding distance shifts right, which is due to the fact that a higher ¾ increases the probability of ¯nding good links farther away from the sender but also decreases the probability of ¯nding good links close to the sender. In both ¯gures it is important to highlight that the optimal forwarding distance lies in the transitional region, showing the distance-hop energy trade-o®. 62 Inrealenvironmentsthepacketreceptionratetakesandinstanceofther.v. ª d ,hence, the optimal local forwarding metric for a node is the one that maximizes the product of the PRR of the link and the distance to the neighbor. Figure 4.2 shows simulations for the PRR£d metric in a line topology, where for each neighbor, the PRR obtained was multipliedbyitsdistance. Itcanbeobservedthatnodesinthetransitionalregionusually have the highest value for this metric. 4.3.3 Analysis for the No-ARQ case In systems with ARQ, at each step a node transmits the same amount of data as the source(r =1),thischaracteristicallowedustodotheanalysisindependentlyofd src¡sink . On the other hand, in systems without ARQ the amount of data decreases at each hop, hence in order to maintain an acceptable delivery rate, the longer the d src¡sink the higher thePRRofthechosenlinksshouldbe. Theanalysisinthissectionexplainsthisbehavior. Letting i 2 [1;2;:::;dhe] be the hop counter, we denote ª i d as the r.v. representing the packet reception rate for the distance d traversed at each hop i. ª i d are i.i.d 8i 2 [1;2;:::;dhe]. This notation allow us to de¯ne the delivery rate r for systems without ARQ traversing a distance d at each hop: r =p src dhe Y i=1 ª i d (4.8) The number of packet transmissions required at each hop i (t i ) is given by: t i =p src i Y j=1 ª (j¡1) d (4.9) 63 Whereª 0 d =1,toaccommodateforthenumberoftransmissionsrequiredatthesource (equal to p src ). The total number of transmissions t is the sum of t i ; 8i2 [1;2;:::;dhe]e. Therefore, t is given by: t=p src dhe X i=1 0 @ i Y j=1 ª (j¡1) d 1 A (4.10) Then » d woARQ is given by: » d woARQ = dhe Y i=1 ª i d dhe X i=1 0 @ i Y j=1 ª (j¡1) d 1 A (4.11) In real scenarios each link will take an instance of the random variable, letting à be an instance of the PRR for a given link, the local calculation of the delivery rate would be r =p src à dhe and the number of transmissions would be sum given by: t=p src dhe X i=1 à (i¡1) =p src (Ã) h ¡1 á1) (4.12) Which leads to a metric of: » d woARQ = (Ã) h (á1) e total ((Ã) h ¡1) = (Ã) h (1¡Ã) e total (1¡(Ã) h ) (4.13) GiventhatthePRRofalinkisintheinterval(0,1), (1¡Ã) 1¡(Ã) h < 1,andforlargenumber of hops (Ã) h in the numerator decreases exponentially while 1¡(Ã) h in the denominator increases. Therefore, equation 4.13 shows that in systems without ARQ, specially for 64 large number of hops, nodes should choose links with high PRRs. Otherwise for long distances the delivery rate, and hence the energy e±ciency, will tend to zero. 4.4 Geographic Forwarding Strategies for Lossy Networks In this section, we present some forwarding strategies that will be compared with the PRR£dmetric. Theaimofthesestrategiesistoavoidtheweakest link problem,andthey are classi¯ed into two categories: distance-based and reception-based. In distance-based policiesnodesneedtoknowonlythedistancetotheirneighbors, whileinreception-based policies, in addition to the distance, nodes need to know also the link's PRR of their neighbors. All the strategies use greedy-like forwarding, in that ¯rst a set of neighbors is blacklisted based on a certain criteria and then the packet is forwarded to the node closest to the destination among the remaining neighbors. 4.4.1 Distance-based Forwarding Original Greedy: Original greedy is similar to the current forwarding policy used in common geographic routing protocols. Original greedy is a special case of the coming blacklisting policies, when no nodes are blacklisted. Distance-based Blacklisting: In this case, each node blacklists neighbors that are above a certain distance from itself. In this chapter the \nominal" radio range is de¯ned as 2d e . For example if the radio range is considered to be 40 m and the blacklisting threshold is 20%, then the farthest 20% of the radio range (8 m) is blacklisted and the packet is forwarded through the neighbor closest to the destination from those neighbors within 32 m. 65 4.4.2 Reception-based Forwarding Absolute Reception-based Blacklisting: In absolute reception-based blacklisting, each node blacklists neighbors that have a reception rate below a certain threshold. For example,iftheblacklistingthresholdis20%,thenonlyneighborsclosertothedestination with a reception rate above 20% are considered for forwarding the packet. BestReceptionNeighbor: Eachnodeforwardstotheneighborthathasthehighest PRR and is closer to the destination. This strategy is ideal for systems without ARQ. 4.4.3 PRR£d Thisisthemetricshowninouranalysisanditcanbeobservedasamixtureofthedistance (d) and reception (PRR) based. For each neighbor, that is closer to the destination, the product of the PRR and distance. 4.5 Comparison of Di®erent Strategies The analytical model derived in section 4.3 provides the optimal forwarding distance. Nevertheless, in order to accurately evaluate the distance-hop trade-o® we need to quan- tify the amount of energy saved by choosing the best candidate according to the optimal metric with respect to other methods. In this section, we compare analytically the energy e±ciency of the di®erent strategies presented in the previous section for systems with ARQ in a chain topology. Inordertocomparethedi®erentstrategieswerequiretheirexpectedenergye±ciency (E[»]). In general, a strategy S has an expected energy e±ciency E[» S ] given by: 66 E[» S ] = X d²' E[» S jd f =d] p(d f =d) = X d²' E[» S jd f =d] q d (4.14) Where ' is the set of distances to neighbors, d f is the distance traveled at each hop, and q d is the probability that » d > » ` ;8` 2 ';` 6= d. In the remainder of this section we denote the random variable »jd f = d as » d . The next subsections provide E[»] for di®erent strategies. 4.5.1 PRR£d For the PRR£d metric q d is given by: q d = Z 1 0 P((x<» d <x+dx)^(» j <x;8j²';j6=d)) dx (4.15) The energy e±ciency of di®erent distances can be considered independent 5 : q d = Z 1 0 P(x<» d <x+dx)P(» j <x;8j²';j6=d) dx (4.16) Finally, q d given by: q d = Z 1 0 f » d (x) Y 8j²';j6=d F » j (x) dx (4.17) 5 The link quality (PRR) is a function of the SNR which is the sum of many contributions, coming from di®erent locations, with random phases [46]. 67 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 distance (normalized) q d τ = 1m τ = 2m τ = 3m transitional region 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 distance (normalized) q d η = 3 η = 5 (a) (b) 0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 distance (normalized) q d σ = 3 σ = 5 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 distance (normalized) q d P t = −10 dBm P t = 0 dBm (c) (d) Figure 4.3: Impact of di®erent parameters on q d , (a) ¿, (b) ´, (c) ¾, (d) P t Where f » d (x) and F » d (x) are the pdf and cdf of the metric » d . Given that these expressions depend on the Q function we provide numerical solutions in Figure 4.3 for q d . This ¯gure shows the impact of di®erent parameters on q d . Figures 4.3 (a) and (b) show that when ¿ and ´ increase the probability q d shifts left, closertotheconnectedregion. Ontheotherhand, when ¾ andP t increaseq d shiftsright, closer to the end of the transitional region. These behavior is explained by the change in the number of neighbors (node density with respect to the coverage range). The higher the number of neighbors, the higher the probability of discovering neigh- bors with good links (high PRR) that are closer to the destination (long d), which in- creases q d . Keeping all the parameters constants, a larger ¿ or a higher ´ (faster signal decay) reduces the density. On the other hand a higher P t increases the coverage range, 68 and higher ¾ increases the probability of ¯nding good links farther away from the sender. Hence, the higher the density (number of neighbors), the higher q d . The expected energy e±ciency of the PRR£d metric for a distance d is given by equation4.7. Hence,accordingtoequation4.14theexpectedenergye±ciencyforsystems with ARQ using the PRR£d metric is given by: E[» PRR£d ]= X d2' dE[ª d ] e total d src¡sink q d (4.18) 4.5.2 Absolute Reception-Based Let us de¯ne à th as the blacklisting threshold of absolute reception, where valid links havePRRvalueson theinterval[à th , 1). Inorder tochoosedas the forwardingdistance, links with distances longer than d should have a PRR < à th , and the link at distance d should have a PRR ¸ à th . Hence, q d for absolute reception-based (ARB) blacklisting is given by: q d ABR =p(ª d ¸Ã th ) Y d w 2';d w >d p(ª dw <à th ) (4.19) Giventhatalinkisconsideredvalidifª d ¸Ã th ,theexpectednumberoftransmissions ateachhopis p src E[ª d jª d >à th ] . Hence,theexpectedvalueoftheenergye±ciencyconditioned on the fact that ª d >à th is given by: E[» d ARB ] = d e total d src¡snk E[ª d jª d >à th ] (4.20) 69 Denoting ° = ª ¡1 (Ã) and ° th = ª ¡1 (à th ) the probability density function of the packet reception rate conditioned on ª d > à th is f(Ãjª d > à th ), which can be mapped to SNR values as f(°j¨ d >° th ), then: E[ª d jª d >à th ] = Z 1 à th Ãf(ªjª>à th )dà = Z +1 ¡° th ª(°)f(°j¨>° th )d° (4.21) Combining the previous two equations we obtain the expected energy e±ciency for absolute reception base (ARB): E[» ARB ] = X d2' dE[ª d jª d >à th ] e total d src¡snk q d ABR (4.22) 4.5.3 Distance-Based Whentheblacklistingisbasedondistancetheenergye±ciencyoftheforwardingdistance d (» d ) is the same as equation 4.7. Denoting d th as the distance blacklisting threshold, distancebasedblacklistingwillselectadistancedtheneighboratdistancedhasaPRR>0 and the neighbors with distances longer than d have a PRR=0. The probability q d of distance based (DB) blacklisting is given by: q d DB =p(ª d >0) Y d w ²';d<d w <d th p(ª dw =0) (4.23) Finally, the expected energy e±ciency is given by: E[» DB ] = X d2';d·d th dE[ª d ] e total d src¡sink q d DB (4.24) 70 0 20 40 60 80 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 PRR threshold (%) Extra Energy Cost τ = 1m τ = 2m τ = 3m Best Reception 0 20 40 60 80 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 PRR threshold (%) Extra Energy Cost η = 3 η = 5 Best Reception (a) (b) 0 20 40 60 80 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 PRR threshold (%) Extra Energy Cost σ = 3 σ = 5 Best Reception 0 20 40 60 80 100 0 0.1 0.2 0.3 0.4 0.5 0.6 Extra Energy Cost PRR threshold (%) P t = −20 dBm P t = −10 dBm Best Reception (c) (d) Figure 4.4: Performance of PRR blacklisting 4.5.4 Comparison Figures 4.4 and 4.5 show the comparison of energy e±ciency in a chain topology for di®erent channel, radio and deployment parameters, as in section 4.3. The ¯gures show the relative performance of the di®erent strategies with respect to the PRR£d metric, i.e. the y axis show the how much extra energy is required to attain the same delivery rate as PRR£d. Similarly to section 4.3, the base model of comparison have parameters ¿=1, ´=3, ¾=3, P t =-10 dBm and f=100. Original greedy is a speci¯c case of distance- based blacklisting, when no distance is blacklisted; and best reception is a speci¯c case of absolute reception-based when high a high blacklisting threshold is selected. 71 0 0.5 1 1.5 2 2.5 3 0 5 10 15 distance threshold (normalized) Extra Energy Cost τ = 1m τ = 2m τ = 3m Original Greedy 0 0.5 1 1.5 2 2.5 3 0 5 10 15 distance threshold (normalized) Extra Energy Cost η = 3 η = 5 Original Greedy (a) (b) 0 0.5 1 1.5 2 2.5 3 0 5 10 15 distance threshold (normalized) Extra Energy Cost σ = 3 σ = 5 Original Greedy 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 5 10 15 distance threshold (normalized) Extra Energy Cost P t = −20 dBm P t = −10 dBm Original Greedy (c) (d) Figure 4.5: Performance of distance blacklisting Figure4.5con¯rmsthesigni¯cantenergyexpenditureoforiginalgreedy,butthereare otherimportantinsightsfromthesecomparisons. First, ¿; ´; ¾ andP t haveanimportant impact on the relative performance of the di®erent metrics due to its in°uence in the numberofneighbors(nodedensitypercoveragerange)andtheexpectedenergye±ciency. An increase in ¿ or ´, or a decrease in P t leads to a lower node density, which implies that the strategies will start to choose the same nodes, given the lack of options, and the energy e±ciency will be more similar among them. When ¾ is increased, it improves the performance of absolute reception-based and decreases the one of distance-based. This is due to the fact that ¾ increases the probability of both, encountering good links at farther distances and bad links at shorter distances. Second, blacklisting links with PRR 72 below 10% improves signi¯cantly the performance of reception-based. This is due to the observation done in Chapter 3 (Figure 3.5) with respect to the cdf of the PRR, where it was noted that most of the links are either \good" or \bad", hence, by blacklisting links below 1% we eliminate most of the bad links. Third, reception-based strategies perform better than distance-based. This due to the fact that reception-base takes advantage of good quality links in the transitional region (farther away from the transmitter), on the other hand, distance-based blacklist potential good links, furthermore, the closer the distance does not necessarily imply better links, and distance-based is still vulnerable to select bad-quality links at medium distances. Fourth, it is important to consider that while some thresholds of distance and absolute reception based strategies show close performance to that of PRR£dist, these values change according to the channel, radio and deployment parameters requiring a pre-analysis of the scenario, on the other hand, PRR£d is a local metric that does not require any a priori con¯guration. Finally, the results show that Best Reception is also a good metric and it can be good candidate for systems without ARQ given that these systems require to select good quality links. 4.6 Experiments with Motes In order to validate our methodology and conclusions, we undertook an experimental study on motes. Twenty-one (21) mica2 motes were deployed in a chain topology spaced every 60 cm (»2 feet). The source (node 0) and sink (node 20) were placed at opposite extremes of the chain. The power level was set to -20 dBm and the frame size was 50 bytes. Three di®erent forwarding strategies were tested: 73 r (%) sce1 sce2 sce3 sce4 sce5 sce6 OG 0 2 32 0 0 94 BR 100 100 100 100 100 100 PRR£d 100 100 100 82 100 100 t sce1 sce2 sce3 sce4 sce5 sce6 OG 70 110 312 78 121 858 BR 652 407 730 754 701 903 PRR£d 563 425 632 547 560 883 Relative » (%) sce1 sce2 sce3 sce4 sce5 sce6 OG inf inf +54 inf inf +3 BR +16 -4 +16 +14 +25 +2 PRR£d 0.0 0.0 0.0 0.0 0.0 0.0 Table 4.2: Empirical Results for Di®erent Forwarding Strategies. OG: neighbor closer to the sink whose PRR>0. BR:neighborwithhighestPRR.IncasetwoormoreneighborshavethesamePRR, the one closer to the sink is chosen. PRR£d: neighbors are classi¯ed according to the PRR£d metric. First, the motes exchange test packets to measure the PRR of the links and populate their routing tables accordingly. Afterwards, the source sends 50 packets to the sink for each of the 3 di®erent strategies (150 total). A maximum of 5 transmissions (1 transmission + 4 retransmissions) are allowed at each hop, if the packet is not received after the ¯fth attempt, it is dropped. Six di®erent scenarios are studied: a football ¯eld, an indoor-building environment and four di®erent outdoor-urban areas. The channel characteristics of some scenarios are signi¯cantlydi®erent, and hence, instead of providingacumulativeresult, wepresentthe results for each one of them. 74 Table 4.2 shows the delivery rate (r), the number of transmission (t), and the energy e±ciency (») for the di®erent scenarios. BR have a delivery rate (r) of 100% in all scenarios, and PRR£d have 100% for all scenarios except scenario-4 (82%). Greedy performs poorly in most of them with a zero or close to zero delivery rate (r) in most cases. With regard to the number of transmissions, BR requires more transmissions than PRR£d in all scenarios except scenario 2 where BR performs better. Given that r is similar for BR and PRR£d, the di®erence in the energy e±ciency is determined by the number of transmissions. BR consumes between 2 to 25% more energy than PRR£d, only in scenario-2 performed 4% worse. On the other hand, in the two scenarios where Greedy has a non-zero delivery rate (r), it consumed between 3% and 54% more energy than PRR£d. It is interesting to observe that the energy \wasted" by greedy forwarding depends on where the ¯rst weak linkisencountered. Insomescenarios, the¯rstweaklinkisencounteredatthebeginning of the chain and hence the energy wasted is not signi¯cant, while, in other scenarios the weak link is encountered at the middle or end of the chain which caused a greater energy waste. Although this experimental study is limited in size, it provides two important conclu- sions. First, it does serve to con¯rm and validate our earlier ¯ndings from the analytical and simulation studies regarding the PRR£d metric. And second, it shows that the best reception metric is also a good metric for real deployments. Based on the insights of section 4.5, we believe that higher densities will lead to bigger savings in terms of energy for the PRR£d metric with respect to other strategies. 75 4.7 Summary We have presented a detailed study of geographic routing in the context of the model presented in chapter 3. We have provided a mathematical analysis of the optimal for- warding metric for both ARQ and No-ARQ scenarios. We have also validated some of our approaches using real experiments on motes. Key results from our study indicate that the common greedy forwarding approach would result in very poor packet delivery rate. E±cient geographic forwarding strategies dotakeadvantageoflinksinthehighvariancetransitionalregionforenergy-e±ciency. An important forwarding metric that arose from our analysis, simulations and experiments is PRR£d, particularly in high-density networks where ARQ is employed. Our results also show that reception-based forwarding strategies are generally more e±cient than distance-based strategies. Finally, it is important to highlight that the PRR£d metric is recommended in static or low dynamic environments, such as environmental monitoring. In highly dynamic environments the link quality can change drastically throughout time and hence stable estimates of PRR might not be possible. In this scenarios, best reception is probably the best forwarding strategy. 76 Chapter 5 Performance of Random Walks on Heterogeneous Networks In this chapter we analyze the performance of random walk-based queries in heteroge- neous sensor networks. A random walk on a graph is a discrete-time stochastic processes that starts from any vertex and at each step selects an adjacent vertex uniformly at ran- dom [43]. Due to its simplicity random walks have attracted considerable attention as queryingmechanismsforWSN[48,51,4,8,52]. However,mostofthesestudieshavebeen focused on simple deterministic graphs such as regular grids that do not consider degree heterogeneity. Asobservedinthemodelpresentedinchapter3degreeheterogeneityisan importantpropertyofreal-lifenetworks. Inthischapterweshowthatrandomwalk-based queries can considerably enhance their performance by exploiting degree heterogeneity. We demonstrate that by using a simple algorithm and including a few high-degree nodes (10% <), query cost can be reduced between 30% and 70%. 5.1 Overview An important approach for querying in unstructured systems is the use of random walks. This approach is gaining popularity in the networking community since random walks 77 are intuitively simple | nodes are visited sequentially in a random order with successive nodes being neighbors in the graph [51, 4, 48, 8]. There is also a signi¯cant body of theoretical literature on random walks as querying mechanisms [3, 43, 9]. However, while this literature provides much insight into the scaling behavior of random walks on simple classes of deterministic graphs (such as 2D Torii), a major property of real-life networks, heterogeneity, was left out of discussion. Heterogeneity on the degree distribution is a highly likely characteristic in real large- scale wireless systems, such as sensor networks, for a number of reasons. First, as shown in the model proposed in this thesis, in real scenarios, channel multi-path and hardware variance lead to signi¯cant changes in degree distribution compared to ideal scenarios. Second, random deployments are inherently non-regular graphs. Third, empirical stud- ies [25, 62] have revealed that hardware variance on the sensitivity and output power of radios lead to nodes with signi¯cantly higher degree than the average (cluster-heads). Fourth, someworks[29]haverecentlyproposedthatforwirelesssensornetworkstoscale, heterogenous networks consisting of highly capable and low capable devices are required. Inthisworkweproposetheuseofapush-pullmechanismtoexploittheheterogeneity of the underlying communication graph to enhance the performance of random-walk- based queries. We take advantage of a well known property of the simple random walks: its stationary distribution ¼(v)=d(v)=2m, where d(v) denotes the degree of node v and mthenumberofedgesinthegraph. Thismeansthatnodeswithhigherdegreearevisited more frequently by the random walk. The idea we use is intuitively simple, events are pushed towards high-degree nodes (cluster-heads) and pulled from the cluster-heads by performing a random-walk-based 78 query. In this scenario some important questions arise: What is the impact of hetero- geneity on performance? andHow much heterogeneity is needed? Weusetwotheoretical toolstoexplorethesequestions. Ouranalyticalresultsarebasedonthedirectconnection between a random walk on a graph and the resistance of the electrical network obtained from the graph by viewing each edge as a unit resistor [22, 13]. We bound the in°uence of cluster-heads on the resistance which in turn bound on the query cost. Our second tool is the use of absorption states in the transitional probability matrix of a graph G to obtain the expected number of steps in a query. The main contribution of this work is to showthatheterogeneityallowsrandom-walk-basedqueriestoenhancetheirperformance. In particular, we obtain that for a line topology where cluster-heads have a coverage k (cover k nodes to the right and k nodes to the left) and are uniformly distributed (evenly spaced), a fraction of 4 5k nodes being cluster-heads can o®er a reduction in query cost of O(1¡ 1 k 2 ) by using a simple distributed algorithm. In intuitive terms this translates to requiring less than 10% of the nodes being cluster-heads to obtain two orders of magni- tude improvement in query cost. Through numerical analysis, we present results showing that in 2D networks also a small percentage of nodes being cluster-heads (» 10%) can lead to signi¯cant improvements in performance (between 30% and 70% depending on the coverage of the high-degree nodes). 79 5.2 Enhancing Random Walks for Heterogeneity In this section we present the event-push query-pull algorithm used in our work. We consider a network where two type of nodes are available: i) nodes with limited com- munication capability (low degree) and ii) nodes with higher communication capabilities (highdegree,cluster-heads). Ourfocusisoninfrastructure-lessnetworks(nolocationnor GPS capabilities), nodes are able to communicate only with their neighbors, and they are aware of their neighbors degree (if neighbors are cluster-heads or not). Our query mechanism is built from two parts: ¯rst an event e is generated randomly at any node in the network, and second, a random-walk-based query is issued in order to ¯nd the event. The event e can either remain at the node where it was generated or movetoacluster-head(ifoneexists). Inanetworkwhereallnodeshavethesamedegree, the event remains at the node where it appears. When cluster-heads are present, upon detection of an event, the event follows Algorithm 1: Algorithm 1 Event forwarding in Heterogeneous Network Require: event e at node v i 1: while node v i is not a cluster-head do 2: if there is a neighbor v j with degree(v j ) > degree(v i ) then 3: forward e to the neighbor with the highest degree; break ties uniformly at random among candidates 4: else 5: forward e to a random neighbor 6: end if 7: end while In order to ¯nd the event, a query is issued through a prede¯ned sink node s. The query follows a simple random walk where the next node is chosen uniformly from the neighbors, until it ¯nds the event e. 80 Inthescenariodescribedabovetherewillbetwocosts: i)thecostofmovingtheevent to a cluster-head, C event and the cost of the query (i.e random walk) to ¯nd the event, C query . The total cost is the sum of both: C total =C event +C query . Denoting L as the number of low-degree nodes and H as the number of high-degree nodes, how doesC total vary with the ratio H H+L ? Is there a value of H H+L that will reduce C total signi¯cantly ? 5.3 Analytical Results In this section we focus on line topologies and we are interested in analyzing the impact of cluster-heads on cost of event discover (C total ). The main result is as follows: Theorem 1. Consider a line topology with (n + 1) nodes, where cluster-heads have a degree 2k and are uniformly distributed. The ¯rst local minima for the maximum hitting time and for the query cost is obtained when the fraction of high-degree nodes is 4 5k . And for this fraction a reduction in query cost of £(1¡ 1 k 2 ) is obtained. As mentioned earlier, this result is obtained using bounds on the resistance of an electrical circuit related to the network graph. The remainder of the section is dedicated totheproofofthistheorem,andTable5.1presentsthenotationusedfortheproof,which explained in detail in the next two subsections. 81 5.3.1 Parameters of Line Topology We consider the undirected graph L(n;k;d) = G(V;E) with the following parameters. An element of V or E is represented by a lowercase, a subset or array is represented by a bold lowercase and the complement of a set or element will be denoted with an upper- bar, for example e represents an element, e an array (subset) and ¹ e and ¹ e represent the complements of e and e, respectively. For a source s2 V and the subset of cluster-head nodes eµ V, the average hitting time from s to e is: h se = X e2e h se jej (5.1) The commute time C uv is the expected time taken by a random walk starting at u to reach v and come back to u, i.e. C uv =h uv +h vu . C uv is given by: C uv =2mR uv (5.2) Where m is the number of edges in the graph and R uv is the e®ective resistance between nodes u and v. In case of symmetry h uv = h vu , which implies that the commute (n+1) number of nodes k cluster coverage d inter cluster-heads distance ® overlaping of cluster-heads coverage (region 2) (s+1) number of clusters Table 5.1: Mathematical Notation 82 time is two times the hitting time. We will use this property in the analysis of the hitting time for line topologies. In the line topology L(n;k;d) = G(V;E), the set of nodes is V = v 0 ;v 1 ;:::v n , since the index goes from 0 to n, the number of nodes is jVj = n+1, we will also denote n as the number of edges when no cluster-head has been added (when nodes communicate only with their immediate right and left neighbors). In addition to the initial n edges, each cluster-head has edges to all nodes which are lessthanorequaltok nodesawayfromitontheline. distheintercluster-headdistance, hence, a node v i is a cluster-head i® i mod d = 0. The set of edges E for L(n;k;d) is de¯ned as follows: E =fv i v j j(i mod d=0 andji¡jj·k) orji¡jj=1g We will consider the case where n À k > 1. Since we are interested in the limit n!1 we will consider only the cases where n = sd, where s is an integer. Then, the total number of clusters is given by (s+1). For a given n and k we are interested in studying the impact of d on C total via the resistance, and the analysis is divided in di®erent regions according to the inter cluster- heads distance d: regions= 8 > > > > > > < > > > > > > : region 1; 2k·d·n=2 region 2; k <d<2k region 3; 1·d·k (5.3) 83 1 2 3 Region 1 Region 2 Region 3 r( ) r( ) Figure 5.1: Examples of line topologies. The big circles denote cluster-heads and the dashed lines their coverage k, all nodes within k hops from the cluster-head are its neigh- bors. Figure 5.1 presentsexamples of line topologies for the 3 di®erent regions. The big cir- cles denote cluster-heads and the dashed lines their coverage k. Notice that k is constant for all three topologies. Later in this section we will show that C total depends mainly on C query , specially for d<2(k+1) whereC event will be shown to be less than 1. For this reason, the resistance analysis will be focused onC query . For the di®erent regions we will ¯rst obtain expressions for the number of edges m and the e®ective resistance R between the nodes at the extremes of the line. Then, in subsections 5.3.2 and 5.3.3 we use these expressions, together with symmetry, to obtain the maximum hitting time and average query cost,C query . 84 5.3.1.1 Analysis of region 1 Recalling that the number of nodes inL is (n+1), and that when no clusters are present the initial number of edges is n. Besides n, each cluster-head contributes with 2(k¡1) new edges. Then the total number of edges is given by: m 1 =n+2(k¡1)s =n+2(k¡1)n=d = n d (d+2(k¡1)) (5.4) The coverage on each side of a cluster-head can be represented by an e®ective re- sistance r(k) (Figure 5.1). r(k) can be derived based on k using techniques to reduce resistors in parallel and series: k =2 ) r(2)=2=3 k =3 ) r(3)=5=8 k =4 ) r(4)=13=21 It can be derived that the numerator and denominator follows the ¯bonacci series fib(:), hence: r(k)= fib(2k¡1) fib(2k) (5.5) Since every cluster-head covers its right and left k-neighbors and the extreme vertices cover only one side, the e®ective resistance R 1 between the extremes ofL is given by: 85 1/2 1/2 < < lower Bound upper Bound Region 2 Region 3 Figure 5.2: Resistance calculation for regions 2 and 3. R 1 =(2r(k)+d¡2k)s =(2r(k)+d¡2k)n=d (5.6) Finally, using equation 5.2 and symmetry, the hitting time for nodes at the extreme of the line is given by: h 1 =( n d ) 2 (d+2(k¡1)) (d+2(r(k)¡k)) (5.7) 5.3.1.2 Analysis of region 2 The number of edges is the same as equation 5.4 since every cluster-head that is added brings 2(k¡1) new edges. However, the e®ective resistance between 2 cluster-heads is di®erent. In this subsection we provide upper and lower bounds for this resistance. 86 Letting ®k be the overlap between the coverage of two neighboring cluster-heads, where0<®<1, thend=k(2¡®). Theresistance circuitis equivalenttothe oneshown in Figure 5.2, where r(x) = r(k(1¡®)) (the e®ective resistance for the non-overlapping part). Giventhatr()convergestotheinverseofthegoldenratio,1=2<r(x)<1,further, considering Rayleigh's monotonicity law we can provide upper and lower bounds for the resistance as shown in Figure 5.2: 2 ®k+2 <r 2 < 2 ®k+1 2 ®k+2 s <R 2 < 2 ®k+1 s 2 ®k+2 n d <R 2 < 2 ®k+1 n d (5.8) Finally, the hitting time is bounded by: 2( n d ) 2 ( 4k¡k®¡2 ®k+2 ) < h 2 <2( n d ) 2 ( 4k¡k®¡2 ®k+1 ) (5.9) 5.3.1.3 Analysis of region 3 For this region, we will analyze only the point where d = k. Contrary to the case of regions 1 and 2, neighboring cluster-heads have an edge connecting each other, hence each cluster-head brings (2(k¡1)¡1) new edges. And the total number of edges is given by: 87 Maximum Average Hitting Time (h) Hitting Time (C query ) [ subsection 5.3.2 ] [ subsection 5.3.3 ] region 1 (lower bound) ( n k ) 2 (k¡ 1 2 ) (2k¡1)n(n+k) 6k 2 region 2 (upper bound) 2( 4n 5k ) 2 ( 13k¡8 3k+4 ) 4n(13k¡8)(8n+5k) 3(3k+4)(5k) 2 region 3 6( n k ) 2 ( k¡1 k+1 ) (k¡1)(2n+k)n (k+1)k 2 Table 5.2: Minimum Maximum and Minimum Average Hitting Times per Region m 3 =n+(2k¡3)s =n+(2k¡3)n=d =3 n k (k¡1) (5.10) The resistance between two cluster-heads can be transformed to an equivalent circuit as shown in Figure 5.2, which leads to an equivalent resistance of 2 k+1 between two neighboring cluster-heads. Hence, the e®ective resistance between the extremes of the line is given by: R 3 = 2 k+1 s = 2 k+1 n k (5.11) Finally, the hitting time for d=k is given by: h 3 (d=k) =6( n k ) 2k¡1 k+1 (5.12) 5.3.2 Local Minimum for Maximum Hitting Time Inthissectionweanalyzethehittingtimebetweenthesink(v 0 )andthelastcluster-head on the line (v n ). Table 5.2 shows the minimum value of h 1 , h 2 and h 3 . In region 3, we 88 analyzed only one point (d = k) , hence, for region 3 the value in Table 5.2 is the one obtained in equation 5.12. For region 1, the minimum value is obtained for d = 2k, we further assume a lower bound on h 1 by setting r(k)=0:5. In the case of region 2, the hitting time depends on the overlapping ®. Even though ®k should take only integer values, we assume ® to be real in order to di®erentiate h 2 (equation5.9)andobtainthevalueof®thatminimizesh 2 . Thevaluesof®fortheupper and lower bounds are given by (® U and ® L ): ® U = 12nk¡7n+4k¡16k 2 2(2n¡4k+1)k ¡ p 80n 2 k 2 ¡88n 2 k+96nk 2 ¡128nk 3 +17n 2 +48nk¡16n 2(2n¡4k+1)k (5.13) ® L = ¡8k 2 +6nk¡4n¡2 p ¡8k 3 n+5n 2 k 2 ¡4n 2 k+8nk 2k(n¡2k) (5.14) For large n and k, ® U = ® L = 0:7639, hence ® opt ¼ 3 4 , Table 5.2 shows the upper bound of h 2 for ® opt . From Table 5.2 we observe that h 1 = £(n 2 =k), and h 2 and h 3 are £(n 2 =k 2 ), hence for large k the minimum is either on h 2 or h 3 . Since n and k are given, we can compare directly upper bound of h 2 and h 3 given in Table 5.2, which leads to: lim k!1 h 3 h 2 =1:0817 (5.15) Hence, the ¯rst local minima is in region 2 and the inter cluster-head distance that attains this minimum is d = 5 4 k, which corresponds to a ratio r between high-degree nodes and the total number of nodes: 89 r =( 4n 5k )( 1 n+1 )¼ 4 5k (5.16) It is important to notice that the global minima might be in region 3, however, the reduction obtained by moving from the ¯rst local minima to the global minima is not signi¯cant, as it will be shown numerically in the next section. 5.3.3 Local Minimum for Expected Hitting Time Forregions1,2andregion3(whend=k),cluster-headiisvisitedonlyaftercluster-head i¡1hasbeenvisited. Thispropertyallowsustodiscardallnodesbeyondagivencluster- headiwhenweareinterestedin obtainingthehittingtime to i. Hence, for cluster-head i we can consider a new line topology, which is a shorter version of the original one, where the sink is at one extreme and cluster-head i is at the end of the new line. This behavior allows us to use directly the expressions derived in the previous subsections with the only change to be made in the initial number of edges n. Recalling that the distance set between the sink and cluster-heads is given by: f0;d;2d;3d:::(s¡1)d;sdg. C query is the average hitting time (E[h i ]) to all clusters and for region 1 is given by: E[h 1 ] = s X i=0 h 1 (n=id) s+1 = s X i=1 ( n d ) 2 (d+2(k¡1)) (d+2(r(k)¡k)) s+1 = (d+2(k¡1)) (2r(k)+d¡2k) s+1 s X i=1 i 2 =(d+2(k¡1)) (2r(k)+d¡2k) s(2s+1) 6 (5.17) 90 For Region 2, using the upper bound of equation 5.9 and recalling that d=k(2¡®), the expected hitting time is given by: E[h 2 ] = s X i=0 h 2 (n=id) s+1 = s X i=1 2( n d ) 2 ( 4k¡k®¡2 ®k+1 ) s+1 = 2(4k¡k®¡2) (®k+1)(s+1) s X i=1 i 2 = n(2n+1)(4k¡k®¡2) 3(®k+1)(k(2¡®)) 2 (5.18) Thederivativeofthisequationwithrespectto ®leadstoanexpressionthatisconsid- erably more complicated than equation 5.13 and it is not presented. However, the results are the same as the obtained for the maximum hitting time, i.e. ® U = ® L = 0:7639 and ® opt ¼ 3 4 . The closed-form expressions and values can be easily obtained by using mathematical software such as Matlab or Mathematica. For Region 3, when d=k, the expected hitting time is given by: E[h 3 (d=k)] = s X i=0 6( n k ) 2 k¡1 k+1 s+1 = 6(k¡1) (s+1)(k+1) s X i=0 i 2 = k¡1 k+1 s(2s+1) (5.19) Table 5.2 also presents the minimum value ofC query for the 3 regions: d=2k; r(k)= 0:5 for region 1 (lower bound), d= 5 4 k for region 2 (upper bound) and d=k for region 3 (only point analyzed). The comparisons lead to the same result as the ones obtained for 91 the maximum hitting time: the order of h 1 is greater than the order of h 2 and h 3 , and as n and k goes to in¯nity the ratio of h 3 over h 2 goes to 1.0817. Now we have all the elements to prove Theorem 1 Proof of Theorem 1: The optimization of the maximum and average hitting times (equations 5.9 and 5.18) leads to d = 5 4 k, using Table 5.2 we proved that the ¯rst local minima occurs in region 2 for both the maximum and average hitting times, and that their cost are £(n 2 =k 2 ). It is known that random walks on regular line topologies have a cost of £(n 2 ) [3], hence, a line topology L(n;k;d) with d = 5 4 k leads to a cost reduction of £(1¡ 1 k 2 ) for both the maximum and average hitting times.2 5.4 Numerical and Experimental Results In this section we use Markov numerical methods and simulations to study the impact of heterogeneityontheperformanceofrandomwalk-basedqueries. First,wepresentnumer- ical results on a line topology that validates the analytical contributions of Section 5.3. Then, we present numerical results on regular grids and random geometric graphs. Fi- nally, we present simulation results on realistic graphs for wireless sensor networks. In Appendix B we provide an introduction to obtain hitting times in Markov Chains. Let us consider that the event is in node e2 V and the sink is in node s2 V, then, the array Q e representing the expected hitting time from each node to e (except e) is given by: 92 Q e =(I¡M e ) ¡1 1 (5.20) Where I is the identity matrix, 1 is a column vector of ones and M e is the matrix resulting from deleting the row and column corresponding to e in M, where M is the transition probability matrix de¯ning the Markov chain in G. Hence, in our case h se = Q e (s), where Q e (s) represents the s'th element of the array. 5.4.1 Line Topologies For a given line topologyL(n;k;d) with transition probability matrixM, where the sink is the extreme left vertices v 0 . Clusters-heads are positioned at v id where i=0,1, ..., s. The hitting time from v 0 to a speci¯c cluster-head is Q id (v 0 ), where Q id is given by: Q id =(I¡M id ) ¡1 1 (5.21) Based on equation 5.1,C query is the average hitting time over all cluster-heads and is given by: C query = d 2n (Q 0 (v 0 )+Q sd (v 0 ))+ d n s¡1 X i=1 Q id (v 0 ) = d 2n (Q sd (v 0 ))+ d n s¡1 X i=1 Q id (v 0 ) (5.22) Inthepreviousequationallcluster-headshavethesameweightexceptfortheextreme ones (v 0 and v sd ), this is due to the fact that at the extremes, the expected number of stored events is half of those stored at the intermediate cluster-heads. However, for large number of cluster-heads the weight can be considered similar and: 93 C query ¼ s X i=1 Q id (v 0 ) s+1 (5.23) C event will be derived according to the regions de¯ned in 5.3 and algorithm 1. There are s+1 cluster-heads for which there is no need to move the event, hence the cost is zero. When d < 2(k + 1), all nodes will be directly connected to a cluster-head and C event = n¡s n+1 . However, when d ¸ 2(k + 1) (most of region 1), there will be orphan nodes between any pair of consecutive cluster-heads. Due to symmetry, the cost will be the same for any subset of nodes between any two neighboring cluster-heads and for simplicity we consider cluster-heads v 0 and v d . For these clusters, nodes between (k+1) and (d¡k¡1) are orphan. Let us de¯ne a=(k+1), b=(d¡k¡1) andM (a:b) as the M's sub-matrix which includes only the rows and columns between a and b. For M a:b , Q a:b =(I¡M a:b ) ¡1 1isthearraycontainingtheexpectednumberofstepsofeachorphan node to reach the closest node directly connected to a cluster-head. Hence, the cost of moving orphan nodes between a and b to a cluster-head is given by: C orphan = X i2Q (Q a:b (i)+1) (5.24) Where the constant 1 represents the cost of moving the event from a node connected to a cluster-head to the cluster-head. Finally, recalling that (n+1) is the total number of nodes inL,C event is given by: 94 10 1 10 2 10 0 10 1 10 2 10 3 10 4 number of clusters cost (steps) Cost of Event κ = 6 κ = 10 κ = 8 region 1 region 2 region 3 Markov Simulations 10 1 10 2 10 2 10 3 10 4 cost (steps) Cost of Query number of clusters κ = 6 κ = 10 κ = 8 region 2 region 1 region 3 Markov Simulations 10 0 10 1 10 2 10 2 10 3 10 4 cost (steps) number of clusters Total Cost κ = 6 κ = 10 κ = 8 region 1 region 2 region 3 Markov Simulations (a) (b) (c) Figure 5.3: C query ,C event andC total vs the number of clusters for a line topology with 121 nodes and di®erent values of k (6, 8 and 10). C event = 8 > > > > > > > > > > < > > > > > > > > > > : 2k(s+1)+sC orphan n+1 ; 2(k+1)·d·n=2 n¡s n+1 ; 2k·d<2(k+1) n¡s n+1 ; region 2 n¡s n+1 ; region 3 (5.25) Figure 5.3 showsC query ,C event andC total vs the number of clusters for a line topology with121nodesanddi®erentvaluesofk. Thesolidlinesrepresentthecostobtainedusing Markov numerical analysis (equations 5.23 and 5.25), and the dotted lines represent simulation results, it can be observed that the Markov method provides an accurate representation of the cost. Also it must be noted that the query cost accounts for most of the total cost, which validates the focus of the analytical section on C query . Figures 5.4 and 5.5 compare the maximum and expected hitting time between the Markov analysis (full lines) and the expressions obtained through the resistance method (dotted lines) for a line topology with 121 nodes and values of k ranging from 5 to 10. The dotted lines show that the bounds get tighter for higher values of k in both ¯gures. 95 10 1 10 2 10 2 10 3 10 4 Maximum Hitting Time number of clusters cost (steps) markov analytical first local minima κ = 5 κ = 10 κ = 6 κ = 7 κ = 8 κ = 9 region 2 resistance bounds Figure5.4: Maximumhittingtimeforalinetopologywith121nodesandvalueskranging from 5 to 10. The¯guresalsoshowsalinewithcirclemarkersdepictingthenumberofclustersrequired to reach the ¯rst local minima according to our analysis (Theorem 1). It is important to notice that the analytical values for the number of clusters are not necessarily integers and hence they may not match exactly the numerical ones, specially for low values of k. However, for large values of k and n the °oor or ceiling of the analytical value will not incur in signi¯cant di®erences. It is also important to notice that the ¯rst cluster-heads added leads to a signi¯cant cost reduction (region 1 and part of region 2), adding even more high-degree nodes beyond this point provides diminishing returns or in some cases even degrades the performance. 5.4.2 Regular Grids and Random Geometric Graphs GridsandRandomGeometricGraphs(RGG)arecommonmodelstostudyvariousprop- erties and protocols for wireless systems. In the previous section, we showed that in line 96 10 1 10 2 10 2 10 3 10 4 Expected Hitting Time number of clusters cost (steps) κ = 5 κ = 10 markov resistance bounds analytical first local minima κ = 9 κ = 8 κ = 6 κ = 7 region 2 Figure 5.5: C query (expected hitting time) for a line topology with 121 nodes and values k ranging from 5 to 10. topologiestheadditionofcluster-headscangreatlyreducethecostofrandom-walk-based queries, do cluster-heads have the same signi¯cant e®ect on 2-dimensional topologies? 5.4.2.1 Grids We assume that the number of cluster-heads is a perfect square and they are uniformly distributedonthegrid,i.e. thegridisdividedinthesamenumberofcellsascluster-heads and each cluster-head is positioned in the node at the center of each cell. According to the algorithm presented in 1, events appearing in a cluster-head has a cost of 0, events appearing in nodes directly connected to a cluster-head have a cost of 1 and events appearing in orphan nodes perform a simple random walk until it hits a node that is directly connected to a cluster-head. Denoting M as the transitional probability matrix, w µ V as the subset of vertices containing all cluster-heads and all the nodes 97 10 0 10 1 10 2 10 0 10 1 10 2 Cost of Event cost (steps) number of clusters κ = 2 κ = 6 κ = 3 κ = 5 κ = 4 10 0 10 1 10 2 10 1 10 2 10 3 Cost of Query cost (steps) number of clusters κ = 2 κ = 6 κ = 3 κ = 4 κ = 5 10 −2 10 −1 10 0 10 1 10 2 10 1 10 2 10 3 number of clusters cost (steps) Total Cost κ = 2 κ = 6 κ = 3 κ = 4 κ = 5 all nodes low−degree all nodes high−degree (a) (b) (c) Figure 5.6: (a) C event , (b) C query and (c) C total for a grid topology with 169 nodes and values of k ranging from 2 to 6. directly connected to them and ¹ w µ V as its complement; we de¯ne M w as the sub- matrix where all the rows and columns of the vertices in w have been removed. Hence, the hitting time for orphan nodes is given by: h ¹ ww = j¹ wj X i=0 (Q w (i)+1) (5.26) AndC event cost for the grid topology is given by: C event = (jwj¡(s+1))+h ¹ ww n (5.27) In our analysis the sink is located at the bottom-left corner of the grid (v 0 ). The query costC query is the average hitting time from the sink to the set of cluster-heads and it is given by combining equations 5.1 and 5.20. Figure5.6presentsresultsforC event ,C query andC total ((a),(b)and(c)respectively)for 169nodesdeployedona2Dgrid. Theyaxisrepresentthecostandthexaxisthenumber of clusters. There are two important observations: i) contrary to the line topology, the 98 case when all nodes are high-degree perform signi¯cantly worse, ii) similar to the line topology, the ¯rst cluster-heads account for most of the savings (greater than 30%) and the higher k the higher the savings. Also note that, as k increases, the case where all nodes are high-degree approaches a complete graph. Another important di®erence with respect to the line topology is that the event cost plays a signi¯cant role in the total cost, while the query cost does not a have the same signi¯cant impact. This may be due to several reasons one of them is that for the same number of nodes the diameter of a line (£(n)) topology is signi¯cantly larger than a grid (£( p n))topology. Fromtheresistancemethodperspective,ingridswecannoteliminate the edges beyond a given cluster, all edges should be considered and hence according to equation 5.2 this would make the query cost for di®erent clusters more similar. 5.4.2.2 Random Geometric Graphs The procedure for getting event and query costs in random geometric graphs is the same as for grids (equation 5.27, and a combination of equations 5.1 and 5.20). However, the interesting case in random geometric graphs is that even when only low-degree nodes are deployed there are some inherent cluster-heads due to some favorable geographical position. Wefurtherenhancetheinherentcluster-headsformedbyincreasingtheirtrans- mission range. According to the algorithm presented in 1, in these scenarios the event moves in a greedy way towards the local cluster-head. Table5.3presentsresultsfor169nodesdeployedrandomlyona1x1squarearea. The resultsaretheaverageover50runs. Theinitialradiusis0.12whichforthisdensitygivesa connectivityprobabilityof¼0.5. Thetablehastwocolumnsnamed\clustering"and\no 99 transmission range clustering no clustering savings (%) 0.12 679.7 833.0 18.4 0.18 414.2 296.2 -39.8 0.24 171.4 225.0 23.8 0.30 88.0 202.7 56.6 0.36 52.9 193.5 72.7 Table 5.3: Random Geometric Graphs clustering". Due to the random deployment some nodes will end up being local-clusters (their degree is higher or equal than their neighbors). For the \clustering" column, we enhance these local clusters by increasing their transmission range to the value given in the \transmission range" column, while the nodes that are not local cluster-heads have a range of 0.12. For the \no clustering" column all nodes have the transmission range given the "transmission range" column, but events stay in the nodes where they appear. We observe that in random geometric graphs clustering also have a signi¯cant impact on the performance of random-walk-based queries (except for r=0.18 where "no clustering" is better) . It is important to mention that for the initial r=0.12 approximately»11% of the nodes end up being local clusters. 5.4.3 Low-Power Wireless Graphs Using the link layer model presented in 3, we evaluate through simulations the e®ec- tiveness of cluster-heads in realistic graphs, which are characterized by the presence of unreliable and asymmetric links. In order to guarantee the survival of the random walk we implemented a 3-way handshake protocol. A node with the random walk issues a request to the next neighbor to receive the random walk, upon reception of the packet the neighbor acknowledges the reception of the random walk, ¯nally upon reception of 100 output power (dBm) clustering no clustering savings (%) -14 263.1 428.7 38.6 -13 211.7 370.4 42.8 -12 174.6 333.4 47.6 -11 152.6 311.4 51.0 -10 132.5 278.1 52.3 Table 5.4: Grid Deployments in Realistic Environments output power (dBm) clustering no clustering savings (%) -14 367.2 557.7 34.2 -13 250.4 432.9 42.2 -12 243.8 380.8 36.0 -11 207.9 332.9 37.6 -10 169.9 294.4 42.3 Table 5.5: Random Deployments in Realistic Environment the acknowledgment, the original node sends a release packet which ends the transfer of the random walk. Tables 5.4 and 5.5 present the results for grid and random deployments. The pre- sentation is similar to Table 5.3, the \clustering" column represents networks were only the inherent local cluster-heads are enhanced by increasing their output power, while in the \no clustering" column all nodes increase their output power but events remain in the nodes where they appear. We can observe that clustering plays a signi¯cant role in reducing the cost of random walk-based queries (between 30% and 50%). On grid de- ployments approximately 12% of the nodes are inherent cluster-heads, while on random deployments about 8% of nodes are cluster-heads. 5.5 Experiments with Motes In this section we present some empirical results. Thirty-one (31) micaZ motes were deployed in a chain topology, where nodes were spaced every 1 meter. The high-degree 101 5 10 15 20 25 30 0 10 0 10 0 10 0 10 node position (m) degree c = 0 c = 1 c = 5 c = 31 0(S) 0 1 2 4 5 9 14 31 31(S) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 number of clusters Relative Cost (a) (b) Figure5.7: Empiricalstudyofrandomwalksondegree-heterogeneousgraphs,(a)presents the degree of the nodes, (b) presents the query cost for graphs with di®erent number of clusters nodes had a higher output power than the low degree nodes. And the low degree nodes increased their output power only to reply to the high degree nodes. Each mote sent 100 test packets to measure the PRR of the links. Communication graphs were obtained for a chain topology where all the nodes had the same low output power, for a chain where all nodes had the same high output power and for a mixture of both. In the graph were both type of nodes are present, the high degree nodes were evenly spaced on the chain. We simulated 300 random walk based queries on each of these graphs. The event followed Algorithm 1, and the query started at the left-most part of the chain (position 0) and followed a random walk until it hit the event. The results are presented in Figure 5.7. Figure 5.7 (a) presents the nodes' degree. Four curves are presented: for the low- power graph, for the graphs with 1 and 5 clusters, and for the high power graph. Small horizontal arrows depict the position of the high degree nodes. We observe that the average high-degree is around 15 (k = 7). Figure 5.7 (b) present the average cost of 102 ¯nding the event. The x-axis represent the number of high-degree nodes (cluster-heads) andthey-axisthecost. 0(S) representsthecasewheretheeventsinthelow-powergraph (0 cluster-heads) are static and remain in the nodes where they appear. 31(S) represents static events for the high-power graph (all nodes are cluster heads.). 0 and 31 represent the low-power and high-power graphs where events follow Algorithm 1. ItisinterestingtoobservebysolelyusingAlgorithm1,withoutthepresenceofcluster- heads, we can obtain signi¯cant gains in both the low-power graph (0(S) ! 0 : »25%) and the high-power graph (31(S) ! 31 : » 45%). This observation was also made in previoussectionbysimulatingtherandomwalkbased-queryonarealisticgraph. Itisalso importanttohighlightthatthe¯rstclustersaccountformostofthesavings,addingmore cluster either leads to diminishing returns or consumes more energy. This behavior was predictedbyouranalysis. Italsointerestingtoobservethateventhoughthedegreeofthe low-power nodes is higher than 2, the ratio 4 5k leads to approximately 3.4 cluster-heads to hit the ¯rst local minimum, which is a decent prediction. Finally, it is important to state that the empirical studies presented in this section are limited and are shown only as a proof-of-concept. Large scale networks need to be tested in order to accurately asses the impact of the contributions of this Chapter on degree-heterogenous networks. 5.6 Summary This chapter focused on the impact of node degree heterogeneity on random walk-based queries. As observed in the model presented in chapter 3, degree heterogeneity is a 103 common property of WSN communication graphs, and we proposed a simple push-pull algorithm to exploit this property and reduce the query cost of random walks. Using connections between random walks and electrical resistance, and Markov chain analysisofthetimetoabsorptionweprovidesomeimportantconclusions. First,ourwork providesinterestingtheoreticalresultsforlinetopologiesshowingthatwhencluster-heads have a coverage k (cover k nodes to the right and left) and are uniformly distributed, a fraction of 4 5k nodes being cluster-heads can o®er a reduction in query cost of O(1¡ 1 k 2 ). Second, in 2D random and grid deployments we presented numerical and simulation resultsshowingthatwithasmallpercentageofhigh-degreenodesinthenetwork(<10%), signi¯cant cost savings can be obtained | between 30% and 70% depending on the coverage of the high-degree nodes. Finally, we presented a limited set of empirically-derived results on micaZ motes that con¯rms the considerable impact of degree-heterogeneity on real WSN communication graphs. 104 Chapter 6 Conclusions In this thesis, we have argued that the design of e±cient WSN routing protocols require a communication graph that incorporates the non-idealities present in common deploy- ments. We have substantiated this thesis by presenting three studies. In our ¯rst study, we have analyzed the behavior of the transitional region, and we have also proposed a more realistic link layer model. Based on this link-layer model, in our second and third studies we have proposed mechanisms to improve the performance of geographic routing protocols and random walk-based queries, respectively. 6.1 Analysis of the Transitional Region In the ¯rst study (Chapter 3), we have presented an in-depth analysis of unreliable and asymmetric links in low-power multi-hop wireless networks. The main contributions of thisworkaretwofold. First,itquanti¯estheimpactofthewirelessenvironmentandradio characteristics on link reliability and asymmetry. And second, we propose a systematic way to generalize models for the link layer that can be used for the e±cient design of routing protocols for WSN. 105 Wehavealsoderivedexpressionsforthepacketreceptionrateasafunctionofdistance, and for the size of the transitional region. These expressions incorporate several radio parameters such as modulation, encoding, output power, frame size, receiver noise °oor and hardware variance; and channel parameters such as the path loss exponent and the log-normal variance. The expressions we have derived provide some important insights into the impact of channel multi-path and hardware variance on the link behavior of wireless sensor networks. First,therelativesizeofthetransitionalregionishigherforlowerpathlossexponents and higher variances. Second, hardware variance induces a pseudo-log-normal variance, which increases the size of the transitional region. Third, a negative correlation between the output power and noise °oor leads to nodes that are good transmitters and receivers, which helps to explain the clustering behavior observed in previous empirical studies [25, 11]. And fourth, even with a perfect-threshold radio, the transitional region still exists as long as multi-path e®ects exist. Our work contributes to a better understanding of the behavior of low-power wire- less links but is not exhaustive. It can be complemented with other studies to capture other important phenomenon present in real scenarios; for instance, contention models from [55], temporal properties from [12] and correlations due to direction of propagation from [60] (Appendix C). Finally, from preliminary results (Appendix D) we have found that even spread spec- trum radios show transitional region e®ects; we therefore believe there is value in extend- ing this work to other settings. 106 6.2 Impact of Lossy Links on Geographic Routing In our second study (Chapter 4), we have studied the performance of geographic routing under the model derived in Chapter 3. In real scenarios, the distance-greedy forwarding mechanism of geographic routing leads to signi¯cant wastage of energy resources due to dropped packets. We have shown that the optimal forwarding choice is generally to neighbors in the transitional region, which denotes that e±cient geographic forwarding strategies do take advantage of the high variance in packet reception rate of this region. Themostimportantcontributionofthisworkhasbeenthederivationoflocaloptimal forwarding metrics to balance the distance-hop energy trade-o® for both ARQ and No- ARQ scenarios. Speci¯cally, for ARQ systems, our analysis, simulations and empirical observations have shown that the product of the packet reception rate and the distance towards destination (PRR£d) is the optimal local metric. Our analysis has also shown thatreception-basedforwardingstrategiesaregenerallymoree±cientthandistance-based strategies. While the work on this thesis has been focused on chain topologies, we also have results [49] that show that the PRR£d metric outperforms other metrics in 2-D random deployments. It is also important to highlight that the PRR£d metric is recommended for static or low dynamic environments, such as environmental monitoring. In highly dynamicenvironmentsthelinkqualitycanchangedrasticallywithtime,andhence,stable estimates of PRR might not be possible. In these scenarios, best reception is probably the best forwarding strategy. 107 6.3 PerformanceofRandomWalksonHeterogeneousNetworks In our last study (Chapter 5), we have analyzed the impact of node degree heterogeneity on random walk-based queries. As shown by some empirical studies [25, 11], and further proved by our analysis on hardware variance, degree heterogeneity is a common property of WSN communication graphs (even on homogeneous networks deployed on a regular grid [25]). The main contribution of this study has been to show that by using a simple dis- tributed push-pull algorithm and having a small percentage of high-degree nodes in the network (10% <), signi¯cant cost savings can be obtained | between 30% and 70% de- pendingonthecoverageofthehigh-degreenodes. Ourworkhasalsoprovidedinteresting theoretical results for line topologies showing that when cluster-heads have a coverage k (cover k nodes to the right and left) and are uniformly distributed, a fraction of 4 5k nodes being cluster-heads can o®er a reduction in query cost of £(1¡ 1 k 2 ). It is also important to mention that one of the drawbacks of random walks is the signi¯cant delay that they encounter. In our work, we have achieved simultaneous reduc- tion in energy cost and delay by minimizing the required number of steps on the random walk. However, an accurate quanti¯cation of the delay reduction should include speci¯c characteristics of the protocol used at the MAC layer. 108 Bibliography [1] L. A. Adamic, R. M. Lukose, A. R. Puniyani, and B. A. Huberman. Search in power-law networks. Physical Review E, 64:046135, 2001. [2] M. Alanyali, V. Saligrama, and O. Savas. A random-walk model for distributed computation in energy-limited networks. In 1st Workshop on Information Theory and its Applications, 2006. [3] D. Aldous and J. A. Fill. Reversible markov chains and random walks. [4] C.AvinandC.Brito. E±cientandrobustqueryprocessingindynamicenvironments using random walk techniques. In IPSN '04: Proceedings of the third international symposium on Information processing in sensor networks,pages277{286, NewYork, NY, USA, 2004. ACM Press. [5] Z. Bar-Yossef, R. Friedman, and G. Kliot. Rawms -: random walk based lightweight membership service for wireless ad hoc network. In MobiHoc '06: Proceedings of the seventh ACM international symposium on Mobile ad hoc networking and computing, pages 238{249, New York, NY, USA, 2006. ACM Press. [6] P.Bose,P.Morin,I.Stojmenovic;,andJ.Urrutia. Routingwithguaranteeddelivery in ad hoc wireless networks. volume 7, pages 609{616, Hingham, MA, USA, 2001. Kluwer Academic Publishers. [7] G. Bottomley, T. Ottoson, and Y.-P. E. Wang. A generalized rake receiver for interference suppression. IEEE Journal on Selected Areas in Communications, 18(8):1536{1545, 2000. [8] D. Braginsky and D. Estrin. Rumor routing algorthim for sensor networks. In WSNA '02: Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications, pages 22{31, New York, NY, USA, 2002. ACM Press. [9] R. Burioni and D. Cassi. Random walks on graphs: ideas, techniques and results. Journal of Physics A: Mathematical and General, 38:R45, 2005. [10] A. Cerpa, N. Busek, and D. Estrin. Scale: A tool for simple connectivity assessment in lossy environments. CENS Technical Report 0021, 2003. [11] A. Cerpa, J. L. Wong, L. Kuang, M. Potkonjak, and D. Estrin. Statistical model of lossy links in wireless sensor networks. In IPSN '05: Proceedings of the 4th international symposium on Information processing in sensor networks, page 11, Piscataway, NJ, USA, 2005. IEEE Press. 109 [12] A. Cerpa, J. L. Wong, M. Potkonjak, and D. Estrin. Temporal properties of low power wireless links: modeling and implications on multi-hop routing. In Mobi- Hoc '05: Proceedings of the 6th ACM international symposium on Mobile ad hoc networking and computing, pages 414{425, New York, NY, USA, 2005. ACM Press. [13] A. K. Chandra, P. Raghavan, W. L. Ruzzo, and R. Smolensky. The electrical resis- tance of a graph captures its commute and cover times. In STOC '89: Proceedings of the twenty-¯rst annual ACM symposium on Theory of computing, pages 574{586, New York, NY, USA, 1989. ACM Press. [14] N. B. Chang and M. Liu. Optimal controlled °ooding search in a large wireless network. In WIOPT '05: Proceedings of the Third International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks, pages 229{ 237, Washington, DC, USA, 2005. IEEE Computer Society. [15] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker. Making gnutella-like p2p systems scalable. In SIGCOMM '03: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pages 407{418, New York, NY, USA, 2003. ACM Press. [16] S.-H. Chen, U. Mitra, and B. Krishnamachari. Cooperative communication and routing over fading channels in wireless sensor network. In WirelessCom '05: IEEE International Conference on Wireless Networks,Communications, and Mobile Com- puting, 2005. [17] Chipcon. cc1000 low power radio transceiver. [18] C.-N. Chuah, D. Tse, J. M. Kahn, and R. A. Valenzuela. Capacity scaling in mimo wireless systems under correlated fading. volume 48, pages 637{650, 2002. [19] E. Cohen and S. Shenker. Replication strategies in unstructured peer-to-peer net- works. In SIGCOMM '02: Proceedings of the 2002 conference on Applications, tech- nologies, architectures, and protocols for computer communications, pages 177{190, New York, NY, USA, 2002. ACM Press. [20] D. S. J. De Couto, D. Aguayo, J. Bicket, and R. Morris. A high-throughput path metric for multi-hop wireless routing. 2003. [21] S. Dolev, E. Schiller, and J. Welch. Random walk for self-stabilizing group commu- nication in ad hoc networks, 2002. [22] P. G. Doyle and J. L. Snell. Random walks and electric networks. 1984. [23] G. G. Finn. Routing and addressing problems in large metropolitan-scale internet- works. [24] G. J. Foschini. Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas. 2006. 110 [25] D.Ganesan, B.Krishnamachari, A.Woo, D.Culler, D.Estrin, andS.Wicker. Com- plexbehavioratscale: Anexperimentalstudyoflow-powerwirelesssensornetworks. 2002. [26] A. Ghosh, S. Boyd, and A. Saberi. Minimizing e®ective resistance of a graph. 2005. [27] L. Girod, J. Elson, A. Cerpa, T. Stathopoulos, N. Ramanathan, and D. Estrin. Em- star: a software environment for developing and deploying wireless sensor networks. 2004. [28] C. Gkantsidis, M. Mihail, and A. Saberi. Random walks in peer-to-peer networks: algorithms and evaluation. Perform. Eval., 63(3):241{263, 2006. [29] R. Govindan, E. Kohler, D. Estrin, F. Bian, K. Chintalapudi, O. Gnawali, S. Rang- wala, R. Gummadi, and T. Stathopoulos. Tenet: An architecture for tiered embed- ded networks. 2005. [30] P. Gupta and P. R. Kumar. Critical power for asymptotic connectivity in wireless networks. pages 547{566, 1998. [31] P. Gupta and P. R. Kumar. The capacity of wireless networks. Information Theory, IEEE Transactions on, 46(2):388{404, 2000. [32] M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. Measuring index quality using random walks on the web. In WWW '99: Proceeding of the eighth international conference on World Wide Web, pages 1291{1303, New York, NY, USA, 1999. Elsevier North-Holland, Inc. [33] C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed di®usion: A scalable and robust communication paradigm for sensor networks. In Proceedings of the 6th annual international conference on Mobile, pages 56{67. ACM Press, 2000. [34] B. Karp and H. T. Kung. Gpsr: greedy perimeter stateless routing for wireless networks. In MobiCom '00: Proceedings of the 6th annual international conference on Mobile computing and networking, pages 243{254. ACM Press, 2000. [35] A.Khandani, J.Abounadi, E.Modiano, andL.Zhang. Cooperativeroutinginwire- less networks. In Allerton Conference on Communications, Control and Computing, 2003. [36] Y.-J. Kim, R. Govindan, B. Karp, and S. Schenker. Geographic routing made prac- tical. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation, 2005. [37] D. Kotz, C. Newport, and C. Elliott. The mistaken axioms of wireless-network research. July 2003. [38] F. Kuhn, R. Wattenhofer, and A. Zollinger. Worst-case optimal and average-case e±cient geometric ad-hoc routing. In MobiHoc '03: Proceedings of the 4th ACM international symposium on Mobile ad hoc networking & computing, pages 267{278, New York, NY, USA, 2003. ACM Press. 111 [39] S. Lee, B. Bhattacharjee, and S. Banerjee. E±cient geographic routing in multihop wireless networks. In MobiHoc '05: Proceedings of the 6th ACM international sym- posium on Mobile ad hoc networking and computing, pages 230{241, New York, NY, USA, 2005. ACM Press. [40] P. Levis, N. Lee, M. Welsh, and D. Culler. Tossim: accurate and scalable simulation of entire tinyos applications. In SenSys '03: Proceedings of the 1st international conference on Embedded networked sensor systems, pages 126{137, New York, NY, USA, 2003. ACM Press. [41] C. Li, W. Hsu, B. Krishnamachari, and A. Helmy. A local metric for geographic routing with power control in wireless networks. In IEEE Conference on Sensor and Ad Hoc Communications and Networks, 2005. [42] H. Liu and K. Li. A decorrelating rake receiver for cdma communications over frequency-selective fading channels. volume 47, pages 1036{1045, 1999. [43] L. Lovasz. Random walks on graphs: A survey. 1996. [44] H. Nikookar and H. Hashemi. Statistical modeling of signal amplitude fading of indoor radio propagation channels. In IEEE International Conference on Universal Personal Communications, 1993. [45] R. Poor. Sources of asymmetry in wireless mesh sensor networks. In private com- munication, 2004. [46] T. Rappaport. Wireless Communications: Principles and Practice. Prentice Hall PTR, Upper Saddle River, NJ, USA, 2001. [47] S. I. Resnick. Adventures in stochastic processes. 1992. [48] N.Sadagopan, B.Krishnamachari, andA.Helmy. Activequeryforwardinginsensor networks. volume 3, pages 91{113, 2005. [49] K. Seada, M. Zuniga, A. Helmy, and B. Krishnamachari. Energy-e±cient forward- ing strategies for geographic routing in lossy wireless sensor networks. In SenSys '04: Proceedings of the 2nd international conference on Embedded networked sensor systems, pages 108{121, New York, NY, USA, 2004. ACM Press. [50] S. Y. Seidel and T. S. Rappaort. 914 mhz path loss prediction model for indoor wireless communication in multi °oored buildings. 1992. [51] S. D. Servetto and G. Barrenechea. Constrained random walks on random graphs: routing algorithms for large scale wireless sensor networks. In WSNA '02: Pro- ceedings of the 1st ACM international workshop on Wireless sensor networks and applications, pages 12{21, New York, NY, USA, 2002. ACM Press. [52] S.Shakkottai. Asymptoticsofquerystrategiesoverasensornetwork. InINFOCOM, 2004. 112 [53] H. Sivaraj and G. Gopalakrishnan. Random walk based heuristic algorithms for distributed memory model checking. volume 89, 2003. [54] K. Sohrabi, B. Manriquez, and G. J. Pottie. Near ground wideband channel mea- surement in 800-1000 mhz. volume 1, pages 571{574 vol.1, 1999. [55] D. Son, B. Krishnamachari, and J. Heidemann. Experimental study of concurrent transmission in wireless sensor networks. 2006. [56] R.Tian, Y.Xiong, Q.Zhang, B.Li, B.Y.Zhao, andX.Li. Hybridoverlaystructure based on random walks. In IPTPS, pages 152{162, 2005. [57] A. Woo, T. Tong, and D. Culler. Taming the underlying challenges of reliable multihop routing in sensor networks. 2003. [58] H. Zhang, A. Arora, and P. Sinha. Learn on the °y: Quiescent routing in sensor network backbones. In OSU-Technical Report OSU-CISRC-7/05-TR48, 2005. [59] J.ZhaoandR.Govindan. Understandingpacketdeliveryperformanceindensewire- less sensor networks. In SenSys '03: Proceedings of the 1st international conference on Embedded networked sensor systems, pages 1{13, New York, NY, USA, 2003. ACM Press. [60] G. Zhou, T. He, S. Krishnamurthy, and J. A. Stankovic. Models and solutions for radio irregularity in wireless sensor networks. volume 2, pages 221{262, New York, NY, USA, 2006. ACM Press. [61] M. Zuniga and B. Krishnamachari. Analyzing the transitional region in low power wireless links. In SECON '04: First IEEE International Conference on Sensor and Ad hoc Communications and Networks, 2004. [62] M. Zuniga and B. Krishnamachari. An analysis of unreliability and asymmetry in low-power wireless links. under submission, 2006. 113 Appendix A Models from Communication Theory This section gives an overview of the log-normal path loss model, encoding and modula- tion. A.1 Log-Normal Path Loss Model Whenanelectromagneticsignal propagates, it maybe di®racted, re°ectedand scattered. These e®ects have two important consequences on the signal strength. First, the signal strength decays exponentially with respect to distance. And second, for a given distance d, the signal strength is random and log-normally distributed about the mean distance dependent value. Duetotheuniquecharacteristicsofeachenvironment,mostradiopropagationmodels use a combination of analytical and empirical methods. One of the most common radio propagation models is the log-normal path loss model [46]. This model can be used for large and small coverage systems [50]. Furthermore, empirical studies have shown that the log-normal path loss model provides more accurate multi-path channel models than Nakagami and Rayleigh for indoor environments [44]. 114 According to this model the received power (P r ) in dB is given by: P r (d)=P t ¡PL(d 0 )¡10 ´ log 10 ( d d 0 )+N(0;¾) (A.1) Where P t is the output power, ´ is the path loss exponent which captures the rate at whichsignaldecayswithrespecttodistance,¾isthestandarddeviationduetomulti-path e®ects and PL(d 0 ) is the power decay for the reference distance d 0 . A.2 Encoding and Modulation Usually, the radio does not send binary data directly ('0s' and '1s'), but it encodes them into bauds. This process is called encoding, and one baud can be less, more or equal to 1 bit. After encoding, the radio uses a modulation mechanism to send these bauds overthewirelesschannel. Theradiohaveseveraloptions; itcanmodulatetheamplitude, frequency or phase of the carrier frequency. Both, the encoding and modulation used, play an important role on the link behavior of WSN. Now, we will provide an example to obtain the packet reception rate (ª) for radios using Non-Coherent FSK (NCFSK) modulation and Non Return to Zero (NRZ) encoding (where 1 bit = 1 baud). In the presence of additive white gaussian noise (AWGN) the probability of bit error P e of the receiver is given by: P e = 1 2 exp ¡ ° 2 (A.2) 115 Where ° is the signal to noise ratio (SNR). A frame is received correctly if all its bits are received correctly, hence, for a frame 1 of length f (in bytes) the probability of successfully receiving a packet is: ª=(1¡P e ) 8f (A.3) Finally, by inserting equation A.2 in equation A.3, the PRR ª is de¯ned as: ª=(1¡ 1 2 exp ¡ ° 2 ) 8f (A.4) Table 3.6 provides expressions for various encoding and modulation techniques. A.3 Noise Floor Another important element that determines the link behavior is the noise °oor. The temperature of the environmentin°uences the thermal noise generated by the radio com- ponents (noise ¯gure) 2 . When the receiver and the antenna have the same ambient temperature the noise °oor is given by [46]: P n =(F +1)kT 0 B (A.5) Where F is the noise ¯gure, k the Boltzmann's constant, T 0 the ambient temperature and B the equivalent bandwidth. 1 A frame consists of: preamble, network payload (packet) and CRC 2 Interfering signals can further in°uence the noise °oor, but they are out of the scope of this thesis. 116 Appendix B Random Walks In this section we present the mathematical tools used to derive hitting times. B.1 Resistance Method For a graph G(V;E) where jVj is the number of nodes and jEj is the number of edges, the following notation will be used. An element of V or E is represented by a lowercase, a subset or array is represented by a bold lowercase and the complement of a set or element will be denoted with an upper-bar, for example e represents an element, e an array (subset) and ¹ e and ¹ e represent the complements of e and e, respectively. The hitting time h uv is the expected time taken by a simple random walk starting at u to reach v for the ¯rst time [43]. For a source s 2 V and the subset of cluster-head nodes eµV, the average hitting time from s to e is: h se = X e2e h se jej (B.1) 117 The commute time C uv is de¯ned as the expected time taken by a random walk startingatutoreachv andcomebacktou. Notethatbyde¯nitionC uv =h uv +h vu , but in general h uv 6= h vu . In a seminal work, Doyle and Snell [22] explored the connection betweenarandomwalkonagraphGandtheresistanceofanelectricalnetworkobtained fromGbyviewingeachedgeasaunitresistor. In[13],Chandraet al. extendedthiswork and proved the following equality that relates the commute time C uv and the e®ective resistance R uv of the electrical network of G : C uv =2mR uv (B.2) Where m is the number of edges in the graph. Notice that in case of symmetry 1 h uv = h vu , which implies that the commute time is two times the hitting time. We use this property in the analysis of the hitting time for line topologies. B.2 Time to Absorption in Markov Chains In this subsection, we brie°y describe a key result in the analysis of Markov chains with absorbing states. Quite often it is of interest to ¯nd out what is the expected time it takes for a chain starting at a transient state i to reach the absorption state. This time is called the time to absorption from state i. Given a graph G(V;E), a random walk on the graph can be de¯ned by a Markov chain M, where M represents the transition probability matrix. The notation used for elements and subsets of V and E is the same as the one used in the previous subsection. 1 The graph G 0 where we name u as v and vice versa is isomorphic to G 118 Let us consider that the event is in node e 2 V and the sink is in node s 2 V. As presented in [47], absorption states can be used to obtain hitting times. Let M e be the matrix resulting from deleting the row and column corresponding to e inM, and let Q e be: Q e =(I¡M e ) ¡1 1 (B.3) Where I is the identity matrix and 1 is a column vector of ones. Q e is an array representing the expected hitting time from each node to e (except e). Hence, in our case h se =Q e (s), where Q e (s) represents the s'th element of the array. 119 Appendix C The Radio Irregularity Model Thisappendixprovidessomestepstoincludenon-isotropicpropertiesofthetransmission coverage, and the impact of obstacles. IntheRIMmodel[60],theauthorspresentthedegreeofirregularity(DOI)coe±cient as a mean to capture the variation per unit degree change in the direction of radio propagation. In that work the received power is given by: P r (d) =P t ¡ DOIAdjustedPathLoss + N(0;¾) =P t ¡ PathLoss£K i + N(0;¾) (C.1) WhereK i isacoe±cienttorepresentthedi®erenceinpathlossindi®erentdirections, and the method to obtain it is presented in the RIM model. Hence, denoting PL(d 0 ) = PL(d 0 )+10 ´ log 10 ( d d 0 ), equation 3.1 can be modi¯ed to include non-isotropic e®ects: P r (d)=P t ¡ PL(d 0 )£K i + N(0;¾) (C.2) The e®ect of obstacles can be included by inserting a new variable on the previous equation. Let us denote denote vw as the path loss in dB due to an obstacle between 120 nodesv andw, forexampleawall. Then, lettingv bethetransmitter, thereceivedpower at w is given by: P rw (d)=P tv ¡ (PL(d 0 )+ vw )£K i + N(0;¾) (C.3) Hence,ifthelayoutoftheenvironmentisprovided,thepreviousequationcanbeused to include additional path loss for each pair of nodes according to the obstacles between them. 121 Appendix D The Transitional Region in MicaZ Motes SomepreliminaryempiricalevaluationsweredonewithmicaZdevices. Thesemoteshave a 2.4 GHz IEEE 802.15.4/ZigBee(tm) RF transceiver, which uses DSSS modem with 2 Mchips/s and 250 kbps e®ective data rate. A chain topology with the same methodology as Chapter 3.4.2 was deployed in the same indoor environment as mica2 motes. FigureD.1presentsempiricalmeasurementsforthechannel,radioandlinksformica2 and micaZ. The nominal output power for both types of motes was -10 dBm. We observe that the transitional region still has a signi¯cant extent. However, for the same output power micaZ radios seem to have a larger connected and transitional regions. Nomajordi®erenceswerefoundintheshadowingstandarddeviationforbothdeploy- ments (around 6.1 for both). However, the path loss exponent for micaZ measurements is 1.94 which is smaller than the corresponding value for mica2 in Table 3.2 (´ = 3.3) . According to equation 3.8 a smaller ´ increases the size of both regions, which provides someintuitionastowhytheextentoftheregionsarelargerformicaZmotesforthesame output power. 122 0 5 10 15 20 −110 −100 −90 −80 −70 −60 −50 distance (m) RSSI (dBm) −110 −100 −90 −80 −70 −60 −50 0 0.1 0.2 0.3 0.4 0.5 RSSI (dBm) PRR 0 5 10 15 20 0 0.1 02 0.3 0.4 0.5 distance (m) PRR (a) (b) (c) 0 5 10 15 20 −110 −100 −90 −80 −70 −60 −50 distance (m) RSSI (dBm) −110 −100 −90 −80 −70 −60 −50 0 0.1 0.2 0.3 0.4 0.5 RSSI (dBm) PRR 0 5 10 15 20 0 0.1 0.2 0.3 0.4 0.5 distance (m) PRR (d) (e) (f) Figure D.1: Comparison of empirical measurements for channel, radio and link between mica2 and micaZ motes, P t = -10 dBm for both type of motes, (a) channel mica2, (b) radio mica2, (c) link mica2, (d), (e), (f) are their micaZ counterparts. The spread spectrum techniques seem to partially combat multi-path by decreasing ´ and consequently providing a larger coverage for the same output power, however, as stated in equation 3.9 a lower ´ implies a larger transitional region which increases the number of unreliable and asymmetric links. An in-depth study of the impact of low-cost spread spectrum radios in the transitional region is part of our future work. 123
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Robust routing and energy management in wireless sensor networks
PDF
Transport layer rate control protocols for wireless sensor networks: from theory to practice
PDF
Models and algorithms for energy efficient wireless sensor networks
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Efficient and accurate in-network processing for monitoring applications in wireless sensor networks
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
On location support and one-hop data collection in wireless sensor networks
PDF
Techniques for efficient information transfer in sensor networks
PDF
Towards interference-aware protocol design in low-power wireless networks
PDF
Robust and efficient geographic routing for wireless networks
PDF
Multichannel data collection for throughput maximization in wireless sensor networks
PDF
Language abstractions and program analysis techniques to build reliable, efficient, and robust networked systems
PDF
Rate adaptation in networks of wireless sensors
PDF
Dynamic routing and rate control in stochastic network optimization: from theory to practice
PDF
Reliable and power efficient protocols for space communication and wireless ad-hoc networks
PDF
Reconfiguration in sensor networks
PDF
Design of cost-efficient multi-sensor collaboration in wireless sensor networks
PDF
Cooperation in wireless networks with selfish users
PDF
Distributed edge and contour line detection for environmental monitoring with wireless sensor networks
Asset Metadata
Creator
Zúñiga Zamalloa, Marco Antonio
(author),
Zuniga, Marco
(author)
Core Title
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
11/30/2006
Defense Date
10/05/2006
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
geographical routing,link modeling,OAI-PMH Harvest,random walks,routing protocols,wireless sensor networks
Language
English
Advisor
Krishnamachari, Bhaskar (
committee chair
), Govindan, Ramesh (
committee member
), Mitra, Urbashi (
committee member
)
Creator Email
mzunigaz@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m200
Unique identifier
UC1446187
Identifier
etd-Zuniga-20061130 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-149231 (legacy record id),usctheses-m200 (legacy record id)
Legacy Identifier
etd-Zuniga-20061130.pdf
Dmrecord
149231
Document Type
Dissertation
Rights
Zuniga, Marco; Zúñiga Zamalloa, Marco Antonio
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
geographical routing
link modeling
random walks
routing protocols
wireless sensor networks