Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
On location support and one-hop data collection in wireless sensor networks
(USC Thesis Other)
On location support and one-hop data collection in wireless sensor networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ON LOCATION SUPPORT AND ONE-HOP DATA COLLECTION IN WIRELESS SENSOR NETWORKS by Kiran Kumar Yedavalli A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful¯llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering) May 2007 Copyright 2007 Kiran Kumar Yedavalli Epigraph phi¢los¢o¢phy n. pl. phi¢los¢o¢phies 1. Love and pursuit of wisdom by intellectual means and moral self-discipline. 2. Investigationofthenature,causes,orprinciplesofreality,knowledge,orvalues, based on logical reasoning rather than empirical methods. 3. A system of thought based on or involving such inquiry. 4. The critical analysis of fundamental assumptions or beliefs. 5. The disciplines presented in university curriculums of science and the liberal arts, except medicine, law, and theology. 6. The discipline comprising logic, ethics, aesthetics, metaphysics, and epistemol- ogy. 7. A set of ideas or beliefs relating to a particular ¯eld or activity; an underlying theory. 8. A system of values by which one lives. the¢sis n. pl. the¢ses (-sz) 1. A proposition that is maintained by argument. 2. A dissertation advancing an original point of view as a result of research, espe- cially as a requirement for an academic degree. ii Dedication To my Parents, Akka, and my wife Divya, for their love and support. iii Acknowledgements It has been my distinct honor and pleasure to work with Prof. Bhaskar Krishnamachari during my PhD studies at the University of Southern California. I am deeply thankful to him for being my advisor and for extending his much needed support and encouragement all through my stay at USC. He has been a great inspiration during di±cult times and a true guide, not just in academic matters, but in matters of life in general. Through his compassion and enthusiasm he has been a true role model to be followed. I have learned much from him about being a good human being, which I believe is going to stay with me all through my life. It is a special honor to me that Prof. Ramesh Govindan and Prof. Antonio Ortega have agreed to be on my thesis dissertation committee and I am very thankful to them for this. Iameternallyindebtedtomyparentsandmysisterfortheirloveandsupportwithoutwhich I wouldn't have been what I am today. I am very fortunate to be able to share my happiness with my loving wife Divya who has been very patient and supportive of all my endeavors. I am greatly thankful to my friends for being there for me when I needed them and for making my stay at USC a bouquet of many happy moments and rich experiences. Through my friends and colleagues in the Autonomous Networks Research Group (ANRG) I have had the opportunity to work with some remarkable and truly outstanding individuals and learn much from them. I am grateful to my friends outside of the group - Satya, Harish, Suraj, and others - for being there for me during moments of the greatest need. I would like to thank the Electrical Engineering department for making my stay at USC very comfortable through their support and services. iv Table of Contents Epigraph ii Dedication iii Acknowledgements iv List Of Tables viii List Of Figures ix Abstract xiv 1 Introduction 1 1.1 Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Location Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Medium Access for One-Hop Data Collection . . . . . . . . . . . . . . . . . . . . 3 1.4 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 I Location Support for Mobile Nodes 9 2 Background on RF Localization 10 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Finger-Printing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Non-Finger-Printing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Accurate RF Localization 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Localization Technique I: Ecolocation . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 Ideal Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.2 Real World Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.3 Location Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Localization Technique II: Sequence-Based Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.1 Location Sequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.2 Localization Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.3 Maximum Number of Location Sequences . . . . . . . . . . . . . . . . . . 32 v 3.3.4 Location Sequence Table Construction . . . . . . . . . . . . . . . . . . . . 36 3.3.5 Unknown Node Location Sequence . . . . . . . . . . . . . . . . . . . . . . 41 3.3.6 Feasible and Infeasible Sequences . . . . . . . . . . . . . . . . . . . . . . . 41 3.3.7 Distance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3.8 Location Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.9 Localization Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.10 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4 SBL Vs. Ecolocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.1 Location Error Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.2 Simulation Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5.3 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5.4 Simulation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.5.5 Simulation Results: Sequence Corruption . . . . . . . . . . . . . . . . . . 54 3.5.6 Simulation Results: Performance Study . . . . . . . . . . . . . . . . . . . 54 3.5.7 Simulation Results: Comparative Study . . . . . . . . . . . . . . . . . . . 57 3.6 Real World Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.6.1 Outdoor Experiment: Parking lot. . . . . . . . . . . . . . . . . . . . . . . 60 3.6.2 Indoor Experiment: O±ce building . . . . . . . . . . . . . . . . . . . . . . 62 3.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4 Fast & Fair Localization 66 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.3 Assumptions and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.4 De¯nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.6 Scheduling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.7 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.8 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.8.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.8.1.1 Grid Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.8.1.2 Random Deployment . . . . . . . . . . . . . . . . . . . . . . . . 84 4.8.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.10 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 II Medium Access for One-Hop Data Collection 91 5 Background on Medium Access Techniques for One-Hop Data Collection 92 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.2 Application Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.3 Medium Access Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3.1.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.3.1.2 Energy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3.2 Slotted Aloha Medium Access . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.3.3 Carrier Sense Medium Access . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.3.3.1 IEEE 802.15.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 vi 5.3.4 Tree/Stack Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3.5 Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.5.1 Channel Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.5.2 Location Information . . . . . . . . . . . . . . . . . . . . . . . . 104 5.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6 Analysis of Slotted Aloha Medium Access for One-Shot Data Collection 106 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.3 Slotted Aloha with Binary Exponential Back-O® . . . . . . . . . . . . . . . . . . 108 6.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7 Location-Aware Medium Access 115 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3 Location-Aware MAC Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.3.1 Implementation Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.4.1 Simulation Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.4.2 Performance of Location-Aware MAC Protocol . . . . . . . . . . . . . . . 123 7.4.3 Comparative Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 8 Enhancement of IEEE 802.15.4 MAC Protocol 133 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 8.2 IEEE 802.15.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8.3 p-Persistent CSMA MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.3.3 Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 8.3.4 Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.4 Characterization of IEEE 802.15.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.5 Enhanced IEEE 802.15.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.5.1 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 8.5.1.1 Enhanced p-Persistent CSMA MAC . . . . . . . . . . . . . . . . 153 8.5.1.2 Enhanced IEEE 802.15.4 . . . . . . . . . . . . . . . . . . . . . . 154 8.5.1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8.5.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 8.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 9 Thesis Summary 159 10 Future Directions 164 Reference List 167 Appendix A Even-Random Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 vii List Of Tables 3.1 Constraints on the unknown node for the example in Figure 3.1(a). . . . . . . . . 23 3.2 Constraints for the example of Table 3.1 when the ranks of third and fourth ranked reference nodes are interchanged due of multi-path e®ects.. . . . . . . . . 25 3.3 Progression of number of location sequences with number of reference nodes (n) in the localization space. The last two columns compare the simulation and analytical results for the maximum number of location sequences. Simulation results are gathered from 1000 random trials (with 100 di®erent random seeds) in each of which n reference nodes were placed uniformly at random in a square localization space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.4 Typical values and ranges for di®erent simulation parameters . . . . . . . . . . . 52 3.5 Comparisonofworst-casecomputationalcomplexitiesofSBL,LSE,Proxmityand 3-Centroid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.1 Simulation parameters and their values. . . . . . . . . . . . . . . . . . . . . . . . 82 4.2 Comparisonofanalyticallowerandupperboundsof Mwithsimulationresultsfor grid deploymentofdi®erentvalueofN,thenetworksize. Notethatthenumberof referencenodesinthein-squareofacellis n=2m 2 +6m+5, m=( R d ¡1)where, R is the radio range and d is the inter reference node distance (Proposition 4 in Section 4.8.1.1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3 Comparison of analytical lower and upper bounds of M with simulation results for random deploymentof di®erent value of N, the network size. The simulations results are average over 10 di®erent random reference node network topologies. . 85 8.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.2 Performance comparison of Original and Enhanced IEEE 802.15.4 MAC for CD in term of throughput (© CD (N)) in Kbps for Low density networks. . . . . . . . 157 viii List Of Figures 2.1 Classi¯cation of localization techniques. . . . . . . . . . . . . . . . . . . . . . . . 10 3.1 The distance rank order of reference nodes (A;B;C;D) is di®erent for di®erent regions (X 1 , X 2 ) in the localization space. . . . . . . . . . . . . . . . . . . . . . . 23 3.2 Real world experimental results: Reference nodes far from the unknown node may measure higher RSS values than closer reference nodes. Note that y-axis is reverse ordered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Ecolocation location estimate (E) for the unknown node (P) at (1;3) for a grid layout of 9 reference nodes (A). The number adjacent to a reference node is its corresponding rank. The location error is expressed in meters where the size of the square localization area is 12£ 12 sq. meters.(a) Sequence: 123456789 (no erroneous constraints) [Estimate: (1:25;3:3); Error: 0:34 meters] (b) Se- quence: 123568497 (13:9% erroneous constraints) [Estimate: (1:25;1:95); Error: 1:07 meters] (c) Sequence: 125379486 (22:2% erroneous constraints) [Estimate: (1:95;1:25); Error: 1:98 meters] (d) Sequence: 243976581 (44:4% erroneous con- straints) [Estimate: (1:95;1:25); Error: 1:98 meters]. . . . . . . . . . . . . . . . . 28 3.4 (a) The perpendicular bisector of the line joining two reference nodes divides the localization space into three distinct regions. (b) Illustration of arrangement of 6 bisector lines for 4 reference nodes placed uniformly randomly in a square localization space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 (a) Examples of location sequences for a four reference node topology. (b) All feasible location sequences for the topology of (a). . . . . . . . . . . . . . . . . . 31 ix 3.6 Addition of fourth reference node D adds 3 new bisector lines to the localization space. (a) The ¯rst of the 3 new bisector lines, line 1, the perpendicular bisector of CD, creates 3 new vertices (equal to the number of pre-existing lines in the localization space), 4 new faces and 7 new edges at most. (b) The second line, line 2, the perpendicular bisector of BD, has to pass through the intersection point of the bisectors of CD and BC because, fBD;CD;BCg form a triangle andtheperpendicularbisectorsofthethreesidesofatriangleintersectatasingle point. Therefore line 2 creates 2 new vertices, 4 new faces and 6 new edges at most. (c) Similarly, line 3, the perpendicular bisector of AD has to pass through the intersection points of perpendicular bisectors of AB, BD and AC, CD as fAD;AB;BDg and fAD;AC;CDg are two triangles with a common side AD. Therefore, line 3 creates 1 new vertex, 4 new faces and 5 new edges at most.. . . 34 3.7 RF channel non-idealities could corrupt a location sequence from the feasible space either to another sequence in the feasible space or to a sequence in the infeasible space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.8 Robustness examples: Location estimate (E) for the unknown node (P) at (1;3) for a grid layout of 9 reference nodes. The number adjacentto a reference node is its corresponding rank. The location error is expressed in meters where the side length of the square localization area is 12 meters. (a) (T =1, ¿ =1), Estimate (E): (1:33;1:33) , Location Error: 0:46 meters (b) (T = 0:722, ¿ = 0:783), Estimate (E): (2:0;2:0), Location Error: 1:4 meters (c) (T = 0:556, ¿ = 0:667), Estimate (E): (2:0;2:0), Location Error: 1:4 meters (d) (T = 0:111, ¿ = 0:278), Estimate (E): (2:0;1:33), Location Error: 1:94 meters . . . . . . . . . . . . . . . 47 3.9 Overlap of Ecolocation scanning grid and regions created by arrangement of bi- sector lines in SBL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.10 Simulation results averaged over 1000 random trials (with 100 di®erent random seeds) in each of which n reference nodes were placed uniformly at random in a 2D square localization area of S£S sq. meters. (a) The average maximum, average and average minimum face areas as a function of the number of reference nodes. (b) The average maximum, average and average minimum edge lengths as a function of the number of reference nodes. K1, K2 and K3, K4 are scaling constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.11 Average location error as measured using Spearman's correlation and Kendall's Tau as a function of the number of reference nodes. . . . . . . . . . . . . . . . . 53 3.12 Sequence corruption: Cumulative distribution function of Kendall's Tau T be- tween the RSS location sequence and true location sequence for varying (a) stan- dard deviation (¾) (b) path loss exponent (´) (c) number of reference nodes (n). 54 3.13 Performance: (a) Average location error as a function of RF channel parameters - standard devation (¾) and path loss exponent (´). (b) Average location error as a function of node deployment parameters - number of reference nodes (n) and reference node density (¯). (c) Average location error as a function of the location of the unknown node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 x 3.14 (a) Average location error as a function of the sequence corruption (T) and as a function of the distance (¿) between the corrupted sequence and its nearest feasible sequence in the location sequence table. (b) Correlation between ¿ and T. 55 3.15 Comparison: Average location error due to SBL, LSE, Proximity and 3-Centroid as a function of standard deviation of RSS log-normal distribution ¾ for di®erent values of path loss exponent ´. (a) ´ =2;n=10 (b) ´ =4;n=10 (c) ´ =6;n= 10 and for di®erent values of number of reference nodes n. (a) n = 4;´ = 4 (b) n=7;´ =4 (c) n=10;´ =4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.16 Outdoor experiment: 11 MICA 2 motes, placed randomly in a 144 sq.meters area, were used as reference nodes as well as unknown nodes. Consequently, each unknown node had 10 reference nodes. (a) Path loss exponent calculation, ´ =2:9. (b) Comparison between true locations and SBL location estimates. (c) LocationerrorduetoSBL,LSE,Proximityand3-Centroid(thenodesareordered in increasing error of SBL). (d) Corruption measure T and error indicator ¿. . . 61 3.17 Indoor experiment: 12 MICA 2 motes, placed randomly in a 120 sq.meters area, were used as reference nodes. The location of the unknown node was estimated for 5 di®erent locations using the 12 reference nodes. (a) Path loss exponent calculation, ´ = 2:2. (b) Comparison between true path and SBL estimated path. (c) Location error due to SBL, LSE, Proximity and 3-Centroid (the nodes are ordered in increasing error of SBL). (d) Corruption measure T and error indicator ¿. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.1 Illustration of a cell, its in-square and out-square. . . . . . . . . . . . . . . . . . . 69 4.2 Example illustrating terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3 Expression for localization delay. . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4 Three di®erent cases (1), (2) and (3) depending on the relative position of local- ization request arrival time with respect to the times slots of reference nodes in the cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5 Simulation Results: (a) Average localization delay D avg (k), (b) Localization fair- ness F(k), (c) Average localizable speed V avg (k), (d) Minimum localizable speed V min (k), and (e) Maximum localizable speed V max (k); as a function of number of reference nodes required for localization k, for ¯ve reference node density values and for grid and random deployment of reference nodes. (f) V avg (k), V max (k), and V min (k), (g) Average localization delay D avg (k); as a function of reference node density ¯ for number of reference nodes required for localization k = 8, for grid and random deployments of reference nodes. (h) Localization fairness F(k) as a function of reference node density ¯ for four di®erent levels of location estimate accuracy (k) for grid and random deployments of reference nodes. . . . 88 5.1 Application space spectrum for one-hop data collection in wireless sensor net- works. The color transition from red to blue indicates the spectrum transition from in¯nite packets in the queues to single-packet in the queues of contending nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 xi 5.2 Flow chart for IEEE 802.15.4 operation at a node. . . . . . . . . . . . . . . . . . 101 6.1 Markov chain of states for a contending node using slotted Aloha with binary exponential back-o® protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.2 ComparisonofanalysisandsimulationsforslottedAlohawithbinaryexponential back-o®. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.3 Performance of slotted Aloha with Binary Exponential Back-o® as a function of number of nodes N for the one-hop one-shot data gathering problem. . . . . . . 113 7.1 Example of the location-aware MAC protocol for m = 4. (a) The square space splitting (b) The corresponding tree. . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.2 (a) 16-split strategy, (m = 16), D(n) = 31. (b) 64-split strategy, (m = 64), D(n)=99. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.3 Location Token Generator for symmetrical square m-split strategy (m is a power of 4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.4 State diagram at the sink and at the sensor node for the location-aware MAC protocol in which the location tokens are generated at the sink (pRx: packets received, L: split level, P(L): partition at split level L). . . . . . . . . . . . . . . 122 7.5 (a) Expected delay and expected energy consumption per node due to 4-split, 16-split and 64-split strategies for grid-random placement of nodes. (b) Expected delayandexpectedenergyconsumptionpernodedueto16-splitstrategyforthree di®erent location distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.6 ExpecteddelayandexpectedenergyconsumptionforHT-split,optimalp-persistent slotted CSMA and IEEE 802.15.4 standard MAC. . . . . . . . . . . . . . . . . . 125 7.7 Comparisonoflocation-aware{4-split,16-split{andlocation-unaware{HT-split and optimal p-persistent slotted CSMA { medium access protocols as a function of n for the one-hop one-shot data collection problem. . . . . . . . . . . . . . . . 126 7.8 (a) 4{Angle{Split Strategy. (b) Comparison of expected delay and energy con- sumption,forangularsplittingandsquaresplittingstrategiesforuniform-random deployment of nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.9 The expected number of time slots for the ¯rst successful transmission.. . . . . . 130 8.1 IEEE 802.15.4 standard is modeled as a p-persistent CSMA with probability of transmission reducing in three steps { p 1 = 1 4:5 , p 2 = 1 8:5 , p 3 = 1 16:5 { with each new collision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.2 Anepochillustratingthetimeintervalbetweenconsecutivesuccessfultransmissions.139 8.3 The optimal probability of transmission. . . . . . . . . . . . . . . . . . . . . . . . 145 8.4 Ratio of expected idle time to expected epoch delay. . . . . . . . . . . . . . . . . 148 xii 8.5 Expected delay and energy consumption in an epoch with n nodes as a function of transmission probability, p, for di®erent values of packet length L. . . . . . . . 148 8.6 Ratio of expected delays and energy consumptions for consecutive epochs. . . . . 149 8.7 ComparisonoftransmissionprobabilitiesforIEEE802.15.4andoptimalp-persistent CSMA for CD and OSD.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.8 Flow chart for Enhanced IEEE 802.15.4 operation at a node. . . . . . . . . . . . 155 8.9 Performance of Channel Feedback Enhanced IEEE 802.15.4. . . . . . . . . . . . . 155 xiii Abstract We consider two fundamental building blocks for many applications in wireless sensor networks - location support and e±cient medium access for one-hop data collection. In the ¯rst part of the thesis we identify two important problems of location support - ac- curate localization and fast & fair localization - and propose novel solutions. We address the problemofaccuratelocalizationbyproposingtwonovel, light-weightRFlocalizationtechniques called Ecolocation and Sequence-Based Localization. We de¯ne constructs called location con- straints and location sequences based on distance ranks of reference nodes from the location of the unknown node and use them for localization. We compare and contrast the two localization techniques and show their robustness to RF channel non-idealities through examples. Through extensive systematic simulations and a representative set of real mote experiments, we show that our light-weight RF localization techniques provide comparable or better accuracy than other state-of-the-art radio signal strength-based localization techniques overa range of wireless channel and node deployment conditions. In addition to being accurate, the location support service should also be fast and fair. The response times of the reference nodes to localization requests from the unknown node shouldbeminimizedandmultipleunknownnodes,atdi®erentlocations,shouldnothavewidely varying response times. We identify this as a fast/fair localization problem and formulate it as amin-maxoptimizationproblem, showthat itisrelatedtothewell-known, NP-hard, maximum broadcastframelengthproblem,andinvestigateaheuristicschedulingbasedsolution. Westudy the attributes that determine the response times of the reference nodes, called the localization xiv delay, and derive closed-form expressions for it. We then investigate the heuristic solution's performanceintermsoflocalizationdelay,fairnessandaverageandminimumlocalizablespeeds. In the second part of the thesis, we address the problem of medium access for one-hop data collection, which occurs frequently in many wireless sensor network applications. We consider a wide spectrum of one-hop data collection applications with continuous data collection at one end and one-shot data collection at the other. While in the continuous data collection problem the contending wireless nodes always have a packet to transmit, in the one-shot data collection problem each contending node has a single packet to transmit. Medium access mechanisms for continuous data collection have been studied extensively in the past by numerous researchers, but such mechanisms for one-shot data collection have received much less attention. In this thesis we address the medium access problem for this spectrum of application scenarios through three di®erent pieces of work. We model and analyze the performance of slotted Aloha medium access techniques for the one-shot data collection problem. Owing to the transient nature of the network in this problem we use non-ergodic Markov chain analysis and derive °ow equations that accurately capture the temporal dynamics of the network. Using these equations we evaluate the medium access techniques' performance in terms of delay and energy consumption. We then present a novel location-aware medium access protocol for the one-shot data col- lection problem that uses the location information of contending nodes to reduce collisions and improve the overall performance. We evaluate the protocol in terms of delay and energy con- sumption and compare it with location-unaware medium access protocols using simulations. Results show that our protocol can take advantage of the location distribution of nodes to pro- vide signi¯cantly lower delay and energy consumption compared to location-unaware protocols. Finally, we model, analyze, and evaluate the performance of the IEEE 802.15.4 MAC proto- col for both ends of the one-hop data collection application spectrum. We ¯nd that the IEEE 802.15.4 MAC protocol performs poorly for one-hop data collection in dense sensor networks, xv showingasteepdeteriorationinboththroughputandenergyconsumptionwithincreasingnum- ber of transmitters. We propose a channel feedback-based enhancement to the protocol that is signi¯cantly more scalable, showing a relatively °at, slow-changing total system throughput and energy consumption as the network size increases. A key feature of the enhancement is that the back-o® windows are updated after successful transmissions instead of collisions. The window updates are based on an optimality criterion we derive from mathematical modeling of p-persistent CSMA. xvi Chapter 1 Introduction In this thesis we study two key problems in wireless sensor networks - location support and e±cient medium access for one-hop data collection. 1.1 Wireless Sensor Networks Wireless sensor networks are an emerging paradigm that promise to change the way humans interact with their environments [59]. The sensor nodes used in a wireless sensor network are inexpensive, intelligent devices that run on batteries and can \sense" the environment in their vicinityforphysicalmetricssuchastemperature,humidity,light-intensity,etc. Thesensornodes can be programmed to communicate among themselves to form networks that could provide many mission critical and quality-of-life enhancing applications. Given these advantages, the emerging trend indicates the deployment of these wireless sensor nodes as an integral part of essentialinfrastructuresuchashomes,o±cebuildings,roads,etc. Sincewirelesssensornetworks areenvisionedtoworkunassistedanduninterruptedformanyyears,energysavingsareofcritical importance for their operation. 1 1.2 Location Support A location support service enables its users, mobile or static, to determine their location coor- dinates, by themselves, with respect to a reference provided by the service. Location support servicesnotonlyenablemanyuser-experienceenhancingservicesbutalsosupportmanymission critical applications. For example, location support services in venues such as museums, zoos, and airports can be used to provide experience-enhancing navigational services to guests. On theotherhand, thesamelocationsupportservicescanprovetobelifesavinginemergencysitu- ationssuchas¯resormedicalassistancetoresidentsofold-agehomes,ortheycouldbeacritical partofmanybusinesses,workingasproductivityenhancersinfactory°oorsandware-houses. In recent years, wireless sensor networks have emerged as key enablers of location support services. The sensor nodes of location support enabling wireless sensor networks are programmed withtheirlocationinformationintermsofCartesiancoordinates withreferencetoapre-de¯ned coordinate system. Mobile devices that request location support services obtain the location coordinates of these nodes, via the wireless medium and use them to determine their own location. Thus, mobile devices, that are unknown nodes 1 , take the help of sensor nodes, that act as reference nodes, to determine their locations using e±cient localization algorithms. In any solution to an engineering problem the cost of the solution should be an essential consideration. DuringourinteractionwithwirelesssensornetworksengineersatBoschResearch inPaloAlto,California[3],itwasconcludedthatthecostofdeploymentofsensornodesinlarge numbers prohibits incorporation of special hardware for location support purposes in them. Thus wireless radios which are used by the sensor nodes for the essential task of communication have emerged as the key enablers for cost e®ective location support. Thus we focus only on radio frequency-based location support systems for wireless sensor networks. Also, the location 1 In this thesis, we use the terms \mobile devices" and \unknown nodes" interchangeably. 2 supportsystemshoulduseminimumpossibleenergyofthereferencenodesinthesensornetwork and at the same time it should be inexpensive to implement, deploy and operate. We identify two key problems in location support systems for wireless sensor networks - (i) accurate localization and (ii) fast and fair localization - and provide e±cient algorithms to solve them that comply with the above conditions for an e®ective location support system. 1.3 Medium Access for One-Hop Data Collection The problem of one-hop data collection occurs frequently in many wireless sensor network ap- plications such as location support, neighbor discovery, data-querying, etc. In this problem, a single data sink seeks interesting data from data sources in its radio range. For example, in lo- cation support, the unknown node (data sink) seeks the location coordinates of reference nodes (data sources) in its radio range. Similarly in neighbor discovery a node in a network (data sink) seeks to discover its neighboring nodes (data sources). The problem of neighbor discovery occurs as a building block of many wireless sensor network functions such as multi-hop routing of data. One of the most important application areas for wireless sensor networks is sensor data collection. Amobileorstaticquerier(datasink)seekstopullinterestingandrelevantdatafrom thesensornodes(datasources)inthewirelesssensornetworkthatareinitsone-hopradiorange using speci¯c data queries. Thus the problem of one-hop data collection opens itself to a broad spectrum of applications ranging from continuous data collection at one end and one-shot data collection at the other end of the spectrum. While in the continuous data collection problem the contending wireless nodes always have a packet to transmit, in the one-shot data collection problem each contending node has a single packet to transmit. We review the applicability and performance of commonly known wireless medium access mechanisms,suchasslottedAlohamulti-accessschemes,carriersensingmediumaccess(CSMA) 3 mechanisms and the IEEE 802.15.4 MAC protocol, to this new spectrum of one-hop data- collection applications speci¯c to wireless sensor networks through mathematical modeling and performance analysis. We also propose a novel medium access protocol that addresses the one- shot data collection end of the spectrum. 1.4 Research Contributions In this section, we provide a high-level description of our research contributions in the areas of radio-frequency-based location support and medium access mechanisms for one-shop data collection. 1. Location Support for Mobile Nodes: (a) Accurate RF Localization: Accurate localization is essential to provide ¯ne- grained localization in feature-rich areas, such as o±ce buildings and factories. This can be achieved by employing many di®erent physical signal types such as radio sig- nals, infra-red, ultra-sound etc. Depending on the environment, each type of signal behaves di®erently. For example, infra-red signals are sensitive to the ambient light intensity and ultra-sound signal behavior depends on the level of humidity in the en- vironment. Signi¯cantly, radio signals are the most sensitive to the wireless channel. They are subjected to wireless channel non-idealities such as multi-path, shadowing, refraction, di®raction and thermal noise. Owing to this behavior of radio signals, our choice of using only radio signals for localization, poses a signi¯cant challenge to achieving the desired level of accuracy in indoor environments. Many researchers have proposed many di®erent solutions to address this challenge. However,inChapter2,wewillshowthatthespeci¯cchallengesposedbythecondition of a cost and energy e±cient location support system requires a fresh new approach. In this thesis, we propose a novel geometric-constraints based approach to address 4 the challenge of using radio signals for accurate localization. We present two light- weight localization techniques provide accurate localization using only radio signals and at the same time very inexpensive to implement, deploy and operate using the minimum possible energy resources. (b) Fast&FairLocalization: Theresponsetimeofthereferencenodestotheunknown node's localization request places limits on how fast it can move while obtaining accurate location estimates. The unknown node should be able to communicate with all the required number of reference nodes for localization before it moves to its next location. Conversely, the speed of movement of the unknown node determines the frequencyoflocalizationrequestsandtheresponsetimeofthereferencenodesshould be able to match this frequency. In either case, the response time of the reference nodestolocalizationrequestsshouldbeminimizedinordertoensurefastlocalization. In addition to being fast, the response time of the reference nodes should not change drastically with the location of the unknown node in the localization area, i.e., the response-time-limited speed of movement of the unknown node should not be signi¯- cantlylowerorhigheratsomelocationscomparedtoothers. Alternatively,ifmultiple unknownnodes atdi®erentlocationsrequestlocationsupportsimultaneously, there- sponse time for all requests should be similar. In order to achieve this, the variation in the reference node response time over all locations of the unknown node should be minimized. To the best of our knowledge, no one has looked at this problem till now. In this thesis, we formally de¯ne the fast/fair problem and propose solutions. 2. Medium Access for One-Hop Data Collection: (a) ModelingandAnalysisofSlottedAlohaMulti-AccessTechniquesforOne- Shot Data Collection: The application of slotted Aloha multi-access techniques to thecontinuousdatacollectionendoftheone-hopdatacollectionapplicationspectrum 5 has been a very well studied area for over a quarter-of-a-century. However, the one- shot data collection problem has gained importance only due to the emergence of wireless sensor networks. To the best of our knowledge ours is the ¯rst attempt at modelingandanalyzingtheslottedAlohamulti-accesstechniquefortheone-shotdata collection problem. This problem is characterized by the presence of a single packet in the transmission queue of each contending node. Once that packet is transmitted thenodeceasestobeincontentionofthewirelesschannel. Thisleadstoanon-steady state, transient behavior of the wireless networks. In this thesis we use non-ergodic Markov chain model to analyze the network dynamics for this problem and derive °ow equations that accurately capture the temporal behavior of the network. Using these equations we evaluate the performance of the protocol for the one-shot data collection problem. (b) Location-Aware Medium Access for One-Shot Data Collection: We add to our contributions to e±cient medium access protocols for the one-shot data collec- tion end of the one-hop data collection application spectrum by presenting a novel location-aware medium access protocol for the same. In this protocol the contend- ing nodes make use of their location information to reduce collisions among them- selvesandimprovetheoverallperformance. Aperformancecomparisonwithlocation- unaware medium access protocols shows that the location-aware protocol can take advantage of the location distribution of nodes to provide signi¯cant performance gains. (c) EnhancementofIEEE802.15.4MACProtocol: TheIEEE802.15.4protocolis anIEEEstandardforlow-rate,low-powerwirelessembeddednetworks. Thestandard speci¯esboththephysicalandmediumaccesslayersoftheprotocol. Inthisthesiswe model, analyze, and evaluate the performance of the IEEE 802.15.4 MAC protocol forboth ends of the one-hop data collectionapplication spectrum. Simulation results 6 show that the IEEE 802.15.4 MAC protocol performs poorly for one-hop data collec- tion in dense sensor networks, showing a steep deterioration in both throughput and energy consumption with increasing number of transmitters. We propose a channel feedback-basedenhancementtotheprotocolthatissigni¯cantlymorescalable,show- ing a relatively °at, slow-changing total system throughput and energy consumption as the network size increases. A key feature of the enhancement is that the back-o® windows are updated aftersuccessful transmissions insteadof collisions. Thewindow updates are based on an optimality criterion we derive from mathematical modeling of p-persistent CSMA. Next, we provide a brief overview of the thesis organization. 1.5 Thesis Organization The rest of the thesis is organized as follows: In the ¯rst part of the thesis, we study the problem of location support. In Chapter 2, we discuss the background work on RF localization and in Chapter 3, we address the problem of accurate RF localization. We address the problem of fast and fair localization in Chapter 4. In the second part of the thesis, we study the problem of e±cient medium access for one-hop data collection. In Chapter 5, we review the background work on medium access technique for one-hop data collection. In Chapter 6, we model and analyze slotted Aloha medium access techniquesfortheone-shotdatacollectionproblemandinChapter7,wepresentanovellocation- aware medium access technique for the one-shot data collection problem. In Chapter 8, we model, analyze,andevaluatetheperformanceoftheIEEE802.15.4MACprotocolforbothends oftheone-hopdatacollectionapplicationspectrumandproposeenhancementsthatimprovethe protocols' performance. 7 We present our conclusions from this work in Chapter 9 and discuss directions for future work in Chapter 10. 8 Part I Location Support for Mobile Nodes 9 Chapter 2 Background on RF Localization 2.1 Introduction Providing accurate indoor localization has been an active area of research for quite some time. Many solutions using di®erent technologies have been proposed in the past. Figure 2.1 gives a classi¯cation from [44] of the various localization technologies based on signal types, signal metrics, processing methods and location estimate ends. Classification of Localization Techniques Signal Type 1. RF (radio frequency) 2. IR (Infra Red) 3. Ultrasound 4. Magnetic 5. Optical Location Estimation End 1. Network Based (location aware) 2. Handset Based (location support) Signal Metric 1. RSS (Received Signal Strength) 2. TDOA (time difference of arrival) 3. TOF (time of flight) 4. AOA (Angle of Arrival) 5. Phase difference Processing Method 1. Finger-Printing 2. Non-Finger- Printing Figure 2.1: Classi¯cation of localization techniques. In our work we focus on RF signals and RSS signal metric based localization techniques, in the setting of a location support system. Other metrics of RF signals such as TDOA, TOF and 10 AOA require special or extra hardware to be incorporated on sensor motes ([79], [41], [75], [80], [57]). In this chapter, we discuss the background work on RF, RSS-based localization techniques, from the literature. Here, we discuss the background work from the point of view of location estimate accuracy and complexity of implementation. All the previously proposed localization techniquescanbebroadlyclassi¯edinto¯nger-printingtechniquesandnon-¯nger-printingtech- niques. First, we present a detailed review of ¯nger-printing techniques and then, review the non-¯nger-printing techniques. 2.2 Finger-Printing Techniques Finger-printingtechniquesarecharacterizedbytwophasesoflocalization. The¯rstphase,which is the pre-con¯guration phase or the o®-line phase, involves \pro¯ling" the localization space. At each location point in the localization space, RSS values from carefully placed access-points are gathered. Access-points perform the role of reference nodes in this case. This set of RSS values is the \¯nger-print" for that location. A ¯nger-print-map is then created that maps each RSS based ¯nger-print to the corresponding location point. Since its physically not possible to measure ¯nger-prints for all locations in the localization space, interpolation and extrapolation mechanisms are used to determine the map for the entire area. Many di®erent devices have been proposed to be employed to determine as accurate a map as possible. In the second phase, which is the localization phase or the on-line phase, the unknown node records RSS values from access points in its range and uses the ¯nger-print-map from the ¯rst phase to map its set of RSS values to its corresponding location. Since RSS values are subjected to random variations in the wireless channel, the RSS ¯nger-print of the unknown node might not directly correspond to any location in the map. Therefore, di®erent techniques such as nearest-neighbor classi¯cation, neural-network based learning, probabilistic estimation, 11 statistical pattern classi¯cation, etc, have been proposed to obtain the best location estimate given the unknown node's ¯nger-print. One of the earliest reported works on radio ¯nger-printing based localization techniques, using wireless LANs is RADAR by Bahl et al in [15]. In this work, in the ¯rst phase, a database iscreatedbycollectingmultipleRSSdatasamplesfromupto3access-pointsatmultiplelocation points, and for multiple orientations at each location point. In the second phase, the location is estimated using the database created in the ¯rst phase, using the nearest neighbor rule. The authors report a median location error of 2 to 3 meters for o±ce environments using 3 base stations, 40 location points, and at least 3 samples and many orientations at each location point. The location error is the Euclidean distance between the true location of the unknown node and its location estimate. Inacloselyrelatedworkin[19], theauthorsproposetousetrainingmethods, suchanneural networks and multi-layer perceptron algorithms, on the data in the database created in the ¯rst phase, to obtain an accurate location estimate for the unknown node. After many iterations of training, they report a best location estimate error of 1.9 meters. The authors of [18] state that \the functional dependence between the signal strength from a number of radio points and the physical position is not deterministic, but a statistical law connecting signal strength and position". In view of this they use the method of support vector machines to determine the location estimate and report an average error of 3.4 meters. In [44], the authors use Kalman ¯ltering algorithms to increase location estimate accuracy in the second phase. They use an Ekahau Positioning Engine (a commercially available localization system) with 5 access-points and report a mean error of 2.5 meters. While the previous three references have focussed on improving the accuracy in the second phase, researchershaveproposedmanydi®erentmethodstoimprovethe¯nger-print-mapinthe ¯rst phase of localization. In [60], the authors use four di®erent types of devices - access-points, sni®ers, stationary emitters, and a location estimation engine - that are connected through 12 a wired network to create the ¯nger-print-map. Through careful placement of sni®ers and stationary emitters, and collected RSS data at 100 location points, the authors achieve a mean location error of 3.3 meters. The authors in [40], build on the work in the previous reference by using sni®ers to monitor the unknown node behavior over a period of time and report a similar error for 40 data points only. They show that the error can be improved by using more sni®ers. In[45],theauthorstakeahybridapproachinwhichmultiplebranchesofinformationsources, such as wireless LANs and Bluetooth devices, and multiple localization techniques, such as triangulation, k-nearest-neighbors and smallest M-vertex polygon, are combined to estimate the location of the unknown node. They selectively weigh and fuse the information from each of the multiple information sources and localization techniques to determine the location estimate and report a mean location error of 2.2 meters. They also report a mean error of 3.3 meters when information fusion is not used. Animportantaspectofalltheabove¯nger-printinglocalizationtechniquesisthatallofthem havebeendesignedforandimplementedonwirelessLANsnetworks. Thisisthemainadvantage for these localization techniques, in that, few, freely available wireless access-points can be employed to provide localization services. However, the main drawbacks of these techniques is the costly, time consuming ¯rst phase of obtaining the ¯nger-print-map for the localization space. Moreover, since this map heavily depends on the features of the localization space, such as o±ce partitions, furniture etc., it has to be recalculated every time there is a change in the localization space. Next, we review RF, RSS-based non-¯nger-printing localization techniques. 2.3 Non-Finger-Printing Techniques Non-¯nger-printing localization techniques are characterized by a single phase of operation, in which the unknown node measures RSS values of RF signals from the reference nodes in its 13 radio-range and uses them along with location coordinates of reference nodes to estimate its location. One of the earliest non-¯nger-printing based localization techniques is the centroid method proposed by Bulusu et al. in [26]. In this technique, the location estimate of the unknown node is the centroid of all reference nodes in its radio range. Even though this is a very simple technique, this provides very coarse grained location estimates. In[20],theauthorsconverttheRSSmeasurementsfromreferencenodestodistanceestimates using a assumed power law relationship between them, and use triangulation to localize the unknown node. They assume certain values for the di®erent constants of the power law. They address the problem of Rayleigh fading of the RSS samples when the unknown node is mobile and propose to collect many RSS samples in a window of time and use the sample average for localization. They present location error as a function of the window size. Localization techniques that convert RSS values to distance range values are called range-based techniques. Another range-based localization technique using the maximum likelihood estimator (MLE) hasbeenproposedby[77]and[100]. AttheheartoftheMLEtechniqueistheradiopropagation model which is assumed to be the log-normal shadowing model [83]: P R (d)=P T ¡PL(d 0 )¡10´log 10 d d 0 +X ¾ (2.1) where, P R is the received signal power, P T is the transmit power and PL(d 0 ) is path loss for a reference distance of d 0 . ´ is the path loss exponent and the random variation in RSS is expressed as a Gaussian random variable of zero mean and ¾ 2 variance, X ¾ = N(0;¾ 2 ). All powers are in dBm and all distances are in meters. The MLE estimator is equivalent to the least-squares estimator (LSE) for Gaussian random errors in RSS values [55], which is the case in the above propagation model. The LSE (or MLE) method uses the radio propagation model for localization as follows: 14 1. Measure the distance between each of the reference nodes and the unknown node using d mi =10 P T ¡PL(d 0 )¡P Ri 10´ (2.2) where, d mi is the measured distance and P Ri is the mean received signal power between a given reference node i and the unknown node. Accurate distance measurement requires accurateestimationofthepathlossexponent(´)oftheenvironment. Thiscanbeachieved by apriori, extensive pre-con¯guration surveys of the localization space or by a periodic exchangeofmessagesbythereferencenodestomeasureRSSvaluesandestimatethevalue of ´ for the environment. 2. For each grid point location in the localization space, determine the sum of the squares of di®erences in the measured distances and the true Euclidean distances of all the reference nodes from the grid point. § (x;y) = n¡1 X i=0 (d (x;y) i ¡d mi ) 2 (2.3) where, d (x;y) i is the Euclidean distance between the grid location (x;y) and the reference node i. 3. Choosethegridpointlocationwiththeleastvalueoftheabovesum,§ (x;y) ,asthelocation of the unknown node. Using the MLE method, the authors in [100] have reported a mean error of 0.5 meters for 10 reference nodes and a single unknown node, using commodity 802.11 cards. This accuracy was obtained at the cost of extensive pre-con¯guration studies in the localization space to determine its path loss exponent ´. Chakrabarty et al. in [30] and Ray et al. in [84] use identity-codes to determine the location of sensor nodes in grid and non-grid sensor ¯elds respectively. In this, each grid point or region in the localization space is identi¯ed by a unique set of reference node IDs whose signals can 15 reach the point or region and this unique set is an identity-code for that point or region. The twomaindrawbacksofthisapproacharethat(i)inordertouniquelyidentifyallunknownnode locations in the localization space, the reference nodes need to be placed carefully according to rules determined by an optimization algorithm and that (ii) for acceptable location accuracies, the number of reference nodes required is prohibitively expensive and for sparse networks of reference nodes the accuracy is coarse-grained, in the order of radio range. For example, the number of reference nodes required to uniquely identify the location of an unknown node using identity-codes is O(p m ), where m is the number of dimensions of the localization space and p is the number of grid points per dimension [30]. For this technique, for an experiment using 802.11b devices, the authors report a maximum error of 13 meters. In [48], the authors propose a RF-based localization technique in which the unknown node location is determined by the intersection of all triangles, formed by reference nodes, that are likely to bound it. The unknown node determines its existence inside a triangle by comparing its measured RSS values to that of its neighbors to detect a trend in RSS values in any partic- ular direction. This technique depends on the weak assumption that signal strength decreases monotonically with distance, which is not true in real world scenarios. In this work, the authors do not report any experimental results. Inarecentwork,Marotiet alin[69]haveusedtherelativephase-o®setbetweentworeceivers of RF signals to determine the distances between the nodes, which in turn could be used for localizing the unknown node. The authors use the standard MICA 2 mote [6] radios to obtain a best error of 3 centimeters. This technique is accompanied by extensive con¯guration such as careful placement of reference nodes and their antennas during run-time. Also, multiple- unknown nodes are required because localization is possible only through their collaboration. Another important draw-back of this work is that the implementation presented is for the case when all the nodes (reference and unknown) are in line-of-sight of each other. This is rarely 16 the case in indoor environments. Also, it may be di±cult to determine the phase-o®sets when barriers such as walls are present between the reference nodes and the unknown node. Averyusefulcomparisonoftheperformanceofmany¯nger-printingandnon-¯nger-printing localization techniques is presented in [36]. In this, the authors conduct experiments in two o±ce buildings over an extended period of time using commodity 802.11 cards. They present a comparative study of 6 ¯nger-printing and 5 non-¯nger-printing techniques and report that no particular RF, RSS-based localization technique has a signi¯cant advantages in terms of location estimate accuracy. Using 5 access-points in a 75 meters £ 48 meters area they report a median error of 3.3 meters and a 97 th percentile error of 10 meters. One major reason for this performancecouldbethe¯xednumberanddensityofaccess-pointsusedintheexperiments. We conjecture that the location accuracy can be improved by increasing the access-points' number and density. 2.4 Chapter Summary From the discussion in the chapter, a trade-o® can be discerned between accuracy of location estimates and the complexity of implementation. For instance, least squares estimation tech- niques ([77]) require accurate RF channel parameters such as the radio path loss exponent; ¯nger-printing based techniques (such as [15]) require extensive pre-con¯guration studies that depend on the features of the localization space; other techniques require complex con¯guration procedure ([69]). On the other extreme, really simple techniques such as computing centroid of nearby reference nodes ([26]) provide low accuracy. Therefore, this leaves the design space of easy implementation accompanied with good accuracy in wireless sensor networks open for research on new localization techniques. Also, localizationbasedonwirelessLANs(WLANs)isdisadvantagedfordensityofreference nodes, as compared to wireless sensor networks. Since wireless sensor networks are made up 17 inexpensive devices, as compared to wireless LAN access-points, they are more amenable to higher density deployments. Even though much experimental analysis has been done to study the accuracy of localization techniques using WLANs, there is no experimental analysis yet of localization techniques using wireless sensor networks. Wireless sensor networks based localiza- tiontechniquescouldprovidehigherresolutioncomparedtoWirelessLANsystemsforthesame cost. 18 Chapter 3 Accurate RF Localization 3.1 Introduction In this chapter, we present two novel RF localization techniques that are lightweight, work with any RF hardware and provide accurate localization without requiring accurate channel parameters or any pre-con¯guration. In both our techniques, called Ecolocation and Sequence-Based Localization (SBL), the un- known node examines the ordered sequence of nearby reference nodes to determine its location. The main idea in our techniques is that the distance-based rank order of reference nodes consti- tutes a unique signature for di®erent regions in the localization space. We name the distance- based rank order as the location sequence of the region. The location sequence can be written as a set of rank constraints, which we call location constraints, on the region represented by the location sequence. In Ecolocation, we obtain the ordered sequence of reference nodes by ranking them on one- way RSS measurements between them and the unknown node. This measured sequence is then written as a set of constraints on the location of the unknown node. This constraint set is then compared with the ideal distance-based constraint set for each location point to determine 19 how many order-constraints are satis¯ed. The location which maximizes the number of satis¯ed order-constraints is then determined to be the best estimate of the unknown node's location. Attheheartofoursecondtechnique,SBL,isthedivisionofatwo-dimensional(2D)localiza- tion space into distinct regions by the perpendicular bisectors of lines joining pairs of reference nodes. We show that each distinct region formed in this manner can be uniquely identi¯ed by a location sequence that represents the distance ranks of reference nodes to that region. We present an algorithm to construct the location sequence table that maps all these feasible loca- tion sequences to the corresponding regions, using the locations of reference nodes. This table is used to localize an unknown node as follows: The unknown node ¯rst determines its own location sequence based on the measured strength of signals between itself and the reference nodes. It then searches through the location sequence table to determine the \nearest" feasible sequence to its own measured sequence. The centroid of the corresponding region is taken to be its location. Ideally, the ranks of the reference nodes based on RSS readings should follow their ranks based on true Euclidean distance. Of course, this is not true in the real world because of the presence of multi-path fading and shadowing in the RF channel. Reference nodes farther from the unknown node might measure higher RSS values than those which are closer and this introduces errors in the constraints for Ecolocation and corrupts location sequences for SBL. However,forEcolocation,weshowthattheinherentinsensitivitytoabsoluteRSSamplitudes and the inherent redundancy present in the set of constraints make it very robust in practice. The name Ecolocation is derived from the phrase \Error COntrolling LOCAlizaTION" because of the close analogy to controlling errors by redundancy in traditional error control coding techniques in digital communication systems. In the case of SBL, for n reference nodes in the localization space, the possible number of combinationsofdistanceranksequencesis O(n n ). However, weprovethattheactualnumberof feasible location sequences is much lower due to planar geometric constraints, only O(n 4 ). The 20 lower dimensionality of the number of location sequences enables the correction of errors in the measured sequence. Thus, the high density of location sequences coupled with the robustness of rank orders in the sequences to random errors help provide good location estimate accuracy. The rest of the chapter is organized as follows: We de¯ne location constraints, introduce EcolocationanddescribeitsoperationinSection3.2. Wede¯nelocationsequencesanddescribe the procedure of localization using them in Section 3.3. In the same section, we derive the max- imum number of feasible location sequences in the localization space, illustrate the construction of the location sequence table, discuss the e®ect of RF channel non-idealities on unknown node locationsequencesanddescribemetricstomeasure\distance"betweensequences. InSection3.4 we compare Ecolocation with SBL. In Section 3.5, we present an exhaustive systematic perfor- mance study of our localization techniques in addition to conducting a comparative study with other state-of-the-art localization techniques. We present the evaluation of our technique in real mote experiments in Section 3.6 and summarize the chapter in Section 3.7. 3.2 Localization Technique I: Ecolocation In this section we describe Ecolocation and illustrate it for the ideal and real world scenarios through examples. The localization process is initiated by the unknown node by broadcasting a localization request. The reference nodes in the network respond to this request with response packets containing their location coordinates. The unknown node measures the signal strength (RSS) of the received packets and uses the obtained location coordinates to determine its own location as follows: 1. Determine the ordered sequence of reference nodes by ranking them on the collected RSS measurements. 21 2. For each possible location grid-point in the location space determine the relative ordering ofreferencenodesandcompareitwiththeRSSorderingpreviouslyobtained,todetermine how many of the ordering constraints are satis¯ed. 3. Pickthelocation thatmaximizesthenumberofsatis¯edconstraints. Ifthereismorethan one such location, take their centroid. Radio frequency (RF) based localization techniques are inherently dependent on the RF channel whose multi-path fading and shadowing e®ects have a fundamental bearing on the accuracy of location estimate. Nevertheless, it helps to study the localization technique in isolation of these e®ects. We introduce Ecolocation for the ideal scenario of zero multi-path fading and shadowing e®ects and latter explain why it provides robust and accurate location estimate even in the presence of these e®ects. 3.2.1 Ideal Scenario In the absence of multi-path fading and shadowing, RSS measurements between the reference nodes and the unknown node accurately represent the distances between them. If the reference nodesarerankedasasequenceindecreasingorderoftheseRSSvaluesthenthisorderrepresents the increasing order of their separation from the unknown node. For a reference node ranked at position i in the ordered sequence, R i >R j )d i <d j ;8i<j where, R i and d i are the RSS measurement and distance of the i th ranked reference node from the unknown node, respectively. The above relationship between two reference nodes is a constraint on the location of the unknown node and is dependent on it. An i th ranked reference node forms (i¡1) constraints 22 with lesser ranked ones and for a total of n reference nodes there are ( n(n¡1) 2 ) constraints on the unknown node. For ¯xed reference node locations, the sequence order and the constraints are completely determined by the unknown node location. Figure 3.1 illustrates this idea for a simple case of four reference nodes and one unknown node. X 1 1 // 2 '' P P P P P P P P P P P P P P 3 Ãà A A A A A A A A A A A A A A A A A A 4 ¸¸ + + + + + + + + + + + + + + + + + + + + + A B C D A B X 2 1 Ãà A A A A A A A A 2 // 3 77 n n n n n n n n n n n n n n 3 FF ± ± ± ± ± ± ± ± ± ± ± ± ± ± C D (a) (b) Figure 3.1: The distance rank order of reference nodes (A;B;C;D) is di®erent for di®erent regions (X 1 , X 2 ) in the localization space. Table 3.1 shows the constraints on the unknown node for the example in Figure 3.1(a). A:1 B:2 C:3 D:4 R A R B <R A R C <R A R D <R A R C <R B R D <R C R D <R C Table 3.1: Constraints on the unknown node for the example in Figure 3.1(a). Each location grid-point in the location space has its own set of constraints based on its Euclideandistancestothereferencenodes. Theunknownnodelocationestimatecanbeobtained by comparing the constraints obtained from RSS measurements to the constraint sets of each locationgrid-pointandpickingthelocationwhichsatis¯esthemaximumnumberofconstraints. If there are more than one such locations then their centroid is the location estimate. 23 2 3.33 5.375.69 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 Distance (meters) RSSI (dBm) RSSI as a function of distance Figure 3.2: Real world experimental results: Reference nodes far from the unknown node may measure higher RSS values than closer reference nodes. Note that y-axis is reverse ordered. 3.2.2 Real World Scenario In contrast to the ideal scenario, the real world is characterized by the presence of multi-path fadingandshadowingintheRFchannel. Ideally, referencenodesthatarefarfromtheunknown node should measure lower RSS values than reference nodes that are nearer, but due to multi- path e®ects this is not true in the real world. Figure 3.2 shows the experimental RSS measurements at ¯ve MICA 2 receivers placed at di®erent distances from a MICA 2 transmitter. It shows that the receiver at 5:69 meters mea- sured a higher RSS value than the receiver at 5:37 meters. Evidently, RSS measurements do not represent distances accurately in the real world. Therefore, if the reference nodes are ranked on their respective RSS measurements, the constraintsontheunknownnodelocationformedbytheserankswillbeerroneous. Forexample, iftheranksofthirdandfourthrankedreferencenodesareinterchangedduetomulti-pathe®ects in the RF channel, as in the experiment of Figure 3.2, for the example in Figure 3.1(a), then the new constraints are as shown in Table 3.2. As it can be seen, 10% of the constraints are erroneous in this case. 24 A:1 B:2 C:4 D:3 R A R B <R A R C <R A R D <R A R C <R B R D <R C R D >R C Table 3.2: Constraints for the example of Table 3.1 when the ranks of third and fourth ranked reference nodes are interchanged due of multi-path e®ects. The percentage of erroneous constraints depends on the RF channel condition, the topology of the reference nodes and the number of reference nodes. The unknown node location estimate accuracy in turn depends on the percentage of erroneous constraints. Next, we discuss the implementation aspects of Ecolocation. 3.2.3 Location Determination For ease of implementation, the constraint set is represented by a constraint matrix M n£n , where M n£n (i;j)= 8 > > > > > > < > > > > > > : 1 if R i <R j 0 if R i =R j ¡1 if R i >R j It is easy to see that M n£n is a skew-symmetric matrix and each element of the matrix represents a constraint in the constraint set. The pseudo code for the Ecolocation algorithm is presented below. Algorithm 1. Ecolocation Input: The number of reference nodes within the range of the unknown node (n), their locations (ax i ;ay i )(i = 1:::n), the RSS values of RF signals from the unknown node at each one of them R i (i = 1:::n), the localization area size (S£S sq. length units), and the area scanning resolution (r). 25 Output: The location estimate of the unknown node. The reference nodes are sorted into an ordered sequence based on R 0 i s and a constraint matrix M n£n is derived from this sequence. I Calculate the number of matched constraints at each grid point (i;j) and identify the maximum number of constraints matched over all the grid points. 0 maxConstrMatchÃ0; 1 for each grid point (i;j) in the localization area 2 for each reference node k(!1:::n) 3 d ij k Ã((ax k ¡i) 2 +(ay k ¡j) 2 ) 1 2 ; 4 generate constraint matrix C ij n£n based on d ij . 5 for each element (m;n)(n>m) in C ij n£n 6 if C ij n£n (m;n)=M n£n (m;n) 7 constrMatch ij ÃconstrMatch ij +1; 8 else 9 constrMatch ij ÃconstrMatch ij ¡1; 10 if constrMatch ij >maxConstrMatch 11 maxConstrMatchÃconstrMatch ij ; I Search for grid points where the maximum number of constraints are matched and return the centroid of those grid points as the location estimate. 12 (eco x ;eco y )Ã(0;0); 13 countÃ0; 14 for each grid point (i;j) 15 if constrMatch ij =maxConstrMatch 16 (eco x ;eco y )Ã(eco x +i;eco y +j); 17 countÃcount+1; 26 18 return ( eco x count ; ecoy count )ILocation Estimate. Theorem 1. Ecolocation takes at most O( S 2 n 2 r 2 ) time and O( S 2 r 2 +n 2 ) space to determine the location of the unknown node. Proof. We should say ¯rst of all that this implementation of Ecolocation is meant only to be functional, it is not at all optimized for space or time complexity. Still, the following analysis providesanupperboundonthecomputationalcostsforimplementingthistechnique. Theinitial sorting of reference nodes based on R 0 i s costs £(nlog(n)) time and O(n) space respectively. The corresponding constraint matrix generation costs O(n 2 ) time and O(n 2 ) space respectively. Calculating the number of constraints matched at each grid point and identifying the maximum numberofconstraintsmatchedoverallgridpoints(lines1-11)costsO( S 2 n 2 r 2 )timeandO( S 2 r 2 +n 2 ) spacerespectively. Searchingforgridpointswheremaximumnumberofconstraintsarematched (lines 12 - 17) costs O( S 2 r 2 ) time and O(1) extra space. Finally, calculating the centroid of those grid points (line 18) costs O(1) time and space. In total, the time and space complexities of Ecolocation are at most O( S 2 n 2 r 2 ) and O( S 2 r 2 +n 2 ) respectively. 3.2.4 Examples Figure 3.3 shows a sample layout of nine reference nodes placed in a grid and a single unknown node. Figure 3.3(a) plots the location estimate for the ideal case when there are no erroneous constraints on the unknown node. Figures 3.3(b), 3.3(c) and 3.3(d) show the location estimates for varying percentages of erroneous constraints. The location estimate error increases with increasing percentage of erroneous constraints. These examples suggest that Ecolocation is robust to multi-path e®ects of the RF channel up to some level. The inherent redundancy in the constraint set ensures that the non-erroneous 27 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 123456789 X−Axis (meters) Y−Axis (meters) P E 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 123568497 X−Axis (meters) Y−Axis (meters) P E 1 2 3 5 6 8 4 9 7 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 125379486 X−Axis (meters) Y−Axis (meters) P E 1 2 5 3 7 9 4 8 6 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 243976581 X−Axis (meters) Y−Axis (meters) P E 2 4 3 9 7 6 5 8 1 (a) (b) (c) (d) Figure 3.3: Ecolocation location estimate (E) for the unknown node (P) at (1;3) for a grid layout of 9 reference nodes (A). The number adjacent to a reference node is its correspond- ing rank. The location error is expressed in meters where the size of the square localization area is 12£ 12 sq. meters.(a) Sequence: 123456789 (no erroneous constraints) [Estimate: (1:25;3:3); Error: 0:34 meters] (b) Sequence: 123568497 (13:9% erroneous constraints) [Es- timate: (1:25;1:95); Error: 1:07meters](c)Sequence: 125379486(22:2%erroneousconstraints) [Estimate: (1:95;1:25); Error: 1:98 meters] (d) Sequence: 243976581 (44:4% erroneous con- straints) [Estimate: (1:95;1:25); Error: 1:98 meters]. constraints help in estimating the unknown node location accurately. Also, the constraint con- struction inherently holds true for random variations in RSS measurements up to a tolerance level of (jR i ¡R j j). Through the above examples, we have shown a proof of concept that localization using geometric constraints is robust to RF channel non-idealities. Next we conduct a deeper investi- gation in to the geometric meanings of location constraints and location sequences and propose a localization technique based on location sequences. 3.3 Localization Technique II: Sequence-Based Localization In this section, ¯rst, we discuss a deeper investigation of location sequences and then describe the procedure to use them for localization. 28 A B Locations closer to 1 (d A < d B ) Locations closer to 2 (d A < d B ) Locations equi- distant from 1 and 2 (d A = d B ) A B C D Face Edge Vertex (a) (b) Figure 3.4: (a) The perpendicular bisector of the line joining two reference nodes divides the localization space into three distinct regions. (b) Illustration of arrangement of 6 bisector lines for 4 reference nodes placed uniformly randomly in a square localization space. 3.3.1 Location Sequences Assume that a 2D localization space consists of n reference nodes. Consider any two reference nodes and draw a perpendicular bisector to the line joining their locations. This perpendicular bisector divides the localization space into three di®erent regions that are distinguished by their proximity to either of the reference node, as illustrated in Figure 3.4(a). Similarly, if perpendic- ularbisectorsaredrawnforall n(n¡1) 2 pairsofreferencenodes,theydividethelocalizationspace in to many regions of three di®erent types - vertices, edges and faces, as shown in Figure 3.4(b). This subdivision of a 2D space into vertices, edges and faces by a set of lines is an arrangement induced by that set [32]. Now, for each region created by the arrangement induced by the set of perpendicular bisec- tors, determine the ordered sequence of reference nodes' ranks based on their distances from them. We de¯ne this ordered sequence of distance ranks as the location sequence. Consider the following theorem. Theorem 2. The location sequence of a given region is unique to that region. Proof. The proof is by contradiction. Assume that two di®erent regions in the arrangement havethesamelocationsequence. Thisimpliesthatthedistanceranksofreferencenodesarethe 29 same for both the regions. This further implies that there is no bisector line that separates the two regions. The implication applies to all possible combinations of regions such as two faces, two edges, two vertices, a face and an edge, an edge and a vertex and a face and a vertex, in their own di®erent ways. Otherwise, if there was a bisector line of two arbitrary reference nodes that separated the two regions then it would rank those reference nodes di®erently for the two regions. But this is a contradiction, as by de¯nition, two di®erent regions in the arrangement are separated by at least a single bisector line. Therefore, each region created by the arrangement has a unique location sequence. Further, we make the following observations: ² All locations inside a region have the same location sequence. ² If each region in the arrangement is represented by its centroid, there is a one-to-one mapping between a location sequence and the centroid of the region it represents. For a vertex, the centroid is the vertex itself; for an edge, the centroid is its midpoint and for a face, the centroid is the centroid of the polygon that bounds it. ² The total number of unique location sequences is equal to the sum of the number of vertices, the number of edges and the number of faces created by the arrangement in the localization space. The order in which the ranks of reference nodes are written in a location sequence is de- termined by a pre-de¯ned order of reference node IDs. We illustrate the above ideas through examples. Figure 3.5(a) shows the location sequences four di®erent regions. In the example the pre-de¯ned order of reference node IDs is ABCD. Region 1 is a face and its location sequence is 1234,sincethedistancerankofAfromitis1(Aistheclosest)andtherespectivedistanceranks of B,C and D are 2,3 and 4 (D is the farthest). Similarly, for Region 3 the location sequence is 4321 as the distance rank of A is the farthest (distance rank 4), D is the closest (distance rank 30 Region 1: A B C D 1 2 3 4 Region 2: A B C D 1 1 3 4 Region 4: A B C D 3 3 1 1 Region 3: A B C D 4 3 2 1 A B C D A B C D 1234 2134 3124 4123 4213 4312 4321 3421 2431 1432 1423 3412 1243 1342 1324 3214 2314 2413 1134 2124 3123 4113 4212 4311 3321 4221 1431 1332 1242 1233 1224 1323 1422 1222 1314 1114 1413 1411 3411 3311 3114 1313 3113 3213 2214 2212 2412 3312 (a) (b) Figure3.5: (a)Examplesoflocationsequencesforafourreferencenodetopology. (b)Allfeasible location sequences for the topology of (a). 1) and B is closer than C and A. For Region 4, which is a vertex, the distance ranks of A,B and C,D are equal in pairs as it lies on the intersection of perpendicular bisectors of those pairs of reference nodes. Also, the pair C,D is closer to it than the pair A,B. Therefore, its location sequence is 3311. Similarly, for Region 2, which is an edge, the distance ranks of A and B are the same and its location sequence is 1134. Figure 3.5(b) shows all feasible location sequences for the topology of reference nodes of Figure 3.5(a). 3.3.2 Localization Procedure The procedure for localization of unknown nodes using location sequences is as follows: 1. Determineallfeasiblelocationsequencesinthelocalizationspaceandlisttheminalocation sequence table. 2. Determine the location sequence of the unknown node location using received signal strength (RSS) measurements of localization response packets obtained from the refer- ence nodes. The RSS based location sequence will be a corrupted version of the original location sequence. 31 3. Search in the location sequence table for the \nearest" location sequence to the unknown node location sequence. The centroid mapped to by that sequence is the location estimate of the unknown node. The above procedure opens itself to the following questions: How many feasible location sequences are there in a 2D localization space? How can we get them? How do random errors in RSS measurements a®ect the unknown node location sequence? What is the meaning of \nearest" location sequence and how do we measure distances between location sequences? In the rest of this section we answer the above questions. We begin by determining the maximum number of feasible location sequences in the localization space. 3.3.3 Maximum Number of Location Sequences Forn reference nodes in the localization space, the number of possible combination sequences of distanceranksisO(n n ). However,weshowthattheactualnumberoffeasiblelocationsequences is much lower, in the order of O(n 4 ) at worst. As stated previously, the number of feasible location sequences is equal to the sum of the number of vertices, edges and faces created by the arrangement induced by the perpendicular bisectors of reference nodes. Therefore, its upper bound can be obtained by determining the maximumnumberofsuchvertices,edgesandfaces,giventhelocationsofthereferencenodes. In [32],theauthorsshowthatthemaximumnumberofvertices,edgesandfacesforanarrangement induced by n lines is n(n¡1) 2 , n 2 and n 2 2 + n 2 +1 respectively. Using these results, for n(n¡1) 2 perpendicular bisectors of n reference nodes, 1. The number of vertices is at most n 4 8 ¡ n 3 4 ¡ n 2 8 + n 4 . 2. The number of edges is at most n 4 4 ¡ n 3 2 + n 2 4 . 3. The number of faces is at most n 4 8 ¡ n 3 4 + 3n 2 8 ¡ n 4 +1. 32 Owing to the properties of perpendicular bisectors, it is possible to derive tighter upper bounds on the number of vertices, edges and faces. Theorem 3. Let L be the set of bisector lines for n reference nodes,jLj= n(n¡1) 2 . Let A(L) be the arrangement induced by L. Then, 1. The number of vertices of A(L) is at most n 4 8 ¡ 7n 3 12 + 7n 2 8 ¡ 5n 12 . 2. The number of edges of A(L) is at most n 4 4 ¡n 3 + 7n 2 4 ¡n. 3. The number of faces of A(L) is at most n 4 8 ¡ 5n 3 12 + 7n 2 8 ¡ 7n 12 +1. Proof. We make use of the property that the perpendicular bisectors of the sides of a triangle intersectatasinglepoint. Assumethat(i¡1)referencenodeshavealreadybeenadded,implying that the localization space already has (i¡1)(i¡2) 2 bisector lines. When the i th reference node is added, (i¡1) new bisector lines are added to the localization space. Vertices: The ¯rst of the (i¡1) bisector lines intersects the already present lines in at most (i¡1)(i¡2) 2 newvertices. Thesecondnewlineistheperpendicularbisectorofasideofthetriangle in which the ¯rst new line is also a perpendicular bisector. Therefore, the second new line has to pass through at least one of the vertices created by the ¯rst new line, thus creating at most (i¡1)(i¡2) 2 ¡ 1 new vertices. Similarly the third new line creates at most (i¡1)(i¡2) 2 ¡ 2 new vertices. This is illustrated in Figure 3.6 for n = 4. Finally the (i¡1) th new line creates at most (i¡1)(i¡2) 2 ¡(i¡2) new vertices. Therefore, the total number of new vertices added by the i th reference node is at most (i¡1)(i¡2) 2 + (i¡1)(i¡2) 2 ¡1+ (i¡1)(i¡2) 2 ¡2+¢¢¢+ (i¡1)(i¡2) 2 ¡(i¡2) (3.1) =(i¡1) (i¡1)(i¡2) 2 ¡(1+2+¢¢¢+(i¡2))=(i¡1) (i¡1)(i¡2) 2 ¡ (i¡2)(i¡1) 2 (3.2) = (i¡1)(i¡2) 2 2 (3.3) 33 1 A B C D 2 1 A B C D 2 1 3 A B C D (a) (b) (c) Figure 3.6: Addition of fourth reference node D adds 3 new bisector lines to the localization space. (a) The ¯rst of the 3 new bisector lines, line 1, the perpendicular bisector of CD, creates 3 new vertices (equal to the number of pre-existing lines in the localization space), 4 new faces and 7 new edges at most. (b) The second line, line 2, the perpendicular bisector of BD, has to pass through the intersection point of the bisectors of CD and BC because,fBD;CD;BCg form a triangle and the perpendicular bisectors of the three sides of a triangle intersect at a single point. Therefore line 2 creates 2 new vertices, 4 new faces and 6 new edges at most. (c) Similarly, line 3, the perpendicular bisector of AD has to pass through the intersection points of perpendicular bisectors of AB, BD and AC, CD asfAD;AB;BDg andfAD;AC;CDg are two triangles with a common side AD. Therefore, line 3 creates 1 new vertex, 4 new faces and 5 new edges at most. The maximum number of vertices for n = 3 is 1. Therefore, for n reference nodes, the maximum number of vertices is 1+ n X i=4 (i¡1)(i¡2) 2 2 =1+ n X i=4 · i 3 2 ¡ 5i 2 2 +4i¡2 ¸ = n X i=1 · i 3 2 ¡ 5i 2 2 +4i¡2 ¸ (3.4) = n 4 8 ¡ 7n 3 12 + 7n 2 8 ¡ 5n 12 (3.5) Edges: As explained previously, the ¯rst new line intersects the already present lines in at most (i¡1)(i¡2) 2 vertices and creates at most (i¡1)(i¡2) 2 +1 new edges on the new line and at most (i¡1)(i¡2) 2 newedgesontheoldlineswhichaddupto (i¡1)(i¡2) 2 ¢2+1newedgesatmost. Sincethe secondnewlinepassesthroughatleastoneoftheverticescreatedbythe¯rstnewline,itcreates atmost (i¡1)(i¡2) 2 +1new edgeson thesecond newline and it creates atmost (i¡1)(i¡2) 2 ¡1new edges on the old lines including the ¯rst new line. This adds up to at most (i¡1)(i¡2) 2 ¢2 new 34 edges in the localization space. This trend is again illustrated in Figure 3.6 for four reference nodes in the localization space. Finally, the (i¡1) th new line adds (i¡1)(i¡2) 2 ¢2¡(i¡3) new edges to the localization space. Therefore, the total number of new edges added by the i th reference node is at most (i¡1)(i¡2) 2 ¢2+1+ (i¡1)(i¡2) 2 ¢2+ (i¡1)(i¡2) 2 ¢2¡1+¢¢¢+ (i¡1)(i¡2) 2 ¢2¡(i¡3) (3.6) =2¢(i¡1) (i¡1)(i¡2) 2 +1¡(1+2+¢¢¢+(i¡3)) (3.7) =1+(i¡1) 2 (i¡2)¡ (i¡3)(i¡2) 2 (3.8) =i 3 ¡ 9i 2 2 + 15i 2 ¡4 (3.9) Themaximumnumberofedgesforn=3is6. Therefore,fornreferencenodes,themaximum number of edges is 6+ n X i=4 · i 3 ¡ 9i 2 2 + 15i 2 ¡4 ¸ = n X i=1 · i 3 ¡ 9i 2 2 + 15i 2 ¡4 ¸ = n 4 4 ¡n 3 + 7n 2 4 ¡n (3.10) Faces: The number of new faces created by a new line is equal to the number of edges on the new line. Therefore, the number of new faces created by the ¯rst new line among the (i¡1) new lines is at most (i¡1)(i¡2) 2 +1. Since the second new line has to pass through one of the intersection points of the ¯rst line, it would also create (i¡1)(i¡2) 2 +1 new faces and this trend continues for all the (i¡1) new lines as illustrated in Figure 3.6. Therefore, the total number of new faces added by the i th reference node is at most (i¡1) µ (i¡1)(i¡2) 2 +1 ¶ (3.11) 35 The localization space has one face when n = 1. Therefore, for n reference nodes the maximum number of faces in the localization space is given by: 1+ n X i=2 (i¡1) µ (i¡1)(i¡2) 2 +1 ¶ = n 4 8 ¡ 5n 3 12 + 7n 2 8 ¡ 7n 12 +1 (3.12) Corollary 1. The maximum number of unique location sequences due to n reference nodes is n 4 2 ¡2n 3 + 7n 2 2 ¡2n+1. Proof. The maximum number of unique location sequences is the sum of the maximum number of vertices, edges and faces due to n reference nodes, derived in Theorem 3. µ n 4 8 ¡ 7n 3 12 + 7n 2 8 ¡ 5n 12 ¶ + µ n 4 4 ¡n 3 + 7n 2 4 ¡n ¶ + µ n 4 8 ¡ 5n 3 12 + 7n 2 8 ¡ 7n 12 +1 ¶ (3.13) = n 4 2 ¡2n 3 + 7n 2 2 ¡2n+1 (3.14) Next, we illustrate how to obtain all these feasible location sequences in the localization space and store them in the location sequence table. 3.3.4 Location Sequence Table Construction Below, we present the pseudo-code for an algorithm that constructs the location sequence table given the locations of the reference nodes and the boundaries of the localization space. 36 Algorithm 2. ConstructLocationSequenceTable 1 . Input: 1. Location coordinates of reference nodes, f(ax i ;ay i ) j i=0!n¡1g. 2. Boundaries of the localization space B. Output: Location Sequence Table. 0 L=fl i j i=0!( n(n¡1) 2 ¡1)g à BisectorLines(f(ax i ;ay i ) j i=0!n¡1g;B) 1 (FL;EL;VL) à ConstructArrangement(L) I Get vertex sequences. 2 for i à 0 to (jVLj¡1) 3 Centroid[i] à VL[i] 4 Sequence[i] à GetSequence(Centroid[i]) 5 end for I Get edge sequences. 6 for i à jVLj to (jVLj+jELj¡1) 7 Centroid[i] à GetEdgeCentroid(EL[i]) 8 Sequence[i] à GetSequence(Centroid[i]) 9 end for I Get face sequences. 10 for i à (jVLj+jELj) to (jVLj+jELj+jFLj¡1) 11 Centroid[i] à GetFaceCentroid(FL[i]) 12 Sequence[i] à GetSequence(Centroid[i]) 1 C++ code ¯les that construct the arrangement of lines and the location sequence table are available for download at http://anrg.usc.edu/downloads.html 37 13 end for I Return the location sequence table 14 return fSequence, Centroidg ² BisectorLines takes in the locations of the reference nodes and the boundaries of the localization space as input and returns the set L of all pair-wise perpendicular bisector lines within the boundaries of the localization space. Each line is represented by the intersection points on the left and right boundaries of the localization space. ² ConstructArrangement constructs the arrangement given a set of lines as input and returns a doubly connected edge list that consists of a vertex list (VL), an edge list (EL) and a face list (FL). Please refer to the book [32], Chapter 8, Section 3 for a detailed description of this algorithm. ² Vertex List, VL: Contains pointers to all vertices of the arrangement induced by the set L. ² Edge List, EL: Contains pointers to all edges of the arrangement induced by the set L. ² Face List, FL: Contains pointers to all faces of the arrangement induced by the set L. ² GetEdgeCentroid takes in an edge pointer as the input and returns the centroid of the edge. The centroid of an edge (c x ;c y ) is its mid point given by: (c x ;c y )à µ o x +d x 2 ; o y +d y 2 ¶ (3.15) where, (o x ;o y ) and (d x ;d y ) are the origin and destination vertices of the edge. 38 ² GetFaceCentroid takes in a face pointer as the input and returns the centroid of the face. The centroid of a face (c x ;c y ), given its verticesf(x i ;y i )j0·i·p¡1g, is calculated as follows: c x à 1 6A p¡1 X i=0 (x i +x i+1 )(x i y i+1 ¡x i+1 y i ) (3.16) c y à 1 6A p¡1 X i=0 (y i +y i+1 )(x i y i+1 ¡x i+1 y i ) (3.17) where, p is the number of vertices that bound a given face and A is its area given by Aà 1 2 p¡1 X i=0 (x i y i+1 ¡x i+1 y i ) ; (x p ;y p )=(x 0 ;y 0 ) (3.18) ² GetSequence takes in the coordinates of a point in the localization space and returns the location sequence for that point with respect to the locations of the reference nodes. Theorem 4. Algorithm 2 takes O(n 5 log(n)) worst-case time and O(n 5 ) worst case space to construct the location sequence table. Proof. The function BisectorLines in line 0 takes O(n 2 ) time and space. The algorithm ConstructArrangement that constructs the arrangement of lines takes O(n 4 ) time, which is optimal, as proven in Theorems 8.5 and 8.6 of [32]. Since this algorithm returns the vertex list VL, the edge list EL and the face list FL, it requires O(n 4 ) space to store all the three lists. The functionsGetFaceCentroid andGetEdgeCentroid in lines 3 and 7 respectively take O(1) time and space each. The function GetSequence involves sorting n reference nodes based on their distances from the centroid of the region in consideration. This takes O(nlogn) time and O(n) space. Since the number of faces, edges and vertices is O(n 4 ) the worst case time requirement for lines 2-13 in the above algorithm is O(n 5 log(n)) and the worst case space 39 requirement is O(n 5 ). Therefore, in total, Algorithm 2 takes O(n 5 log(n)) worst-case time and O(n 5 ) worst case space to construct the location sequence table. Table 3.3 compares simulation results for the number of location sequences obtained using theabovealgorithmwithanalyticalvaluesfromCorollary1. Thesimulationresultsaregathered over 1000 random trials (with 100 di®erent random seeds) in each of which n reference nodes were placed uniformly at random in the localization space. From the last two columns of the table it can be seen that the simulation results match the analytical results very closely. Note that for higher number of reference nodes the probability of occurrence of the arrangement that would produce the maximum of location sequences is less than 1 in 1000 i.e., 0:001. Also, for increasing number of reference nodes, the average number of location sequences is increasingly smaller than the maximum number. Next, we discuss the e®ect of RF channel random errors on the unknown node location sequence. Number of Number of Average Minimum Maximum Maximum Reference Bisector Number of Number of Number of Number of Nodes Lines Location Location Location Location (n) ³ n(n¡1) 2 ´ Sequences Sequences Sequences Sequences (Simulations) (Simulations) (Simulations) (Analytical) 3 3 12.3 7 13 13 4 6 44.0 23 49 49 5 10 117.3 51 141 141 6 15 274.8 217 331 331 7 21 548.4 441 653 673 8 28 988.6 840 1147 1233 9 36 1663.9 1447 1881 2089 10 45 2630.2 2321 2933 3331 Table 3.3: Progression of number of location sequences with number of reference nodes (n) in the localization space. The last two columns compare the simulation and analytical results for the maximum number of location sequences. Simulation results are gathered from 1000 random trials(with100di®erentrandomseeds)ineachofwhichnreferencenodeswereplaceduniformly at random in a square localization space. 40 3.3.5 Unknown Node Location Sequence TheunknownnodedeterminesitslocationsequenceusingRSSmeasurementsofRFlocalization packetsexchangedbetweenitselfandthereferencenodes. TheRSSmeasurementsaresubjected to random errors due to RF channel non-idealities such as multi-path and shadowing. In the absenceofsuchnon-idealities,theRSSmeasurementsaccuratelyrepresentthedistancesbetween the unknown node and the reference nodes. If reference nodes are ranked in a decreasing order of these RSS values then this order represents the increasing order of their separation from the unknown node. This is not true in reality. Reference nodes that are farther from the unknown node might measure higher RSS values than reference nodes that are closer. If the reference nodes are ranked on their respective RSS measurements, the location sequence formed by these ranks will be a corrupted version of the original sequence. Corruption in unknown node location sequence resultsinerroneousestimationofitslocation. Intheidealcase, whenthereisnocorruption, the unknownnodelocationwouldbethecentroidoftheregionrepresentedbyitslocationsequence. However, corruption in its location sequence could erroneously estimate its location to be the centroid of some other region. For example, if the ranks of reference nodes C and D are interchanged because of corruption due to RF channel non-idealities for Region 1 of Figure 3.5(a), the new location sequence would be 1243 instead of 1234. And 1243 represents a region that is adjacent to the original region as shown in Figure 3.5(b). 3.3.6 Feasible and Infeasible Sequences As discussed previously, combinatorially, n reference nodes produce O(n n ) location sequences. But as shown in the previous section, a localization space with n reference nodes has only O(n 4 ) distinct regions and consequently only O(n 4 ) feasible location sequences in the worst 41 case. Forgivenreferencenodelocations,thelocationsequencetableincludesallfeasiblelocation sequences. Allothersequencesareinfeasible. Thenon-idealitiesoftheRFchannelcouldcorrupt a feasible location sequence either to another feasible sequence or an infeasible sequence as illustrated in Figure 3.7. If the corrupted sequence is infeasible, then it would be possible to detect the corruption in the sequence, whereas, if the corrupted sequence is feasible, corruption detection is not possible. Here, we would like to emphasize the importance of low density of location sequences com- pared to the full sequence space. The low density of location sequences implies that many infeasible sequences are mapped to a single feasible sequence and this in turn could provide robustness to location estimation against RF channel non-idealities. Space of feasible location sequences (Size: O(n 4 )) Sequence space of size O(n n ) Space of infeasible location sequences Corruption due to wireless channel non-idealities Figure 3.7: RF channel non-idealities could corrupt a location sequence from the feasible space either to another sequence in the feasible space or to a sequence in the infeasible space. Next, we present metrics to measure distance between two location sequences. 3.3.7 Distance Metrics The distance between two location sequences is essentially the di®erence in rank orders of dif- ferent reference nodes. Fortunately, statistics [78] o®ers two metrics that capture this di®erence in rank orders - Spearman's Rank Order Correlation Coe±cient, and Kendall's Tau. 42 Given two location sequences U =fu i g and V =fv i g, 1·i·n, where u i 's and v i 's are the ranks of reference nodes, the above two metrics are de¯ned as follows. 1. Spearman's Rank Order Correlation Coe±cient [78]: It is de¯ned as the linear correlation coe±cient of the ranks and is given by ½=1¡ 6 P n i=1 (u i ¡v i ) 2 n(n 2 ¡1) (3.19) 2. Kendall's Tau [78]: In contrast to Spearman's coe±cient in which the correlation of exact ranks is calculated, this metric calculates the correlation between the relative ordering of ranks of the two sequences. It compares all the n(n¡1) 2 possible pairs of ranks (u i ;v i ) and (u j ;v j ) to determine the number of matching and non-matching pairs. A pair is matching or concordant if u i >u j )v i >v j or u i <u j )v i <v j and non-matching or discordant if u i > u j ) v i < v j or u i < u j ) v i > v j . The correlation between the two sequences is calculated as follows: ¿ = (n c ¡n d ) p n c +n d +n tu p n c +n d +n tv (3.20) where, n c is the number of concordant pairs, n d is the number of discordant pairs, n tu is the number of ties in u's and n tv is the number of ties in v's. The range of both ½ and ¿ is [¡1;1]. Next, we describe the procedure to determine locations of unknown nodes using their location sequences. 3.3.8 Location Determination The location of the unknown node is determined as follows: 1. Calculate distances between the unknown node location sequence and all location se- quences in the location sequence table using the above distance metrics. 43 2. Choose the centroid represented by the location sequence that is closest to the unknown node location sequence as its location estimate. Mathematically, LocationEstimate=Centroid(arg min 1·i·O(n 4 ) ¿ i ) (3.21) where,¿ i istheKendall'sTauorSpearman'scorrelationbetweentheunknownnodelocation sequence and the i th location sequence in the location sequence table. Due to RF channel non-idealities, the unknown node location sequence could be a feasible sequencedi®erentfrom itsuncorruptedversionoraninfeasiblesequence. Inanycase, theabove procedure maps it to the centroid of the nearest feasible location sequence in the location se- quencetablethat representsadi®erentregionin thearrangementthantheoriginaluncorrupted version. We measure the amount of corruption in the unknown node location sequence by calculat- ing its distance from the uncorrupted version, using the above metrics, and denote it by T. We denote the distance between the unknown node location sequence and the nearest feasible sequence in the location sequence table by ¿. Calculating the Spearman's coe±cient and Kendall's Tau between two sequences are O(n) and O(n 2 ) operations respectively. Since the location sequence table is of size O(n 4 ), searching through it takes O(n 5 ) and O(n 6 ) operations respectively for the above two metrics. Later in the chapter, in Section 3.5, we compare the performance of the two distance metrics in terms of error in the unknown node location estimate. 3.3.9 Localization Scenarios Here, we illustrate two localization procedures for two di®erent scenarios that are determined by the localization space size. 44 1. Entire localization space is within the radio range of the unknown node: In this case, the location sequence table remains constant for all locations of the unknown node in the localization space. Therefore, the localization procedure is as follows: (a) Pre-constructandstorethelocationsequencetableusingthelocationsofthereference nodes. (b) When the unknown node initiates the localization process by broadcasting a local- ization packet, provide the stored location sequence table along with the RSS mea- surements from the reference nodes. (c) TheunknownnodedeterminesitslocationsequenceusingtheRSSmeasurementsand determines its location by searching through the provided location sequence table for the nearest feasible location sequence. Here, the time cost incurred by the unknown node to estimate its location is equal to the sum of the time to determine its location sequence, an O(nlogn) operation, and the time to search through the location sequence table, a O(n 6 ) operation. The amount of memory space required is of the order of O(n 5 ) bytes. 2. Localization space is much larger than the radio range of the unknown node: In this case, thelocationsequencetablechangeswiththelocationoftheunknownnodeasadi®erentset of reference nodes are encountered at each location. Therefore, the localization procedure is as follows: (a) The unknown node collects the locations and RSS measurements of the reference nodes in its radio range. (b) It constructs the location sequence table, using Algorithm 2, using the locations of thereferencenodesandcalculatesitslocationsequenceusingtheRSSmeasurements. 45 (c) It determines its location by searching for the nearest sequence in the location se- quence table. In this case, the time cost incurred by the unknown node to estimate its location is equal tothe sumofthetimetocalculateitslocation sequence, an O(nlogn)operation, thetime to construct the location sequence table, an O(n 5 logn) operation, and the time to search through it, a O(n 6 ) operation. The memory requirement is O(n 5 ) in this case also. A wireless device that is typically used as an unknown node is of the form factor of an IPAQ [2] (that can communicate with the reference node devices, usually of the form factor of Berkeley MICA 2 motes [6]) which typically has a 300MHz processor and 128MB of RAM . In real application scenarios, a typical value for the number of reference nodes (n) is less than 15 after which there is only very marginal gain in location accuracy of the unknown node. Therefore, for a typical value of n=10 reference nodes, the time and space requirements for the unknownnodetoconstructthelocationsequencetableareapproximately0:3millisecondsand32 KB respectively. And the time required to search through it is approximately 0.4 milliseconds. Thus, including the associated overhead, the total localization time taken by sequence-based localization is in milliseconds in typical application scenarios, which is very e±cient. Next, we illustratetherobustnessofourlocalizationtechniquetoRFchannelnon-idealitiesthroughsome examples. 3.3.10 Examples Figure3.8showsthesamplelayoutofninereferencenodesplacedinagridandasingleunknown node (P) considered in Figure 3.3. Figure 3.8(a) plots the location estimate (E) for the ideal case when there are no erroneous ranks i.e., the location sequence is uncorrupted or T = 1. In theseexamplesweuseKendall'sTautomeasurethedistancebetweensequences. Figures3.8(b), 3.8(c)and3.8(d)showthelocationestimatesforincreasingcorruptioninunknownnodelocation 46 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 123456789 X−Axis (meters) Y−Axis (meters) 1 2 3 4 5 6 7 8 9 P E 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 123568497 X−Axis (meters) Y−Axis (meters) 1 2 3 5 6 8 4 9 7 P E 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 125379486 X−Axis (meters) Y−Axis (meters) 1 2 3 5 6 8 4 9 7 P E 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Sequence: 243976581 X−Axis (meters) Y−Axis (meters) P E 2 4 3 9 7 6 5 8 1 (a) (b) (c) (d) Figure 3.8: Robustness examples: Location estimate (E) for the unknown node (P) at (1;3) for a grid layout of 9 reference nodes. The number adjacent to a reference node is its corresponding rank. The location error is expressed in meters where the side length of the square localization area is 12 meters. (a) (T = 1, ¿ = 1), Estimate (E): (1:33;1:33) , Location Error: 0:46 meters (b) (T =0:722, ¿ =0:783), Estimate (E): (2:0;2:0), Location Error: 1:4 meters (c) (T =0:556, ¿ = 0:667), Estimate (E): (2:0;2:0), Location Error: 1:4 meters (d) (T = 0:111, ¿ = 0:278), Estimate (E): (2:0;1:33), Location Error: 1:94 meters sequences. Similar to the example of Figure 3.3, the location estimate error increases with increasing corruption or decreasing correlation, T, between the RSS location sequence and the true location sequence of P. These examples suggest that sequence-based localization is robust to multi-path and shadowing e®ects of the RF channel up to some level. Intuitively, the three main reasons to which this robustness can be attributed to are: 1. The low density, O(n 4 ), of location sequence space relative to the entire sequence space of O(n n ). 2. The inherent redundancy of comparing n(n¡1) 2 rank pairs in calculating the distance be- tween two sequences using Kendall's Tau. 3. The rank order in the location sequence of the unknown node due to two reference nodes with RSS readings R i and R j is robust to random errors in them up to a tolerance level ofjR i ¡R j j. Havingpresentedtwolocalizationtechniques,onebasedonlocationconstraintscalledEcolo- cation and the other based on location sequences called SBL, we investigate their di®erence and similarities in the next section. 47 A B C D Figure3.9: OverlapofEcolocationscanninggridandregionscreatedbyarrangementofbisector lines in SBL. 3.4 SBL Vs. Ecolocation In Ecolocation, the location estimate of the unknown node is the centroid of all grid points that have maximum number of matched constraints, whereas, in SBL the centroid of the region represented by a location sequence is the location estimate. Figure 3.9 shows the overlap of Ecolocation scanning grid points and the regions created by bisector lines in SBL. It can be observed that if the scanning resolution of Ecolocation is high enough then the centroid of the highest-constraint-matching grid points is the same as the centroid of the region represented by the corresponding location sequence. This results from the fact that the constraint set in Ecolocation is equivalent to the location sequence in SBL. The main di®erence between Ecolocation and SBL is in their complexity. While Ecolocation takes O( n 2 S 2 r 2 ) time and O(n 2 ) space, SBL takes O(n 6 ) time and O(n 5 ) space. Clearly, while Ecolocation depends on the number of reference nodes in the radio range n, localization space size S 2 and the scanning resolution r, SBL depends only on the number of reference nodes n. ThissuggeststhatforlargelocalizationspacesandhighlocationresolutionsSBListhepreferred method and for small localization spaces and low location resolutions Ecolocation is preferred. 48 Since Ecolocation is equivalent to SBL for high scanning resolutions, for the rest of the chapter we focus on SBL. 3.5 Evaluation In this section, we present a complete performance evaluation of sequence-based localization (SBL). First, we discuss its inherent location error characteristics and then using simulations, we study its performance as a function of RF channel and node deployment parameters. We also present a comparative study with three other state-of-the-art localization techniques. 3.5.1 Location Error Characteristics Each location sequence maps to the centroid of the region it represents. Representing all loca- tionsinaregionbyitscentroidcomesatthecostoferrorinthelocationestimateofthelocation sequence. If the region is a face, then the location error is of the order of the square-root of the area of the face and if the region is an edge then it is of the order of the length of the edge. Figure 3.10 plots the average, average maximum and average minimum face areas and edge lengths gathered over 1000 random trials in each of which n reference nodes were placed uniformly randomly in a square localization space of size S£S sq. meters. The main error characteristics obtained from curve ¯tting can be summarized as follows: ² The average face area varies proportional to 1 n 4 . Since the location estimate error of locationsinafaceregionisproportionaltothesquare-rootofitsarea,theaveragelocation estimate error for locations in a face region reduces proportional to n 2 . ² The average maximum face area varies proportional to 1 n 2 . Therefore, the maximum location estimate error in a face region reduces proportional to n which is slower than the reduction in average location estimate error. 49 3 4 5 6 7 8 9 10 0 5 10 15 20 25 30 35 Number of Reference Nodes (n) Face Area (% of S 2 ) Avg. Max. Area K1/n 2 Avg. Area K2/n 4 Avg. Min. Area 3 4 5 6 7 8 9 10 0 10 20 30 40 50 60 70 80 90 Number of Reference Nodes (n) Edge Length (% of S) Avg. Max. Length K3/(n+1.5) Avg. Length K4/n 2.5 Avg. Min. Length (a) (b) Figure 3.10: Simulation results averaged over 1000 random trials (with 100 di®erent random seeds) in each of which n reference nodes were placed uniformly at random in a 2D square localization area of S£S sq. meters. (a) The average maximum, average and average minimum face areas as a function of the number of reference nodes. (b) The average maximum, average and average minimum edge lengths as a function of the number of reference nodes. K1, K2 and K3, K4 are scaling constants. ² The average edge length varies proportional to 1 n 2:5 . Since, the location estimate error for locations on an edge is proportional to its length, the average location estimate error for locations on an edge reduces proportional to n 2:5 which is faster than that for locations in a face region. ² The maximum edge length varies proportional to 1 (n+1:5) . Therefore, the maximum loca- tion estimate error for locations on an edge reduces proportional to n which is slower than the reduction of average location estimate error. Apart from the above location errors, the performance of sequence-based localization is a®ected by random errors in RSS measurements due to multi-path and shadowing e®ects of the RF channel. In the rest of this section, we present results from simulation studies that capture the e®ect of these random errors on the performance of SBL. 50 3.5.2 Simulation Model The most widely used simulation model to generate RSS samples as a function of distance in RF channels is the log-normal shadowing model [83]: P R (d)=P T ¡PL(d 0 )¡10´log 10 d d 0 +X ¾ (3.22) where, P R is the received signal power, P T is the transmit power and PL(d 0 ) is path loss for a reference distance of d 0 . ´ is the path loss exponent and the random variation in RSS is expressed as a Gaussian random variable of zero mean and ¾ 2 variance, X ¾ = N(0;¾ 2 ). All powers are in dBm and all distances are in meters. In this model we do not provision separately for any obstructions like walls. If obstructions are to be considered an extra constant needs to be subtracted from the right hand side of the above equation to account for the attenuation in them (the constant depends on the type and number of obstructions). 3.5.3 Simulation Parameters The accuracy of radio frequency based localization techniques depends on a number of param- eters. Chief among this the accuracy of RSS measurements. In an ideal world, in which RSS valuesarenota®ectedbymulti-pathfadinge®ects,theyrepresenttruedistancesbetweennodes, whichcanleadtoveryaccuratelocalizationofunknownnodes. Theidealworldisrepresentedby ¾ =0 in Equation 3.22. However in the the real world RSS values are corrupted by multi-path fading e®ects. This has a profound in°uence on the accuracy of RF localization techniques. Ac- cording to the above propagation model RSS values are de¯ned by ´ and ¾ values for the given environment. Since every RF environment can be characterized by ´ and ¾ values ([47],[68]) it is necessary to study the accuracy of RF localization techniques as a function of these two parameters. 51 In addition the density and number of reference nodes available to the unknown node has a signi¯cant in°uence on the number of reference nodes ([100],[48], etc). Thus the location estimate of any RF-based localization technique depends on a fundamental set of parameters which can be broadly categorized as follows: 1. RF Channel Characteristics: ([47], [83]) (a) Path loss exponent (´): Measures the power attenuation of RF signals relative to distance. (b) Standard deviation (¾): Measures the standard deviation in RSS measurements due to log-normal shadowing. The values of ´ and ¾ change with the frequency of operation and the obstructions and disturbance in the environment. 2. Node Deployment Parameters: (a) Number of reference nodes (n). (b) Reference node density (¯). Table 3.4 lists the typical values and ranges for di®erent parameters used in our simulations. Parameter Typical Value Typical Range P T 4dBm (max.) NA PL(d 0 ) 55dB (d 0 =1m) [68] NA ´ 4 (indoors) 1 { 7 [47] 4 (outdoors) ¾ 7 (indoors) 2 { 14 [47] 4 (outdoors) n 10 3 { 10 ¯ 0:1 (one node in 10 sq.m) f0:01;0:04;0:1;1g Table 3.4: Typical values and ranges for di®erent simulation parameters 52 3.5.4 Simulation Procedure Weassumethatallreferencenodesareinradiorangeofeachotherandalsothatoftheunknown node. A 48 bit arithmetic linear congruential pseudo random number generator was used and results were averaged over 100 random trials. In each trial, n reference nodes were placed uniformlyrandomlyinasquarelocalizationspaceofsizeS£S sq. metersandtheunknownnode was placed at 100 di®erent locations on a grid of S 10 separation. In total, the results presented are averaged over 10000 di®erent scenarios. In our simulations we use S =100 meters. The performance of sequence-based localization is measured in terms of location error for a widerangeofRFchannelconditionsandnodedeploymentparameters. Locationerrorisde¯ned as the Euclidean distance between the location estimate and the actual location of the unknown node. The location error is averaged over 100 random trials as described previously. Figure 3.11 plots the two distance metrics described in the previous section as a function of the number of reference nodes (n) or in other words the length of the location sequence. There is a growing di®erence, however small, between the two metrics with increasing length of the sequence, with Kendall's Tau performing increasingly better than Spearman's correlation in terms of the location estimate error. 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 Number of Reference Nodes (n) Avg. Location Error (meters) η = 4, σ = 7, S = 100 meters Kendall Spearman Figure 3.11: Average location error as measured using Spearman's correlation and Kendall's Tau as a function of the number of reference nodes. 53 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T CDF of T for n = 10, η = 4 σ = 4 σ = 7 σ = 10 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T CDF of T for n = 10, σ = 7 η = 2 η = 4 η = 6 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 T CDF of T for η = 4, σ = 7 n = 4 n = 7 n = 10 (a) (b) (c) Figure3.12: Sequence corruption: CumulativedistributionfunctionofKendall's Tau T between the RSS location sequence and true location sequence for varying (a) standard deviation (¾) (b) path loss exponent (´) (c) number of reference nodes (n). 3.5.5 Simulation Results: Sequence Corruption Figure3.12plotsthecorruptioninlocationsequences, representedby T, duetoRF channeland node deployment parameters. According to these results, the corruption in location sequences ² increaseswithincreasingrandomnessintheRFchannelrepresentedbystandarddeviation in RSS, ¾. (Figure 3.12(a)) ² decreases with increasing path loss exponent, ´. (Figure 3.12(b)) ² is independent of the number of reference nodes in the localization space, n. (Fig- ure 3.12(c)) 3.5.6 Simulation Results: Performance Study Figure 3.13 plots the average location error due to SBL as a function of RF channel and node deployment parameters. The main results are: ² Location error due to SBL is higher for RF channels with higher standard deviation (¾) values (Figure 3.13(a)). This is due to higher levels of corruption in location sequences at higher values of ¾. 54 0 2 4 6 8 10 12 14 2 3 4 5 6 0 2 4 6 8 10 σ n = 10, S = 100 meters η Avg. location error (meters) 3 4 5 6 7 8 9 10 0.01 0.04 0.1 1 0 2 4 6 8 n η = 4, σ = 7, S = 100 meters β (log scale) Avg. location error (meters) (a) (b) 0 20 40 60 80 100 0 20 40 60 80 100 0 2 4 6 8 10 X−AXIS (meters) n = 10, η = 4, σ = 7 Y−AXIS (meters) Avg. Location Error (meters) (c) Figure 3.13: Performance: (a) Average location error as a function of RF channel parameters - standard devation (¾) and path loss exponent (´). (b) Average location error as a function of node deployment parameters - number of reference nodes (n) and reference node density (¯). (c) Average location error as a function of the location of the unknown node. 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0 1 2 3 4 5 6 7 8 9 10 Kendalls Tau Avg. location error (meters) n = 10, η = 4, σ = 7, S = 100 meters T τ 0.5 0.55 0.6 0.65 0.7 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78 0.8 T W n = 10, K = 4, V = 7 (a) (b) Figure 3.14: (a) Average location error as a function of the sequence corruption (T) and as a function of the distance (¿) between the corrupted sequence and its nearest feasible sequence in the location sequence table. (b) Correlation between ¿ and T. 55 ² Location error due to SBL is lower for RF channels with higher path loss exponent (´) values (Figure 3.13(b)). This is due to lower levels of corruption in location sequences at higher ´ values. ² LocationerrorduetoSBLreduceswithincreasingnumberofreferencenodes(n)suggesting thatlongersequencesaremorerobusttoRFchannelnon-idealitiesthanshortersequences. (Figure 3.13(b)) ² Location error due to SBL reduces with increasing reference node density ¯ according to Figure 3.13(b). ² Location error due to SBL depends on the location of the unknown node. Figure 3.13(c) plots the average location error for all possible unknown node locations in the localization space. Itshowsthatunknownnodelocationsthatareclosertothecenterofthelocalization spacehavelowerlocationerrorthanunknownnodelocationsclosertotheboundariesofthe localization space. This can be veri¯ed from the observation (Eg. Figure 3.4(b)) that for anyarrangementofbisectorlines,thefacesandedgestowardsthecenterofthelocalization space have smaller areas and lengths respectively compared to that of at its boundaries. Consequently, for unknown node locations towards the center of the localization space, the location to which the nearest feasible sequence of the corrupted sequence maps will be closertothetruelocationoftheunknownnodethanforlocationstowardstheboundaries. This results in lower location errors for unknown node locations towards the center of the localization space than for locations towards its boundaries. ² Figure 3.14(a) plots average location error as a function of Kendall's Tau values T and ¿ and Figure 3.14(b) plots ¿ as a function of T. The ¯gures suggest that: { The location error is correlated to T, the corruption due to RF channel. 56 { The location error is correlated to ¿, the distance between the corrupted sequence and the nearest feasible sequence. { A correlation exists between ¿ and T. This suggests that, ¿, which is a measurable quantity, as apposed to T, could be used as a quantitative indicator of the location error due to sequence-based localization. Also, owingtoitscorrelationtoT, itcouldalsobeusedasanapproximateindicatorofthestate of the RF channel. 3.5.7 Simulation Results: Comparative Study We compare SBL with three other localization techniques - least squares estimator, proximity localization and 3-centroid. The least squares estimator method has been discussed in detail in Section 2.3 of Chapter 2. In proximity localization, the location of the closest reference node by RSS value is chosen as the location of the unknown node. This is an extreme special case of SBL in which the sequence is of length 1. In Centroid technique, the centroid of all the reference nodes in the radio range oftheunknownnodeischosenasitslocation([26]). Since, inourcase, allreferencenodesarein theradiorangeoftheunknownnodethelocationerrorwouldbeindependentoftheRFchannel characteristics. In order to measure the e®ect of these characteristics on the centroid technique we choose the centroid of the closest three reference nodes by RSS values as the location of the unknown node, called 3-centroid. Figure 3.15 plots the average location error due to SBL, LSE, Proximity and 3-Centroid as a function of the standard deviation in RSS log-normal distribution ¾ for di®erent values of path loss exponents ´ and for di®erent values of number of reference nodes n. The main results of the comparison are: 57 0 2 4 6 8 10 12 14 0 10 20 30 40 50 60 70 σ Avg. Location Error (meters) n = 10, η = 2, S = 100 meters SBl LSE Proximity 3−Centoid 0 2 4 6 8 10 12 14 0 5 10 15 20 25 30 35 40 45 σ Avg. Location Error (meters) n = 10, η = 4, S = 100 meters SBl LSE Proximity 3−Centoid 0 2 4 6 8 10 12 14 0 5 10 15 20 25 30 35 σ Avg. Location Error (meters) n = 10, η = 6, S = 100 meters SBl LSE Proximity 3−Centoid (a) (b) (c) 0 2 4 6 8 10 12 14 0 10 20 30 40 50 60 σ Avg. Location Error (meters) n = 4, η = 4, S = 100 meters SBl LSE Proximity 3−Centoid 0 2 4 6 8 10 12 14 0 5 10 15 20 25 30 35 40 45 50 σ Avg. Location Error (meters) n = 7, η = 4, S = 100 meters SBl LSE Proximity 3−Centoid 0 2 4 6 8 10 12 14 0 5 10 15 20 25 30 35 40 45 σ Avg. Location Error (meters) n = 10, η = 4, S = 100 meters SBl LSE Proximity 3−Centoid (d) (e) (f) Figure 3.15: Comparison: Average location error due to SBL, LSE, Proximity and 3-Centroid as a function of standard deviation of RSS log-normal distribution ¾ for di®erent values of path loss exponent ´. (a) ´ =2;n=10 (b) ´ =4;n=10 (c) ´ =6;n=10 and for di®erent values of number of reference nodes n. (a) n=4;´ =4 (b) n=7;´ =4 (c) n=10;´ =4. ² SBL performs better than Proximity and 3-Centroid over a range of RF channel and node deployment parameters. ² SBL performs better than LSE for higher values of ¾, whereas LSE performs better than SBLforlowervaluesof¾. Thereisacrossovervalueof¾ betweentheerrorduetoSBLand LSE and this value of ¾ is higher for environments that have more attenuation i.e., higher values of path loss exponent ´. There is no signi¯cant change in the value of crossover ¾ with changing number of reference nodes n. ² For lower values of ¾, the location error due to SBL decreases faster than location error due to LSE for increasing values of n. This can be seen in Figures 3.15(a)(b)(c) in which the di®erence between the location error due to SBL and LSE reduces with increasing values of n. ² LSE is out performed by all other localization techniques after some value of ¾ and this value is the lowest for SBL. 58 It should be noted that, in the above simulations LSE operates at a considerable advantage over other techniques as the exact value of the path loss exponent ´ is known. This advantage vanishes in real world scenarios where the value of ´ is very di±cult to estimate accurately owing to its dependence on the area features such as walls, furniture, etc. Thus, LSE may not perform as well in real world scenarios. Table 3.5 compares the time and space complexities of SBL with that of the other three localization techniques. We believe that the e±ciency of SBL can be increased signi¯cantly by using more e±cient location sequence table search algorithms as opposed to a naive search. SBL LSE Proximity 3-Centroid Time O(n 6 ) O(nr 2 ) O(nlogn) O(nlogn) Space O(n 5 ) O(r 2 ) O(n) O(n) Table 3.5: Comparison of worst-case computational complexities of SBL, LSE, Proxmity and 3-Centroid. 3.6 Real World Experiments The performance of sequence based localization in real systems is studied through two exper- iments, representing di®erent RF channel and node deployment parameters, conducted using Berkeley MICA 2 motes [6]. The ¯rst experiment was conducted in a parking lot which repre- sents a relatively obstruction free RF channel and the second experiment was conducted in an o±cebuildingwithmanyroomsandfurniturethatrepresentsatypicalindoorenvironment. For comparison, thelocationsoftheunknownnodeswerealsoestimatedusingthethreelocalization techniques - least squares estimator (LSE), proximity localization, 3-centroid - described in the previous Section. 59 3.6.1 Outdoor Experiment: Parking lot The RF channel in an outdoor parking lot represents a class of relatively obstruction free chan- nels. Eleven MICA 2 motes were placed randomly on the ground as shown in Figure 3.16. All motes were in line of sight of each other and all of them were programmed to broadcast a single packet without interfering with each other 2 . The motes recorded the RSS values of the received packets and stored them in their EEPROMs which were later used o®-line for location estimation. The locations of all the motes were estimated and compared with their true locations. Since all motes were in radio range of each other each mote had ten reference nodes. For the LSE method,toestimatethedistancesbetweenthemotes,theRSSmodeldescribedbyEquation3.22 in Section 3.5.2 was used as there were no obstructions between motes in this experiment. The performanceoftheLSEtechniquedependsonthevalueofthepathlossexponent ´, forthearea in which the experiment was conducted. For this experiment, we used the true distances and the corresponding RSS values between the reference nodes and the unknown node to estimate the value of ´. Figure 3.16(a) plots RSS values as a function of distance and the least-squares (best ¯t) line obtained from linear regression analysis. If (x k ;y k ), 1 · k · m are the data points, the slope (which is ¡´) and intercept of the least-squares best ¯t line are given by ([1]): slope=¡´ = ( P m k=1 x k )( P m k=1 y k )¡m( P m k=1 x k y k ) ( P m k=1 x k ) 2 ¡m( P m k=1 x 2 k ) (3.23) intercept= ( P m k=1 x k )( P m k=1 x k y k )¡( P m k=1 x 2 k )( P m k=1 y k ) ( P m k=1 x k ) 2 ¡m( P m k=1 x 2 k ) (3.24) 2 We had actually measured RSS of 100 packets in one minute and observed that their standard deviation was lessthan0:5dBm. Therefore, wedecidedtouseonlyasinglepacketforlocalization. Inrealapplicationscenarios this would help in conserving energy at the mote and reducing the delay in localization without a®ecting its accuracy. 60 3 4 5 6 7 8 9 10 11 −90 −85 −80 −75 −70 −65 −60 −55 −50 −45 Distance (dB−meters) RSS (dBm) Outdoor experiment η value 0 2 4 6 8 10 12 0 2 4 6 8 10 12 X (meters) Y (meters) True Location Estimated Location 1 2 3 4 5 6 7 8 9 10 11 (a) (b) 8 2 7 10 3 11 6 4 5 9 1 0 1 2 3 4 5 6 Node ID Location error (meters) SBL LSE Proximity 3−Centroid 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Node ID Kendalls Tau T τ 8 2 7 10 3 11 6 4 5 9 1 (c) (d) Figure 3.16: Outdoor experiment: 11 MICA 2 motes, placed randomly in a 144 sq.meters area, were used as reference nodes as well as unknown nodes. Consequently, each unknown node had 10 reference nodes. (a) Path loss exponent calculation, ´ = 2:9. (b) Comparison between true locations and SBL location estimates. (c) Location error due to SBL, LSE, Proximity and 3-Centroid (the nodes are ordered in increasing error of SBL). (d) Corruption measure T and error indicator ¿. 61 We applied the above expressions to the RSS Vs. distance data and calculated the value of ´ to be 2.9. We used this value of ´ to evaluate the LSE technique. Figure 3.16(b) compares the true mote locations with SBL location estimates for all the motes. The Figure also shows the arrangement induced by the perpendicular bisectors between all pairs of reference nodes. Figure 3.16(c) plots the error at each mote location in meters due to all the four techniques. Evidently, SBL performs better than Proximity and 3-Centroid in ten out of eleven cases and it performs better than LSE in all the eleven cases. Figure 3.16(d) plots the sequence corruption (T) at each mote location and the distance (¿) between the corrupted sequence and the nearest feasible sequence in the location sequence table for all the 11 nodes. The correlation between T and ¿ can be clearly seen from the Figure. Comparing Figure 3.16(c) and Figure 3.16(d), broad correlations between T and location error and between ¿ and location error can be observed for SBL. For example, the location error is highest for node IDs 1 and 9, in that order, and ¿ is the lowest for the same node IDs in the same order. Also, the location error is almost equal for nodes 8,2,7 and 10. This trend is also re°ected in the values of ¿ for those nodes. 3.6.2 Indoor Experiment: O±ce building O±ce buildings with features such as rooms, corridors, furniture and other obstructions rep- resent a distinct class of RF channels. Twelve MICA 2 motes (reference nodes) were placed on the ground randomly in a corner of the Electrical Engineering building at USC spanning di®erent rooms and corridors. Figure 3.17 shows a schematic of the experimental setup. In this experiment, an unknown node was placed at ¯ve di®erent locations and these locations were estimated using all the twelve motes as reference nodes. As in the outdoor experiment, the un- known node was programmed to broadcast a single packet from each location and the reference nodes recorded the RSS values of this packet in their respective EEPROMs which were later used o®-line for location estimation. 62 −2 0 2 4 6 8 10 11 −80 −75 −70 −65 −60 −55 −50 −45 −40 Distance (dB−meters) RSS (dBm) Indoor experiment η value 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X (meters) Y (meters) Ref. nodes True Path Estimated Path 1 2 3 4 5 Office room Office room Conference room with furniture Door (a) (b) 1 3 2 5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 Unknown node location Location error (meters) SBL LSE Proximity 3−Centroid 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Unknown node location Kendalls Tau T τ 1 3 2 5 4 (c) (d) Figure 3.17: Indoor experiment: 12 MICA 2 motes, placed randomly in a 120 sq.meters area, were used as reference nodes. The location of the unknown node was estimated for 5 di®erent locationsusingthe12referencenodes. (a)Pathlossexponentcalculation, ´ =2:2. (b)Compar- isonbetweentruepathandSBLestimatedpath. (c)LocationerrorduetoSBL,LSE,Proximity and 3-Centroid (the nodes are ordered in increasing error of SBL). (d) Corruption measure T and error indicator ¿. Unlikeintheoutdoorexperimentnotallmoteswereinlineofsightofeachothereventhough they were in each other's radio range. A subset of the motes had obstructions in between them in the form of walls. For this experiment, we calculated the value of ´ be 2.2 by applying the same linear regression analysis used for the outdoor experiment, to the indoor RSS Vs. distance data. Figure 3.17(a) shows the data and least-squares line. Figure3.17(b)comparestheSBLlocationestimatesofthe¯veunknownnodelocationswith their true locations. It can be seen that the path of the location estimates closely follows the 63 true path of the unknown node. Figure 3.17(c) plots the location estimate error due to SBL, LSE, Proximity and 3-Centroid techniques for each unknown node location. It can be observed that SBL performs better than LSE and 3-centroid in four out of the ¯ve cases and better than Proximity in two out of ¯ve cases. A possible reason why proximity is performing well is the relatively dense distribution of the reference nodes. Figure 3.17(d) plots the sequence corruption (T) at each mote location and the distance (¿) between the corrupted sequence and the nearest feasible sequence in the location sequence table for all the 5 unknown node locations. Comparing this Figure and Figure 3.16(d) shows that sequencesaremorecorruptedintheindoorexperimentthantheoutdoorexperiment, whichwas expected. Also, as in the outdoor experiment, there is a clear correlation between T and ¿ for the indoor experiment also. But the correlations between T and location error and between ¿ and location error are not as clear as that in the outdoor experiment. 3.6.3 Discussion Experimental results show that localization techniques are more accurate for relatively clutter free RF channel environments (outdoors with line of sight) than RF channels with many ob- structions (indoor environment). Also, the performance of LSE in real world scenarios is worse than in simulations, as was conjectured in Section 3.5.7. This is mainly because the radio prop- agation model of Equation 3.22 is an approximate model and the location estimate accuracy for the LSE technique depends heavily on the accuracy of ´ estimate. The RSS measurements in the experiments depend on antenna orientations, antenna height and transmitter/receiver non-determinism. For simulations, these issues can be captured within the log-normal random term in Equation 3.22. 64 3.7 Chapter Summary In this chapter we presented a simple and novel localization technique based on location se- quences called Sequence-Based Localization (SBL). In Sequence Based Localization location sequences are used to uniquely identify distinct regions in the localization space. The location of the unknown node is estimated by ¯rst determining its location sequence using RSS measure- ments of RF signals between the unknown node and the reference nodes. And then searching through a pre-determined list of all feasible location sequences in the localization space, called thelocationsequence table, to¯ndtheregionrepresentedbythe\nearest"one. Inthischapter, we derived expressions for the maximum number of location sequence and presented an algo- rithm to construct the location sequence table. We described distance metrics that measure the distance between location sequences and used them to determine the corruption in location se- quencesduetoRFchannelnon-idealities. Weidenti¯edanapproximateindicatoroftheextentof location estimation error using the same distance metrics. Through examples we demonstrated the robustness of sequence-based localization to RF channel non-idealities. Through exhaustive simulations and systematic real mote experiments we evaluated the performance of our local- ization system and presented a comparison with other state-of-the-art localization techniques for di®erent RF channel and node deployment parameters. Results showed that SBL performs well and better than other state-of-the-art localization techniques in both indoor and outdoor environments. 65 Chapter 4 Fast & Fair Localization 4.1 Introduction In this chapter, we focus on the problems of fast and fair localization of unknown nodes or mobile devices. Related work in the literature has studied the speed of the movement of the mobile device versus the accuracy of tracking [91] and has proposed techniques to balance the speed of tracking and energy consumption [94]. But to the best of our knowledge ours is the ¯rst attempt to study the problem of fast and fair localization. We separate the problems of fast and fair localization from the problem of accurate localization so that the solutions for the former are compatible with any solution for the latter. Fast localization requires minimizing the response time of reference nodes, called the lo- calization delay, to localization requests from mobile devices. This requires avoiding collisions and retransmissions of the reference node responses. This can be achieved by splitting time into slots and scheduling reference nodes to transmit one in each time slot. Fair localization requires minimizing the variation in response time over all locations of the mobile device in the localization area. Theproblemsoffastandfairlocalizationcanbeformulatedasasingleproblemofminimizing the maximum localization delay in the localization area. We show that this problem is closely 66 related to the NP-complete minimum length broadcast frame problem ([82]) in which, the total number of time slots required to schedule all the reference nodes in the localization area is minimized. We investigate a polynomial time heuristic algorithm for this problem and study its performanceintermsoflocalizationdelayandfairness. Acomparativestudyisalsopresentedfor two di®erent reference node deployment distributions - grid and random - for di®erent reference node density values and di®erent levels of location estimate accuracy. Therestofthechapterisorganizedasfollows: InSection4.2wemotivatetheproblemoffast localization. We state our assumptions and introduce terminology, de¯ne the di®erent terms used, formulate the problem of fast/fair localization, present the heuristic time slot scheduling algorithm and de¯ne metrics to study its performance in Sections 4.3, 4.4, 4.5, 4.6 and 4.7, respectively. In Section 4.8, analytical and simulation based evaluations of this algorithm are presented. In Section 4.9, we discuss realistic application constraints for fast/fair localization and summarize the chapter in Section 4.10. 4.2 Motivation As per the conditions for e®ective location support systems described in Chapter 1, energy e±cient operation of location support services is of utmost importance as the wireless nodes are severelyenergyconstrained. Forthisreason,mostWSNdeploymentsincorporatesleepschedules for nodes in which the nodes are awake only for a fraction of a duty cycle. The nodes have to accomplish their assigned tasks within this small period of time. Since the WSN has to perform di®erent tasks at the same, the time allocated for localization will be a fraction of the wake up time of the wireless node. Therefore, the response time of the reference nodes to localization requests from unknown nodes is limited by the duty cycle of the WSN. Each reference node can transmit a single localization packet in each duty cycle. 67 In order to minimize the response time for localization requests, the time allocated for localization during node wake up should not be wasted through collision of packets or back- o®s. This can be achieved only through scheduling of nodes in which nodes are assigned speci¯c duty cycles to transmit their localization packets. If, for example, the nodes in a WSN wake up on a duty cycle of 100 msecs, the unknown node will receive localization packets from the reference nodes with an interval of 100 msecs. If the localization technique requires packets from 10 di®erent reference nodes, the response time for a localization request is 1 sec. Having motivated the problem of fast localization, before we formally de¯ne the problem of fast/fair localization, we list our assumptions and terminology. 4.3 Assumptions and Terminology In this section we state our assumptions and introduce terminology. 1. All reference nodes and mobile devices transmit in the same frequency band and at the same power implying that the radio range (R meters) is the same for all of them. 2. AsetAofK referencenodesaredeployedinatwo-dimensional,squareshapedlocalization area of side S(ÀR) meters. Their locations depend on the deployment distribution (such as grid, random, etc). The reference node density, ¯ = N S 2 . 3. The radio range of reference nodes and mobile devices is the same in all directions 1 and the disc shaped area spanned by the radio range is called a cell. The in-square of a cell is the largest square that is contained within the cell and the out-square of a cell is the smallest square that contains the cell, as illustrated in Figure 4.1. 1 This is an idealized radio model. In Section 4.9 we discuss the e®ects of realistic radio models on fast/fair localization. 68 R CELL IN SQUARE OUT SQUARE Figure 4.1: Illustration of a cell, its in-square and out-square. 4. In order to obtain a ¯nite set of mobile device locations for evaluation purposes, the locations, (x;y), of the mobile device are considered to be grid points separated by one meter. Inordertoavoidedgee®ects,gridpointsinabandofR(À1)metersawayfromthe boundariesofthelocalizationareaareexcluded. Therefore,(x;y)2[R;S¡R]£[R;S¡R]. 5. Reference nodes transmit their location coordinates to the mobile devices in localization packets. The mobile devices use the localization packets to obtain the reference node coordinates and to measure their signal strength. 6. Thelocationestimateaccuracyformobiledevicesincreaseswithnumberofreferencenodes (Chapter 3). 7. The number of reference nodes in the cell of a mobile device located at (x;y), denoted by µ(x;y), is determined by the reference node density (¯), radio range (R) and the mobile device's location (x;y). On average, the number of reference nodes in a cell is equal to ¼R 2 ¯. The number of reference nodes required by a localization technique to guarantee a desired level of location estimate accuracy, is denoted by k (·µ(x;y)) 2 . 2 It should be noted that in a real scenario, owing to the shadowing and multi-path e®ects of the wireless channel and features of the localization area, the same number of reference nodes may not provide the same level of accuracy for all locations of the mobile device. We assume that the value of k is such that it can guarantee the desired level of accuracy even in the worst case. 69 8. The reference node network is globally synchronized and time is split into slots of equal length. Thedurationofeachtimeslot(T)consistsofthetransmissiontimeofalocalization packet and its propagation delay, transmission time of any acknowledgment packet and its propagation delay and a guard period to account for time synchronization. For evaluation purposes, T =100 ms. 9. Let f be any time slot allocation function and let M (a function of f) be the number of timeslotsrequiredbyf toallocatetimeslotstoallreferencenodesinthelocalizationarea. The M time slots constitute a time frame and each reference node periodically transmits localizationpacketsbasedonthepositionofthetimeslotallocatedtoitinthetimeframe. f is a function that maps the set of reference nodes A to the time slots in a time frame of length M and F denotes the set of all such allocation functions f. Note that f is a function of the density and distribution of reference nodes' deployment. 10. The time slots assigned to reference nodes in the cell of a mobile device located at (x;y), are denoted by t i;(x;y) , i=1;:::;µ(x;y). 11. In order to avoid collisions of localization packets, no two reference nodes within 2R distance of each other are allocated the same time slot. Henceforth, this condition is referred to as the 2R-Rule. 4.4 De¯nitions 1. Localization Request Arrival Time is de¯ned as the time instance when the mobile device requests the localization service. At this time, the mobile device starts collecting the localization packets transmitted by the reference nodes in its cell 3 . Localization requests are assumed to arrive at the starting edges of time slots. Thus, the arrival time, t, of 3 Here we assume that reference nodes transmit their localization packets inde¯nitely after initialization. In Section 4.9 we discuss other possible cases. 70 a localization request at a mobile device located at (x;y), is a uniform integer random variable in the interval [1;M]. 2. Localization Delay is de¯ned as the time taken by a certain number of reference nodes to transmit their localization packets, one each, to the mobile device. For a given time slot allocation function f, it depends on the location of the mobile device, the localization request arrival time and the number of reference nodes required for accurate localization. Therefore, it is denoted by D(x;y;t;k). It is measured in time slots. 3. Localizable speed of the mobile device is de¯ned as the speed at which localization of the desired accuracy is possible. It is determined by the localization delay and time slot duration. The localizable speed V(x;y;t;k) is calculated in meters=second as follows: V(x;y;t;k)= 1 D(x;y;t;k)T (4.1) Since localization delay changes with the location of the mobile device, the localizable speed of the mobile device also changes with its location in the localization area. 4. Localization Fairness is de¯ned as the variation in the localizable speed over all possible locations of the mobile device in the localization area. It is measured as the percentage of locations at which the localizable speed is greater than 95% of its average. The above terminology is illustrated through an example in Figure 4.2. It shows N = 36 reference nodes deployed in a grid in a square localization area of side S = 5R meters and, their time slots allocated using some time slot allocation function. Notice that the allocation function follows the 2R¡ Rule and the length of the time frame, M, is equal to 6 for this allocation function. For a mobile device located at A, the number of reference nodes in its cell µ(x;y) = 5 and their respective time slots, t i;(x;y) , are f1;2;3;4;5g. Similarly, the number of referencenodes inthecellof amobiledevicelocatedat B is4 andtheirrespectivetimeslotsare 71 f1;3;4;6g. Let the number of reference nodes required bya particular localization techniquefor the desired level of accuracy be k = 3 reference nodes. If a localization request arrives at time t=1forthemobiledeviceat Athenthelocalizationdelayis3timeslots, whereas, if t=5, the localization delay is 4 time slots. If the time slot duration is T = 100 ms, then the localizable speeds are 1 3£0:1 = 3:33 m=s and 1 4£0:1 = 2:5 m=s respectively, for the above two localization request arrival times. 2 3 1 5 3 4 R 1 5 4 6 2 1 4 1 3 2 4 3 1 5 6 2 4 2 3 1 2 3 1 4 2 1 6 5 6 3 R B A R Figure 4.2: Example illustrating terminology. 4.5 Problem Formulation The aim of fast/fair localization is minimizing localization delay and providing fairness at the same time. This implies that the time slot scheduling algorithm should minimize localization delayanditsstandarddeviation,overallpossiblelocationsofthemobiledevice,simultaneously. This can be achieved by minimizing the maximum localization delay over all locations of the mobile device [21]. In order to normalize the e®ect of localization request arrival time on localization de- lay and fairness, the expected value of the localization delay, with respect to t, denoted by E t (D(x;y;t;k)), is considered. 72 E t (D(x;y;t;k))= 1 M M X t=1 D(x;y;t;k) (4.2) Now, the solution to the fast/fair localization problem is an allocation function f ¤ 2F that minimizesthemaximumexpectedlocalizationdelayoveralllocations(x;y)ofthemobiledevice, that is: f ¤ =argmin f2F fmax (x;y) fE t (D(x;y;t;k))gg (4.3) Consider the following two propositions. Proposition 1. max (x;y) fE t (D(x;y;t;k))g· max (x;y;t) fD(x;y;t;k)g (4.4) 8 allocation functions f 2F. Proof. Proof by contradiction. Assume that max (x;y) fE t (D(x;y;t;k))g> max (x;y;t) fD(x;y;t;k)g (4.5) forsomeallocationfunction f 2F. Let(x 0 ;y 0 )2[R;S¡R]£[R;S¡R]bethelocationatwhich E t (D(x;y;t;k)) is the maximum. Clearly, max (x;y;t) fD(x;y;t;k)g¸ max t2[1;M] fD(x 0 ;y 0 ;t;k)g (4.6) From (4.5) and (4.6), E t (D(x 0 ;y 0 ;t;k))> max t2[1;M] fD(x 0 ;y 0 ;t;k)g (4.7) 73 This is a contradiction because for all locations (x;y) D(x;y;t;k)· max t2[1;M] fD(x;y;t;k)g (4.8) and from Equation (4.2), for all locations (x;y) E t (D(x;y;t;k))· max t2[1;M] fD(x;y;t;k)g (4.9) Proposition 2. max (x;y;t) fD(x;y;t;k)g·M (4.10) 8 allocation functions f 2F. Recall that M is a function of f. Proof. From the de¯nition of localization delay, its maximum value is at most equal to the length of the time frameM for all locations of the mobile device and for all localization request arrival times. The above two propositions lead to the following corollary: Corollary 2. min f2F fmax (x;y) fE t (D(x;y;t;k))gg·min f2F fMg (4.11) Proof. Since inequalities (4.4) and (4.10) hold true for all allocation functions f 2F, they are alsotruefortheallocationfunctionf ¤ thatminimizesthemaximumoftheexpectedlocalization delay over all locations (x;y) of the mobile device. Inequality (4.11) follows from the associative property of inequalities. In addition, we make the following conjecture, which is open to be proved. 74 D(x;y;t;k)= 8 > > < > > : ¿(k;minft i;(x;y) g)¡t+1; t·minft i;(x;y) g ¿(k;t)¡t+1; minft i;(x;y) g<t·maxft i;(x;y) g T k·° M+¿(k¡°;minft i;(x;y) g)¡t+1; minft i;(x;y) g<t·maxft i;(x;y) g T k >° M+¿(k;minft i;(x;y) g)¡t+1; t>maxft i;(x;y) g (4.13) Figure 4.3: Expression for localization delay. Conjecture 1. If M ¤ is the frame length due to f ¤ , the allocation function that minimizes the maximum expected localization delay, then M ¤ is only a small (¼ 1) factor away from the optimal M. Based on corollary 2, the approach we take in this chapter for fast/fair localization is to seek an allocation function that minimizes the upper bound on the maximum expected localization delay and study its performance in terms of localization delay and fairness. Now, the solution to the fast/fair localization problem is the allocation function f ¤¤ that minimizes the length of the time frame, that is: f ¤¤ =argmin f2F (M) (4.12) The above formulation of the fast/fair localization problem is a °avor of the graph coloring problem called the minimum length broadcast frame problem. Ramaswami et al. in [82] have shownthatthisproblemisNPcomplete; thereforethereisnoknownpolynomialtimeallocation function f ¤¤ that can schedule all reference nodes in the localization area to transmit in an optimal number of time slots. Nevertheless, many polynomial time heuristic algorithms have been proposed as solutions to this problem. Next, we present and analyze such a heuristic algorithm. 75 4.6 Scheduling Algorithm Below, we present pseudo-code for a greedy heuristic algorithm that minimizes the length of the time frame using the location information already programmed into the reference nodes. Algorithm 3. A Greedy Heuristic Time Slot Scheduling Algorithm Input: Location coordinatesf(p xi ;p yi )g of reference nodesfq i g, i=1;:::;N in the localization area, reference (X =(0;0)) and radio range (R). Output: Network time slot schedule. 0 fd i g=DIST(fq i g;X) 1 fQ i g=MINSORT(fq i g;fd i g) 2 T Ã1 3 for i:1!N 4 if Q i is not assigned a slot 5 Assign slot T to Q i 6 Add Q i to set S T 7 for j :(i+1)!N 8 if (Q j is not assigned a slot) and (DIST(S T ;Q j )>2R) 9 Assign slot T to Q j 10 Add Q j to set S T 11 end if 12 end for 13 T ÃT +1 14 end if 76 15 end for ² DIST(A;B) determines the Euclidean distance between elements of array A and point B and returns the array of distances. ² MINSORT(A;B)minimumsortsthearray Abasedonthevaluesofarray B andreturns the sorted array. In a centralized implementation, the above algorithm can be executed on a central server that knows the locations of all reference nodes in the network; the time slots can be assigned to the reference nodes later. In a distributed implementation, every reference node executes this algorithm and all of them agree on the same time slot schedule. In this case, every reference node is assumed to know the locations of all other reference nodes in the network. Complexity Analysis: Every reference node determines the order of all reference nodes (line 1 of the pseudo code) in the network based on their distances from a reference point (line 0). This takes O(NlogN) time, O(N) space. The ordering of reference nodes with respect to a reference point in line ensures that reference nodes that are assigned the same time slot are as close to eachotheraspossiblewithoutviolatingthe 2R-Rule. Thisensuresschedulingofreferencenodes inaminimumnumberoftimeslots. Atimeslotisassignedtoeachreferencenodeinthenetwork in lines 3 { 15. This takes O(N 2 ) time and O(N 2 ) space. In total, the algorithm takes O(N 2 ) time and O(N 2 ) space to assign time slots to all the N reference nodes in the network. Next, we de¯ne performance metrics for fast/fair localization. 77 4.7 Metrics For a given time slot allocation function, localization delay D(x;y;t;k) is determined by the location of the mobile device (x;y), the number of reference nodes in its cell µ(x;y), their time slots t i;(x;y) , the length of the time frame M, the number of reference nodes required for localizationk,andthelocalizationrequestarrivaltimet. Thisleadstothefollowingproposition. Proposition 3. Localization delay is given by Equation 4.13 (see Figure 4.3). In this equation, ¿(p;q) is the time slot of the p th reference node starting from time q and ° is the number of reference nodes whose transmission time slots are later than the localization request arrival time t. Proof. All reference nodes in the cell of the mobile device located at (x;y) are sorted from the earliest (minft i;(x;y) g) to the latest (maxft i;(x;y) g) based on their time slots. Figure 4.4 shows the time slots of the µ(x;y) reference nodes in the cell as a subset of the M time slots that constitute the time frame. Note that (minft i;(x;y) g¸1) and (maxft i;(x;y) g·M). 1,…,min{t i,(x,y) },…,max{t i,(x,y) },…,M ș (x,y) Ȗ t t t (1) (2) (3) Figure4.4: Threedi®erentcases(1),(2)and(3)dependingontherelativepositionoflocalization request arrival time with respect to the times slots of reference nodes in the cell. Considerthefollowingthreeexhaustivecases,basedonthepositionofthelocalizationrequest arrival time t relative to the set of time slots of reference nodes in the cell of a mobile device located at (x;y). 78 1. The localization request arrival time of the mobile device is equal to or lower than the minimum of the time slots of the reference nodes in its cell (t·minft i;(x;y) g): Themobile device has to wait till it receives localization packets from all the k reference nodes. The time slot of the k th reference node later than t is same as the time slot of the k th reference node later than minft i;(x;y) g, which is ¿(k;minft i;(x;y)g ). Thus, the localization delay is given by: D(x;y;t;k)=¿(k;minft i;(x;y) g)¡t+1 (4.14) 2. The localization request arrival time of the mobile device is in between the minimum and maximum of the time slots of the reference nodes in its cell (minft i;(x;y) g < t · maxft i;(x;y) g): Let ° be the number of reference nodes whose time slots are later than the localizationrequestarrivaltimet. Ifk·°,thenthemobiledevicehastowaittillthetime slot of the k th reference node starting from time t, which is given by ¿(k;t). Therefore, the localization delay is: D(x;y;t;k)=¿(k;t)¡t+1 (4.15) If on the other hand k > °, i.e., if the number of reference nodes with time slots later than the localization request arrival time t is not su±cient, then the mobile device, after receiving localization packets from the ° reference nodes, has to wait till the end of the frame i.e., time slot M, plus, it has to wait till it receives localization packets from the remaining k¡° reference nodes staring from the ¯rst time slot. The time slot of (k¡°) th referencenodestaringfromthe¯rsttimeslotissameasthe(k¡°) th referencenodestaring from time slot minft i;(x;y) g, which is ¿(k¡°;minft i;(x;y) g). Therefore, the localization delay is given by: 79 D(x;y;t;k)=M¡t+1+¿(k¡°;minft; i;(x;y) g) (4.16) 3. The localization request arrival time of the mobile device is later than the maximum of the time slots of the reference nodes in its cell (t > maxft i;(x;y) g): In this case, the mobile device has to wait till the end of frame i.e., time slot M and again starting from the ¯rst time slot till the time slot of the k th reference node, which is ¿(k;minft; i;(x;y) g). The localization delay in this case is: D(x;y;t;k)=M¡t+1+¿(k;minft; i;(x;y) g) (4.17) We consider the following ¯ve performance metrics: 1. Average localization delay (D avg (k)): It is the average of the expected localization delay, foreachlocation(x;y)ofthemobiledevice,overallpossiblelocationsofthemobiledevice. It is a function of the desired level of accuracy manifested as k and is measured in time slots. D avg (k)= 1 (S¡2R) 2 S¡R X x=R S¡R X y=R E t (D(x;y;t;k)) (4.18) 2. Average localizable speed (V avg (k)): It is determined by the average localization delay and the length of a time slot and is a function of k. V avg (k)= 1 (S¡2R) 2 S¡R X x=R S¡R X y=R E t (V(x;y;t;k)) (4.19) 80 where,theexpectedlocalizablespeedatlocation(x;y)ofthemobiledevice,E t (V(x;y;t;k)), is given by: E t (V(x;y;t;k))= 1 M M X t=1 V(x;y;t;k) (4.20) The devices of V avg (k) are meters=sec. This metric measures the speed of movement of the mobile device at whichit can obtain the desired levelof location estimate accuracy, on average, in the localization area. If the mobile device moves at a speed equal to V avg (k) there is no guarantee that it will obtain the desired level of accuracy at all locations in the mobile device. 3. Localization Fairness (F(k)): It measures the percentage of locations of the mobile device in the localization area that can guarantee the desired location estimate accuracy at a speed of 0:95V avg (k). Higher the percentage of these locations, higher is the localization fairness. 4. Minimum localizable speed (V min (k)): V min (k)= min (x;y) fE t (V(x;y;t;k))g m=s (4.21) Thismetricmeasuresthelocalizablespeedofthemobiledeviceatwhichthereferencenode networkcanprovidealocalizationareawideguaranteeforthedesiredlevelofaccuracy. In other words, if the mobile device moves at a speed that is equal to or lower than V min (k), it is guaranteed to obtain the desired level of location estimate accuracy for all locations in the localization area. 5. Maximum localizable speed (V max (k)): 81 Parameter Value Radio range, R 40 meters Localization area side, S 200 meters Reference node network size, N f121;169;256;324;441g Corresponding reference node densities, one reference node in ¯ = one reference node in S 2 N sq. meters f330:6;236:7;156:3;123:5;90:7g sq. meters Number of reference nodes required for localization, k f3;6;8;10g Table 4.1: Simulation parameters and their values. V max (k)=max (x;y) fE t (V(x;y;t;k))g m=s (4.22) This metric measures the maximum possible localizable speed of the mobile device for all locations of the mobile device in the localization area. 4.8 Evaluation In this section, ¯rst, we analyze the geometries of grid and random deployments of reference nodes and for each of them derive the upper and lower bounds on the time frame length M required by any scheduling algorithm (i.e., any allocation function f). Next, we study the performance of the heuristic algorithm described in Section 4.6 in terms of the metrics de¯ned in the previous section using simulations. 4.8.1 Analysis The de¯nition of a cell ensures that all reference nodes in the cell of a mobile device are at most 2R distance away from each other. According to the 2R¡Rule described in Section 4.3, the number of time slots required to schedule reference nodes in the network should be at least equal to the number of reference nodes in a cell. But this number of time slots is not su±cient. With this understanding, we ¯rst analyze the grid deployment of reference nodes and follow it up with analysis for random deployment. 82 4.8.1.1 Grid Deployment Proposition 4. For a 2D grid, the number of reference nodes in the in-square of a cell is n = 2m 2 +6m+5, m = ( R d ¡1), where d is the inter-node distance and R is the radio range. The number of reference nodes in the out-square of a cell is (2n¡1), the number of reference nodes on the perimeter of the in-square is (2 p 2n¡1¡2) and the number of reference nodes on the perimeter of out-square is (4 p 2n¡1¡4). We do not provide a formal proof for the above proposition as it can be veri¯ed using simple geometric arguments. Notice that, at low reference node densities, the number of reference nodes in a cell is equal to the number of reference nodes in its in-square, where as, for high reference node densities, the number of reference nodes in the cell is greater than the number in its in-square. Proposition 5. For a 2D grid, (d¼R 2 ¯e + p 2n¡1¡ 3) < M < (2n¡ 1), where ¯ is the reference node density. Proof. If all reference nodes in the out-square of a cell are assigned di®erent time slots, this schedulesatis¯esthe2R-Rule. Clearly,thesenumberofslotsaresu±cient. Therefore,according to proposition 4, at most (2n¡1) time slots are required to schedule all reference nodes in the localization area. However, these number ofslotsarenotnecessarybecausethereexist pairs ofreferencenodes within this square that are greater than 2R distance from each other and these pairs can be assignedthesametimeslots. Infact,alltheslotsassignedtoreferencenodesontheperimeterof thein-squareofthecellcanbereusedbythereferencenodesontheperimeteroftheout-square. Andtheremainingreferencenodesontheperimeteroftheout-squarede¯nitelyneedextraslots. But, the geometry of the cell ensures that these remaining reference nodes are greater than 2R away from each other in pairs and thus only half of the extra slots are indeed required. 83 Reference Node Cell In-Square Analytical M Simulation Analytical M Network Size, N Size, n Lower Bound M Upper Bound 121 13(m=1) 18 19 25 169 18:5(m=1:5) 25 29 36 256 25(m=2) 37 41 49 324 32:5(m=2:5) 47 54 64 441 41(m=3) 62 70 81 Table 4.2: Comparison of analytical lower and upper bounds of M with simulation results for grid deployment of di®erent value of N, the network size. Note that the number of reference nodes in the in-square of a cell is n = 2m 2 +6m+5, m = ( R d ¡1) where, R is the radio range and d is the inter reference node distance (Proposition 4 in Section 4.8.1.1). The number of reference nodes on the perimeter of the cell out-square that do not pair up with the reference nodes on the perimeter of the cell in-square is (4 p (2n¡1)¡4¡(2 p 2n¡1¡ 2)¡4 = 2 p 2n¡1¡6), where, the extra 4 is subtracted because these many reference nodes are common between the cell in-square and cell out-square. As stated previously, the number of extra time slots we require is only half the number of reference nodes that do not pair up, which is ( 2 p 2n¡1¡6 2 = p 2n¡1¡3) time slots. In total, since the average number of reference nodes in a cell is ¼R 2 ¯, we need at least (d¼R 2 ¯e+ p 2n¡1¡3) time slots. 4.8.1.2 Random Deployment Proposition 6. For uniform random deployment of reference nodes with density ¯, d¼R 2 ¯e< M<d16R 2 ¯e. Proof. As stated previously, the minimum number of time slots required is at least as many as the number of reference nodes in a cell, which is d¼R 2 ¯e. If all reference nodes in the out-square of a cell of radius 2R are assigned di®erent time slots and the schedule is repeated through out the network, it is su±cient because, with probability one, no two reference nodes in the network within 2R distance of each other are assigned the same time slot. The number of time slots required to achieve this is d16R 2 ¯e. 84 Reference Node Analytical M Simulation Analytical M Network Size, N Lower Bound Average M Upper Bound 121 16 29:3 75 169 22 38:3 109 256 33 54:2 164 324 42 66:0 208 441 56 89:6 283 Table 4.3: Comparison of analytical lower and upper bounds of M with simulation results for random deploymentofdi®erentvalueofN,thenetworksize. Thesimulationsresultsareaverage over 10 di®erent random reference node network topologies. 4.8.2 Simulations The performance of the heuristic scheduling algorithm is measured using simulations in terms of the metrics described in Section 4.7, for grid and uniform random deployments of reference nodes for ¯ve di®erent reference node density values. Table 4.1 lists the various simulation parameters and their values. Thereferencenodelocationsaregeneratedaccordingtothedeploymentdistributionandthe schedulingalgorithmusestheselocationstoassigntimeslotstothemasdescribedinSection4.6. Table 4.2 and Table 4.3 compare the length of the time frame M with its analytical lower and upper bounds for ¯ve di®erent reference node network sizes for grid and uniform random deployments, respectively. For random deployment of reference nodes the results are averaged over 10 di®erent random reference node network topologies. Clearly, the simulation results are within the analytical bounds. Owing to the regular geometry, the analytical bounds on M are tighter for grid deployment as compared to random deployment. Figure 4.5 plots the simulation results in terms of the ¯ve metrics described in Section 4.7. The main simulation results can be summarized as follows: 1. The average localization delay (D avg (k)) increases with the number of reference nodes required for localization (k), implying that the mobile device has to wait for a longer time to obtain higher localization accuracy. Also, D avg (k) is lower for grid deployment 85 as compared to random deployment of reference nodes and its variation with respect to reference node density is minimal. (Figures 4.5(a) and 4.5(g)) 2. The average localizable speed (V avg (k)) of the mobile device decreases with the number of reference nodes required for localization (k), i.e., on average, the mobile device has to move slower to obtain higher location estimate accuracy, con¯rming the observation from average localization delay. Also, the average localizable speed of the mobile device is higher for grid deployment than random deployment of reference nodes. The e®ect of reference node density on the average localizable speed is minimal. (Figures 4.5(c) and 4.5(f)) 3. Localization fairness F(k) is constant with respect to the number of reference nodes re- quired for localization (k), i.e., localization fairness is independent of the desired level of location estimate accuracy. Also, localization fairness increases with reference node density implying that the number of locations in the localization area that can guarantee the desired level of location estimate accuracy for 95% of the average localizable speed increases with reference node density. And these locations are higher in number for grid deployment of reference nodes as compared to random deployment. (Figures 4.5(b) and 4.5(h)) 4. The minimum localizable speed (V min (k)) decreases with number of reference nodes re- quired for localization (k) and increases with reference node density. This implies that the speed of movement of the mobile device should be lower for a localization area wide guarantee of higher location estimate accuracy and this speed increases with increasing reference node density. Also, this speed is higher for grid deployment of reference nodes as compared to random deployment. (Figures 4.5(d) and 4.5(f)) 5. The absolute maximum localizable speed of the mobile device over all its locations in the localizationareadecreaseswiththedesiredleveloflocationestimateaccuracy. Itishigher 86 for random deployment of reference nodes as compared to grid deployment (even thought the di®erence between the values for the two distributions is very low) and its dependence on reference node density is minimal. (Figures 4.5(e) and 4.5(f)) 4.9 Discussion In this section, we present realistic application constraints faced by wireless sensor networks, and discuss their e®ect on fast/fair localization. So far, we have assumed an idealistic radio model in which the radio range for reference nodes and mobile devices is the same in all directions. In this idealized radio model, reference nodes that are farther than R meters from the mobile device cannot communicate with it. But in reality, localization packets sent by such reference nodes have a ¯nite probability of reaching the mobile device [68] and this could potentially lead to collisions of localization packets at the mobile device. In order to avoid such collisions, the heuristic algorithm could be changed to allocate the same time slot to reference nodes that are farther than the present distance (2R). Wehaveassumedthatthereferencenodesandthemobiledeviceoperateinasinglefrequency band. Instead, if multiple frequencies can be used, the time slot scheduling algorithm can incorporate frequency diversity in addition to time diversity to reduce the response of time of reference nodes to localization requests and thus reduce the localization delay further. For this, the mobile device should be able to switch between di®erent frequency bands quickly. Inthischapter,wehaveseparatedfast/fairlocalizationfromthetechniquesusedforaccurate localization. In contrast, if the time slot scheduling algorithm takes into account the number of reference nodes required for a desired level of accuracy by a speci¯c localization technique there couldbepotentialreductioninthelocalizationdelaywithsomecollisiontoleranceatthemobile device. 87 3 6 8 10 0 2 4 6 8 10 12 14 16 18 20 k D avg (k) (time slots) Average localization delay D avg (k) grid: 330.6 sq.m grid: 236.7 sq.m grid: 156.3 sq.m grid: 123.5 sq.m grid: 90.7 sq.m random: 330.6 sq.m random: 236.7 sq.m random: 156.3 sq.m random: 123.5 sq.m random: 90.7 sq.m 3 6 8 10 0 10 20 30 40 50 60 70 80 90 100 k F(k) Localization fairness F(k) grid: 330.6 sq.m grid: 236.7 sq.m grid: 156.3 sq.m grid: 123.5 sq.m grid: 90.7 sq.m random: 330.6 sq.m random: 236.7 sq.m random: 156.3 sq.m random: 123.5 sq.m random: 90.7 sq.m (a) (b) 3 6 8 10 0 0.5 1 1.5 2 2.5 3 k V avg (k) (m/s) Average Localizable Speed V avg (k) grid: 330.6 sq.m grid: 236.7 sq.m grid: 156.3 sq.m grid: 123.5 sq.m grid: 90.7 sq.m random: 330.6 sq.m random: 236.7 sq.m random: 156.3 sq.m random: 123.5 sq.m random: 90.7 sq.m 3 6 8 10 0 0.5 1 1.5 2 2.5 3 k V min (k) (m/s) Minimum Localizable Speed V min (k) grid: 330.6 sq.m grid: 236.7 sq.m grid: 156.3 sq.m grid: 123.5 sq.m grid: 90.7 sq.m random: 330.6 sq.m random: 236.7 sq.m random: 156.3 sq.m random: 123.5 sq.m random: 90.7 sq.m 3 6 8 10 0 0.5 1 1.5 2 2.5 3 k V max (k) (m/s) Maximum Localizable Speed V max (k) grid: 330.6 sq.m grid: 236.7 sq.m grid: 156.3 sq.m grid: 123.5 sq.m grid: 90.7 sq.m random: 330.6 sq.m random: 236.7 sq.m random: 156.3 sq.m random: 123.5 sq.m random: 90.7 sq.m (c) (d) (e) 330.6 236.7 156.3 123.5 90.7 0 0.2 0.4 0.6 0.8 1 1.2 Reference Node Density (β) Localizable speed (m/s) k = 8 grid: V avg (n) grid: V max (n) grid: V min (n) random: V avg (n) random: V max (n) random: V min (n) 330.6 236.7 156.3 123.5 90.7 0 2 4 6 8 10 12 14 16 18 Reference Node Density (β) D avg (k=8) (time slots) Average Localization Delay for k = 8 Grid Random 330.6 236.7 156.3 123.5 90.7 0 10 20 30 40 50 60 70 80 90 100 Reference Node Density (β) F(k) Localization fairness (F(k)) grid: n = 3 grid: n = 6 grid: n = 8 grid: n = 10 random: n = 3 random: n = 6 random: n = 8 random: n = 10 (f) (g) (h) Figure4.5: SimulationResults: (a)Averagelocalizationdelay D avg (k),(b)Localizationfairness F(k), (c) Average localizable speed V avg (k), (d) Minimum localizable speed V min (k), and (e) Maximum localizable speed V max (k); as a function of number of reference nodes required for localization k, for ¯ve reference node density values and for grid and random deployment of reference nodes. (f) V avg (k), V max (k), and V min (k), (g) Average localization delay D avg (k); as a function of reference node density ¯ for number of reference nodes required for localization k = 8, for grid and random deployments of reference nodes. (h) Localization fairness F(k) as a function of reference node density ¯ for four di®erent levels of location estimate accuracy (k) for grid and random deployments of reference nodes. 88 Wewouldliketoilluminatethefactthatthestructureoffast/fairlocalizationproblemisnot limited to localization. This problem occurs in other applications of wireless sensor networks such as node discovery and data querying. The common underlying structure for fast/fair localization, nodediscoveryanddataqueryingistheminimizationofresponsetimeasmeasured bytheunknownnodeforlocalization, bythenodediscovererfornodediscoveryandbythedata querier for data querying. 4.10 Chapter Summary In this chapter, we introduced the problem of fast/fair localization of mobile device in infras- tructure wireless sensor networks and showed that it is related to the minimum broadcast frame length problem. We investigated a greedy heuristic time scheduling algorithm for this problem usingade¯nedsetof¯vemetrics-averagelocalizationdelay,averagelocalizablespeed,localiza- tionfairness, minimumlocalizablespeedandmaximumlocalizablespeed. Wederivedlowerand upper bounds for the number of time slots required to schedule all anchors in the localization area by any scheduling algorithm for grid and random anchor deployment distributions using simplegeometricarguments. Next, usingsimulations, westudiedthedynamics oftheabove¯ve metrics with respect to anchor deployment distributions, anchor densities and location estimate accuracies. Results show that the average localizable speed of mobile device decreases with increasing level of location estimate accuracy and its dependence on anchor density is minimal. Thepercentageoflocationsinthelocalizationareathatcanguaranteeadesiredleveloflocation estimate accuracy at a mobile device speed of 95% of the average localizable speed, the local- ization fairness, increases with anchor density and is independent of the accuracy level desired. The average localizable speed of the mobile device and localization fairness are better for grid deployment of anchors than for random deployment. Also, the localizable speed of the mobile 89 device at which a localization area wide guarantee of a desired level of accuracy can be provided increases with anchor density and it is higher for grid deployment of anchors. 90 Part II Medium Access for One-Hop Data Collection 91 Chapter 5 Background on Medium Access Techniques for One-Hop Data Collection 5.1 Introduction Medium access techniques for wireless packet radio networks is one the fundamental areas that hasbeenanactiveareaofresearchforwelloverquarterofacentury. However, newtechnologies introducenewwirelessmediumaccesscontrol(MAC)conditionsandchallenges. Theemergence of wireless sensor networks (WSN) over the last half a decade has introduced many new chal- lengesandopportunitiesinMACtechniquesowingtothediverserangeofapplicationsWSNare anticipated to be deployed for. In this thesis we consider the problem of medium access control for one-hop data collection in WSN. In this problem a data sink seeks information from data sources (sensor nodes ) in its radio range by issuing requests or queries. The sources give appropriate response to the sink based on the type of queries. This problem occurs frequently in many applications of WSN such as location support, neighbor discovery, data query and response, continuous data download, etc. Thus there exists a broad spectrum of application space based on the data query type for the one-hop data collection problem. In the next section we describe this application space. 92 5.2 Application Space We identify a broad spectrum of medium access problems for the data collection application in wireless sensor networks based on the characteristic of the data to be collected. At one extreme of this spectrum is continuous data (CD) collection and at the other extreme is one-shot data (OSD) collection. In continuous data collection, the sources always have a packet in their transmission queues. In one-shot data collection the sink is interested in one-shot data queries such as \Which nodes have observed the event?" or \Which nodes have recorded temperatures above 50F?", etc. The response to such one-shot queries is a single packet from each sensor node that contains the location of the node or a similar identi¯cation. Once the packet has been successfully transmitted from a node it is not in contention for the channel anymore. Neighbor discovery is a WSN application that falls at the one-shot data collection end of the application spectrum. It is an essential part of many routing protocols. In this application the sink is the node that discovers its neighboring nodes and a single packet is su±cient for each neighboring node to transmit its ID to the sink. Another WSN application in which the one-shot data collection problem occurs in localization. The unknown node (node with unknown location) sends out a localization request and the reference nodes (nodes with known location) in the radio range of the unknown node respond with a single packet each which contains their respective location coordinates. The application space between these two extremes is made up of applications which have non-continuous data as shown in Figure 5.1. The transition from black to white indicates the spectrum transition from in¯nite packets in the queues to single-packet in the queues of contending nodes. The region between these two extremes is characterized by medium access conditions which are identi¯ed by queues with probabilistic occupancy and queues with ¯nite (> 1) packet sizes. 93 Continuous Data (CD): “Infinite” packets in the Queue One-Shot Data (OSD): Single packet in the Queue Queues with probabilistic occupancy Queues with finite packets Figure 5.1: Application space spectrum for one-hop data collection in wireless sensor networks. The color transition from red to blue indicates the spectrum transition from in¯nite packets in the queues to single-packet in the queues of contending nodes. Wemainlyfocusonthetwoendsoftheabovespectrumofapplicationspace, namely, contin- uous data collection and one-shot data collection in the context of the one-hop data collection problem. As mentioned previously, the key distinguishing aspect for these two applications is the number of data packets in the transmission queues of data sources. The continuous data (CD) collection application is de¯ned by the presence of \in¯nite" packets in the transmission queues of contending source nodes or alternatively called as back-logged queues. The one-shot data (OSD) collection application is de¯ned by the presence of a single packet in the transmis- sion queues of contending source nodes. This di®erence fundamentally alters the way these two applications are modeled, analyzed, and understood. Traditionally, researchers have modeled, analyzed, and studied the performance of various medium access techniques for the one-hop data collection problem for continuous data and queues with probabilistic occupancy. However, owing to the application space speci¯c to WSN, the one-shot data collection application occurs much more frequently in WSN. To the best of ourknowledgeoursisthe¯rste®ortatmodeling,analyzing,andunderstandingtheperformance of various medium access techniques for the one-shot data collection application. 94 Inthenextsectionwereviewthevariousmediumaccesstechniquesproposedintheliterature that are applicable to the one-hop data collection problem in WSN. 5.3 Medium Access Techniques Medium access techniques for the one-hop data collection problem can be broadly classi¯ed into two classes, namely, randomized and scheduled. In this work, we mainly focus on randomized medium access techniques and we review the literature for the same. Review of scheduled medium access techniques (eg. TDMA) is beyond the scope of this thesis. Many solutions have been proposed over the years for the problem of medium access in wireless packet radio networks (e.g. [21], [34], [17], [16], [96], [85], [42], [43], [22], [49], [7], [56], [61], [13], [92], [37], [86], [89], [90], [97], [99], [9], [70], [35], [51], [87], [53], [52], [8], [88], [58], [95], [64], [67], [93], [23], [39], [62], [63], [25], [38], [27], [31], [14], [12], [98], [50], [46], [24], [28], [73], [29], [72], [66], [71], [81], [76]). These medium access control (MAC) techniques can be broadly classi¯ed into three main categories: 1. Slotted Aloha Medium Access 2. Carrier Sensing Medium Access 3. Tree/Stack Resolution Algorithms Before we review each of the above category of MAC techniques we describe the MAC problem in the context of one-hop data collection, discuss the required assumptions, and de¯ne performance metrics for evaluation purposes. 5.3.1 Problem Description The modeling and analysis of wireless MAC techniques or protocols is usually characterized by theassumptionthateachcontendingnodealwayshaspacketstotransmitorthatthenumberof 95 packets in the queue is probabilistic based on a stationary arrival rate. In this case the network systemreachesasteady state inwhichthesystemparametersapproachaconstantaveragevalue over time (for example, see [13], [23], [25], [27]). Signi¯cantly, these assumptions do not hold for one-shot data collection because the number of nodes contending for the wireless channel continuously and deterministically decreases with each successful packet transmission. This automatically implies that the system cannot reach steady state; and such a system is therefore said to be transient. An important aspect of MAC protocols is the time synchronization of the data sources and data sinks. Time synchronization of nodes in wireless sensor networks is a very active ¯eld of research and much progress has been done in this area. Owing to this success even the IEEEstandard802.15.4forlow-powerlow-ratewirelessnetworksstipulatestimesynchronization between all nodes in the network ([24]). In this thesis we focus on randomized medium access techniques in which all the data sources and the data sink are time synchronized. Thus, we assume that time is split into slots of equal length and that each node in the network can identify the starting and ending times of these time slots. Recently many sleep scheduling based medium access mechanisms have been proposed (e.g. [97], [99]) for saving energy in wireless sensor networks. However, we do not consider sleep scheduling in this thesis. 5.3.1.1 Metrics Now we de¯ne metrics that accurately capture the performance dynamics of MAC protocols for both CD and OSD scenarios. Let the number of contending nodes in the wireless sensor networks be N. ² Throughput: This metrics is measured in terms of the amount of data successfully transmitted to the sink per unit time. We use this metric for the CD scenario in which 96 each node always has a packet to transmit. We denote throughput by © CD (N) for the CD scenario. ² Delay: This metric is measured in terms of the amount of time taken by the sensor nodes to successfully transmit one packet each to the sink. We use this performance metric for the OSD scenario in which each node has a single packet to transmit. We denote delay by ¢ OSD (N) for the OSD scenario. ² Energy Consumption: This metric is measured in terms of the amount of energy con- sumedbyeachsensornodesintheWSNfordatatransfertothesink. FortheCDscenario the energy consumption is measured per unit time and for the OSD scenario the energy consumption is measured for the entire time to transfer packets from all contending nodes to the sink. We calculate the energy consumption per node as an amortized quantity by calculating the energy consumption for all the nodes and dividing it by the number of nodes. We denote the amortized energy consumption metric by § CD (N) for the CD scenario and by § OSD (N) for the OSD scenario. Next,wedescribetheenergymodelthatweusetomodeltheenergyconsumptioninasensor node. 5.3.1.2 Energy Model For the two data collection scenarios CD and OSD under consideration the sensor nodes are eithertransmittingpackettothesinkorreceivingpacketstransmittedtothesinkbyothersensor nodes. For the OSD scenario once a sensor node transmits its packet successfully it shutdowns tosaveenergy. ThisisnotthecaseintheCDscenarioasnodesalwayshavepacketstotransmit. We denote the energy consumed by a sensor node per unit time during packet transmission by » T and the energy consumption per unit time during reception by » R . 97 Next, we brie°y review each of the three classes of medium access techniques described in the previous section. 5.3.2 Slotted Aloha Medium Access Slotted Aloha medium access protocol ([21], [9]) was one of the ¯rst random access mechanisms suggested for wireless medium access. In this protocol each contending node transmits a packet to the sink at the beginning of a time slot as soon as the packet is available. In order to avoid collisions a back-o® collision avoidance policy is used at each node. In this thesis we consider the commonly used binary exponential back-o® mechanism. We give a detailed description of this mechanism in the next chapter. Much work has been done on the modeling, analysis and performance evaluation of the slotted Aloha protocol for di®erent applications (a few examples include [70], [35], [51], [87], [53], [52], [8], [88], [10], [11]). The above previous work is completely based on either continuous data or non-continuous data collection. To the best of our knowledge no prior work has been done to analyze the performance of the slotted Aloha protocol for the one-shot data collection application. InChapter6wepresentamodelfortheslottedAlohaprotocolwithbinaryexponentialback- o® for the one-shot data collection scenario, analyze it using °ow equations, and then evaluate the performance of the protocol. 5.3.3 Carrier Sense Medium Access Carrier sensing medium access (CSMA) techniques have been proposed (example [58], [95], [64]) over a quarter of century for wireless packet radio techniques. In these techniques each contending node senses the channel, and transmits its packet if it ¯nds the channel to be free of any transmissions and if the channel is busy with another transmission the node defers its transmission. 98 Since in this thesis we are interested in scenarios in which all the contending nodes are time synchronized to a common global clock, the CSMA techniques are applicable to cases in which the transmission of a data packet spans over multiple time slots. The slotted Aloha medium accesstechniquesdiscussedintheprevioussubsectionareapplicabletocasesinwhichthepacket transmission time is equal to a single time slot. If more than one nodes sense the channel to be free in a given time slot and both of them transmittheirpacketsimultaneously,itresultsinacollision. Onemechanismtoavoidcollisionsis forthenodestorandomlyback-o®whentheysensethechanneltobebusy. Thusdi®erentnodes choose di®erent random times to sense the channel and transmit their packets. A commonly used and studied collision avoidance (CA) mechanism with CSMA techniques (CSMA/CA) is binary exponential back-o®. We elaborate more on this mechanism in the context of CSMA techniques in Chapter 8. Many IEEE standard MAC protocols have been proposed based on CSMA techniques. The most popular wireless MAC standard, the IEEE 802.11, or Wi-Fi, is a CSMA/CA technique. Tremendous amount of work has been done in modeling, analyzing and evaluating the perfor- mance of this standard in a diverse range of applications. A sample of these works is captured in [23], [39], [62], [63], [25], [38], [27], [31], [14], [12], [98], [50], [46]. The IEEE 802.11 MAC protocolusesbinaryexponentialback-o®asthecollisionavoidancemechanism. Itisbeyondthe scope of this thesis to present a review of the work on IEEE 802.11 MAC protocol, even though a thorough literature review has been done in this area for this thesis. The IEEE 802.11 MAC protocol has been designed mainly for high speed wireless communi- cationbetweencompliantdevicesorforhighspeedinternetaccess. Thusthemainfeatureofthe IEEE 802.11 MAC protocol is high data rate ranging from 10's to 100's of Mbps. This aspect is signi¯cantly di®erent for wireless sensor networks which operate at several 100 Kbps at best. Starting with this di®erence, the operating conditions of the IEEE 802.11 protocol signi¯cantly di®er from that of wireless sensor networks in such important aspects as energy conservation 99 goals, deviceformfactors, applicationrequirements, etc. Inviewofthesesigni¯cantdivergences the IEEE has proposed and standardized a new protocol for low-power, low-rate wireless net- works called the IEEE 802.15.4 standard. We provide a thorough overview of this standard MAC protocol next. 5.3.3.1 IEEE 802.15.4 The IEEE 802.15.4 standard ([28]) allows di®erent network topologies such as one-hop star and multi-hop. In this thesis we consider the one-hop star topology with multiple data sources and a single sink. In the star topology, a global synchronization of nodes is assumed and the time is separated by beacons transmitted by a network coordinator. The beacon-interval consists of a superframe and an optional energy saving time in which the nodes switch o® their radio and go to sleep. The superframe is divided into 16 time slots of ± = 320 ¹secs duration each. The superframe consists of a contention access period (CAP) and a period of guaranteed time slots (GTS). The GTS is dedicated for low latency applications. In this chapter we consider only the CAP mode (without the energy saving mode, GTS, and beacons) where medium access is through slotted CSMA/CA. In slotted CSMA/CA, a node can transmit its packet only after it senses the channel free for a contention window (CW) of 2 time slots. The main purpose of the CW is to avoid collisions between acknowledgement packets (ACKs) from the sink and data packets from the sources as the protocol does not speci¯cally provision time slots for ACKs [81]. A node chooses a time slot uniformly at random from an initial window of [0;2 BE ¡1], where BE is the back-o® exponent with an initial value of 3. The node transmits its packet if the channel is sensed to be free in that and the next time slots; if the channel is sensed to be busy the node backs o® to a bigger window with BE = 4. On a second busy channel sensing or a collision the node backs o® to a window with aMaxBE = 5 and remains constant. If a node is unable to transmit its packet within 5 back-o®s the transmission is assumed to be a failure and the packet is dropped. We 100 relax this condition in this thesis and allow a node to retransmit its packet until it is successful. Figure 5.2 shows the °ow chart for a node using the IEEE 802.15.4 MAC. The IEEE 802.15.4 standard speci¯es a data rate of 250 kbps and a maximum MAC protocol data unit (MPDU) of 127 Bytes. Given this data rate, the transmission time for a typical packet of 50 Bytes is 5 time slots and for the MPDU it is 13 time slots. Start: BE = 3 Choose a time slot Wait Time = chosen slot? YES Sense Channel in this time slot. Is it free? Sense Channel in next time slot. Is it free? YES BE = min(BE+1, aMaxBE) Choose a new time slot YES NO Transmit. Success? NO YES NO Figure 5.2: Flow chart for IEEE 802.15.4 operation at a node. In [66], the performance of the IEEE 802.15.4 MAC is evaluated in terms of throughput and energy e±ciency using ns { 2 simulations for a maximum of 49 nodes. In [73], the performance of the standard MAC is evaluated for medical applications where the IEEE 802.15.4 devices interface with the traditional MAC technologies such as Ethernet. In [76], the performance of IEEE 802.15.4 MAC protocol is analyzed in the context of medical body area networks (BAN) where the energy e±ciency of body implanted sensors is the focus given that their required life time is in the order of 10-15 years in these applications. In [71], a queuing analysis is presented for the sleep mode with possible ¯nite bu®ers. In [72], the performance of the standard MAC is evaluated in the presence of both uplink and down-link tra±c in the one-hop star topology network. 101 In this work, we evaluate the performance of the IEEE 802.15.4 MAC protocol for both ends ofthe applicationspectrumoftheone-hopdatacollectionprobleminbothhighandlowdensity scenarios. We also propose enhancements to the protocol based on channel-feedback. OtherrecentworkonCSMAbasedMACprotocolsspeci¯ctowirelesssensornetworksinclude [67]and [93]. However, thisworkdi®erssigni¯cantlyfromours as the authors in this workfocus on developing techniques to minimize the delay until the ¯rst packet transmission from the sensor nodes to the sink in the one-hop network. In our work we are interested in the delay and the corresponding energy consumption for all relevant data to be transmitted to the data sink. Next, we discuss the next category of medium access techniques called the tree/stack algo- rithms 5.3.4 Tree/Stack Algorithms The main idea in this category of medium access techniques is to probabilistically and hierar- chically (in a tree form) isolate the contending nodes and separate their collision domains to reduce collisions and thus increase the throughput of the wireless network. A stack is usually used to e±ciently implement these tree based isolation algorithms. We will explain these al- gorithms through the example of the HT-splitting MAC protocol. In this protocol [21], the collision domains of the contending nodes are isolated probabilistically. The protocol starts out by all contending contending nodes in the radio range of the sink tossing a coin, and the subset of nodes with a heads (H) transmitting their packet. If there is a collision, all nodes with a H in the ¯rst level, again toss a coin and the subset of nodes with a H in both the present and the previous levels transmit their packets. This is continued until a single node has H from all the previous tosses and the present toss. Once this node ¯nishes transmitting its packet, the node with a tails (T) in the present toss and a H in all the previous tosses transmits its packet. This process of descending and ascending the \tree" of coin tosses continues until all nodes transmit their packets. 102 In Chapter 7 we propose a tree-based location-aware medium access technique for one- shot data collection applications in which the collision domains of contending sensor nodes are separated based on their locations rather than probabilistically as in the HT-splitting MAC protocol. 5.3.5 Enhancements Next,wediscussdi®erentpossibleenhancementmechanismsproposedovertheyearstotheabove discussed categories of medium access techniques. The two main mechanisms that have been used in the literature to enhancement the performance of medium access protocols are through using relevant feedback from the wireless channel or through using the location information of nodes. 5.3.5.1 Channel Feedback The idea of using feedback from the channel to control the transmission probabilities of con- tending nodes has been used for a long time. Rivest in [86] has proposed a ternary feedback model in which each node has to monitor three channel conditions - absence of transmissions, successful transmissions and, collisions. Rivest has shown that estimating the true value for the number of nodes n and setting the transmission probability to 1 n maximizes the throughput in slotted-Aloha type protocols (in which the packet length is equal to a single time slot). If the packetlengthisofmultipletimeslots,thisresultsdoesnotholdtrueasweshowinProposition9 in Section 8.3. In [25] a control mechanism has been presented that uses the energy consumed by a tagged node in the network in the above three channel conditions between two successful packet transmissions. This mechanism is not applicable in the case of OSD because each node has a single successful packet transmission. Similar strategies based on the estimation of the three channel conditions have been proposed ( [22], [56], [70]) all of which are more suitable for steady state conditions (like in CD) in which the number of contending nodes remain constant. 103 5.3.5.2 Location Information The idea of using location information of contending nodes to enhance the performance of medium access techniques has been prevalent for some time. Corbett et al. in [33] propose a hybridTDMA{Contentionbasedprotocolformulti-hopsensornetworksthatusesthelocations of nodes for spatial reuse and time slot allocation to avoid collisions and interference. The space is divided into hexagonal cells, similar to cellular networks, and nodes within each cell use contention based medium access. In contrast to this, in our work on location-aware medium accesstechnique,mentionedpreviouslyandpresentedindetailinChapter7,weusethelocations of nodes to solve the problem of medium access within a cell. Liu et. al. in [65] use the location informationofnodeswithinone-hoptoprovideenergye±ciencyandfaulttolerance,eventhough the medium access is through contention-based random-access schemes. In our location-aware medium access work, the medium access itself is based on the locations of nodes. Nadeem et. al. in[74]usethe locationinformationintandemwiththecapturee®ecttoincrease throughput in IEEE 802.11 DCF networks. In this, the location information is used to increase the spatial reuse e±ciency and better manage interference leading to additional concurrent transmissions, thereby increasing the overall throughput of the protocol. Again, this work di®ers from ours in that, we solve the one-hop medium access problem using the location information of nodes in contrast to the multi-hop one. 5.4 Chapter Summary In this chapter, we have reviewed the background work on medium access techniques for the one-hopdatacollectionproblemforwirelesssensornetworks. Wehavedescribedtheapplication space of operation and de¯ned the metrics of interest that capture the performance of protocols in this application space. We have then presented an energy model we will use to evaluate the 104 energyperformanceofprotocols. Finally,wehavediscussedthreedi®erentcategoriesofmedium access techniques and the common performance enhancement techniques used in the literature. 105 Chapter 6 Analysis of Slotted Aloha Medium Access for One-Shot Data Collection 6.1 Introduction In this chapter, we analyze the performance of slotted Aloha medium access for the one-hop one-shot data collection problem. As discussed in Chapter 5, the slotted Aloha protocol has been analyzed for many di®erent application scenarios; but to the best of our knowledge ours is the ¯rst attempt at analyzing the protocol for the one-hop one-shot data collection problem. More speci¯cally, we model and analyze the performance of slotted Aloha with binary expo- nentialback-o®collisionavoidancescheme. Wepresenta Markovchainmodelthatcapturesthe average temporal dynamics of the network and derive °ow equations that depict the behavior of the network as a function of time. Through simulations, we show that the °ow equations match thenetworkdynamicsveryaccurately. Finally,weanalyzetheperformanceoftheprotocolusing these °ow equations. Therestofthechapterisorganizedasfollows. Inthenextsection,wereviewtheassumptions made and de¯ne the performance metrics of interest for the one-hop one-shot data collection problem. In Section 6.3, we model and analyze slotted Aloha with binary exponential back-o®. 106 Weevaluateitsperformancefortheone-hopone-shotdatacollectionprobleminSection6.4and summarize the chapter in Section 6.5. 6.2 Problem Description In this section, we review the assumptions made and de¯ne the performance metrics of interest. The number of contending nodes in the radio range of the sink is N. We assume that time is splitintoslots ofequallength andthat allcontending sensornodes(the sinkis nota contending node) in the network are time synchronized to transmit at the beginning of each time slot. The time required to transmit a packet is equal to that of a single time slot duration. When more than one node transmits its packet in the same time slot, it results in a collision, otherwise, if a single node transmits, it results in a successful transmission. The sink informs the sensor nodes of the either case through acknowledgements. We assume that the acknowledgement packet have a negligible e®ect on the protocol performance. Thefollowingtwoperformancemetricsareofinterestfortheone-hopone-shotdatacollection problem: ² ¢ OSD (N): The number of time slots required for the sink to successfully receive packets from all the N contending nodes. ² § OSD (N)): The energy consumption by the contending nodes in the sensor network for thesinktosuccessfullyreceivepacketsfromalltheN contendingnodes. Theenergymodel we consider is as described in Chapter 5. In the next section, we model and analyze slotted Aloha with binary exponential back-o® for the one-hop one-shot data collection problem. 107 6.3 Slotted Aloha with Binary Exponential Back-O® In this protocol, each node starts out with a minimum congestion window of size W 0 and each time it backs-o®, it doubles the size of its congestion window up to a maximum value of W MAX , leading to a binary exponential increase in its window size. At each stage of the increase, the node chooses a time slot to transmit uniformly at random within the current window size. The congestionwindowsizeofthenoderemainsconstantoncethemaximumwindowsizeisreached. If the number of stages of increase before the maximum window size is reached is M, then the window size at stage i, (0·i·M¡1) is W i =2 i W 0 , and W MAX =W M¡1 . m-1,W m-1 -1 m-1,0 m-1,1 … … 0,W 0 -1 0,0 0,1 … i,W i -1 i,0 i,1 … 0,W i+1 -1 i+1,0 i+1,1 … … … … … S Figure 6.1: Markov chain of states for a contending node using slotted Aloha with binary exponential back-o® protocol. Figure6.1showstheMarkovchainofstatesasystemusingthisprotocolgoesthroughbefore all packets are transmitted to the sink through state S. The state (i;j) implies that the node has entered the stage i after backing-o® from stage (i¡1), j back-o® time counter slots ago. We assume that the back-o® time counter length for each node in the network is equal to the time slot length. Let n i;j (t); (i2[0;M¡1]; j2[0;W i ¡1]) be the number of nodes in stage i at time t that have entered this stage j time slots ago. The probability that a node in state (i;j) attempts to transmit its packet, given that it is in this state is given by, 108 p i;j = 1 W i ¡j (6.1) All nodes in the network start at time slot 0 at state (0;0), such that n 0;0 (0)=N. However, 8 t > 0, n 0;0 (t) = 0, as all nodes move to a di®erent state in the next time slot and do not return to state (0;0). Whenever a node attempts to transmit in a time slot, either it backs-o® due to collisions or packet errors, or successfully delivers its packet to the sink. Therefore, the number of nodes that enter state (i;j); j6=0, at t+1 is equal to the average number of nodes in state (i;j¡1) that do not attempt to transmit at t. And all nodes in state (i;j); j 6= 0, at time t leave that state either through successful transmission or back-o® or by just moving to state (i;j +1) at time t+1. Thus, n i;j (t+1)=n i;j¡1 (t)(1¡p i;j¡1 ) (6.2) The average number of nodes that enter state (i;0); 0<i<M¡1, at time t+1 is equal to the sum of the average number of nodes that back-o® from all states (i¡1;j) in the previous back-o® stage i¡1, at time t. And all nodes in state (i;0) at time t would leave to other states by time t+1. Therefore, for i6=0;M¡1, n i;0 (t+1)= W i¡1 ¡1 X q=0 n i¡1;q (t)p i¡1;q [1¡¼ i¡1;q (t)] (6.3) where, ¼ i;j (t)=(1¡p i;j ) n i;j (t)¡1 M¡1 Y k=0 W k ¡1 Y l=0 (k6=i;l6=j) (1¡p k;l ) n k;l (t) (6.4) 109 The number of nodes that enter state (M¡1;0) at time t+1 is equal to the average number of nodes that back from all states of stages M¡2 and M¡1 at time t. Therefore, n M¡1;0 (t+1)= W M¡1 ¡1 X q=0 n M¡1;q (t)p M¡1;q [1¡¼ M¡1;q (t)]+ W M¡2 ¡1 X q=0 n M¡2;q (t)p M¡2;q [1¡¼ M¡2;q (t)] (6.5) Consequently, the average number of nodes that enter state S at time slot t+1 is equal to the sum of the average number of successful deliveries from all states (i;j) in the Markov chain of Figure 6.1 to the sink. n s (t+1)=n s (t)+ M¡1 X i=0 Wi¡1 X j=0 n i;j (t)p i;j ¼ i;j (t) (6.6) The expected delay ¢ OSD (N) is such that n s (¢ OSD (N))=N. The expected energy consumption per node for the successful reception of N packets by the sink is: § OSD (N) = 1 N ¢ OSD (N) X t=0 M¡1 X i=0 W i ¡1 X j=0 (n i;j (t)p i;j » T +(n i;j (t)(1¡p i;j )» R ) (6.7) = 1 N ¢ OSD (N) X t=0 M¡1 X i=0 W i ¡1 X j=0 n i;j (t)(p i;j » T +(1¡p i;j )» R ) (6.8) wheren i;j (t)p i;j andn i;j (t)(1¡p i;j )aretheexpectednumberoftransmissionsandreceptions respectively in time slot t from state (i;j). Since for each sensor node the energy consumption in almost equal for both transmission and reception, » R ¼» T . This reduces the expected energy consumption to 110 § OSD (N)= 1 N ¢ OSD (N) X t=0 M¡1 X i=0 W i ¡1 X j=0 n i;j (t) (6.9) One of the most important aspects of counting the number of nodes at each state by the above, average expressions is the handling of fractional values. If n i;j (t) is a real number less than one, then that number can be approximately assumed to be the probability of existence of a node at state (i;j). All real numbers greater than one are used without change. It should be noted the number of states with fractional values of n i;j (t)'s, increases with increase in their number. 6.4 Performance Evaluation In this section, ¯rst we verify the accuracy of the °ow equations of the previous section using simulationsandthenevaluatetheperformanceofslottedAlohaprotocolwithbinaryexponential back-o® for the one-hop one-shot data collection problem. For simulations, the value of N is varied from 2 to 100 to capture the network dynamics from low to high densities. The results are averaged over 100 random trials, with ten di®erent random seeds. Figure 6.2 plots the analysis and simulation results for the number of nodes in each of the M back-o® stages as a function of time, for M = 5 and W 0 = 32. The number of nodes n i (t) in stage i at time t is calculated as: n i (t)= W i ¡1 X j=0 n i;j (t); 0·i·M¡1 (6.10) The above ¯gure shows that the analysis matches with simulations very well, with one virtually superimposed over the other. It is important to note that the analysis presented in the previous section is ¯rst-order in nature and thus is approximate. Exact analysis of the system dynamicswouldinvolvetrackingtheprobabilitydistributionofnodesineachstateofFigure6.1, 111 0 100 200 300 400 500 600 700 800 0 10 20 30 40 50 60 70 80 90 100 Time slots (t) N = 100, M = 5, W 0 = 32 Analysis, n 0 (t) Simulation, n 0 (t) Analysis, n 1 (t) Simulation, n 1 (t) Analysis, n 2 (t) Simulation, n 2 (t) Analysis, n 3 (t) Simulation, n 3 (t) Analysis, n 4 (t) Simulation, n 4 (t) n 0 (t) n 1 (t) n 2 (t) n 3 (t) n 4 (t) Number of Nodes Figure 6.2: Comparison of analysis and simulations for slotted Aloha with binary exponential back-o®. as a function of time. However, the above ¯gure shows that the approximate analysis of the previous section works well and is valuable in this setting. It can be seen from this ¯gure that, the ¯nal back-o® stage contributes the maximum to the delay. For example, the ¯rst four stages contribute close to half of the delay and the ¯nal stage (stage 4) contributes the other half of the delay. The reason for this is that, the low probability of transmission in the ¯nal stage accompanied by the reducing number of contending nodes in the network, increases the number of idle time slots. This implies that the delay can be reduced either by reducing the number of back-o® stages or by reducing the initial back-o® window or both. This intuition is con¯rmed by Figure 6.3 which plots the expected delay and energy consumption as a function of the number of contending nodes for di®erent values of M and W 0 . The main observations from the ¯gure are as follows. ² Foragiveninitialwindowsize,(Figures6.3(a)&(c)),thereexistlowerandupperthreshold number of nodes between which the performance due to a particular value of M is better than other values. The delay is reduced up to that value of M because of the reduction in the number of idle time slots in the ¯nal back-o® stage by implicitly increasing the probability of transmission. Lower delay implies that the rate at which nodes transition 112 0 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 1000 1200 1400 1600 1800 W 0 = 32 Number of Nodes (N) Expected Delay (Δ OSD (N)) M = 1 M = 2 M = 3 M = 4 M = 5 0 10 20 30 40 50 60 70 80 90 100 0 100 200 300 400 500 600 M=3 Number of Nodes (N) Expected Delay (Δ OSD (N)) W 0 = 16 W 0 = 32 W 0 = 64 (a) (b) 0 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 1000 1200 1400 W 0 = 32 Number of Nodes (N) Expected Energy Consumption Σ OSD (N) M = 1 M = 2 M = 3 M = 4 M = 5 0 10 20 30 40 50 60 70 80 90 100 0 50 100 150 200 250 M=3 Number of Nodes (N) Exp. Energy Consump. ( Σ OSD (N)) W 0 = 16 W 0 = 32 W 0 = 64 (c) (d) Figure 6.3: Performance of slotted Aloha with Binary Exponential Back-o® as a function of number of nodes N for the one-hop one-shot data gathering problem. into the absorbing state S is higher and this implies lesser number of nodes in other states per time slot. Thus, lower number of nodes in the transition states (i;j) implies lower energy consumption according to Equation 6.9. For M =1, where the back-o® window is constant,theperformanceofslottedAlohaintermsofbothdelayandenergyislowerthan that for M > 1 for low densities and higher for high densities. Also from the ¯gure it is evident for node numbers greater than 100 the delay and energy consumption for M = 2 increases beyond M =3. This is because at high densities lower number of back-o® stages implies higher number of collision thus increasing both delay and energy. This suggests 113 that for high densities multiple back-o® stages is preferred and for low densities a single back-o® stage performs better. ² Figure6.3(b)&(d)con¯rmtheintuitionthatathighnodedensitieslargerinitialwindows (W 0 ) perform better and at low node densities smaller initial windows are better. These ¯gurealsosuggesttheexistencelowerandupperthresholdsofinitialwindowsizesbetween which performance due to a particular W 0 value is better than other values. 6.5 Chapter Summary In this chapter, we have investigated the performance of the slotted Aloha protocol with bi- nary exponential back-o® applied for the one-hop one-shot data collection problem that occurs frequently in many data-gathering applications of wireless sensor networks. We derived °ow equations based on transient state transitions and veri¯ed their accuracy through simulations. Using these equations, we then evaluated the performance of the protocol. Results suggest the existence of a delay-energy trade-o® based on the number of back-o® stages and the initial window size. 114 Chapter 7 Location-Aware Medium Access 7.1 Introduction Location awareness of sensor nodes is increasingly common in many wireless sensor network applications. For example, protocols such as GPSR [54] have used it to provide e±cient rout- ing. In this thesis, we propose a novel medium access protocol that makes use of the location awareness of sensor nodes to provide e±cient wireless medium access. The main idea in our protocol is the separation of collision domains of nodes using spatial partitioning. Atree-basedspacepartitioningprocedureisusedtoadaptivelypartitionthespace until each node can transmit its packet successfully, without collisions. The key point here is thatspatialpartitioningallowsustoleveragethelocationdistributionofsensornodestoprovide e±cient medium access. In this chapter, our focus is on the one-hop one-shot data collection problem discussed in chapter 5. The two main performance metrics of interest for this problem are the delay in obtainingpackets from all contendingsensor nodes in the radio range of the sink and the energy consumption incurred by the sensor network in this operation. We evaluate the performance of our location-aware medium access protocol in terms of these two performance metrics and compareitwiththreelocation-unawaremediumaccessprotocols{HT-split,optimalp-persistent 115 slotted CSMA, and the IEEE 802.15.4 standard MAC. We show through simulations that our protocol can take advantage of the location distribution of sensor nodes to provide signi¯cantly lower delay and energy consumption compared to location-unaware medium access protocols. The rest of the chapter is organized as follows. In the next section we discuss related work and in Section 7.2, we describe the assumptions and metrics associated with the one-hop one-shot data collection problem. In Section 7.3, we present our location-aware medium access protocolindetailanddiscussitsimplementationaspects. InSection7.4,wepresenttheprotocol performance evaluation results and discuss its scope in Section 7.5. Finally we conclude and discuss the future directions of our work in Section 7.6. 7.2 Problem Description In this section, we describe the assumptions made and performance metrics associated with the one-hop one-shot data collection problem. The one-hop sensor network has n contending sensor nodes, not including the sink (which does not contend for the channel), each with a single data packet to be transmitted. The locations of all the nodes including that of the sink are known. Time is divided into slots and each node transmits its packet only at the beginning of a time slot. The packet length is such that its transmission time is equal to one time slot. If more than one node transmits in the same time slot, it results in a collision. Otherwise, if a single node transmits in a time slot, it results in successful transmission of the packet. On successful transmission, the node is no longer in contention of the medium. The sink uses explicit acknowledgement (ACK) and negative acknowledgement (NACK) packets to indicate successful packet reception and collision, respectively, to the sensor nodes. The sink broadcasts the ACK/NACK packets as soon as the data packet(s) reception is completed. We assume that the ACK/NACK packet 116 is accommodated by the time slot length such that it has negligible in°uence on the protocol performance. In order to study the intrinsic performance advantages of the location-aware MAC protocol, we initially isolate the random errors due to noise and wireless channel non-idealities such as multi-pathfadingandshadowing. Lateroninthechapterwediscusstheimplicationsofchannel errors on the protocol performance. We assume an energy model in which the sensor node is either in the receive state or the transmitstateuntilitspacketissuccessfullytransmittedafterwhichitmovesintotheshutdown state. We assume that the energy consumed by a node in the receive state is equal to that consumedinthetransmitstate,andthattheenergyconsumedintheshutdownstateisnegligible. Also,forsimplicityweassumethattheenergyconsumedbyanodeinthereceive/transmitstate per time slot is equal to one energy unit. We consider the following two performance metrics for the one-hop one-shot data collection problem: 1. Expected delay (¢ OSD (N)): The average number of time slots required for the sink to successfully receive packets from all contending sensor nodes in the one-hop network. 2. Expected energy consumption (§ OSD (N)): The amortized energy consumption per sensor node for the sink to successfully receive packets from all contending nodes. This metric is calculated by ¯rst evaluating the energy consumed by all nodes until all packets are successfully transmitted and then by dividing that value by the number of nodes that participated in this operation. 7.3 Location-Aware MAC Protocol Now, we describe the our location-aware MAC protocol and illustrate its working through ex- amples. 117 10 12 11 14 13 9 8 7 3 1 2 6 4 5 Radio Range (1, 2, 3) (4, 5, 6) (7, 8, 9) (10, 11, 12, 13, 14) 1 2 3 4 5 6 9 7 8 10 11 12 (13,14) (13,14) 13 14 (a) (b) Figure 7.1: Example of the location-aware MAC protocol for m = 4. (a) The square space splitting (b) The corresponding tree. The main idea in our protocol is a tree-based splitting of space that adaptively reduces the collision domain of sensor nodes until each node is able to transmit its packet successfully. The protocol starts out by splitting the space in the radio range of the sink into m equal partitions. Each partition is a separate collision domain. At each step, only nodes belonging to the current partition are allowed to transmit their packets. When a partition has more than one node their transmission leads to collision. In the event of a collision the current partition is further split into m equal partitions. This continues until the current partition has at-most a single node in it. The protocol moves onto the next partition after all nodes in the current partition have successfully transmitted their packets. This process of space splitting builds a tree with m branches at each split, where, each branch is a separate collision domain. The leaves of the tree are collision domains with at-most a single sensor node in them and therefore successful transmissions can take place only from the leaves of the tree. We illustrate our location-aware protocol through an example shown in Figure 7.1. Here the spaceisasquarewhosehalf-diagonalisequaltotheradiorangeofthesink(thesinkislocatedat the center of the square). In this example the space is split into m = 4 equal square partitions at each level 1 . Figure 7.1(a) shows the square space splitting and Figure 7.1(b) shows the 1 In this chapter we focus on symmetrical square space partitions that are multiples of 4. The value of m in this case has to be power of 4. Results for values of m that split the space into other shapes (such as into rectangles when m is 2, 8, 32, etc.) are not presented due to lack of space. 118 corresponding tree. The space contains 14 sensor nodes numbered 1 through 14. The numbers in the tree show the nodes involved in collision at each branch. At time slot 1, the space is split into 4 equal squares and nodes 1,2, and 3 transmit their packets as all of them belong to partition 1 at level 1 2 . Since this results in collision, partition 1 of level 1 is further split into 4 equal partitions. Now, each partition has a single node. Therefore, node 1 successfully transmits its packet at time slot 2, node 2 at time slot 3 and node 3 at time slot 4. Time slot 5, allotted to partition 4 at level 2 of partition 1 at level 1, goes idle because it does not have any nodes in it. Similarly, nodes 4, 5, and 6 collide at time slot 6 and transmit successfully in time slots 7, 8, and 9 respectively. Time slot 10 goes idle as there are no nodes in partition 4 at level 2 of partition 2 at level 1. Nodes 7, 8, and 9 collide at time slot 11 and after time slot 12 goes idle, they successfully transmit their packets at time slots 13, 14, and 15, respectively. At time slot 16, nodes 10, 11, 12, 13 and 14 belonging to partition 4 at level 1 transmit their packets and collide, leading to further splitting of that partition into 4 partitions at level 2. Since partitions 1, 2, and 3 at level 2 have nodes 10, 11, and 12, respectively, a single node each, all of them transmit theirpacketssuccessfullyattimeslots17, 18, and19respectively. Attimeslot20, nodes13and 14 transmit their packets and collide. This results in further splitting of partition 4 at level 2 of partition 4 at level 1. Due to absence of nodes in partitions 1, 2, and 3 at level 3, time slots 21, 22, and 23 go idle. In time slot 24, nodes 13 and 14 transmit again, collide, and the partition is further split into 4 partitions at level 4. Due to absence of nodes in partition 1 at level 4, time slot 25 goes idle. Finally, nodes 13 and 14 successfully transmit their packets in time slots 26 and 27 respectively. Thus, in the above example, the delay for the sink to receive packets from all the 14 sensor nodes is D(n) = 27 time slots. Also, since the space is split into 4 equal square partitions at 2 We follow the convention of counting partitions from left to right and bottom to top. 119 (a) (b) Figure 7.2: (a) 16-split strategy, (m = 16), D(n) = 31. (b) 64-split strategy, (m = 64), D(n)=99. each level we call it a 4-split strategy. Similarly, Figure 7.2 illustrates the space splitting for 16-split and 64-split strategies. 7.3.1 Implementation Aspects A key aspect in the implementation of our location-aware MAC protocol is the determination of nodes that belong to the current partition. This can be achieved by issuing location tokens thatcontainthe boundariesof thecurrentpartitiontothenodesin eachtimeslot. Thelocation tokens are generated using the Location Token Generator (LTG), shown in Figure 7.3 for the m-splitstrategy,wheremisapowerof4. TheLTGusesthecurrentsplittinglevel,thepartition numbers of all the levels, the sink location and its radio range to determine the boundaries of the current partition. The equations show that the boundaries are calculated relative to the lower left corner of the square space. The implementation of the protocol depends on where the location tokens are generated { at the sink or at the sensor nodes. In the former, the sink has to run the LTG and transmit the location token to the sensor nodes. This can be achieved by piggy-backing the location tokens on the ACK/NACK packets. In the latter, the sensor nodes themselves run the LTG and generate the location token at the beginning of each time slot. The advantage of the latter 120 LTG(L;fP(l):1·l·Lg;(s x ;s y );S): x 1 = µ s x ¡ S 2 ¶ + L X l=1 [(P(l)¡1) mod p m]¢ S ( p m) l ; x 2 =x 1 + S ( p m) L ; y 1 = µ s y ¡ S 2 ¶ + L X l=1 ¥ P(l)¡1 p m ¦ ¢ S ( p m) l ; y 2 =y 1 + S ( p m) L ; Return (x 1 ;x 2 ;y 1 ;y 2 ); ² L: current level in the space splitting tree. ² P(l): partition number at level l for the current partition. ² (s x ;s y ): location coordinates of the sink. ² S: side length of the square whose half-diagonal is equal to the radio range of the sink. ² x 1 istheleftverticalboundary, x 2 istherightverticalboundary, y 1 isthelowerhorizontal boundary, and y 2 is the upper horizonal boundary. Figure 7.3: Location Token Generator for symmetrical square m-split strategy (m is a power of 4). over the former is the small size of the ACK/NACK packets. This advantage is obtained at the cost of shifting the computational load of LTG from the sink to the sensor nodes. Nevertheless, in either case, each sensor node decides if the location token belongs to it by verifying if its location falls within the boundaries speci¯ed by the location token 3 . If the location token belongs to a node it transmits its packet, otherwise, it is ignored. Figure 7.4 shows the sink and sensor node state diagrams for the location-aware MAC protocol for the implementation in which the sensor nodes determine the location tokens by themselves. 3 For the case of one-shot data querying the sensor node has to decide if it satis¯es the query simultaneously. If the node does not satisfy the query the node does not transmit any packet. 121 Receive message from Radio Successful Reception Collision Transmit ACK. pRx++ YES STOP Is pRx = k? NO Transmit NACK. Location Token = LTG(L,…) Is Location Token for me? Receive Request From Sink? YES L=1, P(L) = 1 YES YES Leave Network Transmit Packet. Wait for a time slot Receive ACK from Sink? Is ACK for me? YES Receive NACK from Sink? YES L++ P(L)=1 NO NO Is P(L) = m? NO YES L-- P(L)++ NO P(L)++ NO (a) Sink (b) Sensor Node Figure 7.4: State diagram at the sink and at the sensor node for the location-aware MAC protocol in which the location tokens are generated at the sink (pRx: packets received, L: split level, P(L): partition at split level L). 7.4 Evaluation Inthissectionwepresentresultsofperformanceevaluationofourlocation-awareMACprotocol using simulations. We also present results of a comparative study with location-unaware MAC protocols. First we present the simulation model. 7.4.1 Simulation Model We consider a square space of S£S sq. length units with S = 15 length units, populated by n sensor nodes, and the sink lodged at the center of the square. We vary n from 10 to 200 to capture both low and high node density scenarios. We consider three di®erent sensor node distributions { grid-random, even-random, and uniform-random. In grid-random distribution, thespaceisdividedintoagridofS 2 equalsizedsquaresandnodesareplacedsuchthateachgrid square is occupied by at most a single node. In even-random distribution, the space is divided into n equal sized partitions and each partition has at-most one sensor node. The procedure to divide the space into n equal partitions is described in Appendix A. In uniform-random deployment, each node is placed uniformly at random within the square space. We consider threesymmetricsquarespacesplittingstrategies{4-split, 16-split, and64-split. Thesimulation 122 results are averaged over 1000 random trials with 100 di®erent random seeds. In each random trial the locations of the sensor nodes are di®erent. 7.4.2 Performance of Location-Aware MAC Protocol Now we discuss the performance of our location-aware MAC protocol in terms of the delay and energy consumption metrics discussed in Section 7.2. Figure 7.5(a) shows the results for grid-random deployment of nodes for 4-split, 16-split, and 64-split strategies. The delay is in number of time slots and the energy consumption is in terms of the energy units described in Section 7.2. According to the ¯gure, 4-split and 16-split strategiesperformmuchbetterthan64-splitstrategyforbothlowandhighnodedensities 4 . The ¯gure also suggests that while for low densities 4-split performs better for high densities 16-split provides lower delay and energy consumption. This is expected because, for high node density, the 4-split strategy is too conservative leading to many partition levels and thus higher delay and energy consumption. Whereas for low node density the 16-split strategy is too aggressive leading to many idle time slots, again leading to higher delay and energy consumption. Figure 7.5(b) plots the delay and energy consumption as a function of n for the 16-split strategy for the three random location distributions. As the ¯gure shows, our location-aware protocol performance improves with increasing order in the location distribution of nodes from uniform random to even random to grid random suggesting that our protocol is inherently de- signedtotakeadvantageofthenodes'locationdistribution. Itshouldbenotedinthe¯gurethat the delay and energy consumption remain constant with increasing number of sensor nodes for higher node densities. The reason for this is that, for grid-random deployment, with increasing number of sensor nodes, the node density becomes more uniform across all split partitions. The 4 Fortherestoftheevaluationweconsidertheperformanceofonly4-splitand16-splitasthedelayandenergy consumption due to 64-split is an order of magnitude higher. 123 0 20 40 60 80 100 120 140 160 180 200 0 1000 2000 3000 4000 Number of Nodes (N) Delay Grid Random Deployment 4−split 16−split 64−split 0 20 40 60 80 100 120 140 160 180 200 0 500 1000 1500 2000 Number of Nodes (N) Energy 4−split 16−split 64−split 0 20 40 60 80 100 120 140 160 180 200 0 200 400 600 800 1000 Number of Nodes (N) Delay 16−Split Uniform Even Grid 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 Number of Nodes (N) Energy Uniform Even Grid (a) (b) Figure7.5: (a)Expecteddelayandexpectedenergyconsumptionpernodedueto4-split,16-split and 64-split strategies for grid-random placement of nodes. (b) Expected delay and expected energy consumption per node due to 16-split strategy for three di®erent location distributions. corresponding partition levels remain constant at high node densities irrespective of the actual number of nodes once a certain node density is crossed. Next, we present results of a comparative study between our location-aware protocol and three other location-unaware protocols. 7.4.3 Comparative Study We consider three location-unaware protocols: 1. HT-Split: We have chosen to compare the performance of location-aware MAC protocol tothatoftheHT-splitprotocoltoshowthatourlocation-awareMACprotocol,inaddition to taking advantage of collision domain separation like the HT-split protocol, also takes advantage of the nodes' location distribution, to provide lower delay and energy e±ciency. 2. Optimalp-persistentSlottedCSMA:Inthep-persistentslottedCSMAprotocol([21], [25]), each contending node senses the channel at the beginning of each time slot and if the channel is free it transmits its packet with probability p. If the channel is not free, the node attempts to transmit its packet in the next available free time slot with probability 124 10 20 30 40 50 60 70 80 90 100 0 5000 10000 15000 Number of Nodes (N) Delay Optimal csma HT−Split IEEE 802.15.4 10 20 30 40 50 60 70 80 90 100 0 2000 4000 6000 8000 10000 12000 Number of Nodes (N) Energy Optimal csma HT−Split IEEE 802.15.4 Figure 7.6: Expected delay and expected energy consumption for HT-split, optimal p-persistent slotted CSMA and IEEE 802.15.4 standard MAC. p. When the packet length is equal to that of a single time slot, this protocol is identical to p-persistent slotted Aloha. Even if the packet length is equal to multiple time slots, the packet-length-normalized delay and energy consumption will be identical to that of p-persistent slotted Aloha [25]. The p-persistent CSMA protocol is optimized for delay and energy consumption if the transmission probability is equal to the inverse of the current number of contending nodes i.e., p = 1 k where k is the current number of contending nodes [25]. We use this optimal p-persistent slotted CSMA protocol as a benchmark. 3. IEEE 802.15.4: In order to compare our location-aware MAC protocols' performance with a state-of-the-art MAC protocol for sensor networks, we chose the recently standard- ized IEEE 802.15.4 protocol for low-rate, low-power personal area networks described in Chapter 5. Figure 7.6 shows the expected delay and energy consumption for the above three location- unaware protocols as a function of n. Clearly, for all node densities the IEEE 802.15.4 standard MAC protocol performs the worst. The reason for this is that, for high number of nodes, due to multiple back-o®s, most nodes quickly reach the highest back-o® stage which has the lowest probability of transmission. The advantage of low probability of transmission is o®-set 125 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 600 Grid Random Deployment Number of Nodes (N) Delay 4−Split 16−Split HT−Split Optimal CSMA 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 Number of Nodes (N) Energy 4−Split 16−Split HT−Split Optimal CSMA 0 20 40 60 80 100 120 140 160 180 200 0 200 400 600 800 Even Random Deployment Number of Nodes (N) Delay 4−Split 16−Split HT−Split Optimal CSMA 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 Number of Nodes (N) Energy 4−Split 16−Split HT−Split Optimal CSMA (a) (b) 0 20 40 60 80 100 120 140 160 180 200 0 200 400 600 800 1000 Uniform Random Deployment Number of Nodes (N) Delay 4−Split 16−Split HT−Split Optimal CSMA 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 Number of Nodes (N) Energy 4−Split 16−Split HT−Split Optimal CSMA (c) Figure 7.7: Comparison of location-aware { 4-split, 16-split { and location-unaware { HT-split and optimal p-persistent slotted CSMA { medium access protocols as a function of n for the one-hop one-shot data collection problem. by high number of nodes, leading to higher probability of collision. Therefore for the rest of the evaluation we concentrate only on optimal p-persistent CSMA and HT-Split location-unaware protocols. Thus, we compare the 4-split and 16-split strategies of our location-aware MAC protocol to the HT-split and optimal p-persistent CSMA location-unaware protocols. Figure 7.7 compares the performance of the four protocols in terms of expected delay and energy consumption as a function of the number of nodes (n) for the one-hop one-shot data collection problem. The delay is in number of time slots and the energy consumption is in terms of the energy units described in Section 7.2. The main observations can be summarized as follows: 126 ² The 4-split location-aware MAC protocol performs better (for grid and even random de- ployments)thanorequal(uniformrandomdeployment)tothatoftheoptimalp-persistent CSMAprotocolbothintermsofdelayandenergyconsumption. Thisissigni¯cantbecause in optimal p-persistent CSMA knowledge of the current number of contending nodes is requiredateachstep,butevenintheworstcase(uniformrandomdeployment)with4-split location-aware MAC the same optimal performance is obtained without any knowledge of the number of nodes. This performance gain can be observed for both low and high node densities. ² The 16-split location-aware MAC performs even better than 4-split MAC for grid random deployment for high node densities. For grid-random deployment the performance gains for high node densities are in the order of 60% for both delay and energy consumption for location-aware MAC compared to that of the optimal location-unaware MAC. ² The performance gains for the location-aware MAC protocols are higher for higher node densities. This is because the advantage due to location distribution becomes more signif- icant for higher node density. 7.5 Discussion In this section, we discuss various issues inherent to our location-aware MAC protocol and elaborate on its scope. 1. Partition Shape: We have illustrated and evaluated the performance of the location- aware MAC protocol for the case in which the space is symmetrically partitioned into m equal squares at each level. However intuition suggests that the shape of space splitting does not a®ect the performance of the location-aware MAC protocol as long as the its tree structure remains the same. For the same number of nodes and the same node location 127 10 12 11 14 13 9 8 7 3 1 2 6 4 5 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 600 Number of Nodes (N) Delay Uniform Random Deployment 4 Angle Split 4 Square Split 0 20 40 60 80 100 120 140 160 180 200 0 50 100 150 200 250 300 Number of Nodes (N) Energy 4 Angle Split 4 Square Split (a) (b) Figure 7.8: (a) 4{Angle{Split Strategy. (b) Comparison of expected delay and energy con- sumption, for angular splitting and square splitting strategies for uniform-random deployment of nodes. distribution, if the space is considered to be circular and if it is split into m equal sectors at each level (as shown in Figure 7.8(a) for m = 4), then, on an average, the delay and energy consumption of nodes would remain the same as that for square splitting. This intuition is veri¯ed by the simulation results shown in Figures 7.8(b). 2. Sink Location: The location of the sink is a crucial part in the implementation of our location-aware MAC protocol. However, for certain one-hop one-shot data collection applications such as localization [101], the location of the sink is not available. In fact, the application has to determine the location of the sink. This problem can be solved by ¯rst assuming an approximate location for the sink and then using the location-aware MAC protocol to obtain data-packets from the sensor nodes. We propose to use transmission power control for this purpose. The main idea here is that the sink assumes the location of the nearest sensor node and uses this location to obtain packets from other contending nodes in its radio range. The sink can obtain the location of the nearest sensor node by using power control, in which, its transmission power is incremented by small steps starting from the lowest power until it is able to reach a sensor node and receive a packet from it. In the possibility of the 128 existenceofmorethanonenodesinthelowestconnectedradiorangeofthesink,thenodes can contend for the channel using a random medium access scheme such as p-persistent slottedCSMA.Thekeyobservationherebeingthat,fortypicalnodedensities,thenumber of nodes in the lowest connected radio range of the sink is very low compared to that of in typical operational radio ranges. For example, Tmote-sky [5] devices have a radio range of above 100 m for the highest transmission power of 0 dBm. For the lowest transmission power of ¡25 dBm the radio range is less than 6 m. Thus, for a uniform node density, the number of nodes in the lowest-power connected radio range is at-least two orders of magnitude lower than the number in the highest-power radio range. For the case in which more than one node exists in the lowest connected radio range of the sink, the delay in determining its nearest sensor node is the delay for the ¯rst packet to reach the sink. This delay depends on the density of node distribution, the topology of the network and the reliability of the wireless channel. If the number of nodes in the lowest-power connected radio range is a and if the nodes use p-persistent slotted CSMA [25] with T back-o® slots (since each node chooses uniformly at random to transmit, the probability of transmission at each time slot is p= 1 T ), then the expected number of time slots for the ¯rst successful transmission is given by: ap(1¡p) a¡1 +2(1¡ap(1¡p) a¡1 )ap(1¡p) a¡1 +3(1¡ap(1¡p) a¡1 ) 2 ap(1¡p) a¡1 +¢¢¢ = 1 ap(1¡p) a¡1 (7.1) Figure 7.9 shows the behavior of the above equation for di®erent values of a and T. As the ¯gure shows, the value of T can be chosen such that the delay due to determination 129 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Number of Nodes Number of time slots T = 5 T = 10 T = 15 Figure 7.9: The expected number of time slots for the ¯rst successful transmission. of the sink's location is very low, usually much lower than 10 time slots. A thorough investigation into the in°uence of error in sink location on the protocol performance is left open for future work. 3. Channel Errors: In our location-aware MAC protocol packet losses due to channel errors should be treated di®erently than that due to collisions, in order to maintain the performancegainsoverlocation-unawareMACprotocols. Packetlossduetochannelerror should not lead to further partitioning of the space. Space partitioning should happen only when more than one node transmit its packet that results in a collision. In order to achieve this we propose to use di®erent NACKs - NACK COL and NACK ERR - which identify collision or erroneous packet respectively. The sink sends the appropriate NACK packet based on the event that occurs. On receipt of NACK COL the location token generator (LTG) uses the previous token without generating a new token. Whereas receipt of NACK ERR prompts the LTG partitions the space and generates a new token. Thelossof ACK/NACKpacketsdueto packeterrorsisarelativelyrareoccurrence inhop networks given their small form factor. Their loss due to collisions is not a possibility as the sink is the only transmitter of those packets in the one-hop scenario. 4. Capture E®ect: In the presence of capture e®ect, when two or more nodes transmit in the same time slot and the signal to noise ratio (SNR) of a packet is above a threshold 130 than that of other packets then that packet is received successfully by the sink. Therefore in the presence of capture e®ect our location-aware MAC protocol could cause starvation at some nodes. Further research to e±ciently solve this problem is left open for future work. 5. Hybrid: In the location-aware MAC protocol we have illustrated and evaluated, the space is split into m equal parts at each level. However, the value of m can be changed, adaptively, at each level, depending on the sensor node deployment density. For example, for a given sensor node density, the number of nodes in partitions of higher split levels is lower than that in lower split levels. This fact can be taken advantage to adaptively reduce the value m for higher split levels, thus, reducing the number of idle time slots and consequentlyreducingthedelay. Oralternatively,sincelowernumberofnodescontendfor the channel at higher split levels, random medium access techniques such as CSMA/CA couldbeusedinconjunctionwiththelocation-awareMACprotocolandpotentiallyreduce the delay and energy consumption. 7.6 Chapter Summary Inthischapter,wehavepresentedanovellocation-awaremediumaccessprotocolfortheone-hop one-shotdatacollectionprobleminwirelesssensornetworks. Thede¯ningfeatureofthisproblem is that each sensor node in the one-hop radio range of the sink has a single packet to transmit. We illustrated the working of our protocol using examples and discussed its implementation aspects. The main idea in our location-aware protocol is a tree-based hierarchical partitioning of space to progressively reduce the collision domains of nodes until there are no collisions. Further, we presented results from a thorough performance evaluation of the location- aware protocol in comparison to three location-unaware MAC protocols { HT-split, optimal p-persistent slotted CSMA, and the IEEE 802.15.4 standard protocol { using simulations. We 131 evaluatedtheprotocolforthreedi®erentlocationdistributionsofnodes{uniform-random,even- random, grid-random. Results showed that our location-aware MAC protocol in the worst case (uniform random distribution) provides as good a performance as an optimal location-unaware MAC protocol and in the best case (grid random distribution) provides 60% lower delay and energy consumption. 132 Chapter 8 Enhancement of IEEE 802.15.4 MAC Protocol 8.1 Introduction IEEE 802.15.4 is an important standard for low-rate low-power wireless personal area networks thatisinincreasingcommercialuseforadiverserangeofembeddedwirelesssensingandcontrol applications. The standard provides speci¯cations for both the physical layer and the medium access control (MAC) protocol. We characterize the performance of the IEEE 802.15.4 MAC for one-hop data collection in a star topology where there are multiple transmitters and a single receiver. Our primary focus is on settings where the number of transmitters is large. Because 802.15.4-enabled devices are meant to be low-cost and operate at relatively low rates, such dense deployments are of interest in many sensing applications involving these devices. WemodeltheIEEE802.15.4asap-persistentCSMAwithchangingtransmissionprobability p. We derive the optimal transmission probabilities to maximize the throughput and minimize energy consumption in p-persistent CSMA. We show that, particularly for large number of transmitters, the ratio of the expected idle time between successful receptions to the expected time between successful receptions is a constant for a given packet size when the transmission probabilities are optimal. Further, we ¯nd that when the transmission probability is lower 133 (higher)thantheoptimal,theratioishigher(lower)thanthisconstant. Thisyieldsadistributed channel feedback-based control mechanism that changes the transmission probabilities of nodes dynamically towards the optimal. We develop an enhanced version of the IEEE 802.15.4 MAC protocol using this feedback scheme. In our modeling and evaluation, we consider two extremes of the one-hop data collection spectrum in dense sensor networks: one-shot and continuous data collection. In one-shot data collection, each node sends only a single packet (this could be the response to a one-shot query) and once that packet is transmitted the node is no longer in contention for the channel. In continuous data collection, we assume that each node is backlogged, i.e. always has a packet to transmit. In both cases, we ¯nd that the IEEE 802.15.4 protocol performs poorly in dense settings, showing a steep reduction in throughput and increase in energy with network size. In contrast, the enhanced protocol that we propose is signi¯cantly more scalable, showing a relatively °at, slow-changing total system throughput and energy as the number of transmitters is increased. In this chapter we mainly focus on dense sensor networks in which at-least 50 nodes contend for the channel in either scenario. We assume that the packet lengths are deterministic and constant. The rest of this chapter is organized as follows. In Section 8.2, we present an overview of IEEE802.15.4andmodelitasap-persistentCSMAwithchanging p. InSection8.3, wepresent the modeling and optimization of p-persistent CSMA and characterize the performance of the IEEE802.15.4MACinSection8.4. InSection8.5,wepresentachannelfeedback-basedmedium access control technique and adapt it to present the enhanced IEEE 802.15.4 MAC. In the same section we discuss directions of our future work. We conclude in Section 8.6. 134 8.2 IEEE 802.15.4 Inthissectionwepresentap-persistentCSMAMACmodelfortheIEEE802.15.4MACprotocol in which the probability of transmission p changes with each collision. 8.2.1 Model Now, we model the IEEE 802.15.4 MAC as a p-persistent CSMA MAC with changing p. Before we present the MAC model, we describe the assumptions made and the energy model used. ² Assumptions: Let the number of sensor nodes in the radio range of the sink be N. All sensor nodes are synchronized to a global time which is divided into slots of equal length and each node transmits at the beginning of a time slot. Let the packet length be L time slots. A sensor node is informed of its packets' successful transmission through acknowledgement packets (ACKs) from the sink. Failure to receive an ACK from the sink implies a collision. The ACK is sent by the sink as soon as the packet reception is completed. Table 8.1 summarizes the notations used. ² Energy Model: According to the IEEE 802.15.4 standard a node can exist in any one of the following four states - Shutdown, Idle, Transmit, Receive. For CD, we assume that the nodes are either in the Trasmit or the Recieve state and are not concerned with the Shutdown or Idle states. For OSD, again each node is either in the Transmit or the Recieve state until its packet is transmitted, after which the node moves to the Shutdown state permanently. Let the power consumed in the Transmit state be » T and the power consumed in the Recieve state be » R . According to [4], for the CC2420 IEEE 802.15.4 complaint radio, » R = 35 mW and » T = 31 mW for the highest transmission power. The power consumed in the Shutdown state is negligible . ² MAC Model: In [81] the authors model the IEEE 802.15.4 MAC in the contention accessperiod(CAP)asanon-persistentCSMAwithback-o®. Theyapproximatethethree 135 original uniform-random back-o® windows to geometrically distributed back-o® windows withparametersp 1 ,p 2 andp 3 suchthatp i = 2 BOi+1 , (1·i·3)whereBO i istheoriginal uniform-random back-o® window size. With BO 1 = 8, BO 2 = 16 and BO 3 = 32, the respective values of p 1 , p 2 and p 3 are 1 4:5 , 1 8:5 and 1 16:5 . In this chapter we further simplify this model to a p-persistent CSMA in which the proba- bility of transmission changes from p 1 to p 2 to p 3 with each collision and remains constant after two collisions at p 3 . The key di®erence in our model from the non-persistent CSMA model is that in our case the transmission probability changes with a packet collision instead of a busy carrier sense. Thus in our model, a node starts out with an initial trans- mission probability of p 1 . The node senses the channel at the beginning of each time slot and if the channel is found to be free for two consecutive time slots, it transmits its packet with probability p 1 . If the channel is busy, the node tries to transmit the packet with the same probability the next time it ¯nds two consecutive free time slots. If more than one node transmits in the same time slot it results in a collision and if a node is involved in a collision for the ¯rst time it changes its transmission probability to p 2 . On a second collision its transmission probability is changed to p 3 and it remains constant beyond the second collision. Weevaluatetheaccuracyofourmodelusingsimulations. Theresultsareaveragedover1000 random trials with 100 di®erent random seeds. For the CD scenario, we simulated the protocol for 10000 time slots for a packet length of 50 Bytes (or 5 time slots). Figure 8.1 plots the simulation results comparing the IEEE 802.15.4 and our p-persistent CSMA model and shows that our model is reasonably accurate. Next, we determine the opti- mal performance of a generic p-persistent CSMA MAC with a similar time slot structure and characterize the performance of IEEE 802.15.4 MAC in comparison to that. 136 N Number of nodes in the network p Transmission probability L Length of packet in time slots ± Time slot length (320 ¹secs) » R Power consumption in Receive state » T Power consumption in Transmit state n Number of nodes in an epoch T n Delay in an epoch with n nodes E n Energy consumption in an epoch with n nodes © CD (N) Throughput in bps in CD § CD (N) Energy consumption per node per successful packet transmission in CD ¢ OSD (N) Delay in secs to obtain packets from N nodes in OSD § OSD (N) Energy consumption in Joules to obtain packets from N nodes in OSD p T opt (n;L) Transmission probability that minimizes epoch delay p E opt (n;L) Transmission probability that minimizes epoch energy consumption Table 8.1: Notation 8.3 p-Persistent CSMA MAC In this section we model and analyze a generic p-persistent CSMA MAC and determine the transmission probabilities that optimize its performance. 8.3.1 Overview In a slotted p-persistent CSMA ([21]), each node senses the channel at the beginning of each time slot and if the channel is found to be free of any transmissions, it transmits its packet with a probability p. If the channel is not free, the node attempts to transmits its packet in the next free time slot. If more than one node transmits in the same time slot it results in a collision. Traditionally, system dynamics due to the p-persistent CSMA protocol have been modeled using renewal theory (example [58], [64], [25], [27]). The key assumption that makes the use of renewal or regenerative models feasible is that the system attains stationarity and that the models capture the system behavior at the state. While this assumption is still true for the CD scenario, it is not true for the one-shot data scenario. Nevertheless, we observe the system at 137 50 55 60 65 70 75 80 85 90 95 100 0 5 10 15 20 25 30 Number of Nodes (N) Φ CD (N) Kbps L = 5 p−persistent CSMA IEEE 802.15.4 50 55 60 65 70 75 80 85 90 95 100 0 2 4 6 8 Number of Nodes (N) Σ CD (N) mJoules p−persistent CSMA IEEE 802.15.4 50 55 60 65 70 75 80 85 90 95 100 0 1 2 3 4 Number of Nodes (N) Δ OSD (N) secs L = 5 p−persistent CSMA IEEE 802.15.4 50 55 60 65 70 75 80 85 90 95 100 0 20 40 60 80 100 120 Number of Nodes (N) Σ OSD (N) mJoules p−persistent CSMA IEEE 802.15.4 (a) Continuous Data (b) One-shot Data Figure 8.1: IEEE 802.15.4 standard is modeled as a p-persistent CSMA with probability of transmission reducing in three steps { p 1 = 1 4:5 , p 2 = 1 8:5 , p 3 = 1 16:5 { with each new collision. everysuccessfulpackettransmissionlikein[64]and[27],forbothscenariosandderiveexpressions for throughput, delay and energy consumption. 8.3.2 Model We observe the system at every successful packet transmission. The time interval between two consecutive successful transmissions is de¯ned as an epoch. An epoch is made up of idle time, in which the channel is free of any transmissions, collision time, in which more than one node is transmitting and a single successful transmission time which marks the end of the epoch, as illustrated in Figure 8.2. It is important to note that, for CD, the number of nodes remain constant in all epochs. However, for OSD the number of nodes decreases by one with each passing epoch. Let T n be the epoch delay { the time interval between two consecutive successful packet transmissions { in seconds and E n be the energy consumption { the total energy consumed by all contending nodes { in Joules, for the epoch with n contending nodes. Then © CD (N)= 1 E[T N ] ¢(80L) bps (8.1) 138 … : Idle Slot : Collision Slot : Successful Slot epoch … … … … … Figure8.2: Anepochillustratingthetimeintervalbetweenconsecutivesuccessfultransmissions. § CD (N)= E[E N ] N Joules (8.2) ¢ OSD (N)= N X n=1 E[T n ] seconds (8.3) § OSD (N)= 1 N N X n=1 E[E n ] Joules (8.4) where 80L in Equation 8.1 is the packet length in bits. Clearly, the above metrics are optimized when E[T n ] and E[E n ] are minimized. First we determine expressions for E[T n ] and E[E n ]. Proposition7. For a constant packet length L, the expected epoch delay for n contending nodes is given by E[T n ]= L¡(L¡1)(1¡p) n np(1¡p) n¡1 ¢± (8.5) Proof. As illustrated in Figure 8.2 the delay in an epoch is due to idle time, collision time and successful transmission time. Therefore, the expected delay in epoch n, is given by E[T n ]=E[T Idle;n ]+E[T Collision;n ]+E[T Success ] (8.6) 139 where E[T Idle;n ] is the expected number of idle time slots, E[T Collision;n ] is the expected number of collision time slots and E[T Success ] is the expected number of time slots of successful transmission. SincethepacketlengthLisaconstantE[T Success ]isequaltoL± andindependent of n. If E[N coll;n ] is the expected number of collisions in an epoch with n nodes, then E[T Idle;n ] = (E[N coll;n ]+1)¢E[T IdlePeriod;n ] (8.7) E[T Collision;n ] = E[N coll;n ]¢E[T CollisionPeriod;n ] (8.8) where E[T IdlePeriod;n ] is the expected number of idle time slots between two consecutive packet transmissions (collision or successful) and E[T CollisionPeriod;n ] is the expected number of collision time slots at each collision. Owing to the constant probability of transmission p within an epoch, the IdlePeriods between any two consecutive packet transmissions are i:i:d random variables with the same mean value. Also, since the decision to transmit in a time slot after a free channel sense is independent of the number of previous free channel senses, the number of collisions is independent of the length of IdlePeriods. This holds true for CollisionPeriods also, thus justifying the above two equations. E[N coll;n ] and E[T IdlePeriod;n ] are given by [27]: E[N coll;n ] = 1¡(1¡p) n np(1¡p) n¡1 ¡1 (8.9) E[T IdlePeriod;n ] = (1¡p) n 1¡(1¡p) n ¢± (8.10) We use the above two equations to derive the expected delay in the epoch n. Since the packet length is constant E[T CollisionPeriod;n ]=L±. Therefore, 140 E[T Idle;n ] = 1¡p np ¢± (8.11) E[T Collision;n ] = L±(1¡(1¡p) n ¡np(1¡p) n¡1 ) np(1¡p) n¡1 (8.12) Substituting the above equations in Equation 8.6, we get Equation 8.5. Proposition 8. For a constant packet length of L, the expected epoch energy consumption for n contending nodes is given by E[E n ]=» R ±¢ L¡(L¡1)(1¡p) n¡1 p(1¡p) n¡2 +» T ±¢ L (1¡p) n¡1 (8.13) Proof. Similar to Equation 8.6, the energy consumption in the epoch n is equal to the sum of the energy consumption in idle time, the energy consumption in collision time and the energy consumption in a successful transmission. E[E n ]=E[E Idle;n ]+E[E Collision;n ]+E[E Success ] (8.14) Using equations from Proposition 7, E[E Idle;n ] can be calculated as E[E Idle;n ] = (E[N coll;n ]+1)¢n» R ¢E[T IdlePeriod;n ] (8.15) = n» R ±¢ 1¡p np =» R ±¢ 1¡p p (8.16) Surprisingly,foraconstantp,theidletimeenergyconsumptionisindependentofthenumber of contending nodes in an epoch, and depends only on p. Similarly, the collision time energy consumption is given by 141 E[E Collision;n ]=E[N coll;n ]¢E[E CollisionPeriod;n ] (8.17) The expected energy consumption in a CollisionPeriod, E[E CollisionPeriod;n ], is equal to the sum of the expected energy consumption by nodes involved in packet transmissions and the expected energy consumption by nodes in idle state during the CollisionPeriod. Therefore, E[E CollisionPeroid;n ]=L» T ± n X i=2 iPfTrans:=ijCollisiong+L» R ± n X i=2 (n¡i)PfTrans:=ijCollisiong (8.18) where PfTrans: =ijCollisiong is the probability that i (¸2) nodes transmit their packets given that a collision has occurred, and it is given by PfTrans:=ijCollisiong=PfTrans:=ijTrans:¸2g= ¡ n i ¢ p i (1¡p) n¡i 1¡(1¡p) n ¡np(1¡p) n¡1 (8.19) Substituting the above equation in Equation 8.18, we get E[E CollisionPeriod;n ]= L(» T ¡» R )±¢np(1¡(1¡p) n¡1 ) 1¡(1¡p) n ¡np(1¡p) n¡1 +nL» R ± (8.20) Substituting the above equation in Equation 8.17, we get E[E Collision;n ]= L(» T ¡» R )±¢(1¡(1¡p) n¡1 ) (1¡p) n¡1 + L» R ±¢(1¡(1¡p) n ¡np(1¡p) n¡1 ) p(1¡p) n¡1 (8.21) And ¯nally, the expected energy consumption during a successful transmission is given by E[E Success ]=» T ¢L±+» R ¢L±¢(n¡1) (8.22) 142 Substituting the above equations in Equation 8.14 we get Equation 8.13. 8.3.3 Optimality Let p T opt (n;L) and p E opt (n;L) respectively be the transmission probabilities at which E[T n ] and E[E n ] are minimized. Proposition9. For n>1, the transmission probability that minimizes the expected epoch delay E[T n ] is given by p T opt (n;L) = 1 n ; L=1 (8.23) p T opt (n;L) ¼ p n 2 +2n(n¡1)(L¡1)¡n n(n¡1)(L¡1) ; L>1 (8.24) Proof. The value of p that minimizes E[T n ] is obtained by equating its ¯rst derivative with respect to p to zero. dE[T n ] dp =0 (8.25) For L=1, E[T n ]= ± np(1¡p) n¡1 (8.26) Taking the derivative and equating it to zero results in p= 1 n . Similarly, for L>1, equating the derivative of E[T n ] from Equation 8.5 to zero yields the following equation. (1¡p) n = L L¡1 ¢(1¡np) (8.27) 143 Fornp<1,(1¡p) n canbeapproximatedto1¡np¡ n(n¡1) 2 p 2 . Usingthisapproximationand further simpli¯cation, Equation 8.27 reduces to Equation 8.24 as an unique root to a quadratic equation. It can be veri¯ed that d 2 E[T n ] dp 2 >0 for p=p T opt (n;L), thus minimizing E[T n ]. Proposition10. Forn>1and° = » T » R thetransmissionprobabilitythatminimizestheexpected epoch energy consumption E[E n ] is given by p E opt (n;L)¼ p n 2 +2n(n¡1)(L¡1)+4L(n¡1)(°¡1)¡n n(n¡1)(L¡1)+2L(n¡1)(°¡1) (8.28) Proof. Similar to the previous Proposition equating dE[En] dp to zero yields (1¡p) n = L L¡1 ¢(1¡np¡p 2 (n¡1)(°¡1)) (8.29) The same approximation as in the previous proposition and further simpli¯cation of the above equation results in Equation 8.28. It can be veri¯ed, as in the previous Proposition, that the second derivative of E[E n ] with respect to p is positive for p = p E opt (n;L), thus minimizing E[E n ]. Numericalcalculationsshowthattheapproximationsareveryclosetotheactualvalues. For n=1,theoptimumtransmissionprobabilityisequalto1; i.e.,whenthereisasinglesensornode left, delay and energy are minimized when it transmits its packet with probability 1. Figure 8.3 plots p T opt (n;L) and p E opt (n;L) as a function of the number of contending nodes from n = 100 to n =2 for di®erent values of °. As the ¯gure shows, for optimal performance the probability of transmission should increase with decreasing number of nodes in an epoch in order to avoid excessive idle time slots. We can also see that the transmission probabilities are higher for lower values of °. This is because if the node spends more energy in the receive state than in the transmit state, energy is saved if it transmits more than it receives. 144 0 10 20 30 40 50 60 70 80 90 100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Number of nodes in an epoch (n) Optimal p L=5 p opt T (n,L) p opt E (n,L), γ = 2 p opt E (n,L), γ = 4 p opt E (n,L), γ = 0.5 p opt E (n,L), γ = 0.25 Figure 8.3: The optimal probability of transmission. Corollary 3. If » T = » R , then p T opt (n;L) = p E opt (n;L), i.e., the delay and energy consumption are jointly optimized with a single probability of transmission for » T =» R . Proof. For ° =1 Equations 8.24 and 8.28 are equal, which proves the corollary. 8.3.4 Optimality Criteria Now,wediscusssomeinterestingoptimalitycriteriafortheepochdelayandenergyconsumption. ² Proposition 11. Let ¡(L) = (L¡1) 2 L¡ p 2L¡1 . If L > 1 and n is large such that n¡1 n ¼ 1 then for optimal transmission probability the average epoch delay is a constant equal to ¡(L). p=p T opt (n;L))E[T n ]¼¡(L) (8.30) Proof. For optimal transmission probability, substituting Equation 8.27 into Equation 8.5 we get E[T n ]= (L¡1)(1¡p T opt (n;L)) 1¡np T opt (n;L) (8.31) 145 For n large such that n¡1 n ¼1, from Equation 8.24 p T opt (n;L) ¼ 0 (8.32) np T opt (n;L) ¼ p 2L¡1¡1 L¡1 (8.33) Substituting the above equations into Equation 8.31 proves the proposition. Corollary4. Foroptimaltransmissionprobabilityandforlargenumberofnodessuchthat n¡1 n ¼1 the throughput of p-persistent CSMA MAC protocol is a constant independent of n and depends only the length of the packet. OptimalThrougput¼ L¡ p 2L¡1 (L¡1) 2 packets=timeslot (8.34) Proof. Throughput is calculated as the inverse of the epoch delay. Equation 8.34 is a direct result from Proposition 11. ² Proposition12. Let¡ R (L)= L¡ p 2L¡1 (L¡1)( p 2L¡1¡1) . IfL>1andnislargesuchthat n¡1 n ¼1, then for optimal transmission probability the ratio of average idle time in an epoch to the average epoch delay is a constant equal to ¡ R (L). Also, if the transmission probability is greater than optimal then the ratio is lower than ¡ R (L) and vice versa. p=p T opt (n;L)) E[T Idle;n ] E[T n ] ¼¡ R (L) (8.35) p7p T opt (n;L)) E[T Idle;n ] E[T n ] ?¡ R (L) (8.36) Proof. Using Equations 8.9, 8.5 and 8.27 for optimal p, 146 E[T Idle;n ] E[T n ] = 1 L¡1 à 1 np T opt (n;L) ¡1 ! (8.37) For n¡1 n ¼1, using Equation 8.24 np T opt (n;L)¼ p 2L¡1¡1 L¡1 (8.38) Substitutingtheaboveequationintothepreviousequationthe¯rstpartoftheproposition is proved. Similarly, for p7p T opt (n;L) np7 p 2L¡1¡1 L¡1 (8.39) ) E[T Idle;n ] E[T n ] ?¡ R (L) (8.40) Hence the proposition is proved. Figure 8.4 illustrates Proposition 12 for L = 5. The approximation of the ratio to ¡ R (L) is primarily due the approximation in Equation 8.24. As the ¯gure shows, for low values of n the ratio deviates away from ¡ R (L). ² Figure 8.5 plots the expected delay and energy consumption for an epoch with n = 50 nodes as a function of the transmission probability p for di®erent values of the packet length L. The ¯gure can be explained through the following question: In p-persistent CSMA, if the length of the packet is increased from L to L+l (l > 0), should the value of transmission probability p be increased or decreased to maintain the delay and energy consumption constant? 147 0 10 20 30 40 50 60 70 80 90 100 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Number of nodes in an epoch (n) E[T Idle,n ]/E[T n ] L = 5 p opt T (n,L) p opt T (n,L) − 0.003 p opt T (n,L) + 0.003 Figure 8.4: Ratio of expected idle time to expected epoch delay. 0.005 0.01 0.015 0.02 0.025 0.03 0 1 2 3 4 n = 50, ξ R = 35 mW, ξ T = 31 mW p E[T n ] msecs 0.005 0.01 0.015 0.02 0.025 0.03 0 1 2 3 4 5 6 7 p E[E n ] mJoules L = 5 L = 4 L = 3 L = 2 L = 1 L = 1 L = 2 L = 3 L = 4 L = 5 Figure 8.5: Expected delay and energy consumption in an epoch with n nodes as a function of transmission probability, p, for di®erent values of packet length L. Figure 8.5 shows us that the answer to the above question is that it depends on the value of p. If p<p T opt (n;L), then for the same delay, p should be increased and if p>p T opt (n;L) thenpshouldbedecreased. Thesameanswerholdstrueforenergyif p T opt (n;L)isreplaced by p E opt (n;L). The ¯gure also shows that the optimal transmission probability values p T opt (n;L) and p E opt (n;L) decrease with increasing L. ² Figure 8.6 plots ratios of consecutive epoch delays and energy consumptions as functions of n. In this ¯gure, if the ratio is greater than 1, it implies that the delay or energy value 148 increases with decreasing n and vice versa. Greater the di®erence from 1, higher the rate of increase or decrease. The following observations can be made from the ¯gure: { For p = p T opt (n;L), E[T n ] is almost constant over all n. For p > p T opt (n;L), E[T n ] shootsupforhighervaluesofnduetohighernumberofcollisions. Forp<p T opt (n;L), E[T n ] shoots up for lower values of n due to higher number of idle time slots. { For p = p E opt (n;L), E[E n ] increases monotonically with increasing n. For p > p E opt (n;L), E[E n ] shoots up for higher values of n due to higher number of colli- sions. For p < p E opt (n;L), E[E n ] is higher than the optimal energy consumption values for lower values of n due to higher number of idle time slots. 0 10 20 30 40 50 60 70 80 90 100 0.6 0.8 1 1.2 1.4 1.6 n E[T n−1 ]/E[T n ] L=5, ξ R = 35 mW, ξ T = 31 mW p = p opt T (n,L) p = 0.35 p = 0.002 0 10 20 30 40 50 60 70 80 90 100 0.5 0.6 0.7 0.8 0.9 1 n E[E n−1 ]/E[E n ] p = p opt E (n,L) p = 0.35 p = 0.002 Figure 8.6: Ratio of expected delays and energy consumptions for consecutive epochs. For CD the implication of this criterion is that the delay between two successful packet transmissions is independent of the number of nodes in the network as long as the nodes are transmitting at optimal transmission probabilities. For OSD, the implication given by is the following proposition. 149 Proposition 13. For OSD, if n is large such that n¡1 n ¼ 1, then the transmission probability is optimal if and only if the epoch of delay of two consecutive epochs are equal. p=p T opt (n;L) , E[T n¡1 ]=E[T n ] (8.41) Proof. (i) To prove that p=p T opt (n;L))E[T n ]=E[T n¡1 ] (8.42) From Equation 8.27 p is optimal when (1¡p) n = L L¡1 ¢(1¡np) (8.43) For optimal value of p, E[T n ] E[T n¡1 ] = (1¡p T opt (n;L))(1¡(n¡1)p T opt (n¡1;L)) (1¡p T opt (n¡1;L))(1¡np T opt (n;L)) ¼ 1 for n¡1 n ¼1 (8.44) We have used Equation 8.24 in the above simpli¯cation. (ii) To prove that E[T n ]=E[T n¡1 ])p=p T opt (n;L) (8.45) Using equation 8.5, E[T n ]=E[T n¡1 ] implies L¡(L¡1)(1¡p) n np(1¡p) n¡1 = L¡(L¡1)(1¡p) n¡1 (n¡1)p(1¡p) n¡2 (8.46) Algebraic manipulations reduce the above equation to 150 (1¡p) n = L L¡1 ¢(1¡np) (8.47) which is the same as equation 8.27, which implies p=p T opt (n;L). Hence proved. 8.4 Characterization of IEEE 802.15.4 Having determined the performance of optimal p-persistent CSMA, we characterize the perfor- mance of IEEE 802.15.4 MAC in this section. Figure8.7plotstheaveragetransmissionprobabilitiesfortheIEEE802.15.4MAC(obtained usingthep-persistentCSMAmodel)incomparisontothetransmissionprobabilitiesforoptimal p-persistent CSMA for both CD and OSD. The transmission probabilities shown for IEEE 802.15.4areobtainedusingthedefaultvaluesspeci¯edinthestandardincludingthetworequired sensing slots, which is not required for the generic p-persistent CSMA MAC. For OSD, the transmission probability for IEEE 802.15.4 quickly stabilizes at 1 16:5 =0:0606 and for CD, close to that value. This behavior is in contrast to the trend shown by optimal probabilities. This implies that the back-o® mechanism of IEEE 802.15.4 protocol can be modi¯ed for enhanced performance as follows: ² The change of back-o® window sizes should happen at successful transmissions instead of at collisions or busy channel senses. Further, for OSD, successful packet transmissions are a better indicator for future congestion than collisions or busy channel senses. ² For CD, the average transmission probability for IEEE 802.15.4 MAC remains almost constant irrespective of the number of contending nodes, while for optimal p-persistent CSMA it reduces with N. For optimal performance the window sizes should be re°ective of the number of contending nodes. 151 ² For OSD, the back-o® window size should actually decrease with every successful trans- mission as the optimal transmission probability increases. 50 55 60 65 70 75 80 85 90 95 100 0 0.02 0.04 0.06 0.08 Number of Nodes (N) Transmission Probabilities, L = 5 IEEE 802.15.4 Optimal p−persistent CSMA 0 10 20 30 40 50 60 70 80 90 100 0 0.1 0.2 0.3 0.4 Number of nodes in an epoch (n) IEEE 802.15.4 Optimal p−persistent CSMA CD OSD, N = 100 Figure8.7: ComparisonoftransmissionprobabilitiesforIEEE802.15.4andoptimalp-persistent CSMA for CD and OSD. In the next section, we present a channel feedback enhanced IEEE 802.15.4 MAC that incorporates the above features. 8.5 Enhanced IEEE 802.15.4 ThekeyideainenhancingtheperformanceoftheIEEE802.15.4istousetheoptimalitycriteria for p-persistent CSMA derived in Section 8.3. In particular, we consider the criterion described in Proposition 12 which requires measurement of the idle time as well as the delay between two consecutive successful transmissions. These measurements can be construed as feedback from the channel. In chapter 5 we reviewed the related work in channel feedback-based medium access control techniques. Prior work on channel feedback based medium access enhancements has been reviewed in Chapter 5. A good control mechanism should depend on the network and tra±c conditions as well as the application requirements. Our objective is to present a feed-back control mechanism 152 that is suitable for both CD and OSD scenarios. One major challenge presented in OSD is to estimate the true system state using channel conditions in the face of constantly changing state of the system (decreasing number of contending nodes). Nevertheless, the analysis pre- sentedinSection8.3presentsuswithuniqueopportunitiestoe±cientlycontrolthetransmission probabilities in real time. 8.5.1 Our Approach Our approach for channel feedback-based control of transmission probabilities is mainly based onProposition12. Accordingtotheproposition, ifthetransmissionprobabilityisoptimal, then the ratio of idle time to the delay between two consecutive successful packet transmissions is ¡ R (L). If the transmission probability if higher than the optimal value then the ratio is lower than ¡ R (L) and vice versa. First we describe how this optimality criterion can be used for an enhanced p-persistent CSMA and then adapt it to design an enhanced IEEE 802.15.4 MAC protocol. 8.5.1.1 Enhanced p-Persistent CSMA MAC Eachcontendingnodecanstartbychoosingthesametransmissionprobabilityuniformlyatran- dom in a small interval of say (0;0:05). Each node in the network measures the current epoch's idle time and delay and uses these measurements to determine the transmission probability for the next epoch. If the ratio of idle time to the delay is lower than ¡ R (L) then it means that the transmission probability would have been greater than the optimal value. Therefore the transmissionprobabilityofthenextepochshouldbelowerthanthecurrentepoch'stobringthe delayclosertooptimal. Similarly,iftheratioifhigherthan¡ R (L)thenextepoch'stransmission probability should be increased for optimal delay. Thus, the transmission probability update rule is given by 153 p next =p current ¢ ® ¡ R (L) (8.48) where ® = T Idle;current Tcurrent . In this update rule the increase or decrease in the transmission probability is directly proportional to the value of the ratio ®. 8.5.1.2 Enhanced IEEE 802.15.4 The IEEE 802.15.4 MAC protocol uses di®erent window sizes to control the transmission of packets. In order to use the above optimality criterion, the transmission probability update rule should be converted into a window size update rule. For this we make use of the approximation we used in Section 8.2 to model the IEEE 802.15.4 MAC as a p-persistent CSMA MAC with changing p. In this, if a uniform-random back-o® window has a size of W time slots then it can be closely modeled as a geometric-random choice of time slot with parameter p as long as p = 2 W+1 . Thus a transmission probability can be converted into window size by using the inverse relationship, i.e., W = 2¡p p . Based on this and the transmission probability update rule given above, the window update rule for the Enhanced IEEE 802.15.4 MAC is: W next = (W current +1)¡ R (L)¡® ® (8.49) A key aspect of this update rule is that, all nodes in the network should updated their win- dows at every successful packet transmission. Figure 8.8 shows the °ow chart for the Enhanced IEEE 802.15.4 MAC operation at a node. It should be noted that all aspects of the original IEEE 802.15.4 MAC have been preserved except for when the window is changed and how it is changed. 154 Start p = rand(0, 0.05) W = (2-p)/p Choose a time slot Wait Time = chosen timeslot? From Channel: Is there a successful transmission? YES YES Sense Channel in this time slot. Is it free? Sense Channel in next time slot. Is it free? YES Don’t Change Window Choose a new time slot NO NO Transmit. Success? YES NO YES Change Window Size W next = ((W current + 1) ī R (L) – Į ) / Į Choose a new time slot Figure 8.8: Flow chart for Enhanced IEEE 802.15.4 operation at a node. 50 55 60 65 70 75 80 85 90 95 100 0 50 100 150 L = 5 Number of Nodes (N) Φ CD (N) Kbps 50 55 60 65 70 75 80 85 90 95 100 0 2 4 6 8 Number of Nodes (N) Σ CD (N) mJoules IEEE 802.15.4 Optimal p−persistent CSMA Enhanced p−persistent CSMA Enhanced IEEE 802.15.4 50 55 60 65 70 75 80 85 90 95 100 0 1 2 3 4 Number of Nodes (N) Δ OSD (N) secs L = 5 IEEE 802.15.4 Optimal p−persistent CSMA Enhanced p−persistent CSMA Enhanced IEEE 802.15.4 50 55 60 65 70 75 80 85 90 95 100 0 20 40 60 80 100 120 Number of Nodes (N) Σ OSD (N) mJoules IEEE 802.15.4 Optimal p−persistent CSMA Enhanced p−persistent CSMA Enhanced IEEE 802.15.4 (a) Continuous Data (b) One-Shot Data Figure 8.9: Performance of Channel Feedback Enhanced IEEE 802.15.4. 8.5.1.3 Evaluation Figure 8.9 shows the performance gains for the Enhanced IEEE 802.15.4 MAC in comparison to the original. The ¯gure also shows the performance of the optimal p-persistent CSMA and enhanced p-persistent CSMA. It should be noted that the performance of the enhanced IEEE 802.15.4 MAC matches that of the enhanced p-persistent CSMA MAC for CW = 0, i.e., if the nodesdonotsensethechannelfortwoconsecutivefreeslotsbuttransmittheirpacketoncetheir 155 chosen time slot occurs. Thus, for the enhancement we use, the performance of the enhanced p-persistentCSMAisanupper-boundontheperformanceoftheenhancedIEEE802.15.4MAC. An important observation from the ¯gure is that the system throughput reduces drastically with increasing number of contending nodes for the original IEEE 802.15.4 MAC. But for the enhancedversion,thesystemthroughputisalmostconstantwiththenumberofnodes; implying that it is more scalable than the original. This holds for energy also. These signi¯cant gains in performance are observed for both CD and OSD scenarios. 8.5.1.4 Discussion In actual implementation the measurement of idle time and the delay between two consecutive successful packet transmissions can be achieved easily at each node by observing ACKs from the sink. If all nodes in the network are in the radio range of each other then all nodes see the same idle time between two consecutive successful packet transmissions. If, on the other hand, all nodes are in the radio range of the sink but not in the radio range of each other, then each node sees an idle time that is based on the number of nodes in its neighborhood. Thus, the above update rule tries to optimize the transmission probability for the number of nodes in the neighborhood of each node and not for the entire network. However, the sink can measure the idle time for the entire network and piggy back this value in the ACKs to the sensor nodes. The sensor nodes measure the epoch delay as the interval between the ACKs. Thus, in this case, the channel feedback is via the sink. Theperformancedi®erenceintermsofdegradationorimprovement,ifany,betweenthelocal feedback and global feedback based mechanisms needs to be investigated. This is a direction for future work. An important aspect of the Enhanced IEEE 802.15.4 MAC protocol is that all nodes should changetheirwindowsizesandchooseanewtimeslot(orstartanewcounter)ateverysuccessful 156 packet transmission. Otherwise, only a few nodes optimize their window sizes and this could lead to unfairness in the CD scenario. Another important aspect to consider is the e®ect of channel errors. The current standard MAC assumes channel errors based packet losses to be collisions and backs-o® accordingly, thus misconstruing channel errors as congestion. But the enhanced MAC protocol does not change any protocol parameters due to channel errors based packet losses, as successful packet transmissions are taken as the only indicators of channel congestion. Nevertheless, a thorough investigation of the e®ect of channel errors should be addressed in future work. In this chapter, we have focused on dense sensor networks. The following table shows the throughputperformancecomparisonoftheoriginalandenhancedIEEE802.15.4MACprotocols for lower number of nodes. Clearly, according to the results, the current MAC performs better than the enhanced MAC for low number of nodes. But with increasing number of nodes, the enhanced MAC increasingly performs better. N 10 20 30 40 Original 110 85 58.75 38.75 Enhanced 13.75 63.75 63.75 61 Table 8.2: Performance comparison of Original and Enhanced IEEE 802.15.4 MAC for CD in term of throughput (© CD (N)) in Kbps for Low density networks. In the enhanced MAC protocol we have used a single optimality criterion from Section 8.3. We would like to investigate the use of the other criteria also. Recent research has focused on the e®ect of capture e®ect on wireless MAC protocols. The in°uence of capture e®ect on the enhanced IEEE 802.15.4 MAC for the two data collection scenarios is another direction for future work. 157 8.6 Chapter Summary We have shown that the current IEEE 802.15.4 MAC performs poorly for data collection in densesensornetworks. WepresentedachannelfeedbackenhancedMACprotocolthatperforms signi¯cantly better than the current version. For this we modeled the IEEE 802.15.4 MAC as a p-persistent CSMA with changing p, optimized a generic p-persistent CSMA MAC and used the resultant optimality criteria to propose a channel feedback-based enhancement for the original IEEE 802.15.4 MAC. Results showed that our Enhanced IEEE 802.15.4 MAC scales signi¯cantly better for both continuous data and one-shot data collection scenarios in dense networks (number of nodes is greater than 50). For low density networks the performance of the current MAC is better for upto 25 nodes after which the performance of the enhanced MAC is better. 158 Chapter 9 Thesis Summary Wehaveconsideredtwokeyproblemsinwirelesssensornetworks-locationsupportande±cient medium access for one-hop data collection. Inthe¯rstpartofthethesis,we¯rstdeterminedoperatingspeci¯cationsfore®ectivelocation support services in consultations with engineers from Bosch Research. Then, based on these speci¯cations we provided e®ective solutions to the two main components of location support services - accurate localization and fast/fair localization. We addressed the problem of accurate localization by proposing two novel localization tech- niques - one based on location constraints called Ecolocation and another based on location sequences called Sequence Based Localization (SBL). In Ecolocation the unknown node's loca- tion is identi¯ed by a set of location constraints. This location constraint set is derived from the ranks of reference nodes based on their RSS measurements of RF signals from the unknown node. The location of the unknown node is estimated by searching through grid points in the localization space and choosing the grid point that satis¯es the maximum number of matched constraints as its location. If there are more than one such grid points, their centroid gives the location of the unknown node. 159 In Sequence-Based Localization location sequences are used to uniquely identify distinct regions in the localization space. The location of the unknown node is estimated by ¯rst de- termining its location sequence using RSS measurements of RF signals between the unknown node and the reference nodes. And then searching through a pre-determined list of all feasible locationsequencesinthelocalizationspace,calledthelocationsequencetable,to¯ndtheregion representedbythe\nearest"one. Inthischapter,wederivedexpressionsforthemaximumnum- ber of location sequence and presented an algorithm to construct the location sequence table. We described distance metrics that measure the distance between location sequences and used them to determine the corruption in location sequences due to RF channel non-idealities. We identi¯ed an approximate indicator of the extent of location estimation error using the same distance metrics. Through examples, we demonstrated the robustness of Ecolocation and SBL to RF channel non-idealities. We compared Ecolocation and SBL and argued that the former is equivalent to the latter for high scanning resolutions. The comparison suggested that Ecolocation is more suitable for small localization spaces and low location resolutions and that SBL is more suitable for large localization spaces and high location resolutions. Through exhaustive simulations and systematicrealmoteexperiments, weevaluatedtheperformanceofourlocalizationsystemsand presentedacomparisonwithotherstate-of-the-artlocalizationtechniquesfordi®erentRFchan- nelandnodedeploymentparameters. Resultsshowedthatsequence-basedlocalizationperforms well, better than other localization techniques in both indoor and outdoor environments. Next, we introduced the problem of fast/fair localization of mobile device in wireless sensor networks and showed that it is related to the minimum broadcast frame length problem. We investigated a greedy heuristic time scheduling algorithm for this problem using a de¯ned set of ¯vemetrics-averagelocalizationdelay,averagelocalizablespeed,localizationfairness,minimum localizable speed and maximum localizable speed. We derived lower and upper bounds for the number of time slots required to schedule all reference nodes in the localization area by any 160 scheduling algorithm for grid and random reference node deployment distributions using simple geometric arguments. Using simulations, we studied the dynamics of the above ¯ve metrics with respect to refer- ence node deployment distributions, reference node densities and location estimate accuracies. Results show that the average localizable speed of mobile device decreases with increasing level of location estimate accuracy and its dependence on reference node density is minimal. The percentage of locations in the localization area that can guarantee a desired level of location estimate accuracy at a mobile device speed of 95% of the average localizable speed, the local- ization fairness, increases with reference node density and is independent of the accuracy level desired. The average localizable speed of the mobile device and localization fairness are better for grid deployment of reference nodes than for random deployment. Also, the localizable speed of the mobile device at which a localization area wide guarantee of a desired level of accuracy can be provided increases with reference node density and it is higher for grid deployment of reference nodes. In the second part of the thesis, we presented our research on medium access techniques for one-hop data collection application in wireless sensor networks. We identi¯ed a spectrum of application space for the one-hop data collection problem with continuous data collection at one end and one-shot data collection at the other end. While in the continuous data collection problem the contending nodes always have a packet to transmit, in the one-shot data collection problem each contending node has a single packet to transmit. We started out the second part of the thesis by presenting a review of the existing medium access techniques and their applicability to the spectrum of one-hop data collection applica- tions. We argued that the traditional medium access techniques have been studied extensively for the continuous data end of the application and that ours is the ¯rst attempt at studying them for the one-shot data collection end. Thus, through this thesis, we made research con- tributions to the understanding of medium access techniques for one-hop data collection in the 161 following three directions - (a) modeling and analysis of slotted Aloha multi-access technique with binary exponential back-o® for the one-shot data collection problem, (b) development of a novel location-aware medium access technique for the one-shot data collection problem, and (c) modeling, analysis, and performance evaluation of the IEEE 802.15.4 MAC protocol for both the continuous and one-shot data collection problems. The one-shot data collection problem is characterized by the presence of a single packet in the transmission queue of each contending node. On transmission of the packet the node does not contend for the channel anymore. This leads to a non-steady state, transient behavior of the wireless networks. In this thesis we modeled the slotted Aloha medium access protocol with binaryexponentialback-o®collisionavoidanceschemeusinganon-ergodicMarkovchaintoana- lyzethetransientnatureoftheone-shotdatacollectionproblem. Usingthisapproximatemodel we derived °ow equations to capture the network dynamics of the wireless sensor network and veri¯ed their accuracy using simulations. Using these equations, we evaluated the performance of the protocol in terms of the delay in obtaining packets from all contending nodes and the corresponding energy consumption in the wireless sensor network. Results suggest that for a given initial window size for the binary exponential back-o® scheme, reducing the number of back-o® stages reduces both the delay and energy consumption. According to the results, while for high node densities multiple back-o® stages are preferred, for low node densities a single back-o® stage performs better. Next, we presented a novel location-aware medium access protocol for the one-shot data collection problem in wireless sensor networks. We illustrated the working of our protocol us- ing examples and discussed its implementation aspects. The main idea in our location-aware protocol is a tree-based hierarchical partitioning of space to progressively reduce the collision domains of nodes until there are no collisions. Further, we presented results from a thorough performance evaluation of the location-aware protocol in comparison to three location-unaware 162 MAC protocols { HT-split, optimal p-persistent slotted CSMA, and the IEEE 802.15.4 stan- dard MAC protocol { using simulations. We evaluated the protocol for three di®erent location distributions of nodes { uniform-random, even-random, grid-random. Results showed that our location-aware MAC protocol in the worst case (uniform random distribution) provides as good a performance as an optimal location-unaware MAC protocol and in the best case (grid random distribution) provides 60% lower delay and energy consumption. Finally, we modeled, analyzed, and evaluated the performance of the IEEE 802.15.4 MAC protocolforbothendsoftheone-hopdatacollectionapplicationspectrum. Wehaveshownthat the current IEEE 802.15.4 MAC performs poorly for data collection in dense sensor networks, i.e.,withsensornetworkswithmorethan50contendingnodes. Wepresentedachannelfeedback enhanced MAC protocol that performs signi¯cantly better than the current version. For this we modeled the IEEE 802.15.4 MAC as a p-persistent CSMA with changing p, optimized a genericp-persistentCSMAMAC,andusedtheresultantoptimalitycriteriatoproposeachannel feedback-based enhancement for the original IEEE 802.15.4 MAC. Results showed that our Enhanced IEEE 802.15.4 MAC scales signi¯cantly better for both continuous data and one-shot data collection scenarios in dense networks. We have also shown that for low density networks the performance of the current MAC is better for upto 25 nodes beyond which the performance of the enhanced MAC is better. 163 Chapter 10 Future Directions In this chapter we discuss possible directions for future work in the area of location support and e±cient medium access for one-hop data collection in wireless sensor networks. 1. Location Support: (a) Accurate Localization: In the ¯rst part of this thesis we have presented localiza- tion techniques for location support in wireless sensor networks mainly for the two dimensional scenario. However, there exist many real world problems that are char- acterized by three dimensional operational conditions. However, many researchers have used localization techniques developed for two-dimensions in three-dimensional scenarios. It is not clear if this introduces unseen errors in the estimated location of the unknown node. We believe that a comprehensive study should be undertaken to verifyiftwo-dimensionallocalizationtechniquesaregoodenoughinthreedimensional scenarios. (b) Fast & Fair Localization: In this thesis we have formulated the problem of fast & fair localization as a graph-coloring problem and presented a heuristic based solution for the same. Based on the level of information the sensor nodes have, this solution canbeeithercentralizedordistributed. However,webelievethatthisproblem,which occurs frequently in many wireless sensor network applications, can be formulated as 164 a linear program and a close to optimal solution can be derived by solving this linear program. Further,aLagrange-Dualitybasedformulationfortheproblemcouldreveal a sub-gradient based distributed solution for the fast & fair localization problem. 2. Medium Access for One-Hop Data Collection: (a) Location-AwareMediumAccessforOne-shotDataCollection: Inthisthesis wehavepresentedasimulationbasedevaluationforthetree-basedhierarchical,space splitting location-aware MAC protocol for one-shot data collection problems. How- ever, this protocol opens itself for a thorough mathematical analysis and derivation of closed form expressions for the delay and energy consumption metrics of interest. Also, this protocol is open to many possible improvements and enhancements as dis- cussedinSection7.5ofChapter7andfurtherresearchshouldbedonetoexploresuch possibleperformanceenhancersfortheprotocol. Also,athoroughrealsystems-based evaluationoftheprotocolshouldbedonetoaddresstheimplementationconcernssuch as as the e®ect of errors in the locations of nodes on the performance of the protocol. Another interesting direction for future work is the application of this protocol for 3-dimensional sensor node deployments. (b) Enhancement of the IEEE 802.15.4 MAC Protocol: In Chapter 8 we have presented a channel-feedback based enhancement for the IEEE 802.15.4 MAC proto- col. For this enhancement we had used a global feedback from the sink to e®ect the enhancement for the protocol. However, further investigations should be taken up to study the performance di®erence, if any, between local and global channel feedback, for the enhanced protocol. In addition a a real-world practical implementation of the proposed enhancement is required to study the e®ect of real world phenomena such as channel errors and packet losses on the performance of the enhanced protocol. 165 Also, it is important to study the in°uence of capture e®ect on the enhancement of the protocol. 166 Reference List [1] http://mathworld.wolfram.com/leastsquares¯tting.html. [2] http://welcome.hp.com/country/us/en/prodserv/handheld.html. [3] http://www.boschresearch.com. [4] http://www.chipcon.com/¯les/CC2420 Data Sheet 1 3.pdf. [5] http://www.moteiv.com/products/docs/tmote-sky-datasheet.pdf. [6] http://www.xbow.com/Products/productsdetails.aspx?sid=72. [7] Animprovedstabilityboundforbinaryexponentialbacko®. TheoryComput.Syst.,30:229 { 244, 2001. [8] Rosenkrantz W. A. and Towsley D. On the instability of the slotted ALOHA multiaccess algorithm. IEEE Trans. Automatic Cont., 28(10):994 { 996, 1984. [9] N. Abramson. The aloha system: Another alternative for computer communications. In ProceedingsoftheFall1970AFIPSComputerConference,pages281{285,November1970. [10] S. Adireddy and L. Tong. Optimal Transmission Probabilities for Slotted ALOHA in Fading Channels. Proc. CISS02. [11] V. Anantharam. The stability region of the ¯nite-user slotted ALOHA protocol. Infor- mation Theory, IEEE Transactions on, 37(3):535{540, 1991. [12] ANSI/IEEE Standard 802.11. IEEE. Wireless LAN medium access control (MAC) and physical layer speci¯cations., 1999 edition, 1999. [13] N.-O. Song B.-J. Kwak and L. E. Miller. Performance Analysis of Exponential Backo®. IEEE/ACM Transactions on Networking, 13:343{355, April 2005. [14] J.G.KimB.P.Crow,I.WidjajaandP.T.Sakai. IEEE802.11wirelesslocalareanetworks. IEEE Commun. Mag., Sept 1997. [15] Victor Bahl and V. N. Padmanabhan. Radar: An in-building rf-based user location and tracking system. In IEEE INFOCOM, Tel Aviv, Israel, 2000. [16] L. Bao and J. Garcia-Luna-Aceves. A new approach to channel access scheduling for ad hoc networks. In The seventh annual international conference on Mobile computing and networking, pages 210 { 221, 2001. [17] L. Bao and J. Garcia-Luna-Aceves. Hybrid channel access scheduling in ad hoc networks. In IEEE Tenth International Conference on Network Protocols (ICNP), November 2002. 167 [18] RobertoBattiti,MauroBrunato,andAlessandroVillani. Statisticallearningtheoryforlo- cation ¯ngerprinting in wireless lans. Technical Report DIT-02-0086, University of Trento (Italy), 2002. [19] Roberto Battiti, Thang Lee Nhat, and Alessandro Villani. Location-aware computing: a neural network model for determining location in wireless lans. Technical Report DIT-02- 0083, University of Trento (Italy), 2002. [20] P. Bergamo and G. Mazzini. Localization in sensor networks with fading and mobility. In The 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Commu- nications (PIMRC'02), volume 2, 2002. [21] Dimitri Bertsekas and Robert Gallager. Data Networks (Second Edition). Prentice Hall, 1991. [22] B.Hajek and T.Van Loon. Decentralized Dynamic Control of a Multi-Access Broadcast Channel. IEEE Transactions on Automatic Control, 27:559{569, 1982. [23] Giuseppe Bianchi. Performance Analysis of the IEEE 802.11 Distributed Coordination Function. IEEE Journal on Selected Areas in Communications, 18(3):535{547, March 2000. [24] B. Bougard, F. Catthoor, D. C. Daly, A. Chandrakasan, and W. Dehaene. Energy E±- ciency of the IEEE 802.15.4 Standard in Dense Wireless Microsensor Networks: Modeling and Improvement Perspectives. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE). IEEE, March 2005. [25] Ra®aele Bruno, Marco Conti, and Enrico Gregori. Optimization of E±ciency and Energy Consumption in p-Persistent CSMA-Based Wireless LANs. IEEE Transactions on Mobile Computing, 1(1):10{31, January{March 2002. [26] Nirupama Bulusu, John Heidemann, and Deborah Estrin. Gps-less low-cost outdoor lo- calization for very small devices. IEEE Personal Communications Magazine, October 2000. [27] Frederico Cali, Marco Conti, and Enrico Gregori. Dynamic Tuning of the IEEE 802.11 Protocol to Achieve a Theoretical Throughput Limit. IEEE/ACM Transactions on Net- working, 8(6):785{799, December 2000. [28] EdCallaway, PaulGorday, LanceHester, JoseA.Gutierrez, MarcoNaeve, BobHeile, and Venkat Bahl. Home Networking with IEEE 802.15.4: A Developing Standard for Low- Rate Wireless Personal Area Networks. IEEE Communications Magazine, pages 70{77, August 2002. [29] C.D.Iskander. Performance analysis of IEEE 802.15.4 noncoherent receivers at 2.4 GHz under pulse jamming. In Radio and Wireless Sysmposium, pages 327{330, January 2006. [30] Krishnendu Chakrabarty, S. Sitharama Iyengar, Hairong Qi, and Eungchun Cho. Grid coverage for surveillance and target location in distributed sensor networks. IEEE Trans- actions on Computers, 51(12):1448{1453, December 2002. [31] H.S.ChhayaandS.Gupta. Performancemodelingofasynchronousdatatransfermethods in the IEEE 802.11 MAC protocol. ACM/Balzer Wireless Netw., 3:217{234, 1997. [32] Mark de Berg, Marc van Krevald, Mark Overmars, and Otfried Schwarzkopf. Computa- tional Geometry - Algorithms and Applications, Second Edition. Springer, second edition, 2000. 168 [33] Derek J. Corbett and David Everitt. A Partitioned Power and Location Aware MAC Protocol for Large Ad Hoc Networks. In European Wirless 2005, Nicosia, Cyprus, April 2005. [34] D.J.Aldous. Ultimate instability of exponential back-o® protocol for acknowledgment- based transmission control of random access communication channels. IEEE Trans. In- form. Theory, 33(2):219 { 223, 1987. [35] A. El-Hoiydi. Aloha with preamble sampling for sporadic tra±c in ad hoc wireless sensor networks. In IEEE International Conference on Communications, April 2002. [36] Eiman Elnahrawy, Xiaoyan Li, and Richard P. Martin. The limits of localization using signal strength: A comparative study. In First IEEE International Conference on Sensor and Ad hoc Communications and Networks (SECON), Santa Clara, CA, October 2004. IEEE. [37] RexMinetal. AFrameworkforEnergy-ScalableCommunicationinHigh-DensityWireless Networks. In ACM ISPLED, pages 36 { 41, Monterey, August 2002. [38] M. Conti F. Cali and E. Gregori. IEEE 802.11 wireless LAN: Capacity analysis and protocol enhancement. pages 142 { 149, San Francisco, CA, April 1998. [39] L. Fratta G. Bianchi and M. Oliveri. Performance evaluation and enhancement of the CSMA/CAMAC protocol for 802.11 wireless LANs. In PIMRC, pages 392 { 396, Taipei, Taiwan, October 1996. [40] Sachin Ganu, A.S.Krishnakumar, and P.Krishnan. Infrastructure-based location estima- tion in wlan networks. In IEEE Wireless Communications and Networking Conference (WCNC), 2004. [41] LewisGirodandDeborahEstrin. Robustrangeestimationusingacousticandmultimodal sensing. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1312{1320, Maui, HI, USA, October 2001. [42] L.A.GoldbergandP.D.MacKenzie. Analysisofpracticalbacko®protocolsforcontention resolution with multiple servers. J. Comput. Syst. Sci., 58(1):232 { 258, 1999. [43] Flajolet P. Greenberg A. G. and R. E. Ladner. Estimating the multiplicities of con°icts to speed their resolution in multiple access channels. J. ACM, 34(2):289 { 326, 1987. [44] I. Guvenc, C. T. Abdallah, R. Jordan, and O. Dedeoglu. Enhancements to rss based indoor tracking systems using kalman ¯lters. In GSPx & International Signal Processing Conference, Dallas, Texas, March 2003. [45] Youngjune Gwon, Ravi Jain, and Toshiro Kawahara. Robust indoor location estimation of stationary and mobile users. In IEEE INFOCOM, Hong Kong, March 2004. [46] M.SchlagerH.Woesner, J.P.EbertandA.Wolisz. Power-savingmechanismsinemerging standards for wireless lans: The mac level perspective. IEEE Personal Communications, pages 40 { 48, June 1998. [47] H. Hashemi. The indoor radio propagation channel. In Proceedings of the IEEE, vol- ume 81, pages 943{968. IEEE, July 1993. [48] T. He, B.M. Blum C. Huang, J.A. Stankovic, and T. Abdelzaher. Range{free localization schemes for large scale sensor networks. In Mobicom, San Diego, CA, September 2003. 169 [49] T. Leighton J. Hstad and B. Rogo®. Analysis of backo® protocols for multiple access channels. SIAM J. Comput., 25(4):740 { 744, 1996. [50] J.P.EbertJ.Weinmiller,H.WoesnerandA.Wolisz. Analyzingandtuningthedistributed coordination function in the IEEE 802.11 DFWMAC draft standard. In MASCOT, San Jose, CA, February 1996. [51] D. G. Jeong and W. S. Jeon. Performance of an exponential backo® scheme for slotted- ALOHA protocol in local wireless environment. IEEE Trans. Veh. Technol., 44(3):470 { 479, August 1995. [52] D.YoshimuraK.Sakakibara,T.SetoandJ.Yamakita. OnthestabilityofslottedALOHA systems with exponential backo® and retransmission cuto® in slow-frequency-hopping channels. September 2001. [53] H. Muta K. Sakakibara and Y. Yuba. The e®ect of limiting the number of retransmission trials on the stability of slotted ALOHA systems. IEEE Trans. Veh. Technol., 49(4):1449 { 1453, July 2000. [54] B. Karp and H.T.Kung. Greedy Perimeter Stateless Routing for Wireless Networks. In International Conference on Mobile Computing and Networking, 2000. [55] S.M. Kay. Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1993. [56] F. Kelly. Stochastic Models of Computer Communications Systems. Journal of Royal Statistical Society, Series B, 47:379{395, 1985. [57] Yoon H. Kim and Antonio Ortega. Quantizer design and distributed encoding algorithm for source localization in sensor networks. In The Fourth International Conference on Information Processing in Sensor Networks (IPSN 2005), Los Angeles, CA, April 2005. [58] Leonard Kleinrock and Fouad A. Tobagi. Packet switching in radio channels: Part i { carrier sense multiple access modes and their throughput-delay characteristics. IEEE Transactions on Communications, 23:1400{1416, December 1975. [59] Bhaskar Krishnamachari. Networking Wireless Sensors. Cambridge University Press, 2005. [60] P. Krishnan, A.S. Krishnakumar, Wen-Hua Ju, Colin Mallows, and Sachin Ganu. A system for lease: Location estimation assisted by stationary emitters for indoor rf wireless networks. In IEEE INFOCOM, Hong Kong, March 2004. [61] P. R. S. Kumar and L. Merakos. Distributed control of broadcast channels with acknowl- edgementfeedback: Stabilityandperformance. InIEEEControlandDecisionConference, Las Vegas, Nevada, December 1984. [62] M.ContiL.BononiandL.Donatiello. Designandperformanceevaluationofadistributed contention control (DCC) mechanism for IEEE 802.11 wireless local area networks. pages 59 { 67, Dallas, Texas, October 1998. [63] M. Conti L. Bononi and E. Gregori. Design and performance evaluation of an asymptot- ically optimal backo® algorithm for IEEE 802.11 wireless LANs. Maui, Hawaii, January 2000. [64] S. S. Lam. Carrier Sense Multiple Access Protocol for local networks. The International Journal of Distributed Informatique, 4:21{32, 1980. 170 [65] Bao Hua Liu, Nirupama Bulusu, Huan Pham, and Sanjay Jha. A Self-Organizing, Lo- cation Aware Media Access Control Protocol for DS-CDMA Sensor Networks. In IEEE International Conference on Mobile Ad Hoc and Sensor Networks,pages528{530,October 2004. [66] Gang Lu, Bhaskar Krishnamachari, and Cauligi S. Raghavendra. Performance Evaluation of the IEEE 802.15.4 MAC for Low-rate Low-Power Wireless Networks. In IEEE IPCCC 2004, pages 701{706, April 2004. [67] Kyle Maieson, Hari Balakrishnan, and Y. C. Tay. Sift: A MAC Protocol for Event-Driven Wireless Sensor Networks. Technical Report 894, MIT Laboratory for Computer Science, May 2003. [68] Marco Zuniga and Bhaskar Krishnamachari. Analyzing the Transitional Region in Low Power Wireless Links. In First IEEE International Conference on Sensor and Ad hoc Communications and Networks (SECON), Santa Clara, CA, October 2004. [69] Miklos Marot, Peter Volgyesi, Sebestyen Dora, Branislav Kusy, Andras Nadas, Akos Ledeczi, Gyorgy Balogh, and Karoly Molnar. Radio interferometric geolocation. In Sen- Sys '05: Proceedings of the 3rd international conference on Embedded networked sensor systems, pages 1{12, New York, NY, USA, 2005. ACM Press. [70] M.Gerla and L.Kleinrock. Closed Loop Stability Control for S-Aloha Satellite Communi- cations. pages 210{219, September 1977. [71] J. Misic, V. B. Misic, and S. Sha¯. Performance of IEEE 802.15.4 beacon enabled PAN withuplinktransmissionsinnon-saturationmode-accessdelayfor¯nitebu®ers. In IEEE BroadNets, October 2004. [72] Jelena Misic, Shairmina Sha¯, and Vojislav B. Misic. Performance of a Beacon Enabled IEEE 802.15.4 Cluster with Downlink and Uplink Tra±c. IEEE Transactions on Parallel and Distributed Systems, 17(4):361{376, 2006. [73] D.CypherN.GolmieandO.Rebala. PerformanceAnalysisofLowRateWirelessTechnolo- gies for Medical Applications. In Compuer Communications (Elsevier), pages 28:1266{ 1275, June 2005. [74] TamerNadeem,LushengJi,AshokAgarwala,andJonathanAgre. LocationEnhancement to IEEE 802.11 DCF. In IEEE INFOCOM, Miami, Florida, USA, March 2005. [75] A.NasipuriandK.Li. Adirectionalitybasedlocationdiscoveryschemeforwirelesssensor networks. In WSNA, Atlanta, Georgia, September 2002. [76] N.F.Timmons and W.G.Scanlon. Analysis of the Performance of IEEE 802.15.4 for Med- ical Sensor Body Area Networking. In IEEE SECON, pages 16{24, 2004. [77] Neal Patwari and Alfred O. Hero III. Using proximity and quantized rss for sensor local- ization in wireless networks. In WSNA, San Diego, CA, September 2003. [78] William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling. Nu- merical Recipes in C: The Art of Scienti¯c Computing. Cambridge University Press, second edition, 1992. [79] Nissanka B. Priyantha, Anit Chakraborty, and Hari Balakrishnan. The cricket location- support system. In ACM MOBICOM, Boston, MA, August 2000. 171 [80] Nissanka Bodhi Priyantha, Hari Balakrishnan, Erik Demaine, and Seth Teller. Mobile- Assisted Localization in Wireless Sensor Networks. In IEEE INFOCOM, Miami, FL, March 2005. [81] Iyappan Ramachandran, Arindam K. Das, and Sumit Roy. Analysis of the Contention Access Period of IEEE 802.15.4 MAC. Accepted for publication in ACM Transactions on Sensor Networks., September 2005. [82] RajivRamaswamyandKeshabK.Parhi. DistributedSchedulingofBroadcastsinaRadio Network. In INFOCOM `89, volume 2, pages 407{504, Ottawa, Ont., Canada, April 1989. [83] Theodore S. Rappaport. Wireless Communications, Principles & Practice. Prentice Hall, 1999. [84] Saikat Ray, David Starobinski, Ari Trachtenberg, and Rachanee Ungrangsi. Robust loca- tion detection with sensor networks. IEEE JSAC Special Issue on Fundamental Perfor- mance Limits of Wireless Sensor Networks, 22(6):1016{1025, August 2004. [85] R.G.Gallager. Aperspectiveonmultiaccesschannels. IEEETrans.Inform.Theory,31:124 { 142, 1985. [86] R.L.Rivest. Network Control by Bayesian Broadcast. IEEE Transactions on Information Theory, 33(3), 1987. [87] L. G. Roberts. Aloha packet system with and without slots and capture. Computer Communication Review, 5(2):28 { 42, April 1975. [88] Tsybakov B. S. and Mikhailov V. A. Ergodicity of a slotted Aloha system. Probl. Inf Transm. (translated from Russian original in Probl. Peredachi If: IS, 4 (Oct.-Dec. 1979), 72-87), 15(4), April 1980. [89] S Shenker. Some conjectures on the behavior of acknowledgment based transmission control of random access communication channels. In ACM Sigmetrics 87 Conference on Measurement and Modelling of Computer Systems, Ban®, Alberta, Canada, 1987. [90] S.SinghandC.Raghavendra. PAMAS:Powerawaremulti-accessprotocolwithsignalling for ad hoc networks. ACM SIGCOMM Computer Communication Review, 28(3):5 { 26, July 1998. [91] AdamSmith,HariBalakrishnan,MichelGaraczko,andNissankaBodhiPriyantha. Track- ing Moving Devices with the Cricket Location System. In 2nd International Conference on Mobile Systems, Applications and Services (Mobisys 2004), Boston, MA, June 2004. [92] W.SzpankowskiandV.Rego. Instabilityconditionsarisinginanalysisofsomemultiaccess protocols. Technical Report CSD-TR-577, Purdue Univ., Lafayette, Ind., February 1986. [93] Y. C. Tay, Kyle Jamieson, and Hari Balakrishnan. Collision Minimizing CSMA and its Applications to Wireless Sensor Networks. IEEE Journal on Selected Areas in Communi- cations, August 2004. [94] Sameer Tilak, Vinay Kolar, Nael B. Abu-Ghazaleh, and Kyoung-Don Kang. Dynamic Localization Protocols for Mobile Sensor Networks. In IEEE International Workshop on Strategies for Energy E±ciency in Ad-hoc and Sensor Networks (IEEE IWSEEASN`05), Phoenix, AZ, USA, April 2005. 172 [95] Fouad A. Tobagi and Leonard Kleinrock. Packet switching in radio channels: Part ii { the hidden terminal problem in carrier sense multiple-access and the busy-tone solution. IEEE Transactions on Communications, 23(12), 1975. [96] S.ShenkerV.Bharghavan,A.DemersandL.Zhang. MACAW:amediaaccessprotocolfor wireless LAN's. In Conf. on Communications Architectures, Protocols and Applications, pages 212{225, London, 1994. [97] Alec Woo and David Culler. A Tranmission Control Scheme for Media Access in Sensor Networks. InACM/IEEEInternationalConferenceonMobileComputingandNetworking, pages 221{235, Rome, Italy, July 2001. [98] C. Hsu Y. Tseng and T. Hsieh. Power-saving protocols for IEEE 802.11-based multi-hop ad hoc networks. pages 200 { 209, June 2002. [99] Wei Ye and John Heidemann. Medium Access Control in Wireless Sensor Networks. Technical Report ISI-TR-580, USC/ISI, October 2003. [100] Kiran Yedavalli. Location Determination Using IEEE 802.llb. Master's thesis, The Uni- versity of Colorado at Boulder., December 2002. [101] Kiran Yedavalli, Bhaskar Krishnamachari, Sharmila Ravula, and Bhaskar Srinivasan. Ecolocation: A Sequence Based Technique for RF Localization in Wireless Sensor Net- works. In The Fourth International Conference on Information Processing in Sensor Net- works (IPSN 2005), Los Angeles, CA, April 2005. 173 Appendix A Even-Random Distribution Procedure to obtain even-random distribution of nodes: Below, we illustrate the steps to divide a square region into n equal area partitions: 1. Divide the square into x=b p n+0:5c vertical partitions. 2. Eachverticalpartitionwillbedividedintoaminimumofy min =b n x chorizontalpartitions. 3. Let r = n¡x¢y min . Determine the number of horizontal partitions y i in each vertical partition i (1·i·x) using, y i = 8 > > < > > : y min +1; 1·i·r y min ; r <i·x 4. The width of vertical partition i is W i =S¢ y i n , where S is the side length of the square. 5. Finally, divide vertical partition i into y i equal parts. 174
Abstract (if available)
Abstract
We consider two fundamental building blocks for many applications in wireless sensor networks - location support and efficient medium access for one-hop data collection.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Transport layer rate control protocols for wireless sensor networks: from theory to practice
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
Realistic modeling of wireless communication graphs for the design of efficient sensor network routing protocols
PDF
Efficient and accurate in-network processing for monitoring applications in wireless sensor networks
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Multichannel data collection for throughput maximization in wireless sensor networks
PDF
Robust routing and energy management in wireless sensor networks
PDF
IEEE 802.11 is good enough to build wireless multi-hop networks
PDF
Congestion control in multi-hop wireless networks
PDF
Reconfiguration in sensor networks
PDF
Algorithmic aspects of throughput-delay performance for fast data collection in wireless sensor networks
PDF
Techniques for efficient information transfer in sensor networks
PDF
Rate adaptation in networks of wireless sensors
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Distributed edge and contour line detection for environmental monitoring with wireless sensor networks
PDF
Reliable and power efficient protocols for space communication and wireless ad-hoc networks
PDF
Towards interference-aware protocol design in low-power wireless networks
PDF
Language abstractions and program analysis techniques to build reliable, efficient, and robust networked systems
PDF
Models and algorithms for energy efficient wireless sensor networks
Asset Metadata
Creator
Yedavalli, Kiran Kumar
(author)
Core Title
On location support and one-hop data collection in wireless sensor networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
04/11/2007
Defense Date
03/22/2007
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
localization,medium access protocols,OAI-PMH Harvest,wireless sensor networks
Language
English
Advisor
Krishnamachari, Bhaskar (
committee chair
), Govindan, Ramesh (
committee member
), Ortega, Antonio (
committee member
)
Creator Email
kyedavalli@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m372
Unique identifier
UC1322118
Identifier
etd-Yedavalli-20070411 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-399972 (legacy record id),usctheses-m372 (legacy record id)
Legacy Identifier
etd-Yedavalli-20070411.pdf
Dmrecord
399972
Document Type
Dissertation
Rights
Yedavalli, Kiran Kumar
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
localization
medium access protocols
wireless sensor networks