Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Adaptive sampling with a robotic sensor network
(USC Thesis Other)
Adaptive sampling with a robotic sensor network
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ADAPTIVE SAMPLING WITH A ROBOTIC SENSOR NETWORK by Bin Zhang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) May 2008 Copyright 2008 Bin Zhang Acknowledgements Foremost, I would like to thank my supervisor, Professor Gaurav S. Sukhatme, for his guidance during my research and study at University of Southern California. His en- thusiasm in research and interest in a variety of areas had motivated all his students. In addition, he was always accessible and willing to help students with their research. Without his guidance and help, this thesis would not exist in the first place. I would also like to thank my committee members, Professor David Caron, Professor Mark Hansen and Professor David Kempe for their guidance. I also owe my thanks to Professor Ari Requicha for his guidance during my first several years in USC. IwouldliketothankallthemembersoftheNAMOSteamfortheirsupport,comments and help with the experiments. In particular, I would like to thank Carl Oberg, Amit Dhariwal, Arvind Pereira, Jnaneshwar Das, Beth Stauffer, Lindsay Darjany, Xuemei Bai, as well as former members Eric Shieh and Stefanie Moorthi. I would like to thank all people of Robotic Embedded System Lab at USC for mis- cellaneous help, especially for discussions and comments on my research. The thanks go to: Jonathan Kelly, Sameera Poduri, Karthik Dantu, Marin Kobilarov, Jonathan Binney, ii Kale Harbick, DeWitt ’Tal’ Latimer IV, as well as former members Dr. Srikanth Sari- palli, Dr. Boyoon Jung, Dr. Gabe Sibley, Dr. Maxim Batalin, Dr. David Naffin, and Dr. Stefan Hrabar. Finally, I would like to thank my family. I thank my parents Xinming Zhang and Xianrong Zeng for their faith and confidence in me and allowing me to pursuit what I wanted. Most importantly, I thank my wife, Yi Lu, for her support, encouragement, patience and love, on which the past a few years of my life has been built. iii Table of Contents Acknowledgements ii List Of Figures vi Abstract ix Chapter 1: Introduction 1 1.1 Scalar Field Reconstruction and Adaptive Sampling . . . . . . . . . . . . 1 1.2 Robotic Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 2: Optimal Experimental Design 10 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Optimal Design with Local Linear Regression . . . . . . . . . . . . . . . . 11 2.2.1 Local Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Optimal Design for Linear Local Regression . . . . . . . . . . . . . 14 2.3 Optimal Design with Thin Plate Splines . . . . . . . . . . . . . . . . . . . 17 2.3.1 Thin Plate Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.2 Optimal Design with Thin Plate Spline . . . . . . . . . . . . . . . 20 2.4 Optimal Design with Kriging . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Chapter 3: Adaptive Sampling with a Single Mobile Node 32 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 Objective Function for Path Planning . . . . . . . . . . . . . . . . . . . . 34 3.3 Energy Consumption Model . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.3.1 Model with Direction . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.2 Model Without Direction . . . . . . . . . . . . . . . . . . . . . . . 41 3.4 Path Planning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4.1 Exhaustive Search with Pruning Heuristics . . . . . . . . . . . . . 43 3.4.2 Nearly Exhaustive Search . . . . . . . . . . . . . . . . . . . . . . . 44 3.4.3 The Blum et al. Algorithm . . . . . . . . . . . . . . . . . . . . . . 45 3.4.4 The K-Path Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 47 iv 3.4.5 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . 50 3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.2 Experiments using the NAMOS data set . . . . . . . . . . . . . . . 56 3.5.3 Experiments on the data set from the NIMS node . . . . . . . . . 58 3.5.4 Experiments in Echo Park . . . . . . . . . . . . . . . . . . . . . . . 60 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 4: Adaptive Sampling with a Large Number of Mobile Nodes 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Deployment Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2.1 Deployment Within a Small Neighborhood . . . . . . . . . . . . . 72 4.2.2 Distributed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.5 Physical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.6 Simulation on Adaptive Sampling . . . . . . . . . . . . . . . . . . . . . . . 87 4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chapter 5: Adaptive Sampling with a Small Number of Mobile Nodes 91 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.2 Path Planning with Graph Partition . . . . . . . . . . . . . . . . . . . . . 94 5.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Chapter 6: Conclusion 104 References 106 Appendix Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 v List Of Figures 1.1 NAMOS, a robotic sensor network. (a) The components of a buoy. (b) The buoys in a lake. (c) The first robotic boat. (d) The second robotic boat. (e) The robotic boat in operation with buoys in a lake. (f) The second boat in operation with buoys in ocean. . . . . . . . . . . . . . . . . 6 2.1 Comparison between optimal design and random design with local linear Regress. No energy consumption model is considered here. . . . . . . . . . 16 2.2 Comparison between optimal design and random design with thin plate splines. No energy consumption model is considered here. . . . . . . . . . 24 2.3 (a) The design that minimizes the estimation variance of universal kriging within the area 0 ≤ x 1 ≤ 1 and 0 ≤ x 2 ≤ 1. (b) The estimation variance corresponding to the design. . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 The robotic boat of the NAMOS project used in experiments regarded in this thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 The trajectories considered for the energy consumption model. See Ta- ble 3.1 for energy consumption for each trajectory. . . . . . . . . . . . . . 40 3.3 The state transition model of NIMS node. . . . . . . . . . . . . . . . . . . 42 3.4 Performance comparison with 15 (a), 23 (b) and 56 (c) locations. . . . . . 51 3.5 Performance comparison with an energy consumption model without di- rection. (a) and (c) show the performance of Nearly Exhaustive Search on two data sets of the surface temperature of Lake Fulmor; (b) and (d) show the performance of the Chaudhuri et al. algorithm on the same data sets. 53 3.6 Performance comparison with an energy consumption model with direc- tion. (a) and (c) show the performance of Nearly Exhaustive Search on twodatasetsasFigure3.5;(b)and(d)showtheperformanceoftheChaud- huri et al. algorithm on the same data sets. . . . . . . . . . . . . . . . . . 54 vi 3.7 (a) The simulated scalar field. (b) the performance comparison between three algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.8 (a) The data set of the surface temperature of Lake Fulmor; (b) The in- terpolated temperature field by using the whole data set. . . . . . . . . . 57 3.9 Comparison between the adaptive sampling with the Chaudhuri et al. al- gorithm, adaptive sampling with the Nearly Exhaustive Search and raster scan on the data set of the surface temperature of Lake Fulmor. Note that the Chaudhuri and Nearly Exhaustive Search stands for the Adaptive Sampling algorithms with corresponding path planning subroutines. . . . 58 3.10 Performance comparison between 3 algorithms on the data set of fluores- cence on a vertical transact of Lake Fulmor. . . . . . . . . . . . . . . . . . 59 3.11 AtrialoftheexperimentinEchoPark. (a)showstheplannedpathandthe actual track and (b) shows estimation error vs distance the boat traveled, which is approximately proportional to the energy consumed by the boat 62 3.12 Anothertrailoftheexperiment. (a)showstheplannedpathandtheactual track and (b) shows estimation error vs distance the boat traveled. . . . . 63 3.13 (a) The path generated by the Chaudhuri et al. algorithm. (b) The path generated by using Christofides’ heuristic. . . . . . . . . . . . . . . . . . . 65 4.1 The dashed lines are the boundaries of the partitions at step i, and the solid lines are the boundaries at step i+1. The polygon with bold solid edges is subarea S i+1 j . It is obvious that S i+1 j = P M i l=1 ΔS i+1 j,l . . . . . . . . 76 4.2 The dashed lines are the boundaries of subareas at step i, and solid lines aretheboundariesofsubareasatstepi+1. Thepolygonwithbolddashed edges is the subspace S i l . It is obvious that S i l = P M i+1 j=1 ΔS i+1 j,l . . . . 78 4.3 (a) the desired function g 0 (x,y); (b)The node density after 120 steps.. . . 83 4.4 The node deployment density better approximates the desired profile over time. Shown here is the diminishing difference between the actual density and the desired density for different values of noise. The process is noise tolerant since there is little measurable difference between the steady-state density errors in four curves. . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5 Steady-state density error increases slowly as sensor noise increases. . . . 85 4.6 Robomote testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 vii 4.7 Data from two trials with the Robomotes. The bars show the average sensor density in the steady state, the curve shows the desired density. . . 87 4.8 The scalar field to be estimated. . . . . . . . . . . . . . . . . . . . . . . . 88 4.9 One example of node distributions in simulations. (a) The initial distri- bution of the sensor nodes. (b) The final distribution of the sensor nodes after 50 steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.10 Results of the simulations on the system with a large number of mobile nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.1 (a) The scalar field m(x,y) = 1/(1 + exp(25 p x 2 +y 2 − 17.5)). (b) The ratio IMSE ea /IMSE opt on the scalar field. . . . . . . . . . . . . . . . . 98 5.2 The paths planned for a group of 4 mobile nodes. (a) Partitioned with equal area strategy; (b) Partitioned with equal gain strategy. . . . . . . . 100 5.3 Comparison of the estimation errors of three strategies: sequential alloca- tion, partition with strategy of equal area and partition with strategy of equal gain.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.4 The running time of the three strategies. . . . . . . . . . . . . . . . . . . . 101 5.5 Performance comparison between the two partition strategies with 32 mo- bile nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 viii Abstract Robotic sensor networks provide new tools for in-situ sensing in challenging settings such as environmental monitoring. Motivated by applications in marine biology we study the field reconstruction problem using a robotic sensor network. We focus on the adaptive samplingproblem. Thenetworkmakessequentialcontroldecisionsaboutwheretosample the environment to optimally reconstruct the field being probed. Reconstruction errors are thus reduced by adjusting the sample distribution. In a robotic sensor network, the static network nodes provide long term continuous sensor data (samples) at fixed locations. Ontheotherhand,themobilenodes(robots)areabletoadjustthedistribution ofthesamplesbutthenumberofsamplestheycantakeislimited. Wepresentalgorithms that exploit the advantages of both static and mobile nodes. Samples from static nodes areusedtobootstrapacoarseestimateofthefield,andtherobotstakeadditionalsamples to successively refine the estimate. Three cases are studied in this dissertation: a static network augmented with 1. a single robot, 2. a large number of robots, and 3. a small number of robots. For the first case, approximation algorithms developed for the orienteering problem are used for generating near-optimal data-gathering tours once the gainofeachlocationiscomputedbyusingoptimalexperimentaldesignfornon-parametric regression. For the second case, an auction-based algorithm is used for coordination ix between mobile nodes. For the final case, spatial and task partitioning methods are used to coordinate robots with each partition being treated as in the single robot case. The methods developed in this dissertation have been extensively tested in simulations. Experimental field trials (several km in two lakes and a harbor) on a physical network of two robotic boats and ten static buoys validate the algorithms. x Chapter 1 Introduction 1.1 Scalar Field Reconstruction and Adaptive Sampling A sensor network is a system consisting of nodes capable of sensing, computation, and communication. Sensornetworkshavemanyapplications, especiallyinecosystemobserv- ing and environmental monitoring. We start with two examples. The first concerns chemical and biological processes in a lake. Compared to tradi- tional scientific instruments, a sensor network can provide both temporal and spatial information on the lake facilitating the study of microbial dynamics. In this case, accu- rately measuring quantities such as temperature, chlorophyll, salinity, and concentration of various nutrients is very useful to scientists seeking a better understanding of aquatic ecosystems. A second example involves monitoring water quality. A sensor network deployed in a harbor or a beach can be used for monitoring water quality and provide early warning of hazards. Once a dangerous condition manifests itself, the sensor network can be used to analyze the source of the problem and help identify remediation measures. 1 Broadly speaking, both examples discussed above involve estimating scalar fields, each characterized by a single scalar quantity which varies in both space and time. A general scalar field is a real-valued function defined on n-dimensional Euclidean space. In applications of sensor networks, n is typically 2 or 3 and the goal is to use a limited number of sensor readings (samples) to reconstruct the underlying scalar field. That is, to estimate the value of the field at any given location and any point in time. The algorithms developed in this dissertation have been tested in the context of scale field estimation for aquatic monitoring applications where the sensors used are local, e.g. thermistors, fluorometers etc. Other sensors such as laser range finders, sonars or cameras, canprovideinformationonthesurroundingphysicalenvironmentinawidearea and hence a limited number of sensors may cover the whole sensing area. In contrast, the sensors used in aquatic monitoring application regarded here, can only provide local information. That is, each sensor reading is only valid in the vicinity of the sensor and hence lots of sensors are needed to adequately cover the area of operation. When only a limited number of sensors are available, to estimate values of the field at locations where the samples were not taken, we have to assume correlation between the readings at different locations and estimate the data. Intuitively, the farther two locations are from each other in space, the weaker the correlation between the values of the scalar field. Therefore, when there is no other information available, to improve the estimation accuracy locally, we need to increase the local sensor density. In many applications, a physical model describing the underlying scalar field is not available. In those cases, non-parametric regression is the natural choice. The errors associated with the reconstruction depend not only on the number of samples, but also 2 on the spatial distribution of the samples. When the number of samples is fixed, there exists an optimal sample distribution, which depends on the underlying scalar field. For example, iflocallinearregressionisusedforreconstruction, whenthesampledistribution is proportional to a polynomial of the trace of the Hessian matrix of the scalar field, the error of reconstruction is minimized [43]. Therefore, when the number of samples is limited, the reconstruction error can be reduced by adjusting the distribution of samples. Werefertotheprocessofimprovingthereconstructionaccuracybycontrollingthesample distribution as Adaptive Sampling. 1.2 Robotic Sensor Network A sensor network normally consists of only static nodes and there are two common ways to control the sample distribution. In the first case, a large number of nodes are deployed uniformly. Intheprocessofobservingormonitoring,asubsetofnodeswakesuptogather sensor readings while other nodes remain in sleep mode [65]. This approach works well even when the underlying scalar field changes over time provided the nodes are deployed densely enough. However, this approach requires a large number of sensor nodes to achieve low estimation error, and there might be some nodes that never wake up, which iswasteful. Anotherwaytocontrolthedistributionofthesamplesistodeploythesensor nodes non-uniformly. More sensor nodes are deployed in “critical areas”. However, this approach suffers when critical areas change, since it is not easy to redeploy the static sensor nodes. 3 This problem with static nodes can be partially overcome by augmenting the static sensor network with mobile nodes (robots). A sensor network consisting of both static and mobile nodes is referred to as a robotic sensor network or a sensor actuator network. On the one hand, communication, sensing and computing take much less energy. As a result, the battery life of static nodes is much longer than mobile nodes and static nodes can be used to take sensor readings at higher rates for longer periods. Assuming the phenomenon does not vary rapidly in time, by averaging recent readings, the variance of the static sensor readings can be reduced and hence static nodes can provide more accurate readings than the mobile nodes. On the other hand, mobile nodes are able to move around and take sensor readings at “optimally selected” locations that promise reduction in estimation error. Therefore, static sensor nodes can be used to estimate the underlyingfieldcoarselyandthenthemobilenodescanbesentouttotakeextrareadings to refine the field estimate. An example of a robotic sensor network is the testbed developed and used in this thesis: Networked Aquatic Microbial Observing System (NAMOS). NAMOS was devel- oped for freshwater and marine monitoring [61, 22, 55], and consists of 10 static buoys, i.e. static sensor nodes, and two robotic boats, i.e., mobile nodes. Each buoy has an Intel Xscale processor (PXA255), a wireless card, a fluorometer and an array of 6 ther- mistors, as shown in Figure 1.2. The fluorometer is used to measure fluorescence, which is an indication of the chlorophyll concentration in water. An optional sensor module, Hydrolab Sonde DS5, can be attached to a buoy to provide more sensing information, suchasdissolvedoxygenandturbidity. Thewirelesscardsonbuoysareconfiguredtorun in ad-hoc mode and a multi-hop protocol based on Emstar [57] is used to create routing 4 tree, which can reliably route the packets through the network. Two robotic boats have been developed for NAMOS. The first boat is a modified RC airboat as shown in Fig- ure 1.2. This boat is equipped with the same Intel Xscale processor and wireless card as the buoys for computation and communication. For navigation, a Garmin 16A GPS and aPNI2-axisdigitalcompassareusedtomeasurethelocationandorientationoftheboat. The sensing module consists of a fluorometer and a thermistor for surface measurement. The second boat is a modified commercial RC boat with dual propellers (Figure 1.2) and equipped with a more powerful Intel Core 2 Duo processor and a 802.11G wireless card. The navigation module includes the same Garmin GPS but with a more accurate 3DMG-X1 compass. The sensor module on board is a Hydrolab Mini Sonde, which is mounted on a winch to move vertically in the water and is able to provide information, such as fluorescence, PH and turbidity, along a vertical water column. 1.3 Problem Statement Inthisdissertation, weformulateandsolvethefollowingproblem. How should the mobile nodes move so that the data they collect can be optimally combined with data collected by the static nodes such that the field reconstruction error is minimized, subject to the constraint that the total energy available to the mobile nodes is bounded? Specifically, if eachstaticnodemakesameasurementinitsvicinity,andthetotalenergyavailabletothe mobile nodes is known, what paths should the mobile nodes take to minimize the error associated with the reconstruction of the entire field? Here we assume that the energy consumedbycommunications andsensingisnegligiblecomparedtotheenergyconsumed 5 (a) (b) (c) (d) (e) (f) Figure 1.1: NAMOS, a robotic sensor network. (a) The components of a buoy. (b) The buoys in a lake. (c) The first robotic boat. (d) The second robotic boat. (e) The robotic boat in operation with buoys in a lake. (f) The second boat in operation with buoys in ocean. 6 in moving the mobile nodes. We also assume that accurate localization of both the static nodes and the mobile nodes is available. Finally, we focus on reconstructing phenomena which do not change (or change slowly) compared to the time a mobile node takes to complete a data collecting tour. 1.4 Contributions In general, the optimality of the path for the mobile node depends on the approach used for estimating the field. Model-based estimation (and hence optimal sampling design based on linear or non-linear models) is well studied [52]. In the environmental observing context a model is normally unknown (it might even be the goal of the project to learn a model from the data collected by the sensor network). Therefore, non-parametric estimation is appropriate. In this dissertation, we propose adaptive sampling algorithms based on optimal design methodologies for three non-parametric regression techniques, local linear regression [51, 24], which is based on the analysis on the integrated mean square error of the field reconstruction; thin plate spline [47], which is based on the knot selection; and universal kriging [48]. Our adaptive sampling algorithms are formed by combining optimal design with a path planning algorithm and/or a coordination mechanism. Three cases are studied in this dissertation. 1. In the first case, the robotic sensor network consists of a large number of static sensor nodes but only one mobile node. The adaptive sampling algorithm first uses the optimal experimental design to define a gain associated with each location and 7 then uses a path planning algorithm to find a good path for the mobile node to follow when taking new samples. In this case, path planning is actually an instance of the Orienteering Problem, which is a well-known NP-complete problem. As a result, we use approximation algorithms to find a good path instead of an optimal path. 2. In the second case, we assume that the number of the mobile nodes is so large that each mobile node does not have to take sensor readings in multiple locations. As a result, with the optimal density function corresponding to the optimal experimen- tal design, the adaptive sampling algorithm is reduced to a deployment algorithm. We propose a distributed deployment algorithm. Analysis and simulations show that the deployment algorithm is able to deploy the mobile nodes according to a given density function. Simulation results show considerable performance improve- ment by combining the optimal design for local linear regression with the proposed deployment algorithm. 3. In the third case, we study an intermediate condition involving a moderate (> 1) number of mobile nodes. As a result, both path planning and coordination between mobile nodes have to be considered. A partition-based approach is proposed. The sensing field is first partitioned into sub-areas, and within each sub-area, the path planning algorithm for a single mobile node is used to find a good path for each mobile node. Two partitioning strategies are studied: equal area and equal gain. Simulations show that the partition-based approach is able to achieve reasonable 8 performance in a short running time when compared to the sequential planning approach. 1.5 Outline The dissertation is organized as follows. Optimal experimental design is the basis of all the adaptive sampling algorithms proposed in this dissertation and it is discussed in the next chapter. Systems with a single mobile node are introduced in Chapter 3, and algorithms combining optimal experimental design with path planning algorithms for a single mobile node are presented. In Chapter 4, we study the case where there are a large number of mobile nodes. A distributed algorithm is proposed to coordinate the mobile nodes to achieve optimal performance. In Chapter 5, systems with a small number of mobile nodes are discussed and a partition-based approach is proposed. We conclude the dissertation in Chapter 6. 9 Chapter 2 Optimal Experimental Design 2.1 Introduction Instatistics,optimalexperimentaldesignconcernstheselectionofobservationlocationsso thattheerrorsassociatedwithestimatingthescalarfieldisminimized. Whenparametric models exist, the classical optimal design approach can be used. If a linear model can be assumed, for example, the general idea is to maximize the information matrix in some sense. Several functions can be used as the criterion to determine which matrix is maximal. One popular choice, D-optimality, selects observation locations to maximize the determinant of the information matrix. V-algorithm can be used to determine the optimal design [52]. If a non-linear model is assumed, a sequential design approach is often used. In this case, sampling locations are chosen one by one [52]. Every time a sample taken, the information matrix is updated and the new sample is determined by maximizing the current information matrix. Unfortunately, in the cases of scalar field estimation, normally we do not have a model for the underlying phenomenon. In some cases, it is even the task of the sensor 10 network to provide enough data so that a model can be constructed. Therefore, in this dissertation, we focus on the problem with no predetermined model and hence non- parametric regressions are used for estimation. Following similar ideas with optimal design with a parametric model, we can derive optimal design strategies for the non- parametric regression. Non-parametric regression is a form of regression analysis in which the predictor does not take a predetermined form. Instead, it assumes that the regression curve/surface belongs to some infinite dimensional collection of functions. The regression model is con- structed from the information derived from the data and then the regression is carried to estimate the curve/surface. Non-parametric regression techniques include local poly- nomial regression, smoothing splines, kriging and wavelets. In this dissertation, we are going to develop the optimal design based on the former three techniques. 2.2 Optimal Design with Local Linear Regression 2.2.1 Local Linear Regression The kernel estimator is one of the most popular non-parametric methods, because it is easy to understand, analyze and implement on a computer. In this dissertation, we focus on local linear regression, an extension of the simple kernel estimators. The asymptotic properties of local linear regression have been investigated in Statistic literature [43, 44, 24, 25, 51]. The main take-away message from these studies is that the mean square error (MSE) of the estimator is related to the second derivatives of the phenomenon investigated. 11 Ifm(x)isthefunctiontobeestimated(reconstructed),weassumethefollowingmodel y i =m(x i )+σ(x i )ǫ i , (2.1) wherex i areddimensionalcoordinates,y i aresensorreadings,whicharescalarvalues. m(x) is the underlying scalar field to be estimated. σ 2 (x i )=Var(Y|X=x i ) is finite. ǫ i are mutually independent and identically distributed random variables with zero mean and unit variance and are assumed to be independent of x i . local linear regression is a process that estimates the scalar field at each location x by using weighted linear regression. It is assumed that the closer two locations in the scalarfield, the higher the correlationbetween the valuesatthe two locations. Therefore, the closer a data point to the location x, the higher the weight it should be assigned. Normally, the weight is defined by a non-negative function, which is known as kernel. A kernelessentiallydefinesaneighborhoodaroundx. Thesampleswithintheneighborhood would have effect on the estimate of the value at x while the one outside have little or no effect at all. Then the local linear regression estimator for each point x is the linear estimator that minimizes n X i=1 {y i −α−β T ·(x i −x)} 2 ·K H (x i −x), (2.2) whereK H (u)=|H| −1/2 K(H −1/2 u). K(u)isaddimensionalkernelwith R K(u)du= 1. H is a d×d symmetric positive definite matrix. It is used to scale the kernel K(u) and hence modify the size and shape of the neighborhood. Therefore, H 1/2 is called the 12 bandwidth matrix. Solving the previous optimization problem, we have the following estimator: ˆ m(x,H)=(1,0,··· ,0)·(X T x W x X x ) −1 X T x W x Y, (2.3) where X x = 1 (X 1 −x) T ··· ··· 1 (X n −x) T , (2.4) Y =(Y 1 ,··· ,Y n ) T , (2.5) W x =diag{ K H (X 1 −x) ··· K H (X n −x) }. (2.6) If the kernel satisfies following conditions μ 2 (K)I= Z uu T K(u)du (2.7) Z u l 1 1 ···u l d d K(u)du=0 (2.8) for all non-negative integers l 1 ,··· ,l d of which the sum is odd, it has been proved that the estimation error associated with local linear regression at location x is given by the following equation [51]: MSE{ˆ m(x;H)} = R(K)σ 2 (x) n|H| 1/2 f(x) + 1 4 μ 2 2 (K)tr 2 {HH m (x)} +o p {n −1 |H| −1/2 +tr 2 (H)}. (2.9) 13 where n is the number of samples, R(k) = R K(u) 2 du, f(x) is the density function with R f(x)dx = 1, and H m (x) is the Hessian matrix of m(x). There are two main terms in Equation (2.9). The first term is the variance due to the noise from the sensors and the second term is the bias induced by using the linear model to approximate a general function. Normally, the number of samples is assumed to be constant since it is limited by the budget, such as energy available. As a result, the variance can only be reduced by increasing the bandwidth or the local sample density. On the other hand, the bias can only be reduced by reducing the bandwidth. Therefore, there must exist an optimal bandwidth minimizing the local mean square error and optimal density minimizing the integrated mean square error, which is discussed in the next subsection. 2.2.2 Optimal Design for Linear Local Regression In statistics, the performance of an estimator is defined by IMSE. In Equation (2.9), when n|H| 1/2 is big enough and H is small enough, the term o p {n −1 |H| −1/2 +tr 2 (H)} is negligible and the IMSE can be approximated as following J = Z µ R(K)σ 2 (x) n|H| 1/2 f(x) + 1 4 μ 2 2 (K)tr 2 {HH m (x)} ¶ dx (2.10) We further assume that H =h 2 I and the Equation (2.10) is rewritten as J = Z µ R(K)σ 2 (x) nh d f(x) + 1 4 μ 2 2 (K)h 4 tr 2 {H m (x)} ¶ dx. (2.11) If the Hessian matrix of m(x) is assumed to be known, we can determine the optimal bandwidth and the optimal density function by minimizing the IMSE with the constraint 14 R f(x)dx=1. ByapplyingtheLagrange-Eulerdifferentialequation, wehavetheoptimal bandwidth and density function as follows. f ∗ (x)∝tr 2d d+8 {H m (x)}σ 8 d+8 (x), (2.12) h ∗ = µ dR(K)σ 2 (x) nf ∗ (x)μ 2 2 (K)tr 2 {H m (x)} ¶ 1 d+4 , (2.13) and the the IMSE is given by following equation IMSE=C Z µ tr d {H m (x)}σ 4 (x) n 2 f ∗ 2 (x) ¶ 2 d+4 dx (2.14) where C is a constant and f ∗ (x) is called the optimal design. When the cost of moving from one sampling location to another is small compared to the cost of taking sensor readings, we can use f ∗ (x) to generate the set of locations where more samples should be taken. Assume that there are a small number of initial sensor readings available, the Hessian matrix can be estimated by using local polynomial regression, which is similar to locallinearregression. Insteadofapplyingalinearmodeltothesamples,localpolynomial regression uses a higher order polynomial model. Figure 2.1 compares the performance of the optimal design and random design in simulations. In the case of optimal design, we assume that initially there are 20% of the total n readings evenly distributed across the sensing field. From those 20% readings, the Hessian matrix was estimated by using 2nd order local polynomial regression and the optimal density f ∗ (x) was computed by using Equation (2.12). The locations of the other80%samplesaregeneratedaccordingtotheoptimaldensityandadditionalreadings 15 100 200 300 400 500 600 700 800 900 1000 1100 1 1.5 2 2.5 3 3.5 4 4.5 The number of samples IMSE Random Design Optimal Design Figure 2.1: Comparison between optimal design and random design with local linear Regress. No energy consumption model is considered here. are taken. Finally the scalar field is estimated by using local linear regression with the optimal bandwidth computed from Equation (2.13). In the case of random sampling, a set of n sampling locations is generated with uniform distribution. The underlying scalar field is also estimated by local linear regression with the optimal bandwidth and an estimate of sample density. The simulations are repeated 25 times for n = 200, 400, 600, 800, 1000 and the averages are plotted in Figure 2.1 with standard deviation. As canbeseenfromthefigure, optimaldesignsoutperformtherandomdesigninmostcases. When the number of samples is greater than 600, the estimation errors associated with optimal designs are 20% less than random design. Since for 2-dimensional estimators the optimal decrease rate of the estimation error is n −2/3 , 20% improvement is significant. 16 2.3 Optimal Design with Thin Plate Splines 2.3.1 Thin Plate Splines Smoothing splines are another family of non-parametric regression techniques. The con- cept of smoothing splines can be traced back as early as in 1923, when Whittaker [64] applied the idea to the problem of graduation of data. The idea of smoothing splines was revisited in 1960’s and since then it has become one of the most important methodolo- gies for non-parametric regression. In the case of two or higher dimensional regression, there are two spline approaches. One is a tensor spline, which is based on the additive model [59, 29]. The other is the thin plate spline [47]. Generally speaking, the tensor spline is more useful in cases where each dimension has its own interpretable meaning while the thin plate spline is more appropriate for the spatial regression. In the cases where the response can be approximated by the sum of certain functions of each element of the predictor vector or the tensor products of those functions, tensor spline is very useful. Tensor splines cannot only predict the response given the predictor variable, but also display how each element affects the response. In most cases of spatial regression, the elements of the predictor variable, i.e., coordinate, affect the response together and an additive model is not appropriate. In this dissertation, we focus on thin plate spline since the scalar fields to be estimated are mainly natural phenomena varying spatially. Given a data set {(x 1 ,y 1 ),...,(x n ,y n )}, the d-dimensional thin plate spline is defined as the function ˆ m(x) that minimizes L(ˆ m)= 1 n n X i=1 W i ·(y i − ˆ m(x i )) 2 +λ·J(ˆ m) (2.15) 17 and J(ˆ m) is defined as J(ˆ m)= Z X l! α 1 !·...·α d ! ( ∂ l ˆ f ∂x α 1 1 ·...·∂x α d d ) 2 dx (2.16) where ˆ m(x) is any real function that has l-th order partial derivatives and λ is the smoothing parameter, which controls the smoothness of the function. It has been proved that there exists a function g ∗ ∈ G so that L(g ∗ ) ≤ L(m) for any real function m with l-th order partial derivatives. The set G is defined as follows. G={g :g(x)= t X j=1 φ j β j + n X i=1 ψ i δ i and n X i=1 φ j (x i )δ i =0} (2.17) where φ i is a polynomial basis, which is the (i−1)-th order polynomial, and the ψ i is a radial basis, which is defined as ψ i (x)= kx−x i k 2l−d logkx−x i k d is even; kx−x i k 2l−d d is odd (2.18) Therefore, the previous variational problem has been reduced to an optimization problem over a finite-dimensional space. The closed form solutions for the parameter β and δ are given as follows. δ β γ = K T K +λK K T T T T T K T T T O T O O −1 K T T T O ·Y (2.19) where K i,j =ψ j (x i ) and T i,j =φ j (x i ). 18 Intheprevioussolution,thenumberofknotsisthesameasthenumberofdatapoints and the smoothness is controlled by the λ. The greater λ, the higher the smoothness. Another way to adjust the smoothness is to control the number and deployment of the knots. Actually, in the area where the spatial variation of the function is not high, a few knots would be enough for the approximation and hence the number of knots could be less than the number of data points. In other words, the number and locations of knots can be used to control the smoothness of the spline. More knots would result in more accurate interpolation while less knots would result in higher smoothness. Assuming {k 1 ,k 2 ,...,k n k } are the set of knots, we define the new radial basis as follows. ψ i (x)= kx−k i k 2l−d logkx−k i k d is even; kx−k i k 2l−d d is odd (2.20) where i=1,...,n k . We also define ˜ T i,j =φ j (k i ) (2.21) and ˜ K i,j =ψ j (k i ). (2.22) Since we can use the number and locations of knots to control the smoothness of thin plate spline, here we drop the smoothness penalty L(m). In this case, the original variation problem becomes the problem to find the function to minimize L=( ˜ K T ·δ+T T ·β−Y) T ·W ·( ˜ K T ·δ+T T ·β−Y) (2.23) 19 with the constraint ˜ T T ·δ =0. (2.24) Applying the Euler-Lagrange approach to the this problem, we have the closed form solution for the parameters β and δ as shown in Equation (2.25). δ β γ = ˜ K T ˜ K ˜ K T T ˜ T T T ˜ K ˜ T T ˜ T O ˜ T O O −1 ˜ K T ˜ T T O ·Y (2.25) 2.3.2 Optimal Design with Thin Plate Spline By placing knots adaptively, we can find the structure of the underlying scalar field. In area where spatial variation is high, more knots are to be placed; And in areas where the scalar field does not change much spatially, fewer knots are to be placed. After the numberandlocationofknotsarefixed, theproblemofthinplatesplinesbecomesalinear regression. Therefore, the idea of optimal experimental design with thin plate splines is firsttousetheknotplacementalgorithmtodeterminetheknotsandthenusetheclassical optimal design algorithm to find the optimal locations of extra samples. Knot placement has been studied extensively in the community of statistics. This problemcanbeformulatedasageneraloptimizationproblem. However,asthenumberof knotsincreases, theassociatedcomputationincreasesrapidly. Onewaytoavoidthesteep computationalcostistouseastepwiseadditionand/ordeletionasanapproximation[27]. This algorithm starts with a minimum number of knots. Then the algorithm searches the data points to find the place to put one more knot. The place is chosen in such a way 20 that the decrease of the estimation error is positive and maximized. Repeat this process until the estimation error increase as the number of knots increases. Finally, all knots are reexamined to find out if the knot can be deleted. We refer to the estimator error after removingaknot as theestimation errorassociatedwith it inthe processofdeletion. If the deletion of one knot decreases the estimation error, it can be removed from the knot set. The knot associated with the smallest estimation error is deleted. The process of deletion is repeated until the deletion of any knot would cause the estimation error increase. In practice, no ground truth is available and hence the estimation error cannot be computed. Fortunately, the estimation error can be estimated by using cross validation (CV), which is defined in Equation (2.26). CV = 1 n n X i=1 (y i − ˆ y −i ) 2 (2.26) where ˆ y −i is the estimate by using all the data points except the i-th data point (x i ,y i ). Tocomputecrossvalidation, weessentiallyfitthinplatesplinestoallsubsetsofsizen−1 of the data and have to solve Equation (2.25) for n times, which involves the inversion of a big matrix. As a result, cross validation is computational expensive. General cross validation(GCV)isthenproposedtoapproximatecrossvalidationandittakesmuchless computation since we only need to fit thin plate splines to the whole data set once. GCV is defined as Equation (2.27). GCV = 1 n n X i=1 (y i − ˆ y i ) 2 /(1− tr{A} n ) 2 . (2.27) 21 where ˆ y i is the estimate on y at the location x i and A is the linear transformation from y to ˆ y. A is defined as ˆ y =Ay, where y =( y 1 ... y n ) T , and ˆ y =( ˆ y 1 ... ˆ y n ) T . Intheclassicaloptimaldesignforthelinearmodel,thegoalistominimizethevariance of the final estimate, which is equivalent to maximizing some function of the information matrix. Since matrices are partially ordered, some criterion is necessary for comparing two matrices. One popular criterion is Φ-optimality, which maximizes the determinant of the information matrix. In the simple case where there are only a few bases, there exist closed form solutions. In the complex cases, an algorithm named as V-algorithm can be used to find the optimal design. Please refer [52] for details of the V-algorithm. However, to apply the V-algorithm, the information matrix should take the form M = n X i=1 B 1 (x i ) B 2 (x i ) ... B n B (x i ) T · µ B 1 (x i ) B 2 (x i ) ... B n B (x i ) ¶ (2.28) where ( B 1 (x) B 2 (x) ... B n B (x) ) is the set of basis functions. The information ma- trix associated thin plate splines as shown in Equation (2.25) does not take this form, because the bases in the Equation (2.25) are not orthogonal to each other. We can use the Q-R decomposition to orthogonalize the system. Let’s suppose ˜ T =Q·R is the QR decomposition of the polynomial bases, and Q= µ F 1 F 2 ¶ , (2.29) 22 where F 1 is the matrix consisting of the column vectors corresponding to ˜ T and F 2 consists of the rest orthogonal column vectors. We redefine the smoother with new parameters ω =[ ω T 1 ω T 2 ] T as follows ˆ y = µ T K·F 2 ¶ ω 1 ω 2 . (2.30) The new parameter ω is related to the previous parameter β and δ as shown in following equation β =ω 1 ; δ =F 2 ·ω 2 . Now the parameter can be estimated with following equation ω = F T 2 ˜ K T ˜ KF 2 T T ˜ KF 2 F T 2 ˜ K T T T T T −1 · F T 2 · ˜ K T T T ·Y. (2.31) The information matrix becomes M = F T 2 ˜ K T ˜ KF 2 T T ˜ KF 2 F T 2 ˜ K T T T T T = F T 2 ˜ K T T T · µ F 2 ˜ K T ¶ . (2.32) Notes that B = ( F 2 ˜ K T ) T is the new basis and now the information matrix has the same format as in Equation (2.28). When the number and location of knots are 23 0 50 100 150 200 1 1.5 2 2.5 3 3.5 The Number of New Samples Sum of Mean Square Error random design optimal design Figure 2.2: Comparison between optimal design and random design with thin plate splines. No energy consumption model is considered here. determined, a thin plate spline estimator can be viewed as a linear regression and hence theV-algorithmcanbeappliedtodeterminetheoptimallocationsforadditionalsamples. Figure 2.2 shows the results of simulations comparing the optimal design and the random design for thin plate splines. The scalar field to be estimated is given by y =exp(− (x−x 0 ) T (x−x 0 ) 0.08 )+σ·ǫ, where x 0 = (0.45,0.45) T , ǫ is a random variable with (0,1) normal distribution and σ is the noise level, which is chosen to be 0.1. The sensing field is the square defined by 0≤ x 1 ≤ 1 and 0≤ x 2 ≤ 1. In the simulations, two designs are studied: optimal design and random design. In the former case, we assume that 121 readings are initially taken and used to determine the number and locations of the knots. When the set of knots is fixed, V-algorithm is used to determine the locations of the new samples. In the latter 24 case, the locations for new samples are randomly generated. For each design, simulations are carried for 100 trials with 40, 60, 80, 100, 120, 140, 160, 180 and 200 new samples. The estimation errors are the sum of error on all points of a 101×101 grid of the sensing field. It is obvious in the Figure 2.2 that the performance of the optimal design is able to achieve lower estimation error given the same number of data points. On average, with the same number of samples, optimal design reduces the estimation error by about 15%. 2.4 Optimal Design with Kriging Kriging is a family of linear estimators, which are used to estimate a spatially varying phenomenon with limited number of samples. Kriging is first developed by Krige [37] and used for gold mining valuation problem in 1950’s. However, it did not catch much attention until the work of Matheron in 1960’s [40]. Later on, kriging is so popular that a new subject named geostatistics was formed based on it. Luenberger also independently developedavectorialapproachforoptimizationwhenhewasworkingontheoptimization problem and kriging is one application of this approach [39]. There are three variations of kriging: simple kriging, ordinary kriging and universal kriging. All kriging techniques assume that the sampling is a partial realization of a random function y(x), where x is the location. Simple kriging assumes the function y(x) is second order stationary, which is defined as follows. E{y(x)}=e; (2.33) E(y(x−m)(y(x+h)−m)=Cov(h), (2.34) 25 where E(·) denotes expectation, e is a constant and h is a vector distance. Second order stationary means that the correlation between two random variables at different locations only depends on the distance between the locations but not on the locations themselves. Simple kriging also assumes that the constant e is known. Ordinary kriging is a generalization of simple kriging. It still assumes that y(x) is second order stationary, but drops the assumption that the constant e is known. For many cases, to assume that the underlying phenomenon has a expectation that is constant across the whole sensing field is still not reasonable. So universal kriging is proposed as a generalization to ordinary kriging. Since simple kriging and ordinary kriging are only special cases of universal kriging, we are going to focus on universal kriging in this dissertation. Instead of constant expectation, universal kriging assumes that the expectation of the random variable can be modeled as linear combination of analytical functions p i (x), e.g., polynomials. e(x)= n X i=0 a i p i (x) (2.35) where a i are parameters to be determined. The assumption shown in Equation (2.34) is kept and is rewritten as in Equation (2.36). E(y(x)−e(x))(y(x+h)−e(x+h))=Cov(h) (2.36) If{(x 1 ,y 1 ),...,(x n ,y n )} is the set of samples andx 0 is the location at which we would like to estimate the value y(x 0 ), universal kriging is defined as unbiased linear estimator as shown in Equation (2.37). ˆ y(x 0 )=Λ T ·Y (2.37) 26 where Λ=(λ 1 ,λ 2 ,...,λ n ) T is the vector of parameters andY =(y 1 ,y 2 ,...,y n ) T . It can be proved that as an unbiased estimator, universal kriging must satisfy following condition. Λ T ·P =P 0 , (2.38) where P 0 and P are defined as follows. P 0 =(p 0 (x 0 ),p 1 (x 0 ),...,p k (x 0 )) T ; P = p 0 (x 1 ) p 1 (x 1 ) ... p k (x 1 ) p 0 (x 2 ) p 1 (x 2 ) ... p k (x 2 ) ... ... ... ... p 0 (x n ) p 1 (x n ) ... p k (x n ) . It can also be proved that the estimation variance at location x 0 can be computed by following equation. σ 2 (x 0 )=2Λ T Γ 0 −Λ T ΓΛ, (2.39) where Γ 0 and Γ are defined as follows. Γ 0 =(γ(x 1 −x 0 ),...,γ(x n −x 0 )) T ; Γ= γ(0) γ(x 2 −x 1 ) ... γ(x n −x 1 ) γ(x 1 −x 2 ) γ(0) ... γ(x n −x 2 ) ... ... ... ... γ(x 1 −x n ) γ(x 2 −x n ) ... γ(0) . 27 Here γ(h)=1/2cov(h) is semivariogram, which is to be estimated by using the data set. Byminimizingthetheestimationvariancewiththeunbiasedconstraintofequation 2.38, we can find a closed form solution to the parameters λ i . Λ μ = Γ P P T 0 −1 Γ 0 P 0 (2.40) where μ = (μ 1 ,...,μ k ) T is the vector of Lagrange multipliers. Replace the Λ with the values from equation above, we can compute the estimated value at location x 0 . The variance of estimation can also be computed as follows. σ 2 (x 0 )=( Λ T μ T )· Γ 0 P 0 =( Γ T 0 P T 0 ) Γ P P T 0 −1 Γ 0 P 0 (2.41) Universal kriging is closely related to thin plate splines [47]. Actually, an universal kriging estimator can be viewed as a thin plate spline whose radial bases are specially defined. On the other hand, a spline can be identified as an universal kriging estimator and the covariance matrix can be derived from the radial bases of the thin plate spline. The optimal design with universal kriging is defined as the design that minimizes the estimation variance. Similar to classical optimal design, the estimation variance can be minimized by maximizing the following matrix. M = Γ P P T 0 28 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x1 x2 (a) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 x1 x2 Estimation Variance (b) Figure 2.3: (a) The design that minimizes the estimation variance of universal kriging within the area 0 ≤ x 1 ≤ 1 and 0 ≤ x 2 ≤ 1. (b) The estimation variance corresponding to the design. Since the matrixM does not take the form as shown in Equation (2.28), the V-algorithm cannot be applied. However, a greedy algorithm can be used. This algorithm adds the sampling locations one by one and each time it chooses the best possible sampling location. Figure 2.3 shows the design for 25 samples and the corresponding estimation variance. The semivariogram is assumed as γ(h)=1−exp(−3h)+0.1, and is independent of the location. It is not surprising that the design derived from the algorithmisapproximatelyaspacefillingdesign[3,45]. Becauseoftheassumptiononthe semivariogram, the farther apart two locations are from each other, the less correlation between corresponding values. As a result, the distance from x 0 , where the value to be predicted, to the closest sample determines the estimation variance. To minimize the 29 covariance, the distance to the closest sample must be minimized for any x 0 and hence the samples should be evenly distributed across the sensing field. 2.5 Conclusion In this chapter, we have discussed optimal designs with non-parametric regression. For each of the three non-parametric regression techniques, local linear regression, thin plate spline and universal kriging, and a design strategy is proposed. The main advantage of local linear regression is that it can be implemented in a distributed way as can the corresponding optimal design. As a result, optimal design for local linear regression is appropriate for the case where large number of mobile nodes involved and distributed algorithms are preferred. In Chapter 4, the optimal design for local linear regression is combined with a distributed deployment algorithm to form a distributed adaptive sampling algorithm. Thin plate splines can provide a compact representation for the scalar field with a few knots and polynomial functions. Therefore, thin plate splines are preferred in the cases where the estimate has to be updated frequently, and hence large amounts of data have to be generated or transmitted while the storage or bandwidth is limited. Kriging is based on the estimate on the correlation between the values at different locations. Generally speaking, the change of the correlation would not change as often as the scalar field itself. So, if the correct correlation has been estimated, the optimal design for universal kriging might result in a more steady design than the other two techniques. 30 Forallthesedesignstrategies,weassumethatthecosttotakesamplesareonlyrelated to the number of samples. This assumption is reasonable in the applications where the samples are taken manually or the cost to take samples are very high, such as taking rock samples. However, when a robotic sensor network is used to measure values such as temperature, PH or dissolved oxygen, the cost to take sensor readings is negligible but the cost to move the sensor to the desired locations might be high. Therefore, the cost associated with sampling is related not only to the number of samples, but also to the locations of the samples. The strategies to take samples in those cases are presented in the following chapters. 31 Chapter 3 Adaptive Sampling with a Single Mobile Node 3.1 Introduction In the previous chapter, we have discussed optimal design strategies for three different non-parametric regression methods. Those strategies provide a way to evaluate the ben- efit associated with each potential sampling location. In the case where a robotic sensor network is used for scalar field estimation, the mobile nodes are normally used to take new samples. Since the mobile nodes are powered with batteries, which can only provide limited energy for the nodes, the energy consumed for moving a mobile node from one location to another must be taken into account when determining the best path for sam- pling. In this case, adaptive sampling becomes a path planning problem that maximizes the benefits with the constraint on the cost. In this chapter, we discuss several algorithms to find a nearly optimal path for a single mobile node. The cases with multiple mobile nodes are discussed in the following chapters. As in Chapter 2, we still assume that the static nodes are deployed uniformly across the sensing field and provide the initial estimate of the scalar field. The proposed 32 algorithm has been tested with simulations on real data as well as on a testbed in the field. A penalty associated with the path is considered in various incarnations of the robot mapping/exploration problems, which are well studied in robotics literature [50, 53, 42, 66]. Roy [50] proposed an Entropy based approach for path planning so that better localization could be achieved. Sim [53] proposed an A-optimality criterion for SLAM andabreadthfirstsearchalgorithmtofindtheapproximateglobaloptimalpath. Mei[42] proposed an orientation based target selection strategy, where the energy cost for turning isincorporatedintheoptimalcriterion. Whiletherearesimilaritiesbetweentheadaptive samplingproblemstudiedhereandmapping/explorationproblems,itisimportanttonote that there are some differences. In robot mapping/exploration the criterion of optimality is defined as the decrease in the uncertainty of the map, and it is normally assumed that the estimation bias is zero. In adaptive sampling for a scalar field, interpolation is necessary to estimate the value where no direct sensing information is available and hence bias is inevitable. Therefore, a new criterion is needed to define the optimality of the path. Similar path planning problems have also been studied in the theoretical computer science and operation research communities as the Orienteering Problem. the Orienteer- ing Problem is a variant of the Traveling Salesman Problem and defined as maximizing the number of nodes visited while keeping the length of the path no more than a given budget. TheOrienteeringProblemisNP-completeand severalapproximation algorithms have been developed. Arkin et al. [1] proposed the first approximation algorithm for the Orienteering Problem, which provided an approximation factor of 2 for the case where 33 all the points to be visited are in a plane. It is also assumed that there exists an edge between any two points and the cost for traveling along an edge is the Euclidean dis- tance between its two end points. Chen et al. [16] proposed another algorithm for the Orienteering Problem in the plane, which achieved an approximation factor of 1/(1−ǫ), where ǫ is any small positive value. A constant approximation factor algorithm for the general Orienteering Problem has been proposed by Blum et al. [8] and the approxima- tion factor is 4. Bansal et al. [5] improved the algorithm and reduced the approximation factor to 3. Chekuri et al. improved the approximation further to 2 +ǫ [15]. In our case, the problem can not be modeled as an Orienteering Problem in the plane since the obstacles in the sensing field may prevent the mobile robot from moving straight between two locations, and hence the cost for traveling between any two locations can be greater than the Euclidean distance between them. As a result only the algorithms of [8, 5, 15] can be used. We implement the Blum et al. algorithm as a subroutine of the adaptive sampling algorithm since it is faster than the other two algorithms, although the Blum et al. algorithm is also computational expensive. Details are discussed in Section 3.4. 3.2 Objective Function for Path Planning When samples are to be taken by a mobile robot, optimal design along is not enough becausenowtheconstraintisnotonlythenumberofsamplestobetakenbutthedistance that the mobile robot can travel. The distance in turn depends on the energy available to the mobile robot. In this case, the costs associated with the samples are not the same. Some samples might take more energy while others might take much less depending on 34 how far these sampling locations are from the start location of the robot and which path the robot takes to reach these locations. Before the path planning algorithm can be used to find the best path, we need first to define the criterion for the best path. Here, we define the path as a sequence of states (x 1 ,x 2 ,...,x np ) and at each state the mobile node is going to take new sensor readings. Here we assume that the cost for taking a reading is negligible compared to the cost for moving the mobile node. Since it won’t increase the cost to take a sensor reading along thepath,themobilenodealwaystakesnewreadingsalongthepath. Thegoalofadaptive sampling is to find the best set of sampling locations so that the estimation errors can be minimized. Similar to the information gain defined in the robot exploration literature, we define the gain for a location and for a path as follows. Assuming there are initially n readings from static sensor nodes (x 1 ,y 1 ),...,(x n ,y n ), the gain associated with a new sample x is defined as G(x)=Err(x 1 ,x 1 ,...,x n )−Err(x 1 ,x 1 ,...,x n ,x). (3.1) and the gain associated with a path P =(x n+1 ,x n+2 ,...,x n+n P ) is defined as G(P)=Err(x 1 ,x 1 ,...,x n )−Err(x 1 ,x 1 ,...,x n ,x n+1 ,...,x n+n P ), (3.2) whereErr(·)istheestimationerrorwithacertaindesignandn p isthenumberoflocations on path P. With different regression technology, the estimation errors are different. 35 For local linear regression, IMSE is used as the reconstruction error, which can be estimated as follows. Note that IMSE contains both estimation variance and bias. IMSE(X 1 ,...,X n0+n )∝ Z ( tr d {H m (x)}v 2 (x) n 2ˆ f 2 (x) ) 2 d+4 dx, (3.3) where ˆ f(x)=(n+n 0 ) −1 P n+n 0 i=1 K H (X i −x) is the estimate on the sample density. If a thin plate spline is used for estimation, we assume the the structure of the spline has been captured by a knot selection algorithm with the initial data, and hence the estimation error can be estimated by using the the variance of the estimation as shown in Equation (3.4). Err(x 1 ,...,x n )= Z η(x) T F T 2 ˜ K T ˜ KF 2 ˜ T T ˜ KF 2 F T 2 ˜ K T ˜ T ˜ T T ˜ T −1 η(x)σ 2 dx, (3.4) where η(x) = (ψ 1 (x),...,ψ n k (x),φ 1 (x),...,φ i (x)) T , σ 2 is the variance of the sensor readings and n k is the number of knots. ˜ K, ˜ T and F 2 are defined as in Equation 2.22 2.21 and 2.29 respectively and they all depend on where the samples were taken. Similarly, the estimation error of Kriging can also be approximated by using the estimation variance as shown in Equation (3.5). Err(x 1 ,...,x n )= Z ( Γ T 0 F T 0 ) Γ F F T 0 −1 Γ 0 F 0 dx, (3.5) where Γ, F, Γ 0 and F 0 are defined in Section 2.4. 36 Since we assume that the scalar field does not change until the mobile node finish taking all the extra samples, the gain associated with each location does not change during the time the path is planned. Note that the gain of the entire path is less than the sum of the gains of all points on the path, because the gain at each location is not independent of the gain at other locations. Equation (2.14) shows that MSE at location x is proportional to f − 4 d+4 (x), wheref(x)islocalsampledensity. Thatis,thereconstructionerroratxdecreasesaslocal sample density increases and the decreasing rate also decreases. When one extra sample taken at locationx, the local density atx increases and density at any locationx ′ within x’s neighborhood also increases. As a result, the decreasing rate of the reconstruction error atx ′ decreases, which is actually the gain associated withx ′ . When another sample taken at location x ′ after one sample already taken at x, the reconstruction error would not decrease as much as when no extra sample taken at x. Therefore, if x and x ′ form a path, the gain of the path is less than the sum of the gains associated with individual locations. This also applies to paths with more than two locations. However, if the gain of each path is computed separately, the problem becomes com- putationally intractable. Actually, the dependency of the gains between locations x and x ′ happens only when the two locations are within each others neighborhood, which is determined by the bandwidth. When two locations are not in each others neighborhood, the sample in one location has no effect on the local sample density of the other location and hence would not change the gain of the other location. Even within the neighbor- hood, the dependency between two locations decreases as the distance between x and x ′ 37 Figure 3.1: The robotic boat of the NAMOS project used in experiments regarded in this thesis. because the value of kernel function decreases as the distance increases. Therefore, we approximate the gain of a path by using the sum of the gains of all locations on the path. Because the decreasing rate of MSE decreases as the local density increases, the gain achieved by taking a second sample at a location is much less than the gain associated with the first. Therefore, the gain function is defined so that the mobile robot will not achieveanygainifitvisitsthesamelocationforthesecondtime. Notethat,ifthestateof the mobile robot contains direction, it is possible that two states share the same location but have different directions. In this case, according to the assumption made here, the robot will not get any gain when it visits the same location for the second time even if it is in a different state (i.e. different direction). 3.3 Energy Consumption Model An energy consumption model is used to determine if two states are adjacent to each other and how much energy would be consumed for the robot to transits from one state 38 to another. Two energy consumption models are considered in this chapter. In the first model, the robot has two or more states for each location and in each state, the robot has a different direction. This model is used for the robotic boat working in a small area, where the distance between two adjacent locations is so small that the energy consumed for making turns is significant compared to the cost for traveling from one location to another. In the other model, no direction information is taken into consideration and each state represents one location for the mobile robot. This model is designed for the case where the mobile robot is able to move in any direction or the energy consumed for makingturnsisnegligiblecomparedtocostofmovingthemobilerobotfromonelocation to another. 3.3.1 Model with Direction The model with turn cost included is inspired by the robotic boat developed in the Networked Aquatic Microbial Observing System (NAMOS) project, which consists of 10 static sensor nodes and two robotic boats. Figure 3.1 shows one of the mobile nodes used in the tests reported in this chapter. Unlike wheeled mobile robots, it is very difficult for a boat to make turns in place if it has only one propeller. In this case, the boat has a minimum turning radius, which depends on the speed of the boat. With the above limitations in mind, we propose the energy consumption model for the robot as depicted in Figure 3.2. The position of the boat is represented as (x, y, direction). At each location (x, y), the robot can be in one of the two directions: North- South (NS) or West-East (WE). We assume that the boat is able to move forwards and backwards with the same amount of energy consumed. In reality, this assumption is not 39 Figure 3.2: The trajectories considered for the energy consumption model. See Table 3.1 for energy consumption for each trajectory. always true. The actual energy consumption depends on many factors, e.g. wind and current, and may change over time. As a result, the model proposed here is only an approximation to the real boats and asymmetric models would be discussed later. From each state, there are 9 states the boat can transit to. For example, if current state of the boat is (0, 0, NS), the boat is able to transit to state (1, 0, NS), (1, 1, WE), (0, 1, NS), (-1, 1, WE), (-1, 0, NS), (-1, -1, WE), (0, -1, NS), (1, -1, WE) and (0, 0, WE). We assume that the energy required for the state transition is proportional to the distance the robot travels. The optimal curve connecting two states is the shortest curve with the radius of curvature of any point not less than the minimum turning radius. In thispaper, weuseacirculararctoapproximatetheoptimalcurve. Theenergyconsumed when the robot transits from (0, 0, NS) to other states is listed in Table 3.1, where we assume the energy cost from (0, 0, NS) to (0, 1, NS) is C 0 . 40 Table 3.1: The energy consumption table Destination State Energy consumed (1, 0, NS) 1.6C 0 (1, 1, WE) 1.6C 0 (0, 1, NS) C 0 (-1, 1, WE) 1.6C 0 (-1, 0, NS) 1.6C 0 (-1, -1, WE) 1.6C 0 (0, -1, NS) C 0 (1, -1, WE) 1.6C 0 (0, 0, WE) (1+3/4π)C 0 3.3.2 Model Without Direction The second model is inspired by the Networked Info-Mechanical System (NIMS), which is a sensor-actuator system with a mobile node driven by cables. A NIMS mobile node is able to move in any direction and the energy consumed for movement in any direction is the same. The motion of the mobile node is very accurate compared to the robotic boat since it is driven by cables and the motion system is calibrated for each deployment. The localization error of the node is on the order of centimeters. The particular NIMS system usedtotestouralgorithmwasNIMSRDwithonemobilenodeworkingunderwater. The mobile node is cylinder-shaped and is suspended from above. As a result, the horizontal movement of the node disturbs the water much more than the vertical movement. To minimize the disturbance caused by the mobile node, the motion of the node was con- strained in such a way that the node only moves vertically when it is in water. When it is on the surface it may move horizontally. The state of the NIMS node is defined as (x, y), where x is the horizontal coordinate and y is the vertical coordinate. The state transition is depicted in Figure 3.3. 41 Figure 3.3: The state transition model of NIMS node. In the graph, each vertex represents one state of the node and each edge represents a statetransition. Iftwoverticesarenotadjacenttoeachother,thereisnodirecttransition between the two corresponding states. The energy consumed by each state transition is proportionaltothedistancethenodetravels, asshowninthefollowingequation, wherek isaconstant. Notethatthenodecanonlymoveverticallyorhorizontallyandatransition is only allowed between two states sharing either the same x or the same y coordinates. E = k|x 2 −x 1 | if x 1 6=x 2 and y 1 =y 2 =0; k|y 2 −y 1 | if x 1 =x 2 ; ∞ otherwise. (3.6) 3.4 Path Planning Algorithms Once the gain function is defined and the energy consumption model is determined, we can construct a graph, in which vertices are the states of the mobile node and edges are possible state transitions. Each node has a weight equal to the gain associated with the corresponding state and each edge has a weight equal to the energy consumed for the state transition. Now, the problem can be specified in two ways. In the first case, 42 the problem can be defined as finding the path that maximizes the gain collected while the energy spent, i.e., the length of the path, is not more than a given budget. In the second case, the problem is defined as finding the path that minimizes the energy consumed while collecting gains no less than a given quota. The former is known as the Orienteering Problem [4, 8] or Selective Traveling Salesman Problem, and the latter is known as Quota TSP or K-TSP [4, 2]. 3.4.1 Exhaustive Search with Pruning Heuristics When the problem is small enough, we can use an Exhaustive Search to solve the Orien- teering Problem. In this chapter, an Exhaustive Search is used as the approach for small problems as well as a benchmark to compare other algorithms to be introduced later in this section. We implemented Exhaustive Search with heuristics. The first heuristic is that the optimal path should be no worse than the solution returned from a greedy algorithm. For each path that is not fully expanded, a greedy algorithm can provide a lower bound for the gain the path is able to collect. Another heuristicis to find the upper bound of the gain a path can collect. Suppose E a is the energy currently available and C min is the minimum cost among all edges, we first calculate the upper bound on the number of states the robot can visit n max =E a /C min . Then from the states that the robot can reach with energy E a individually, we pick the n max states with greatest gains and sum the gain of those states. The sum provides an 43 upper bound for the gain the path can achieve. We keep track of the maximum of the lower bound of the gain of all possible paths by using L max . When a path is expanded, we compute its upper bound and lower bound. If its upper bound is less than L max , this path is pruned; Otherwise, the path is retained and L max is updated if the lower bound of this path is greater than L max . This algorithm generates the path collecting the maximum gain with the energy consumption within the bound. 3.4.2 Nearly Exhaustive Search When the scale of the problem increases, Exhaustive Search becomes impractical. In this case, we use another heuristic to improve the efficiency of the Exhaustive Search. Given a path P from s to t, the energy consumed by the path is denoted by E P , and the gain collected by P is denoted by G P . Suppose the initial energy available is E 0 and that two paths P and Q intersect with each other at state v. In this case, state v divides each path into two segments. We use P 1 and P 2 to denote the segment of P before and after v, and Q 1 and Q 2 to denote the corresponding segments of Q. If the following condition is satisfied, it is very likely that there exists a solution outperforming path Q. G P 1 >G Q 1 and E P 1 ≤E Q 1 (3.7) If P collects more gain than Q, it is obvious that Q is suboptimal. On the other hand, if Q is able to collect more gain than P, Q 2 must collect more gain than P 2 . Since P 1 and Q 2 join at state v, we can construct a new path R by connecting P 1 and Q 2 . The new path R would collect more gain than path Q except in the case where Q 2 goes through 44 the states already visited by P 1 . As a result, the gain collected by the new path is not the sum of G P 1 and G Q 1 and hence R might not be better than Q. In our algorithm, we ignore this case and prune Q if the condition specified by Equation (3.7) is satisfied. The details of the algorithm are given in Algorithm 1, where s 0 ,e 0 is the initial state of the mobile node, r 0 , X 0 , Y 0 are the initial sensor readings and corresponding coordinates, and P is the path found by the algorithm. To distinguish this algorithm from the previous algorithm, we name the previous algorithm Exhaustive Search since it generatetheoptimalpathandthisalgorithmNearlyExhaustiveSearchsinceitgeneratea suboptimalpath. Simulationsshowthatthisalgorithmhasgoodperformanceinpractice. 3.4.3 The Blum et al. Algorithm The Orienteering Problem and the Quota TSP have been studied in the theoretical com- puterscienceandoptimizationcommunities. Severalapproximationalgorithmshavebeen developedforthoseproblems. ThebestapproximationfactorforthegeneralOrienteering Problemis2+ǫ,whichisachievedbythealgorithmdevelopedbyChekurietal.[15]. How- ever, this algorithm is computationally expensive and too slow to solve the path planning algorithm in our case. Therefore, we implemented the algorithm proposed by Blum et al.[8], whichisasubroutineofthealgorithmof[15]andabletoachieveanapproximation factor of 4. Blum et al. first proposed an approximation algorithm for the Min-Excess problem. The excess of path P is e P = d P v,t −d v,t , where d P s,t is the distance between v and t along the path P while d s,t is the distance between v and t in the graph. The Min- Excess problem is to find a path collecting a given amount of gain with minimum excess. The algorithm is based on the observation that the optimal solution to the Min-Excess 45 Algorithm 1: The Nearly Exhaustive Search Algorithm for Path Planning input : s 0 , e 0 , X 0 , Y 0 , r 0 output: P Initialize all G(s, e) = 0; Q = MakeQueue(); enqueue(Q, s 0 , e 0 ); while Q is not empty do (s, e) = dequeue(Q); if e<C 0 then continue; end for each vertex s ′ adjacent to s do gain = ComputeGain(s ′ , X 0 , Y 0 , r 0 ); e ′ = e - cost(s, s ′ ); prev((e ′ ,s ′ )) = (s, e); if s ′ ∈ancestors(s) then G(s ′ , e ′ ) = G(s, e); else G(s ′ , e ′ ) = G(s, e) + gain; end bPrune = false; for all (s ′ , e ′′ ) do if G(s ′ ,e ′′ )>G(s ′ ,e ′ ) AND e ′′ >e ′ then bPrune = true; break; end end if prune then delete (s ′ , e ′ ); else enqueue(s ′ , e ′ , Q); end end end (s,e) = argmax (s,e) G(s,e); while (s,e) is not (s 0 ,e 0 ) do push s into P; (s, e) = prev(s, e); end return P; 46 problem can be partitioned into a sequence of segments and each segment is the optimal path of an interval. Therefore, the Blum et al. algorithm first computes the approximate optimal segment corresponding to each possible interval and each possible gain by using analgorithmforK-PathdevelopedbyChaudhurietal.[13]. Thenadynamicprograming is used to glue those segments together. The intervals are defined by four states v, t, c and f, where v and t are the start and end state of the segment respectively, andc and f are the closest and farthest states from start states. It has been proved that the path re- turned by the Blum et al. algorithm has excess at mostα EP =2.5+ǫ times the excess of the optimal path. The solution to the Orienteering Problem can be found by computing the good solutions to the Min-Excess problems for all the possible gains and end states and picking the path whose cost is less than the given budget while the gain collected is the maximum. The approximation factor is ⌈1+α EP ⌉ = 4, which means that the path returned by the Blum et al. algorithm has the cost no more than 4 times of the cost of an optimal path. Since the Blum et al. algorithm has to enumerate all the possible segments for any v, t, c, f and any possible gain g, it has to call the K-Path algorithm O(n 4 l) times and take O(n 4 l) space, where n is the number of states and l is the total gain available in the graph. As a result, it takes a lot of resources to run the Blum et al. algorithm. In practice this algorithm can only be used to solve small problems. 3.4.4 The K-Path Algorithm Our problem can also be viewed as an instance of the Quota TSP problem: fix the gain to be collected and minimize the cost of the path. Viewed in this way, the path planning problem can be solved by using the Chaudhuri et al. K-Path algorithm. K-Path problem 47 is to find a path with minimum cost while visiting at least k vertices. In the Chaudhuri et al. algorithm, a primal-dual schema is used to solve a Lagrangian relaxation of the K-Path problem and a tree including start state s and end state t collecting at least a gain of k is returned. Once a tree with gain of k is returned, it has been proved that the cost of the tree is not more than the cost of the optimal path from s to t collecting the same amount of gain. However, this algorithm may not return the tree for certain k. For those k, an interpolation algorithm can be used to construct a tree by using two trees generated by the Chaudhuri et al. algorithm, one with gain greater than k and the other with gain less than k. For example, by using the interpolation techniques by Arora and Karakostas [2], one can obtain an s-t path for any value of k, while increasing the cost of the tree to 1+δ times of an optimal path, where δ is any small positive number. By doubling all the edges except those on the shortest path from s to t in the tree, we construct a path from s to t with cost no more than 2+2δ times the cost of the optimal path. When applying the Chaudhuri et al. K-Path algorithm to the energy consumption model with directions, one issue has to be addressed. In the case where the direction of the mobile robot is taken into consideration, there are multiple states corresponding to the same location and hence they share the same gain. Once one state is visited, the gain associated with the location is taken. Even if the robot visits a different state but in the same location, no gain can be collected. This is different from the definition of the K-Path problem, where each state has its own gain. Therefore, we build an auxiliary graph from the original graph. First, we introduce an auxiliary state for each location. The auxiliary state has the gain associated with the location and the other states sharing 48 the same location have 0 gain. The new state is connected to the states sharing the same locations with auxiliary edges and the cost of the edge is defined to be the same as the energy cost for making a 90 degree rotation in place. According to the model in the previous section, it takes the boat (1+3π/4)C 0 energy to make such turn. As a result, the cost of an auxiliary edge is C a = (1+3π/4)C 0 . C 0 is the cost for the mobile robot to move one step forward and it is also the minimum edge cost. Now, we can apply the Chaudhuri et al. algorithm to the auxiliary graph and denote the path returned by T ′ . The tree corresponding to the original graph is constructed from T ′ by removing the auxiliary states and edges from path T ′ and we denote it as T. It is possible that for some auxiliary state a, both of the two auxiliary edges would be included in the tree T ′ . As a result, removing those auxiliary edges could break the tree into two disconnected parts. This can be avoided by replacing one of the auxiliary edges with the edge between the two real states without increasing the cost of the tree, since that edge has the same cost as the auxiliary edge. By doubling the edges not on the path from s to t in the tree T, a path can be constructed for the original graph. A path for the auxiliary graph can also be constructed in a similar way. We denote the former by P and the latter by P ′ . If C P ′ and C P are the costs of corresponding trees in the auxiliary graph and the original graph and C a is the cost of the auxiliary edge, we have C P ′ = C P + 2mC a , where m is the number of auxiliary states visited by tree T ′ . If C P∗ is the cost of the optimal solution for the original graph, we can construct a path P ∗ a corresponding to P ∗ in the auxiliary graph by adding one auxiliary state and one auxiliary edge for each real state. The cost of P ∗ a is C P∗ +2m P∗ C a . IfC P ′ ∗ is the cost of the optimal solution for the auxiliarygraph,C P ′ ∗ ≤C P∗ +2m P∗ C a sinceP ∗ a isoneofthepathsintheauxiliarygraph. 49 If C P is the path returned by the Chaudhuri et al. algorithm andα is the approximation factor, we have C P <C P +2m p C a ≤αC P ′ ∗ ≤α(C P∗ +2m P∗ C a ). Since m p∗ is the number of states visited by path P∗, it is bounded above by C p∗ /C 0 . Therefore, C p <α · C p∗ +2 C p∗ C 0 (1+3π/4)C 0 ¸ =α(3+3π/2)C P∗ . For the gains associated with the paths generated by the Chaudhuri et al. algorithm without interpolation, the approximation factor α is 2 and hence C p <(6+3π)C P∗ =15.42C P∗ . Therefore, for the energy consumption model with directions, we are able to achieve an approximation factor of 15.42 for those gains. For other gains, the approximation factor varies depending on the interpolation techniques used. If the interpolation algorithm proposed by Arora and Karakostas is used, the approximation factor is 15.42+7.71δ. 3.4.5 Performance Comparison We have implemented the four algorithms described above and compared their perfor- mance. Wetestedthesealgorithmsonproblemsofdifferentsize. Thetestsarerunonthe data set collected in Lake Fulmor, which is described in the next section. We discretized the sensing field with a resolution of 20, 15 and 10 meters, and obtained graphs of 15, 50 0 50 100 150 200 250 300 0 50 100 150 energy gain Nearly Exhaustive Search Blum Chaudhuri Exhaustive Search (a) 0 50 100 150 200 250 300 350 0 50 100 150 200 250 energy gain Chaudhuri Nearly Exhaustive Search Exhaustive Search (b) 0 200 400 600 800 0 100 200 300 400 500 600 energy gain Nearly Exhaustive Search Chaudhuri Exhaustive Search (c) Figure 3.4: Performance comparison with 15 (a), 23 (b) and 56 (c) locations. 51 23 and 56 locations for each case respectively. We first use Equation (3.1) to compute the gain for each location, and then run each of the algorithms and compare the gain collected by the path returned. Exhaustive Search is used as a benchmark for the other algorithms since it returns the optimal path. Figure 3.4 shows the results. All the algo- rithms show performance very close to Exhaustive Search for all the cases. However, the running times of the algorithms are quite different. The Blum et al. algorithm is able to compute the solution for the case of 15 states, but is not able to finish the case of 23 states before it runs out of memory 1 ; Exhaustive Search is able to finish both the case of 15 and 23 states but can only finish the computation for the third case for energy less than 250, which corresponds to approximately 25 steps. The Chaudhuri et al. K-Path algorithm and Nearly Exhaustive Search are both able to finish all cases. When applying the Chaudhuri et al. algorithm directly to our problem, we did not use any interpolation since in practice it is not necessary to find the path that collects the gain exactly equal to the given value. As a result, the approximation factor is 2. Since the Blum et al. algorithm is not able to handle a reasonably big problem, from now on, we consider only the other three algorithms. Figure 3.5 and Figure 3.6 show more extensive comparisons between the Chaudhuri et al. K-Path algorithm and Nearly Exhaustive Search. The tests are carried on two data sets, both from Lake Fulmor. For each data set we run the algorithms for several times and each time the boat starts from a different location. As before, the Exhaustive Search algorithm is used as a benchmark for the two algorithms. In each figure the ratio G P /G P∗ is shown, where P is the path returned by either of the two algorithms and P∗ 1 The tests were run on a PC with a 2.8 GHz Pentium 4 CPU and 1GB Memory. 52 0 100 200 300 0.8 0.85 0.9 0.95 1 1.05 Energy Ratio (a) 0 100 200 300 0.8 0.85 0.9 0.95 1 1.05 energy ratio (b) 0 100 200 300 0.8 0.85 0.9 0.95 1 1.05 Energy Ratio (c) 0 100 200 300 0.8 0.85 0.9 0.95 1 1.05 Energy Ratio (d) Figure 3.5: Performance comparison with an energy consumption model without direc- tion. (a) and (c) show the performance of Nearly Exhaustive Search on two data sets of the surface temperature of Lake Fulmor; (b) and (d) show the performance of the Chaudhuri et al. algorithm on the same data sets. is the path returned by the Exhaustive Search algorithm with the same initial energy level. From the figures, we can see that although the Nearly Exhaustive Search has a poor approximation factor, it shows good performance in practice. It is also obvious that the Chaudhuri et al. algorithm has better performance in the case where the energy consumption model without direction is used. This is reasonable, since for the directed 53 0 100 200 300 0.5 0.6 0.7 0.8 0.9 1 1.1 Energy Ratio (a) 0 100 200 300 0.5 0.6 0.7 0.8 0.9 1 1.1 Energy Ratio (b) 0 100 200 300 0.5 0.6 0.7 0.8 0.9 1 1.1 Energy Ratio (c) 0 100 200 300 0.5 0.6 0.7 0.8 0.9 1 1.1 Energy Ratio (d) Figure 3.6: Performance comparison with an energy consumption model with direction. (a) and (c) show the performance of Nearly Exhaustive Search on two data sets as Fig- ure 3.5; (b)and (d) show the performance of the Chaudhuri et al. algorithm on the same data sets. modeltheapproximationfactoris15.42whilefortheundirectedmodeltheapproximation factor is 2. 3.5 Experiments OuralgorithmhasbeentestedontheNAMOSsystemwiththeenergyconsumptionmodel presented in Section 3.3.1 as well as the NIMS system with the model in Section 3.3.2. 54 We first test the algorithms on a simulated scalar field; Then, the algorithms are tested on both data sets taken by the robotic boat and the NIMS node in the Lake Fulmor. We also run Nearly Exhaustive Search on the robotic boat in Echo Lake. 3.5.1 Simulations 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 x y (a) 0 2 4 6 8 10 12 14 0.8 1 1.2 1.4 1.6 1.8 2 Energy Level RSS Chaudhuri Nearly Exhaustive Search Raster Scan (b) Figure3.7: (a)Thesimulatedscalarfield. (b)theperformancecomparisonbetweenthree algorithms. We First tested adaptive sampling algorithms on a simulated scalar field m(x,y)=1/(1+exp(30 p (x−1) 2 +y 2 )−15) in a 1×1 square, which is shown in Figure 3.7(a). There are 36 static nodes to provide the initial sensor readings. We compare the three algorithms in Figure 3.7(b). The first algorithm uses optimal experimental design to generate a gain associated with each location and the Chaudhuri et al. K-Path algorithm to find a pathcollecting a given gain with minimum energy cost. The second algorithm also uses optimal experimental design 55 togenerategainsbutusesNearlyExhaustiveSearchasthepathplanningalgorithm. The third algorithm is used as a benchmark, where the path is generated as a raster scan. For each method, we added the data points on the path to the initial readings from the static sensors and use the new set of readings to estimate the scalar field. Figure 3.7(b) shows that both adaptive sampling algorithms outperform raster scan when the energy level is low. When the energy available is less than 8, the reconstruction errors associated with adaptive sampling algorithms with Nearly Exhaustive Search and the Chaudhuri et al. algorithm are about 20% to 30% less than raster scan on average. Since IMSE ∝ n − 2 3 for the best case in 2-dimensional problems, a 20% improvement is significant. When the initial energy increases to above 8, the difference between the difference is not obvious and the raster scan actually outperform the adaptive sampling with the Chaudhuri et al. algorithm. This is because when the energy level is more than 8, the mobile node has enough energy to go to almost every location and hence the raster scan become very close to an optimal path. On the other hand, the paths generated from the Chaudhuri et al. algorithm are constructed from a tree and some edges of the tree have to be included twice in the path, which wastes energy. As a result, when the energy level is high, the paths generated by the Chaudhuri et al. algorithm are not as good as raster scan. 3.5.2 Experiments using the NAMOS data set Figure 3.8(a) shows the points in the entire raster scan data set taken in Lake Fulmor by the NAMOS boat. The solid curve is the boundary of the lake and the black dots are the data points. Temperature readings are used in the tests and Figure 3.8(b) shows the temperature field interpolated with the whole data set. We mine this data set (the 56 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 x (meters) y (meters) (a) x (meters) y (meters) 20 40 60 80 100 120 140 0 20 40 60 80 100 120 18.2 18.4 18.6 18.8 19 19.2 19.4 (b) Figure 3.8: (a) The data set of the surface temperature of Lake Fulmor; (b) The inter- polated temperature field by using the whole data set. ground truth) as follows to test the algorithm. We assume that the boat is able to follow the path generated by the adaptive sampling algorithm accurately. When one sampling locationxisgenerated, wesearchintherasterscandatasetforthepointclosesttoxand take the corresponding temperature reading as the sensor reading at location x. There are 15 static nodes to provide the initial samples. The locations of those static nodes are shown as small circles in Figure 3.8(a). Asintheprevioussubsection,wecomparetheperformanceofthreealgorithms: Raster scanandadaptivesamplingalgorithmswithNearlyExhaustiveSearchandtheChaudhuri et al. algorithm. The estimates on the temperature field are compared to the whole data set, which is assumed to be the ground truth. The performance is measured by the RSS (Sum of the Squared Residue) on the whole data set. From the figure, we can see that both adaptive sampling algorithms achieve better performance than the raster scan, especially the one using Nearly Exhaustive Search as path planning algorithm. When the 57 0 100 200 300 400 500 600 6 8 10 12 14 16 Energy RSS Chaudhuri Nearly Exhaustive Search Raster Scan Figure 3.9: Comparison between the adaptive sampling with the Chaudhuri et al. al- gorithm, adaptive sampling with the Nearly Exhaustive Search and raster scan on the data set of the surface temperature of Lake Fulmor. Note that the Chaudhuri and Nearly Exhaustive Search stands for the Adaptive Sampling algorithms with corresponding path planning subroutines. energy level is between 150 to 400, the IMSE of Nearly Exhaustive Search is about 20% less than the raster scan. 3.5.3 Experiments on the data set from the NIMS node Another data set on which the algorithms have been tested was taken by NIMS also in Lake Fulmor. Unlike the data set used in Section 3.5.2, which is a 2D temperature field on the surface of the water, the NIMS data set is taken from a transect vertical to the surface, which is 45 meters wide and 5 meters deep. The fluorometer readings are used in our tests. The data set consists of 635 data points and 10 readings were taken for each data point. The data points are arranged into 23 columns and each column consists of 17 to 30 points. Therefore, the data points are not in an isotropic distribution and the 58 density of data points is higher along depth than horizontally. As a result, in our test, we compensate for the effect by making the bandwidth matrix anisotropic. To implement adaptive sampling algorithms, we assume that there are static nodes equipped with the same sensor, i.e., fluorometer, on the transect. The locations of the staticnodesaredeterminedbyusingacoffee-housedesign[45],whichgeneratesuniformly distributeddatapointsacrossthesensingfield. Sincethestaticnodestakesensorreadings at one location and the variance of the readings can be reduced by averaging the readings within a short period, it is reasonable to assume that the sensor readings from the static sensors are less noisy than the readings from the mobile node. The reading from the static node sitting at location x is simulated by using the average of the 10 readings at x while the readings from among the mobile node are simulated by randomly selecting one from the 10 readings. The Hessian matrix and the gain are estimated by using the sensor readings from the simulated static nodes. 0 50 100 150 4.5 5 5.5 6 6.5 7 7.5 8 8.5 x 10 5 energy RSS Chaudhuri Nearly Exhaustive Search Raster Scan Figure3.10: Performancecomparisonbetween3algorithmsonthedatasetoffluorescence on a vertical transact of Lake Fulmor. 59 Similar to Section 3.5.2, we implemented three algorithms on the data set. The Adaptive Sampling with the Chaudhuri et al. algorithm, the Adaptive Sampling with theNearlyExhaustiveSearchandtheRasterscan. BecausetheNIMSnodeisnotallowed to move horizontally in the water, and it takes much less energy to scan vertically than horizontally, the raster scan was designed in such a way that the mobile node would scan the whole depth if it determines to take readings along depth. Figure 3.10 shows the performance comparison between the three approaches. Once the path is generated, the simulationsarecarriedout20timesandtheaverageandstandarddeviationoftheRSSof the three algorithms vs. the energy available to the mobile node are shown in the figure. Ascanbeseen,thetwoAdaptiveSamplingalgorithmsperformedconsistentlybetterthan the Raster Scan. When the energy available is less than 70 units, the adaptive sampling algorithms are able to achieve around 20% less RSS. However, with further increase of theenergy, theimprovementbecomemarginal. Thisisduetothesizeofthesensingfield. In our tests, the transect is discretized into 172 states. Without the 40 initial samples, it takes only 138 units of energy for the mobile node to cover the whole transect. When the 40 initial samples are taken into consideration, the energy needed is even less. As a result, when the energy is greater than 80, all the algorithms generate similar sample density and hence achieve similar RSS. 3.5.4 Experiments in Echo Park The adaptive sampling algorithm with Nearly Exhaustive Search has also been tested on the boat in the lake in Echo Park in Los Angeles and the target scalar field is the surface temperaturefield. Onceagain, weusetherasterscandataasthegroundtruth. Theboat 60 was first used to quickly perform a raster scan in a 40m×70m area. We assume that the temperature field did not change significantly during the period when the experiments were carried out. Figure 3.11 shows the results of two experiments. The solid lines in Figures 3.11(a) and 3.12(a) show the boundary of the test area and the circles are the locations where initial temperature readings are available. The boat starts from the right bottom corner facing north in the first experiment and from top right corner facing west in the second experiment. The triangles in Fig- ure 3.11(a) and Figure 3.12(a) are the planned locations where the boat should go and take temperature readings and the dots show the actual track taken by the boat. Note that the actual track did not follow the planned path strictly. This is due to the GPS resolution and the effect of the wind. Currently, there is no battery monitor on the boat andhencewecannotmeasuretheenergyconsumedbytheboatdirectly. However, during the experiments, the speed of the propeller on the boat was approximately constant and hence the thrust generated by the propeller is approximately constant. Therefore we can approximate the energy consumption by measuring the distance the boat traveled. With thedenseGPSreadingsalongthetrack,itiseasytocomputethedistance. Figure3.11(b) and Figure 3.12(b) show the IMSE of the estimation vs. the distance the boat traveled in both experiments, respectively. 3.6 Conclusion In this chapter, we have proposed an adaptive sampling algorithm for mobile sensor networks consisting of a set of static nodes and a mobile robot tasked to reconstruct 61 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 70 x (meters) y (meters) boundary actual track planned path (a) 0 20 40 60 80 100 120 140 160 0.068 0.069 0.07 0.071 0.072 0.073 0.074 0.075 Distance traveled (meters) IMSE (b) Figure 3.11: A trial of the experiment in Echo Park. (a) shows the planned path and the actual track and (b) shows estimation error vs distance the boat traveled, which is approximately proportional to the energy consumed by the boat 62 0 5 10 15 20 25 30 35 40 45 −10 0 10 20 30 40 50 60 70 80 x (meters) y (meters) boundary actual track planned path (a) 0 50 100 150 200 250 0.021 0.0215 0.022 0.0225 0.023 0.0235 0.024 0.0245 distance the boat traveled (meters) IMSE (b) Figure 3.12: Another trail of the experiment. (a) shows the planned path and the actual track and (b) shows estimation error vs distance the boat traveled. 63 a scalar field. Our algorithm is based on non-parametric regression techniques, e.g., local linear regression, thin plate splines and universal kriging. Sensor readings from static nodes are sent to the mobile robot and used to estimate the structure of the scalar field, which are related to the estimation error. A path planner can then be used to generate a path for the mobile robot such that the resulting IMSE of the field reconstructionisapproximatelyminimizedsubjecttotheconstraintthatthemobilenode has a finite amount of energy which it can expend. We discussed four path planning algorithms, ExhaustiveSearch, NearlyExhaustiveSearch,theBlumetal. Algorithmand the Chaudhuri et al K-Path Algorithm. Exhaustive Search returns the optimal path, but the running time is exponential to the scale of the problem. The Blum et al. Algorithm providesthebestapproximationfactorwithpolynomialrunningtime. However,theorder of the polynomial function is high and it is not feasible for practical problems. The other two algorithms, Nearly Exhaustive Search and the Chaudhuri et al. K-Path algorithm, work well with problems of reasonablesize, though both onlyproduce goodpaths instead of optimal paths. The former normally generate a path resulting in less estimation error in practice while the latter has a better theoretical guarantee on the performance. In practice, it might be better to run both algorithms and pick the better one of the paths generated by those algorithms. In this case, we can achieve better performance in most situations while still having good performance guarantees in the worst situations. Data from extensive traverses in the field as well as simulations, validate the performance of our algorithms. Note that for the Chaudhuri et al. algorithm, we construct the path from the tree in a most straight forward way: doubling the edge not on the shortest path from start 64 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 x (meters) y (meters) (a) 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 x (meters) y (meters) (b) Figure 3.13: (a) The path generated by the Chaudhuri et al. algorithm. (b) The path generated by using Christofides’ heuristic. state to the end state. A better path can be constructed from the tree with a more complexapproach. Forexample,wecanuseChristofides’heuristic[17]tocreateshortcuts between some states and hence get a shorter path going through the same set of states. Figure3.13(a)showsthepathgeneratedbytheChaudhurietal. algorithmforthesurface temperaturedatasetinLakeFulmorandFigure3.13(b)showsthepathconstructedfrom the same tree with Christofides’ heuristic. The latter is obviously much better than the former path. We assumed that the boat is able to move forward and backward with the same amount of energy consumed. However, this assumption is not always true. Actually, it is more likely that the energy consumed for moving forward is less than backward. In this case, the state transition graph is a directed graph. Exhaustive Search and Nearly Exhaustive Search are able to deal with directed graphs. However, the Chaudhuri et al. algorithm does not work for directed graph since when a tree constructed, there is no guarantee that there exists a path from the start state s to end state t. Since it relies on 65 the Chaudhuri et al. algorithm to generate optimal segments, the Blum et al. algorithm does not work for directed graph either. As a result, another algorithm that work for directed graph is necessary. The best approximation factor currently known that can be achieved within polynomial time for the Orienteering Problem on directed graph is log 2 (OPT), where OPT is the amount of gain that can be collected [15]. One interesting extension of adaptive sampling is to use this algorithm with the directed state transition graphandfindouthowwelltheadaptivesamplingalgorithmwouldperformonadirected graph. Another interesting extension is how to carry out the sampling task in the case of multiplemobilenodes. Inthecaseofonesinglemobilenode, themainchallengeistofind the best path for the mobile node so that the estimation error can be minimized. When the number of mobile nodes increases, the problem of how to coordinate the multiple mobile nodes rises, which is to be studied in the next two chapters. 66 Chapter 4 Adaptive Sampling with a Large Number of Mobile Nodes 4.1 Introduction In the previous chapter, we presented adaptive sampling algorithms for a robotic sensor network with a single mobile node. In this chapter, we study the adaptive sampling problem for a large number of mobile nodes. We assume that the number of mobile nodes is so large that each mobile node needs only take sensor readings at one location. Such system is able to deal with time varying phenomenon as long as the sensor nodes are synchronized and take sensor readings at the same time. With this assumption, the density of samples or sensor readings is the same as the node density. So with the optimalsampledensitygeneratedbyoptimalexperimentaldesigndiscussedinChapter2, the adaptive sampling problem is reduced to the problem of deploying the mobile nodes accordingly. Therefore, we devote most part of this chapter to answering following question. How can an ensemble of mobile sensor nodes deploy itself so that the deployment density (i.e. nodes per unit area in the 2D case) varies spatially in proportion to a function induced 67 by a scalar field ? That is, if f ∗ (x) denotes a function that can be estimated with sensor readings, how can sensor nodes deploy themselves so that node density f(x) is proportional to f ∗ (x). We assume that 1. The function f ∗ (x) is not known before deployment. In the case of adaptive sam- pling, the optimal density of samples depends on the features of the underlying scalar field and may change with time. However, it can be computed locally by sensor nodes. For example, if Local Linear Regression is used to reconstruct the scalar field, each node can compute the trace of Hessian matrix at its current loca- tion by using the sensor readings from the nodes in its neighborhood and compute the optimal density. In this case, a distributed algorithm for adaptive sampling can be developed. 2. The communication range is short compared to the whole sensing field. This is a common assumption for sensor networks since a short communication range means less energy for communication. 3. All mobile nodes are localized in a common global reference frame and able to navigatefromitscurrentlocationtoacommandedgoallocation. Wedonotassume that the navigation should be accurate, but accurate localization is necessary for a good estimate on the scalar field. Many localization algorithms have been proposed for sensor networks and any one with reasonable accuracy can be used. 4. Eachnodecandetectthelocaldensityofnodesinasmallneighborhood. Inpractice, this can be done by counting the synchronization messages from the nodes nearby. 68 Withtheseconstraintsandassumptionsinmind, wedevelopadistributedalgorithmthat deploys sensor nodes so that the node density is proportional to the function f ∗ (x). Our approach rests on the following observation. The ideal node distribution is easily computable in the special case where the area to be covered is very small (i.e. linear dimensions on the order of the communication range of each node) such that within it there exists one node (call it the cluster head) which can communicate directly with all other nodes in the area. In this case, each node within the cluster can compute the value of the function f ∗ (x) at its current location and send it to the cluster head. With stan- dard function approximation techniques (e.g., using sampling [28, 34]), the cluster head canthengeneratedesirednodelocationswithintheneighborhoodandassigneachmobile node a new location. Then the mobile nodes would move accordingly thereby achieving the desired density profile. For the cases where the sensing field is much bigger than the communication range, our strategy is to randomly partition the sensing field into sufficiently small neighborhoods. Within each neighborhood the redistribution process as describedaboveisenacted. Weprovethataftersufficientlymanyiterations, nodedensity approachesthedesiredprofileintheoverallregion. Wealsodemonstrateempiricallythat our strategy is noise tolerant. We substantiate our resultsusing simulations, and perform some simple experiments in one dimension using small robotic nodes called Robomotes. Finally, we present simulation results on the adaptive sampling, which uses the opti- mal design with Local Linear Regression to generate function f ∗ (x) and the distributed deployment algorithm to deploy sensor nodes accordingly. Sensor deployment has been studied extensively in sensor networks literature and manydeploymentalgorithmshavebeendevelopedwithdifferentcriteria. Coverageisone 69 of the most important criterion measuring the quality of service of a sensor network [41]. One approach to achieve desired coverage is to randomly deploy the extra sensor nodes in the sensing field until the coverage of the sensor network reaches a given threshold [18]. Another frequently used approach is the potential field technique [33, 69]. Wang et al. [63] introduced an approach to detect coverage holes with a Voronoi diagram and proposed an algorithm based on potential field to adjust the locations of some nodes to repair these holes. Dhillon et al. [23] proposed a centralized algorithm on a grid world, where a master node is used to simulate the movement of the mobile nodes in a potential field and determine their final locations and then each sensor node moves to the final locations directly. The third approach is to place the sensor nodes one by one on the boundary of the area already covered [32]. This approach can only find a suboptimal deployment, but it is able to ensure that each node retains line-of-sight with another node. Asimilardeploymentstrategyisproposedin[6,7], whereamobilerobotisusedto deploy the sensor nodes. In this approach, the sensor nodes are used to assist the mobile robot in navigation and no localization is necessary. Another criterion for sensor node deployment is connectivity of the network. One approach is to start the deployment with the network fully connected and then maximize the coverage while maintaining the connectivity of the network [49]. Another approach is tofirstdeploythesensornodesandthenrepairtheconnectivitybydeployextranodes[19]. The third criterion is the estimation error. Zhang and Wicker [67] proposed a de- ployment algorithm to minimize the estimation distortion, where the sensing field is partitioned into small cells and a powerful fusion center is used to balance the numbers of nodes within cells. Bulusu [11, 10] proposed an algorithm to determine the locations 70 of a special type of nodes, beacons, so that the localization errors of normal nodes are reduced. This algorithm is a greedy algorithm and the new beacon is always placed at the location with maximum localization errors. The algorithm in this chapter is different from above approaches in the following ways: First, we aim at deploying sensor nodes according to a function derived from the measurement of a scalar field. Second, our algorithm is distributed and it can be applied to a system consisting of a large number of nodes. Perhaps the most similar prior work is reported in [9], where a family of distributed methods are proposed to deploy mobile nodes according to event distribution. In those approaches, every node determines its final location according to the initial locations of all mobile nodes and the event distribution. Assuming the initial locations of all mobile nodes and the event distribution are known, no communication is necessary for those algorithms. However, if the information is not available to all mobile nodes initially or it changes over time, it has to be flooded across the network, which may cause more resources consumed. Our algorithm does not need such global information and hence no flooding is necessary. 4.2 Deployment Algorithm In this section, we first present an algorithm for deploying sensor nodes in a small neigh- borhoodsuchthatthenodedensityisproportionaltoafunctionf ∗ (x). Next,wedescribe a distributed algorithm for sensor deployment in a large area, which uses the first algo- rithm as a subroutine. 71 4.2.1 Deployment Within a Small Neighborhood This work is inspired by the particle filter techniques [28, 34], which has been used extensively for localization in robotics [21]. When applied to robot localization, the density of samples/particles is used to represent the probability distribution function (PDF) of the location of a robot. In our algorithm, each sensor node corresponds to one particle in the particle filter and the function f ∗ (x) corresponds to the probability distribution function. By using an algorithm similar to there-sampling algorithm used in particle filters, we generate new locations for the sensors so that the density of the nodes is f(x)∝f ∗ (x). Recall that we assume a system where each node is able to move and localize itself andeachnodeisabletodetectnodedensityinitsvicinity(e.g., bycountingthemessages sent from its neighbors). Within a small neighborhood, one node is selected as cluster head such that all other nodes are able to communicate with the cluster head directly. At the beginning of the algorithm, all nodes send their current locations, estimates on current local node densities and sensor readings to the cluster head. The cluster head first computes the estimate on the function f ∗ (x) from the readings and locations of all nodes. Then it puts all the locations, including its own, into a set P. Each location x in P is assigned a weight w(x) = f ∗ (x) ˆ f(x) , where ˆ f(x) is estimated local node density around x. Next, the cluster head generates a set of new locations by re-sampling from this weighted set P. The probability that a location x is selected by the re-sampling process is proportional to the weight assigned to it. By doing so, the density of new locations approximates the desired density. It is obvious that some locations would be 72 selectedmultipletimes. ThisproblemissolvedbyapplyingGaussiannoisetotheselected locations. The cluster head assigns each node in the cluster to a new location. There are two options to assign locations. One is to minimize the energy consumed by the whole cluster. In this case, the assignment algorithm should minimize the sum of the distances nodesneedtravel. Theotheroptionistobalancetheenergyconsumptionamongdifferent nodes. In this case, each node is assigned a priority to choose its next location. The less the energy left in the battery, the higher priority the node gets. Each node selects the location closest to its current location from the available locations. Finally, all nodes move to the assigned locations. The details of the algorithm are shown in Algorithm 2. The inputs R and ˆ F are sent to the cluster head from all nodes in the cluster. n is the total number of nodes in the cluster,R is the set of sensor readings and ˆ F is the set of estimated local node density. In thisalgorithm,subroutineEstimate()isusedtoestimatef ∗ (x)fromthesensorreadings. Another subroutine AssignLocations() takes the new locations and assigns them to all nodes in the cluster. 4.2.2 Distributed Algorithm On the basis of the centralized algorithm that operates within a cluster, we designed a distributed algorithm for sensor deployment. For the distributed algorithm, we assume that: 1. The centralized algorithm is able to deploy the sensor nodes (within a cluster) so that sensor nodes density is proportional to to f ∗ (x); 73 Algorithm 2: Location reassignment within a cluster input : n, R, ˆ F output: None F ∗ = Estimate(R); for k = 1 to n do Q[k] = P k i=1 F ∗ [i] ˆ F[i] ; T[k] = Rand(); end T =Sort(T); i = 1; j = 1; while i≤N do if T[i]<Q[j] then Index[i] = j; i = i + 1; else j = j+1; end end AssignLocations(Index); 2. f ∗ (x)≥0 for any x∈S and R S f ∗ (x)dx6=0, where S is the sensing field; 3. The total number of nodes n is very large. The distributed algorithm first randomly partitions the sensing field into subareas (small neighborhoods). The nodes within each subarea form a cluster and one node is selected as the cluster head. The cluster head executes Algorithm 2 to redeploy sensor nodes so that the density is proportional to f ∗ (x) within each cluster. The sensing field is partitioned again and the process is repeated until termination conditions are satisfied. The details of the algorithm are shown in Algorithm 3, where R and ˆ F have the same meaning as in Section 4.2.1 and where C i j is a constant for any x∈S i j . When implementing the algorithm, we use the LEACH protocol [30] to randomly partition the sensing field into subareas and the system into clusters. In this approach, 74 Algorithm 3: Deployment algorithm i = 0; while termination conditions are not satisfied do Randomly partition the sensing field into small subareas S i j ; Select one cluster head from nodes within each subarea; Each node l within one subarea sends R[l] and ˆ F[l] to cluster head; Cluster heads generate new locations with the distribution g i j (x)=C i j f(x); Assign each node a new location; all nodes move to new locations; i = i + 1; end each node elects itself as a cluster head with a pre-defined probability. If one node elects itself as a cluster head, it broadcasts its location to its neighbors. Otherwise, the node listens to the messages from cluster heads, and selects the one closest to it as its cluster head. The node joins the corresponding cluster by sending a registration message to the cluster head. The region covered by a cluster is a subarea. 4.3 Analysis It can be proved that the algorithm in the previous section results in a node density that approaches C 0 ·f ∗ (x) as the number of time steps increases. Formally, we have Theorem 4.3.1. Theorem 4.3.1 If S is the area of operation, f ∗ (x) is the desired function and f(x) is the actual node density, as i→∞, C i j →C 0 for all j, where C 0 = R S f(x)dx R S f ∗ (x)dx . 75 Figure 4.1: The dashed lines are the boundaries of the partitions at step i, and the solid lines are the boundaries at step i+1. The polygon with bold solid edges is subarea S i+1 j . It is obvious that S i+1 j = P M i l=1 ΔS i+1 j,l . To prove this theorem, we need the following lemmas. Lemma 4.3.2 If C i =C j for any i and j, then for any subarea, C i =C 0 . Proof Suppose C i = C for any i, where C is a constant. That is, f(x) = C·f ∗ (x) for x∈S. So, we have Z S f(x)dx= Z S C·f ∗ (x)dx=C· Z S f ∗ (x)dx. Therefore C = R S f(x)dx R S f ∗ (x)dx =C 0 . 76 Lemma 4.3.3 Let S i l be a subarea at time step i, S i+1 j be a subarea at time step i+1, and ΔS i+1 j,l = S i+1 j ∩ S i l as shown in Figure 4.1, where j ∈ {1,2,...,M i+1 } and l ∈ {1,2,...,M i }, we have C i+1 i = M i X l=1 C i j · ΔF i+1 j,l F i+1 j , where F i+1 j = R S i+1 j f ∗ (x)dx and ΔF i+1 j,l = R ΔS i+1 j,l f ∗ (x)dx. Proof According to Algorithm 3, after all nodes within S i+1 j are redeployed, f i+1 i (x)=C i+1 j ·f ∗ (x),x∈S i+1 j . Since S i+1 j = P M i l=1 ΔS i+1 j,l , and the number of nodes in subarea S i+1 j does not change after the redeployment, then Z S i+1 j f i+1 j (x)dx=N i+1 j = M i X l=1 Z ΔS i+1 j,l f i l (x)dx. So, we have C i+1 j = R S i+1 j f i+1 j (x)dx R S i+1 j f ∗ (x)dx = P M i l=1 R ΔS i+1 j f i l (x)dx R S i+1 j f ∗ (x)dx = P M i l=1 C i l · R ΔS i+1 j,l f ∗ (x)dx R S i+1 j f ∗ (x)dx = P M i l=1 C i j ·ΔF i+1 j,l F i+1 j . 77 Figure 4.2: The dashed lines are the boundaries of subareas at step i, and solid lines are the boundaries of subareas at step i+1. The polygon with bold dashed edges is the subspace S i l . It is obvious that S i l = P M i+1 j=1 ΔS i+1 j,l . Lemma4.3.3saysthat the proportionalityconstants associatedwith the distributions in subareas at the current step is the weighted average of the constants in previous step. Next,weprovethatthedifferencebetweenanytwosubareaswilldiminishaftersufficiently many time steps. Lemma 4.3.4 Letσ i = P M i l=1 F i l F ·(C i l −C 0 ) 2 , whereF = R S f ∗ (x)dxandF i l = R S i l f ∗ (x)dx, then When step i approaches infinite, σ i approaches 0. Proof Since S i l = P M i+1 j=1 ΔS i+1 j,l , as shown in Fig 4.2, F i l = Z S i l f ∗ (x)dx= M i+1 X j=1 Z ΔS i+1 j,l f ∗ (x)dx= M i+1 X j=1 ΔF i+1 j,l . For similar reason, F i+1 j = M i X l=1 ΔF i+1 j,l . 78 From the definition of σ i , we can compute σ i and σ i+1 as follows. σ i = M i X l=1 F i l F (C i l −C 0 ) 2 = M i X l=1 P M i+1 j=1 ΔF i+1 j,l F (C i l −C 0 ) 2 = 1 F M i X l=1 M i+1 X j=1 [ΔF i+1 j,l (C i l −C 0 ) 2 ]; σ i+1 = M i+1 X j=1 F i+1 j F (C i+1 j −C 0 ) 2 . So, if we define Δσ i =σ i −σ i+1 , Δσ i = 1 F M i+1 X j=1 [ M i X l=1 (C i l −C 0 ) 2 ΔF i+1 j,l −(C i+1 j −C 0 ) 2 F i+1 j ]. (4.1) Let α j = P M i l=1 (C i l −C 0 ) 2 ·ΔF i+1 j,l , and β j =(C i+1 j −C 0 ) 2 ·F i+1 j , we have Δσ i = 1 F M i+1 X j=1 (α j −β j ). On the other hand, replace l with l 1 , α j F i+1 j = M i X l 1 =1 (C i l 1 −C 0 ) 2 ΔF i+1 j,l 1 F i+1 j = M i X l 1 =1 [(C i l 1 −C 0 ) 2 ΔF i+1 j,l 1 ( M i X l 2 =1 ΔF i+1 j,l 2 )] = M i X l 1 =1 M i X l 2 =1 [(C i l 1 −C 0 ) 2 ΔF i+1 j,l 1 ΔF i+1 j,l 2 ]. 79 From lemma 4.3.3, β j F i+1 j =(C i+1 j −C 0 ) 2 (F i+1 j ) 2 =[(C i+1 j −C 0 )·F i+1 j ] 2 =[ M i X l ′ =1 ΔF i+1 j,l ′ (C i l ′ −C 0 )] 2 . Then, (α j −β j )·F i+1 j = M i X l 1 =1 M i X l 2 =1 ΔF i+1 i,l 1 ΔF i+1 i,l 2 [(C i l 1 −C 0 )−(C i l 2 −C 0 )] 2 = M i X l 1 =1 M i X l 2 =1 ΔF i+1 i,l 1 ΔF i+1 i,l 2 (C i l 1 −C i l 2 ) 2 . Therefore Δσ i = 1 F M i+1 X j=1 (α j −β j ) = 1 F M i+1 X j=1 [ 1 F i+1 j M i X l 1 =1 M i X l 2 =1 ΔF i+1 j,l 1 ΔF i+1 i,l 2 (C i l 1 −C i l 2 ) 2 ] = 1 F M i X l 1 =1 M i X l 2 =1 η i+1 l 1 ,l 2 (C i l 1 −C i l 2 ) 2 , where η i+1 l 1 ,l 2 = P M i+1 j=1 ΔF i+1 j,l 1 ·ΔF i+1 j,l 2 F i+1 j . Because f(x)≥ 0, we have F i+1 j ≥ 0 and ΔF i+1 j,l ≥ 0. Additionally, becauseΔF i+1 j,l areindependentwithstepi, theprobabilitythatΔF i+1 j,l =0 for all j is very small, especially when there are large number of nodes in the system. So, we have Δσ i ≥0. 80 Now, we prove that σ i approaches 0 as i approaches infinity. Suppose σ i 6= 0 for any i. Since σ i ≥ 0, there must exist δ > 0 so that for any i, σ i ≥ δ. Suppose at step i, (C i m −C 0 ) 2 is the maximum among (C i l −C 0 ) 2 for all l∈{1,2,...,M i }, we have σ i = M i X l=1 F i l F (C i l −C 0 ) 2 ≥ F i m F (C i m −C 0 ) 2 ≥δ. On the other hand, there must exists l ′ so that (C i m −C i l ′ ) 2 ≥(C i m −C 0 ) 2 . Then, Δσ i ≥ 1 F M i X l 2 =1 η i+1 m,l 2 (C i m −C i l 2 ) 2 ≥ 1 F η i+1 m,l ′ (C i m −C i l ′) 2 ≥ 1 F η i+1 m,l ′ (C i m −C 0 ) 2 ≥ 1 F η i+1 m,l ′ F F i m δ ≥ η i+1 m,l ′ (C i m −C 0 ) 2 F i m δ ≥η 0 δ, whereη 0 =min i η i+1 m,l ′ (C i m −C 0 ) 2 F i m andη 0 >0. So,ifi> σ 0 −δ η 0 ·δ ,σ i <δ,whichisacontradiction with the assumption that σ i is always greater than or equal to δ hence the assumption is not true. Therefore, when i approaches infinity, σ i approaches 0. 81 4.4 Simulation Results We carried out several simulations to verify the validity of the distributed node deploy- ment algorithm. In the simulations, we would like to deploy sensor nodes so that node density is proportional to g 0 (x,y)=0.1+e − (x−4) 2 +(y−6) 2 2 . However, the estimate on the desired density is corrupted by noise as follows g(x,y)=g 0 (x,y)+ǫ, where ǫ is a zero-mean random variable with Gaussian distribution N(0,σ). The simulations were performed in a 2D environment of 10x10 square. At the begin- ning of the simulations, all nodes are within a start area where x∈ [0,1] and y ∈ [0,10]. 500 nodes are deployed. The communication range of each node is 3 units. The probabil- ity that a node elects itself as a cluster head is chosen to be 0.05. Hence approximately 25 clusters are formed at each step. Figure 4.3 shows the node densities after 120 steps in one simulation trial. One step here means one iteration of the redistribution process. Figure 4.3(a) is the scalar field and Figure 4.3(b) shows the node density. In this trial, the variance of the noise is 0.1, 10% of the signal. 82 0 2 4 6 8 10 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 (a) 0 2 4 6 8 10 0 2 4 6 8 10 0 5 10 15 20 25 (b) Figure 4.3: (a) the desired function g 0 (x,y); (b)The node density after 120 steps. We define desired node density, actual node density and density error as follows. Desired node density at location(x,y) is defined as f ∗ (x,y)= n· R (x,y)∈s g 0 (x,y)dxdy A· R (x,y)∈S g 0 (x,y)dxdy where s is a δ×δ square centered at (x,y), A = δ 2 is the area of s and S is the 10×10 square. In simulations, δ =2.0. Actual node density at location (x,y) is defined as ˆ f(x,y)= n s A , where s and A have the same meaning as above and n s is the number of nodes within the square s. The density error is defined as err = X x∈{0.5,1.5,...,9.5} X y∈{0.5,1.5,...,9.5} | ˆ f(x,y)−f ∗ (x,y)|. 83 0 20 40 60 80 100 120 0 100 200 300 400 500 600 700 steps Density errors σ = 0.0 σ = 0.1 σ = 0.2 σ = 0.3 Figure 4.4: The node deployment density better approximates the desired profile over time. Shownhereisthediminishingdifferencebetweentheactualdensityandthedesired density for different values of noise. The process is noise tolerant since there is little measurable difference between the steady-state density errors in four curves. Figure 4.4 shows the convergence of the algorithm, where the density errors are shown as a function of steps. The simulations are done in 4 groups and each group consists of 10 trials. The trails in one group share the same noise covariance (σ), and each curve represents the average density errors in one group. From Figure 4.4 we can see that the error converges after about 60 steps. Note that in the simulations all nodes start from a small area and the convergence time includes the time for the nodes to expand to the whole sensing field. The convergence time and final steady-state error is largely insensitive to noise. Figure 4.5 summarizes the steady-state density errors when the noise covariance in- creases from 0 to 0.4. Steady-state errors are computed by averaging the density errors after the 60th time step. The basic observation is that our algorithm produces steady- state errors that increases slowly with the increase of the noise. 84 0 0.1 0.2 0.3 0.4 80 100 120 140 160 180 200 220 Noise covariance Average density errors Figure 4.5: Steady-state density error increases slowly as sensor noise increases. 4.5 Physical Experiments The sensor deployment algorithm has been tested on the Robomote robotic testbed [20] (shown in Figure 4.6). The experiments are carried out on a 200cm×100cm table and Mezzanine [31] is used for node localization. Figure 4.6: Robomote testbed 85 There are two implementation issues with the experiments on the Robomote testbed. Given the small number (6) of Robomotes available for experiments, the centralized algo- rithmmaynotdeploythenodesinthegivendistributionwithineachsmallneighborhood. That is, the density of one time deployment may vary significantly from the desired pro- file. However, the average density over several steps would approximate the desired one. Accordingly the average density over the last 20 steps are used to display the experimen- tal results. Another problem with a small number of nodes is an increased probability that no cluster is formed. For instance, in the experiments, the probability that one node electsitselfasclusterheadissettobe0.25,sotheprobabilitythatnoclusterheadelected is (1−0.25) 6 =0.1780. Since in the experiments the communication range is set to be 40 cm, the probability that one node does not join any cluster is even higher. Thus, some nodes might sit at one location for several steps. Because the overall node density is low (lessthan1per10cm), sittingatonelocationforevenonestephasconsiderableeffecton the average density. Therefore, each node is programed to randomly move one step if it does not join any cluster. The other issue is that the size of a Robomote is 7cm×4.5cm and hence the distance between two Robomotes can not be very small. However, the new locations generated by the cluster head may be very close to each other. As a result, beforeassigninglocations,theclusterheadschecktomakesurethatthedistancebetween any two locations is greater than a threshold. Experiments with the Robomotes are performed in one dimension. Robomotes are required to deploy themselves along the x-axis with the desired density proportional to f ∗ (x) = exp(− (x−100) 2 2·30 2 ). Several experiments were conducted, and following are two selected examples. At the beginning of the experiments, all nodes start from the right 86 boundary of the sensing field. Then, most nodes move away from their start locations towards the center of the sensing field and finally accumulate around the location x = 100cm. Figure 4.7 compares the final densities (bars) and the desired distribution (solid line, obtained by normalizing f ∗ (x)). The two distributions are in reasonable agreement. 50 60 70 80 90 100 110 120 130 140 150 160 170 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 position (cm) average node density (a) 50 60 70 80 90 100 110 120 130 140 150 160 170 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 position (cm) average node density (b) Figure 4.7: Data from two trials with the Robomotes. The bars show the average sensor density in the steady state, the curve shows the desired density. 4.6 Simulation on Adaptive Sampling In this simulation, the distributed deployment algorithm is combined with the optimal designforLocalLinearRegression. Thelatterisusedtogeneratetheoptimalnodedensity (f ∗ (x)) and the former is used to adjust node distribution so that the node density is proportional to f ∗ (x). It is assumed that the robotic sensor network consists of 500 nodes, of which 121 are static nodes and others are mobile nodes. The static nodes are evenly distributed in a 100×100 square and the mobile nodes are deployed randomly in the same area initially, as shown in Figure 4.9(a), where the static nodes are shown as 87 0 20 40 60 80 100 0 20 40 60 80 100 10 10.2 10.4 10.6 10.8 11 x (meters) y (meters) Figure 4.8: The scalar field to be estimated. asterisks and the mobile nodes are shown as small circles. The scalar field were simulated with the following equation, which is also shown in Figure 4.8. m(x,y)= 1 1−exp(−0.2y+0.0004x 2 −10) . A Gaussian noise with σ = 0.1 is applied to all the sensor readings. Figure 4.10 shows the estimation errors averaged on 20 trials v.s. the progress of the algorithm. For each trial, with the initial readings from both static and mobile sensor nodes, optimal design is used to generate the desired node distribution. The distributed deployment algorithm is then used to redeploy the mobile nodes according to the ideal density. However, the deploymentalgorithmisonlycarriedfor5steps. Thenthesensornodestakenewreadings and Local Linear Regression is used to estimated the scalar field. The sum of squared errorsiscomputedacrossthewholesensingfield. Now,theoptimaldesigniscarriedagain to compute the new optimal distribution. The process iterates for 10 times and hence 88 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 x (meters) y (meters) (a) 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 x (meters) y (meters) (b) Figure 4.9: One example of node distributions in simulations. (a) The initial distribution of the sensor nodes. (b) The final distribution of the sensor nodes after 50 steps. the deployment algorithm is carried for 50 steps. In Figure 4.10, the average estimation errors with the standard deviations are shown for every 5 steps, where the errors at step 0 correspond to the random distribution. At the beginning of the simulations, the estimation errors decrease rapidly and then become steady after around 20 steps, when the estimation errors have decreased about 30% from the beginning. Figure 4.9(b) shows one example of the final distribution of the sensor nodes. In this distribution, most of the mobile nodes gather in the areas where the trace of the Hessian matrix is high. 4.7 Conclusion In this chapter, we present a distributed deployment algorithm for a robotic sensor net- work to deploy sensor nodes in a desired density. Our strategy is to randomly partition space into sufficiently small neighborhoods. Within each neighborhood the redistribu- tion process directed by a cluster head or leader node is enacted. We prove that after 89 −20 0 20 40 60 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 the number of steps IMSE Figure 4.10: Results of the simulations on the system with a large number of mobile nodes. sufficiently many iterations, node density approaches the desired profile in the overall region. Though the algorithm partitions the system into clusters and each cluster needs one head, the cluster heads are selected randomly. So the algorithm does not suffer from single point failure. By combining the deployment algorithm with the optimal design for Local Linear Regression presented in Chapter 2, a distributed adaptive sampling algorithm is formed. Simulations show that our algorithm is able to considerably improve the estimation ac- curacy. Although the algorithm is tested in the 2D cases, it can be extended to 3D cases since both the deployment algorithm and Local Linear Regression can easily be adopted to higher dimensional cases. 90 Chapter 5 Adaptive Sampling with a Small Number of Mobile Nodes 5.1 Introduction In the previous two chapters, we have discussed adaptive sampling algorithms for robotic sensor networks with a single mobile node and with a large number of mobile nodes. In the former case, we focus on path planning for each mobile node - no consideration on coordination between mobile nodes is needed. In the latter case, we assume that the number of mobile nodes is so large that each mobile node only needs to take sensor readings at one location at each step. Therefore, the algorithm proposed is focused only on the coordination between mobile nodes and no path planning is considered. However, in a real system, the number of mobile nodes might be neither exactly one norverylarge. Forexample,intheNAMOSproject,thenumberofmobilenodes(robotic boats)is2. Asaresult,bothpathplanningandcoordinationbetweenmobilenodesshould be taken into consideration. In this chapter, we study the problem of adaptive sampling for a small number of mobile nodes. Given some initial readings from the static sensor nodes uniformly deployed in the sensing field and k mobile nodes, what paths should the 91 mobile nodes take so that the error of the estimation is minimized? As in the previous chapters, we discretize the sensing field and associate a gain with each location. This is the multiple vehicle routing or Multiple-path Orienteering problem. The Multiple-path Orienteering problem is a known NP-complete problem and there is no polynomial time algorithm currently known to find the optimal path. On the other hand, the structure of the scalar field that we utilize for estimating, e.g. Hessian matrix, may change over time and hence the computation time is critical to our solutions. As a result, in this chapter, we are looking for a fast approach that generates good paths, instead of an algorithm returningtheexactoptimalsolutions. Weproposeadivideandconquerapproachtosolve the adaptive sampling problem with a small number of mobile nodes. In our approach, the sensing field is partitioned into sub-areas and each sub-area is assigned to one mobile nodes. Within each sub-area, the algorithms discussed in Chapter 3 are used to generate good paths. One advantage of the partition approach is that once the sensing field is partitioned, the original problem is reduced to several smaller and independent problems and path planning can be carried in parallel. The Multiple-path Orienteering problem has been studied in [8, 14, 38, 56]. Blum et al. [8] presented a sequential planning algorithm, where the paths are planned in a sequential order. This algorithm is able to achieve an approximation factor of 3.53. Louis [38] tackled this problem using Genetic Algorithms. Krause [36] proposed a near- optimal algorithm to solve the problem of static sensor node deployment assuming that the underlying phenomenon is a Gaussian Process and a dense pre-deployment of sensors is given to find the correlation between readings at any two locations. Based on the same assumption, Singh [56] introduced an approximation algorithm to solve a problem 92 similartooursandextendedittothecasewithmultiplemobilerobotsbyusingsequential planning similar to [8]. Multi-robot exploration and mapping is another problem closely related to what we are studying here. In multi-robot exploration and mapping, it is normally assumed that therobotsaredeployedinanunknownenvironmentandnoglobalinformationisavailable. In this case, many techniques of coordinated exploration and mapping are based on the idea of frontier points or cells [26, 68, 54, 12]. To assign frontier points to individual robots, a market-based approach is used and the target locations are assigned through auctions [68, 54, 12]. There are different ways to evaluate the target locations for each robot. In the case where the energy consumption is important, the assignment with best tradoff between energy and utility can be chosen [12]; Fox [26] took the uncertainty of localization into consideration and proposed a strategy that trades off between the frontier and the hypothesis. Normally, the coordination strategies assign each robot one target location to visit. In the case multiple target locations are to be assigned, a sequential allocation is used [68, 54]. Stroupe [60] proposed a value-based coordination algorithm, which trades off between dynamic target observing, exploration, sampling and communications. However, the energy consumption is not considered. In this chapter, we consider at a slightly different problem. First, we use the static sensors to provide a coarseestimateofthescalarfieldandhenceroughglobalinformationisavailable. Second, multipletargetlocationsaretobeassignedtoeachrobotandeachrobotthenvisitsthese locations sequentially. Finally, because of the existence of the static sensor nodes, we assume that communication can be achieved by using a multi-hop protocol and hence we do not consider communication as a constraint. 93 5.2 Path Planning with Graph Partition Suppose the robotic sensor network consists of k mobile nodes and all of them have the same capabilities and initial energy level. The basic idea is to partition the sensing field intok sub-areas and assign each mobile node a sub-area. After the partition is computed and assigned to mobile nodes, the original problem is reduced to k smaller problems: to find the optimal path for each mobile node within each sub-area. In this case, the algorithms proposed in Chapter 3 can be used. The approach is algorithmically shown below. Algorithm 4: Adaptive Sampling With Multiple Mobile Nodes Construct the state graph G=(V, E); Collect readings from static sensors r0; for each vertex v∈V do Compute gain for v; end Partition G into sub-graphs G 1 ,...,G k ; Assign mobile node N i to G i for i = 1, ..., m; for i = 1 to k do π i = PlanPath(G i , E); Collect new reading r i ; end Reconstruct the scalar field from r 0 ∪r 1 ∪...∪r m ; As in Chapter 3, we discretize the sensing field and convert it into a graph by using the energy consumption model. Next, we compute an information gain for each location. Finally, the sensing field is partitioned and path planning algorithms are applied to each subarea. Two partitioning strategies are considered, equal area and equal gain. Equal area is a straightforward strategy, where the areas of all partitions are the same. Equal gain is a more intuitive strategy, where the sums of the gain in sub-areas are the same. 94 This strategy is inspired by the idea that each mobile node should achieve similar per- formance with the same amount of initial energy since all mobile nodes have the same capabilities. One advantage of the equal area strategy is that the partitioning does not rely on the Hessian matrix and the inaccuracy of the estimate of the Hessian matrix has no effect on path planning. Alongwiththetwostrategies,wealsoapplyanotherconstraintthattheboundariesof thesub-areasshouldbeasstraightaspossible. Generallyspeaking,ittakesamobilenode more effort to follow a complex boundary and hence result in more energy consumption. One way to measure the complexity of a boundary is the length. The length of the boundary can in turn be measured with the number of cut edges. Therefore, the problem of partitioning the sensing field (represented as a graph) into sub-areas can be formed as a graph partition problem: dividing the graph into sub-graphs so that the sum of the weights of vertices in sub-graphs are the same and the number of cut edges is minimized. If the graph is to be partitioned into equal-gain sub-areas, the weight of each vertex is the corresponding gain. If equal-area sub-areas are preferred, the weight of each vertex is simply 1. GraphpartitionisawellknownNP-completeproblemandcurrentlythereisnoknown polynomial time algorithm to find the optimal partition. However, many approximation algorithms have been proposed [62, 58, 35]. The approach used in this chapter exploits a multilevel paradigm [62]. This approach consists of two stages. In the first stage, the graph is contracted until the size is less than a given threshold. Then, an interactive processofexpansionandrefinementisperformedtobalancetheweightsofthepartitions. This algorithm runs fast with reasonable-size graphs. 95 Once the partitioning is done, we need to assign each mobile node a sub-area to carry out the path planning. If we can assume that at the beginning of the experiment, a mobile node can be deployed at any given location, the assignment can be done trivially by assigning any mobile node to any sub-area. However, sometimes, this assumption is not valid. In some cases, we might not have the tool to deploy the mobile nodes at any given location. Additionally, if the adaptive sampling algorithm is implemented as an incremental approach, the mobile nodes are required to carry out the sampling task repeatedly and hence it is not realistic to keep redeploying the mobile nodes. In this case, we need to assign the sub-areas to mobile nodes in such a way that the sum of the distances between current locations of the mobile nodes and the assigned sub-areas is minimized. This is known as the assignment problem, and the Hungarian algorithm [46] can be used to find the optimal assignment. 5.3 Performance Analysis Whenthesensingfieldispartitionedintoksub-areas,thetotalestimationerroristhesum of the estimation errors within each sub-areaIMSE total = P k i=1 IMSE i , where IMSE i is the estimation error within the i-th sub-area. If we assume that we have an algorithm to generate the path for a single mobile node to take new samples so that the sample density is the same as the optimal density, then the IMSE within each sub-area can be estimated by Equation (2.14). With the notations introduced in Section 2.2.1, we rewrite Equation (2.14) as follows. IMSE i =CI 8 d+4 i /n 4 d+4 i , 96 where n i is the number of samples within the sub-area i, and I i and C are defined as follows. I i = Z S i tr 2d d+8 {H m (x)}dx; C =R(K)σ 2 μ 2 2 (K)(d − d d+4 + 1 4 d 4 d+4 ). When equal area is used as the partition strategy, n i = n/k and the IMSE of the whole sensing field can be expressed as follows. IMSE ea = k X i=1 CI 8 d+4 i (n/k) 4 d+4 . (5.1) If IMSE opt is the estimation error with optimal global density, the ratio of IMSE ea to IMSE opt is r = k X i=1 CI 8 d+4 i (n/k) 4 d+4 / CI 8 d+4 n 4 d+4 =k 4 d+4 k X i=1 (I i /I) 8 d+4 . (5.2) It can be proved that 1≤r≤k 4 d+4 , (5.3) where the left equality happens when I i = I/k for all i, and the right equality happens when I i =1 and I j =0 for all j 6=i. Because IMSE opt is the error corresponding to the optimal density, the estimation error associated with the partition strategy of equal gain is bounded below by it, i.e., IMSE eg ≥IMSE opt . As a result, IMSE ea IMSE eg = IMSE ea IMSE opt IMSE opt IMSE eg ≤k 4 d+4 k X i=1 (I i /I) 8 d+4 ≤k 4 d+4 . (5.4) 97 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 (a) 0 5 10 15 20 25 30 35 40 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 The number of mobile nodes IMSE ea /IMSE opt (b) Figure 5.1: (a) The scalar field m(x,y)=1/(1+exp(25 p x 2 +y 2 −17.5)). (b) The ratio IMSE ea /IMSE opt on the scalar field. Generally, Equation (5.4) means that the performance between the equal gain and equal areastrategydependsonboththeunderlyingscalarfieldtobeestimatedandthenumber of mobile nodes available. For the same scalar field to be estimated, the performance difference between equal gain and equal area increases as the number of mobile nodes increases. Whenthenumberofmobilenodesissmall,theperformancedifferencebetween thetwopartitionstrategiesisalsosmall. Figure5.1showstheratioIMSE ea /IMSE opt on agivenscalarfield. NotethatthisratioistheupperboundoftheratioIMSE ea /IMSE eg . One interesting observation is that the ratio in Equation (5.4) does not depend on the number of samples. 5.4 Simulations We carried out a series of simulations to verify the partition-based adaptive sampling algorithm, with both equal gain and equal area strategies. In the simulations, the path 98 planning algorithm used is Nearly Exhaustive Search, which works pretty well until the size of the graph is too big. We also implemented the sequential planning algorithm and compared the results with the partition-based approaches. Several sets of experiments have been carried on different scalar fields and all of them show similar trends. Following are the simulation results on the scalar field m(x,y) = 1/(1+exp(20x 2 −20y+12)). In the simulations, there are 121 static sensor nodes uniformly deployed in the 1x1 square and the noise level is 5%. Figure 5.2 shows the paths planned for a group of 4 mobile nodes together with the sub-areas generated. The 4 subareas are represented by using different markers, squares, triangles, and circles. As expected, the higher gain locations are gathered in the top left corner. Hence in the case where equal gain is used as the partitioning strategy, the top left sub-area is smaller than the other three sub-areas. However, both the strategies put more sampling locations near the top left corner. Figure 5.3 compares the estimation errors between three approaches: the sequential allocation, partitioning with equal area and partitioning with equal gain. The mean and standard deviation are computed based on 100 trials. From the figure we can see that theestimationerrorsofallthreeapproachesaresimilar, whilethepartitioningwithequal area is slightly worse than the other two. Given the analysis in Section 5.3, the similarity in the performances with 4 mobile nodes is not surprising. Actually, in the cases where an ideal single path planning algorithm is given,IMSE ea /IMSE eg is no more than 1.58. In the simulations, the Hessian matrix is not known and has to be estimated and the estimation error associated with equal gain is worse than in the ideal case and hence performance difference is not obvious. 99 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x y (a) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x y (b) Figure 5.2: The paths planned for a group of 4 mobile nodes. (a) Partitioned with equal area strategy; (b) Partitioned with equal gain strategy. 100 0 0.5 1 1.5 2 2.5 3 3.5 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 energy errors eg ea seq Figure 5.3: Comparison of the estimation errors of three strategies: sequential allocation, partition with strategy of equal area and partition with strategy of equal gain. 0 0.5 1 1.5 2 2.5 3 3.5 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Energy time (seconds) Equal Gain Equal Area Sequential Allocations Figure 5.4: The running time of the three strategies. 101 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 1 2 3 4 5 6 7 8 Energy Level RSS Equal Area Equal Gain Figure 5.5: Performance comparison between the two partition strategies with 32 mobile nodes. Figure 5.4 compares the running time of the three approaches. The running times of the two partition-based approaches are very close to each other while the running time of the sequential allocation is much higher than the other two. Figure 5.5 shows the case with 32 mobile nodes. Note that in this test, we utilize an incremental approach. The adaptive sampling is carried out in 5 stages. In each stage, the mobile nodes use the sensor readings from both static sensors and previous stages to reconstruct the scalar field and estimate the Hessian matrix. Each mobile node starts from its previous state in the previous stage. The estimation error corresponding toenergy =0 is the estimation error by using only the readings from static sensor nodes. From the figure, we can see that, on the one hand, the performance difference between equal gain and equal area increases when the number of mobile nodes increases, which verifies Equation (5.4); On the other hand, the difference increases with the progress of 102 the incremental approach, i.e., with the increase of the number of samples. The reason is that as the number of new samples increases, the estimate of the Hessian matrix of the scalar field becomes more accurate. As a result, the partitioning can be done more accurately and hence the performance increases. 5.5 Conclusion In this chapter, we discussed the adaptive sampling problem with a small number of mobile nodes. We present an approach to coordinate multiple mobile nodes, i.e., par- titioning the sensing field into sub-areas and apply the path planning algorithm for a single mobile node in each sub-area. Two partitioning strategies are explored: equal gain and equal area. We studied the reconstruction errors of the two strategies and also compared them to sequential planning. Simulations show that both the strategies have similar reconstruction error as well as sequential planning. Simulations also show that the partitioning approaches take less running time than sequential planning. An analysis of the partitioning strategies shows that when the number of mobile nodes is small, the difference between equal gain and equal area is small, especially in the case where the Hessian matrix is to be estimated. 103 Chapter 6 Conclusion In this dissertation, we studied the adaptive sampling problem for scalar field estimation byusingaroboticsensornetwork. Theproblemofadaptivesamplingisdefinedasfollows: Given some initial sensor readings, where should mobile node(s) take additional readings so that the estimation error is minimized? Ifthecosttomovethemobilenodesisnegligiblecomparedtothecosttotakesamples, the adaptive sampling problem is an optimal experimental design problem, which has been well studied. In many applications, a parametric model of the underlying scalar field is not available. Therefore, we have focused on design strategies for non-parametric regression, such as local linear regression, thin plate splines and universal kriging. The design strategies depend on the specific regression techniques. For local linear regression, with theasymptotic analysisofmean squarederror, aclosedform solutiontothe optimal density of samples is presented. For thin plate splines, a two-stage strategy is proposed: The number and locations of knots are first determined by minimizing general cross validation, then a model-based optimal design algorithm is used to generate the optimal distribution of the samples, which minimizes the estimation variance. Optimal design for 104 universal kriging also minimizes the estimation variance but the initial sensor readings are used to estimate the semivariogram. For robotic sensor networks, the cost for moving a node is much higher than taking a sensorreading. Thecostisnormallytheenergyconsumptionsinceroboticsensornetworks are powered by batteries. In this case, the set of sampling locations generated by optimal design is no longer the ideal choice. We proposed adaptive sampling algorithms for a single mobile node in Chapter 3. Given the energy available, the algorithms are able to generate a nearly optimal path for the mobile node in the sense of minimizing the estimation errorwith limitedenergy. Weassume that the underlying scalarfield doesnot change until the mobile nodes finish collecting samples. Inthecaseofalargenumberofmobilenodes, weproposedadistributedalgorithmfor adaptive sampling. It is assumed that the number of nodes is so large that each mobile node only needs to take sensor readings in one location for estimation. As a result, the distribution of sensor readings is the same as the distribution of nodes. Since optimal designs can be used to generate the ideal sample distribution, and hence the ideal node distribution, the adaptive sampling problem now becomes the problem of deploying the mobile nodes so that node density is proportional to a given function. We proposed a distributed algorithm to solve the problem in Chapter 4. For the case where a small number of mobile nodes are used, each mobile node has to take sensor readings in multiple locations and collaborate with other mobile nodes. We first partition the sensing field into sub-areas. Within each sub-area the path planning algorithm for a single mobile node is used to produce a path. Simulations show that the approach is able to produce good paths within a short period of time. 105 The choice of which scheme to use is mainly determined by the resources available, e.g., the number of mobile nodes. If the resources are not constrained, the scale and the properties of the phenomenon would play an important role in determining the scheme to use. If the scale is large or the phenomenon changes fast, the multiple mobile nodes configuration is preferred, since both path planning and sample collecting can be done faster. When the phenomenon changes slowly or the scale is small, a single mobile node is enough to accomplish the task and hence the scheme for single mobile node can be used. In conclusion, by combining optimal design techniques, path planning algorithms and inter-robot coordination mechanisms, adaptive sampling algorithms can be designed for a robotic sensor network so that the resources of the system, such as energy and sensing ability, can be used efficiently in real world settings. 106 References [1] E. M. Arkin, J. S. B. Mitchell, and G. Narasimhan. Resource-constrained geomet- ric network optimization. In Proceedings of the fourteenth annual symposium on Computational geometry, pages 307–316, 1998. [2] S.AroraandG.Karakostas. A2+ǫapproximationalgorithmforthek-mstproblem. InProceedings of SIAM Symposium on Discrete Algorithms (SODA),pages754–759, 2000. [3] A. Atkinson and A. Donev. Optimum Experimental Designs. Oxford Science Publi- cations, 1992. [4] B. Awerbuch, Y. Azar, A. Blum, and S. Vempala. New approximation guarantees for minimum-weight k-trees and prize-collecting salesmen. SIAM Journal on Com- puting, 28(1):254–262, 1998. [5] N. Bansal, A. Blum, S. Chawla, and A. Meyerson. Approximation algorithms for deadline-tsp and vehicle routing with time-windows. In Proceedings of 36th Annual ACM Symposium on Theory Computer, pages 166–174, 2004. [6] M. A. Batalin and G. S. Sukhatme. Coverage, exploration and deployment by a mobile robot and communication network. Telecommunication Systems Journal, Special Issue on Wireless Sensor Networks, 26(2):181–196, 2004. [7] M. A. Batalin and G. S. Sukhatme. The design and analysis of an efficient local algorithm for coverage and exploration based on sensor network deployment. IEEE Transactions on Robotics, 23(4):661–675, August 2007. [8] A. Blum, S. Chawla, D. R. Karger, T. Lane, A. Meyerson, and M. Minkoff. Approx- imation algorithms for orienteering and discounted-reward tsp. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, pages 46–55, 2003. [9] Z.BulterandD.Rus. Controllingmobilesensorsformonitoringeventswithcoverage constraints. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 1568–1573, New Orleans, LA, May 2004. [10] N.Bulusu,D.Estrin,L.Girod,andJ.Heidemann. Scalablecoordinationforwireless sensor networks: Self-configuring localization systems. In Proceedings of the Sixth IEEEInternationalSymposiumonCommunicationTheoryandApplications(ISCTA ’01), July 2001. 107 [11] N.Bulusu,J.Heidemann,andD.Estrin. Adaptivebeaconplacement. InProceedings of International Conference on Distributed Computing Systems,pages489–498,April 2001. [12] W. Burgard, M. Moors, C. Stachniss, and F. E. Schneider. Coordinated multi-robot exploration. IEEE Transactions on Robotics, 21(3):376–386, 2005. [13] K. Chaudhuri, B. Godfrey, S. Rao, and K. Talwar. Paths, trees and minimum latency tours. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, pages 36–45, 2003. [14] S. Chawla. Graph Algorithm for Planning and Partitioning. PhD thesis, School of Computer Science, CMU, 2005. [15] C. Chekuri, N. Korula, and M. Pal. Improved algorithms for orienteering and re- lated problems. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 661–670, 2008. [16] K. Chen and S. Har-Peled. The orienteering problem in the plane revisited. In Proceedings of the twenty-second annual symposium on Computational geometry, pages 247–254, 2006. [17] N. Christofides. Worst-case analysis of a new heuristic for the travelling salesman problem. In Proceedings of Symposium on New Directions and Recent Results in Algorithms and Complexity, pages 441–450. Academic Press, 1976. [18] T. Clouqueur, V. Phipatanasuphorn, P. Ramanathan, and K. K. Saluja. Sensor de- ployment strategy for target detection. In Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, pages 42–48. ACM Press, 2002. [19] P. I. Corke, S. E. Hrabar, R. Peterson, D. Rus, S. Saripalli, and G. S. Sukhatme. Autonomous deployment and repair of a sensor network using an unmanned aerial vehicle. In Proceedings of the IEEE International Conference on Robotics and Au- tomation (ICRA), pages 3602–3609, New Orleans, LA, April 2004. [20] K.Dantu,M.H.Rahimi,H.Shah,S.Babel,A.Dhariwal,andG.S.Sukhatme. Robo- mote: Enabling mobility in sensor networks. In Proceedings of IEEE/ACM Fourth International Conference on Information Processing in Sensor Networks (IPSN- SPOTS), pages 404–409. IEEE, April 2005. [21] F. Dellaert, D. Fox, W. Burgard, and S. Thrun. Monte carlo localization for mo- bile robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 1322–1328, Detroit, Michigan, May 1999. [22] A. Dhariwal, B. Zhang, C. Oberg, B. Stauffer, A. Requicha, D. Caron, and G. S. Sukhatme. Networked aquatic microbial observing system. In the IEEE Interna- tional Conference of Robotics and Automation (ICRA),Orlando, Florida, May2006. Poster. 108 [23] S. S. Dhillon, K. Chakrabarty, and S. S. Iyengar. Sensor placement for grid coverage under imprecise detections. In Proceedings of International Conference on Informa- tion Fusion, pages 1581–1587, 2002. [24] J.Fan. Locallinearregressionsmoothersandtheirminimaxefficiencies. The Annals of Statistics, 21(1):196–216, 1993. [25] J. J. Faraway. Sequential design for the nonparametric regression of curves and surfaces. In Proceedings of the 22nd Symposium on the Interface between Computing Science and Statistics, pages 104–110. Springer, 1990. [26] D. Fox, J. Ko, K. Konolige, B. Limketkai, D. Schulz, and B. Stewart. Distributed multirobot exploration and mapping. Proceedings of the IEEE, 94(7):1325–1339, July 2006. [27] J. H. Friedman and B. W. Silveman. Flexible parsimonious smoothing and additive modeling. Technometrics, 3(1):3–21, 1989. [28] N. Gordon, D. Salmond, and A. Smith. Novel approach to nonlinear/non-gaussian bayesian state estimation. In IEE Proceedings F of Radar and Signal Processing, pages 107–113, 1993. [29] C.Gu. Smoothingandregression. In Multivariate Spline Regression, pages329–356. Wiley, 2000. [30] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy-efficient com- munication protocol for wireless microsensor networks. In Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, volume 2, 2000. [31] A. Howard. Mezzanine user manual; version 0.00. Technical Report IRIS-02-416, Institute for Robotics and Intelligent Systems, USC, 2002. [32] A. Howard, M. J. Matari´ c, and G. S. Sukhatme. An incremental self-deployment al- gorithm for mobile sensor networks. Autonomous Robots Special Issue on Intelligent Embedded Systems, 13(2):113–126, 2002. [33] A. Howard, M. J. Matari´ c, and G. S. Sukhatme. Mobile sensor network deployment using potential fields: A distributed, scalable solution to the area coverage problem. In Proceedings of the International Symposium on Distributed Autonomous Robotic Systems, pages 299–308, June 2002. [34] M. Isard and A. Blake. Condensation - conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1):5–28, 1998. [35] G.KarypisandV.Kumar. Afastandhighqualitymultilevelschemeforpartitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359–392, 1998. [36] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor place- ments: Maximizing information while minimizing communication cost. In Proc- cedings of the Fifth International Conference on Information Processing in Sensor Networks (IPSN’06), pages 2–10, Apr 2006. 109 [37] D. G. Krige. A statistical approach to some basic mine valuation problem on the witwatersrand. Journal of the Chemical, Metallurgical, and Mining Society of South Africa, 52:119–139, 1951. [38] S. J. Louis, X. Yin, and Z. Y. Yuan. Multiple vehicle routing with time windows usinggeneticalgorithms. InProceedings of the 1999 Congress on Evolutionary Com- putation, volume 3, December 1999. [39] D.G.Luenberger. Optimization by Vector Space Methods. JohnWiley&Sons,1969. [40] G. Matheron. Traite de geotatistique appliquee, tome i. Memoires du Bureau de Recherches Geologiques et Minieres, 24, 1962. [41] S. Meguerdichian, F. Koushanfar, M. Potkonjak, and M. B. Srivastava. Coverage problemsinwirelessad-hocsensornetworks. InProceedingsoftheAnnualJointCon- ference of the IEEE Computer and Communications Societies (INFOCOM), pages 1380–1387, 2001. [42] Y. Mei, Y.-H. Lu, C. S. G. Lee, and Y. C. Hu. Energy efficient mobile robot ex- ploration. In Proceedings of the IEEE International Conference of Robotics and Automation (ICRA), pages 505–551, Orlando, Florida, May 2006. [43] H.-G. G. M¨ uller. Optimal design for nonparametric kernel regression. Statistics and Probability Letters, 2(5):3–14, March 1984. [44] H.-G. G. M¨ uller and U. Stadtmuller. Variable bandwidth kernel estimators of re- gression curves. The Annals of Statistics, 15(1):182–201, March 1987. [45] W. G. M¨ uller. Collecting Spatial Data. Physica-Verlag, 2001. [46] J. Munkres. Algorithms for the assignment and transportation problems. Journal of the Society of Industrial and Applied Mathematics, 1(5):32–38, March 1957. [47] D. W. Nychka. Spatial-process estimates as smoothers. In Smoothing and Regres- sion, pages 329–356. Wiley, 2000. [48] R. A. Olea. Geostatistics for engineers and earth scientists. Kluwer Academic Pub- lishers, 1999. [49] S. Poduri and G. S. Sukhatme. Constrained coverage for mobile sensor networks. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 165–172, New Orleans, LA, May 2004. [50] N. Roy, W. Burgard, D. Fox, and S. Thrun. Coastal navigation – robot motion with uncertainty. In Proceedings of the AAAI Fall Symposium: Planning with POMDPs, volume 1, pages 35–40, 1998. [51] D. Ruppert and M. P. Wand. Multivariate locally weighted least squares regression. The Annals of Statistics, 22(3):1346–1370, 1994. [52] S. D. Silvey. Optimal Design. Chapman and Hall, 1980. 110 [53] R. Sim and N. Roy. Global a-optimal robot exploration in SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 661–666, Barcelona, Spain, April 2005. [54] R. G. Simmons, D. Apfelbaum, W. Burgard, D. Fox, M. Moors, S. Thrun, and H. Younes. Coordination for multi-robot exploration and mapping. In Proceedings of the AAAI, pages 852–858, 2000. [55] A. Singh, M. A. Batalin, M. Stealey, B. Zhang, A. Dhariwal, B. Stauffer, S. Moor- thi, C. Oberg, A. Pereira, V. Chen, Y. Lam, D. A. Caron, M. Hansen, W. Kaiser, and G. Sukhatme. Human assisted robotic team campaigns for aquatic monitoring. Journal of Field Robotics, 24(11-12):969–989, November 2007. [56] A. Singh, A. Krause, C. Guestrin, W. Kaiser, and M. Batalin. Efficient planning of informative paths for multiple robots. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 2204–2211, 2007. [57] K. Sohrabi, W. Merrill, J. Elson, L. Girod, F. Newberg, and W. Kaiser. Methods for scalable self-assembly of ad hoc wireless sensor networks. IEEE Transactions on Mobile Computing, 3(4):317–331, October 2004. [58] A. J. Soper, C. Walshaw, and M. Cross. A combined evolutionary search and multi- level optimisation approach to graph-partitioning. Journal of Global Optimization, 29(2):225–241, 2004. [59] C. J. Stone. The use of polynomial splines and their tensor products in multivariate function estimation. The Annals of Statistics, 22(1):118–171, 1994. [60] A. W. Stroupe, R. Ravichandran, and T. Balch. Value-based action selection for exploration and dynamic target observation with robot teams. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 4190– 4197, New Orleans, LA, May 2004. [61] G. S. Sukhatme, A. Dahriwal, B. Zhang, C. Oberg, B. Stauffer, and D. Caron. The design and development of a wireless robotic networked aquatic microbial observing system. Environmental Engineering Science, 24(2):205–215, 2007. [62] C.WalshawandM.Cross. Meshpartitioning: Amultilevelbalancingandrefinement algorithm. SIAM Journal on Scientific Computing, 22(1):63–80, 2000. [63] G. Wang, G. Cao, and T. L. Porta. Movement-assisted sensor deployment. In Pro- ceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), volume 4, pages 2467–2479, March 2004. [64] E. T. Whittaker. On a new method of graduation. In the Proceedings of Edinburgh Mathematics Society, volume 41, pages 63 – 75, 1923. [65] R. Willett, A. Martin, and R. Nowak. Backcasting: Adaptive sampling for sen- sor networks. In Proceedings of the third international symposium on Information processing in sensor networks, pages 124–133, 2004. 111 [66] B.Yamauchi. Afrontier-basedapproachforautonomousexploration. InProceedings of the IEEE International Conference of Robotics and Automation (ICRA), pages 146–151, Albuquerque, NM, July 1997. [67] X. Zhang and S. B. Wicker. How to distribute sensors in a random field? In Proceedingsof the third international symposiumon Information processingin sensor networks, pages 243–250. ACM Press, 2004. [68] R. Zlot, A. Stentz, M. B. Dias, and S. Thayer. Multi-robotexploration controlled by amarketeconomy. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), volume 3, pages 3016–3023, Washington, DC, May 2002. [69] Y. Zou and K. Chakrabarty. Sensor deployment and target localization based on virtual forces. In Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), volume 2, pages 1293–1303, March 2003. 112 Appendix Publications 1. BinZhangandGauravS.Sukhatme. Adaptivesamplingforestimatingascalarfield usingamobilerobotandasensornetwork. InProceedingsoftheIEEEInternational Conference on Robotics and Automation (ICRA), pages 3673–3680, Roma, Italy, April 2007. 2. Amarjeet Singh, Maxim A. Batalin, Michael Stealey, Bin Zhang, Amit Dhariwal, Beth Stauffer, Stephanie Moorthi, Carl Oberg, Arvind Pereira, Victor Chen, Ye- ung Lam, David A. Caron, Mark Hansen, William Kaiser, and Gaurav Sukhatme. Human assisted robotic team campaigns for aquatic monitoring. Journal of Field Robotics, 24(11-12):969-989, November 2007. 3. Gaurav S. Sukhatme, Amit Dhariwal, Bin Zhang, Carl Oberg, Beth Stauffer, and David Caron. The design and development of a wireless robotic networked aquatic microbial observing system. Environmental Engineering Science, 24(2):205-215, 2007. 4. Amit Dhariwal, Bin Zhang, Carl Oberg, Beth Stauffer, Aristides Requicha, David Caron,andGauravS.Sukhatme. Networkedaquaticmicrobialobservingsystem. In the IEEE International Conference of Robotics and Automation (ICRA), Orlando, Florida, May 2006. Poster. 5. Bin Zhang and Gaurav S. Sukhatme. Controlling sensor density using mobility. In Proceedings of the Second IEEE Workshop on Embedded Networked Sensors, pages 141–149, Sydney, Australia, May 2005. 6. Bin Zhang, Gaurav S. Sukhatme, and Aristides A. Requicha. Adaptive sampling for marine microorganism monitoring. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1115–1122, Sendai, Japan, September 2004. 113
Abstract (if available)
Abstract
Robotic sensor networks provide new tools for in-situ sensing in challenging settings such as environmental monitoring. Motivated by applications in marine biology we study the field reconstruction problem using a robotic sensor network. We focus on the adaptive sampling problem. The network makes sequential control decisions about where to sample the environment to optimally reconstruct the field being probed. Reconstruction errors are thus reduced by adjusting the sample distribution. In a robotic sensor network, the static network nodes provide long term continuous sensor data (samples) at fixed locations. On the other hand, the mobile nodes (robots) are able to adjust the distribution of the samples but the number of samples they can take is limited. We present algorithms that exploit the advantages of both static and mobile nodes. Samples from static nodes are used to bootstrap a coarse estimate of the field, and the robots take additional samples to successively refine the estimate. Three cases are studied in this dissertation: a static network augmented with 1. a single robot, 2. a large number of robots, and 3. a small number of robots. For the first case, approximation algorithms developed for the orienteering problem are used for generating near-optimal data-gathering tours once the gain of each location is computed by using optimal experimental design for non-parametric regression. For the second case, an auction-based algorithm is used for coordination between mobile nodes. For the final case, spatial and task partitioning methods are used to coordinate robots with each partition being treated as in the single robot case. The methods developed in this dissertation have been extensively tested in simulations. Experimental field trials (several km in two lakes and a harbor) on a physical network of two robotic boats and ten static buoys validate the algorithms.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Multi-robot strategies for adaptive sampling with autonomous underwater vehicles
PDF
Reconfiguration in sensor networks
PDF
A robotic system for benthic sampling along a transect
PDF
Informative path planning for environmental monitoring
PDF
Mobility-based topology control of robot networks
PDF
Data-driven robotic sampling for marine ecosystem monitoring
PDF
Interaction and topology in distributed multi-agent coordination
PDF
The interplay between networks and robotics: networked robots and robotic routers
PDF
Aging analysis in large-scale wireless sensor networks
PDF
Remote exploration with robotic networks: queue-aware autonomy and collaborative localization
PDF
Active sensing in robotic deployments
PDF
Algorithmic aspects of throughput-delay performance for fast data collection in wireless sensor networks
PDF
Robust real-time vision modules for a personal service robot in a home visual sensor network
PDF
Data scarcity in robotics: leveraging structural priors and representation learning
PDF
Compilation of data-driven macroprograms for a class of networked sensing applications
PDF
Rate adaptation in networks of wireless sensors
PDF
Learning from planners to enable new robot capabilities
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
Sensing with sound: acoustic tomography and underwater sensor networks
PDF
Communication and estimation in noisy networks
Asset Metadata
Creator
Zhang, Bin
(author)
Core Title
Adaptive sampling with a robotic sensor network
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science (Robotics
Publication Date
04/18/2008
Defense Date
03/06/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
adaptive sampling,OAI-PMH Harvest,path planning,robotic sensor network,sensor actuator network
Language
English
Advisor
Sukhatme, Gaurav S. (
committee chair
), Caron, David A. (
committee member
), Hansen, Mark H. (
committee member
), Kempe, David (
committee member
)
Creator Email
binz@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1147
Unique identifier
UC165427
Identifier
etd-Zhang-20080418 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-58539 (legacy record id),usctheses-m1147 (legacy record id)
Legacy Identifier
etd-Zhang-20080418.pdf
Dmrecord
58539
Document Type
Dissertation
Rights
Zhang, Bin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
adaptive sampling
path planning
robotic sensor network
sensor actuator network