Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 887 (2007)
(USC DC Other)
USC Computer Science Technical Reports, no. 887 (2007)
PDF
Download
Share
Open document
Flip pages
Copy asset link
Request this asset
Description
Jabed Faruque, Ahmed Helmy. "PBS: A virtual grid architecture for information gradient-based active querying in sensor networks." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 887 (2007).
Transcript (if available)
Content
PBS: A Virtual Grid Architecture for Information Gradient-based Active Querying in Sensor Networks Jabed Faruque Department of Electrical Engineering University of Southern California Los Angeles, CA 90089 faruque@usc.edu Ahmed Helmy Department of Computer and Information Science and Engineering University of Florida Gainesville, FL 32611 helmy@ufl.edu Abstract— Every physical event diffuses its effect ge- ographically, which results in perceivable information gradient within the proximity of the phenomenon. In this paper, we propose a novel framework that exploits this diffusion property to form a virtual grid-based querying architecture, Probe-before-Spray (PBS), for wireless sensor networks. PBS effectively divides the sensor field on- demand based on the query type and parameters in addition to the gradient spread. Also, it combines gra- dient routing and in-network processing for efficient and scalable querying in sensor networks. Based on PBS, we design new algorithms to process basic aggregate queries - count, sum, average, max and min, and combined queries. Through analysis, we analyze the worst-case overhead to process these queries using PBS. Also, using extensive simulations, we demonstrate that PBS helps to reduce search overhead significantly (over 30%) to process such queries while attaining accuracy over 99%. I. INTRODUCTION The fine grain environmental monitoring capability of wireless sensor networks associates the physical world with computing platforms. With the advances in sensor technology, it is possible to detect and/or measure a wide variety of physical phenomena like temperature, light, sound, radiation, humidity, chemical contamination, ni- trate level in the water, etc. In real life, within the prox- imity of the phenomenon, every physical event leaves some fingerprints, i.e., information gradient in terms of the event’s effect; e.g., fire increases temperature, chemical spill increases contamination, nuclear leakage increases radiation, so on. Moreover, most of the physical phenomena follows known diffusion laws[16][17] with distance. That is, f(d) ∝ 1 d α , where d is the distance from the point having the maximum effect of the event, f(d) is the magnitude of the event’s effect, and α is the exponent of the diffusion function that depends on the type of effect and the medium; e.g., for light α = 2, and for heat α = 1. Using the diffusion property of an event’s effect, we can determine the magnitude of the event from apart if the distance from the event is known. Conversely, for a given magnitude of an event source, it is also possible to estimate the spread (e.g., area) of the effect through known diffusion laws, empirical data or the local col- laboration of sensor nodes. This property of information gradient can be utilized for query processing, especially for on-demand query processing about the event(s). In this document, we refer the area around an event source from which the event’s effect can be perceived as “the geometry of event’s effect”. One of the challenges of using this diffusion property is that the gradient is not perfect in reality and suffers distortion due to various environmental effects. We carefully consider this fact in our design and analysis to exploit this property. In recent years, sensor network has been viewed as distributed database that collects the measurements of the physical world[1]. Users specify the named data they want to collect or the event of interest through application specific or declarative queries, and the infras- tructure efficiently collects and processes the data within the sensor network. Typical queries can be on-demand simple one-shot, aggregate or combined queries or long- lived continuous queries. Existing declarative query processing systems, in- cluding TinyDB[9] and Cougar[12] resolve query via a routing tree, which need to be established by initial flooding throughout the network. Also, the application specific query mechanisms, like Directed Diffusion[2], disseminate the interest hop-by-hop throughout the net- work similar to flooding; however, they optimize the in- terest forward based on query parameters, e.g., location. Random walk based mechanisms, like ACQUIRE[3], process simple one-shot and combined queries leverag- ing replicated data. Nevertheless, all such approaches ignore the physical properties of the events of interest. Information gradient-based query mechanisms [4][5][8] exploit the diffusion patterns of events for directionality towards source(s) or node(s) that satisfy given query parameters. However, in multiple events scenario, sources can be sparsely located and create non-overlapping information gradient regions. Thus, existing information gradient-based approaches unable to explore the gradients due to all sources. Therefore, the query processing may produce only partial results. In this study, we propose a novel framework that exploits the diffusion property to form virtual grid-based architecture, Probe-before-Spray(PBS) to process infor- mation gradient-based queries. Leveraging geographical information and the geometry of event’s effect, the querier (i.e., sink) establishes a virtual grid structure in a sensor field and initiates the query in each grid cell. The grid structure of PBS uses the geometry of event’s effect to introduce search scope and reduces search overhead. Here, the cell size can vary with query type. Also, PBS uses probing to identify the occurrence or the existence of event(s) and saves search overhead, especially for a region without event(s). Further, PBS overcomes the limitation of existing information gradient-based query processing approaches and explores the information gra- dients due to sparsely located sources. Compared to existing grid-based architectures, the grid cells of PBS are resizable and can be variable. Also, the querier establishes the grid on-demand in the network. Further, based on the proposed PBS architecture, we have designed algorithms to compute basic aggregate queries - count, sum, average, max and min, and com- bined query, which combines multiple sub-queries by conjunction operator. In this study, we focus our attention on the set of events where the event’s effect diffuses after its occur- rence. Here, we assume that the sensor nodes are able to detect the changes due to event(s). Several recent work ([18], [19]) justifies this assumption. Also, initially we assume that the surrounding region of an event source is obstacle free to diffuse the information gradient of event’s effect. However, this assumption can be relaxed using local collaboration to detect such obstacles. Fi- nally, throughout this study, we consider the approximate geometry of event’s effect based on empirical results and known diffusion laws. This geometry may change with environment condition. Precise computation of the ge- ometry of event’s effect is beyond the scope of this work. However, for performance evaluation, we consider the distortion of diffusion that captures this effect in some extend. The main contributions of this paper include: • Proposing a novel architecture, Probe-before- Spray(PBS) that forms resizable virtual grid cells to process information gradient-based queries. This reduces search overhead significantly by exploiting the geometry of event’s effect and the geographical information. In addition, PBS combines information gradient-based routing and in-network processing. • Developing new query processing algorithms based on PBS to process aggregate queries - count, sum, average, max and min, and combined query. • Analyzing the worst-case performance of the query processing algorithms as well as PBS architecture using simple analytical models. • Evaluating the performance of query processing algorithms based on PBS using realistic simula- tion model. It is found that the algorithms are robust (success rate is over 99%) and reduce energy overhead significantly (more than 30%) over usual flooding based approaches. II. RELATED WORK In sensor networks perspective, query processing is an effort to co-design both query processing and net- working subsystems to enable efficient and scalable self- organized data retrieval and in-network processing in a reliable, energy efficient and timely manner. Among several in-network query systems, Directed- Diffusion[2] is pioneer work. Instead of using query language like SQL, this approach focuses on both query dissemination mechanisms and flexible in-network processing. All the protocols based on this approach describes a query by interest messages. A sink node originates the interest and disseminates in the network by flooding. The interest forwarding decision depends on query parameters, for example location attributes. However, this approach also disseminates the query within regions having no event. Declarative query processing systems, like TinyDB[9] and Cougar[12], use flooding to disseminate queries in the network and collect the replies via a routing tree, where the root node usually is the user’s physical location. Here, queries are parsed and optimized at user’s PC and then injected into the tree-based sensor network for processing. Like Directed-Diffusion, here in-network processing can be done at leaf nodes or intermediate nodes to reduce the amount of data flow to the root. 2 Leveraging geographical information and the diffusion spread, i.e., the geometry of event’s effect, our proposed query-processing architecture, PBS forms virtual grid on- demand in the sensor field. Depending on query type and parameters as well as the diffusion pattern of the event of interest, PBS determines the grid cell size. Also, it selects one node of each cell as a virtual querier that probes the cell to check the existence of event source(s) and initiates query forwarding in the cell. Thus, PBS eliminates search overhead in cell(s) where required sources are absent as well as performs in-network query processing without flooding the whole network. It is important to mention that PBS is appropriate for a set of events, where the event’s effect can be perceived within the surrounding region of an event source. Several query systems define policies to avoid flooding for query dissemination and forward the query only to nodes that produces relevant results for a partic- ular query. For example, [10] uses semantic routing tree (SRT) to limit the query dissemination only to nodes whose readings are within a particular range. Here, each node needs to collect information about its children or subtrees. The SRT concept is analogous to index of a conventional database system and suitable for less dynamic environment. However, PBS perceives the presence of event(s) through probing and avoids search where the probing fails. Also, another example is [6] that discovers querying paths for target tracking. This approach uses an objective function to choose a node that optimizes the usefulness of sensor data and corresponding communication costs along the paths. Model-based data acquisition scheme proposed in [11] has some similarity with our approach of using diffusion model concept. Their proposed architecture combines model-based approximate query answering to optimize the data gathering. However, we use known diffusion models just to estimate the geometry of event’s effect. The use of virtual grids in PBS has certain similarity with TTDD[13] approach for scalable and efficient data delivery to multiple mobile sinks. In this approach, each data source establishes a grid as needed and sensor nodes at the cross point of the grid receive data from the source. Compared to PBS, in TTDD, the grid cell size is fixed and independent of the geometry of event’s effect. Also, TTDD is not suitable for in-network query processing. In [7], three information-driven algorithms, DAM, EBAM and EMLAM have been proposed for construct- ing and maintaining sensor aggregates that collectively monitor target activity in the environment. All three al- gorithms are used for leader election without addressing query processing and associated routing issues. Through these algorithms, we can obtain target count from the total number of elected leaders. In this work, in addition to proposing the querying architecture, Probe-before-Spray (PBS), we also develop new algorithms for basic aggregate queries (count, sum, average, max and min) and combined query. Also, we analyze the performance of the algorithms using simple analytical models and extensive simulations. III. OVERVIEW OF PBS ARCHITECTURE The proposed virtual grid-based active querying archi- tecture, PBS relies upon two foundations - (1) the geom- etry of event’s effect, and (2) an underlying geographic routing scheme. An event source having specific magnitude (say, X) diffuses its effect in the sensor field. Depending on the sensitivity of embedded sensors, this diffusion spreads up to a certain area. At the periphery, the recorded magnitude of the event’s effect is much lower than X. However, this small magnitude of the event’s effect can be regarded as an indication that the required source(s) may exist within that area, which is called “the geometry of event’s effect” in this paper. Leveraging this key idea, a sensor node,S, having minimum sensitivitym (where, m < X) can establish a virtual circular contour, C, within which it can detect the presence of a source having magnitude X. In addition to known diffusion laws, this C can be also be determined by empirical data or local collaboration. Now, virtual grid formation is described through Figure 1. Consider the distance between the source and the node, S is d. Now, according to the geometry of events effect, the radius of contour C is d. For conservative estimation of d and to avoid gap or overlapping area due to circular region, here we consider the inner square (say, C is ) of C. Thus, the node S is able to detect S(x,y) Qv Qv Qv Qv Qv Qv Qv Qv Qv Qv Qv Qv C (x1,y1) d √ 2 (x3,y3) Fig. 1. Virtual grid of PBS with virtual queriers, Qv . If S is located at (x,y), then (x1,y1) and (x3,y3) are (x+ d √ 2 ,y+ d √ 2 ) and (x− d √ 2 ,y− d √ 2 ) respectively. 3 the presence of a source having magnitude X within C is . Here, the length of each side of C is is d √ 2. In a two-dimensional sensor field, using C is as the area of grid cell, PBS divides the specified region using the query parameter into grid cells as shown in Figure 1. Depending on query types, the cells size can be equal (e.g., count, sum, average) or variable (e.g., max, min, combined). For each cell, the node closest to the center of corresponding cell is considered as virtual querier,Q v . These virtual queriers initiate query in the corresponding grid cells on behalf of the querier. To Initiate query in a cell, the corresponding virtual querier, Q v , performs following two tasks. 1) Information probing: Q v uses a probing phase to identify the existence of information gradient in a cell. To improve the quality of probing, Q v also collects data about the required information gradient from its one-hop way neighbors. 2) Query spray i.e., dissemination: If information probing finds the required information gradient,Q v disseminates the query either by scoped flooding or information gradient-based query dissemination mechanisms, like RUGGED[8]. The routing pro- tocol, RUGGED uses braided multiple-path explo- ration and controls the instantiation of paths using a probabilistic function. In the information gradient region, a node forwards the query greedily towards the region having the required level (according to query parameters) of information gradient, where nodes use scoped flooding to find all nodes that satisfy the given query. Here, both query parame- ter(s) and the boundary of the grid cell limit the scope of flooding. On the other hand, when the probing is unable to identify the required informa- tion gradient in the cell, Q v simply forwards the query to the Q v of the next cell. In addition to query dissemination within grid cells, the proposed querying mechanism uses a geographical routing protocol, GPSR[14] to route the query among Q v s and to get a reply. GPSR[14] is previously devel- oped in literature to enable packet/query delivery to a node at a specified location. This routing mechanism has of two modes - (1) greedy-mode forwarding, and (2) perimeter-mode traversal. In greedy mode traversal, when a node receives a packet destined to a node at location (x,y), it forwards the packet to the neighbor closest to (x,y). In the absence of any such neighbor or due to existence of void in the network, the node forwards the packet using perimeter mode traversal that uses right-hand rule to get around the voids. According to above description, PBS architecture is based on three assumptions. First, all nodes have the knowledge about the geometry of event’s effect. This depends on the type of application and sensing modality. As previously mentioned, in this study, we consider the approximate geometry of event’s effect based on known diffusion laws (e.g., light, temperature etc.), empirical results or local collaborations. Approximate estimation of d is detailed in Section IV. Second, all nodes know the approximate geographical perimeter of the network, which may be configured at the time of deployment or using simple discovery protocol. Finally, nodes location can be determined using existing localization protocols. Although the basic idea of PBS is simple, the main challenging part is to design energy efficient query processing algorithms for various query types. Here, we develop algorithms for aggregate queries - count, sum, average, max, min, and combined query using PBS architecture. Following sections detailed the approximate estimation of ‘d’ and the query processing algorithms. IV. APPROXIMATE ESTIMATION OF ‘d’ Before describing the approach to estimate approxi- mate value of ‘d’, we first present some empirical results to support the fact that event’s effect follows diffusion law in real environment. For empirical experiments, we measure light diffusion in both empty room (for minimum surface reflection) and office room (for moderate surface reflection) in the presence of ambient light. We use high precision digital light meter (EXTECH, model 401025) and omni- directional light sources having different magnitudes. Here, we measure the change in light intensity due to omni-directional light source. In both scenarios, we observe similar pattern of light diffusion having diffusion parameter, α = 2, as shown in Figure 2. Although same 1 2 3 4 25 50 75 100 125 150 175 light intensity (in foot-candle) Distance, d, from the source(in foot) fos(d) = 37.166 d 2 few(d) = 21.841 d 2 (office room - strong light source) (empty rooom - strong light source) (empty room - weak light source) (office room - weak light source) fow(d) = 20.688 d 2 fes(d) = 43.584 d 2 Fig. 2. Light diffusion patterns in two different environments. Here, squares and triangles represent the measured data. Curve fitting is used to determine the diffusion equations. 4 light sources are used in both rooms, the dark surface of the office room absorbs some portion of light, so the measured light intensity in the office room is slightly lower than that of the open room. Also, we observe the fact that the signals of multiple non-coherent sources (e.g., light source) have additive effect at each point of overlapping diffusion regions. It is required to mention that we use this empirical data set to emulate event sources for simulations to evaluate the performance of PBS architecture and the proposed algorithms. Now, to estimate the approximate value of ‘d’, con- sider the minimum change detection sensitivity of sensor node ism for the event of interest. Also, assume that the event’s effect follows a diffusion law having diffusion parameter α. Now, consider a query to find node(s) having magnitude X. If a sensor node can measure an effect having magnitude m from d distance away from the source of interest, then the distance, d, can be expressed as d = α r X m . (1) Here, the value of α may change with the change of environmental condition and the elasticity of medium. Thus, the above equation computes only the approximate geometry of event’s effect. V. QUERY PROCESSING ALGORITHM In this section, we describe the details of new query processing algorithms that uses PBS architecture. A. Aggregate query - Count, Sum and Average The aggregate query Count counts the total number of sensor nodes in the network that satisfies the given query parameters, for example, find the number of nodes in a sensor field having temperature sensor reading 200 o F or more due to fire or equivalent event(s). In addition to geographical scope, here the query parameters also specify the magnitude(s) of the source(s) or event’s effect (e.g., temperature, light, etc.) of interest. Existing querying approaches start with disseminating the query by flooding within the specified geographic scope and then use in-network aggregation for counting. When each node and/or its descendents satisfy the query, the node reports the accumulated count to its parent node. Here, initial flooding causes significant energy dissipation. The developed Count algorithm using PBS leverages the geometry of event’s effect that corresponds to the query parameter to reduce the search overhead where the required event source(s) is not available. Sum and Average queries are similar to Count query. In these cases, in addition to counting the nodes they also aggregate sensor readings where the reading satisfies given query parameter(s). In this paper, we only describe the Count algorithm. Consider a query about an event of type E having magnitude X or more. Estimate the approximate geom- etry of event having magnitude X using Equation (1). Here, we consider an obstacle free environment for the diffusion of event’s effect, which will be relaxed later. Assume that the event’s effect diffuses up to d x and beyond that the magnitude of the events effect drops below the minimum sensitivity of sensor node, i.e., m. Now, the steps of Count algorithm are as follows: 1) Establish a virtual grid of cells where the area of each cell is d x √ 2×d x √ 2, except the edge cells. 2) Virtual querier, Q v , uses information probing as described in Section III to find the existence of information gradient within its corresponding cell. a) If the probing finds information gradient in the cell,Q v disseminates the query in the cell as described in Section III to find source(s) having magnitude X or more. b) Otherwise, Q v skips the query dissemination in that cell. PBS continues this step for all remaining cells in the sensor field. In step (2a), we consider only sources having magnitude X or more. However, sources having magnitude less than X, but close to Q v may result sufficient information gradient. So, information probing may identify informa- tion gradient andQ v triggers query dissemination in that cell. In such a scenario, the query dissemination using scoped-flooding approach causes some extra overhead, while the information gradient-based query dissemina- tion approach stops query forwarding after few steps. Qv Qv Qv Qv Qv Qv Qv Qv Qv Qv Qv E1 E2 Qv E3 Fig. 3. E1,E2 and E3 are three events of same type. The magnitude of E1 and E2 are X or more, while E3 is much smaller. Here, small dots represent sensor nodes. 5 Since, information gradient-based query dissemination approach is unable to improve information level to X. In the presence of obstacles within a cell, information gradient pattern for the diffusion may be different among nearby nodes. Through local collaboration, nodes can divide the corresponding grid cell(s) to obtain proper diffusion pattern within each portion of the cell. B. Aggregate query - Max and Min In a sensor field, aggregate query Max finds a node that records the maximum magnitude of the event’s effect. Existing approaches collect data from all nodes and the maximum is identified at root node i.e., sink. To reduce the amount of data flow, intermediate nodes suppress non-promising responses. However, flooding- based query dissemination and collecting reply through tree (based on child-parent relationship) causes signif- icant transmission overhead. Using PBS architecture, we develop a new Max algorithm that reduces energy overhead significantly in most cases. Consider a query to find the maximum magnitude, say M, of an event of typeE. Assume,M x is the maximum sensing limit of sensor for the event of typeE. Now, the steps to determineM are as follows: 1) Determine the initial value of M using scoped- flooding within the virtual grid cell that corre- sponds to M x . Assume, d x √ 2×d x √ 2 is the area of the cell according to Equation (1). Say, M 1 is the maximum information gradient within the cell and M 1 ≤ M x . Thus, the initial value of M is M 1 . Now, assume that the area of a virtual grid cell that corresponds to the current value of M (i.e., M 1 ) is d 1 √ 2×d 1 √ 2 according to Equation (1), where d 1 ≤d x . Q v Q v Q v E 2 E 3 a b c d e Q v Q v Q v Q v E 1 Q v f Q v Q v o n m Q v Q v Q v l Q v k j h Q v g i d x √ 2 Fig. 4. E1,E2 and E3 are three events of same type, where E3 > E2 > E1. Cell’s number, a,b,c,...,o indicates the order of visit. Scoped flooding is used in cell a and then b. Information gradient is perceived in cell b that determines the size of cell c. Again, cell e and g have more powerful event sources, so cell size is increased. For h,i, and j cells, centers are already visited, so Qv is moved diagonally to first unvisited node for those cells. Here, the result of Max query is the magnitude of E3. This initialization step continues until some infor- mation gradient is perceived due to a source. 2) This step is similar to the step (2) of the Count algorithm, except the current cell area is deter- mined by the current value of M. Here, M is non-decreasing as well as the area of the cell corresponding toM. Now, depending on the result of information probing, Q v has following two choices: a) If Q v perceives information gradient and ac- cording to Equation (1) the information gra- dient is higher than the current value ofM, Q v disseminates the query. Using information gradient-based protocols, like RUGGED[8], Q v finds the maximum value, sayM c , within the current cell. Thus,M can be updated as M = max(M,M c ). This updatesM to M c , ifM<M c . b) Otherwise, Q v skips the query dissemination within the current cell. Continue this step to cover the whole sensor field. In this algorithm, the query dissemination between Q v s is not simple due to the variability of cells area. Using cells of different area, the algorithm scans the sensor field horizontally from left to right and right to left and so on. To avoid any gap between the cells of two consecutive horizontal scans, the starting position of new horizontal scan is determined by the smallest cell of the most recent completed horizontal scan as shown in Figure 4. This causes some overlapping cells and also the center node of a cell, the potential virtual querier, may be visited during the previous horizontal scan. In such a scenario, the query is forwarded diagonally further from the center within the cell until an unvisited node is found, which is the virtual querier, Q v of the cell. Finally, if no source exists in a sensor field, the overall algorithm becomes equivalent to multiple scoped- flooding at different parts of the sensor field that cover all nodes. Using the similar steps of the above algorithm, it is also possible to design an algorithm to find an event source having minimum magnitude. C. Combined Query Combined query consists of several sub-queries that are combined by conjunction operator. In a multi-modal sensor field, the sub-queries are interested for different type of events having different magnitudes. Also, the 6 corresponding diffusion patterns may follow different diffusion laws. Consider a combined query consists of n sub-queries about n different type of events, say E 1 ,E 2 ,...,E n having magnitude X 1 ,X 2 ,...,X n . Assume that the area of virtual cells corresponds to X 1 ,X 2 ,...,X n are A 1 ,A 2 ,...,A n respectively, where A i = 2d 2 xi , for i = 1,2,...,n, according to Equation (1). Thus, possible cell area set A = {A 1 ,A 2 ,...,A n }. Now, using PBS architecture, the steps of combined query processing algorithm are as follows: 1) Set current cell area to min(A) and initiate in- formation probing. This cell area allows Q v to perceive the presence of information gradient due to any remaining events of interest. 2) Depending on the result of probing, Q v chooses one of the following steps: a) If probing finds information gradient, Q v disseminates the query within the cell to find node(s) that solves some unsolved sub- queries. b) Otherwise, Q v skips the query dissemination in that cell. 3) Rebuild the setA of possible cells area, based on remaining sub-queries. Now, if A = φ, the query is successful and send the reply to the querier. On the other hand, if A 6= φ and the sensor field is fully visited, the query is unsuccessful. Finally, if A 6= φ and the sensor field is not fully visited, then continue from step(1). Here the area of cells are also variable. Thus, the query dissemination between Q v s is similar to Max algorithm described in Section V-B. VI. ANALYSIS OF THE ALGORITHM In this section, we present simple analysis to highlight the energy efficiency of PBS architecture to process the query processing algorithms mentioned in Section V. A. Assumptions Consider a rectangular 2-D sensor field with uniformly distributed N nodes. Also, consider average neighbor- hood size isn b . Thus, the energy overhead of information probing per virtual querier, Q v , equals C p =n b +1. Here, the neighbors reply the broadcast of Q v . The collection of information about required event from neighbors in addition to Q v reduces the effect of en- vironmental noise as well as the distortion of diffusion. B. Count Query Consider a Count query about an event of type E having magnitude X or more. Now, use Equation (1) to find the area of cell and assume that there are x cells within the sensor field. For simplicity of analysis, assume that the area of all cells are equal including edge cells. Thus, each cell has n c = N x nodes and there are √ n c nodes on each side of a cell. Let m be the number of sources in the sensor field having magnitudeX or more. For simplicity, assume that the sensor nodes with reading X or more for an event are located in a same cell. Now, ifm = 1, the probability that a cell does not have any source is (1− 1 x ). Thus, for m sources, the probability of finding at least one source in a cell equals P e = 1− 1− 1 x m = 1−e − m x . Here, 1− 1 x m is the probability that the cell does not have any source. To assess the worst-case energy overhead, assume that eachQ v uses scoped flooding to find the required node(s) in a cell where sensor reading is X or more. Thus, the query spray i.e., dissemination overhead per cell equals C s =n c = N x , since, nodes are uniformly distributed. Therefore, the energy overhead per cell equals T c =P e (C p +C s )+(1−P e )C p . Here, the first part computes the overhead in the presence of source(s), while the other part determines the overhead if no source is available in the cell. Q v s use geographic routing protocol (a multi-hop routing protocol) to route the query between them. Since, all cells are square and identical, so the distance between two consecutive nodes is approximately equal to the length of cell’s side. Thus, the number of transmissions require to route the query among Q v s equals T gr = (x−1) √ n c = (x−1) r N x . Thus, the total energy overhead to process the Count query equals T =xT c +T gr , =x(n b +1)+N 1−e − m x +(x−1) r N x . (2) This equation captures the impacts of both the number of 7 20 40 60 80 100 5 10 15 20 400 600 800 1000 1200 5 10 15 20 Overhead Number of sources Number of grid cells (a) Overhead for various number of sources and cells. 0 5 10 15 20 5 10 15 20 25 30 35 40 200. 400. 600. 800. 1000. Number of sources Optimal number of cells Overhead (transmission) (b) Optimal number of cells (shown by boxes) and corre- sponding overhead (shown by dots) for number of sources. Fig. 5. Count query processing overhead for various number of sources and cells in the sensor field sources in the sensor field and the number of cells, which depends on given query parameter, on query processing’s overhead. Consider a sensor field of N = 1000 sensor nodes where the average neighborhood size, n b = 6. For this sensor field, Figure 5(a) shows the query processing’s overhead where the number of cells and sources vary between 1 to 100 and 1 to 20 respectively. Figure 5(a) shows that for a fixed number of sources, initially the query-processing overhead reduces with the increase of the number of cells i.e., querying for smaller values. Since, query spray overhead is higher in larger cells. Further, with the increase of number cells in a sensor field, information-probing overhead increases, but at a slower rate. The minimum overhead and corresponding optimal number cells are shown in Figure 5(b) for different number of sources in a sensor field. C. Max Query In the absence of events in a sensor field, the Max query algorithm performs multiple scoped flooding as discussed in Section V-B. Now, considering M x is the maximum sensing limit and using Equation (1) to find the size of grid cell, assume that there are x cells within the sensor field. Therefore, the overhead of the Max query-processing algorithm equals T noevent =N +(x−1) r N x . Here, the first term is combined flooding overhead and the second term is query routing overhead between Q v s. This overhead is larger than usual flooding based approach. The overhead of the algorithm increases further if information gradient is found during initialization and later current maximum, M, increases at every step of query processing. Since, this causes probing overhead in smaller cells in addition to query dissemination over- head. However, due to horizontal scans of the algorithm, this scenario is very unlikely to occur. D. Combined Query Consider a Combined query has n sub-queries about n different type of events E 1 ,E 2 ,...,E n having mag- nitude X 1 ,X 2 ,...,X n respectively. For simplicity of analysis, assume that the area of cells corresponds to X 1 ,X 2 ,...,X n are same and there are x cells within the sensor field. According to Equation (1), if α i 6= α j for i6=j, then X i 6=X j for events E i and E j . Let m 1 ,m 2 ,...,m n be the number of events of type E 1 ,E 2 ,...,E n having required magnitudes. Consider- ing the events are independent and uniformly distributed in the sensor field, the probability to find all events in a cell equals p =p 1 p 2 ...p n = n Y i=1 1−e − m i x . Here, p i = 1 − 1− 1 x mi = 1−e − m i x is the probability that the event E i is available in a cell. Now, p changes (i.e., increases) after finding each event due to probing and spray in cells. Thus, the average overhead of information probing can be expressed as T pavg ≤ (n b +1) 1 p = n b +1 Q n i=1 1−e −m i x . Here, (n b +1) is the overhead of each probing and 1 p is the expected number of cells required to probe. In the worst case, all cells are required to probe. Thus, in this scenario, the worst case overhead of information probing equals T pw = (n b +1)x. For query spray i.e., dissemination, actual overhead depends on the location of events in the sensor field i.e., cells. Consider query spray is used for all 1 p cells. Thus, using scoped-flooding for query spray, the average case overhead of query spray can be expressed as T savg ≤ 1 p N x , since, the sensor field has N nodes and x cells. There- fore, the total average case overhead of PBS architecture 8 to process combined query can be expressed as T avg ≤ n b +1 Q n i=1 1−e −m i x + 1 p N x +(x−1) r N x . Here, the third term is the overhead of geographic routing between Q v s similar to Section VI-B. In the worst case scenario, n events will be located in different cells. Thus, using scoped-flooding for query spray, the worst-case overhead of query spray equals T sw =n N x . Therefore, in the worst case, PBS architecture will be energy efficient over flooding-based approach for com- bined query processing, if (n b +1)x+n N x +(x−1) r N x ≤N. Here, we assume the overhead of flooding-based ap- proach is N. VII. SIMULATIONS AND PERFORMANCE We evaluate the performance of PBS architecture for proposed query processing algorithms through extensive simulations and consider following performance metrics: 1) Overhead in terms of energy dissipation is the av- erage number of transmissions required to process a query. 2) Success ratio is the ratio of obtained value through query algorithm over actual value. This metric is used for Count and Max queries. 3) Absolute success probability is the fraction of total queries when obtained value equals actual value. A. Simulation Model In our simulations, we use a 100ft×100ft uniform random grid with10 4 sensor nodes placed at distance1ft from each other. Except for the border nodes, each node is able to communicate with eight neighbors. For the simulations of Count and Max queries, we use empirical data set (Section IV) to emulate event source(s), where the exponent of the diffusion function i.e., α equals 2.0. For combined query, we simulate five different types of events having α equals 2.0, 1.9, 1.8, 1.7 and 1.6. For the distortion of information diffusion, we use Degree of Irregularity (DOI) and Weibull distribution with shape parameter 1.13 and scale parameter 0.28 similar to [15]. Both actual event(s) and small noisy events are uni- formly distributed in sensor field. Here, small events are unable to solve queries. Also, consider lossy wireless links and ARQ is used only for information-probing. For query spray, both scope-flooding and information gradient-based routing are used. Information gradient- based routing as specified in [8] uses a probabilistic diffusion function with exponent β for probabilistic forwarding, i.e., p j = f(j) = 1 j β , where j is the hop count in the gradient region. The performance of query processing using PBS depends on information probing and query spray. For query spray, scoped flooding is more robust as well as causes more energy overhead than information gradient- based routing. Thus, the robustness and energy efficiency achieved using scoped flooding mainly represents the effectiveness of probing. In addition to following results, more detail analysis and results can be found in [20]. B. Count Query In our simulations, the success ratio of Count query is over 99% using scoped flooding as shown in Fig.6(a). 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 20 30 40 50 60 70 80 90 100 # of Event = 1(SF) # of Events = 2(SF) # of Events = 3(SF) # of Events = 4(SF) # of Events = 8(SF) # of Events = 1(IR) # of Events = 2(IR) # of Events = 3(IR) # of Events = 4(IR) # of Events = 8(IR) Query Value Average Success Ratio (a) Success ratio. 0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 # of Event = 1(SF) # of Events = 2(SF) # of Events = 3(SF) # of Events = 4(SF) # of Events = 8(SF) # of Events = 1(IR) # of Events = 2(IR) # of Events = 3(IR) # of Events = 4(IR) # of Events = 8(IR) Query Value Absolute Success Probability (b) Absolute success probability Fig. 6. Count query using Scoped-flooding (SF) and Information gradient-based routing (IR) with β = 0.7. Here, DOI = 0.05 and Pr(link loss) = 0.1. 0 0.2 0.4 0.6 0.8 1 20 30 40 50 60 70 80 90 100 # of Event = 0(SF) # of Event = 1(SF) # of Events = 2(SF) # of Events = 3(SF) # of Events = 4(SF) # of Events = 8(SF) # of Events = 0(IR) # of Events = 1(IR) # of Events = 2(IR) # of Events = 3(IR) # of Events = 4(IR) # of Events = 8(IR) Query Value Avg Xmission Overhead / node Fig. 7. (Normalized) Overhead of Count query using scoped- flooding (SF) and Information gradient-based routing (IR) with β = 0.7. Here, DOI = 0.05 and Pr(link loss) = 0.1. For small query value, occasionally probing may fail due to noise (i.e., higher DOI). The probing quality can be improved further by collecting information from neighbors more than one hop away from virtual querier. Using information gradient-based routing, the success 9 ratio drops for large query values and in the presence of more events. Cell area is large for large query values; so gradient-based routing may unable to find all nodes that satisfy the query in the presence of noise and lossy wireless links. However, using smaller value of β as shown in [8], the success ratio can be improved further. Similarly, Fig.6(b) shows that absolute success probability is high when scoped flooding is used for spray. In the presence of no events, the overhead of PBS is only 20% as shown in Fig.7. However, with the increase of number of events, the overhead increases as more nodes can satisfy the query and require more transmissions to find them. For scoped flooding, in addition to flooding within a bounded region, it is unable to stop query forwarding if probing result is false positive and causes more overhead. C. Max Query Fig.8(a) shows the success ratio of Max query is over 99% i.e., obtained maximum is very close to actual maximum in our simulations even in the presence of 0.95 0.96 0.97 0.98 0.99 1 1 2 3 4 5 DOI = 0.00(SF) DOI = 0.01(SF) DOI = 0.02(SF) DOI = 0.03(SF) DOI = 0.04(SF) DOI = 0.05(SF) DOI = 0.00(IR) DOI = 0.01(IR) DOI = 0.02(IR) DOI = 0.03(IR) DOI = 0.04(IR) DOI = 0.05(IR) Average Success Ratio Number of Events (a) Success ratio. 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 DOI = 0.00(SF) DOI = 0.01(SF) DOI = 0.02(SF) DOI = 0.03(SF) DOI = 0.04(SF) DOI = 0.05(SF) DOI = 0.00(IR) DOI = 0.01(IR) DOI = 0.02(IR) DOI = 0.03(IR) DOI = 0.04(IR) DOI = 0.05(IR) Number of Events in sensor field Avg. Xmission Overhead / node (b) Overhead per node. Fig. 8. Max query using Scoped-flooding (SF) and Information gradient-based routing (IR) with β = 0.7. Here, Pr(link loss) = 0.1. lossy wireless links and distortion. We notice that the overhead of query processing decreases with the increase of number of events as shown in Fig.8(b). Because, less number of scoped flooding is required to obtain the initial Max value. Also, at the early stages of query processing, Max value becomes high, so probing helps to avoid query spray and further improves the overhead. D. Combined Query We consider three sets of combined queries, where the area of cells corresponds to sub-queries are (1) equal (i.e., 1 : 1 : 1 : ... ), (2) linearly increasing (i.e., 1 : 2 : 3 : ... ) and (3) exponentially increasing (1 : 2 : 4 : ... ). The success probabilities in all cases are over 99% as shown in Fig.9(a) even in the presence 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 1 event/type,Eq.(SF) 1 event/type,Eq.(IR) 2 events/type,Eq.(SF) 2 events/type,Eq.(IR) 1 event/type,L.inc.(SF) 1 event/type,L.inc.(IR) 2 events/type,L.inc.(SF) 2 events/type,L.inc.(IR) 1 event/type,Exp.inc.(SF) 1 event/type,Exp.inc.(IR) 2 events/type,Exp.inc.(SF) 2 events/type,Exp.inc.(IR) Number of Sub-queries / query Absolute Success Probability (a) Absolute success probability. 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 1 event/type,Eq.(SF) 1 event/type,Eq.(IR) 2 events/type,Eq.(SF) 2 events/type,Eq.(IR) 1 event/type,L.inc.(SF) 1 event/type,L.inc.(IR) 2 events/type,L.inc.(SF) 2 events/type,L.inc.(IR) 1 event/type,Exp.inc.(SF) 1 event/type,Exp.inc.(IR) 2 events/type,Exp.inc.(SF) 2 events/type,Exp.inc.(IR) Number of Sub-queries / query Avg. Xmission Overhead / node (b) Overhead per node. Fig. 9. Combined query using Scoped-flooding (SF) and Information gradient-based routing (IR) with β = 0.7. Here, DOI = 0.05 and Pr(link loss) = 0.1. of diffusion distortion and lossy wireless links. Also, the query processing overhead using gradient-based routing is below 50% as shown in Fig.9(b). However, as the area corresponds to sub-queries increases exponentially, the overhead of using scoped flooding for query spray, i.e., dissemination increases sharply due to large cell area. VIII. CONCLUSION In this paper, we have presented a novel architec- ture, Probe-before-Spray (PBS) for information gradient- based active query processing. This reduces search over- head by exploiting geographical information and the diffusion spread to form resizable virtual cells within a query specified region. Based on PBS, we develop query-processing algorithms for aggregate queries and combined query. We analyze the performance of PBS using simple analytical models. Also, through simulations, we found that Count, Max and Combined query algorithms based on PBS reduces search overhead over 40%, 30% and 50% respectively over usual flooding based approach while attaining accuracy over 99%. In addition, the proposed architecture can be eas- ily augmented with both Directed-Diffusion[2] and TinyDB[9] or Cougar[12] to reduce flooding overhead for energy efficient in-network query processing. Further, considering each virtual querier as a cluster head, PBS can also be used for hierarchical sensor networks. REFERENCES [1] A. Woo, S. Madden and R. Govindan, “Networking Support for Query Processing in Sensor Networks”, Communications of the ACM, V ol. 47, No 6, June 2004. [2] C. Intanagonwiwat, R. Govindan and D. Estrin, “Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks”, MobiCom 2000. [3] N. Sadagopan, B. Krishnamachari, and A. Helmy, “Active Query Forwarding in Sensor Networks (ACQUIRE)”, Journal of Ad Hoc Networks, V ol 3, Issue 1, pp. 91-113, January 2005. 10 [4] M. Chu, H. Haussecker, and F. Zhao, “Scalable Information- Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks”, Int’l J. High Performance Computing Ap- plications, 16(3):90-110, Fall 2002. [5] J. Liu, F. Zhao, and D. Petrovic, “Information-Directed Routing in Ad Hoc Sensor Networks”, WSNA 2003. [6] F. Zhao, J. Liu, L. Guibas and J. Reich, “Collaborative Sig- nal and Information Processing: An Information Directed Ap- proach”, Proceeding of the IEEE , 91(8), 2003. [7] Q. Fang, F. Zhao and L. Guibas, “Lightweight Sensing and Communication Protocols for Target Enumeration and Aggre- gation”, MobiHoc 2003. [8] J. Faruque, A. Helmy, “RUGGED: RoUting on finGerprint Gradients in sEnsor Networks”, IEEE ICPS, 2004. [9] S. Madden, M. Franklin, J. Hellerstein, and W. Hong, “TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks”. OSDI, December 2002. [10] S. Madden, M. Franklin, J. Hellerstein, and W. Hong, “The Design of an Acquisitional Query Processing for Sensor Net- works”, SIGMOD 2003. [11] A. Deshpande et al.,“Model-Driven Data Acquisition in Sensor Networks”, VLDB 2004. [12] Y . Yao and J. Gehrke, “Query Processing for Sensor Networks”, CIDR, Jan. 2003 [13] H. Luo, F. Ye, J. Cheng, S. Lu and L. Zhang, “TTDD: Two-Tier Data Dissemination in Large-Scale Wireless Sensor Networks”, Wireless Networks 11, pp. 161-175, 2005. [14] B. Karp and H. T. Kung, “GPSR: Greedy Perimeter Stateless Routing for Wireless Networks”, MobiCom 2000. [15] G. Zhou et al., “Impact of Radio Irregularity on Wireless Sensor Networks”, MobiSYS 2004. [16] D.R. Askeland, The Science and Engineering of Materials, PWS Publishing Co., 1994. [17] J.F. Shackelford, Intro to Materials Science For Engineers, 5th Ed., Prentice Hall, 2000. [18] L. Reznik et al., “Embedding Intelligent Sensor Signal Change Detection into Sensor Network Protocols”, IEEE SECON 2005. [19] L. Gu et al., “Lightweight Detection and Classification for Wireless Sensor Networks in Realistic Environments”, ACM SenSys 2005. [20] J. Faruque, A. Helmy, “PBS: A Virtual Grid Architecture for Information Gradient-based Active Querying in Sensor Net- works”, http://nile.usc.edu/pbs. 11
Asset Metadata
Creator
Faruque, Jabed (author), Helmy, Ahmed (author)
Core Title
USC Computer Science Technical Reports, no. 887 (2007)
Alternative Title
PBS: A virtual grid architecture for information gradient-based active querying in sensor networks (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
11 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269757
Identifier
07-887 PBS A Virtual Grid Architecture for Information Gradient-based Active Querying in Sensor Networks (filename)
Legacy Identifier
usc-cstr-07-887
Format
11 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Linked assets
Computer Science Technical Report Archive