Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 899 (2008)
(USC DC Other)
USC Computer Science Technical Reports, no. 899 (2008)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
FrameworkforSnapshotLocation-basedQuery ProcessingonMovingObjectsinRoadNetworks ∗ Haojun Wang Department of Computer Science Universityof Southern California Los Angeles, CA, 90089 haojunwa@usc.edu Roger Zimmermann Computer ScienceDepartment National Universityof Singapore Singapore 117590 rogerz@comp.nus.edu.sg ABSTRACT Location-based services are increasingly popular recently and ef- ficiently supporting queries is a key challenge. Here, we present a novel design to process large numbers of location-based snap- shot queries on MOVing objects in road Networks (MOVNet, for short). MOVNet’s dual-index design utilizes an on-disk R-tree to store the network connectivities and an in-memory grid structure to maintain moving object position updates. A method to speed- ily compute the overlapping grid cells in the network relates these two indices and given an arbitrary edge in the space we analyze the minimum and maximum number of grid cells that are possi- bly affected. Based on the above features we propose algorithms to support mobile network distance range and k nearest neighbor queries. We demonstrate via theoretical analysis and experimen- tal resultsthat MOVNet yields excellent performance with various networkswhilescalingtoaverylargenumber ofmovingobjects. 1. INTRODUCTION Withthewidespread useof GPSdevices, moreand morepeople are enjoying location-based services. Various applications, such asroad-sideassistance, highway patrol,andlocation-awaregames, are popular in many urban areas. This has intensified research in- terests to overcome the inherent challenges in designing scalable and efficient infrastructures to support very large numbers of users concurrently. Themobilitymadepossiblebytheusageofcar-based orhandheldGPSdevicesinmetrocitiesresultsintwofundamental system requirements: distance computations within a (road) net- work andprocessing of movingPointsof Interest (POIs). An increasing number of applications require query processing of moving POIs based on an underlying network. For example, when a pedestrian calls for emergency assistance, the call-center may want to locate all police cars within a five-mile distance and dispatch them to the call-originating location. Note that the men- ∗ This research has been funded in part by NSF grants EEC- 9529152 (IMSC ERC), CMS-0219463 (ITR), IIS-0534761, NUS AcRF grant WBS R-252-050-280-101/133 and equipment gifts from the Intel Corporation, Hewlett-Packard, Sun Microsystems and Raptor NetworksTechnology. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice andthefull citation onthefirstpage. Tocopy otherwise, to republish, topostonserversortoredistribute tolists,requires priorspecific permission and/or afee. Copyright 200X ACMX-XXXXX-XX-X/XX/XX ...$5.00. tioned examples require snapshot queries, rather than continuous monitoring (whichisanother classof applications). Query results Grid index R-tree Range queries Computing affected cells Max bound of affected cells Progressive Probe MNDR kNN filtering Range queries kNN MKNN Figure 1: The index structures and query processing modules ofMOVNet. Spatial data processing is a very active research field. Some of the early work introduced spatial processing of stationary objects based on Euclidean distance metrics. More recent work incorpo- rates POI mobility or network-distance processing, but often not both. Two of the main challenges when supporting POI mobility on an underlying road network are to (a) efficiently manage object location updates and (b) provide fast network-distance computa- tions. To address these issues we have designed a novel system to process location-based queries on MOVing objects in road Net- works(MOVNet). Thegoalistoefficientlyexecutesnapshot range and k nearest neighbor queries over moving POIswithinastation- ary road network. Although MOVNet is not aimed at continuous queryprocessinginitscurrentform,webelievethatalargenumber of location-based services only require snapshot query processing capabilities. For instance, when a user calls a service center to find a nearby taxi, the query is instantaneous and as soon as a taxi is dispatched to pick up the customer, the transaction is complete (i.e.,theorderingphase). Figure1illustratesMOVNet’ssystemin- frastructure and components. To handle large networks, MOVNet utilizesan on-disk R*-tree[1] structuretostore thenecessary con- nectivity information. Efficient processing of moving object posi- tionupdates isachieved withanin-memory gridindex. Anappeal- ing feature of MOVNet is the bi-directional mapping between the twostructuresthatenablestheretrievalofaminimalsetofdatafor query processing. Based on the concept of affected cells that form thesetofgridcellsoverlappingwithagivenedgewepresent algo- rithmstoexecuterangeaswellas kNNqueries. Analytical bounds on the minimum and maximum number of affected cells with an arbitrary network edge enable the pruning of the search space dur- ing mobile range query processing. In the mobile kNN query al- gorithm, weutilizetheconcept ofaprogressive probe intothegrid indextoestimatethesubspacecontainingtheresultset. Theperfor- mance of our design has been verified vigorously through theoret- ical analysis and simulations. Our comparison with two state-of- the-art baseline algorithms demonstrate the superior performance ofMOVNet. The remainder of this paper is organized as follows. Section 2 describes the related work. Section 3 discusses our assumptions and the dual-index design. In the following Section 4 we propose our mobile network distance range query and k nearest neighbor queryalgorithms. Wepresent thetheoreticalanalysisofour design in Section 5. We vigorously verify the performance of MOVNet and demonstrate that the results match our analysis in Section 6. Finallyweconclude withSection7. 2. RELATEDWORK Processingspatialqueriesinnetworkshasbeenstudiedintensely. Papadias et al. [15]first presented a framework that integrates net- work and Euclidean information when processing network-based queries. The VN 3 method [11] improves upon this idea with a Voronoi-based approach to pre-compute the distances within and across subspaces. The goal was to avoid on-line distance compu- tations in processing k Nearest Neighbor (kNN) queries. Huang et al. [8] addressed the same problem by proposing the islands approach. It estimates the overhead of pre-computation and the trade-off between query and update performance for kNNqueries. TocopewithContinuous kNN(C-kNN)queriesonstationaryPOIs inanetwork,Kolahdouzan etal.[10]proposedtheIntersectionEx- amination and Upper Bound Algorithm (IE/UBA) to compute the kNN objects of all nodes on a path and the split points between adjacent nodes whose nearest neighbors are different. Recently, Cho et al. [4] solved the same problem by introducing UNICONS that incorporatespre-computed kNNlistsintoDijkstra’salgorithm such that itoutperforms theIE/UBAindense networks. The above group of algorithms makes the assumption that the POIs are static. When POIs are dynamic, the key challenge lies in thelargenumberoflocationupdatesthatmustbemanagedwithan appropriateindexingstructure. Movement predictions(i.e.,thetra- jectoryofmovingobjects)havebeenusedwithR-tree-basedstruc- tures(e.g.,the TPR*-Tree[16]). However, thesetree-based indices suffer from excessive node reconstruction costs when performing location updates. Therefore, grid-based structures haveraisedcon- siderable interest due to their simplicity and efficiency in indexing moving objects. Much of the recent work leverages either an in- memory grid index [5, 13, 19] or an on-disk grid index [7, 18]. Followingthistrend,ourdesignofMOVNetutilizesanin-memory gridindex tomanage thelocationupdates of moving POIs. A number of grid-index based methods have been proposed to process location-based services on moving POIs with Euclidean distances. Forinstance, Chon et al.[5]first presented analgorithm based onthetrajectoryof moving POIsoverlapping withgridcells to solve snapshot range and kNN queries. SINA [12] and SEA- CNN[18]wereintroducedascentralizedsolutionswiththeideaof shared execution to process continuous range and kNN queries on moving POIs. Yu et al. [19] proposed an algorithm (referred to as YPK-CNN) for monitoring C-kNN queries on moving objects by defining a search region based on the maximum distance between the query point and the current locations of previous kNNs. As anenhancement, Mouratidisetal.[13] presentedasolution(CPM) that defines a conceptual partitioning of the space by organizing grid cells intorectangles. Location updates are handled only when objects fall within the vicinity of queries, hence improving sys- tem throughput. However, the above techniques are limited to Eu- clidean distancecomputations. For environments where POIs are dynamic and distances are based on network paths only a few techniques exist. Jensen et al. [9] described an abstract distributed infrastructure for handling locationupdatesofmovingPOIsinanetworkinconjunctionwitha kNNqueryalgorithm. Asacentralizedalternative,S-GRID[7]was introduced as a means to process kNN queries. A pre-computed structureismaintainedwithregardtothespatialnetworkdatasuch astoimprovetheefficiencyofquery processing. Recently, Moura- tidisetal.[14]addressed theissueofprocessingC-kNNqueriesin road networks by proposing two algorithms (namely, IMA/GMA) that handle arbitrary object and query movement patterns in road networks. This work utilizes an in-memory data structure to store the network connectivity, therefore it is unsuitable for large-sized networks (e.g, metro cities). In contrast, MOVNet uses an on-disk R-tree structure that has a proven performance record for large- sized2Ddatausage. 3. SYSTEMDESIGN In this section, we describe our data modeling of the road net- work, the data structures of indices, and a cell overlapping algo- rithmthatrelatestheR-treeandthegridindex inMOVNet. 3.1 NetworkModelingandAssumptions We define a road network (or network for short) as a directional weighted graph Gconsisting of asetof edges (i.e.,road segments) E, and a set of vertices (i.e., intersections, dead ends)V,whereE ⊆V ×V. For any network G(E,V), each edge e is represented as e(v1,v2), i.e., it is connected to two vertices v1, v2,where v1 and v2 are the starting and ending vertex, respectively. Let v1 = v2. Each edge e is associated with a length, given by a function length(e) :E→R + ,whereR + istheset ofpositive real numbers. (a) (b) Figure2: Anexampleofaroadnetworkanditscorresponding, linearizedmodelinggraph. Theroadnetworkistransformedintoa modeling graphduring query processing. Specifically,graphverticesrepresent thefollow- ing three cases: (i) the intersections of the network, (ii) the dead end ofaroad segment, and(iii)the pointswhere thecurvature ofa road segment exceeds a certain threshold so that the road segment issplitintotwopiecestopreservethecurvatureproperty. Although polylines can also be used to represent the edges, we use a set of linesegmentstorepresentanedgeduetothenatureofourdataset. Asaresult,themodelinggraphisapiecewiseapproximationofthe network. For example, Figure 2(a) shows a small road network, and Figure2(b) illustratesthecorresponding modeling graph. There are different objects (e.g., cars, taxis, and pedestrians) moving along the road segments in a network. These objects are known as theset of moving objectsM. A moving object m∈M is aPOIlocatedinthenetwork. Thelocationof mattime tisdefined as loct(m)=(xm, ym), where xm and ym are the x and y coordi- natesof mattime t,respectively. Aquerypoint q ∈Misamoving objectissuingalocation-basedspatialqueryatdifferenttimes. Cur- rently our design is focusing on snapshot range queries (e.g., “find alltaxiswithinathreemilerangefrommycurrentlocation”). Note that these queries are processed with network distances. For sim- plicity we use the term distance to refer to the network distance in thefollowing sections. MOVNet assumes that periodic sampling of the moving object positionsconveystheirlocationsasafunctionoftime. Thismethod is commonly used (see [18]) as it provides a good approximation of the moving object positions. Our primary goal is to reduce the evaluation costduringqueryprocessing. Aspatialquerysubmitted by a user at time t1 is computed based on loct 0 (M). The system has the lastest snapshot of moving objects at t0,where t0 ≤ t1, t1 − t0 < Δt,and Δt is a fixed time interval; the result is valid until t0 + Δt. We define the distance function of two moving objects m1 and m2 at time t as distt(m1,m2): loct(m1) × loct(m2) → R + . distt(m1,m2)denotestheshortestpathfrom m1 to m2 inthemet- ric of the network distance at time t. For notational simplicity, we denote dist(m1,m2)asthedistancefunctionof m1 and m2 atthe current time. Similarly, the distance function of an edge e(v1,v2) and amoving object mattime tisdefinedas distt(e,m): loc(v1) × loct(m)→R + . distt(e, m) denotes the shortest path from v1 to minthemetricofthenetwork distanceattime t. Thedistancebetweentwomovingobjectsdependsonthelength of edges and the connectivity of vertices as well as the current lo- cations of the objects. We elaborate on our dual-index structure designed to facilitate distance computations in the following sec- tion. 3.2 Dual-IndexStructure Design Torecordtheconnectivityandcoordinates ofverticesinstation- ary networks, MOVNet utilizes an on-disk R*-tree, a data struc- turewhichhasbeenintensivelystudiedforhandlingverylarge2-D spatial data. Once the edges are retrieved from disk, a correspond- ing modeling graph is constructed in memory using the following structure. Weuse avertexarraytostorethecoordinates of vertices inthegraph. Foreachvertex,thearraymaintainsalistrecordingits outgoing edges. To quickly locate a vertex in the array, MOVNet managesahashtabletomapthecoordinateofavertexintoitsindex inthevertexarray. A memory-based grid index is used to manage the locations of movingobjects[19]. Withoutlossofgenerality,weassumethatthe service space is a square. We can partition the space into a regu- lar grid of cells with a size of l × l.Weuse c(column, row) to denote a specific cell in the grid index (assuming the cells are or- dered from thelower leftcorner of thespace). At time t,a moving object m has loct(m)=(xm, ym), therefore it overlaps with cell c( xm l , ym l ). Each cell maintains an object list containing the identifiers of enclosed objects. The objects’ coordinates are stored in an object array, and the object identifier is the index into this array. Figure 3 shows a part of the network of Figure 2(b) that is managed by a grid index of 8 × 8 cells. An example object on e(v2,v4) is enclosed by c(5, 5). Accordingly, the object list of c(5, 5) records the object identifier and hence we can retrieve the coordinate of theobject fromtheobject array. Given a set of grid cells, retrieving the underlying network can be transformed into range queries on the R-tree. It is highly de- sirable to have an algorithm so that for an arbitrary edge, we are able to find the set of overlapping cells very quickly. Although thisissimilartothelinerasterizationalgorithm (e.g.,Bresenham’s Algorithm [2]), it is noteworthy to point out that these existing al- gorithms onlyobtain anapproximation of theoverlapping cells(or 01 0 1 2 3 4 5 6 7 V 6 V 1 V 3 V 4 V 5 V 8 V 2 3456 2 7 V 7 C(5, 5) Object List Object Array (x, y) 2.6 2 2 3.5 4.0 3.8 3.5 2.7 Figure 3: An example network indexed by the grid index and itsdatastorage. pixels, in that case). In contrast, our goal is to compute the com- plete set of overlapping cells. Therefore, we devise an incremental algorithm. First, let us assume that the service space is managed by a grid- based index. We define theset ofcells{c1, c2, ..., cn}, which are consecutively overlapped from v1 to v2 by an edge e(v1,v2),as the set of affected cells of e. For instance, inFigure 3, the affected cellsof e(v1,v2)are{c(1, 6), c(2, 6), c(3, 6), c(4, 6)}. Given an edge e(v1,v2), the coordinates of vertices v1 and v2 are(xv1, yv1)and(xv2, yv2),respectively. Thesetofaffectedcells of ecan becomputed withAlgorithm1. Algorithm1Compute-affectedcells (e, c) 1: /* eistheedge */ 2: /* l isthesidelength ofacell*/ 3: m= y v2 −y v1 x v2 −x v1 , b= yv1 - m· xv1 4: startX = x v1 l , startY = y v1 l 5: endX = x v2 l , endY = y v2 l 6: cellList = φ 7: while startX = endX do 8: if endX > startX then 9: nextX = startX+1 10: else 11: nextX = startX-1 12: endif 13: nextY = m×nextX×c+b c 14: for i= startY to nextY do 15: cellList = cellList∪c(startX, i) 16: endfor 17: startX = nextX, startY = nextY 18: endwhile 19: for i= startY to endY do 20: cellList = cellList∪c(endX, i) 21: endfor 22: return cellList Weuse straight line segments torepresent edges inthe network. Therefore, any edge e(v1,v2) can be described by a first degree polynomial function in the form of y = m · x + b with x ∈ [xv1, xv2]. Algorithm1first captures thegradient m and the y-intercept b of an edge (Line3). After that, it computes the cells overlapping withthestartingand ending vertexof theedge, respectively (Lines 4 - 5). The algorithm follows a step-forward approach where in eachstep,itmovesonecellonthe x-axisfromthecelloverlapping with the starting vertex and calculates the affected cells along the y-axis (Lines 7 - 18). Finally, it terminates once it reaches the cell overlapping withtheending vertex(Lines19- 21). ThecomplexityofAlgorithm1islinearinthelengthoftheedge. Ourexperimental resultsshowthattheCPUtimeusedforcomput- ing overlapping cells consumes less than 5% of the query process- ing time with various settings. This indicates that our method is well suited for online computing. More importantly, by introduc- ing Algorithm1, MOVNetcreates ameans tobi-directionally map underlying networks and moving object positions. We present our query design in the following sections showing the flexibility and scalabilityofthisdual-index approach. 4. QUERYDESIGN In this section, we first describe our design of a mobile range query algorithm. Next, we present the minimum and maximum bounds on the number of grid cells that can overlap with an arbi- trary edge. Then, the maximum bound is used to prune the search space during query processing. Finally, we propose a mobile kNN query algorithm by introducing the concept of a progressive probe and leveraging our range query algorithm. 4.1 RangeQueryAlgorithm Givenaquerypoint q,avalue d,anetwork Gandasetofmoving objectsM,alocation-based networkdistancerangequeryretrieves all POIs ofM that are within the distance d from q at time t.By using the definitions of Section 3, the query can be represented as rangeQueryt(q,d): loct(q)× loct(M)→{mi, i = 1, ..., n}, ∀ mi, distt(q,mi)≤ d. We propose a Mobile Network Distance Range query algorithm (MNDR) to facilitate the query processing. First, we know from the Euclidean distance restriction [15] property that the distance dist(q,m)forobject minanetworkisalwayslargerthanorequal to the Euclidean distance d of q to m. We observe that only the network data in MOVNet is stored on disk. Therefore, we first perform a Euclidean range query with q as the center and d as the radiustoretrievethenetworkfromtheR-treeandtocreatethecor- responding modeling graph. After that, we are able to perform the later steps efficiently in memory. Second, the starting vertex of an edge e(v1,v2)hasthepropertythatif dist(q,v1)>d,theaffected cells of the edge are not required to be examined during this first pass because any moving object on e has a distance greater than d from q. Hence for each vertex in the modeling graph, MNDR leverages Dijkstra’s algorithm [6] to compute the distance from q. In addition, our algorithm avoids unnecessary processing on any edge with a distance from the query point greater than d. Finally, foreachedgewhosestartingvertexhasadistance≤ d,MNDRgen- erates the list of affected cells by using Algorithm 1 and retrieves thecorresponding moving objects fromthegridindex. Algorithm 2 details MNDR. To illustrate the algorithm with an example, let us assume that the system is processing a network as shown in Figure 3, where the side length of cells is 1.0 unit. A query object q with dist(q,v2) = 1.0 submits a range query with a range d = 3.5. MOVNet first invokes a Euclidean distance range query with q as the center and d as the radius (Line 5 of Algo- rithm2). Consequently, edges overlapping withtheshadowed area willberetrievedfromtheR-treeindexandacorresponding model- inggraphisbuiltasshowninFigure4(a)(Line6). Notethat qisin- sertedasthestartingvertexintothemodelinggraph(Line8). Next, Dijkstra’s algorithm is invoked (Line 9). We add a constraint d in thedistancecomputationsothatanyedge e(v1,v2)with dist(q,e) >d will not be processed, which avoidsexcessive computation on edgesthatareoutofrange. WhenDijkstra’salgorithmfinishes,the distanceofeachvertexfrom q isshowninFigure4(b). Inaddition, S = (v2, 1), (v4,2.5), (v3, 3),(v5,3.4). Based on this informa- tion,MNDR computes cellSetbyusingour celloverlappingalgo- rithminLines10- 14, shown asthedark-grey cellsinFigure4(b). Afterthat,themovingobjectsin cellSetareretrievedfromthegrid Algorithm2MobileNetwork DistanceRangeQuery (q, d) 1: /* q isthequery object */ 2: /* disthedistance*/ 3: result= φ 4: /*Findingthesetof edgesE ,ANDverticesV overlapped by thecirclewithcenterpoint q,and radius d*/ 5: (E ,V )=Euclidean-range(q, d) 6: G=Create-modeling-graph(E ,V ) 7: e =Object-map-matching(q,E ) 8: q =Add-vertex-into-graph(G, q, e) 9: S =Compute-distance(G, q, d) 10: foreach vertex v in S do 11: foreach edge e outgoing from v do 12: cellSet= cellSet∪ cellOverlapping(e,d− dist(q,v)) 13: endfor 14: endfor 15: result=Retrieve-objects(cellSet, G) 16: foreach object min resultdo 17: e(v1,v2) =Object-map-matching(m,E ) 18: dist(q,m)= min(dist(q,v1)+ dist(v1,m), dist(q,v2)+ dist(v2,m)) 19: if dist(q,m)>dthen 20: result= result- m 21: endif 22: endfor 23: return result index toconstitute theresultset. However, several post-processing steps are required to ensure that the distance of each moving ob- ject is within range d. First, some cells might overlap with several edges. For instance, c(6, 6) overlaps with e(v2,v3) and e(v3,v4). Hence for each object in the result set, MNDR determines which edge the object is located on (Line 17) through a map matching process. Second,someobjectsmaybereachableviamorethanone path from the query point. MOVNet will only consider the short- est path and examine the path against the range d (Line 18). For example, moving objects on edge e(v3,v4) have two paths from q (q→ v2 → v3,and q→ v4). MNDR will compute the distance of each object via each path, and only use the shortest one. Finally, once the distance from q to the object is determined, MNDR con- firmsthatthedistance≤ d. Forinstance,foranyobject mretrieved from c(5, 0), dist(q,m) > 3.5, thus the algorithm removes these objectsinLines19- 21. 01 0 1 2 3 4 5 6 7 V 6 V 1 V 3 V 4 V 5 V 8 q inserted as the starting vertex V 2 3456 2 7 01 0 1 2 3 4 5 6 7 d[V 1 ]= 3.6 d[V 8 ] = 3456 27 d[V 3 ]= 3 d[V 6 ]= 6.5 d[V 5 ]= 3.4 d[V 4 ]= 2.5 q d[V 2 ]= 1 8 (a) (b) Figure 4: A Mobile Network Distance Range (MNDR) query example. Whenwecompute cellSetin Algorithm2(Lines10 -14), some cells can be further pruned before the system retrieves the moving objects from the corresponding grid index. For instance, in the example illustrated above, e(v4,v6) overlaps with six cells. Some of thecellscanbepruned becausetheirdistancesfromq>d.This optimization can be achieved by using the geometric properties as described inthefollowing section. 4.2 The Minimumand MaximumNumber of CellsOverlappingwithanEdge 01 0 1 2 3 3 2 x y e x1 e y1 e y2 e x2 e x3 e y3 e y4 e x0 e y0 l l Figure 5: Computing the length of edges with regard to the numberofgridcells We present an important geometric property that relates an ar- bitrary edge with the grid cells it overlaps. Since the edge is rep- resented as a straight line segment, the relationship between the lengthof anedge e(v1,v2)and thenumber ofitsaffectedcellscan be described asfollows. LEMMA 1. Assumethattheservicespaceismanagedbyagrid- based index with a cell size of l× l. For an edge e(v1,v2) with a set of affected cells {c1, c2, ..., cn}, the maximum length of e is √ 2× l× n. The minimum length of eis 01≤ n≤ 2 n−3 2 2 + n−2 2 2 ·ln≥ 3 PROOF. Without loss of generality let us consider an edge e in theservicespacethatoverlapswithgridcellsasshowninFigure5. Assume that the number of affected cells for e is n. Therefore, for 0≤ exi ≤ l, 0≤ eyi ≤ l,wehave length(e)= n−1 i=0 e 2 xi + n−1 i=0 e 2 yi (1) Weobserve that, when exi = eyi = l,where 0≤ i≤ n− 1,we obtain the maximum length of e when substituting exi and eyi in (1) lengthmax(e)= n 2 · l 2 + n 2 · l 2 = √ 2· l· n Tocompute theminimumlengthof e,weobservefromFigure5 that ey1 + ey2 = ex2 + ex3 = l,andsoon,whichcanbesummarized as e x(2j) + e x(2j+1) = l 1≤ j ≤ n−3 2 e y(2k−1) + e y(2k) = l 1≤ k ≤ n−2 2 (2) Forsimplicity,weuse Exj toreferto e x(2j) +e x(2j+1) and E yk toreferto e y(2k−1) + e y(2k) fromhereon. When n= 1,theminimumlengthof e= e 2 x0 + e 2 y0 = 0,where ex0 = ey0 = 0. Similarly, when n = 2, the minimum length of e = 0,where ex0 = ey0 = ex1 = ey1=0. When nis≥ 3and odd, wehave ⎧ ⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ n−1 i=0 e 2 xi =(ex0 + ex1 + n−3 2 j=1 Exj + e x(n−1) ) 2 n−1 i=0 e 2 yi =(ey0 + n−2 2 k=1 E yk + e y(n−2) + e y(n−1) ) 2 UsingthepropertiesinEqn(2),theaboveequationscanbetrans- formed into ⎧ ⎪⎪⎨ ⎪⎪⎩ n−1 i=0 e 2 xi =(ex0 + ex1 +( n−3 2 )· l + e x(n−1) ) 2 n−1 i=0 e 2 yi =(ey0 +( n−2 2 )· l + e y(n−2) + e y(n−1) ) 2 Substituting the corresponding parts of Eqn(1) with the above equations, we can conclude that, if ex0 = ex1 = e x(n−1) = ey0 = e y(n−2) = e y(n−1) = 0, lengthmin(e)= n− 3 2 2 + n− 2 2 2 · l Similarly,when nis≥ 3and even, wehave ⎧ ⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ n−1 i=0 e 2 xi =(ex0 + ex1 + n−3 2 j=1 Exj + e x(n−2) + e x(n−1) ) 2 n−1 i=0 e 2 yi =(ey0 + n−2 2 k=1 E yk + e y(n−1) ) 2 Using the same properties as shown inEqn(2), we can conclude that lengthmin(e)= n−3 2 2 + n−2 2 2 · l. Therefore, we have proved that when n ≥ 3, in both even and odd cases, lengthmin(e)= n−3 2 2 + n−2 2 2 · l. Lemma1statestheminimumandmaximumboundsofthelength of an edge given a fixed number of cells. We further deduce from Lemma 1 the maximum and minimum number of affected cells withregardtoan arbitraryedge. COROLLARY 1. Assumethattheservicespaceismanagedbya grid-basedindexwithacellsizeof l× l. Foranedge e(v1,v2),the maximumandminimumnumberofaffectedcellsare √ 2·length(e) l +3, and length(e) √ 2·l ,respectively. PROOF. Weknow fromLemma1that,given anedge e(v1,v2), length(e) ≤ √ 2 · l · n; hence we can directly deduce that n ≥ length(e) √ 2·l . Similarly, since length(e) ≥ n−3 2 2 + n−2 2 2 · l,itfol- lows that length(e) ≥ 2· n−3 2 2 · l, which leads us to con- clude that n≤ √ 2·length(e) l +3. Weutilizethepropertyofthemaximumnumberofaffectedcells inCorollary1toprunethesearchspace. LetusassumethatMNDR generates the list of cells overlapping with an edge e(v1,v2) and there are n1 affected cells. By using Corollary 1 we deduce that a range d− dist(q,v1)isonlyabletooverlapwithatmost n2 cells, where n2 <n1. ThereforeMNDRwillonlyrecordthefirst n2 cells on e(v1,v2) into cellSet. As an example consult Figure 4(b). We know that dist(q,v4)= 2.5, thereforeweonly need torecord cells on e(v4,v6) within a range of 3.5 - 2.5=1.0 from v4.Usingthe maximum bound of the number of affected cells, MNDR records thefirst 5cellson e(v4,v6)startingfrom v4,eventhoughthereare 6cellsoverlapping with e(v4,v6). In summary, Corollary 1 provides a precise range on how edges overlap with grid cells. Our simulation results indicate that this property offers substantial performance improvements when com- puting theaffectedcellsover long edges (i.e.,freeway segments). 4.3 k NearestNeighborQueryAlgorithm Given a query point q,avalue k, anetwork Gand aset of mov- ing objectsM, a network-distance-based k nearest neighbor query retrieves the k objects ofM that are closest to q according to the network distance at time t. Formally, a mobile kNN query is rep- resentedas kNNQueryt(q,k): loct(q)× loct(M)→{mi, i=1, ..., k},where∀ mj ={M- mi }, distt(q,mj)≥ distt(q,mi). To cope with this type of query, we propose a Mobile k Nearest Neighbor query algorithm (MKNN) leveraging our MNDR algo- rithmtoefficientlycompute the kNN POIsfromthequerypoint in the network. We observe that the grid index in MOVNet enables fine-grained space partitioning. Additionally, the grid index main- tainsanobject listineachgridcell,whichcan bequicklyaccessed toretrieve thenumber of enclosed objects. Therefore, webegin by searching the surrounding area of the query point in the grid index and continuously enlarging the area until we are able find a sub- space that contains kNN POIs in terms of the Euclidean distance. Wetermthisprocedurea progressive probe. Notethatinthepro- gressiveprobe,weonlyretrievethesizeoftheobjectlistfromeach cell, while the distance of each object from the query point is not computed because we aim to obtain an approximate area enclos- ing kNN objects within network distance. Our experimental study shows that in 30% to 48% of the test cases the actual number of kNN objects are bounded by our progressive probe. More impor- tantly, the complexity of retrieving the object list size from each cell is O(1), which isvery efficient especially since our grid index isanin-memory structure. 01 0 1 2 3 4 5 6 7 3456 2 7 L 0 L 2 L 1 q 01 0 1 2 3 4 5 6 7 d[V 1 ]= 3.6 3456 27 d[V 3 ]= 3 d[V 6 ]= 6.5 d[V 5 ]= 3.4 d[V 4 ]= 2.5 q d[V 2 ]= 1 (a) (b) Figure 6: A Mobile Network Distance k-NN (MKNN) query example. We definethatcellsinthegridindexaregroupedintolevelscen- tered at c( xq l , yq l ),where q is a moving object submitting a mobile kNN query and l is the side length of a grid cell. The first level L0 is the single cell c( xq l , yq l ) and cells at the next level are the surrounding cells of L0, and so on. Formally, cells in lev- els Li (i∈{1, 2,···}) can be represented as Li = c(x1,y1)∪ c(x2,y2)∪c(x3,y3)∪c(x4,y4),where xq l −i≤ x1≤ xq l +i, y1 = yq l +i, x2 = xq l −i, yq l −i+1≤ y2≤ yq l +i−1, x3 = xq l + i, yq l − i +1 ≤ y3 ≤ yq l + i − 1,and xq l −i≤ x4≤ xq l +i, y4 = yq l −i.Byusingthedefinition above,theprogressiveprobefirstretrievesthenumberofobjectsin L0 via the gridindex. If there areless than k objects in L0, it con- tinues to scan the number of objects at the next level of cells, and soon. Figure6(a)illustratesanexampleofthesesteps. Assumethe system is maintaining a network as shown in Figure 3 and a query object q in c(5, 5) submits a nearest neighbor query with k = 10. Theprogressiveprobefirstlocates q in c(5, 5),whichbecomes L0. After that, the number of POIs in c(5, 5) is retrieved from the grid index. If there are less than 10 POIs in L0, the progressive probe sequentially searches the next levels Li,where i∈{1, 2,···}, il- lustrated in the shadowed areas in Figure 6(a). Assuming that at least 10POIshavebeenfound afterthescanin L2,theprobestops and results in an estimated space for kNN objects in the network. Because the R-tree is on secondary storage in MOVNet and the number of disk I/Os should be minimized, MKNN utilizes this es- timated area to launch a range query extracting the edges from the R-tree, instead of following a network expansion approach to re- trieveafewedgesatatime. Wealsointroducethefollowingdatastructures: candidateObjs and unvisitedV ertices. These are minimum priority queues on thevalueofthedistancesfromthequerypoint. Thesetofcandidate objectsisretrievedfromthegridindexaspossibleobjectsinthefi- nal result set. The set of unvisited vertices is tobe expanded when there are less than k objects found during query processing. Addi- tionally, we manage resultObjs as a maximum priority queue in termsof thedistancefromthequery point withasizeof k. Algorithm 3 elaborates on the MKNN algorithm. MKNN first executes the progressive probe in the grid index so that an approx- imate query result space is created. After that, MKNN uses this subspace as an initial range to invoke the MNDR module so that the corresponding edges are retrieved from the R-tree and the dis- tance of each vertex from q is computed (Lines 8 - 12). Given theexampleofFigure6(a),Figure6(b)demonstratesthecorrelated modeling graph andthedistance toeachvertex. Next,avertexisde-queuedfrom unvisitedV ertices(Line15). For each outgoing edge from the vertex, the set of affected cells is computed and objects are retrieved from the corresponding grid cells and placed into candidateObjs (Lines 21 - 24). After that, weexaminetwopossiblecases: First,iftherearelessthan kobjects in resultObjs, MKNN de-queues objects from candidateObjs into resultObjs (Lines 25 - 27). Second, if the distance of the kth result object is greater than the distance of the first element of candidateObjs,the kth result object will be de-queued and inserted into candidateObjs.Next, candidateObjs de-queues anobject andinsertsitinto resultObjs (Lines28- 30). Thealgorithmterminateswhen resultObjscontains kPOIsand the distance of the kth result object is less than the distance of the minimum vertex in univsitedV ertices (Lines 17 - 20). Other- wise,ifthelastvertex v inthemodelinggraph(i.e.,thevertexwith the longest distance to q) is visited and the distance of the kth re- sult object isgreater than dist(q,v), MKNNwilluse dist(q,v)as the radius to launch a range query in the R-tree as a new iteration of MKNN (Line 32). Although this step causes I/O operations as well as the overhead of creating a modeling graph again, MKNN maintains the set of visited vertices in each iteration to avoid vis- iting these vertices in future iterations (Line 13). As our simula- tion results have verified, under various settings, MKNN requires no more than two iterations during query processing in more than 97% of thetest cases. Therefore, thismethod significantlyreduces theI/Ocost and ensures highsystem throughput. 5. COMPLEXITYANALYSIS Wepresent our theoretical analysisof MOVNet inthefollowing Algorithm3Mobile NetworkDistance kNNQuery(q, k) 1: /* q isthequery object */ 2: /* k isthenumber of NN objects*/ 3: /* l istheside lengthofcell*/ 4: foundkObjs = false 5: visitedV ertices= φ 6: radius =Progressive-probe(q, k, l) 7: while foundkObjs = falsedo 8: (E ,V )=Euclidean-range(qx, qy, radius) 9: G=Create-modeling-graph(E ,V ) 10: e=Object-map-matching(q,E ) 11: q =Add-vertex-into-graph(G, qx, qy, e) 12: S =Compute-distance(G, q) 13: unvisitedV ertices= S - visitedV ertices 14: while unvisitedV ertices!=NULLdo 15: minV ertex=De-queue(univisitedV ertices) 16: cellSet= φ 17: if resultObjs.size =kAND minV ertex.dist≥ kth resultObjs.dist then 18: foundkObjs = true 19: break 20: endif 21: foreach edge e outgoing from minV ertexdo 22: cellSet= cellSet∪ cellOverlapping(e,d− dist(q,v)) 23: endfor 24: candidateObjs = candidateObjs∪ Retrieve-objects(cellSet, G) 25: while resultObjs.size < k do 26: De-queue(candidateObjs)to resultObjs 27: endwhile 28: whilePeak(candidateObjs).dist≤ kth resultObjs.dist do 29: Swith(De-queue(candidateObjs), De-queue(resultObjs)) 30: endwhile 31: endwhile 32: radius= minV ertex.dist 33: endwhile 34: return resultObjs sections. We assume that the network and moving objects are uni- formly distributed in one unit square space (i.e., for an object m, 0≤ xm < 1,and 0≤ ym < 1). Thisisan optimalsimplification, which is similar to previous studies [13, 19]. A grid index with l × l side length manages the moving object location updates. The totalnumberofedgesandmovingobjectsinthenetworkare E and M, respectively. 5.1 AnalysisofMNDR For a MNDR query with a range d, let us assume the query covers an area of 4d 2 . Although the Euclidean distance query in MNDR is in actuality performed within an area of πd 2 , our as- sumption does not change the quality of our analysis. During the processing of MNDR, there are O(d 2 E) edges retrieved from the on-diskR-treeintheEuclideandistancerangequery. Thenextstep that creates a modeling graph is of complexity O(d 2 E) since ev- ery edgewill berecorded inthegraph. Findingtheedge where the query point is located can be achieved during the modeling graph construction. Additionally, inserting the query point into the mod- eling graph as the starting vertex requires only O(1) operations. The running time of Dijkstra’s algorithm to compute the distance of each vertex from the query point is O(d 2 E·lg(d 2 E)).Next, MNDR calculates the cell set overlapping with the edges based on the distance information. Note that each edge is examined at most once during the course of this step. Therefore, O(d 2 E) it- erations are needed to calculate the overlapping cells. Moreover, since the length of the edge is bounded by d, the total complex- ity of this step is O(d 3 E). Finally, MNDR retrieves the objects from the grid index and computes the result set. For a range query with a side length of 2d, the number of overlapping grid cells is (2d + l) 2 /l 2 [17]. Foreachcell,wecanassumethatthereare l 2 M objects. Hence the number of moving objects retrieved in the fi- nal step is O((2d + l) 2 M). To sum up, the cost of MNDR can be represented as O(d 2 Elg(d 2 E)+(2d + l) 2 M). We observe that the cost of MNDR is linear in the number of POIs. Similarly, the system throughput is proportional to the side length of cells (or inversely proportional to the number of cells). Additionally, both factors are lower-bounded by the cost of graph construction, Dijkstra’s algorithm, and the overlapping cell com- putation. Finally, the CPU cost is a quadratic function of d,which means alargerrange resultsinaserious increaseinCPUcost. 5.2 AnalysisofMKNN Forsimplicity,letusassumethattheprogressiveproberesultsin a subspace containing k nearest neighbor objects. In the case that MKNN needs to expand to a larger space with more iterations, it can be modeled with our cost model times a constant, which does not change thecharacteristicofour analysis. Since we assume that POIs are uniformly distributed, the sub- space containing kNNobjects has asizeof k M . Therefore, MKNN needs to scan k M·l 2 cells to find the kth object and return a sub- space. The subsequent steps in MKNN that perform a Euclidean distance range query, construct the modeling graph, compute the overlapping cells, and retrieve objects are the same as the ones in MNDR. Hence the cost in these operations can be summarized as O( kE M · (2 + lg kE M + kE M )+( kE M + l) 2 · M).Thefinal step thatfiltersobjects from candidateObjectsinto resultObjectsis bounded by the size of resultObjects (i.e., k). In summary, the costofMKNNcanbesimplifiedas O( k Ml 2 + kE M · kE M +( kE M + l) 2 · M). The equation above shows that the CPU cost of MKNN is pro- portionalto k. Theexplanationisthatwithanincreasing k,MKNN needs to search for a larger space to find the query result. Addi- tionally, the CPU cost is inversely proportional to the number of objects. Thisisbecause withmorePOIs,thesearch space forfind- ing the kth object becomes smaller, and vice versa. Finally, the system throughput as a function of the cell size is bounded by two factors: thecost fromtheprogressive probe and thecost of retriev- ing objects from the grid index. Asmaller cellsize resultsin more overhead from the progressive probe. In contrast, increasing the size of cells implies that more objects are retrieved from the grid index. Wepresentoursimulationresultsinthefollowingsectionas anexperimental verificationof our theoretical analysis. 6. EXPERIMENTALEVALUATION To evaluate the performance of MOVNet, we performed exten- sive simulationson a real dense road network. The result indicates thatMOVNetachievesgoodthroughput withawidevarietyofdata settings. In Section6.1 we startby describing the data sets used in oursimulationandoursimulatorimplementation. Experimentalre- sultsandthecorrespondingdiscussionarepresentedinSection6.2. 6.1 SimulatorImplementation WeobtainedarealdatasetfromTIGER/Line 1 . TheLosAngeles County (LA) data set has 304,162 road segments distributed over anareaof4,752squaremiles. Theaveragelengthofroadsegments is 0.1066 miles. For simplicity, we assume that each road segment isbi-directional. ThenetworkdataisindexedwithanR*-Tree. Ad- ditionally,weusedanetworksimulator[3]togeneratethepositions of 100,000 moving objectsintheroadnetwork. Existing work, such as IMA/GMA, only focus on C-kNN query processing. ThismethoddiffersfromthefunctionalityofMOVNet that focuses on snapshot range query and kNN query processing. Therefore, we leveraged the concept of network expansion[15] to design baseline algorithms for performance comparisons in our simulations. The baseline algorithm for mobile range queries exe- cutes as follows. First we retrieve the edge where the query point is located. Next, the closest vertex to the query point is expanded and outgoing edges from this vertex are retrieved. The expansion stopsonceallverticeswhosedistancesfromthequerypointareless than dhavebeenexpanded. Afterthat,foreachexpanded edge,the overlappingcellsarecomputedandPOIsinthesecellsareretrieved toconstitutetheresultset. Basedonthesameidea,thebaselineal- gorithm for mobile kNN queries has the following steps. First we locate the road segment on which the query point is moving and compute the affected cells. Next, the corresponding moving ob- jects are retrieved from the affected cells. If there are less than k objects inthe result set, or the distance from the query point to the closest vertex is less than the distance of the kth object from the query point, the closest vertex isexpanded and the outgoing edges from this vertex are retrieved. Afterward, the set of affected cells on the outgoing edges is computed and the corresponding objects are retrieved from the grid index. The vertex expansion process stops when there are k objects in the result set and the distance from query point to the kth object is no greater than the distance fromthequery point totheclosest un-expanded vertex. We implemented a simulator in Java. The simulation was exe- cuted on a workstation with 1 GB memory and a 3.0 GHz Xeon processor. We arranged the road segments of the LA county data set into a R*-tree indexfile, in which we set the page size to 4KB. Each road segment is stored in a MBR bounded by itsstarting and ending coordinates. To achieve fair comparison, our baseline al- gorithms also use thisR*-tree index structure tospeed up the edge retrieval during queryprocessing. Foreachtest case,our simulator creates a service space with the area equal to the LA county size. It then opens the R*-tree index file and use a buffer for caching the disk pages read by MOVNet with a size of 10 pages. Next, an in-memory grid index is created with the positions of the moving objects. Tosimplifythemap-matchingprocess,weassumethatob- jectlocationsalwaysfallalongtheroadsegments. Inthenext step, the query generator randomly picks a moving object and launches aqueryfromitslocation. Table1summarizestheparametersused. Ineach experimental settingwevaried asingleparameterand kept the remaining ones at their default values. The experiments mea- sured the CPUtime (in milliseconds) and the number of disk page accessastheperformance ofthequeryprocessing. Foreachexper- imental configuration, the simulator executed 1,000 iterations and reported theaverageresult. 6.2 SimulationResults We arefirstinterestedinverifyingtheupdatecostsfromPOIsin MOVNet. Since we use an in-memory grid index to handle these updates, there is no disk access on the R*-tree index file, which is 1 http://www.census.gov/geo/www/tiger/ Table1: Simulationparameters Parameter Default ValueRange Number ofPOIs 50K 10K -100K POIdistribution Uniform Uniform NumberofNNs(k) 50 2- 128 Radius(miles) 5 2-10 Number of cellsper axis 1K 200 -1,400 on secondary storage. Therefore, we measured the CPU time of the update processing. Note that the update and query processing should befinished in one update cycle to ensure the correctness of the query results. We assume that at the beginning of each update period, 20%ofthePOIssubmittheirnewpositions. Figure7shows that when there are 20,000 updates messages in one period (i.e., 100,000 POIs), MOVNet is able to record these changes in about 200 miliseconds. Additionally, the update cost is proportional to thenumberofupdatemessages. Therefore,itispossibletoestimate theCPUtimethatisrequired toprocess updates. We start to verify the perfor- 4000 8000 12000 16000 20000 0 50 100 150 200 CPU Time (milisec) Number of Updates Figure7: TheCPUtimeofup- datecostasafunctionofPOIs mance of MNDR. Figure 8(a) illustratestheeffectofthenum- ber ofcellswiththeLAcounty data set. The results show that MNDR requires less than half oftheCPUtimecomparedwith the baseline algorithm. Corre- spondingly, Figure 8(b) studies the page accesses of both algo- rithms. Aswecansee,thebase- line algorithm consumes more than 3,000 page accesses withvari- ous cell sizes. As comparison, MNDR requires less than 100 page accessesduringqueryprocessing. Animportantobservationisthat a small number of cells cause the CPUtimeof MNDR to degrade. Ontheotherhand,thediskaccessofMNDRisstablewithdifferent cell sizes. Thiscan be explained by thefact that adisk access only occurs when we retrieve the road segments from the R*-tree file. Sinceweuseafixedrange inthistest,thenumber of disk accesses isnotaffectedbychangingthecellsize. However,alargercellsize willresultinalargernumber of POIsbeingretrievedfromthegrid index during query processing. Therefore, theCPU timeexpended in this portion is larger than with smaller cell sizes. Overall, we concludethatMOVNetscalesverywellwithvaryingcellnumbers. Note that with MNDR, the setting of 1,000 cells per axis achieves astableandoptimal performance, hencewesetthedefault number of cellsper axistobe1,000 inour other tests. 2 4 6 8 10 12 14 0 50 100 150 200 250 300 CPU Time (milisec) Number of cells per axis (10 2 ) Baseline MNDR 2 4 6 8 10 12 14 0 500 1000 1500 2000 2500 3000 3500 Disk Access (Pages) Number of cells per axis (10 2 ) Baseline MNDR (a) (b) Figure8: TheperformanceofMNDRasafunctionofthenum- berofcells Next, Figure 9(a) illustratesthe effect of the number of POIs on the execution time of MNDR. Aswe can see, MNDR outperforms the baseline algorithm with various numbers of POIs. In the case of 20K POIs, the CPU timeof MNDR isabout 30% of that of the baseline algorithm. Additionally, the output shows that the CPU time increases linearly with the number of POIs, which follows our complexity analysis expectation. The very small gradient of the MNDR output suggests that MOVNet is very scalable to sup- port a very large number of POIs. More importantly, with 100K POIs,theprocessingtimeforLAcountyisabout 0.1seconds. This demonstrates how efficiently MOVNet executes. Figure 9(b) plots the disk accesses of both algorithms. Similarly to the CPU time outputs, MNDR performs consistently much lower than the base- linealgorithm. 20 40 60 80 100 0 100 200 300 400 500 CPU Time (milisec) Number of Objects (10 3 ) Baseline MNDR 20 40 60 80 100 0 500 1000 1500 2000 2500 3000 3500 Disk Access (Pages) Number of Objects (10 3 ) Baseline MNDR (a) (b) Figure9: TheperformanceofMNDRasafunctionofPOIs Figure 10(a) plots the CPU time (with logarithmic scale) versus the query range with the LA county set. The CPU time quadrat- ically increases with a larger range. When the range is 4 miles, MNDRcosts0.076seconds. Processingarangeof8milesrequires 0.2secondsbyusingMNDRcomparedwith0.65secondswhenus- ingthebaseline algorithm. Additionally, MNDRalways consumes about 40% of the CPU time compared with the baseline algorithm duringqueryprocessing. Figure10(b)plotsthecorrespondingpage accesses. Theoutputcorresponds totheCPUoutput,aswellasour complexity analysis results. Assuming the road network is uni- formly distributed, the number of edges grows quadratically with theincreaseoftherange. Sincetheseedges mustberetrievedfrom theR-treefileduringqueryprocessing, theperformance ofMNDR isdeterioratingcorrespondingly. 02468 10 10 100 1k CPU Time (milisec) Range (miles) Baseline MNDR 2468 10 10 100 Disk Access (Pages) Range (miles) Baseline MNDR (a) (b) Figure10: TheperformanceofMNDRasafunctionofrange Next, we are interested in the performance improvement when usingCorollary1inMNDR.Figure11(a)plotstheCPUtimewhen usingCorollary1toprunethesearchspaceinMNDRcompared to not using it when handling the LA county data. The performance improvement of using Corollary 1 is about 10% when the range is 6.0 milesand lessthan 5% when the range is 2.0 miles. Webelive this islargely due to the fact thatthe TIGER/Linedata setconsists ofmanyveryshortroadsegments(0.1066milesonaverage). There areonlyafewcellsthatoverlapwitheachedge, whichimpliesthat there is little chance to prune some cells during query processing. However,thesystemimprovementbyusingCorollary1issubstan- tial when it is applied to large road segments. To illustrate this fact, weextracted thefreeway segments inLAcounty (theaverage length of road segments is 2.7127 miles) and performed the sim- ulation on just this network. Figure 11(b) shows the results, with query ranges from 2.0 up to 10.0 miles. The results indicate that the improvement of system throughput by applying Corollary 1 is very noticable. Especially when the range is 6 miles, the system performance achievesagainof over 30%. Henceweconclude that for a network with long road segments, it is very appealing to use Corollary1toprune thesearchspace. 02468 10 0 50 100 150 200 250 300 350 CPU Time (milisec) Range (miles) W/O Search Space Pruning W/ Search Space Pruning 2468 10 0 20 40 60 80 100 120 140 CPU Time (milisec) Range (miles) W/O Search Space Pruning W/ Search Space Pruning (a) LAcounty (b) LAcounty, freeways only Figure11: TheCPUtimeimprovementofusingCorollary1 Now we study the performance of MKNN. Figures 12(a) and (b) illustrate the CPU time and disk accesses of MKNN as a func- tion of the number of cells, respectively. An observation is that the throughput of MKNN is relatively stable when the number of cells per axis exceeds 400. This is because when we use a fairly large cell size (e.g., 200 cells per axis), the grid index only pro- vides a very coarse-grained space partition. Hence the progressive probe results in a space with lots of unnecessary road segments, which must be retrieved in later steps. However, the performance ofMKNNbecomesstablewhenwechoosesmallcellsizes. Hence we set the default number of cellsper axis to be 1,000 in our other MKNN tests. More importantly, MKNN consistently requires less CPU time as well as disk accesses than the baseline algorithm. In general, MKNN costs less than 50 milliseconds to process a kNN query with k =50. 2 4 6 8 10 12 14 40 80 120 160 200 CPU Time (milisec) Number of cells per axis (10 2 ) Baseline MKNN 2 4 6 8 10 12 14 40 80 120 160 200 Disk Access (pages) Number of cells per axis (10 2 ) Baseline MKNN (a) (b) Figure 12: The performance of MKNN as a function of the numberofcells. Figure 13(a) plots the performance of MKNN with regard to k. The CPU time grows proportionally with k. More importantly, MKNN outperforms the baseline algorithm. The growth of the CPU time in MKNN is much slower than that of the baseline al- gorithmasafunctionof k. MKNNcostslessthan 80%oftheCPU time of the baseline algorithm where k = 128. Figure 13(b) shows the disk accesses of both algorithms. The gradient of MKNN out- put is very small, which suggests that with the increase of k,the progressive probe inMKNN significantly avoidsexcessive I/Oop- erations on theR-tree. Finally, when k = 128, theCPUtimeinLA county is less than 0.5 seconds. This clearly shows that MOVNet can support avery largevalue of k. Figure 14(a) illustrates the CPU time of MKNN as a function of the number of POIs. The result shows that the CPU time is inverselyproportionaltothenumberofPOIs,whichiswhatweex- pect from the theoretical analysis. With a larger number of POIs, the performance of MKNN becomes better. Thischaracteristicen- suresthatMOVNetisveryapplicableforuseinmetroareas. When there are 100K POIs in the service area, processing a kNN query with k = 50 requires only 26 milliseconds. Another important ob- servationisthatMKNNhasbettersystemthroughputthanthebase- line algorithm with varying numbers of POIs. The improvement ranges from a factor of 4.23 up to 5.80. Figure 14(b) shows the disk access counts with regard to both MKNN and the baseline al- gorithm, which correlateswithour CPUtimemeasurement result. 20 40 60 80 100 120 140 0 100 200 300 400 500 600 CPU Time (milisec) k Baseline MKNN 0 20 40 60 80 100 120 140 0 100 200 300 400 500 Disk Access (pages) k Baseline MKNN (a) (b) Figure13: TheperformanceofMKNNasafunctionof k. 20 40 60 80 100 0 100 200 300 400 500 CPU Time (milisec) Number of objects (10 3 ) Baseline MKNN 20 40 60 80 100 0 100 200 300 400 Disk Access (pages) Number of objects (10 3 ) Baseline MKNN (a) (b) Figure14: TheperformanceofMKNNasafunctionofPOIs. Based on our simulation results and analysis, we conclude that theperformanceofMOVNetscalesverywellwithvarioussettings. ItconsumeslessCPUtimeinalltestcases. Theperformancediffer- ence between MOVNet and the baseline algorithms is much more distinguishable withregardtodiskpage access. 7. CONCLUSIONS Location-based services have generated growing interest in the research community. This paper presents an infrastructure aimed toprocesslocation-basedserviceswithmovingobjectsinroadnet- works. Weproposeacelloverlappingalgorithmthatquicklyrelates the underlying network and moving objects in memory. Based on theinfrastructureofMOVNet,wepresenttwonovelalgorithmsfor processing snapshot range queries and kNN queries, respectively. The experimental evaluation suggests that MOVNet is highly effi- cient inprocessing these querieswithareal dense roadnetwork. Weplantoextendourworkinseveraldirections. First,ourstudy currently assumes a static network. However, incorporating some dynamicnetworkupdates, suchasthereal-timetrafficinformation, will be critical for many location-based services, especially those for metro usages. We would like to extend our work to support a dynamic underlying network. Additionally, continuous queries are the most sophisticated query type in location-based services. Although they consume much more computation and memory re- sources than snapshot queries, they offer an extended view of POI movements and become appealing for monitoring purposes. This functionalityisveryuseful inanumberofplaces, suchas911call- centers. WeareplanningtoextendthefunctionalityofMOVNetto support continuous queries. 8. REFERENCES [1] N.Beckmann, H.-P.Kriegel, R.Schneider, and B.Seeger. TheR*-Tree: AnEfficientandRobust Access Method for Pointsand Rectangles. In SIGMOD Conference, 1990. [2] J.Bresenham. Algorithmfor computer control of adigital plotter.IBM Systems Journal, 4(1):25–30, 1965. [3] T.Brinkhoff. Aframework for generating network-based moving objects. GeoInformatica, 6(2):153–180, 2002. [4] H.-J.Choand C.-W.Chung. Anefficientand scalable approach tocNNQueriesinaRoadNetwork. InVLDB, 2005. [5] H.D. Chon, D.Agrawal, andA. E.Abbadi. Range andkNN query processing for moving objectsingridmodel. MONET, 8(4), 2003. [6] E.Dijkstra.A noteontwoproblems inconnection with graphs. Numeriche Mathematik, 1, 1959. [7] X.Huang, C.S.Jensen, H. Lu,and S.Saltenis.S-GRID:A VersatileApproach toEfficient QueryProcessinginSpatial Networks. In SSTD, 2007. [8] X.Huang, C.S.Jensen, and S.Saltenis.TheIslands Approach toNearest Neighbor Querying inSpatial Networks. In SSTD, 2005. [9] C.S.Jensen,J.Kol´ arvr, T. B.Pedersen, and I.Timko. Nearest Neighbor QueriesinRoadNetworks. In ACMGIS, 2003. [10] M. R.Kolahdouzan and C.Shahabi. Continuous K-Nearest Neighbor QueriesinSpatialNetworkDatabases. InSTDBM, 2004. [11] M. R.Kolahdouzan and C.Shahabi. Voronoi-Based K Nearest Neighbor Searchfor SpatialNetworkDatabases. In VLDB,2004. [12] M. F.Mokbel, X. Xiong, and W.G.Aref.SINA:Scalable Incremental Processing of Continuous Queriesin Spatio-temporal Databases. InSIGMOD Conference, 2004. [13] K.Mouratidis, M.Hadjieleftheriou, and D.Papadias. Conceptual partitioning: Anefficientmethod for continuous nearest neighbor monitoring. InSIGMOD Conference, 2005. [14] K.Mouratidis, M.L.Yiu, D.Papadias, andN. Mamoulis. Continuous nearest neighbor monitoring inroad networks. In VLDB,2006. [15] D.Papadias, J. Zhang, N.Mamoulis, andY. Tao.Query processing inspatial network databases. InVLDB,2003. [16] Y. Tao,D.Papadias, and J.Sun. TheTPR*-Tree: An OptimizedSpatio-Temporal AccessMethod forPredictive Queries. InVLDB,2003. [17] H.Wang, R.Zimmermann, and W.-S.Ku. ASPEN:An Adaptive SpatialPeer-to-PeerNetwork. In ACM GIS,2005. [18] X.Xiong, M.F.Mokbel, and W.G.Aref.SEA-CNN: ScalableProcessingof Continuous K-NearestNeighbor QueriesinSpatio-temporal Databases. InICDE,2005. [19] X.Yu, K.Q.Pu, andN. Koudas. Monitoring k-nearest neighbor queries over moving objects. InICDE,2005.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 891 (2007)
PDF
USC Computer Science Technical Reports, no. 892 (2007)
PDF
USC Computer Science Technical Reports, no. 871 (2005)
PDF
USC Computer Science Technical Reports, no. 843 (2005)
PDF
USC Computer Science Technical Reports, no. 909 (2009)
PDF
USC Computer Science Technical Reports, no. 886 (2006)
PDF
USC Computer Science Technical Reports, no. 795 (2003)
PDF
USC Computer Science Technical Reports, no. 893 (2007)
PDF
USC Computer Science Technical Reports, no. 748 (2001)
PDF
USC Computer Science Technical Reports, no. 762 (2002)
PDF
USC Computer Science Technical Reports, no. 846 (2005)
PDF
USC Computer Science Technical Reports, no. 844 (2005)
PDF
USC Computer Science Technical Reports, no. 685 (1998)
PDF
USC Computer Science Technical Reports, no. 911 (2009)
PDF
USC Computer Science Technical Reports, no. 908 (2009)
PDF
USC Computer Science Technical Reports, no. 699 (1999)
PDF
USC Computer Science Technical Reports, no. 625 (1996)
PDF
USC Computer Science Technical Reports, no. 628 (1996)
PDF
USC Computer Science Technical Reports, no. 912 (2009)
PDF
USC Computer Science Technical Reports, no. 781 (2002)
Description
Haojun Wang, Roger Zimmermann. "Framework for snapshot location-based query processing on moving objects in road networks." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 899 (2008).
Asset Metadata
Creator
Wang, Haojun
(author),
Zimmermann, Roger
(author)
Core Title
USC Computer Science Technical Reports, no. 899 (2008)
Alternative Title
Framework for snapshot location-based query processing on moving objects in road networks (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
11 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16271167
Identifier
08-899 Framework for Snapshot Location-based Query Processing on Moving Objects in Road Networks (filename)
Legacy Identifier
usc-cstr-08-899
Format
11 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/