Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 948 (2014)
(USC DC Other)
USC Computer Science Technical Reports, no. 948 (2014)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
An Efficient Index Structure for Large-scale Geo-tagged Video Databases Ying Lu Cyrus Shahabi Seon Ho Kim Integrated Media Systems Center, University of Southern California, Los Angeles, CA {ylu720,shahabi,seonkim}@usc.edu ABSTRACT We are witnessing a significant growth in the number of smart- phoneusers,phone’shardwareandsensorstechnology,andbroad- bandbandwidth. Consequently, anunprecedented numberofuser- generated videos (UGVs) are being collected by the public. Even though regular videos are difficult to index and search, UGVs can potentially be indexed and searched more effectively by utilizing the smartphone sensors (e.g., GPS locations, compass directions) togeo-tagthesevideosattheacquisitiontimeataveryfinespatial granularity. Ideally, each video frame can be tagged by the spatial extentofitscoveragearea,termedField-Of-View(FOV). Inthispaper,wefocusonthechallengesofspatialindexingand querying of FOVs in a large repository. Since an FOV is shaped similar to a piece of pie and contains both location and orienta- tioninformation,conventionalspatialindexes,suchasR-tree,can- not index them efficiently. Moreover, since UGVs’ distribution is non-uniform (e.g., more FOVs in popular locations), Grid-based indexes cannot cope with skewed UGVs efficiently. Finally, since UGVs are usually captured in a casual way with various shooting and moving directions, moving trajectories, zoom levels and cam- eralenses,noprioriassumptionscanbemadetocondensethemin anindexstructure. Hence we propose a class of new R-tree-based index structures that effectively harness FOVs’ camera locations, orientations and view-distances,intandem,forbothfilteringandoptimization,with- out storing the actual spatial extents of FOVs in the R-tree nodes. In addition, we present novel search strategies and algorithms for efficient range and directional queries on FOVs utilizing our in- dexes. Our experiments with a real-world dataset and a large syn- thetic video dataset (over 30 years worth of videos) demonstrate the scalability and efficiency of our proposed indexes and search algorithmsandtheirsuperiorityoverthecompetitors. CategoriesandSubjectDescriptors H.2.8 [Database Management]: Database Application—Spatial databases and GIS; H.3.4 [Information Storage and Retrieval]: SystemsandSoftware—Performanceevaluation GeneralTerms Algorithms,Experimentation,Performance Keywords Geo-tag,index,scalability,user-generatedvideo Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bearthisnoticeandthefullcitationonthefirstpage. Tocopyotherwise,to republish,topostonserversortoredistributetolists,requirespriorspecific permissionand/orafee. Copyright20XXACMX-XXXXX-XX-X/XX/XX...$15.00. 1. INTRODUCTION Drivenbytheadvancesinvideotechnologiesandmobiledevices (e.g.,smartphones,GoogleGlasses),alargenumberofusergener- ated mobile videos are being produced and consumed. According to a study by Cisco [1], the overall mobile data traffic reached 1.5 exabytespermonthattheendof2013and53%ofwhichwasmo- bile video data. It is forecasted that mobile video traffic will grow atanannualgrowthrateof61%between2013and2018andreach 15.9 exabytes per month by 2018. It is obvious that mobile videos will play a critical role in daily life, however, it is still very chal- lengingtoorganizeandsearchsuchahugeamountofunstructured user-generated mobile videos, resulting in a significant underuti- lization of big video data which carry plentiful information about humanlife. Hence,howtoefficientlyindexsuchahugescalevideo repositorybecomeanurgentandchallengingproblem. To overcome this challenge, we leverage two critical technolog- ical trends. First, various sensors are increasingly becoming avail- ableinmobiledevicestocapturegeographicalpropertiesofmobile videos. For example, GPS sensors and digital compasses embed- ded in smartphones can capture the camera locations and viewing directions at the frame level while capturing videos. Second, the association of video content with the geospatial properties (e.g., viewable scene or Field Of View model [2]) is shown very useful for various applications such as an online mobile media manage- ment system, MediaQ [7]. In the presence of such fine granular geo-metadata (i.e., FOV), we propose a new efficient index struc- tureforalargescalegeo-taggedvideodatabase. (a) Rangequery (b) Directionalquery Figure1: MediaQ[7]interface Unliketraditionalspatialobjects(e.g.,points,rectangles),FOVs arespatialobjectswithorientations(i.e.,viewingdirectionofcam- era). For example, Fig. 1 shows the FOVs of videos on Google Maps in the MediaQ [7] system. A blue pie shape represents the FOV of the currently playing frame of the corresponding video. In the presence of FOV metadata, there can be two typical spatial queries on geo-tagged videos [12] in a spatial database: range and directional queries. A range query finds all the FOVs that overlap with a user-specified query area, e.g., a circle. Fig. 1(a) illustrates a circular range query in MediaQ, searching for videos in an area at the University of Southern California, where the markers on the mapshowtheresultingvideolocationsoftherangequery. Adirec- tionalqueryfindsalltheFOVswhichorientationsoverlapwiththe user-specified direction within a range. Fig. 1(b) shows the results ofadirectionalquerywhentheinputdirectionistheNorth. Note that “direction” discussed in this paper is an inherent at- tributeofaspatialobject(i.e.,FOV).Thisisdifferentthanhowdi- rectionhasbeentreatedpastinthespatialdatabasefield,wheredi- rectionisonlyacomponentofaquery. Forexample,thegoalofan object-based directional query in [11] is to find objects that satisfy thespecifieddirectionofthequeryobject(e.g.,"findingrestaurants inthenorthofanapartment")whilethegoalofadirectionalquery in this study is to find all the objects pointing towards the given direction. To distinguish these two characteristics, we will use the term“orientation”whenreferringtothedirectionattributeofFOV objectsand“direction”whenwerefertothequerycomponent. Unfortunately,indexingFOVsposesgreatchallengestoexisting spatialindexesduetotheorientationpropertyoftheFOVsandthe way these datasets are collected in the real world. Let us elabo- rate on each case below. First, the FOVs of geo-tagged videos are spatial objects with both locations and orientations. Existing in- dexescannotefficientlysupportthistypeofdata. Forexample,one straightforwardapproachtoindexFOVswithatypicalspatialindex suchasR-tree[5]istoenclosetheareaofeachFOVusingaMini- mumBoundingRectangle(MBR).However,sinceR-treedoesnot consider the orientation information of objects, it suffers from un- necessarilylargeMBRsandconsequentlylarge“deadspaces”(i.e., empty area that is covered by an index node but does not overlap with any objects in the node). Also, it can perform neither orien- tation filtering nor optimization, resulting in many false positives. Thestate-of-the-artofindexingFOVsisathree-levelgrid-basedin- dex (Grid) [12]. Grid stores the location and direction information at different grid levels which will not be very efficient for video querying since video queries (e.g., range queries) involve both lo- cation and orientation information of FOVs at the same time. Sec- ond,inreallife,theFOVsofuser-generatedvideosarerecordedin a casual way with various shooting directions, moving directions, movingspeeds,zoomlevelsandcameralens. Gridsuffersfromef- ficiencyproblemforindexingFOVswiththedifferentzoomlevels and camera lens’ properties. In addition, the FOVs are not uni- formlydistributedintherealworld. Certainareasresultinasignif- icantly larger number of FOVs due to the high frequency and long durationofuploadedvideosforthoselocations. Trivially,theGrid- based Index performs poorly for non-uniformly distributed FOVs since the occupancy of grid files rises quickly for skewed distribu- tion. Toovercomethedrawbacksoftheexistingapproachesandindex FOVs efficiently, we propose a class of new index structures using orientationinformation,termedOR-trees,buildingonthepremises of R-tree. The first straightforward approach uses R-tree to only index the camera locations of FOVs as points and then augment the index nodes to store their orientations. This variation of R- tree is expected to generate smaller MBRs and reduce their dead spaces while supporting orientation filtering. To enhance further, we devise a second variation by adding an optimization technique inutilizingorientationinformationduringnodesplitandmergeop- erations. Finally, in our third and last variation, we add the FOVs’ viewable distances into the consideration during both filtering and optimizationprocess. Our extensive experiments using a real-world dataset and big syntheticallygenerateddatasets(morethan30yearsworthofvideos) demonstratehoweachvariationofOR-treeworkscomparingtothe competitors,i.e.,R-treeandGrid. Unlikeouroriginalintuition,the first variation which simply augments the nodes with orientations did not produce any better results than R-tree nor Grid. However, when we enhance the merge and split operations with orientation and view distance information in the second and third variation, the index performance in supporting range and directional query greatly enhanced almost by a factor of two comparing to R-tree and Grid. This implies that a naive addition of extra orientation informationinaugmentingR-treedoesnotnecessarilyenhancethe performance of indexing. However, the results demonstrates that oursophisticatedoptimizationtechniquesinaugmentingR-treeus- ing extra orientation and view distance is critical in the enhance- ment of index performance, which is the main contribution of this paper. Another major contribution of this paper is the new search algorithmstoefficientlyprocessrangeanddirectionalquerieswith OR-trees. For example, we devise a new method to identify an in- dex node of OR-tree as “total hit” (i.e., all the child nodes are in the result set) without accessing its child nodes, which results in a significant reduction in processing cost. Finally, we develop an analytical model to compute the bound of the maximum possible improvementofOR-treesoverR-tree. The remainder of this paper is organized as follows. In Sec. 2, werepresenttheFOVspatialmodelforgeo-taggedvideosandfor- mally define the video queries. Sec. 3 reviews the related work. In Sec. 4, we present two baseline indexes. We propose our index structuresinSec.5anddescribesearchalgorithmsfortheproposed indexesin Sec. 6. Weanalyze the maximum improvement over R- trees in Sec. 7. Sec. 8 reports our experimental results. Finally, Sec.9concludesthepaperanddiscussessomeofourfuturedirec- tions. 2. PRELIMINARIES 2.1 VideoSpatialModel Inthispaper,videosarerepresentedasasequenceofvideoframes, and each video frame is modeled as a Filed Of View (FOV) [2] as shown in Fig. 2. An FOV f is denoted as (p, R, ! Θ), in which, p is the camera location,R is the visible distance, ! Θ is the orienta- tion of the FOV in form of a tuple< ! θ b ! θe>, where, in clockwise direction, ! θ b is the beginning and ending view direction (a ray), respectively. We store ! Θ as a tuple of two numbers: Θ<θ b ,θe>, where,θ b = ł ! N ! θ b (resp.θe = ł ! N ! θe),istheanglefromthenorth ! N to the beginning (resp. ending) view direction ! θ b in clockwise di- rection. During video recording using sensor-rich camera devices, we can obtain the camera view direction with respect to the north θ and the visible angle α automatically [2]. Then we can derive θ b = (θ 2 +360)mod360,andθe = (θ+ 2 +360)mod360. In this paper, we represent the coverage of each geo-video as a series of FOV objects. LetV be a video dataset. For a videovi 2 V, let Fv i be the set of FOVs ofvi. Hence, the video databaseV canberepresentedasanFOVdatabaseF =fFv i j8vi 2Vg. p θ α North R Θ r b θ e θ → b θ → e θ Figure2: FOVmodel 2.2 QueriesonGeo-Videos As we represent a video databaseV as an FOV spatial database F, the problem of geo-video search is transformed into spatial queries on FOV databaseF. Next, we formally define two typical spatial queries on geo-videos [12]: range queries and directional queries. Given a query circleQr(q,r) with the center pointq and the radiusr, a range query finds FOVs that overlap withQr. It is formallydefinedas: RangeQ(Qr,F) def ⇐⇒{f∈F|f∩Qr̸=∅} Given a query direction interval Q d (θ b ,θe) and a query circle Qr(q,r), a directional query finds FOVs whose orientations over- lapwithQ d withintherangeQr,andisformallydefinedas: DirectionalQ(Q d ,Qr,F) def ⇐⇒{f∈F|f. ⃗ Θ∩Q d ̸=∅andf∩Qr̸=∅} 3. RELATEDWORK Geo-referencedvideoshavebeenofwideinteresttotheresearch communities of multimedia and computer vision. For example, Zhu et al. [21] generated panorama from videos based on the geo- metadata of videos. Shen et al. [16] studied automatic tag annota- tions for geo-videos. In spatial database field some studies [2,15] focused on geo-video modeling and representation. Other stud- ies [4,8,12,13,19,20] mainly focused on geo-video indexing and queries. Navarrete et al. [13] and Toyama et al. [19] utilized R- tree [5] and grid files to index the camera locations of videos, re- spectively. In addition, others [4,8,12] focused on indexing and query processing of geo-videos represented as FOV objects. Ay et al. [4] indexed FOV objects with R-tree, and Ma et al. [12] proposed a grid-based index for FOV objects. However, neither of them are efficient. Their drawbacks are analyzed in Sec. 4.2. Kim et al. [8] present an R-tree based index called GeoTree for FOV objects. The difference between GeoTree and R-tree is that GeoTree stores Minimum Bounding Tilted Rectangle (MBTR) in theleafnodes. AnMBTRisalongtiltedrectangleparallelingwith the moving direction of the FOV stream enclosed in the MBTR. GeoTree is only suitable for indexing mobile videos whose mov- ing directions do not change frequently. However, in real life, the moving directions of mobile videos change frequently, e.g., with Google Glass or a mobile phone. Further, GeoTree considered the movingdirectionofFOVobjectsinsteadoftheorientationsofFOV objects. Furthermore, it does not store the orientation information in the index node for filtering. Therefore, the existing work is not efficient and effective for indexing and querying the geo-videos withbothlocationandorientationinformation. Since FOV objects are spatial objects with orientations, stud- ies on directions are related to our work. The exploration of di- rectional relationships between spatial objects has been widely re- searched [14,18], including absolute directions (e.g., north) and relativedirections(e.g.,left). Someworkmainlystudiedondirection- aware spatial queries [9–11]. Other studies focused onthe moving directionsofmovingobjects[8,17]. In our paper, the directions of FOV objects are the inherent at- tributesoftheobjectsratherthanthatofqueriesandhencedifferent fromthedirectionsdiscussedintheabovestudies. 4. BASELINEMETHODS 4.1 R-tree OnebaselineforindexingFOVobjectsisusingR-tree[5],which is one of the basic and widely used index structures for spatial ob- jects. To index FOVs efficiently using R-tree, we index the FOV objectbasedontheMBRsofthevisiblesceneoftheFOVobjects. Consider the example in Fig. 3, in which,f1,...,f8 are FOV ob- jects. Thelocations,orientationsandvisibledistancesoftheFOVs are also given in Fig. 3. The R-tree of those FOVs is illustrated in Fig.4. SinceR-treeisbasedontheoptimizationofminimizingthe area of MBRs of FOV objects, the MBRs of the leaf nodes of the R-treearethedashedrectanglesinFig.3(assumethefanoutis2). Range and Directional queries based on R-tree For the range query Qr in Fig. 3, we need to access all the index nodes (R1 R7) of the R-tree since all of their MBRs overlaps Qr. However, of which, only two FOV objects f1 and f2 are results. For the directional query with the query direction interval Q d (0 ◦ 90 ◦ ) and the query range Qr, we also need to access all of the R-tree nodessincethisR-treecannotsupportorientationfiltering. Qr f1 f 3 f8 f 4 f 2 f5 f7 R1 R 2 R3 R 4 Q d f6 f:p f: f:R f1 4.3,1.1 330 ◦ -7 ◦ 8.5 f2 5.2,0.4 340 ◦ -20 ◦ 8.4 f3 7.0,1.6 20 ◦ -50 ◦ 8.0 f4 8.0,1.1 24 ◦ -58 ◦ 8.5 f5 4.6,1.8 260 ◦ -315 ◦ 3.0 f6 5.8,1.0 263 ◦ -320 ◦ 2.7 f7 11.0,8.0 170 ◦ -215 ◦ 6.0 f8 12.6,8.0 170 ◦ -215 ◦ 6.0 Figure3: SampledatasetofFOVobjects [0.2, 8.4] [0.4, 9.6] [7.0, 15.2] [1.0, 9.1] [2.9, 8.4] [0.4, 8.8] [7.0, 13.9] [1.6, 9.1] [8.0, 15.2] [1.0, 8.6] [0.2, 5.5] [1.1, 9.6] f 1 R1 [1.5, 4.7] [1.1, 4.0] f 5 [2.9, 8.4] [0.4, 8.8] f 2 [3.1, 5.8] [0.4, 3.0] f 6 [7.0, 13.9] [1.6, 9.1] f 3 [7.0, 11.2] [2.0, 8.0] f 7 [8.0, 15.2] [1.0, 8.6] f 4 [9.6, 14.2] [2.0, 8.0] f 8 R2 R3 R4 R5 R6 R7 [0.2, 5.5] [1.1, 9.6] Figure4: R-tree f 1 MBR(f1) (a) FOVobject R 1 f 1 f 5 (b) Indexnode Figure 5: Dead spaces of object and index node of R-tree. Dashedareadenotesthedeadspaces. Hence,R-treehasthefollowingdrawbacksforindexingFOVs: Dead space. Fig. 5 illustrates the “dead spaces” (or empty area, theareathatiscoveredbytheMBRofanR-treenode, but does not overlap with any objects in the subtree of the node [5]) of FOV f1 and R-tree node R1 in Fig. 3. Dead spaces will cause false positives for range queries, and thus increasebothindexnodeaccessesandcpucomputationcost. Taking the range query in Fig. 3 as an example, due to the dead spaces of index nodes R3 and R4, it needs to access R3 and R4, which are not necessary access since FOVs in neitherR3 norFOVsinR4 areresults. LargeMBRs. TheareaoftheMBRofanR-treenodewould be large due to the large visible scenes of the FOV objects enclosedinthenode. WithR-tree,largeMBRswillincrease the number of accessed node for a given range query since the decision whether to visit a node depends on if the MBR overlapsthequeryarea[5]. No orientation filtering. With regular R-tree, there is no ori- entationinformationintheindexnodesoftheR-tree. No orientation optimization. R-tree is constructed based on theoptimizationofminimizingthecoveringareaofFOVob- jects,withoutconsideringtheirdirections. 4.2 Grid-basedIndex Another competitive approach that considers the directions of FOVobjectsisGrid-basedIndex[12]. Itisathree-levelgrid-based indexstructureindexingFOVsbasedontheirviewablescene,cam- era locations and view directions. Specifically, At the first level, it indexes FOVs in a coarse gird, where each grid cell, denoted as CL1(m,n), is an δ δ square area, and the parameter δ deter- minesthecellsize. EachcellmaintainstheFOVsthatoverlapwith the cell. At the second level, each first-level cell is divided into sssubcells,whereeachsubcellisdenotedasCL2(f,g),andthe parameters decides thecell size at thesecond level. Each second- levelcellmaintainstheFOVswhosecameralocationsareinsidethe cell. Atthe thirdlevel, it divides360 ◦ intox ◦ intervalsinaclock- wisedirectionstartingfromtheNorth(0 ◦ ). Eachdirectioninterval maintainsalistofFOVobjectswhoseorientationsoverlapwiththe directioninterval. Forexample,Fig.6showstheGrid-basedIndex oftheFOVobjectsinFig.3. CL1(1,1) CL1(3,3) CL2(1,1) CL2(8,8) Qr Qd f1 f3 f8 f4 f2 f5 f7 f6 (a) FirstandsecondlevelsoftheGrid C L1 (1,1) C L1 C L1 (2,1) C L1 (1,2) C L1 (2,2) C L1 (3,2) f 5 f 1 , f 5 f 1 , f 2 , f 3 , f 4 , f 3 , f 4 , f 8 f 5 , f 6 , f 7 , f 8 C L2 (4,1) C L2 C L2 (3,2) C L2 (5,2) C L2 (7,6) f 2 ,f 6 f 3 ,f 4 C L2 (6,6) f 7 f 1 , f 2 , f 3 , f 4 f 3 , f 4 C L3 f 7 , f 8 f 7 , f 8 f 5 ,f 6 o o 45 0 - o o 90 46 - o o 180 91 - o o 225 181 - o o 270 226 - o o 360 315 - f 1 ,f 2 ,f 3 ,f 4 ,f 5 ,f 6 f 1 ,f 5 f 8 f 5 ,f 6 (b) Indextables Figure6: TheGrid-basedindexofFOVdatasetinFig.3 Range queries with Grid-based Index The range query algo- rithm follows the filter-refinement paradigm. 1) Filtering: Among the first-level CL1 cells, it accesses the CL1 cells whose overlap areawiththequeryregionislargerthanacertainthresholdϕ(e.g., 30%) to get the candidate FOVs; For theCL1 cells whose overlap area is less than ϕ, it covers the overlap region with the second- level CL2 subcells, and retrieves the neighboring CL2 subcells to get the candidate FOVs that overlap with theCL2 subcells. 2) Re- finement: itchecksthecandidateFOVobjectsonebyonetodeter- mine whether they are the final results. For example, for the range queryQr inFig.6,weneedtoaccessthefirst-levelcellCL1(2,2), and all the second-level cells as they are within the distance of R around CL2(2,5) or CL2(2,6), where R (= 8.5) is the maximum visibledistanceamongtheFOVsinFig.3. Directional queries with Grid-based Index For the directional query, it first uses the cells at the first and second levels for range processing, and then accesses the third-level cells for orientation filtering. For example, for the directional query in our running ex- ample, it needs to visit all the first and second level cells accessed by the range queryQr above, besides those, it needs to access the third-levelcells0 ◦ 45 ◦ and45 ◦ 90 ◦ thatoverlapwiththequery direction intervalQ d . Grid-based Index applies the range filtering andorientationfilteringseparatelyinthatitstoresthelocationand orientationinformationatdifferentlevels. However,theGrid-basedIndexhasthefollowingdrawbacks: It stores the location and orientation information at different levels which is not efficient since video queries (e.g., direc- tional queries) usually involve both location and orientation informationofFOVsatthesametimeduringqueryprocess- ing. ItisnotsuitableforindexingFOVswithdifferentzoomlev- els and camera lens’ properties. Those FOVs have different viewabledistancesastheyarecalculatedbasedonzoomlev- els and camera lens [2]. In the refinement of a range query, it needs to visit all theCL2 cells that are within the MAXI- MUMviewabledistanceamongalltheFOVsinthedatabase foreachsecond-levelcellthatoverlapswiththequeryrange, whichwouldresultinmanyfalsepositives. It performs poorly for skewed distribution of camera posi- tionsofFOVobjectssincethebucketoccupancyofgridfiles risesverysteeplyforskeweddistribution [6]. 5. THECLASSOFOR-TREES ToovercomethedrawbacksofR-treeandGrid,wedeviseaclass of new index structures combining camera locations, orientations andviewabledistancesofvideos. 5.1 OrientationAugmentedR-tree: OAR-tree Recall that with R-tree, using MBRs to estimate FOVs will re- sultinlargeMBRs,large“deadspaces”andthelossoforientation information. In this section, we introduce a new index called Ori- entation Augmented R-tree (OAR-tree) based on smaller MBRs, reduced “dead spaces”, and incorporating orientation information intheindexnodes,toacceleratethequeryefficiency. Inparticular,fortheleafindexnodesofanOAR-tree,insteadof the MBRs of FOV objects, we store three values and a pointer to the actual FOV objects. Based on which, we can avoid the “dead spaces”ofFOVobjectstoreducefalsepositives. Specifically,each leaf index node N of an OAR-tree contains a set of entries in the form of (Oid, p, R, Θ), where, as discussed in Sec. 2.1, Oid is pointer to an FOV in the database; p is the camera location of the FOVobject;Risitsvisibledistance;andΘisitsvieworientation. Forinternalindexnodes,wereplace1)Oidwithapointertothe child node, 2) p with the MBR of all camera points in the child node, 3) R with an aggregate value representing all visible dis- tances in the child node, and 4) θ with an aggregate value repre- senting all orientations in the child node. Specifically, each non- leaf index node N of an OAR-tree contains a set of entries in the formof(Ptr,MBRp,MinMaxR, ! MBO),where Ptr isthepointertoachildnodeofN; MBRp is the MBR of the camera locations of the FOVs in the subtree rooted atPtr; as shown in Fig. 7,MBRp is ob- viouslymuchsmallerthantheMBRofFOVsinR-tree. MinMaxRisatuple<MinR,MaxR>,whereMinR(resp. MaxR) is the minimum (resp. maximum) visible distance oftheFOVsinthesubtreerootedatPtr; ! MBO is the Minimum Bounding Orientation (MBO), de- finedinDefinition1below,oftheorientationsoftheFOVsin thesubtreerootedatPtr. FromDefinition1,wecanseethat ! MBO is a tuple of< θ b ,θe >, which has the same form as view orientationsΘ of FOV objects. For example, the mini- mumboundingorientationofFOVsf3 andf4 intherunning exampleinFig.4.1isMBO(f3,f4)asshowninFig.8. DEFINITION 1 (MINIMUM BOUNDING ORIENTATION (MBO)). Given a set of FOVs’ orientations Ω = Θi < θ bi ,θei > , 1 i n, n is the number of orientations in Ω, then the Min- imum Bounding Orientation (MBO) of Ω is the minimum angle in clockwise direction that covers all the orientations in Ω, i.e., MBO(Ω) =<θ b ,θe >,suchthat ł ! θ b ! θe = min bi ∈Ω max ej ∈Ω ø ! θ bi ! θej . Fig. 9 illustrates the OAR-tree for the objects in Fig. 3. Taking the first entryE of nodeN5 as an example, entry E points to the nodeN1,whichisthechildnodeofN5. InN1,therearetwoFOVs f1 and f5. N1.MBRp is [4.3,4.6,1.1,1.8] (which is in the form of[minx,maxx,miny,maxy]),whichistheMBRofthecamera locationsoff1.p = (4.3,1.1)andf5.p = (4.6,1.8). N1.MBO is <260 ◦ ,7 ◦ > is the MBO of f1.Θ =<330 ◦ ,7 ◦ > and f5.Θ =< 260 ◦ ,315 ◦ >. Furthermore, N1.MinMaxR = [3.0,8.5] is the minimum and maximum of the viewable distances f1.R = 8.5 andf5.R = 3.0,respectively. R-tree MBR OAR-tree MBRp f 3 f 4 Figure 7: MBR comparison be- tweenR-treeandOAR-tree. f 3 f 4 f 4 ' Figure 8: An MBO. f ′ 4 is thetranslationoff4. [2.7, 8.5] [4.3, 5.8] [0.4, 1.8] [6.0, 8.5] [7.0, 12.6] [1.1, 8.0] [260°, 7°] [24°, 215°] [3.0, 8.5] [4.3, 4.6] [1.1, 1.8] [2.7, 8.4] [5.2, 5.8] [0.4, 1.0] [260°, 7°] [263°, 20°] [8.0, 8.5] [7.0, 8.0] [1.1, 1.6] [6.0, 6.0] [11.0, 12.6] [8.0, 8.0] [24°, 58°] [170°, 215°] 8.5 [4.3, 1.1] f 1 N1 [330°, 7°] 8.4 [5.2, 0.4] f 2 [340°, 20°] 8.0 [7.0, 1.6] f 3 [30°, 50°] 8.5 [8.0, 1.1] f 4 [24°, 58°] 3.0 4.6, 1.8] f 5 [260°, 315°] 2.7 [5.8, 1.0] f 6 6.0 [11.0, 8.0] f 7 6.0 [12.6, 8.0] f 8 [263°, 320°] [170°, 215°] [170°, 215°] N3 N2 N4 N5 N6 N7 Figure9: AnOAR-treeexample LikethetraditionalR-tree[5],theOAR-treeisconstructedbased ontheoptimizationheuristicofminimizingtheenlargedareaofthe minimumboundingrectanglesofthecameralocationsoftheindex nodes, i.e., the OAR-tree aims to group the index nodes that the camera locations of FOVs in the subtrees are close in the same higher-levelnode. Forexample,basedontheoptimizationofmin- imizing the area of camera locations, as shown the OAR-tree in Fig.9,FOVsf3,f4 aregroupedintothenodeN3,andf7 andf8 in N4. Subsequently, for the running example in Fig. 4.1, comparing with the baseline R-tree, for the OAR-tree we visit two less index nodesfortherangequeryQr (bothN3 andN4 canbepruned),and onelessindexnodeforthedirectionalqueryQ d (N4canbepruned andalltheFOVsinthesubtreeofN3 canbereportedasresults). It is non-trivial to process the range and directional queries based on OAR-treeefficiently,andwewilldiscussthealgorithmsinSec.6. The OAR-tree stores the MBRs of camera locations, and incor- poratestheaggregateorientationandviewabledistanceinformation of all the children nodes to achieve smaller MBRs and orientation filtering. However,theOAR-treeisonlybasedontheoptimization of minimizing the covering area of the camera locations, which may result in large false positives for both range and directional queries. Similartothe“deadspace”ofanR-treenode,weformally define the “Virtual Dead Space” of an OAR-tree node in Defini- tion 2. Different from the dead space of an R-tree node where the coverage of an R-tree node, i.e., MBR, is stored, for the virtual deadspaceofanOAR-tree,itsvirtualcoverageisnotstored. How- ever, both of them will produce false positives for range queries. Fig.10(a)showsthevirtualdeadspacesoftheOAR-treenodecon- taining f1 and f5, and the OAR-tree node containing f1 and f2. DEFINITION 2 (OAR-TREE NODE VIRTUAL DEAD SPACE). GivenanOAR-treenodeN(MBRp, ! MBO,MaxMinR),thenthe virtual dead space ofN is the area that is virtually covered byN, butdoesnotoverlapwithanyFOVsinthesubtreeofN. Thevirtual coverageofN isaconvexsuchthatanypointintheconvexcanbe coveredbyanyFOV(p, ! Θ,R),8p2N.MBRp,8 ! Θ 2N. ! MBO, 8R2N.MaxMinR. Consider Fig. 10(a) again, for the example in Fig. 3, FOV f1 is grouped together with f5 in the OAR-tree based on the camera point optimization. However, if f1 is grouped together with f2, additionally considering orientation information, then the virtual deadspacesoftheOAR-treenodecontainingFOVsf1 andf5 will besignificantlyreducedandsodoesthenumberoffalsepositives. Basedonthisobservation,wenextdiscusshowtoenhanceOAR- tree by considering orientation optimization during the index con- struction. f 1 f 5 (a) Camera point optimization f 1 f 2 (b) Camerapoint and orientation optimization Figure10: VirtualdeadspacesofOAR-treenodesbasedondif- ferentoptimizations. Dashedareaindicatesvirtualdeadspace. 5.2 OrientationOptimizedR-tree: O 2 R-tree Inthissection, weproposeanewindexcalledOrientationOpti- mized R-tree (O 2 R-tree) that optimizes based on both the camera locationscoveringareaandthesimilarityinorientation. The stored information of O 2 R-tree index nodes is the same as that of the OAR-tree. The main difference between O 2 R-tree and OAR-tree is in the optimization criteria during the merging and splittingoftheindexnodes. WhiletheframeworkoftheO 2 R-treeconstructionalgorithm(see Algorithm1),issimilartothatofOAR-tree,consideringadditional orientationinformation,theproceduresChooseLeafandSplitare differentandarepresentedasfollows. The procedure ChooseLeaf is to choose a leaf index node to store a newly inserted FOV object. ChooseLeaf traverses the O 2 R-tree from the root to a leaf node. When it visits an internal nodeN in the tree, it will choose the entryE of the nodeN with the least Waste, to be given in Equation 3, which combines the camera locations and orientation information of the FOV objects. The procedure Split is to split an index node N into two nodes when the node N overflows. We use the standard Quadratic Split algorithm[5]basedonourproposedWastefunction. We proceed to computeWaste considering the combination of thewastesofthecameralocationsandvieworientations. GivenanO 2 R-treeentryE(MBRp,MaxMinR,MBO)andan FOVobjectf(p,R,Θ),thecoveringareawaste∆Area(E,f)for thecameralocationisdefinedinEqn(1). ∆Area(E,f)=Area(MBR(E,f))−Area(E) (1) where Area(MBR(E,f)) is the area of the minimum bounding rectangleenclosingE.MBRpandf.p;Area(E)istheareasofthe minimum bounding rectangleEi.MBRp. The angle waste for the vieworientationiscomputedbyEqn(2) ∆Angle(E,f)= MBO(E.MBO,f.Θ)− ß −−−−−→ E.MBO− −−→ f.Θ (2) where MBO(E.MBO,f.Θ) is the clockwise cover angle of the minimumboundingorientationenclosingE.MBO andf.Θ. Algorithm1Insert(R: anoldO 2 R-tree,E: anewleafnodeentry) Output: ThenewO 2 R-treeR. 1: N←ChooseLeaf(R,E) 2: addE tonodeN 3: ifN needstobesplitthen 4: {N,NN}←Split(N,E) 5: ifN.isroot()then 6: initializeanewnodeM 7: M.append(N);M.append(NN) 8: StorenodesM andNN //N isalreadystored 9: R.RootNode←M 10: elseAdjustTree(N.ParentNode,N,NN) 11: elseStorenodeN 12: if¬N.isroot()then 13: AdjustTree(N.ParentNode,N,null) ProcedureChooseLeaf(R,E) 14: N←R.RootNode 15: whileN isnotleafdo 16: E ′ =argmax E i ∈N Waste lo (E i ,E) //Eqn(3) 17: N←E ′ .Ptr 18: ReturnN ProcedureSplit(N,E) 19: E 1 ,E 2 = argmin E i ;E j ∈N∪{E} Waste lo (E i ,E j );//Eqn(3) 20: foreachentryE ′ inN∪{E},whereE ′ ̸=E 1 ,E ′ ̸=E 2 do 21: ifWaste lo (E ′ ,E 1 )≥Waste lo (E ′ ,E 2 )then 22: ClassifyE ′ asGroup1 23: elseClassifyE ′ asGroup2 24: ReturnGroup1andGroup2 Combining Eqn(1) and Eqn(2) using linear regression and nor- malization,wecancomputetheoverallwastecostinEqn(3). Waste lo (E,f)=β l ∆Area(E,f) max∆Area +βo ∆Angle(E,f) max∆Angle (3) In Eqn(3), max∆Area (resp. max∆Angle) is the maximum of ∆Area(E,f) (resp. ∆Angle(E,f)) for all the pair entries Ei andEj to normalize the camera location (resp. orientation) waste. Parameters β l and βo, 0 β l ,βo 1, β l + βo = 1, are used to strike a balance between the area and angle wastes. A smaller Waste lo (E,f)indicatesthattheentryE ismorelikelytobecho- sen for insertion of objectf. Note that Eqn 3 can be naturally ex- tendedtocomputethesimilaritybetweentwoentriesofO 2 R-trees. Qr Qd f 1 f 3 f 8 f 4 f 2 f 5 f 7 N 1 f 6 N 2 N 3 N 4 Figure11: LeafnodesofO 2 R-treefortheexampleinFig.4.1 For the running example shown in Fig. 11, the O 2 R-tree groups f1 and f2 into a node N1, and groups f5 and f6 into a node N2. Hence,ascomparedtotheOAR-tree,O 2 R-treevisitsonelessindex nodefortherangequeryQr (nodeN2 canbepruned),andoneless indexnodeforthedirectionalqueryQ d (nodeN2 canbepruned). 5.3 ViewDistanceandOrientationOptimized R-tree: DO 2 R-tree Consideringthecameralocationandorientationforoptimization may still be insufficient. To illustrate this, consider Fig. 12, FOV f1 is packed with f2 in node N1 (Fig. 12(a)) based on the O 2 R- treeoptimizationastheircameralocationsandorientationsarethe same. While additionally considering the visible distances for op- timization, f1 is packed with f3 in N1 (see Fig. 12(a)) due to the high dissimilarity of visible distances between f1 and f2 and that between f1 and f3. Therefore, the range query qr needs to visit twoindexnodes(i.e.,N1 andN2 inFig.12(a))basedontheO 2 R- tree optimization. However, if we consider the view distances for optimization as well then we only need to visit one node (i.e., N1 inFig.12(b)). Hence,wediscusshowtoconstructtheindexbased ontheoptimizationcriterionincorporatingtheviewdistanceinfor- mation of FOV objects, and we call the new index View Distance andOrientationOptimizedR-tree(DO 2 R-tree). f 1 f2 f 3 f4 Qr N 1 N 2 (a) Without view dis- tance f 1 f2 f 3 f4 Q r N 1 N 2 (b) Withviewdistance Figure 12: Illustration of optimization criteria with and with- outconsideringviewdistance. Supposethefanoutis2. Again,thestoredinformationofDO 2 R-treeindexnodesandthe index construction framework are the same as those of the O 2 R- tree. The difference between DO 2 R-tree and O 2 R-tree is the opti- mization criteria. Hence, unlike the waste function in Eqn(3), the new waste function incorporates the view distance differences as giveninEqn(5). Given a DO 2 R-tree entry E(MBRp,MaxMinR,MBO) and anFOVobjectf(p,R,Θ),thewasteofviewabledistance∆R(E,f) isdefinedinEqn(4). ∆Diff(E,f)=Diff(Ef)−Diff(E) (4) where Diff(E) is the difference between maximum and mini- mum viewable distances of entry E, i.e., D(E) = E.MaxR E.MinR. Diff(Ef) is the difference between maximum and minimum viewable distances of node enclosing the viewable dis- tancesofbothE andf. Combining the wastes of camera location area and orientation coveringangleandthewasteofviewdistancedifferencestogether, wecancomputetheoverallwastecostinEqn(5). Waste lod (E,f) = β l ∆Area(E,f) max∆Area +βo ∆Angle(E,f) max∆Angle + β d ∆Diff(E,f) max∆Diff (5) In Eqn(5),max∆Diff is the maximum of ∆Diff(E,f) for all the pair entries Ei and Ej to normalize the visible distance. Pa- rametersβ l ,βo andβ d ,0β l ,βo,β d 1,β l +βo+β d = 1,are usedtotunetheimpactofthethreewastes. Inparticular,ifβ d = 0, then DO 2 R-tree reduces to O 2 R-tree, and if also βo = 0, then it becomesOAR-tree. 6. QUERYPROCESSINGS Weproceedtopresentthequeryalgorithmsforrangequeriesand directional queries, respectively, based on DO 2 R-tree which is the generalizationofthethreeindexesdiscussedinSec.5. 6.1 Rangequeries Inthissection,wedevelopanefficientalgorithmtoanswerrange queries. At the high-level, the algorithm descends the DO 2 R-tree in the branch-and-bound manner, progressively checking whether eachvisitedFOVobject/indexnodeoverlapwiththerangequery object. Subsequently, the algorithm decides whether to prune an object / index node, or to report the FOV object / index node (all theFOVobjectintheindexnode)toberesult(s). Inthefollowing, before presenting the algorithm, we will first present an exact ap- proach to identify whether an FOV object overlaps with the range query object, and then we exploit it to identify whether a DO 2 R- tree index node should be accessed or not through two newly de- finedstrategies: 1)pruningstrategyand2)totalhitstrategy. 6.1.1 Searchstrategiesforrangequeries q p r (a) Case1 q p R r → b θ → e θ Θ f g (b) Case2 q p R r → b θ → e θ Θ f g b e (c) Case3 → b θ → e θ Θ (d) Case4 Figure13: OverlapidentifyingforanobjectFOV LetQr(q,r)betherangequerycirclewithcenterpointqandra- diusr,andletf p,R,Θ<θ b ,θe> beanFOVobject,weexplain the exact approach to calculate whether FOV f overlaps with the query Qr. As shown in Fig. 13, there are four overlapping cases: Case1)TheFOVcameralocationf.piswithinthequeryQr. The formal equation is given in Eqn(6). Obviously, FOVs in this case mustoverlapwiththe queryQr; Case 2)f.p is outside ofQr, and the ray ! pq is within the FOV view orientation off.Θ. In this case, f can overlap with Qr iff Qr intersects with the circle with the centerpointpandradiusR,asformallydefinedinEqn(7);Case3) f.p is outside ofQr, and the ray ! θ b is between the ray ! pq and ray ! pg. Inthiscase,f canoverlapwithQr iff thesegmentpbintersects withthearcfg,whichisformallydefinedinEqn(8);Case4)Anal- ogously,f.pisoutsideofQr,andtheray ! θe isbetweentheray ! pf and ray ! pq. In this case,f can overlap withQr iff the segmentpe intersectswiththearcfg,whichisformallydefinedinEqn(9). Therefore we can derive the lemma bellow to identify whether anFOVobjectoverlapswitharangequerycircle. LEMMA 1 (OVERLAP IDENTIFYING FOR AN OBJECT). Given anFOVf p,R,Θ<θ b ,θe> andarangequeryQr(q,r),f over- lapswithQr iffitsatisfiesEqn(6),orEqn(7),orEqn(8),orEqn(9). |pq|≤r (6) |pq|≤r +R and − → b − → pq + − → pq − → e = − → (7) |pq|cos − → pq − → b − ¨ r 2 −(|pq|sin − → pq − → b ) 2 ≤R and − → b − → pq+ − → pq − → e ̸= − → (8) |pq|cos − → pq − → e− ¨ r 2 −(|pq|sin − → pq − → e) 2 ≤R and − → b − → pq+ − → pq − → e ̸= − → (9) Based on Lemma 1, we can develop our first strategy, the prun- ing strategy, to examine whether a DO 2 R-tree index node N can be pruned or not. In order to prune N without accessing the ob- jects in the subtree ofN, we first introduce two approximations in Definition3. Let N MBRp,<MinR,MaxR>,MBO<θ b ,θe> be an in- dex node in DO 2 R-tree. As shown in Fig. 14, let < ! p b q, ! peq>, where p b ,pe 2 MBRp, be a ray tuple such that 8p 2 MBRp, the ray ! pq is between ! p b q and ! peq in clockwise direction. Subse- quently,wehavefollowingdefinitionandlemmas. DEFINITION 3 (MaxA ANDMinA). Themaximum(resp. min- imum) cover angle in clockwise direction from the MBO of the DO 2 R-treeindexnodeN totheray ! pq,denotedasMaxA(MBO,MBRp,q) (resp. MinA(MBO,MBRp,q)),isdefinedas: MaxA(MBO;MBRp;q) =Max ł − → b −→ peq; ł −→ p b q − → e (10) MinA(MBO;MBRp;q) = 0 if − → overlapswith< −→ p b q; −→ peq> Min ł − → e −→ p b q; ł −→ peq − → b otherwise (11) MBRp MBO MaxA(MBO, MBRp, q) MinA(MBO, MBRp, q) e b (a) Case1 MBRp MBO MaxA(MBO, MBRp, q) MinA(MBO, MBRp, q) e b (b) Case2 Figure14: IllustrationofMinAandMaxA LEMMA 2. 8f(p,Θ)2N(MBRp,MBO),8θ2f.Θ, ł ! θ ! pq MaxA(MBO,MBRp,q)and ł ! θ ! pqMinA(MBO,MBRp,q) PROOF. Since there are two cases of the relationships between MBOand< ! p b q, ! peq>(seeFig.14),Lemma2isobviouslytrue. LEMMA 3 (PRUNING STRATEGY). IndexnodeN canbepruned ifitsatisfiesEqn(12),orEqn(13),orEqn(14), MinD(q;MBRp)≥r +MaxR (12) MinA(MBO;MBRp;q)≥ arcsin r MinD(MBRp;q) (13) MinD(q;MBRp)cos(MaxA(MBO;MBRp;q))− p r 2 −MinD 2 (MBRp;q)sin 2 (MinA(MBO;MBRp;q))≤MaxR; (14) whereMinD(MBRp,q)istheminimumdistancefromq toMBRp PROOF. 8f 2 N, if Eqn(12) is true then Eqn(6) and Eqn(7) in Lemma 1 are obviously true. Additionally, 8f 2 N, if Eqn(13) (or Eqn(14)) is true then Eqn(8) and Eqn(9) are true according to Lemma2. ThereforeLemma3istrue. We are now ready to discuss our second strategy. We call an index node N a “total hit” iff all the objects in N overlap with the query circle (i.e., they all belong to the results). This is a new concept that does not exist with regular R-trees. If an index node N is a “total hit”, then it is not necessary to exhaustively check for all the objects in N one by one, so the processing cost can be significantlyreduced. Hence, based on the two approximations above, we propose a novel search strategy, total hit strategy, to examine whether an in- dexnodeN isatotalhitornotwithoutaccessingthe FOVobjects inthesubtreeofN. LEMMA 4 (TOTAL HIT STRATEGY). AlltheFOVsinthesub- treeofN canbereportedasresultsifitsatisfiesEqn(15),orallthe equations(16),(17)and(18), MaxD(q;MBRp)≤r (15) MaxD(q;MBRp)≤r +MinR (16) MaxA(MBO;MBRp;q)≤ arcsin r MaxD(MBRp;q) (17) MaxD(q;MBRp)cos(MinA(MBO;MBRp;q))− p r 2 −MaxD 2 (MBRp;q)sin 2 (MaxA(MBO;MBRp;q))≤MinR (18) PROOF. TheproofissimilartothatofLemma3andisomitted. 6.1.2 Searchalgorithm Basedonthetwonewstrategiesdiscussedabove,weproceedto develop an efficient algorithm to answer range queries (see Algo- rithm 2). The overview of the algorithm is to descend the DO 2 R- tree in the branch-and-bound manner, progressively applying each ofthestrategiestoanswertherangequeries. 6.2 Directionalqueries Wenextexplainanefficientalgorithm(seeAlgorithm3)forpro- cessing directional queries with DO 2 R-tree. Given a direction in- tervalQ d ,wecaneasilydecidewhethertheorientationofaDO 2 R- tree index node overlaps withQ d using orientation information in the DO 2 R-tree nodes. Similar to the range query algorithm, Algo- rithm 3 also follows in a branch-and-bound manner, progressively applyingthesearchstrategiestoanswerthedirectionalquery. Note that we apply the range search strategies and the orientation filter- ingmethodsatthesametimetodecidewhetheraDO 2 R-treeindex nodeshouldbeprunedora“totalhit”(Line5andLine7). Algorithm2RangeQuery(R: DO 2 R-treeroot,Qr: rangecircle) Output: Allobjectsf∈results. 1: InitializeastackS withtherootoftheDO 2 R-treeR 2: while¬S.isEmpty()do 3: N←S.top();S.pop() 4: ifN satisfiesEqn(12)∧Eqn(13)∧Eqn(14)then 5: PruneN //Lemma3 6: elseifN satisfiesEqn(15)∨ Eqn(16)∧Eqn(17)∧Eqn(18) 7: thenresults.add(N.subtree()) //Lemma4 8: elseforeachchildnodeChildN ofN 9: S.push(ChildN) Algorithm 3 Directional Query (R: DO 2 R-tree root, Q d : direc- tionalinterval,Qr: rangecircle) Output: Allobjectsf∈results. 1: InitializeastackS withtherootoftheDO 2 R-treeR 2: while¬S.isEmpty()do 3: N←S.top();S.pop() 4: ifQ d disjointswithN.Θand N satisfiesEqn(12)∧Eqn(13)∧Eqn(14)then 5: PruneN 6: elseifQ d coversN.Θand N satisfiesEqn(15)∨ Eqn(16)∧Eqn(17)∧Eqn(18) then 7: results.add(N.subtree()) 8: elseforeachchildnodeChildN ofN 9: S.push(ChildN) 7. ANALYSIS We analyze the maximum improvement space of range FOV querieswithR-trees. LEMMA 5. Assume that the camera locations and orientations ofN FOVobjectsareuniformlydistributedinanareaand0 ◦ 360 ◦ , respectively,andtheviewabledistancesandviewableanglesofthe FOVs are the same. Comparing with the range query algorithm with R-tree in Sec. 4.1, The I/O cost of the optimal algorithm with theoptimalindexforrangequeriesisatmost66.7%timeslessthan thatoftheapproachwithR-treeinSec.4.1. PROOF. Given a range queryQr(q,r) and an arbitrary FOVf, if the MBR of f, f.mbr, overlaps with Qr, then there is at most 50%probabilitythatf isafalsepositivesincetheareaofthe“dead space”,asshowninFig.5,canbeprovedtobeatmosthalfofthat of f.mbr. Let M be the number of FOVs whose MBRs overlap withQr. Then the number of R-tree nodes need to be accessed is M f (1+ P h i=1 1 f i ) 3M 2f . And then the optimal search algorithm with an optimal index (i.e., only accessing results) visits at most ( 3M 2f M 2f )/ 3M 2f < 66.7% less nodes than the approach with R- tree. Hence,Lemma5istrue. LEMMA 6. Assume that the camera locations and orientations ofN FOVobjectsareuniformlydistributedinanareaand0 ◦ 360 ◦ , respectively, and the viewable distances and viewable angles of the FOVs are the same. The improvement of the optimal algo- rithm based on the optimal index for direction queries, IDopt, is 2×r×s 1 (r+s 1 ) 2 +(1 e− b 360 ) (r−s 1 ) 2 (r+s 1 ) 2 ,wherer isthequeryrangeradius; s1, the side length of the leaf node, is p f−1 √ N +s0, andN is total FOV number, f is the fanout of the R-tree, s0 is the average side length of the MBR of FOV objects, [θ b ,θe] is the query direction range. PROOF. Obviously,IDoptIRopt. 8. EXPERIMENTS We conducted experimental studies to evaluate the efficiency of ourmethodsusingtwofundamentalqueries: rangeanddirectional. 8.1 ExperimentalSettings ImplementedIndexes Weimplementedourproposedindexesand search algorithms: OAR-tree, O 2 R-tree, and DO 2 R-tree for range anddirectionalqueries. Inaddition,weimplementedtwobaselines forcomparison: R-treeandGridbasedIndex. DatasetsandQueries Theirperformancewasevaluatedusingtwo types of datasets: Real World (RW) and Synthetically Generated (Gen) dataset as shown in Table 1. RW was collected by more than 100 people mainly in Los Angeles and Singapore: around 50% of videos at the University of Southern California (USC) and 30%ofvideosatSingaporedowntownandtheNationalUniversity of Singapore (NUS). The distribution of FOVs in RW was very skewed. To evaluate the scalability of our solutions, we synthet- ically generated five Gen datasets with different data sizes in log scale from 0.1M (million) to 1B (billion) FOVs using the mobile video generation algorithm presented in [3]. In Gen, FOVs were uniformly distributed across the 10km 10km area around the USC, and their view distances were also randomly distributed in 100500 meters, see Table 2. Unless specified, the 1M generated datasetwasassumedbydefaultinthereportedexperimentalresults. For each (RW andGen) dataset, we generated 5 query sets for range queries by increasing the query radius from 100 to 500 me- tersby100meters. Eachrangequerysetcontained10,000queries withdifferentquerylocationsbutthesamequeryradius. ForGen, querypointswereuniformlydistributedinthe10km10kmarea aroundUSC.ForRW,ahalfofrangequerieswereuniformlydis- tributed in the 10km 10km area around USC while the other halfweredistributedina10km10kmareainSingapore. Again, unless specified, the query radius of 200 meters was assumed by default in the rest of discussion. Similarly, for each dataset, we generated query sets for directional queries varying the query di- rectionintervalfrom45 ◦ to315 ◦ increasingby45 ◦ . Setup and metrics. We implemented all the indexes on a server withIntelCore TM 2DuoCPUE8500@3.16GHz,4GBofRAMand used the page size of 4KB. We evaluated their performance based ondisk-residentdata. ThefanoutsofR-tree,OAR-tree(resp.,O 2 R- tree, DO 2 R-tree) were 170, 102, respectively. For the baseline of the Grid based Index, following the setting reported in [12], we set the first-level cell size to be 250 meters, set the second-level cellsizetobe62.5meters,andsettheviewdirectionintervaltobe 45 ◦ . Asthemetricsofourevaluation, we reporttheaveragequery processingtimeandtheaveragenumberofpageaccessesperquery afterexecuting10,000queriespereachset. Table1: Datasetsfortheexperiments Statistics RW Gen total#ofFOVs 0.2M 0.1M∼1B total#ofvideos 1276 100∼1M FOV#persecond 1 1 averagetimepervideos 3mins 0.28hours totaltime 62hours 27.8hours∼31.71years averagecameramovingspeed(meters/s) 1.25 1.35 averagecamerarotationspeed(degrees/s) 10 10 averageviewabledistanceR(meters) 100 250 averageviewablecoverangle(degrees) 51 60 Table2: SyntheticallyGeneratedDataset totalFOV# 0.1M 1M 10M 100M 1B totalvideo# 100 1K 10K 100K 1M totaltime 27.8hours 11.6days 115.7days 3.2years 31.7years Space requirement. The space usages of the index structures for different datasets are reported in Table 3. The space requirements ofOAR-tree,O 2 R-tree,DO 2 R-treewerealmostidenticalsowere- port only the space usage of DO 2 R-tree. DO 2 R-tree requires a lit- tlebitmorespacethanR-treesinceitneedsextraspacetostorethe information of orientations and viewable distances on each index node. However, the space usage of DO 2 R-tree was significantly less(about5timesless)thanthatofGridbecauseGridredundantly storeseachFOVateachlevel. Table3: Sizesofindexingstructures(Megabytes) RW 0.1M 1M 10M 100M 1B R-tree 8.66 3.41 32.69 289 2,536 23,978 Grid 42.02 16.70 163 1,576 16,519 163,420 DO2R-tree 9.75 5.51 60.25 379 3,784 38,299 8.2 Evaluationofrangequery In this set of experiments, we evaluated the performance of the indexes for range query using theGen datasets. First, Fig. 15 re- portstheaveragenumberofpageaccessesofallindexes. Fig.15(b) shows the same but focuses on data size from 0.1M to 10M. Note that the data size is shown in log scale but the page access num- ber is shown in linear scale. The most important observation is that both O 2 R-tree and DO 2 R-tree significantly outperformed the otherindexes,whichdemonstratesthesuperiorityofouroptimized approaches. O 2 R-tree (resp., DO 2 R-tree) accessed around 40% (resp.,50%)lesspagesthanGrid,andaround50%(resp.,60%)less than R-tree. The improvement of DO 2 R-tree is very close to the theoreticalmaximumimprovement(66.7%,asanalyzedinSec.7), clearly showing the effectiveness of orientation and view distance optimizations and search algorithms of O 2 R-tree and DO 2 R-tree. Another observation is that OAR-tree needed a bit more page ac- cesses than R-tree. This is expexted because OAR-tree is only based on the optimization of minimizing covering area of camera locations. The dead space of OAR-tree node can be larger than that of R-tree node, thus it may produce more false positives than R-tree, even though it incorporates view orientation and distance information into the index nodes for filtering. This demonstrates that not the simple consideration of orientation but the optimiza- tion criteria considering the orientation can significantly facilitate thereductionofthedeadspacesoftreenodesandsubsequentlythe reductionoffalsepositives. Inaddition,DO 2 R-treewasabitbetter than O 2 R-tree for range queries since optimization incorporating additional viewable distance information might help to accelerate range queries. Overall, the number of page accesses increased lin- earlyasthedatasetsize,i.e.,thenumberofFOVs,increased. Second,Fig.16reportstheaveragequeryprocessingtimeinthe aboveexperiments. Theoverallperformanceimprovementshowed a similar tread as in Fig. 15 butone can observe that O 2 R-tree and DO 2 R-treeprovidedabetterimprovementinpercentagereduction comparingtoR-treeandGridthanthecaseofpageaccess. Itisbe- cause the “total hit” search strategy applied on our indexes can fa- cilitate the reduction of processing time, while R-tree (resp., Grid) needs to check all the FOV objects in a node (resp., cell) in which alltheFOVsareresults. Finally,Fig.17reportstheimpactofqueryradiusontheaverage page accesses while varying the query radius from 100 to 500 me- ters. OtherthanGrid,theperformancetrendheldforotherindexes as the query radius increased. One can obviouslysee that the page accessing number of Grid increases so rapidly and is significantly greater than those of other indexes after the radius is greater than 200 meters. It is because that the first-level cell size of Grid was set at 250 meters in our experiments so Grid needs to access more first-level cells, each of which contains more number of objects. Gridisnotflexibleforquerieswithvariousradiuses. 8.3 Evaluationofdirectionalquery In this set of experiments, we evaluated the performance of the indexesfordirectionalqueryusingthesamedatasets. Weusedthe defaultqueryradiusof200metersintheseexperiments. As shown in Fig. 18, the overall performance enhancement by O 2 R-treeandDO 2 R-treewereobviousandsimilarastheresultsof range queries. Actually, their percentage reduction was greater in directional queries than those in range queries. Specifically, O 2 R- tree(resp.,DO 2 R-tree)accessedabout70%(resp.,65%)lessnum- ber of pages than Grid and accessed about 67% (resp., 63%) less than R-tree. This demonstrates that the orientation optimization in buildingO 2 R-treeandDO 2 R-treewasmoreeffectiveinsupporting directional queries (as expected). Note that O 2 R-tree performed a little better than DO 2 R-tree in directional query since the extra optimization of minimizing the extra view distance information in DO 2 R-treewouldslightlyhindertheeffectivenessofitsdirectional queryprocessingwhileO 2 R-treefocusesmainlyonorientation. Next,weevaluatedtheimpactofthequerydirectioninterval. In Fig. 18(b), as expected, the page access number of R-tree was not influenced by the query direction interval. The page access num- ber of other indexes slightly increased when the angle increased. Clearly,O 2 R-treeandDO 2 R-treedemonstratedatleast2timesbet- terperformancethanothers. ThepageaccessnumberofGridgrows muchfasterthanotherindexesbecauseGridneedstovisititsthird levelfororientationfilteringforeachcandidateatthesecond-level cell and the number of candidate cells at the third level linearly increasesasquerydirectionintervalgrows. 8.4 Impactofweightparameters This set of experiments aimed to evaluate how our optimization criteria(i.e.,location,orientation,andviewdistance)andtheirbal- ance impact on the index performance. First, we built O 2 R-trees with different weight of orientation differenceβo (in Eqn 3) from 0 to 1 (BetaO in Fig. 19). Fig. 19 shows the number of page ac- cesses of O 2 R-tree, built with different βo, for range query with 1MFOVdataset. Notethatthecaseofβo = 0isactuallyanOAR- tree which mainly focuses on locations in building the tree and its performance is one of the worst. This is because just simply aug- mentingnodeswithorientationwithoutoptimizationdoesnothelp. As we increaseβo, i.e., applying more optimization using orienta- tion, the performance of O 2 R-tree becomes significantly better in themidrangeofβo andbecomesthebestaroundβo = 0.4. How- ever, close to the other extreme case of βo = 1 (i.e., β l =0), only 0 10000 20000 30000 40000 50000 60000 0.1M 1M 10M 100M 1B # of page accesses dataset size R-tree Grid OAR-tree O2R-tree DO2R-tree (a) 0.1M1B 0 100 200 300 400 500 600 700 800 0.1M 1M 10M # of page accesses dataset size R-tree Grid OAR-tree O2R-tree DO2R-tree (b) 0.1M100M Figure15: Pageaccessesofrangequeries 0 0.5 1 1.5 2 2.5 3 3.5 4 0.1M 1M 10M 100M 1B Query time (seconds) datasize R-tree Grid OAR-tree O2R-tree DO2R-tree Figure 16: Query processing timeofrangequeries 0 100 200 300 400 500 600 700 800 100 200 300 400 500 # of page accesses query radius (meter) R-tree Grid OAR-tree O2R-tree DO2R-tree Figure 17: Impact of radius in rangequeries 0 10000 20000 30000 40000 50000 60000 0.1M 1M 10M 100M 1B # of page accesses dataset size R-tree Grid OAR-tree O2R-tree DO2R-tree (a) Varyingdatasetsize 20 40 60 80 100 120 45 90 135 180 225 270 315 # of page accesses Query direction interval (degree) R-tree Grid OAR-tree O2R-tree DO2R-tree (b) Varyquerydirectioninterval Figure18: Directionalqueries 20 40 60 80 100 120 0 0.2 0.4 0.6 0.8 1 # of page accesses BetaO in O2R-tree O2R-tree Figure19: Varyβo inO 2 R-tree 20 40 60 80 100 120 0 0.2 0.4 0.6 0.8 1 # of page accesses BetaD in DO2R-tree DO2R-tree Figure20: Varyβ d inDO 2 R-tree orientationdifferenceisconsideredinbuildingO 2 R-treesotheper- formanceofrangequerysuffers. Agoodbalancebetweenareaand anglewastescanproducethebestresult. Next,webuiltDO 2 R-treeswithdifferentweightofviewabledis- tance difference β d (in Eqn 5) from 0 to 1 (BetaD in Fig. 20). Learned from the previous result, we setβ l = 0.6(1β d ) and βo = 0.4(1β d )toassumethebestcaseofthem. Whenβ d = 0, thetreeisactuallytheO 2 R-treewithβo = 0.4. Whenβ d = 1,only viewable distance difference is considered. Fig. 20 shows that the performance of DO 2 R-tree was comparable (best whenβ d = 0.2) when β d 0.6 but it suffered as β d approached 1. In this study, the distance difference among FOVs were not big so the impact of β d was minimal. For a specific application, we can find good parametervaluesempirically. 8.5 EvaluationusingRWdataset To evaluate the effectiveness of our indexes and search algo- rithms in real world environments, we also conducted a set of ex- periments using the real world (RW) dataset, which is in skewed distribution. AsshowninFig.21,theresultsshowsthesametrends as in the previous experiments. Comparing with R-tree, O 2 R-tree and DO 2 R-tree provide the same improvements. Comparing with Grid, O 2 R-tree and DO 2 R-tree provide more improvements with skewedly distributed RW dataset. Specifically, O 2 R-tree (resp. DO 2 R-tree) accesses about 45% (resp., 55%) less disk pages than Grid for range queries, and about 75% (resp., 70%) less for direc- tionalqueries. Itisbecausethebucketoccupancyofgridfilesrises verysteeplyforskeweddistribution. Tosummarize,theexperimentalresultsdemonstratethatourpro- posed indexes O 2 R-tree and DO 2 R-tree and corresponding search algorithmsoutperformthetwobaselineindexesforbothrangeand directional queries; and the stored orientation information in the OAR-treenodecanfacilitatetheperformanceofdirectionalqueries. 9. CONCLUSIONANDFUTUREWORK Thispaperrepresentsvideodataasaseriesofspatialobjectswith orientations,i.e.,FOVobjects,andproposedaclassofR-tree-based indexes that can index location, orientation and distance informa- tion of FOVs for both filtering and optimization. In addition, our indexescanflexiblysupportusergeneratedvideosthatcontainad- hocFOVswithpotentiallyskeweddistributions. Further,twonovel 0 5 10 15 20 25 30 35 40 R-tree Grid OAR O2R DO2R # of page accesses Indexes Range Directional Figure21: ComparisononRWdataset searchstrategieswereproposedforfastvideorangeanddirectional queriesontopofourindexstructures. Ourindexesarenotspecific for FOV objects with sector shapes and can be used to index any spatial object with orientation in other shapes such as vectors, tri- angles and parallelograms. We intend to extend this work in two directions. First, we intend to extend our indexes to the cloud for even larger sets of video data. Second, we would like to study the insertion and update costs of our indexes and study techniques for batchinsertionofvideo. 10. REFERENCES [1] http://www.cisco.com/c/en/us/solutions/collateral/ service-provider/visual-networking-index-vni/white_ paper_c11-520862.pdf. [2] A.S.Ay,R.Zimmermann,andS.H.Kim.ViewableSceneModelingfor GeospatialVideoSearch.InACMIntl.Conf.onMM,pages309–318,2008. [3] S.A.Ay,S.H.Kim,andR.Zimmermann.Generatingsyntheticmeta-datafor georeferencedvideomanagement.IntheSIGSPATIAL,pages280–289,2010. [4] S.A.Ay,R.Zimmermann,andS.H.Kim.Relevancerankingingeoreferenced videosearch.ACMMultimediaSystems(MMSys),16(2):105–125,Mar.2010. [5] A.Guttman.R-trees: adynamicindexstructureforspatialsearching.In SIGMOD,pages47–57,1984. [6] V.JainandB.Shneiderman.Datastructuresfordynamicqueries: Ananalytical andexperimentalevaluation.InProc.oftheWorkshoponAdvancedVisual Interfaces.NY:ACM,pages1–11,1994. [7] S.H.Kim,Y.Lu,G.Constantinou,C.Shahabi,G.Wang,andR.Zimmermann. Mediaq: Mobilemultimediamanagementsystem.InACMMMSys,pages 224–235,2014. [8] Y.Kim,J.Kim,andH.Yu.Geotree: Usingspatialinformationfor georeferencedvideosearch.InKnowledge-BasedSystems,pages1–12,2014. [9] K.C.Lee,W.-C.Lee,andH.V.Leong.Nearestsurrounderqueries.IEEE TKDE,22(10):1444–1458,2010. [10] G.Li,J.Feng,andJ.Xu.Desks: Direction-awarespatialkeywordsearch.In Proc.ofthe28thIEEEICDE,pages474–485,2012. [11] X.Liu,S.Shekhar,andS.Chawla.Object-baseddirectionalqueryprocessingin spatialdatabases.Proc.ofIEEETKDE,15(2):295–304,Feb.2003. [12] H.Ma,S.A.Ay,R.Zimmermann,andS.H.Kim.Large-scalegeo-taggedvideo indexingandqueries.GeoInformatica,Dec.2013. [13] T.NavarreteandJ.Blat.Videogis: Segmentingandindexingvideobasedon geographicinformation.InProc.oftheConf.ongeographicinformation science,pages1–9,2002. [14] M.Schneider,T.Chen,G.Viswanathan,andW.Yuan.Cardinaldirections betweencomplexregions.ACMTODS,37(2):8:1–8:40,June2012. [15] S.Shekhar,X.Liu,andS.Chawla.Anobjectmodelofdirectionandits implications.Geoinformatica,3(4):357–379,Dec.1999. [16] Z.Shen,S.ArslanAy,S.H.Kim,andR.Zimmermann.Automatictag generationandrankingforsensor-richoutdoorvideos.InProc.ofthe19th ACMintl.conf.onMultimedia,pages93–102,2011. [17] Y.Tao,D.Papadias,andJ.Sun.Thetpr*-tree: anoptimizedspatio-temporal accessmethodforpredictivequeries.InProc.ofthe29thIntl.Conf.onVLDB, volume29,pages790–801,2003. [18] Y.Theodoridis,D.Papadias,andE.Stefanakis.Supportingdirectionrelations inspatialdatabasesystems.InProc.ofthe7thIntl.SymposiumonSpatialData Handling(SDH’96),1996. [19] K.Toyama,R.Logan,andA.Roseway.Geographiclocationtagsondigital images.InProc.ofthe11thACMIntl.Conf.onMM,pages156–166,2003. [20] K.youngWhangandR.Krishnamurthy.Themultilevelgridfile-adynamic hierarchicalmultidimensionalfilestructure.InProc.Intl.Conf.onDatabase SystemsforAdvancedApplications,pages449–459,1991. [21] Z.Zhu,E.M.Riseman,A.R.Hanson,andH.Schultz.Anefficientmethodfor geo-referencedvideomosaicingforenvironmentalmonitoring.Mach.Vision Appl.,16(4):203–216,Sept.2005.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 964 (2016)
PDF
USC Computer Science Technical Reports, no. 959 (2015)
PDF
USC Computer Science Technical Reports, no. 650 (1997)
PDF
USC Computer Science Technical Reports, no. 592 (1994)
PDF
USC Computer Science Technical Reports, no. 694 (1999)
PDF
USC Computer Science Technical Reports, no. 799 (2003)
PDF
USC Computer Science Technical Reports, no. 839 (2004)
PDF
USC Computer Science Technical Reports, no. 826 (2004)
PDF
USC Computer Science Technical Reports, no. 590 (1994)
PDF
USC Computer Science Technical Reports, no. 943 (2014)
PDF
USC Computer Science Technical Reports, no. 896 (2008)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 804 (2003)
PDF
USC Computer Science Technical Reports, no. 736 (2000)
PDF
USC Computer Science Technical Reports, no. 721 (2000)
PDF
USC Computer Science Technical Reports, no. 766 (2002)
PDF
USC Computer Science Technical Reports, no. 645 (1997)
PDF
USC Computer Science Technical Reports, no. 742 (2001)
PDF
USC Computer Science Technical Reports, no. 646 (1997)
PDF
USC Computer Science Technical Reports, no. 701 (1999)
Description
Ying Lu, Cyrus Shahabi, and Seon Ho Kim. "An efficient index structure for large-scale geo-tagged video databases." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 948 (2014).
Asset Metadata
Creator
Kim, Sean Ho
(author),
Lu, Ying
(author),
Shahabi, Cyrus
(author)
Core Title
USC Computer Science Technical Reports, no. 948 (2014)
Alternative Title
An efficient index structure for large-scale geo-tagged video databases (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
11 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269311
Identifier
14-948 An Efficient Index Structure for Large-scale Geo-tagged Video Databases (filename)
Legacy Identifier
usc-cstr-14-948
Format
11 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/