Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
City-scale aerial LiDAR point cloud visualization
(USC Thesis Other)
City-scale aerial LiDAR point cloud visualization
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CITY-SCALE AERIAL LIDAR POINT CLOUD VISUALIZATION by Zhenzhen Gao A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2014 Copyright 2014 Zhenzhen Gao Acknowledgements My sincere appreciation and thanks to my adviser Prof. Ulrich Neumann. His advice and guidance have been invaluable, his scientific knowledge is both motivating and inspiring. Many thanks to Prof. Suya You, Prof. C.-C. Jay Kuo, Prof. Aiichiro Nakano and Prof. Cyrus Shahabi for taking their precious time to serve on the committee of my qualifying exam and thesis defense. I benefit a lot from their brilliant comments and suggestions. I am very grateful to the people I have worked with or shared knowledge with. Thanks to Dr. Qian-Yi Zhou, Dr. Luciano Nocera, Dr. Sang Yun Lee, Dr. Jongmoo Choi and Dr. Saty Raghavachary for encouraging my research and for helping me to grow as a research scientist. Thanks to Dr. Parag Havaldar for priceless advice for my career. I would especially like to thank the current and previous CGIT lab members who shared a great time with me: Kelvin Chung, Charalambos ’Charis’ Poullis, Jonathan Mooser, Lu Wang, Quan Wang, Tanasai Sucontphunt, Wei Guan, Guan Pang, Yi Li, Jing Huang, Jiaping Zhao, Yili Zhao and Rongqi Qiu. A special thanks to my family. Words cannot express how grateful I am to my beloved husband Miao Wang for all of the sacrifices that he has made on my behalf. I am grateful to my mother, father, grandparents, brother and sister-in-law, their support for me was what sustained me thus far. I would also like to thank all of my friends who supported me in life and research, and incented me to strive towards my goal: iii Shuai Hao, Zhisen Jiang, Bei Pan, Na Chen, Zaoshi Yuan, Rong Xiang, Elisabeth Zhu, Xuemei Zhao, Dian Gong, Rong Yang, Zhi Tong and many others. Finally, I acknowledge Airborne 1 Corp. and Sanborn Corp. for providing data, and Center for Interactive Smart Oilfield Technologies (CiSoft) for financial support. iv Contents Acknowledgements iii List of Figures viii List of Tables xv Abstract xvi Chapter 1 Introduction 1 1.1 Visually-Complete Aerial LiDAR Point Rendering . . . . . . . . . . . 2 1.2 Fusing Oblique Imagery with Augmented Aerial LiDAR . . . . . . . . 5 1.3 Feature Enhancing Aerial LiDAR Point Cloud Refinement . . . . . . . 7 1.4 Real-time Refinement for Aerial LiDAR Cities . . . . . . . . . . . . . . 10 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 2 Related Work 15 2.1 Visually-Complete Aerial LiDAR Point Rendering . . . . . . . . . . . 15 2.1.1 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.2 Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.3 Splats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Fusing Oblique Imagery with Augmented Aerial LiDAR . . . . . . . . 18 2.2.1 2D-3D Registration . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.2 Texture Mapping on Triangle Meshes . . . . . . . . . . . . . . 19 2.2.3 Color Mapping on Points . . . . . . . . . . . . . . . . . . . . . 20 2.2.4 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Feature Enhancing Aerial LiDAR Point Cloud Refinement . . . . . . . 22 2.3.1 Normal Estimation . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Up-sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4 Real-time Refinement for Aerial LiDAR Cities . . . . . . . . . . . . . . 26 v 2.4.1 Hybrid Representation . . . . . . . . . . . . . . . . . . . . . . 26 2.4.2 Real-time Point Cloud Refinement . . . . . . . . . . . . . . . . 28 Chapter 3 Visually-Complete Aerial LiDAR Point Rendering 29 3.1 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.1 Classify Ground, Trees and Buildings . . . . . . . . . . . . . . 33 3.3.2 Augment Ground Points Under Trees . . . . . . . . . . . . . . 34 3.3.3 Label Building Boundaries . . . . . . . . . . . . . . . . . . . . 35 3.3.4 Estimate Normals . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.5 Estimate Wall Normals . . . . . . . . . . . . . . . . . . . . . . 38 3.3.6 Augment Wall Points . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.1 Generate Wall Quadrangles . . . . . . . . . . . . . . . . . . . . 40 3.4.2 Generate Wall ARPs . . . . . . . . . . . . . . . . . . . . . . . 41 3.4.3 Rendering Extension . . . . . . . . . . . . . . . . . . . . . . . 43 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.5.2 Image Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.5.3 Comparison with Other Methods . . . . . . . . . . . . . . . . . 47 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Chapter 4 Fusing Oblique Imagery with Augmented Aerial LiDAR 50 4.1 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3.2 Out-of-core Strategy . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.3 GPU Splatting . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.4 Color Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 Rendering Pass . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.2 Accumulation Pass . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4.3 Color Normalization . . . . . . . . . . . . . . . . . . . . . . . 58 4.5 Experiments and Discussion . . . . . . . . . . . . . . . . . . . . . . . 59 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 5 Feature Enhancing Aerial LiDAR Point Cloud Refinement 63 5.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 vi 5.1.1 Normal Estimation . . . . . . . . . . . . . . . . . . . . . . . . 64 5.1.2 Boundary Detection . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.1 Smoothing for all points . . . . . . . . . . . . . . . . . . . . . 73 5.2.2 Smoothing for boundary points . . . . . . . . . . . . . . . . . . 73 5.3 Up-sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.3.1 Interior point up-sampling . . . . . . . . . . . . . . . . . . . . 75 5.3.2 Boundary point up-sampling . . . . . . . . . . . . . . . . . . . 79 5.3.3 Feature enhancing extension . . . . . . . . . . . . . . . . . . . 79 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Chapter 6 Real-time Refinement for Aerial LiDAR Cities 87 6.1 Multi-resolution Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1.1 Continuous Data Set . . . . . . . . . . . . . . . . . . . . . . . 89 6.1.2 Discrete Data Set . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2 Hybrid Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.2.1 Polygon Conversion . . . . . . . . . . . . . . . . . . . . . . . . 93 6.2.2 Polygon Simplification . . . . . . . . . . . . . . . . . . . . . . 95 6.2.3 Hybrid Representation for MRQ Non-Leaf Nodes . . . . . . . . 96 6.3 Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.1 LOD Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.2 Hybrid Rendering . . . . . . . . . . . . . . . . . . . . . . . . . 100 6.3.3 Smooth LOD Transitions . . . . . . . . . . . . . . . . . . . . . 101 6.4 Real-time Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.4.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.4.2 Up-sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.4.3 Re-arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.5.1 Hybrid Rendering . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.5.2 Real-time Refinement . . . . . . . . . . . . . . . . . . . . . . . 114 6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Chapter 7 Conclusion 119 Reference List 122 vii List of Figures 1.1 Overview of the proposed framework. Two kinds of input data are shown on the left, the other four are framework components. . . . . . . 2 1.2 Demonstration of the 2.5D nature of aerial LiDAR point cloud. Image courtesy to Qian-Yi Zhou. . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Shaded views of a point cloud color coded by classes. Green trees, brown ground and pink buildings. (a) Using basic PBR, white holes are visible where wall points and ground points under trees are missing. (b) Using the presented extended PBR, holes are effectively filled with recovered missing points: blue walls and brown ground. . . . . . . . . . 4 1.4 Comparison of (a) Rendering of original aerial LiDAR points, and (b) Rendering of colored augmented aerial LiDAR points generated by the color mapping approach. The added colors greatly improve appearance, 3D perception and interpretation of the scene. . . . . . . . . . . . . . . 6 1.5 Rendering of a building roof. (a) Raw points. (b) Refined points. The refinement effectively alleviate artifacts of the original data. . . . . . . . 7 1.6 Point clouds of buildings with various roofs. . . . . . . . . . . . . . . . 8 1.7 The effect of refinement. (a) Original aerial LiDAR points. (b) Before refinement, edges are fuzzy. (c) After refinement, edges are sharp. (b) and (c) are rendered by visually-complete rendering [1]. Brown for ground, dark grey for buildings, light blue for augmented building walls, and green for trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.8 A comparison between rendering of Gao et al. [1] (b) and the presented framework (c) on the data set of (a). Our framework provides compara- ble visual quality but with improved frame-rates. pts for points and fps for frames per second. Brown for ground; dark grey for buildings; and light blue for walls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 viii 2.1 Common representations used to render point clouds: triangles (left), points (middle) and splats (right). The same data set is used for all three representations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Rendering of urban area of Toronto using points and TINs. From [2]. . . 16 2.3 Rendering of urban area of Toronto using splats, taken from Kovac et al. [3]. Missing points on building walls result in holes. . . . . . . . . . 18 2.4 Texture mapped triangle meshes of a residential area. Vegetation is fake- looking. From Ding et al. [4]. . . . . . . . . . . . . . . . . . . . . . . . 20 2.5 Color mapping on extremely dense point cloud. From Pintus et al. [5]. . 21 2.6 Feature preserving filtering. From Duguet et al. [6]. . . . . . . . . . . . 23 2.7 Existing refinement methods cannot handle position discontinuous fea- tures. Rough boundaries are produced as a result. From Huang et al. [7]. 24 2.8 Point cloud up-sampling. From Huang et al. [7]. . . . . . . . . . . . . . 25 2.9 POP system chooses a different number of points and triangles based on viewing location (red for points; blue for triangles). From Chen et al. [8]. 26 3.1 Pre-processing and rendering for basic PBR (yellow), and for the pre- sented completion methods: augmented wall points (green), procedu- rally generated quadrangle walls (blue) and procedurally generated ARP walls (red). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Comparison of provided wall geometries. The pink dot is a 3D point, the blue dots and surfaces are added new data. (a) Point-wall. (b) Quadrangle-wall. (c) ARP-wall. . . . . . . . . . . . . . . . . . . . . . 32 3.3 Demonstration of augmenting ground points under trees. Tree points are rendered in green and ground points in brown. (a) Before. (b) After. . . 34 3.4 Various cases of boundary points. . . . . . . . . . . . . . . . . . . . . . 35 3.5 Geometric interpretation of Principal Component Analysis. . . . . . . . 38 3.6 Rendering of two overlapping wall pieces with different normals. (a) Using the same solid color. (b) Using the same gradient color. Aliasing on overlapped areas is alleviated by gradient blending. . . . . . . . . . . 41 ix 3.7 Comparison of ARP-walls (red rectangles) and quadrangle-walls (blue segments) over the same group of boundary points (black dots), viewed from top. When viewed along the directions of yellow segments, gaps are found in quadrangle-walls. While ARP-walls are watertight in all directions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.8 Different wall representations. (a) Original roof-only point cloud. (b) Augmented with point-walls. (c) Augmented with quadrangle-walls. (d) Augmented with ARP-walls. . . . . . . . . . . . . . . . . . . . . . . . 42 3.9 Denver area. (a) Rendering with quadrangle-walls and LOD factor f = 4. (b) Aerial image from Microsoft Bing Maps. The rendering provides clear building contours that match well with the image. Vegetation is different as two data sets were not captured at the same time. . . . . . . 46 3.10 Atlanta area. (a) Rendering with quadrangle-walls and LOD factor f = 4. (b) Aerial image from Microsoft Bing Maps. The rendering provides clear building contours that match well with the image. Vegetation is different as two data sets were not captured at the same time. . . . . . . 47 3.11 Rendering comparison of Los Angeles area. (a) Using our rendering with quadrangle-walls, LOD is disabled. Trees are colored in green. (b) Visualization of the full geometric models from Zhou et al. [9] using a ray tracer and shadow maps, image courtesy to Qian-Yi Zhou. (a) provides comparable visual cues with (b). . . . . . . . . . . . . . . . . 48 4.1 The color mapping framework. Operations are in bold. . . . . . . . . . 52 4.2 The overview of the USC campus area using colored augmented points generated by the presented color mapping and rendering system. . . . . 60 4.3 Comparison of (a) rendering of colored points. (b) a full-size oblique image. (a) achieves comparable visual cues to the ground truth image. . 60 4.4 Comparison of (a) rendering of colored points. (b) visualization of tex- tured polygonal models. (a) provides added visual cues from realistic roads and vegetation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.5 Comparison of different point data resolution. (a) Using the sparse data. (b) Using the dense data. The color mapping quality increases with point cloud density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 x 5.1 Normal estimation comparison of isotropic and anisotropic neighbor- hoods, under different noise level. Wedges have various angles: 30 (left two columns), 90 (middle two columns) and 150 (right two columns). Points with errors larger than 10 are colored in red, colors of the rest points are interpolated between green (good) and blue (bad). The values of sensitive parameter for clustering are given for anisotropic neighbor- hood estimation. In all cases, anisotropic neighborhood provides better normal estimation than isotropic neighborhood. . . . . . . . . . . . . . 64 5.2 Anisotropic neighborhood selection for point p (blue) sits (a) near the edge, and (b) on the edge, using 16-nearest neighbors: selected (red) and unselected (black). . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.3 Boundary points (red) detection. . . . . . . . . . . . . . . . . . . . . . 69 5.4 The smoothing process on a synthetic building with two roof blocks. One edge of the wedge sits right above one edge of the rectangular roof. The data is corrupted with Gaussian noise of zero-mean and standard deviation of 1% of the diagonal of the bounding box. (b)-(d): cross- sections and close-up views of the squared area in (a). (e)-(f): close-up views of boundary points of the rectangular area in (a). . . . . . . . . . 70 5.5 Special cases for different influence weight g. Arrows for normals; dot- ted lines for vectors connecting two points; blue point is p. Red points will get large g if evaluated with (a) Equation (5.1). (b) Equation (5.2). . 72 5.6 The problem caused by displacing boundary points. (a) Boundary points (red) will be moved to the dotted line. The dotted circle is centered at the red point with radiuskvk. (b) After displacing boundary points only. The interior point (blue) is left outside the boundary. (c) After displacing boundary and affected interior points. The problem of (b) is avoided. . . 75 5.7 The up-sampling process on the data of Figure 5.4(h). (a)-(e) show close-up views of the square in Figure 5.4(a). . . . . . . . . . . . . . . 76 5.8 2D view of the up-sampling process. (a) Point p (blue) and itsg-radius neighbors (black). p i and p j form the under-sampled area. (b) Interpo- lation end point q (red) and the base point b new (green). Inner circle has g-radius and outer circle has g max -radius. (c) Two candidate positions (crosses) to insert a new point based on b new . . . . . . . . . . . . . . . . 77 xi 5.9 Interior point up-sampling for a 2D surface with both inner and outer boundaries. The original sampling is non-uniform. Inserted points are in red. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.10 Smoothing results over different noise levels. The distance between an actual point and its ground truth is color coded as the rightmost bar. Degradation of smoothing quality of both interior surface and bound- aries is negligible. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.11 Stable refinement over various input sampling density. . . . . . . . . . . 81 5.12 Versatility on real data with different shapes and complexity. Colored using a normal map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.13 Smoothing results for the data set in Figure 5.10(c). Our approach has less errors on boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.14 Comparison of geometry modeling and visually-complete rendering of points refined by our approach. Our approach provides better looking and more accurate curved boundaries. . . . . . . . . . . . . . . . . . . 84 5.15 Refinement for a larger data set with several buildings. . . . . . . . . . 85 5.16 Quality of surface reconstruction gets improved after the refinement on raw points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.1 A two-level MRQ. (a) Input point data marked with bounding boxes (in red or blue). Spaces between bounding boxes are left intentionally for clarity. (b) MRQ of (a). Node colors correspond to colors of bounding boxes. L0 has 1 4 resolution of L1. . . . . . . . . . . . . . . . . . . . . . 89 6.2 Gridding and filtering results on (a) The elevation map of a test data. (b) The GEM after gridding. (c) After applying a box filter on (b), empty grids are filled by new red points. (d) After applying a Laplace filter on (c), building boundary grids are marked in light blue. . . . . . . . . . . 91 6.3 Rendering effect of polygon conversion. (a) Rendered by Gao et al. [1] where ground and buildings are points. (b)-(e) Rendered by our frame- work with polygon conversion under different parameters: L = 10 5 ((b)(d)) and L= 10 3 ((c)(e)). Converted polygons in (d) and (e) are randomly colored. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 xii 6.4 Rendering effect of polygon simplification. (a)(c) Rendered by Gao et al. [1]. Wall quadrilaterals are colored randomly in (c). (b)(d) Rendered by our framework with polygon simplification underL= 10 4 . Simpli- fied walls are colored randomly in (d). . . . . . . . . . . . . . . . . . . 97 6.5 Building an RPQ p sub-tree of parent node m based on a child node n. (a) Real scene. Red for buildings; and grey for ground. (b) GEM p of node n with 64 grids. (c) GEM p of node m covering only the region of n. (d)-(i) The propagation process of RPQ p from child n to parent m. A node with a cross mark has f rp as converted polygon; a node with a question mark needs to be re-evaluated over GEM m p in (c); the rest of the nodes have f rp as point. . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.6 A data set at different viewing distances. Converted polygons and sim- plified polygons are shown in red. . . . . . . . . . . . . . . . . . . . . 100 6.7 The relationship between a GEM (dark grey) and its extension eGEM (both light and dark grey regions). . . . . . . . . . . . . . . . . . . . . 102 6.8 The process of refining a MRQ leaf node m. Input: (a) GEM of node m with red shade for buildings and grey shade for ground, and (d) RPQ p of node m. Initialization: (b) A new GEM with four times resolu- tion as G m . Initial grid data (green) are set from G m , and (e) A new RPQ p for (b) initialized as R m . Up-sampling: (c) Only regions of point- represented nodes in R m are interpolated, with light yellow for grids in direct group and dark yellow for grids in indirect group, and (f) RPQ p in (e) is updated for newly interpolated grids. Re-arrangement: Four child nodes n 1 -n 4 are added under m in MRQ, GEM and RPQ p of each child node are shown (g)-(n). . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.9 GEM interpolation. (a) Interpolate a new grid g between the interpola- tion grid-pair s and e. b is the base point, d is the projection distance, p is the projected new point on the latent surface, and p 0 is the actual point represented by grid g. (b) Neighborhood of b (the cross) within radiusg which contains only original grids (green). . . . . . . . . . . . . . . . . 106 6.10 Speed-quality trade-off under the influence ofG. . . . . . . . . . . . . . 109 6.11 Rendering comparison between VRQ and our framework withL= 10 3 and G= p 2 2 w g . (a) and (b) for Denver; (c) for Atlanta1; and (d) for Atlanta2. Our framework offers comparable visual details but greatly improved rendering speed. . . . . . . . . . . . . . . . . . . . . . . . . 111 xiii 6.12 Rendering comparison over close-ups of Los Angeles (left column), Atlanta (middle column) and Denver (right column) before and after the refinement. Green trees, brown ground, pink buildings and light blue building walls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.13 Refinement effect over a series of zoom-in of a scene from Los Angeles. Before refinement (top row), roof boundaries are fuzzy. After refinement (bottom row), boundaries look sharper and the tiny top roof shows better details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.14 Refinement comparison without (a) and with (b) using hierarchical hybrid structures over 102k points from Los Angeles. Listed are the number of processed points and the processing time for refinement. . . . . . . . . 115 6.15 The percentage of points that are represented by point under differentL. Polygon-represented regions are highlighted with random colors. . . . . 116 xiv List of Tables 3.1 Basic information of test data sets. . . . . . . . . . . . . . . . . . . . . 44 3.2 Pre-processing time (in seconds) of three wall reconstruction approaches. 45 3.3 Rendering performance (in fps) of our methods and the basic PBR. . . . 46 4.1 Basic information and performance of data sets in experiment. . . . . . 59 5.1 Errors for Figure 5.10. MD: mean distance; SD: standard deviation. . . 80 5.2 Errors for Figure 5.11. MD: mean distance; SD: standard deviation. . . 82 5.3 Processing time (in seconds) of each refinement step. IP-N: number of input points; SM-T: time for smoothing; US-T: time for up-sampling; US-N: number of points after up-sampling; EU-T: time for feature enhance- ment up-sampling; OP-N: number of output points. . . . . . . . . . . . 84 6.1 Basic information of test data sets. . . . . . . . . . . . . . . . . . . . . 109 6.2 Pre-processing (pp) time (in seconds) and rendering (rd) speed (in fps) comparisons of VRQ and our framework. pc for polygon conversion; and ps for polygon simplification. . . . . . . . . . . . . . . . . . . . . . 110 6.3 Basic information of test data sets. . . . . . . . . . . . . . . . . . . . . 115 6.4 The influence of L over the percentage of points represented as point. Measured in all leaf nodes of the initial MRQ over three cities. . . . . . 116 xv Abstract Aerial LiDAR (Light Detection and Ranging) is cost-effective in acquiring terrain and urban information by mounting a downward-scanning laser on a low-flying aircraft. It produces huge volumes of unconnected 3D points. This thesis focuses on the interactive visualization of aerial LiDAR point clouds of cities, which is applicable to a number of areas including virtual tourism, security, land management and urban planning. A framework needs to address several challenges in order to deliver useful visual- izations of aerial LiDAR cities. Firstly, the data is 2.5D, in that the sensor is only able to capture dense details of the surfaces facing it, leaving few samples on vertical building walls. Secondly, the data often suffers from noise and under-sampling. Finally, the large size of the data can easily exceed the memory capacity of a computer system. This thesis first introduces a visually-complete rendering framework for aerial LiDAR cities. By inferring classification information, building walls and occluded ground areas under tree canopies are completed either through pre-processing point cloud augmen- tation or through online procedural geometry generation. A multi-resolution out-of- core strategy and GPU-accelerated rendering enable interactive visualization of virtu- ally unlimited size data. With adding only a slight overhead to existing point-based approaches, the framework provides comparable quality to visualizations of off-line pre- computation of 3D polygonal models. xvi The thesis then presents a scalable out-of-core algorithm for mapping colors from aerial oblique imagery to city-scale aerial LiDAR points. Without intensive processing of points, colors are mapped via a modified visibility pass of GPU splatting, and a weighting scheme leveraging image resolution and surface orientation. To alleviate visual artifacts caused by noise and under-sampling, the thesis shows an off-line point cloud refinement algorithm. By explicitly regularizing building boundary points, the algorithm can effectively remove noise, fill gaps, and preserve and enhance both normal and position discontinuous features for piece-wise smoothing buildings with arbitrary shape and complexity. Finally, the thesis introduces a new multi-resolution rendering framework that sup- ports real-time refinement of aerial LiDAR cities. Without complex computation and without user interference, simply based on curvature analysis of points of uniform sized spatial partitions, hierarchical hybrid structures are constructed indicating whether to represent a partition as point or polygon. With the help of such structures, both ren- dering and refinement are dynamically adaptive to views and curvatures. Compared to visually-complete rendering, the new framework is able to deliver comparable visual quality with less than 8% increase in pre-processing time and 2-5 times higher rendering frame-rates. Experiments on several cities show that the refinement improves rendering quality for large magnification under real-time constraint. xvii Chapter 1 Introduction Light Detection and Ranging (LiDAR) is a terrain and urban information acquisition technique based on optical remote sensing. LiDAR is primarily categorized into two classes: aerial LiDAR is gathered by mounting a downward-scanning laser on a low- flying aircraft while terrestrial LiDAR is gathered by mounting a horizontally-scanning portable scanner on a tripod. These techniques generate huge volumes of 3D point samples without connectivity that can be complemented by intensity, RGB colors and classification labels. The use of aerial LiDAR data for urban modeling [10, 11] and visualization [3, 12] has received increasing attention in recent years. This thesis is concerned with the interactive visualization of city-scale aerial LiDAR point clouds, which is relevant to a number of areas including virtual tourism, security, land management and urban plan- ning [13–15]. The direct rendering of a raw aerial LiDAR point cloud as shown in Figure 1.1(a) has limited visual cues due to noise and occlusions. This thesis targets at a process- ing and rendering framework that can provide interactive visualizations with important visual cues such as complete geometries and colors. Figure 1.1 shows an overview of the presented framework, which is composed of four components (Figure 1.1(b)-(e)) intro- duced as follows: i) Section 1.1 completes occluded geometries through point cloud augmentation and procedural rendering, ii) Section 1.2 fuses colors from aerial oblique imagery with LiDAR points, iii) Section 1.3 applies feature enhancing refinement to 1 Figure 1.1: Overview of the proposed framework. Two kinds of input data are shown on the left, the other four are framework components. improve quality of input points, and iv) Section 1.4 adapts refinement into real-time city-scale rendering. 1.1 Visually-Complete Aerial LiDAR Point Rendering Aerial LiDAR data can be characterized as being 2.5D in nature, in that the sensor is only able to capture dense details of the surfaces facing it [9]. Only a few points typically appear on vertical surfaces such as building walls and such areas are either completely occluded or severely under-sampled, as is the case for ground areas under tree canopies. Figure 1.2 shows an building example captured by aerial LiDAR. To provide useful point-based visualizations of this type of data, one needs to fill in or augment missing or under-sampled areas that often produce artifacts like holes as in Figure 1.3(a). This work focuses on augmenting scenes with buildings and trees, as 2 Figure 1.2: Demonstration of the 2.5D nature of aerial LiDAR point cloud. Image courtesy to Qian-Yi Zhou. these objects are predominant in urban areas where the occlusions and under-sampling is most severe. The point-based rendering (PBR) framework is extended to augment aerial LiDAR point cloud in the offline pre-processing and/or procedurally generate geometries in the online GPU-accelerated rendering for more complete and realistic visualizations, by informing the system with cues relating to the physical properties of the scanned points, inferred from readily available classification information of point clouds. The 2.5D characteristic of building structures is formally observed and defined in [9], as "building structures being composed of detailed roofs and vertical walls con- necting roof layers". Inspired by exploitation of this characteristic in building recon- struction [9, 10, 16–19], the presented work introduces the 2.5D characteristic to the visualization problem. To reconstruct building walls, for any building point on building boundaries, a piece of vertical wall connecting to the ground is attached. Three distinct geometries are explored to use as wall primitive: point, quadrangle and axis-aligned rectangular parallelepiped (ARP). Point-wall is produced by augmenting the point cloud in the pre-processing. Quadrangle-wall and ARP-wall are procedurally generated in the rendering. For under-sampled ground areas under tree canopies, the point cloud is augmented with occluded ground points directly under tree points in the pre-processing. 3 To accommodate the large size of aerial LiDAR datasets that usually exceeds the memory capacity of a computer system, an out-of-core scheme is applied to process only a manageable subset of the data at any one time. For performance consideration, it is coupled with a quadtree as the multi-resolution data structure to provide view- dependent level-of-detail (LOD). Objects near the viewer are displayed with more detail than those that are farther away. (a) (b) Figure 1.3: Shaded views of a point cloud color coded by classes. Green trees, brown ground and pink buildings. (a) Using basic PBR, white holes are visible where wall points and ground points under trees are missing. (b) Using the presented extended PBR, holes are effectively filled with recovered missing points: blue walls and brown ground. Figure 1.3(b) shows the result produced by the presented extended PBR. Since gaps caused by under-sampled building walls and ground areas under trees are successfully covered, it provides better visual cues than basic PBR as shown in Figure 1.3(a). Contributions: To the best of our knowledge, this is the first work to utilize the 2.5D characteristic of building structures to improve point cloud visualization. The presented approach successfully fills the gaps between ground and roofs as well as under-sampled or missing points under tree canopies, thus providing com- parable visual cues to visualizations generated by off-line pre-computation of 3D polygonal urban models. The presented approach also has low overhead and achieves interactive frame-rates. 4 Three distinct primitives are introduced to reconstruct building walls. Procedural geometry is implemented at the rendering level of a PBR framework. The deferred shading of PBR is extended to handle both point splats and polygonal geometries. 1.2 Fusing Oblique Imagery with Augmented Aerial LiDAR In addition to geometric data, the use of imagery as textures in 3D landscape visualiza- tions is an important methodology to improve appearance, 3D perception and interpre- tation by adding essential visual cues in color that are absent in the geometry. Most commonly, the transfer of color from registered images to geometries involves texturing triangle meshes. Triangulation [20] and urban modeling [21–23] are used to extract meshes from raw unstructured LiDAR point clouds. Both methods require intensive processing and fail to recover complete and accurate urban scenes due to noise, occlusions and incomplete modeling capabilities. Often less salient structures such as street lights, power cables, cars, etc. are ignored. Fusing colors directly to points is a new visualization alternative. Some scanning systems [24] integrate cameras to provide LiDAR points with colors. But colored points produced by these systems often result in confetti-style color variations due to the mix- ing of adjacent points with different colors obtained from multiple scans and the conse- quent variations in lighting. A few approaches [5, 25] map points with colors extracted from external images, but none of them has been shown to work with sparsely sam- pled aerial LiDAR data. With the increasingly common acquisition of high resolution 5 aerial oblique images and recent developments in image-to-LiDAR registration tech- niques [20, 26, 27], the presented approach fuses colors from multiple pre-registered aerial oblique images with a augmented aerial LiDAR point cloud with complete geome- tries. (a) (b) Figure 1.4: Comparison of (a) Rendering of original aerial LiDAR points, and (b) Rendering of colored augmented aerial LiDAR points generated by the color mapping approach. The added colors greatly improve appearance, 3D perception and interpreta- tion of the scene. The core ingredient of this work is the color mapping designed for extremely sparse point clouds, which employs a modified visibility pass of GPU splatting to project 3D LiDAR points as oriented surface splats to 2D oblique images. To attack the challenge that both images and points are often too large to fit in memory, the presented approach also utilizes an out-of-core strategy to sequentially process images and hierarchically partitioned points. Figure 1.4(b) shows the colored augmented points which provide obviously improved visual cues than the original data depicted in Figure 1.4(a). Contributions: This is the first algorithm for mapping multiple overlapping images onto augmented aerial LiDAR points. 6 A modified visibility pass of GPU splatting is utilized to map colors for points while determining visibility in a single pass. The method works for point clouds with any spatial resolution. The presented complete color mapping system is generic to any co-registered oblique imagery and aerial LiDAR point cloud, regardless of sources. 1.3 Feature Enhancing Aerial LiDAR Point Cloud Refinement Aerial LiDAR point clouds captured using commercial laser range scanners are invari- ably noisy, mostly caused by scanner artifacts and alignment errors between scans [28]. In addition, points near singular regions such as boundaries, ridges, ravines or crest lines are often under-sampled [29]. Due to noise and under-sampling, a direct rendering or surface reconstruction of raw data produces artifacts like grainy planes, gaps and holes, bumpy boundaries, etc. Figure 1.5(a) shows a rendering example of a raw building roof. These artifacts are especially visually-disturbing in an urban scene where planes and straight lines are universal. Therefore, the refinement of aerial LiDAR point clouds is an important pre-process for other applications. (a) (b) Figure 1.5: Rendering of a building roof. (a) Raw points. (b) Refined points. The refinement effectively alleviate artifacts of the original data. 7 In contrast to point data gathered by other scanners, aerial LiDAR points bear two unique properties that make the refinement problem more challenging. Firstly, as cap- tured by laser scanners equipped on a low-flying aircraft, the sampling resolution of aerial LiDAR points is relatively low. Compared to other point data that have hun- dreds or even thousands of points/m 2 , a typical aerial LiDAR data only has less than 20 points/m 2 . Most refinement methods rely on dense data to infer and reconstruct the latent object. The low sampling rate makes those approaches inapplicable. Due to lack of information, the goal of accurately recovering the underlying buildings for aerial LiDAR points is impractical. Secondly, aerial LiDAR data can be characterized as being 2.5D [9] in nature. For a building, the sensor is only able to capture details of roofs but few points on vertical walls, leaving boundary points discontinuous in the posi- tions. This property poses a great challenge to any energy minimization approach as the neighbors of a boundary point only covers part of the environment, missing energy con- trol from the rest of the environment. As far as we know, no point refinement algorithms can regularize such boundary points. This work takes as input an unoriented aerial LiDAR point cloud of a single building, which has all points lie on the roof but no points on other objects like trees, walls or ground. The building consists of either a single roof block or several roof blocks of different heights as seen in Figure 1.6. Figure 1.6: Point clouds of buildings with various roofs. To make the approach generally applicable to building points of arbitrary shape and complexity, the only assumption is piece-wise smoothness of roofs. A smooth surface 8 can be described as one having smoothly varying normals [30] whereas features appear as discontinuities either in the normals or in the positions. Ridges, ravines and crest lines are formed by discontinuities in the normals where the underlying surface is still continuous, they are called normal discontinuous features. Boundaries that are formed by discontinuities in the positions are called position discontinuous features. This work aims at a fast, robust and easy to implement refinement approach to produce a new set of oriented points with smoothed noise, filled gaps and holes, and enhanced features both of normal and position discontinuity. Recent developments in geometry refinement are effective to consolidate point clouds with sharp features. Fast and robust smoothing and up-sampling techniques are extended to accommodate unique properties of aerial LiDAR building points. Impor- tantly, position discontinuous features, i.e., boundary points, are explicitly regularized. The feature-aware two-step framework is guided by normal and boundary direction (tan- gent direction of the underlying line of a boundary point), which are estimated via Prin- cipal Component Analysis (PCA) over anisotropic neighborhoods. The selection of the best anisotropic neighborhood of a point is adapted from Gauss map clustering [31]. The first smoothing step applies a two-stage robust bilateral filtering, which first filters normals and then updates positions under the guidance of the filtered normals. With a separate similar pass, boundary directions are explicitly filtered and then bound- ary positions are updated to match the new directions. This step effectively removes noise as well as preserves features. The second up-sampling step uses a local detector to locate gaps of the latent surface, and then fill with interpolated new points via a fea- ture preserving bilateral projection operator. Gaps on boundaries are detected and filled explicitly under a similar process. A global uniform sampling density is easily achieved and adjusted by a global neighborhood radius. An extension of the up-sampling step enables feature enhancement by inserting more points only in the vicinity of features. 9 The presented approach operates directly on piece-wise smoothing points with singularities. It does not require triangulation, segmentation or patch fitting, which are computationally expensive and sensitive to noise. The local characteristic of the approach makes it memory-efficient, easy to implement, and easily extensible to large data sets. As shown in Figure 1.5(b), the presented refinement method is effective to remedy perceptual artifacts conveyed in Figure 1.5(a). Contributions: To the best of our knowledge, this is the first point refinement method allowing preserving and enhancing position discontinuous features. Recent developments in geometry refinement are extended to accommodate unique properties of aerial LiDAR building points: sparse sampling density and the 2.5D characteristic [9]. 1.4 Real-time Refinement for Aerial LiDAR Cities Since the point geometry lacks connectivity information, the display quality of direct rendering of aerial LiDAR point clouds of cities is limited by the sampling density of input points. For instance, under large magnification, silhouettes and areas of high curvature appear fuzzy and blocky, as shown in Figure 1.7(b). To improve rendering quality for under-sampled areas, increasing point density via refinement is necessary. It has also been shown to be effective in several systems ( [7] [32] [33]). However, it is too expensive and unnecessary to up-sample all points of a scene to a pre-defined static density, since in any view, there exist regions that are occluded, back-facing, or have higher density than needed. As the extension and optimization for Section 1.1 and 1.3, this work presents a new visualization framework for aerial LiDAR cities that provides greatly improved 10 (a) (b) (c) Figure 1.7: The effect of refinement. (a) Original aerial LiDAR points. (b) Before refinement, edges are fuzzy. (c) After refinement, edges are sharp. (b) and (c) are rendered by visually-complete rendering [1]. Brown for ground, dark grey for buildings, light blue for augmented building walls, and green for trees. rendering performance, and real-time dynamic point refinement aiming at improving rendering quality for continuous surfaces under large magnification. Reducing rendering load is necessary to enable interactive visualization for large data sets. One common approach is to cut down the number of rendered primitives using LOD techniques. With the support of a multi-resolution hierarchy of the scene, for each frame, the renderer selects a proper resolution with adequate details view-dependently so that only sufficient primitives are rendered. As rendering primitives, points are more efficient for displaying small details but polygons are faster for representing large flat surfaces. Therefore, to further reduce rendering load, it is desirable to support hybrid representation of both point and polygon of the scene, and to select the most efficient rendering primitive view-dependently. The presented framework is a combination of LOD and hybrid representation. LOD employs an MRQ (Multi-Resolution Quadtree) hierarchy as the supporting structure. Due to lack of connectivity, points are a more efficient LOD representation than polygons. Thus the MRQ is built directly out of input points. In correspondence to different processing methods, each MRQ node separately stores a continuous data set 11 for ground and building points whose latent surfaces are continuous, and a discrete data set for tree points that are discrete and unstructured. (a) Input points (b) 1,195,896 pts, 46,168 quads, 29.6 fps (c) 510,541 pts, 50,524 quads, 83.8 fps Figure 1.8: A comparison between rendering of Gao et al. [1] (b) and the presented framework (c) on the data set of (a). Our framework provides comparable visual quality but with improved frame-rates. pts for points and fps for frames per second. Brown for ground; dark grey for buildings; and light blue for walls. Hybrid representation utilizes two RPQ (Rendering Primitive Quadtree) hierarchies as the supporting structure: one RPQ p for input points; and one RPQ w for augmented walls that are missing in the original data. RPQ structures use a hierarchically built quadtree to represent partitions of an underlying continuous space, or of data sampled from that space. Any breadth-first traversal down the tree gives a family of cells that par- tition the space; each cell stores the best rendering primitive (point or polygon) for an approximation to the surface represented by its data. For a cell whose entire set of points form a planar surface, the corresponding RPQ p node stores a quadrilateral covering the same space to replace these points. Similarly, for a cell whose entire set of wall quadri- laterals approximate a planar surface, the corresponding RPQ w node stores a simplified quadrilateral to compactly represent these walls. By sharing the same quadtree hierar- chy, hybrid representations for higher levels of the MRQ are built through propagation from lower levels. The framework builds supporting structures completely in pre-processing. During rendering, it traverses the MRQ hierarchically to locate visible nodes with sufficient details, then renders data of these nodes according to corresponding RPQ structures. 12 When a visible MRQ node is detected as under-sampled, only its continuous data set needs refinement. By referencing the corresponding RPQ p , refinement is further nar- rowed down to regions whose representation is point. This is because, for a region whose underlying surface is flat, increasing point density does not contribute to visual quality. For an under-sampled region that is represented by point, the refinement approach quadruples the number of points via interpolation and a feature-preserving projector, and updates the affected RPQs. This process reiterates until the desired point density is reached. With a simple threshold in height and comparison of classification, the interpolation is limited to points sharing the same latent surface. Contributions: This work presents a framework for interactive and visually-complete rendering of aerial LiDAR point clouds of cities. The framework achieves improved render- ing performance by utilizing hierarchical hybrid point-polygon structures which support both quadrilaterals converted from planar subsets of points, and simplified quadrilaterals from augmented wall quadrilaterals. This work introduces a real-time refinement approach for city-scale aerial LiDAR point clouds. It utilizes hierarchical hybrid point-polygon structures to concen- trate the efforts of up-sampling on high curvature regions only. The approach provides improved rendering quality under large magnification for several cities. 1.5 Outline The rest of the thesis is organized as follows: Chapter 2 reviews literature related to rendering and processing of point clouds. 13 Chapter 3 presents various techniques for visually-complete rendering of aerial LiDAR cities. Besides the pre-processing point cloud augmentation used to recover building walls and ground areas under tree canopies, it also shows an online method to complete building walls procedurally by generating quadrangles or axis-aligned rectangular parallelepipeds. Chapter 4 introduces a scalable out-of-core algorithm for mapping colors from aerial oblique imagery to city-scale aerial LiDAR points. A weighting scheme leveraging image resolution and surface orientation is designed for this purpose. Chapter 5 provides a point cloud refinement algorithm that is performed com- pletely off-line to alleviate visual artifacts caused by noise and under-sampling. The presented two-step framework explicitly regularizes boundary points, so that both normal and position discontinuous features can be preserved and enhanced. The first smoothing step applies a two-stage feature preserving bilateral filtering to smooth out noise in normals, boundary directions and positions. The second up-sampling step employs a local gap detector and a feature preserving bilateral projector to deliver uniform up-sampling. A simple extension is also provided to enhance features. Chapter 6 enables real-time point cloud refinement in the context of multi- resolution rendering via hierarchical hybrid point-polygon structures. It intro- duces multi-resolution and hybrid heirarchies that are computationally simple to construct and suitable for aerial LiDAR points. Besides details on performing real-time point up-sampling, it also shows how a hybrid rendering of both point and polygon can speed up the performance. Chapter 7 summarizes the thesis and concludes with some directions for future work. 14 Chapter 2 Related Work This study employs a variety of techniques as reviewed below categorized by research directions. Section 2.1 briefs point cloud rendering approaches. Section 2.2 introduces work related to fusing color with point clouds. Section 2.3 presents existing geometry refinement approaches. Section 2.4 summarizes prior work involves hybrid representa- tion and real-time point cloud refinement. 2.1 Visually-Complete Aerial LiDAR Point Rendering LiDAR point clouds are often visualized by using triangles, points or splats as illustrated in Figure 2.1. Figure 2.1: Common representations used to render point clouds: triangles (left), points (middle) and splats (right). The same data set is used for all three representations. 2.1.1 Triangles Many Geographical Information Systems such as ArcGIS [34] and GRASS GIS [35], and a majority of the systems listed in Fernandez et al. [36] support triangular irregular 15 network (TIN) rendering of LiDAR data. Figure 2.2(b) shows an example rendered in TINs. Because a TIN is generated by directly triangulating point clouds, missing or under-sampled areas are recovered by the space-filling nature of triangle meshes. TIN visualizations also provide the user with important cues such as surface orientation, lighting and occlusion. However, TIN visualizations have known drawbacks. (i) Trian- gulated surfaces emphasize the noisy nature of LiDAR points so these visualizations are often visually misleading. (ii) TIN methods are unsuited to vegetation since they are not well-modeled using continuous surfaces. (iii) Triangulation is compute-intensive due to the lack of connectivity between points in raw sensor data. (iv) In the case of very dense point clouds, triangles often cover only one or a partial pixel, making the rendering very inefficient and highly prone to aliasing. (a) Points (b) TINs Figure 2.2: Rendering of urban area of Toronto using points and TINs. From [2]. Alternatives to TIN visualizations are the 3D model extraction methods [18,37,38], where a point cloud is transformed into a collection of approximating polygonal sur- faces. These methods have the advantage of providing data reduction for large urban scenes, but they commonly discard points for objects and details that are not modeled such as fine architectural details, vegetation, street lights, and power cables. 16 2.1.2 Points To overcome the disadvantages of triangle-based visualizations, there are methods for directly displaying points. Points without connectivity offer unaltered visualizations of scanner data. Recent systems such as Exelis ENVI [39], Exelis E3De [40], LViz [41], and most software listed in Fernandez et al. [36] allow direct rendering of raw 3D points using simple primitives like pixels, squares or smoothed circles. These primitives are fix-sized, non-oriented and non-blended, thus the implementations are fast, but often result in aliased images with holes lacking important 3D cues. Figure 2.2(a) shows an example rendered in points. 2.1.3 Splats With the advance of programmable shaders, PBR techniques that represent points as oriented discs, commonly referred to as splats, are now practical for interactive use. A comprehensive survey of PBR is found in Kobbelt et al. [42]. The book [28] offers more detailed comparisons of various PBR algorithms. In the area of LiDAR visualization, the work of Kovac et al. [3] applies a two-pass rendering to efficiently blend (see Figure 2.3) splats and the system of Kuder et al. [12] combines splatting with deferred shading [43] in a three-pass rendering algorithm. Overlapped oriented splats can create a visually-continuous surface without the need for costly triangulation to form surfaces among points. Terrain and rooftops are well suited to PBR because dense splats form good approximations of continuous surfaces. Rendering vegetation as splats also produces a more realistic image than the TIN repre- sentation because branches, stems and leaves are represented without the artifacts of sur- faces created by triangulation. However, an important drawback of splatting appears in areas with sparse or no points due to occlusions (facades and areas under tree canopies). 17 Figure 2.3: Rendering of urban area of Toronto using splats, taken from Kovac et al. [3]. Missing points on building walls result in holes. These areas appear as holes, often impairing the comprehension of the rendered image as illustrated by Figure 1.3(a) and Figure 2.3. The presented methods address this prob- lem by extending the PBR with deferred shading [43] framework which achieves the best trade-off between performance and rendering quality [28]. 2.2 Fusing Oblique Imagery with Augmented Aerial LiDAR The structure of this section is organized as follows. Section 2.2.1 introduces the pre- requisite 2D-3D registration for the color mapping problem. Section 2.2.2 and Sec- tion 2.2.3 summarize prior methods of mapping texture or color from images to geome- tries. Section 2.2.4 describes approaches to handling large data sets. 18 2.2.1 2D-3D Registration The registration between 2D images and 3D geometries is a pre-requisite to fuse the two different data sets. Existing approaches work well on urban areas with high success rates. Features composed of connected lines dubbed 3CS features are used by Wang et al. [26] to register oblique imagery with untextured polygonal models derived from a LiDAR point cloud, with correct pose recovery rates of 98%. Approaches operating directly on sensed points have also been developed and differ on the type of information they use to relate 3D LiDAR points and 2D image points. Examples include the work of Wang et al. [26] that matches region contours corresponding to building footprints, and the method in Mastin et al. [20] uses mutual information registration methods with reported experimental success rates between 93% and 98:5% depending on the datasets used. In this study, the aerial oblique imagery is pre-registered using the approach of Wang et al. [26] which provides high success rate. 2.2.2 Texture Mapping on Triangle Meshes To transfer color from images to geometries, the majority of the published work concen- trates on texture mapping of triangle meshes derived from raw unstructured point cloud. Triangle meshes are either generated by triangulation [20] or from modeling [21–23]. The approach of Frueh et al. [23] attempts to select the best image per polygon or groups of polygons, while most other papers [44–48] fuse color information at each vertex by introducing a per-pixel weighting scheme that combines contributions from multiple images. For urban scenes, texture mapped buildings deliver realistic results, but vege- tation is fake-looking since triangle modeled trees and plants are unrealistic, as seen in Figure 2.4. 19 Figure 2.4: Texture mapped triangle meshes of a residential area. Vegetation is fake- looking. From Ding et al. [4]. The aerial oblique imagery used in the presented approach also suffer from varying resolutions, varying illumination conditions, occlusions and registration errors as images used to texture map triangle meshes. Therefore, heuristics used to design the per-pixel weighting scheme that address these difficulties are also applicable to this problem, such as image resolution [23, 47] and surface orientation [23, 44, 45]. 2.2.3 Color Mapping on Points Only a few approaches deal directly with color mapping on points. Colored points can be produced by some scanning systems with integrated cameras [24, 49], which often suffer from the unpleasant confetti-style color variations due to data mixing from multiple scans and the variation in lighting. PointShop [50] can be used to color points through parameterization mapping, but only limited to small point data sets. In particular, the work closest to the presented approach is the system of Pintus et al. [5] which achieves seamless image blending directly on massive unstructured extremely dense point clouds, see Figure 2.5 for examples. This study devises a differ- ent color mapping scheme that can effectively exclude occluded points for both dense and sparse point clouds. 20 Figure 2.5: Color mapping on extremely dense point cloud. From Pintus et al. [5]. 2.2.4 Data Management One major problem in the texture/color mapping of large-scale data sets is to manage the large amount of data that cannot be loaded at once into main memory. The out-of-core strategy is a common solution. For models that are sufficiently small to completely fit in main memory, the approach of Xu et al. [48] applies an out-of-core data structure to store images and supporting mask data. The work of Callieri et al. [46] shows that dedicated out-of-core structures for models and images can be used with extra processing time due to random access traversals. The image blending of Pintus et al. [5] coarsely reorders the point cloud along a space filling curve in pre-processing and sequentially process both points and images as streams in an out-of-core fashion. The presented out-of-core scheme processes images in sequence similar to Pintus et al. [5], but spatially reorders points with a supporting hierarchical structure that is more suitable for aerial LiDAR point clouds. 21 2.3 Feature Enhancing Aerial LiDAR Point Cloud Refinement 2.3.1 Normal Estimation The presented normal estimation method belongs to regression based estimation pio- neered by Hoppe et al. [51]. They apply PCA over a constant k nearest neighbors of a point and take as normal the eigenvector corresponding to the smallest eigenvalue of the covariance matrix. The methods proposed in Gross et al. [28] and Pauly et al. [52] are weighted variances of this basic approach. Guennebaud et al. [53] fit an algebraic sphere to the local neighborhood of a point, and take the sphere’s gradient as the normal. This method gives stable results even for low sampling rates. All above methods rely on a constant isotropic neighborhood selection, which always implies a trade-off between the ability to handle noise and to recover small details. The work of Mitra et al. [54] uses the optimal neighborhood size adaptive to the local noise scale, curvature and sampling density. However, due to isotropic nature of the neighborhood, all these methods cannot provide reliable normals near edges as they are estimated by neighbors from both sides. Based on scale space theory, Hu et al. [55] compute normals by applying a bilateral estimation scheme on the adaptively constructed multi-layer neighbors. Yoon et al. [56] apply the ensemble technique from statistics to improve the robustness of Hoppe et al. [51]. Recent work of Li et al. [57] uses a RANSAC-like method to detect the best local tangent plane for each point. The approach of Boulch et al. [58] estimates the final normal as the maximum of the discrete probability distribution of possible normals based on a robust Randomized Hough Transform. They also provide several sampling strategies to deal with sampling anisotropy. These algorithms are capable to deal with 22 points located in singular regions in presence of noise and outliers. However, they are either computationally expensive or require dense sampling or both. 2.3.2 Smoothing The literature is abundant in feature preserving surface smoothing [59–64], mostly inspired by image processing work on scale-space and anisotropic diffusion [65]. As pointed out in Jones et al. [66], all diffusion-based feature preserving techniques are, in essence, all local and iterative, therefore they require significant computational times. A popular alternative to anisotropic diffusion is the bilateral filter, which was orig- inally proposed in image processing [67, 68]. It is a non-linear filter where the output is a weighted average of the input. Durand et al. [69] found that bilateral filtering is essentially similar to anisotropic diffusion, but a more stable non-iterative robust esti- mator. Methods of Fleishman et al. [70] and Jones et al. [66] extend the bilateral filter to feature preserving mesh smoothing based on robust statistics [71, 72] and local first- order predictors of the surface. The weight depends not only on the spatial distance, but also on a similarity term that penalizes values across features. The bilateral filtering is also extended to the application of feature preserving point cloud smoothing. The work of Mederos et al. [73] combines the bilateral filtering idea with moving least squares (MLS) methods [74]. The work of Duguet et al. [6] uses and filters surface curvature of an iteratively computed higher order surface approximation, as shown in Figure 2.6. (a) Original (b) Noisy (c) Smoothed by Jones et al. [66] (d) Smoothed by Duguet et al. [6] Figure 2.6: Feature preserving filtering. From Duguet et al. [6]. 23 Surface normals play an important role in surface denoising as surface features are best described with the first-order surface normals [75]. Several two-stage feature pre- serving surface denoising methods [76, 77], which first filter normals and then update positions under the guidance of the filtered normals, have shown improved results than a single pass filtering of only normals or positions. This study applies a similar two-stage smoothing based on bilateral filtering. Figure 2.7: Existing refinement methods cannot handle position discontinuous features. Rough boundaries are produced as a result. From Huang et al. [7]. Existing point refinement methods can only preserve and enhance normal discon- tinuous features on continuous surfaces, no methods found for position discontinuous features. As a result, boundaries are not regularized, see Figure 2.7 for an example. Several feature smoothing approaches [78–80] identify features in a point cloud, con- nect the feature points using a minimum spanning tree or simple snapping, and return a set of complete and smooth lines or curves via fitting. These approaches can be used to regularize position discontinuous features like boundary points, however, they are not suitable for point-based applications since feature points are converted to lines or curves. The presented boundary smoothing method is distinct to them by retaining the point representation throughout the process. 24 2.3.3 Up-sampling Lipman et al. [81] introduce the parameterization-free locally optimal projector (LOP) for surface approximation from point-set data. Up-sampling can be achieved by a few iterations of LOP. A weighted version of LOP [82] can deal with non-homogeneous point density. The point cloud consolidation scheme of Huang et al. [7] extends LOP to allow evenly distributed up-sampling, see Figure 2.8 for an example. However, the expensive computational cost of LOP limits its application to large data sets. (a) Input (515 points). (b) Surface reconstruction over (a). (c) Up-sampled by Huang et al. [7] (4,000 points). (d) Surface reconstruction over (c). Figure 2.8: Point cloud up-sampling. From Huang et al. [7]. Several up-sampling methods are based on a smooth point-based representation called MLS surface introduced in Levin et al. [74]. The surface is defined implicitly by a local projection operator. The work of Alexa et al. [83, 84] up-sample a point set through V oronoi point insertion in local tangent spaces followed by MLS projection. Dependence on a local V oronoi diagram makes the generation of a locally uniform sam- pling expensive. The uniform up-sampling of Pauly et al. [52] is also expensive as it is based on a particle simulation. In addition, the computation of the local projection operator is expensive as a non-linear optimization problem. 25 By combining the fast dynamic up-sampling algorithm of Guennebaud et al. [33] and the fast bilateral projection operator of Huang et al. [7], the presented up-sampling algorithm is efficient and effective to deliver feature-enhancing uniform point sampling for non-homogeneous point density. 2.4 Real-time Refinement for Aerial LiDAR Cities This section first reviews rendering systems utilizing hybrid point-polygon representa- tion, then introduces existing real-time point cloud refinement approaches. 2.4.1 Hybrid Representation Given an object in both point and polygon representations, for small details where polygons occupy fewer screen pixels than corresponding points, rendering as points is more efficient. But as the projected screen area grows, rendering as polygons becomes increasingly efficient. To benefit from both rendering primitives, several hybrid approaches are developed. Figure 2.9: POP system chooses a different number of points and triangles based on viewing location (red for points; blue for triangles). From Chen et al. [8]. 26 Known as one of the first hybrid point-polygon systems, POP [8] (Figure 2.9) utilizes a multi-resolution hierarchical quadtree to store triangles only at leaf nodes and points at intermediate nodes. Those intermediate points are computed bottom-up based on bound- ing spheres of child nodes as QSplat [85]. The first hybrid simplification method [86] uses a triangle hierarchy based on edge collapsing. It adaptively replaces triangles with points via an error metric. The hybrid hierarchy of Coconu et al. [87] consists of original triangles and points sampled from triangle surfaces. Wand et al. [88] apply a hierarchy that is independent of mesh connectivity and topology through dynamic and random sampling of points. Zheng et al. [89] first organize the scene into a binary space par- titioning (BSP) tree; and then subdivide objects in each BSP leaf node into a quadtree hierarchy, which contains both the sample points and polygon rendering information at each level. With hardware acceleration, Hao et al. [90] segment model faces into regions, store triangles and vertex point clouds of each region, and render different regions using either triangles or points according to their distances from the viewpoint. All above systems require polygonal mesh as input, which is not directly applicable to point clouds. There are a few other approaches that work on point clouds directly without requir- ing a surface mesh. PMR [91] relies on V oronoi diagram and its dual Delaunay triangu- lation to convert points into triangles for the hybrid hierarchy. Wahl et al. [92] convert planar subsets in the point cloud to quadrilaterals based on the iterative random sample consensus (RANSAC) [93]. An octree structure is also used to identify planar represen- tations at different scales and accuracies for level-of-detail selection during rendering. Specialized for LiDAR point clouds, the hybrid system of Kuder et al. [94] detects con- tinuous surfaces in points by progressively growing smooth regions, and replaces them with decimated triangle meshes. Point data is non-redundantly distributed across all levels of the hierarchy, and surfaces are triangularized into meshes for each tree node 27 separately. The hierarchical hybrid structure presented in this work draws inspiration from above systems. But unlike these systems that take considerable pre-processing time trying to accurately recover latent surfaces of the point cloud, the structure pre- sented in this thesis is more efficiently constructed by evaluating only regular spacial partitions without involving expensive computation. 2.4.2 Real-time Point Cloud Refinement Targeted at real-time point cloud refinement, the up-sampling algorithms of Guennebaud et al. [95] [33] iteratively and uniformly inserts new points based on the construction of accurate one ring neighborhoods. Unfortunately, providing only uniform refinement is a major drawback for most applications, as it is almost impossible to avoid either over or under-sampling even in a static scene. Including adaptivity within refinement can strongly improve the overall performance by reducing the number of points in areas classified as less important (e.g. flat areas, far areas, partially hidden areas). The GPU-based real-time up-sampling algorithm of Guennebaud et al. [53] is based on this notion. It generates a regular pattern of m m new points within the quad bounding of each input point. The number of m is adaptive to both feature and view via a geometric error metric. Nevertheless, for a flat region that is parallel to the view, this algorithm still generates a prohibitively large number of new points under large magnification. The presented refinement is designed to avoid such problem. 28 Chapter 3 Visually-Complete Aerial LiDAR Point Rendering This chapter first presents an overview of the complete framework in Section 3.1, fol- lowed by a brief introduction of the data management in Section 3.2. The implementa- tion of the two parts of the framework is detailed in Section 3.3 and Section 3.4. The method is evaluated by experimental results in Section 3.5. Finally, Section 3.6 summa- rizes the work. 3.1 Framework Overview An overview of the framework is presented in Figure 3.1 where the yellow path rep- resents a basic PBR pipeline. Our framework consists of two parts: an offline pre- process and an online GPU-accelerated rendering. Pre-processing performs one time operations that are needed to augment the point cloud and to set-up rendering. The ren- dering stage provides the standard rendering passes (detailed in Section 3.4) as well as functions to procedurally generate wall geometries where needed. Users can select from three geometries (point, quadrangle and ARP) to reconstruct walls. The workflow varies depending on the choice of geometry, as shown in Figure 3.1. The input is a 3D LiDAR point cloud, as produced by most scanners. The prelimi- nary classifier attaches labels to each point for three primary classes of ground, tree and building. The point cloud is then augmented by adding missing ground points under 29 trees. Next, building points that lie on the perimeter of rooftops are labeled as boundary points. The positions of such points are later used to anchor building walls. After this step, we have four valid classes: ground, tree, building and building boundary. Normal vectors are then computed for all points, this is the last common step shared by three wall reconstructions as follows. Point-wall After the additional step to compute normals for the wall surfaces, point- walls are created by adding points connecting the building boundaries and the ground. In the rendering stage, augmented points are rendered as splats, like all other points, using the standard three PBR passes. Quadrangle-wall With the information acquired from the wall normal estimation step, all points are sent to the rendering. After the visibility pass that renders points of all classes as splats, boundary points are processed to create walls by generating a quadrangle for each boundary point. Both quadrangles and splats are sent to the attribute pass to accumulate per-pixel normals and colors. And finally the shading pass provides the Phong shading [96]. ARP-wall In many cases it is appropriate to consider the vertical wall primitive as axis-aligned, so the step to estimate wall normals is skipped. The remaining rendering steps are similar to the quadrangle case. Note that the view-dependent PBR processes will potentially render point splats and quads surfaces each frame. Figure 3.2 compares three wall geometries in a conceptual way. 3.2 Data Management A LiDAR data set is often too large to load and process in main memory. A common solution is to partition the data into several manageable subsets and to process these 30 Figure 3.1: Pre-processing and rendering for basic PBR (yellow), and for the presented completion methods: augmented wall points (green), procedurally generated quadrangle walls (blue) and procedurally generated ARP walls (red). subsets in sequence in an out-of-core manner [3, 12, 97]. Our system utilizes a quadtree hierarchy to manage subsets as Kovaˇ c et al. [3]. The quadtree construction is inexpensive and the domain of quadtrees is effectively 2.5D, which is a good match for aerial LiDAR. The generated structure is light-weight so it can reside in main memory and be efficiently traversed. We predefine a maximum number of points to be stored in each quadtree node and the input LiDAR point cloud is spatially partitioned as a hierarchical quadtree. The 31 (a) (b) (c) Figure 3.2: Comparison of provided wall geometries. The pink dot is a 3D point, the blue dots and surfaces are added new data. (a) Point-wall. (b) Quadrangle-wall. (c) ARP-wall. root node covers the whole area and each interior node contains its bounding box and four pointers to its children. Leaf nodes contain the bounding box of each subset of the original point cloud. The quadtree of Kovaˇ c et al. [3] only stores index information of points that belong to the bounding box in a leaf node. It does not apply to our case where indices are dynamic as the input point cloud is augmented. We write the point data of each leaf node in sequence to storage as a single file and attach index information in the file to corresponding leaf nodes. Although this takes additional time for the one- time file write operations, our per-frame read operations are faster than that of Kovaˇ c et al. [3]. Because our accesses are spatially coherent, but read operations of Kovaˇ c et al. [3] extract sparsely located points from the large original input data file. The quadtree is also used as the multi-resolution structure for managing LOD during rendering. Hierarchical traversal of the bounding boxes enables efficient visibility test- ing for the rendering stage. The LOD implementation of Kovaˇ c et al. [3] loads points of a visible leaf node into multiple OpenGL Vertex Buffer Objects [98] (VBOs) and distributes VBOs uniformly among levels of the quadtree. An interior node does not have complete data until the collection of VBOs from all descendants are accessed. Our multi-resolution method does not propagate VBOs from leaf nodes to interior nodes. Rather, for a visible leaf node, given a current camera position and the user-defined LOD factor f , we compute the number of screen pixels n by projecting the bounding 32 box of the leaf node onto the screen. We then randomly select f n points from the leaf node, load them into VBOs and send them directly to the rendering pipeline. Since n decreases as the distance between the leaf node and the camera increases, the LOD image is created by displaying more points near the camera and fewer points for areas far from the camera. To avoid frequent points selection and update, we define an inter- val around n and reload points only when the current calculated n 0 is beyond the interval defined by the previous n. In the experiments, we use [ n p 2 , n p 2] as the interval. When processing a subset of points, a k-d tree is built for each class of points to enable rapid in-class neighborhood selection. To facilitate rapid between-class neigh- borhood selection, the k-d trees are partitioned on the x-y plane to enable fast accessing of all points with similar(x;y) values. Given a point(P x ;P y ;P z ) in class A, its neighbors in class B are located by seeking points near (P x ;P y ) in B’s k-d tree regardless of the difference in z values. 3.3 Pre-processing Pre-processing handles operations that only need one-time offline computations. 3.3.1 Classify Ground, Trees and Buildings The input data are assumed in LAS [99] format. The LAS standard defines several standard LiDAR classes including ground, building, water, low vegetation, etc. Our test data and many other LiDAR point clouds come with classification information. In cases where classification tags are not available, third-party software such as LAStools [100] or those listed in Fernandez et al. [36] can be used to generate class information. 33 Our framework requires points in classes of ground, building, and all levels of veg- etation. For simplicity, we treat all vegetation as tree points regardless of their sub- classes. 3.3.2 Augment Ground Points Under Trees Trees do not possess the 2.5D characteristic, so laser scans often include both canopy points and ground points under the tree crown. However, sampling of the ground under the tree canopy is often sparse as in Figure 3.3(a). To provide a hole-free ground visual- ization, we augment the point cloud with new ground points in under-sampled areas. (a) (b) Figure 3.3: Demonstration of augmenting ground points under trees. Tree points are rendered in green and ground points in brown. (a) Before. (b) After. Given an input data resolution d points/m 2 , a tree point with position (T x ;T y ;T z ), if there is no ground point at position (T x ;T y ), we consider the ground as possibly occluded. In this case, we count the number of ground points n at position (T x ;T y ) within a radius r in the ground k-d tree. If n < pr 2 d, the ground area is considered 34 under-sampled and we augment the input data with a new ground point (G x ;G y ;G z ) under the tree point as follows: G x = T x ;G y = T y ;G z = å n i=1 z i w i å n i=1 w i (3.1) where each ground point i in the neighborhood of radius r has position (x i ;y i ;z i ), and the weight w i is based on its 2D distance to the new ground point: w i = 1 p (x i G x ) 2 +(y i G y ) 2 (3.2) The radius of the neighborhood r is adjusted depending on the resolution d of the data set. Experiments on several data sets with d ranging from 0.5 points/m 2 to 25 points/m 2 show that 16 neighbors yield good results and performance, so we use r= q 16 pd for our tests. This method effectively fills the ground holes under trees as seen in Figure 3.3(b). 3.3.3 Label Building Boundaries Classification only provides categories of ground, trees, and buildings. To locate walls, points on building boundaries (roof edges) need to be identified. We use a threshold on the number of neighboring ground points to identify if a building point belongs to the boundary. Figure 3.4: Various cases of boundary points. 35 Figure 3.4 shows a bird-eye view of various cases where a point lies on building boundaries. A building point may be adjacent to the ground (point A), another building (point B), or trees (point C). We claim that boundary points have half of their neigh- boring points in radius r as ground points. An ideal example is point A. If r does not overlap with the adjacent building, the claim also holds for point B. Since ground points under trees are recovered prior to this process (in Section 3.3.2), the claim is still valid for point C. The neighborhood of any point on roof corners contains even more ground points than that of point A, B or C, thus corner points satisfy the claim as well. However, this simple criterion assumes a constant point sample density. For areas with a variable point density, and areas that only the building points near the boundary (for example, point D) in stead of those right on the edge (point A, B and C) are sampled, there may be too few ground points. In these cases, we adjust a tolerance d to find boundary points. Given a neighborhood radius r and a building point, we count the number of building points N bp and the number of ground points N gp in the neighborhood. The building point is labeled as a boundary point if: N gp N gp + N bp >= 0:5d (3.3) Equation (3) is only valid if N bp refers to points from a single building. In practice, the neighborhood is chosen to be small enough to only contain points from a single building as well as to contain a sufficient number of samples for a valid measurement. Our experiments use a fixed radius r= 2, corresponding to 2 meters (the datasets are in UTM coordinates), a distance that is smaller than the distance between any two buildings yet provides enough samples for the estimation of (3) given the datasets. A value of d = 0:1 produces good results in our experiments. Once a boundary point is identified, the corresponding ground point directly under it is estimated similarly to augmenting the ground point under a tree point (Section 3.3.2). 36 Because these two points share the same x-y value, only the z value of the ground point, i.e. G z , needs to be stored with the boundary point. By processing all building points, the building boundaries are approximated and stored in a separate k-d tree. Note that a point labeled as boundary is also a building point, so it occurs in both k-d trees. This threshold method is simple but provides good results in our experiments. If needed, more sophisticated methods [101, 102] can approximate building boundary more precisely with the trade-off of more pre-processing time. 3.3.4 Estimate Normals The input of this step is points labeled as trees, ground, buildings and building bound- aries. Only neighbors of the same class are used to estimate the normal of a point. We calculate the normal of a tree point as in Kovaˇ c et al. [3] by the cross product of vectors formed with two neighbors. Since tree points are usually sparse and randomly sampled, noisy normals computed this way lead to a natural appearance of vegetation. Since ground points form a large continuous surface, we use Principal Component Analysis [103] over k nearest neighbors of a point for a more precise normal estimation. As in Section 3.3.2, we use k= 16. Since Principal Component Analysis determines a normal up to a sign, we fix the sign so that the ground normal points skyward, i.e., the angle between the normal and the unit vector along z-axis should be smaller than 90 degrees. Building points corresponding to a single building form a continuous surface, so Principal Component Analysis is also applicable to estimate these normals. However, the segmentation of separate buildings is not available, so we analyze the k points within the neighborhood radius r= 2 as in Section 3.3.3 to ensure that all neighbors of a build- ing point are constrained to the same building as the point. The sign of a building normal is fixed similar to that of a ground normal, as points on building roofs also faces the sky. 37 Because boundary points are also building points, they share the normals computed as above. Figure 3.5: Geometric interpretation of Principal Component Analysis. 3.3.5 Estimate Wall Normals When computing the normal of a boundary point as in Section 3.3.4, Principal Com- ponent Analysis also provides axes directions of the underlying tangent plane. Let l 1 l 2 l 3 be the sorted eigenvalues of the covariance matrix, and v 1 ;v 2 ;v 3 the associ- ated eigenvectors. As shown in the geometric interpretation in Figure 3.5, v 1 estimates the axis with greatest variance of the tangent plane, v 2 estimates the other axis of the tangent plane, and v 3 the normal at the point. To attach a piece of wall to a given boundary point, only the vertical direction (unit vector along z-axis) of the wall is assumed and fixed. We use the largest tangent plane axis direction v 1 of the corresponding boundary point to estimate the horizontal direction of the wall, because the two vectors are parallel as depicted in Figure 3.5. The normal of the wall is then calculated as the cross product of its horizontal direction and its vertical direction. 38 3.3.6 Augment Wall Points When the user selects point as the primitive to reconstruct missing walls, a set of aug- mented points are created to model the walls. We know the approximate data resolution d points/m 2 of input LiDAR data. For a boundary point(B x ;B y ;B z ) with G z as the ground height at the foot of the corresponding wall, the two end points of a vertical line within the wall are known. Augmented points are created by linear interpolation along z-direction of this line. A number of( B z G z p d 2) equidistant wall points are created and assigned the same wall normal calculated in Section 3.3.5 for the boundary point. The augmented points are then added to a separate k-d tree for walls. Figure 3.8(b) shows a rendering result for point represented walls. 3.4 Rendering Our rendering algorithm is based on a GPU-accelerated PBR with deferred shading framework [43]. The rendering of points is summarized in three basic passes. 1. Visibility pass calculates the splat size (radius) and shape of a corresponding point, and renders the splats only to the depth buffer withe-offset that determines the maximum distance between the two blended splats. 2. Attribute pass renders the splats again. Because this pass uses the depth buffer from the visibility pass and disables writing to it, only splats within e-distance are blended. Besides depth buffer, other rendering buffers supported by OpenGL Multiple Render Targets are used to store accumulated normals and colors of blended splats. 39 3. Shading pass traverses pixels of the screen to normalize the accumulated normals and colors stored in rendering buffers of the attribute pass and to apply accurate per-pixel Phong shading. If point-wall representation is selected, the point data is complete when rendering. The above mentioned 3-pass renders, blends, and shades points of all classes together to generate the final image. If quadrangle or ARP is selected to represent wall geometry, the rendering process is modified as depicted in Figure 3.1. A procedure runs after the visibility pass to generate wall-surface geometry as detailed in Section 3.4.1 and Section 3.4.2. The rendering loop is modified to handle both splats and polygonal primitives as described in Section 3.4.3. 3.4.1 Generate Wall Quadrangles The 2.5D characteristic of building structures assumes vertical walls, so it is natural to use vertically-oriented quadrangles as wall primitives. For a boundary point(B x ;B y ;B z ) with G z as the ground height at the foot of the wall, a vertical quadrangle is generated to connect the boundary point to the ground. The quadrangle height is B z G z and its normal is calculated in Section 3.3.5. The width of the quadrangle is 2r, where r is the radius of the corresponding rendered splat of the boundary point. Since the splat radius r is calculated in the visibility pass, the gener- ation of quadrangles can occur immediately after that pass. We procedurally generate a triangle fan representing the wall quadrangle in a geometry shader [104] whenever the splat of a boundary point passes the visibility pass. As splat radius is designed to ensure overlapping with neighboring splats, quadrangles created this way also overlap with neighboring quadrangles. 40 If quadrangles are rendered with a solid color, shading aliasing artifacts often occur when two quadrangles with different normals overlap, as seen in Figure 3.6(a). To alle- viate the problem, we render quadrangles with gradient color, so overlapped quadrangles have smooth color changes as shown in Figure 3.6(b). Figure 3.8(c) shows a rendering result for quadrangle represented walls. (a) (b) Figure 3.6: Rendering of two overlapping wall pieces with different normals. (a) Using the same solid color. (b) Using the same gradient color. Aliasing on overlapped areas is alleviated by gradient blending. 3.4.2 Generate Wall ARPs For point clouds that have very low data resolution, the horizontal direction and normal of a piece of wall surface estimated in Section 3.3.5 is often inaccurate. Quadrangles with inaccurate orientations may produce holes in overlapping areas with neighbors, but rectangular parallelepipeds are always watertight when viewed from any direction, as seen in Figure 3.7. We devise this approach of using ARPs to represent solid walls without computing their orientations based on normals. Because walls stand between the ground and roofs, the top and bottom faces of an ARP will always be hidden by the splats of points at the top and bottom. Therefore, only the vertical faces need to be generated for the ARP. For the four vertical faces, at most two of them can be seen at anytime. Similar to the quadrangle approach of 41 Figure 3.7: Comparison of ARP-walls (red rectangles) and quadrangle-walls (blue seg- ments) over the same group of boundary points (black dots), viewed from top. When viewed along the directions of yellow segments, gaps are found in quadrangle-walls. While ARP-walls are watertight in all directions. Section 3.4.1, we use 2r as the width of each vertical face, use the geometry shader after the visibility pass to generate a triangle fan representing the two visible faces of a ARP, and use gradient colors for faces to alleviate shading aliasing. Figure 3.8(d) shows a rendering result for ARP represented walls. (a) (b) (c) (d) Figure 3.8: Different wall representations. (a) Original roof-only point cloud. (b) Aug- mented with point-walls. (c) Augmented with quadrangle-walls. (d) Augmented with ARP-walls. 42 3.4.3 Rendering Extension The deferred shading of point splats and polygonal geometries is enabled by sharing the same depth buffer for the visibility pass and sharing the rendering buffers to accumulate normals and colors. The extended renderer runs as follows. 1. Points in the classes of ground, trees and buildings are rendered as splats using the visibility and attribute passes. 2. Boundary points passing the visibility pass proceed to a procedure to generate wall geometries as detailed in Section 3.4.1 and Section 3.4.2. To improve perfor- mance, back face culling is implemented for polygons based on the angle between their normals and the viewing ray. The visible polygonal geometries are then ren- dered using a similar attribute pass as the one used for points. 3. The same shading pass as the basic PBR normalizes the accumulated normal and color buffers, and applies per-pixel Phong shading to generate the final image. 3.5 Experimental Results The implementations use the C# language under .NET 4.0 environment, OpenGL 2.1, Tao Library [105] and the Cg [106] language for shaders. Three wall reconstruction approaches are implemented, we also re-implemented the PBR system of [43] for com- parison. A computer equipped with dual-core Intel Xeon 2.0GHz CPU, 4GB RAM, NVIDIA GeForce GTX260 896MB graphics card, Western Digital 7200RPM hard disk and Win- dows Vista was used for testing. All tests were rendered at 1280800 pixels. 43 3.5.1 Performance Table 3.1 lists basic information of various data sets used for the experiment. The data of Los Angeles is mostly urban area consists of buildings and less trees, the other two data sets are residential areas where plenty of trees can be found. Table 3.1: Basic information of test data sets. Data set Denver Los Angeles Atlanta File size (MB) 44.0 49.6 104.6 Resolution (pts/m 2 ) 6.6 0.58 22.9 # points (M) 1.64 2.60 5.48 % ground 67 56 42 % building 20 28 16 % tree 13 12 42 % other 0 4 0 The pre-processing is measured in two parts: (i) the time spent on building up data structures, and (ii) the total time for all pre-processing operations except classifica- tion shown in Figure 3.1. Pre-processing is different for different wall reconstruction approaches, so individual times for each choice of wall primitive are listed. Under default setting, we do ten runs for each data set with each wall primitive and record the average time in Table 3.2. The largest sum of (i) and (ii) among three wall reconstruction approaches is also listed as “Maximum total time” for each data set. The time of (i) is proportionally increasing with the size of input file and the total number of points. It is also affected by the maximum number of points per quadtree leaf node. The time of (ii) depends on the number of points of each class and the actual scene-architectural layout. For example, given the same number of building points, dif- ferent layouts will obtain varied numbers of boundary points, thus impacting processing time. Although we cannot define the precise effects of every factor, the “Average time/M 44 points” shows that with our test data, the system takes 8-15 seconds to pre-process one million points. Table 3.2: Pre-processing time (in seconds) of three wall reconstruction approaches. Data set Denver Los Angeles Atlanta Structure build-up 2.64 4.06 8.56 Point-wall 12.26 18.87 58.23 Quadrangle-wall 11.35 16.64 49.01 ARP-wall 11.16 16.54 48.12 Maximum total time 14.90 22.93 66.79 Avg. time/M points 9.09 8.82 14.91 Ignoring factors like point density and rendering parameters such as blending dis- tance e and splat sizes, the performance of the LOD renderer is mainly dependent on the number of selected points rather than the total number of points. By adjusting the LOD parameter, one can always achieve interactive frame-rates. For comparisons, we use fixed rendering parameters and render the scene with full detail, i.e. LOD disabled. The rendering performance of each approach is measured by adjusting the viewport to fit the whole scene area horizontally and recording the average frame-rate while orbiting the camera about the scene for a minute. Table 3.3 demonstrates the rendering performance of three wall reconstruction approaches and basic PBR with deferred Phong shading where no augmentation or procedurally gener- ated geometries are added to the input data. All three wall primitives have comparable performance, the quadrangle representation performs slightly better than the other two. Compared to basic PBR, the overhead of our method is low. For example, using the quadrangle wall impacts performance by less than 35% in these test cases. 45 Table 3.3: Rendering performance (in fps) of our methods and the basic PBR. Data set Denver Los Angeles Atlanta Point-wall 42.6 34.1 9.6 Quadrangle-wall 46.8 36 10.6 ARP-wall 42.5 32.5 10.7 Basic PBR 53.2 38.8 16.2 (a) (b) Figure 3.9: Denver area. (a) Rendering with quadrangle-walls and LOD factor f = 4. (b) Aerial image from Microsoft Bing Maps. The rendering provides clear building contours that match well with the image. Vegetation is different as two data sets were not captured at the same time. 3.5.2 Image Quality Our method provides visually complete and appealing renderings for both urban areas (Figure 3.11(a)) and residential areas (Figure 3.9(a), Figure 3.10(a)), despite the vari- ation in data resolution. Figure 3.9 and Figure 3.10 compare our visualizations with corresponding aerial images. Since the point clouds and images were not acquired at the same time, the vegetation looks quite different. However, our renderings provide clear building contours that perfectly match those in the images. Figure 3.8 compares the rendering quality of three wall primitives. Points provide a unified look of the scene, but as stated in [3], round and fuzzy splats do not approxi- mate flat surfaces and edges as well as curved natural features. The point-wall is often 46 (a) (b) Figure 3.10: Atlanta area. (a) Rendering with quadrangle-walls and LOD factor f = 4. (b) Aerial image from Microsoft Bing Maps. The rendering provides clear building contours that match well with the image. Vegetation is different as two data sets were not captured at the same time. less visually appealing than the other two primitives. ARPs represent the sharp corners of walls best as vertical faces of the primitive form corners naturally, but because of axis-alignment, they may create bumpy walls. Quadrangles make the best wall repre- sentation as they provide unified look along vertical direction and smooth appearance on overlapped areas, if gaps caused by inaccurate wall orientations are not present. 3.5.3 Comparison with Other Methods The state of the art LiDAR visualization systems [3,12] were built upon basic PBR with no consideration of occlusions. Figure 1.3 clearly shows that by adding building walls and filling ground under canopies, our system provide a better visualization than basic PBR. Figure 3.11 compares the renderings of our quadrangle-wall approach with the visu- alization of complete building models created offline by Zhou et al. [9] on the same Los Angeles area. Our rendering provides comparable visual information for buildings as the polygon models provide. The bottom right close-up even shows finer details of the 47 stadium than the respective model. Since we only consider local neighbors, the appear- ance of walls are highly dependent on the quality and sampling of the raw data. As shown in the top right close-ups in Figure 3.11(a), buildings rendered by our method look less appealing than the corresponding models. However, the goal of this work is not to compete the image quality with the compute-intensive urban modeling but to improve the state of the art interactive city-scale visualization. (a) (b) Figure 3.11: Rendering comparison of Los Angeles area. (a) Using our rendering with quadrangle-walls, LOD is disabled. Trees are colored in green. (b) Visualization of the full geometric models from Zhou et al. [9] using a ray tracer and shadow maps, image courtesy to Qian-Yi Zhou. (a) provides comparable visual cues with (b). 3.6 Summary This is the first work to utilize the 2.5D characteristic of building structures to fill holes in the visualization of aerial LiDAR point cloud. The presented system uses a hierarchical and out-of-core approach to data management and GPU-accelerated PBR with deferred shading. The system treats points differently by classification and incorporates three distinct geometries to represent building walls: points, quadrangles and ARPs. The 48 method is generic to any data with buildings and trees regardless of point density. The rendering of the system provides comparable 3D cues to those obtained with polygonal 3D modeling while maintaining interactive frame-rates. It has been found that among the three building wall reconstruction approaches, quadrangle-walls achieve the best trade-off between performance and image qual- ity. Augmented point-walls are useful when the processed point cloud is to be ren- dered with other rendering algorithms, because the augmented points are added in pre- processing and therefore independent of run-time rendering algorithm. ARP-walls are best employed with noisy or low resolution data as this approach does not require the estimation of a normal vector. 49 Chapter 4 Fusing Oblique Imagery with Augmented Aerial LiDAR This chapter first presents an overview of the complete color mapping framework in Section 4.1, followed by a brief introduction of the augmentation process in Section 4.2 and the data management in Section 4.3. The implementation of the color mapping process is detailed in Section 4.4. The method is evaluated by experimental results in Section 4.5. Finally, Section 4.6 summarizes the work. 4.1 Framework Overview Figure 4.1 shows the workflow of the presented color mapping framework. The input oblique imagery is assumed to be pre-registered with the point cloud, i.e., for each image, the camera calibration matrix that maps 3D points in world space to 2D pix- els in image space is known. We first augment the point cloud to recover under-sampled areas such as building walls and ground under tree canopies by inferring the actual physical nature of points from classification, as detailed in Gao et al. [1]. The augmented point cloud is then spatially partitioned into GPU memory-manageable subsets and stored in a quadtree. Each leaf node of the quadtree relates to one subset of the point cloud. Oblique imagery is processed one image after another in an out-of-core fashion as follows. 50 1. A hierarchical traversal of the quadtree locates leaf nodes that are visible to the image. The points subsets relate to these visible leaf nodes are then distinguished. 2. The GPU-accelerated rendering pass projects points of visible subsets to the cur- rent image as surface splats. Overlapping splats form surfaces that can effectively filter out occluded points, while per pixel weight and point-index are stored in an off-screen rendering buffer. 3. The accumulation pass processes the rendering buffer of previous pass to add weights and colors to points in visible subsets. After processing all images, each point contains the weight and color accumulated from all contributing image pixels. The color normalization operation parses all points one last time and updates the final color of each point as the weighted average of accu- mulated colors. 4.2 Augmentation Due to the imaging process of aerial LiDAR, vertical building walls and ground under tree canopies are often totally or partially occluded, resulting in gaps and sparsely sam- pled areas in the point cloud. The completion method described in Gao et al. [1] is applied to recover missing samples. Below is a brief review of the method emphasizing the aspects that relate to the color mapping problem. Given the classification information, the physical nature of a point is inferred and thus used to recover the missing samples. Points labeled as buildings share the 2.5D characteristic of building structures defined in Zhou et al. [9]: building structures being composed of detailed roofs and vertical walls connecting roof layers. For any building point on building boundaries, we construct a piece of vertical wall connecting to the 51 Figure 4.1: The color mapping framework. Operations are in bold. ground to fill the gaps between roofs and ground. For a point labeled as vegetation, regardless of the height, it is often the case that the ground directly under the point is occluded. Therefore, we recover the occluded ground points to fill the holes under trees. The work of Gao et al. [1] provides examples of offline point cloud augmentation for building walls and ground areas under tree canopies, and two types of procedurally generated geometry to recover building walls. We choose the augmentation approach for the color mapping problem because (i) missing points of both building walls and ground under trees can be recovered completely offline in a pre-processing, and (ii) the 52 added new data are also points that can be processed together with the original scanned point data. Figure 1.3 shows the difference of an aerial LiDAR point cloud before and after the augmentation. 4.3 Data Management One big challenge of color mapping for aerial LiDAR is the large amount of data that often exceeds the memory capacity of a computer system. A common solution is to partition the large data into small manageable subsets and process each subset in an out-of-core manner. 4.3.1 Data Structures The oblique imagery is naturally partitioned as individual image files that have similar sizes and are independent to each other. So the images are sequentially processed one after another as the strategy of Pintus et al. [5]. An oblique image often covers a small part of the scene, i.e., a large portion of points are not visible to the image. Therefore, it is not efficient to process subsets of points sequentially. We use a quadtree hierarchy to manage points subsets. Quadtree construction is simple and the domain of quadtrees is effectively 2.5D, which is a good match for aerial LiDAR. The generated structure is light-weight so it can reside in main memory and efficiently traversed. As our color mapping operation of points is GPU-accelerated, the maximum number of points to be stored in each node is defined based on the capacity of GPU memory. Given the 2D bounding box on x-y plane of the point cloud and the total number of points, a hierarchical quadtree is constructed. The root node covers the whole area. 53 Each interior node contains four pointers to its children and its bounding box. The leaf nodes contain the bounding box of each subset of the point cloud. To enable fast data accessing, points are reordered using a two-pass parsing. In the first pass, only index information of points that belong to the same bounding box is collected and stored in each leaf node. The second pass loads points per leaf node from the input file and writes the re-organized points to an output file stored on the disk. The index information in the output file is then updated in the leaf node. Compared to the Morton ordered point-stream of Pintus et al. [5] that reads data in a memory coherent way, our one-time point re-organizing takes more time due to frequent jumps when locating points of the same node in the input file. However, the quadtree hierarchy enables much faster per-frame visibility test than their streaming strategy. 4.3.2 Out-of-core Strategy The presented framework processes images sequentially in an out-of-core fashion. For each image in a sequence, the system loads the image from hard disk to main memory and processes it on CPU. After processing, the system writes the results to disk and frees the main memory occupied by the image. The main memory usage of imagery is bounded by the size of the largest image. When processing an image, the framework processes visible points subsets of the image one after another in an out-of-core fashion. For each subset, the system loads the points of the subset from hard disk to main memory and then uploads the main memory buffer to GPU. After the processing on GPU, the system writes results to disk and frees memory buffers occupied by the subset of points on both CPU and GPU. The GPU memory usage of the point cloud is bounded to the user-defined maximum number of points a quadtree node contains. 54 To hide the data accessing latency to the maximum extent, we use multiple CPU memory buffers as the bridge connecting the hard disk and the GPU. Multi-threading and the look-ahead pre-fetch policy are also applied to access both points and images. 4.3.3 GPU Splatting A GPU splatting technique is used to display the final colored points. Also, the rendering pass of the color mapping framework is a modified visibility pass of GPU splatting. So we briefly review the technique before the section of color mapping. As detailed in Kovaˇ c et al. [3], points are rendered as splats to form a hole-free image where: The size of a splat is adjusted to its distance from the viewer, the nearer the bigger. The shape of a splat is adapted to its orientation and position to make it always face the camera. A splat is associated with a radially decreasing Gaussian weight function for fil- tering. The projection of such a splat produces an image space elliptical fuzzy splat that can be smoothly blended with other splats. The colored points are visualized using a GPU splatting with deferred shading [43] algorithm which performs in three passes: Visibility Pass renders all points as splats only to the depth buffer with e-offset that determines the maximum distance between the two blended splats. Attribute Pass uses the depth buffer from previous pass and renders all splats again, only splats withine-distance are blended. Normals and colors are accumu- lated and stored in other rendering buffers supported by OpenGL Multiple Render Targets. 55 Shading Pass draws a fullscreen quad with a fragment shader to normalize nor- mals and colors from previous pass and to apply per-pixel Phong shading. 4.4 Color Mapping Compared to the texture mapped triangle meshes, color mapped points are light-weight in a sense that no image or texture coordinates need to be attached while rendering. Therefore, we can afford to map to points the fused colors from multiple images that could potentially deliver a better visual performance than using a single image. With multiple images available, the task is to assign a color for each LiDAR point by evalu- ating all contributing oblique image pixels. For each oblique image I i with size (W i ;H i ), given the calibration matrix M i , the camera position C i in world space is estimated. 4.4.1 Rendering Pass An oblique image often covers a small portion of the whole scanned area, so only points subsets that are potentially visible to the image need to be processed. We select these subsets by hierarchical traversal of the quadtree nodes. The visibility of each quadtree node is performed by checking the projected bounding box of the node with boundaries of the image I i . We define the screen space by the quad ((0;0);(W i ;H i )) to make one to one corre- spondence between screen pixels and image pixels of I i . Points of the list of visible leaf nodes are projected onto the screen space via M i and rendered as splats using a modified visibility pass of GPU splatting [3] as follows. The depth buffer is regular with no offset. 56 An additional off-screen rendering buffer B wi is used to store weight and point- index per pixel. The weight is used to measure the color contribution of the corresponding image pixel to associated point. We take as weight the dot product of the point’s normal and the normalized view direction. The point-index is used to locate the point. Since the depth test is on for the visibility pass, occluded points are either discarded or overwritten, therefore the resulting buffer B wi stores only weights and point-indices of points that are closest to the camera. Compared to the visibility computation of Pintus et al. [5] which first renders points as pixel-sized points and then applies a screen-space operation over the depth buffer, our method requires only one singe rendering pass to pick out visible points. Furthermore, the approach of Pintus et al. [5] identifies occluded points by thresholding the accessi- bility of a point based on the solid angle sustained by other points in the direction of the camera, while we provide a base splat size parameter to adapt to various data reso- lution and to control the portion of occluded points. Our parameter is more intuitive to adjust. Lastly, the method of Pintus et al. [5] aims at extremely dense points only, our splatting method is more generic that can be applied to points with any resolution, either thousands of points/m 2 or less than ten points/m 2 . 4.4.2 Accumulation Pass To sample colors from image I i , the accumulation pass parses pixel by pixel the buffer B wi of the rendering pass. If an pixel (u;v) of B wi stores a positive weight w together with a point-index j, we locate the point in the file F on disk (each point is associate 57 with a rgba structure where rgb is used to store color and a for weight) and accumulate the color and weight of image I i as follows: F[ j]:rgb+= I i (u;v) w F[ j]:a+= w The weighting scheme applied in Section 4.4.1 and Section 4.4.2 effectively takes into account the most important factors in the traditional triangle-based texture map- ping [23]: Image Resolution The splat size is calculated based on the distance between the corresponding point and the camera’s position C i . If the point is near C i , it will pro- duce a bigger splat resulting more pixels in B wi and a larger accumulated weight for image I i . This is coincident with the idea that the image with larger resolution for a surface should take a bigger weight. Surface Orientation The weight w is the dot product between the splat’s nor- mal and view direction, which ensures that images taken from a more direct and perpendicular view have larger weights and vice versa. 4.4.3 Color Normalization After processing all images, contributions of image pixels associated with a LiDAR point j is gathered in the form of accumulated color and weight in F[ j]:rgba. Through one last pass of all points, the final color of each point j is calculated by normalization of the accumulated color as follows: F[ j]:rgb== F[ j]:a 58 After this pass, the colored point cloud can be directly rendered using any GPU splatting methods. 4.5 Experiments and Discussion The implementations use the C# language under .NET 4.0 environment, OpenGL 2.1, Tao Library [105] and the Cg [106]language for shaders. A computer equipped with dual-core Intel c Core(TM) i3 3.20GHz CPU, 6GB RAM, NVIDIA GeForce c GT 520 1024MB graphics card, Seagate c 7200RPM hard disk and Windows 7 c was used for testing. The aerial LiDAR data used for testing cover the area of the USC campus near down- town Los Angeles. There are two data resolutions: sparse and dense. If not specified, the rendering images use the dense point data. The aerial oblique imagery consists of 410 pre-registered images with resolution 49923328 for a total size of 5.34GB. A typical oblique image covers a quarter of the scene. The basic information of point data and the color mapping performance are shown in Table 4.1. The time measures all operations from loading the original point cloud till the colored points produced. Data set Sparse Dense File size (MB) 44.0 694 Resolution (pts/m 2 ) 0.57 5.7 Original #points (M) 2.6 26 After augmentation #points (M) 4.3 48.9 Time 5m3s 1h38m Table 4.1: Basic information and performance of data sets in experiment. Figure 4.2 shows the overview of the whole area with colored points. Our color mapping algorithm successfully produces textures on both building roofs and walls. 59 Figure 4.3 compares the rendered colored points with an full-size oblique image of the same area. Because the color of a point comes from multiple images, the colored points display comparable visual appearance from any views. For example, the stadium in Figure 4.2 is a different view to that of Figure 4.3(b). Figure 4.2: The overview of the USC campus area using colored augmented points generated by the presented color mapping and rendering system. (a) (b) Figure 4.3: Comparison of (a) rendering of colored points. (b) a full-size oblique image. (a) achieves comparable visual cues to the ground truth image. Figure 4.4 compares visualizations of colored points generated by our color map- ping algorithm and texture mapped polygonal models generated by [26] and [23] over the same area. Besides the added visual cues from roads and vegetation, our method pro- vides comparable if not better details on building facades. Note that the quality of both 60 methods is affected by the point clouds quality, data resolution, registration accuracy, and imagery quality. (a) (b) Figure 4.4: Comparison of (a) rendering of colored points. (b) visualization of textured polygonal models. (a) provides added visual cues from realistic roads and vegetation. In Figure 1.4(b) and Figure 4.3(a), we can see that the building roofs show better visual appearance than walls. This is due to the different image quality of horizontal and vertical regions. Because aerial oblique imagery uses a vertical viewing axis, which guarantees optimal resolution for horizontal regions, but depicts vertical elements only to perspective effects. Figure 4.5 compares the effect of data resolution of points, it clearly shows that more points can produce more details because the final texture is formed by points samples. This case also shows that our color mapping algorithm is adaptive to data with various resolutions. 4.6 Summary This chapter presents a scalable out-of-core technique for mapping color information from aerial oblique imagery to large scale aerial LiDAR point cloud that are from the same or different sources. The point cloud augmentation effectively recovers missing points on building walls and under tree canopies of aerial LiDAR and ensures continuous 61 (a) (b) Figure 4.5: Comparison of different point data resolution. (a) Using the sparse data. (b) Using the dense data. The color mapping quality increases with point cloud density. textures for color mapping. The novel application of the modified visibility pass of GPU splatting in color mapping allows computing visibility and color contribution in a single pass. The presented system is simple to implement, effective to produce good results, scalable to arbitrarily large data and generic to be applied to data with any resolution. The experimental results show that color mapping of points can greatly improve the quality of LiDAR points visualization. 62 Chapter 5 Feature Enhancing Aerial LiDAR Point Cloud Refinement The structure of this chapter is as follows. Section 5.1 estimates initial normals and classifies boundary points for the following steps. The framework is composed of a smoothing step as detailed in Section 5.2 to smooth out noise and an up-sampling step shown in Section 5.3 to fill gaps/holes. The chapter closes with experimental results in Section 5.4 and conclusion in Section 5.5. 5.1 Preparation This section presents preparation work prior to the smoothing and re-sampling steps. Section 5.1.1 shows how to estimate a good initial normal for each point to be used later as a normal discontinuous feature detection criterion. Section 5.1.2 labels out points lie on position discontinuous boundaries to be treated separately from other points. We first introduce the definition of the point cloud, the data structure used for storage and searching, and the definition of neighborhoods. The aerial LiDAR point cloud taken as input is a set of 3D points P = fp 1 ; p 2 ;:::; p N g; p i 2 R 3 , where p i is a position and the corresponding normal is denoted as n p i . The average point spacing is s. Due to the 2.5D characteristic of the data [9], positions of P cover a much larger span along x and y axis than that along z axis. We use 63 a k-d tree [107] partitioned on the x-y plane to store P for its simplicity in implementa- tion and good performance in neighbor searching. Two local neighborhood definitions are supported. k-nearest neighbors N k (p), which are k neighbors nearest to a point p2 P, regard- less of the distance. r-radius neighbors N r (p), which include all neighbors in the sphere of center p and radius r. 5.1.1 Normal Estimation Figure 5.1: Normal estimation comparison of isotropic and anisotropic neighborhoods, under different noise level. Wedges have various angles: 30 (left two columns), 90 (middle two columns) and 150 (right two columns). Points with errors larger than 10 are colored in red, colors of the rest points are interpolated between green (good) and blue (bad). The values of sensitive parameter for clustering are given for anisotropic neighborhood estimation. In all cases, anisotropic neighborhood provides better normal estimation than isotropic neighborhood. Our refinement framework will smooth out noise in normals, so the initial estimation does not need to be accurate. However, as the framework relies on normal information 64 to preserve and enhance normal discontinuous features, it is beneficial to have a good initial guess of normals that is differentiable across features. The popular regression based normal estimation [51] is simple and can produce good results on smooth regions. The method fails to handle sharp features because the nor- mal is always evaluated in an isotropic neighborhood. Li et al. [57] observe that, for each point, in most cases an anistropic neighborhood can be determined from which the local tangent plane can be reliably approximated. To achieve the best trade-off between quality and performance for low density aerial LiDAR point data, we design a normal estimation algorithm that uses a constant isotropic neighborhood for points on smooth regions and an anisotropic neighborhood for points near features. Smoothness Test Given a point p and its k-nearest neighbors N k (p), the surface vari- ance at p as introduced in Pauly et al. [108] is: n(p)= l 0 l 0 +l 1 +l 2 ; where l i are the eigenvalues of the covariance matrix evaluated over p and N k (p) with l 0 l 1 l 2 . Ifn(p)n smooth , we consider p lies on a smooth area and take N k (p) as its neighborhood for normal estimation. The thresholdn smooth 2[0; 1 3 ] is user adjustable. Anisotropic Neighborhood Selection For a point p failing the smoothness test, we adapt the idea of Gauss map clustering [31] to find its best anisotropic neighborhood where all neighbors share the similar tangent plane with p. This is motivated by the fact that these neighbor points will cluster together on the Guass map. 65 Any two neighbors p i and p j , i< j;i; j2[0;k) form a triangle4 i j together with p. The normal vector of4 i j can be estimated as n i j = pp i pp j As n i j only determines the direction not the orientation, we choose from the pair fn i j ,n i j g the one closer to the positive z-axis as the normal of4 i j . We use N to denote the set of normals of all possible C k;2 triangles. Same as Boulch et al. [58], it is unnecessary to check collinearity as the random normal generated by near collinear points does not affect much the final clustering. The algorithm of Weber et al. [31] maps N to the unit sphere centered at p via a projection operation. We skip this projection and analyze directly the original N which naturally maps to the unit sphere centered at the origin, since the clustering behavior is irrelevant to the sphere location. Note that only the half sphere with positive z is covered by N. Similar to Weber et al. [31], we use the hierarchical agglomerative clustering method [109] to cluster normals in N based on geodesic distance on the unit Gauss sphere. The clustering algorithm stops when the distance between two clusters exceeds a thresholdh2[0; p 2 ]. We now analyzeV clusters containing more than c min members: IfV = 0 orV > 4, p is considered as a noise, we take N k (p) as its neighborhood. The number of four is from the assumption that most data sets generally don’t have more than four sharp features meeting in one point [31]. If 1V 4, p is considered to be near or on a sharp feature. All points of a cluster (a cluster member n i j is associated with two neighbor points p i and p j ) form an anisotropic neighborhood candidate for p. We can fit a least square plane 66 to p and points of a cluster. Based on the distance from p to the fitted plane, the cluster(s) closest to p are identified. – If only one such cluster, neighbor points of this cluster form the best anisotropic neighborhood of p, as depicted in Figure 5.2(a). – If two or more clusters, p can be categorized to any of these clusters, i.e., p lies exactly on a sharp feature. In this case (Figure 5.2(b)), the normal of p is ambiguous. To avoid abrupt normal changes along a feature edge, we don’t choose any one of the clusters but use N k (p) instead. For the value of k, 8 and 32 are considered as the lower and upper bounds to deliver acceptable results [31], we use 16 for the experiments. h is the sensitivity parameter for Gauss map clustering, the smaller the value, the more points including noise will be detected as near sharp features. A relatively low value (0:1 - 0:4) works for our data. The value of c min is locally adjusted as half of the members of the largest cluster of p, which is more effective than a global setting. (a) (b) Figure 5.2: Anisotropic neighborhood selection for point p (blue) sits (a) near the edge, and (b) on the edge, using 16-nearest neighbors: selected (red) and unselected (black). Normal Estimation We apply PCA over point p and its neighbors selected as above, and take the eigenvector corresponds to the least eigenvalue of the covariance matrix as 67 the normal at p [51]. We determine the orientation as closer to the positive z-axis since a roof point p is always pointing towards the sky. To test the effectiveness of our anisotropic neighborhood, we compare the normal difference between the estimation and the ground truth on a set of synthetic point clouds. The data is sampled from wedges with various angles: 30 , 90 , 150 . The original point clouds are uniformly sampled with spacing 0:4 in a 10 10 bounding box. The noisy point clouds are generated by perturbing each point a random vector with size 20% and 50% of the point spacing. We visualize the data with Root Mean Square with threshold (RMS t ) errors as in Boulch et al. [58]: RMS t = s 1 jPj å p2P v 2 p where v p = \ n p;re f n p;est if \ n p;re f n p;est <t, and v p = p 2 otherwise. n p;re f is the reference ground truth normal at p and n p;est is the estimated normal at p. Figure 5.1 compares RMS 10 of normals estimated over isotropic (k= 24) and our anisotropic neighborhood. Normals of points near edges are obviously better with anisotropic neighborhood estimation for either acute or obtuse angles, under moderate noise. 5.1.2 Boundary Detection The input point set P represents roof blocks of a single building, so roof boundary points are position discontinuous features. These features can be differentiated from interior points based on local environment analysis, as neighbors of a boundary point cover only part of the environment but neighbors of an interior point are distributed all around. For a point p2 P, its r-radius neighborhood N r (p) is analyzed similar to Linsen et al. [110] to decide whether it is a boundary point or not. To find a pertinent one-ring 68 neighborhood, we set r = bs;b2[1;2] as in Guennebaud et al. [33]. Such a neigh- borhood selection may include points from nearby roof blocks different from that of p. By removing neighbors from N r (p) whose z-coordinate difference with p is larger than h, neighbors are limited to those on the same roof block. Neighbor points p i are then projected onto the tangent plane of p and sorted such that their projections q i form increasing counterclockwise angles with the positive x-axis. Finally, we check the angle difference between any two immediate ordered neighbors to see if it exceeds a limitq c , e.g. p 2 . Once this criterion is fulfilled, p is classified as a boundary point. Figure 5.3 shows that boundary points of each roof block are correctly labeled, with parameters b = 2,q c = p 2 and h= 2s. Figure 5.3: Boundary points (red) detection. We use B to denote the set of all boundary points of the input data and I for all interior points, where I = P B. Three additional local neighborhoods for boundary points are defined as below. k-nearest same-roof boundary neighbors N B s k (p), which are k nearest boundary points that share the same roof block with p. r-radius boundary neighbors N B r (p) collect boundary points whose projection on x-y plane are within the r-radius circle at p’s projection. Using 2D distance is to cover boundary points on different roof blocks but are relevant to the regulariza- tion of p. r-radius same-roof boundary neighbors N B s r (p) collect r-radius boundary neigh- bors that share the same roof block with p. We have N B s r (p) N B r (p). 69 5.2 Smoothing (a) Input points. (b) Normal estima- tion. (c) Normal smooth- ing. (d) Position smooth- ing. (e) Boundary direc- tion estimation. (f) Boundary direc- tion smoothing. (g) Boundary posi- tion smoothing. (h) Output points. Figure 5.4: The smoothing process on a synthetic building with two roof blocks. One edge of the wedge sits right above one edge of the rectangular roof. The data is corrupted with Gaussian noise of zero-mean and standard deviation of 1% of the diagonal of the bounding box. (b)-(d): cross-sections and close-up views of the squared area in (a). (e)-(f): close-up views of boundary points of the rectangular area in (a). The smoothing step is a two-stage bilateral filtering performed on all points (Sec- tion 5.2.1) and then on boundary points only (Section 5.2.2). The first stage filters normals or boundary directions. The second stage displaces point positions under the guidance of smoothed normals or boundary directions. Bilateral filtering is selected because it is a simple, efficient, non-iterative, and fea- ture preserving robust estimator. The estimate for a point is the weighted average of predictions from neighboring points. The weight of a neighbor depends not only on the spatial distance (spatial weight f ), but also on the influence difference (influence weight g) that penalizes values across features. 70 For a point p and one of its neighbors p i , the spatial weight f always takes the Euclidean distancekp p i k. The influence weight g has more options. Jones et al. [66] uses the distance between the prediction and the original position: kP p i (p) pk (5.1) where the prediction is simply the projection of p onto the plane through p i with normal n p i :P p i (p)= p+(p i p) n p i n p i . Fleishman et al. [70] instead uses: < n p ; p i p> (5.2) to give less weights to neighbors that do not lie on the plane through p with normal n p . Duguet et al. [6] uses normal difference: 1< n p ;n p i > (5.3) Given a point p on edge e 1 and some of its neighbors on another edge e 2 , Fig- ure 5.5(a) shows the case when Equation (5.1) gives all neighbors on e 2 a large g, and Figure 5.5(b) shows the case when Equation (5.2) gives neighbors on e 2 that are close to p a large g. But Equation (5.3) can handle both cases by assigning a much smaller g to all points on e 2 , as normal difference is the best measurement to distinguish points on different surfaces of a feature. Experiments also show that Equation (5.3) works best for both normal and position filtering, so it is taken for the influence weight g. The bilateral filter of Duguet et al. [6] applies an additional quality coefficient to weigh the sum of least square errors of a fitted second-order jet surface. Experiments with a similar quality coefficient show negligible improvement for our data, so this coefficient is not included. 71 (a) (b) Figure 5.5: Special cases for different influence weight g. Arrows for normals; dotted lines for vectors connecting two points; blue point is p. Red points will get large g if evaluated with (a) Equation (5.1). (b) Equation (5.2). The same filtering equation is used for normal, position, boundary direction as well as boundary position, the quantity to be smoothed is denoted as X: X 0 (p)= 1 k(p) å p i 2N r (p) f(kp i pk)g(1< n p ;n p i >)X p i (p) (5.4) where k is the sum of weights as the normalizing factor k(p)= å p i 2N r (p) f(kp i pk)g(1< n p ;n p i >) The spatial and influence weight functions are Gaussians: f(kp i pk)= e kp i pk s f 2 ;g(1< n p ;n p i >)= e 1n T p n p i 1cos(s g ) 2 The spatial weight f is to remove noise by including a large number of neighbors in the estimate, we use a constant s f = r 2 as Jones et al. [66]. The influence weight g is to handle outliers by decreasing the weight of neighbors with large normal differences, we use a constant angle parameters g = 15 as Huang et al. [7]. 72 5.2.1 Smoothing for all points Normal Smoothing For each point p2 P, we run the bilateral filter over its neighbor- hood N r (p), where the normal prediction from p i 2 N r (p) is the normal of p i , i.e., n p i . Figure 5.4(c) shows the smoothed normals. Position Smoothing Now, we run the bilateral filter to smooth positions of each point p2 P, where the prediction from a neighbor point p i 2 N r (p) is the projection of p onto the plane through p i with normal n p i . Instead of displacing p to the new position p 0 , we constrain the point to move along its normal n p , i.e., move to the projection of p 0 on the line through p with direction n p . This preserves the original sampling density. Figure 5.4(d) shows the smoothed positions. 5.2.2 Smoothing for boundary points Boundary Direction Estimation To estimate boundary direction of a point b2 B, similar to Daniels et al. [78], we take the direction of the underlying line of b, which is the eigenvector corresponding to the largest eigenvalue of the covariance matrix formed by b and its neighbors. The boundary direction of b is denoted as d b . Boundary Direction Smoothing Equation (5.4) should only consider boundary points since interior points do not affect boundary alignment. The 2.5D characteris- tic of building structures [9] infers that neighboring roof blocks are connected by the same vertical walls. Therefore, boundary points of different roof blocks should also influence each other when projected on the x-y plane. We break the smoothing of boundary direction into two parts: on the x-y plane and along z-axis. For the smoothing on x-y plane, all boundary neighbors in N B r (b) are evalu- ated. In Equation (5.4), the prediction from a neighbor point b i is d b i projected on the x-y 73 plane; f weighs the Euclidean distance between b and b i projected on the x-y plane; and the normal difference in g is adapted to the difference of boundary directions projected on the x-y plane. The smoothing along z-axis only considers boundary neighbors from the same roof block (N B s r (b)). Similarly, Equation (5.4) is updated with projections on z-axis for the prediction and f . But g uses the difference of 3D boundary directions, as the projected difference on z-axis is often too small. Figure 5.4(f) shows the smoothed boundary directions. Boundary Position Smoothing Similar to the smoothing of boundary direction, boundary position is also smoothed in two parts. The complete prediction from a neigh- bor point b i is the projection of b onto the line through b i with direction d b i . It is split into two projections on the x-y plane and along z-axis for Equation (5.4). The displacement of b is restricted to move on the tangent plane defined by b and n b , and along the direction perpendicular to d b . The constraint preserves results of previous smoothing passes and the original sampling density along boundary direction. The underlying continuous surface is formed by both boundary and interior points, it will be problematic if the displacement of boundary points is independent of that of interior points. Given a boundary point b to be moved to a new position b 0 , the displacement vector is v= b 0 b. Interior points in the neighborhood N kvk (b) (dotted circle in Figure 5.6 (a)) may be left outside the boundary after re-positioning b, e.g., the blue point in Figure 5.6 (b). To avoid the problem, in addition to displacing b to b 0 , we also displace every interior neighbor that is in the way of the displacement of b. For any interior neighbor b i 2 N kvk (b)^b i 2 I, if it is in the way determined by v T (b i b)> 0, it should be displaced to a new position b 0 i = b i +v, as seen in Figure 5.6 (c). Figure 5.4(g) shows the smoothed boundary positions. 74 (a) (b) (c) Figure 5.6: The problem caused by displacing boundary points. (a) Boundary points (red) will be moved to the dotted line. The dotted circle is centered at the red point with radiuskvk. (b) After displacing boundary points only. The interior point (blue) is left outside the boundary. (c) After displacing boundary and affected interior points. The problem of (b) is avoided. 5.3 Up-sampling The up-sampling step treats interior and boundary points separately. The insertion of new interior points aims to fill gaps on the latent surface, while the insertion of new boundary points is to fill gaps along the underlying boundary lines. For both cases, we first use a local detector inspired by boundary detection in Section 5.1 to locate gaps, then apply an edge-aware bilateral projector [7] to insert new points. Finally, the up- sampling step is extended to enhance features. 5.3.1 Interior point up-sampling The boundary detection in Section 5.1 is based on the neighborhood environment anal- ysis of point p. The angle between two immediate ordered neighbors define the spatial clearance within the r-radius sphere centered at p. Under-sampling is most likely to happen in the area formed by two neighbors that have the maximum angle difference, as depicted in Figure 5.8(a). This idea is used to design a simple and effective local gap detector. For each interior point p2 I, given a global radius g, we project all neighbors p i 2 N g (p) onto its tangent plane centered at p. Note that p i can be either interior or boundary. 75 (a) Smoothed points. (b) Interior up- sampling. (c) Boundary up-sampling. (d) Interior enhanc- ing. (e) Boundary enhancing. (f) Output points. Figure 5.7: The up-sampling process on the data of Figure 5.4(h). (a)-(e) show close-up views of the square in Figure 5.4(a). All neighbors are ordered by their counterclockwise angle formed with the positive x- axis. A gap is found if the maximum angle difference of any two immediate neighbors exceed a threshold q c . A new point is inserted to fill the gap as detailed below. This process continuous till the maximum angle difference of p is smaller thanq c . Given an under-sampled region formed by p and two neighbors p i and p j , we now find another point q to interpolate the new sample. For each neighbor point p k lies in between the spheres with radiusg andg max , i.e., p k 2 N g max (p)^ p k 62 N g (p), ordered by increasing Euclidean distance to p, we project p k onto the tangent plane through p with normal n p , and label it as q if it fulfills the following conditions: p k is ordered in between of p i and p j by the angle criterion. p k is not too close to p i and p j , i.e., the angle formed by\p i pp k and\p k pp j are both larger thanq min . 76 To find neighbors covering the entire environment of an interior point, we useq c = p 2 andg max =bs as in Section 5.1, andq min = p 8 as in Guennebaud et al. [33]. Figure 5.8(b) shows the end point q found by the above criteria. Now, the task is to insert a new point p new with normal n new based on end points p and q. The edge-aware bilateral projector [7] is applied to constrain the projection direction to be along n new , i.e., p new = b new + d new n new (5.5) , where b new is the base point and d new the distance along n new . (a) (b) (c) Figure 5.8: 2D view of the up-sampling process. (a) Point p (blue) and its g-radius neighbors (black). p i and p j form the under-sampled area. (b) Interpolation end point q (red) and the base point b new (green). Inner circle has g-radius and outer circle has g max -radius. (c) Two candidate positions (crosses) to insert a new point based on b new . For regular sampling in the local region defined by the sphere at p with radiusg, the base location needs to be within that sphere, as b new = p+ minf kp qk 2 ;gg(q p) Figure 5.8(b) shows the selected base point b new . Similar to Huang et al. [7], the projec- tion distance under a given normal n is obtained by d new = å p i 2N g max (b new ) (n T (b new p i )) f(kb new p i k)g(1< n;n p i >) å p i 2N g max (b new ) f(kb new p i k)g(1< n;n p i >) 77 Constrained to the set of n p , n q , the normal n new is selected as the one with the minimum d new evaluated above. Now, a new point can be inserted with Equation (6.1). As illustrated in Figure 5.8(c), b new is inside the g-sphere at p, if n new = n p , p new will be within the sphere. Thus, the gap between p i and p j is filled, and the gap detector can proceed without stalling. However, if n new = n q , p new will be pushed outside the sphere, leaving the gap detector in an infinite loop. As a remedy, n new is constrained to be n p . In case n q wins the d new evaluation above, we skip this end point q and continue the search in neighbors. It won’t be problematic if no new points are inserted for the gap formed by p, p i and p j , as this gap will be detected and filled when processing point q. We simply skip this gap for now and continue to the second largest gap till all gaps of p are processed. (a) Original (b) g = 2:0 (c) g = 1:3 (d) g = 1:0 Figure 5.9: Interior point up-sampling for a 2D surface with both inner and outer bound- aries. The original sampling is non-uniform. Inserted points are in red. The new point p new is immediately added to the storage k-d tree as an interior point. Therefore, a single pass of all interior points can fill all gaps in a neighborhood of g radius. Compared to the up-sampling of Huang et al. [7] which relies on a global priority queue to manage the insertion order, our method is more memory efficient by locating under-sampled areas via a simpler local detector. As seen in Figure 5.9, our approach is effective to deliver global uniform sampling, which is easily adjustable by the radiusg. It also shows that the approach works for input data with non-uniform sampling while preserving both inner and outer boundaries. Figure 5.7(b) shows the result of interior up-sampling. 78 5.3.2 Boundary point up-sampling The up-sampling of boundary points is similar to that of interior points. One difference is to only consider boundary points in a neighborhood. The other difference is to use boundary directions instead of normals when evaluating d new and n new . Figure 5.7(c) shows the result of boundary up-sampling. 5.3.3 Feature enhancing extension By inserting more points near a feature, the feature is enhanced. Normal discontinuous features are interior points and position discontinuous features are boundary points. We enhance two types of features by separately up-sampling interior and boundary points near discontinuities. Given an interior point p and its N g (p), the neighbors share similar normal with p form a new neighborhood N sn g (p), where for any p i 2 N sn g (p), we have 1 n T p n p i < 1 cos(s g ). The points in N sn g (p) are all from the same smooth surface as p. With this neighborhood, an under-sampled point found by the local gap detector is a point closest to a feature in a local region. g is set as the final radius of the regular up-sampling step, in order to skip interior points that are far away from features. The set of near feature points is denoted as F. For each point f2 F, we insert new points to fill gaps with a user defined radius g e (g e <g) as detailed in Section 5.3.1. Newly inserted points are also added to F. Note that the neighborhood of f is still formed by points in P as before. The resulting point cloud has uniform sampling of radius g in flat area and uniform sampling of radius g e near features. As g e <g, we see more densely sampled features, i.e., features are enhanced. Figure 5.7(d) shows the result with interior features enhanced. 79 The feature enhancing up-sampling for boundary points is easily achieved by replac- ing normals with boundary directions. Figure 5.7(e) shows the result with boundary features enhanced. Finally, the rendering of feature enhanced points is shown in Figure 5.7(f), which has greatly improved visual appearance than the rendering of raw points in Figure 5.4(a). 5.4 Experimental Results Robustness. Results of the smoothing step are used to show the robustness to handle noise. The kernel of the smoothing step is the well-studied bilateral filter, one is referred to previous studies [6, 66, 70] for its robustness on smoothing interior surfaces. For the smoothing of noisy boundaries, we use a synthetic square with a circular hole where both straight and curved boundaries are present. Gaussian noises are added by perturbing each point a random vector with size 50% and 100% of the point spacing. Figure 5.10 visually shows that the smoothing quality does not degrade much as the noise increases. Table 5.1 measures average distance and standard deviation from smoothed interior and boundary points to the ground truth, to quantitatively verify the robustness. Figure 5.9 shows that the up-sampling step is also robust to handle non- homogeneous point density. Table 5.1: Errors for Figure 5.10. MD: mean distance; SD: standard deviation. Data set Interior-MD Interior-SD Boundary-MD Boundary-SD (a) 0.028482 0.018486 0.051923 0.027182 (b) 0.003169 0.003243 0.025885 0.015879 (d) 0.036123 0.026051 0.060937 0.031821 (e) 0.004458 0.004472 0.030655 0.019828 80 (a) 50% noise, original (b) 50% noise, smoothed (c) 100% noise, origi- nal (d) 100% noise, smoothed Figure 5.10: Smoothing results over different noise levels. The distance between an actual point and its ground truth is color coded as the rightmost bar. Degradation of smoothing quality of both interior surface and boundaries is negligible. Stability. The test data sets are sampled from a synthetic wedge with vastly differ- ent densities, and each is corrupted with 1% Gaussian noise as that of Figure 5.4(a). Figure 5.11 compares the rendering results before and after the refinement. From left to right, wedges have decreasing number of samples: 100%, 20% and 5% proportion to the samples of the leftmost wedge (15;625 samples). Same parameters are applied, so the resulting refined wedges have approximately the same sampling density. Our approach is stable as most normal and position discontinuous features are sharply recovered for all data sets. Table 5.2 quantitatively demonstrates the stability by measuring distances from points to the ground truth surface. (a) 100%, orig- inal (b) 100%, refined (c) 20%, origi- nal (d) 20%, refined (e) 5%, origi- nal (f) 5%, refined Figure 5.11: Stable refinement over various input sampling density. Point clouds of real buildings with various density are also tested. Buildings of Figure 1.5, 5.3, 5.12(a)-(c) and 5.16 are extracted from the aerial LiDAR point cloud of 81 Table 5.2: Errors for Figure 5.11. MD: mean distance; SD: standard deviation. Data set Original-MD Refined-MD Original-SD Refined-SD 100% 0.013280 0.002699 0.010053 0.003021 20% 0.013131 0.003787 0.013131 0.005466 5% 0.012857 0.011546 0.009670 0.016437 Atalanta with 25 points/m 2 density, whereas buildings of Figure 5.12(d), 5.14 and 5.15 are from USC with 5:8 points/m 2 density. Versatility. Rendering results of Figure 1.5 and 5.12 show that our refinement is effec- tive to buildings with various roof shapes and complexity by providing smooth planar areas, filled gaps, and sharpened both normal and position discontinuous features. (a) Original (b) Original (c) Original (d) Original (e) Refined (f) Refined (g) Refined (h) Refined Figure 5.12: Versatility on real data with different shapes and complexity. Colored using a normal map. Extensibility. The presented approach is designed to refine roof points of a single building, however, it is easily extensible to other objects with similar surface properties, e.g. the ground, as seen in Figure 5.15. 82 Because of the local property of the approach, it can be extended to large-scale data that needs streaming. Figure 5.15 shows part of USC to demonstrate that several buildings as well as the ground can be refined in a single pass with a single set of parameters. (a) 100% noise, original (b) After RIMLS smoothing (c) After our smoothing Figure 5.13: Smoothing results for the data set in Figure 5.10(c). Our approach has less errors on boundaries. Comparison with previous work. The presented approach is superior to previous point-based refinements in the sense that position discontinuous features are explicitly regularized. Figure 5.13 visually compares smoothed results of RIMLS [111] and the presented smoothing step, our approach has obviously less errors on boundaries. A recent surface reconstruction [112] and geometry modeling [80] can regularize position discontinuous features as intersections of fitted planes. They have shown to be effective for straight boundaries. But because of the planar assumption on inter- secting surfaces, curved boundaries are not well handled, as seen in Figure 5.14(c). Without any geometric assumption, our approach provides smoother boundary as seen in Figure 5.14(d) which is also more accurate as compared to the ground truth image in Figure 5.14(a). 83 (a) Ground truth image (b) Original points (c) Geometry models (d) Refined points Figure 5.14: Comparison of geometry modeling and visually-complete rendering of points refined by our approach. Our approach provides better looking and more accurate curved boundaries. Performance. The testing computer is equipped with dual-core Intel c Core(TM) i3 3:20GHz CPU, 6GB RAM, Seagate c 7200RPM hard disk and Windows 7 c . The pro- gram uses a single core without any optimization. Table 5.3 lists the processing time for each step. The refinement is fast in the sense that every step takes only a few seconds for all test data. Table 5.3: Processing time (in seconds) of each refinement step. IP-N: number of input points; SM-T: time for smoothing; US-T: time for up-sampling; US-N: number of points after up-sampling; EU-T: time for feature enhancement up-sampling; OP-N: number of output points. Data set IP-N SM-T US-T US-N EU-T OP-N Figure 1.5 (b) 5,594 0.57 1.37 11,597 2.36 16,590 Figure 5.7 (f) 23,500 2.07 5.84 52,303 2.57 57,461 Figure 5.12 (e) 9,079 1.23 1.68 15,603 5.80 25,720 Figure 5.12 (f) 7,546 1.12 1.18 12,146 4.11 21,409 Figure 5.12 (g) 4,658 0.49 0.99 9,248 3.45 16,532 Figure 5.12 (h) 119,310 11.06 40.44 298,164 105.36 450,279 Figure 5.16 (b) 4,498 0.76 0.54 6,403 0.70 7,913 Applications. The refinement is a useful pre-process to deliver visual improvement for direct rendering of points as exemplified in Figure 5.12. 84 Visually-complete aerial LiDAR point cloud visualization [1] is a direct beneficiary of the refinement. As seen in Figure 5.15, walls look more visually appealing as the boundary roof points are effectively regularized. (a) Original (b) Refined Figure 5.15: Refinement for a larger data set with several buildings. To take advantages of the refinement, real-time visualization applications can pre- process the smoothing step and the regular up-sampling step withg = s to handle noise and under-sampling of the entire data; and then dynamically perform the feature enhanc- ing up-sampling step in a view dependent manner to add just enough details for distant objects. The refinement also improves visual quality for surface reconstruction of points, exemplified by APSS [113] and RIMLS [111] from MeshLab [114]. We use default parameter setting in the experiment; one can achieve better results by fine tuning of parameters. Figure 5.16 visually compares the results of these two surface reconstruc- tion methods performed on a point cloud before and after the refinement. It shows that surface reconstruction quality is greatly improved by refinement because of smoother planes and sharper features. 85 (a) Before: points (b) After: points (c) Before: APSS (d) After: APSS (e) Before: RIMLS (f) After: RIMLS Figure 5.16: Quality of surface reconstruction gets improved after the refinement on raw points. 5.5 Summary By extending recent developments in geometry refinement to accommodate unique properties of aerial LiDAR building points, this chapter presents a specialized two- step refinement framework where normal discontinuous features (boundary points) are explicitly processed. Normals and boundary directions are estimated over anisotropic neighborhoods selected by Gauss map clustering. The smoothing step applies a two- stage feature preserving bilateral filtering to smooth out noise in normals, boundary directions and positions. The up-sampling step employs a local gap detector and a fea- ture preserving bilateral projector to deliver uniform up-sampling. A simple extension allows feature enhancing by increasing samples in the vicinity of features. The pre- sented method is fast, simple to implement, and easily extensible. Experiments on both synthetic and real data show that our refinement method effectively removes noise, fills gaps, and for the first time, regularizes position discontinuous point boundaries. It is a beneficial pre-process for other applications such as rendering and surface reconstruc- tion. 86 Chapter 6 Real-time Refinement for Aerial LiDAR Cities This chapter presents a visualization framework for cities in the form of aerial LiDAR point clouds. To provide interactive rendering for large data sets, the framework com- bines level-of-detail (LOD) technique with hierarchical hybrid representations of both point and polygon of the scene. The supporting structure for LOD is a multi-resolution quadtree (MRQ) hierarchy that is built purely out of input points. Each MRQ node stores separately a continuous data set for ground and building points that are sampled from continuous surfaces, and a discrete data set for independent tree points. The continu- ous data is first augmented with vertical quadrilateral building walls that are missing in original points owing to the 2.5D nature of aerial LiDAR. The continuous data is then spatially partitioned into same size subsets, based on which RPQ (Rendering Primitive Quadtree) hierarchies are constructed as the hybrid point-polygon representation of the scene. Specifically, a polygon conversion operation replaces points of a subset forming a planar surface to a quadrilateral covering the same space, and a polygon simplifi- cation operation decimates wall quadrilaterals of a subset sharing the same plane to a single compact quadrilateral. Interactive hybrid visualization is retained by adapting a hardware-accelerated point based rendering with deferred shading. 87 Point cloud refinement is necessary as well as effective in improving display quality of visualizations. The presented multi-resolution hybrid rendering framework also sup- ports real-time refinement for aerial LiDAR point clouds of cities. By integrating clas- sification information, the refinement is performed only on ground and building points that are sampled from continuous surfaces. For a visible under-sampled region, by ref- erencing related hierarchical hybrid structures, the refinement process interpolates new points only to point-represented partitions via a feature-preserving bilateral projector. Without complex computation and without user interference, the presented refinement is dynamically adaptive to views and curvatures. The structure of this chapter is as follows. Section 6.1 and 6.2 show details on how to construct data structures supported for the multi-resolution hybrid rendering. The rendering routine is explained in Section 6.3. Section 6.4 presents the dynamic point cloud refinement that is performed in real-time. The chapter closes with experimental results in Section 6.5 and conclusion in Section 6.6. 6.1 Multi-resolution Hierarchy LOD rendering requires a supporting multi-resolution data structure that indexes the space of the scene. This work uses MRQ, a quadtree hierarchy, since its construction is inexpensive and its domain is effectively 2.5D, which is a good match for aerial LiDAR. The construction of the MRQ begins at the root node, which covers the whole area of the scene. The bounding box of the data contained in the root node is equally divided along the x and y axes into four sub-areas indexed by four children of the root node. This process continues until a predetermined maximum number of points per leaf node is reached. Each MRQ node stores the bounding box of the indexed space, and pointers 88 (a) (b) Figure 6.1: A two-level MRQ. (a) Input point data marked with bounding boxes (in red or blue). Spaces between bounding boxes are left intentionally for clarity. (b) MRQ of (a). Node colors correspond to colors of bounding boxes. L0 has 1 4 resolution of L1. to index point data within its bounding box. In addition, it also stores pointers to its parent and child nodes for convenient upward and downward traversal. Based on physical properties, point data of an MRQ node is split into two sets: one continuous data set for ground and building points that are sampled from continuous surfaces; and one discrete data set for tree points that are independent and unstructured. All points are distributed in such a way that leaf nodes represent the highest resolu- tion of the scene, and each non-leaf node has 1 4 resolution of its children, as seen in Figure 6.1. Consequently, we get a refinement when traversing downward, and a sim- plification when traversing upward. 6.1.1 Continuous Data Set Because of the 2.5D characteristic, aerial LiDAR points can be represented as an eleva- tion map without loss of information. And because the underlying surfaces of ground and building points are piece-wise continuous, interpolation-based gridding of the eleva- tion map can preserve geometry details as long as the resolution of grids is no less than the sampling density of original point data. Compared to data structures built directly 89 upon original 3D points, the 2D gridded elevation map (GEM) provides more efficient storage and neighborhood selection allowing substantial reduction of the computation complexity [115]. Therefore, continuous data sets of ground and building points are stored in GEMs. Constructing GEMs for MRQ Leaf Nodes Given an MRQ leaf node, by projecting continuous data within its bounding box onto x-y plane, an elevation map is created. Through a gridding operation, we first convert the elevation map to a GEM. A filtering operation applied to the GEM then removes possible gaps and identifies building bound- ary points. Finally, vertical quadrilaterals are augmented to building boundary points to complete missing building walls. Gridding Points of the elevation map are quantized to squares of uniform width w g , such that x, y positions are indexed by integers, and each (x, y) pair relates to one sample point. The quantized elevation map is a GEM; and each sample point of the GEM is referred to as a grid. To ensure the number of grids in GEM is approximately the same as the number of points in original data,w g is set to the average point spacing of the entire scene. This is reasonable as aerial LiDAR points are sampled approximately uniformly. Each grid g indexed by (x g , y g ) contains as its data an elevation value, a normal vec- tor, and a flag f c indicating its classification as ground or building. g’s data is determined by neighbors nearest to (x g , y g ) on the corresponding elevation map. We start with k= 8 nearest neighbors as Ochotta et al. [116], and then select a subset of k 0 (k 0 k) neigh- bors whose elevations are withinz of the largest elevation of all k neighbors, in order to ensure that the subset of neighbors come from the same surface. Evaluated from these k 0 neighbors, g’s elevation takes the weighted average, its normal estimated by PCA, and its classification flag set as the same. z is the minimum elevation difference between 90 two roof layers as detailed in Gao et al. [1], which is proved to be effective to separate ground and different roof layers of a building. Figure 6.2(b) shows a GEM resulted from gridding. (a) (b) (c) (d) Figure 6.2: Gridding and filtering results on (a) The elevation map of a test data. (b) The GEM after gridding. (c) After applying a box filter on (b), empty grids are filled by new red points. (d) After applying a Laplace filter on (c), building boundary grids are marked in light blue. Filtering Due to quantization errors and under-sampling of original data, a grid on GEM may be empty, leading to a gap in visualization. A box filter [117] is applied to effectively fill these empty grids, as shown in Figure 6.2(c). Following the wall completion approach of Gao et al. [1], building boundary points need to be distinguished. Similar to edge detection of an image [117], this work applies a Laplace filter to elevations of GEM grids, which results in obviously larger elevations for building boundary grids than those for non-boundary grids. Thus all building bound- ary grids can be identified by thresholding with the same z for gridding, as shown in Figure 6.2(d). Wall Augmenting To complete missing building walls, a vertical quadrilateral is augmented offline for every building boundary grid to connect it to the ground, adapted from Gao et al. [1]. 91 For clarity, the following sections use GEM p (Gridded Elevation Map for Points) for the subset of GEM including all point data; GEM w (Gridded Elevation Map for Walls) for the subset containing only building boundary points and their wall quadrilaterals; and GEM for the complete structure including both point data and augmented quadrilateral walls. Constructing GEMs for MRQ Non-Leaf Nodes GEM of an MRQ non-leaf node is built bottom-up by regularly sub-sampling 1 4 of grids from its four child nodes. Specifi- cally, the GEM of a non-leaf node is composed of grids on odd rows and even columns (or grids on even rows and odd columns) from GEMs of its child nodes. It seems that such a scheme will lead to bad aliasing, but because the LOD system only displays higher level nodes when they can provide adequate details for current frame, no notice- able aliasing is seen in practice. 6.1.2 Discrete Data Set Since tree points look natural when rendered as independent separate splats, for simplic- ity, this work applies a user-adjustable fixed splat size for tree points of any resolution in the MRQ. As a result, each tree point only needs to store one 3D position, and one normal vector. An MRQ leaf node builds its discrete data set by including all tree points within its bounding box, as the highest resolution. An MRQ non-leaf node builds its discrete data set bottom-up by sub-sampling 1 4 of discrete data sets from all its four child nodes, i.e., randomly selecting one point for every four tree points from its child nodes. 92 6.2 Hybrid Representation Hybrid representations are only applicable to continuous data sets that are stored in GEMs of each MRQ node. Corresponding to two types of GEMs, there are two types of hybrid representations in the form of RPQs. A GEM p contains point data only, which is labeled as ground, building or building boundary. Sampled from continuous surfaces, if a subset of these points form a large flat region, the point subset can be converted to a few polygons without losing details. We construct an RPQ p (Rendering Primitive Quadtree for Points) hierarchy for the GEM p indicating whether to render an area as point or converted polygon. Observing that a group of quadrilaterals sharing the same plane can be simplified, we also compactly decimate wall quadrilaterals. Accordingly, an RPQ w (Rendering Primitive Quadtree for Walls) is constructed for the GEM w indicating whether to render an area as original or simplified quadrilaterals. RPQ is used to refer to the combination of RPQ p and RPQ w hierarchies. Besides point data, the hybrid representation also supports two classes of polygons: quadrilater- als converted from planar subsets of points, and simplified wall quadrilaterals. The following sections first introduce how to perform polygon conversion (Sec- tion 6.2.1) and simplification (Section 6.2.2) for an MRQ leaf node, then show details of propagating the hybrid representation to MRQ’s non-leaf nodes (Section 6.2.3). 6.2.1 Polygon Conversion To convert planar subsets of points into polygons, existing approaches require iterative trials of plane-fitting. The number of planes and coverage of each plane are often nec- essary to get satisfying results. To keep the polygon conversion fast and simple, we forfeit the idea of locating all planes with their maximum coverage. Instead, all points 93 are spatially partitioned into subsets of the same size, and planes are only fit to these point subsets. (a) 139.1 fps (b) 215 fps (c) 481.3 fps (d) 128,535 pts, 4,367 quads (e) 35,524 pts, 4,953 quads Figure 6.3: Rendering effect of polygon conversion. (a) Rendered by Gao et al. [1] where ground and buildings are points. (b)-(e) Rendered by our framework with poly- gon conversion under different parameters: L= 10 5 ((b)(d)) and L= 10 3 ((c)(e)). Converted polygons in (d) and (e) are randomly colored. RPQ p is a quadtree hierarchy constructed similar to MRQ, but with a different ter- mination condition. Given an RPQ p node with w h grids, if either w or h has no more than 3 grids, this node is a leaf node and no further splitting is needed. The number 3 comes from the fact that it requires at least 2 2 grids to form a polygon, and splitting a dimension of 3 will result in a node smaller than the minimum requirement. An RPQ p node stores its bound on GEM p for indexing, and a flag f rp indicating its rendering primitive as either point or converted polygon. 94 The rendering primitive evaluation starts at leaf nodes of the RPQ p . When all grids of a leaf node are non-empty, PCA is applied over their 3D positions. Only when the smallest eigenvaluel 0 of the covariance matrix is small enough (i.e., l 0 L, whereL is a user-defined threshold that is close to 0), can points of this node be converted to a quadrilateral covering the same space whose normal is the eigenvector corresponding to l 0 . f rp is set to converted polygon for this case and set to point otherwise. The rendering primitive evaluation is then propagated bottom-up to higher levels of the RPQ p . When four child nodes of a non-leaf node all have f rp as converted polygon, PCA is applied over 3D positions of all grids from four children. A quadrilateral is fit to this node if l 0 L. In this case, f rp is set to converted polygon and four child nodes are removed from the RPQ p ; otherwise, f rp is set to point. Generic homogeneous covariance [118] allows efficient PCA computation for non-leaf nodes, which involves only simple additions rather than expensive outer productions. Figure 6.3 shows that the rendering with polygon conversion achieves better perfor- mance without obvious quality degradation, compared to the rendering without polygon conversion. 6.2.2 Polygon Simplification The construction of RPQ w is similar to that of RPQ p . An RPQ w node stores its bound on GEM w , and a flag f rp indicating its rendering primitive as either original or simplified polygon. The rendering primitive evaluation starts at leaf nodes of RPQ w . For a leaf node with only one building boundary grid, the one original wall polygon is also the single sim- plified polygon, thus its f rp is set as simplified polygon. For a leaf node with more than one boundary grids, PCA is applied over 3D positions of all building boundary grids. When the second smallest eigenvalue l 1 is small enough (i.e., l 1 L), these building 95 boundary grids approximate a straight line segment. Thus the corresponding original wall quadrilaterals can be simplified to one vertical quadrilateral defined by the line seg- ment and the ground position with minimum elevation. The simplified quadrilateral is a conservative estimation to ensure gap-free rendering of the space covered by its original wall quadrilaterals. In this case, f rp is set to simplified polygon; in all other cases, f rp is set to original polygon. The rendering primitive evaluation is then propagated bottom-up to higher levels of the RPQ w . This work applies PCA over 3D positions of all building boundary grids from four child nodes of an RPQ w non-leaf node. If these building boundary grids form a straight line, corresponding original wall polygons are replaced by one quadrilateral, f rp is set to simplified polygon, and four child nodes are removed from the RPQ w ; otherwise, f rp is set to original polygon. Figure 6.4 shows that polygon simplification provides comparable visual quality and with slightly improved rendering frame-rates, compared to the rendering without polygon simplification. 6.2.3 Hybrid Representation for MRQ Non-Leaf Nodes Hybrid representation of an MRQ non-leaf node cannot be derived solely from its child nodes, since a parent node has different data resolution from its child nodes. It does not need to be reconstructed from scratch either, as some of the hybrid representations of higher resolutions (child nodes in the MRQ) are reusable for that of lower resolutions (parent node in the MRQ). Constructing hybrid representation of non-leaf nodes is a bottom-up propagation of RPQs of leaf nodes. As RPQ p and RPQ w share similar propagation process, we only demonstrate details for RPQ p for clarity. Given four child nodes n i ;i2 1;2;3;4, each has a GEM n i p with s n i grids and an RPQ n i p . The maximum height h n i max of RPQ n i p is log 4 s n i , which is achieved when not all leaf nodes 96 (a) 59 fps (b) 60 fps (c) 3,588 wall quads (d) 2,643 wall quads Figure 6.4: Rendering effect of polygon simplification. (a)(c) Rendered by Gao et al. [1]. Wall quadrilaterals are colored randomly in (c). (b)(d) Rendered by our framework with polygon simplification underL= 10 4 . Simplified walls are colored randomly in (d). have f rp as polygon. Representing 1 4 resolution of these child nodes, for the parent node m with GEM m p , four sub-trees under the root node of its RPQ m p are built based on RPQ n i p as follows. When the height of RPQ n i p is h n i max , we cut lowest level nodes from RPQ n i p , and re-evaluate rendering primitives of remaining parent nodes based on GEM m p . The re-evaluation continues upwards until f rp of a node remains unchanged, or the root of RPQ n i p is reached. The updated RPQ n i p is a sub-tree for RPQ m p . When the height of RPQ n i p is smaller than h n i max , all its leaf nodes are represented by polygons, so this RPQ n i p is taken directly as a sub-tree of RPQ m p . Finally, rendering primitive of RPQ m p ’s root node is evaluated over all grids in GEM m p . Figure 6.5 illustrates the above process applied to one child node. 97 6.3 Rendering For aerial LiDAR data sets that are larger than the memory capacity of a computer system, the presented framework applies an out-of-core scheme that processes only a manageable subset of the data at any time, similar to Gao et al. [1]. By coupling it with the MRQ hierarchy, view-dependent LOD rendering is provided. We first introduce how to perform LOD selection in Section 6.3.1, then present details of hybrid point-polygon rendering in Section 6.3.2, and finally show a simple yet effective mechanism to smooth the transitions between different LODs in Section 6.3.3. 6.3.1 LOD Selection LOD selection is performed by hierarchical traversal of the MRQ from root (the low- est resolution) to leaves (the highest resolution). The framework applies three view- dependent selection criteria for each node: view-frustum culling, back-face culling and screen projection tolerance. The implementation of view-frustum culling is similar to Pajarola et al. [118]. Back-face culling is applied to ground and building points only, as tree points do not form surfaces. Its implementation is an efficient normal cone approach [119] that does not involve trigonometric functions. We only check screen projection tolerance for nodes that meet the first two criteria, as follows. A node n’s bounding quad on x-y plane is warped to screen space covering p pixels. If p is less than the number of grids in GEM n p , i.e. one grid covers less than one pixel on screen, this node is selected for rendering, and downward traversal is terminated. Figure 6.6 shows different LODs of a scene. As one of the major rendering expenses is the MRQ traversal, the framework utilizes the idea of coherence traversal [8] for performance consideration. At each frame, instead 98 (a) Real scene (b) GEM n p (c) GEM m p (d) RPQ n p (e) (f) (g) (h) (i) RPQ m p Figure 6.5: Building an RPQ p sub-tree of parent node m based on a child node n. (a) Real scene. Red for buildings; and grey for ground. (b) GEM p of node n with 64 grids. (c) GEM p of node m covering only the region of n. (d)-(i) The propagation process of RPQ p from child n to parent m. A node with a cross mark has f rp as converted polygon; a node with a question mark needs to be re-evaluated over GEM m p in (c); the rest of the nodes have f rp as point. of traversing the MRQ from its root node, it is traversed from nodes (both displayed and culled) of the last frame either up or down to find the most appropriate nodes to display. 99 (a) 6,035 quads, 70,455 pts, 152.9 fps (b) 2,818 quads, 33,474 pts, 275.1 fps (c) 482 quads, 6,605 pts, 654.8 fps (d) 125 quads, 1,925 pts, 935.8 fps Figure 6.6: A data set at different viewing distances. Converted polygons and simplified polygons are shown in red. 6.3.2 Hybrid Rendering Based on a hardware-accelerated point based rendering with deferred shading approach [1], the framework integrates rendering of point splats and polygonal geome- tries via multiple rendering targets in three basic passes: 1. Visibility Pass renders all points and polygons to a depth buffer, and records only the closest depth of a pixel. 2. Attribute Pass renders all points and polygons again, and blends only those that are within e offset of the recorded depth, resulting in pixels of accumulated nor- mals and colors. 3. Shading Pass traverses all pixels of the attribute pass to normalize the accumu- lated normals and colors, and applies accurate per-pixel Phong shading. The rendering of the continuous data set of an MRQ node is actually the rendering of leaf nodes of corresponding RPQ p and RPQ w . For a node with f rp as point, the framework displays each non-empty grid as a circular splat. The splat’s diameter is at least p 2w g in order to enclose the entire area covered by the grid, where w g is the width of quantization squares as seen in Section 6.1.1. For a node with f rp as polygon, either converted, original or simplified, the framework renders the corresponding stored 100 quadrilateral. The size of the quadrilateral is adjusted to its normal so that its projection on x-y plane can cover the entire bounding quad of the node projected on the same plane. This size is further enlarged with an offsetw g to ensure overlapping with adjacent quadrilaterals. By using gradient colors [1], together with a blending mechanism [120], possible cracks between neighboring primitives can be alleviated. The framework provides two rendering options for the discrete data set of an MRQ node. One is to feed tree points as splats to the above three rendering passes. It provides a unified rendering routine for all data and a smoother visualization, but it has an adverse impact on rendering performance, especially for cities with abundant trees. The other is to blend with continuous data set rendering only closest tree points that are filtered by a single pass. This provides better frame-rates, but requires additional implementation, and has obvious aliasing artifacts between neighboring tree points. Experiments of this chapter utilize the first rendering option. 6.3.3 Smooth LOD Transitions As the pre-computed MRQ has a fixed resolution per level, jumping from one level to another leads to popping artifacts. To make smooth LOD transitions, the framework blends together renderings from two adjacent levels. Given a node n that passes the screen projection tolerance test of Section 6.3.1, its parent node m is taken for blending. Let l m be the resolution of m (low detail level), i.e. the number of grids in GEM m p ; and l n be the resolution of four child nodes (including n) of m (high detail level), where l n = 4 l m . Current actual resolution l takes the number of pixels of screen space warping of m’s bounding quad on x-y plane. During rendering, the visibility pass renders all primitives from both node m and its four child nodes to the depth buffer, and the attribute pass blends normals and colors of all primitives from both node m and its four 101 child nodes that are withine depth, where the alpha of geometries of m takesa = l n l l n l m and the alpha of child nodes uses 1a. 6.4 Real-time Refinement When the screen projection of a MRQ leaf node fails the screen projection tolerance, i.e., each point covers more than one screen pixel, a direct rendering of these points leads to a blocky image as in QSplat [85]. To ensure smooth shading, points of the leaf node are continuously refined on-the-fly by object-space up-sampling, which is based on feature-preserving interpolation between adjacent points. As the rendering of a polygon is always smooth for large magnification, refinement is only necessary for regions that are represented by points. Such regions can be easily located from RPQ p structures related to leaf nodes of the MRQ. Given a MRQ leaf node m, if the root node of its corresponding RPQ p has f rp as polygon, i.e., the entire area covered by m can be rendered by a polygon, no refinement is needed for m. Otherwise, the MRQ is appended with four child nodes n i (i2 1;2;3;4) under node m. Each n i covers 1 4 the region of m but in a higher resolution. The refinement continues until the level where screen projection tolerance is reached or a node has f rp as polygon. Figure 6.7: The relationship between a GEM (dark grey) and its extension eGEM (both light and dark grey regions). 102 Figure 6.8: The process of refining a MRQ leaf node m. Input: (a) GEM of node m with red shade for buildings and grey shade for ground, and (d) RPQ p of node m. Initialization: (b) A new GEM with four times resolution as G m . Initial grid data (green) are set from G m , and (e) A new RPQ p for (b) initialized as R m . Up-sampling: (c) Only regions of point-represented nodes in R m are interpolated, with light yellow for grids in direct group and dark yellow for grids in indirect group, and (f) RPQ p in (e) is updated for newly interpolated grids. Re-arrangement: Four child nodes n 1 -n 4 are added under m in MRQ, GEM and RPQ p of each child node are shown (g)-(n). To ensure smooth interpolation near node boundaries, besides the GEM covering the region of a node, an extended version eGEM covering a larger region is also provided. Given w h as dimensions of a GEM, its corresponding eGEM has dimensions of(w+ 2)(h+ 2). Only the eGEM needs to be stored, as the GEM is a subset of it and can be easily retrieved via indexing, as seen in Figure 6.7. In pre-processing, only MRQ leaf nodes that can be represented by point store eGEMs, as these are the places for which refinement is possible. During rendering, eGEMs are constructed on-the-fly for newly added MRQ nodes. For simplicity and clarity, the following subsections explain refinement operations over GEMs. However, one needs to extend to eGEMs in practice. 103 Given a MRQ leaf node m, its GEM G m with dimensions wh, and its RPQ p R m , the refinement of m produces four child nodes n i (i2 1;2;3;4) via three stages as detailed in Section 6.4.1 - Section 6.4.3 and illustrated in Figure 6.8. 6.4.1 Initialization To support refinement of node m, a new GEM G 0 m with dimensions 2w 2h is created. Grids of G m are copied to corresponding positions in G 0 m , i.e., G 0 m (2x;2y)= G m (x;y);x2 [0;w);y2[0;h). These grids in G 0 m are labeled as original, to differentiate from inter- polated grids. A new RPQ p R 0 m is also constructed, which is initialized same as R m . 6.4.2 Up-sampling Each leaf node of R m is checked, if its representation is point, further refinement is performed. The refinement is composed of interpolations over G 0 m grids and updates for R 0 m nodes. GEM Interpolation Given a leaf node l of R m , its bounds (x l ;y l ;w l ;h l ) on G m corre- sponds to the bounds (2x l ;2y l ;2w l ;2h l ) on G 0 m . Those G 0 m grids covering the region of l that need interpolation can be categorized in two groups: Direct Group for grids that are between two original grids, as their values can be interpolated directly from original data. Indirect Group for grids that are between two interpolated grids, as their values can not be interpolated until all grids in direct group are processed. For a grid g indexed as (i; j), i2[0;2x l ); j2[0;2y l ) in node l, two nearest grids in vertical direction are (i; j 1) and (i; j+ 1), two nearest grids in horizontal direction are (i 1; j) and (i+ 1; j). If either one grid-pair is original data, g belongs to the 104 direct group, and its interpolation is based on this grid-pair; otherwise, g belongs to the indirect group, and its interpolation is between the grid-pair either in vertical or horizontal direction. Figure 6.8(c) shows examples of grids in both direct and indirect group. The following paragraphs explain the interpolation process for grid g, given its inter- polation grid-pair s and e, with 3D position p s , p e , normal n s , n e , and class flag f s c , f e c respectively. As GEM contains three classes of building, boundary and ground, class flags of the grid-pair fall into one of the three situations below: Case 1: f s c 6= f e c , and one grid is boundary, the other is ground. Case 2: f s c 6= f e c , and one grid is building, the other is boundary. Case 3: f s c = f e c . Case 1 does not need interpolation as grid g stands in between two different objects. To preserve building boundaries, and to close the gap between ground and walls, g is set to the same data as the ground grid. Since boundary belongs to building, boundary points are compatible to building points. Therefore, Case 2 is a special Case 3 where both grids can be considered as building points. For these two cases, elevation and normal of grid g are interpolated between s and e, and its class flag is set to be the same as the interpolation grid-pair. By employing a similar edge-aware bilateral projector [7], a new 3D point p with normal n p is inserted as p= b+ dn p (6.1) , where b is the base point and d is the distance to be projected along n p , as illustrated in Figure 6.9(a). This scheme is to constrain the projection direction to be along the normal of the inserted point. 105 (a) (b) Figure 6.9: GEM interpolation. (a) Interpolate a new grid g between the interpolation grid-pair s and e. b is the base point, d is the projection distance, p is the projected new point on the latent surface, and p 0 is the actual point represented by grid g. (b) Neighborhood of b (the cross) within radiusg which contains only original grids (green). Base Selection. b is set as the center of 3D positions of s and e, i.e., b= p s +p e 2 . Projection Distance. Similar to Huang et al. [7], given the base location b, to project it onto the latent surface along a given direction n, the projection distance d is determined by: d(b;n)= å p i 2N g (b) (n T (b p i )) f(kb p i k)g(1 n T n p i ) å p i 2N g (b) f(kb p i k)g(1 n T n p i ) (6.2) , where f and g are the spatial and influence weight functions defined by Gaussians, as detailed in Gao et al. [32]; and N g (b) denotes 3D point neighbors in the sphere of center b and radius g whose corresponding grids share same or compatible class flag as s and e. g is set as p 2w g where w g is the grid width of the input G m . N g (b) only includes points whose related grids are original, as they provide better accuracy than interpolated ones. This selection strategy also makes the interpolation independent on orders, i.e., the data of a grid does not change whether it is interpolated before or after its neighbors. Figure 6.9(b) shows the 2D view of an example N g (b). 106 Normal Determination. Constrained to the set of n s , n e , the initial value of normal n p is selected as the one with the minimum d evaluated in Equation (6.2). Similar to Huang et al. [7], the final value of n p is decided by: n p = å p i 2N g (b) n p i f(kb p i k)g(1 n T p n p i ) å p i 2N g (b) f(kb p i k)g(1 n T p n p i ) (6.3) Note that the initial value of n p is a good approximation, for better performance, one can skip the evaluation of Equation (6.3) in practice. By taking n p back to Equation (6.2), d is determined, and then p is given by Equation (6.1). Finally, to decide the normal and elevation of grid g, the point p 0 right over the grid is utilized, as shown in Figure 6.9(a). As a result, g’s normal takes n p ; and g’s elevation takes the z value of p 0 which is approximated by trigonometric functions over a triangle formed by p, p 0 and b. Although both the normal and elevation are approximations, no visible artifacts are found in experiments. RPQ p Update When grid interpolation of leaf node l of R m is complete, the region in G 0 m corresponding to l is spatially partitioned into four sub-regions, and each of which is evaluated to see if it can be represented as a polygon. Accordingly, four child nodes indexing these sub-regions are created and added under node l in R 0 m . 6.4.3 Re-arrangement For the re-arrangement, G 0 m is equally split into four sub-regions, and four child nodes indexing these sub-regions are added under node m in MRQ. RPQ p of each child node is the corresponding sub-tree covering the same region in R 0 m . Finally, quadrilateral walls are augmented on-the-fly for boundary grids, and RPQ w of each child node is re- evaluated from scratch. It’s not applicable to propagate information from parent RPQ w 107 to children RPQ w , since orientations of wall polygons need to be re-estimated based on new neighbors after the refinement. 6.5 Experiments The experiments are performed in two parts: Section 6.5.1 focuses on evaluating the hybrid rendering framework without involving refinement; and Section 6.5.2 concen- trates on tests over real-time point cloud refinement. 6.5.1 Hybrid Rendering The test computer is equipped with dual-core Intel Xeon c 2.0GHz CPU, 4GB RAM, NVIDIA GeForce c GTX260 896MB graphics card, Western Digital c 7200RPM hard disk and Windows Vista c . All tests are rendered at 1280 800 pixels. Table 6.1: Basic information of test data sets. Data set Denver LA Atlanta1 Atlanta2 File size (MB) 44.0 49.6 104.6 3,052.3 Density (pts/m 2 ) 6.6 0.6 22.9 20.7 # points (M) 1.6 2.6 5.5 112.6 % ground 67.1 55.9 42.2 44.1 % building 19.5 27.8 16.1 36.5 % tree 13.4 12.3 41.7 19.4 % other 0 4.0 0 0 The test data sets are from three cities as listed in Table 6.1. The first three data sets are of medium size, and the last data has a large size. Atlanta1 is a small part of Atlanta2. If not specified, a figure uses the LA (Los Angeles) data set. Parameters The framework provides a parameterL to adjust the error tolerance for both polygon conversion and simplification. Since L affects polygon conversion and 108 simplification in the same way, we only exemplify its influence on polygon conversion for clarity. As seen in Figure 6.3, a largerL leads to larger polygons with fewer details, while a smaller L produces smaller polygons with more details. It also shows that the rendering speed increases with the increase of L, as more points are converted to polygons. (a) G= p 2 2 w g ; 333 fps (b) G= 3 p 2 4 w g ; 237 fps (c) G= p 2w g ; 153 fps Figure 6.10: Speed-quality trade-off under the influence ofG. Another user-adjustable parameter is the splat radiusG for point rendering. As seen in Figure 6.10, increasing the value can alleviate gaps between neighboring points with different orientations. However, since a largerG corresponds to more pixels on screen per point, the improvement of visual quality is accompanied by the decrease of rendering performance. By tuningL andG, users can trade off the rendering speed and quality. To get better rendering quality, a smaller L and a larger G are suggested. In a large scene where details of distant objects are not important, a largerL and a smallerG are preferred. Performance The presented framework is built as an extension and optimization for visually-complete rendering [1]. Among three wall primitives, the option of procedu- rally generated quadrilateral walls achieves the best speed and quality trade-off [1], thus this option (referred to as VRQ) is taken as the benchmark. 109 Table 6.2: Pre-processing (pp) time (in seconds) and rendering (rd) speed (in fps) com- parisons of VRQ and our framework. pc for polygon conversion; and ps for polygon simplification. Data set Denver Los Angeles Atlanta1 Atlanta2 VRQ pp 125.1 238.9 404.9 8,077.2 our pp 134.2 250.6 426.5 8,252.7 VRQ rd 18.5 24.9 18.8 17.8 our rd (pc) 73.1 68.0 40.1 18.2 our rd (pc+ps) 73.4 68.5 40.3 18.3 Quality Figure 6.3 and 6.4 show that polygon conversion and simplification can preserve most of the details of small scenes under proper parameters, even though orig- inal rendering primitives are changed. Figure 3.1, 6.11a and 6.11b show that the quality degradation in larger scenes is negligible, as our framework provides identical visual cues to VRQ. Pre-processing Time As shown in Table 6.2, the pre-processing time increases with file size and point number. It is also affected by the number of points of each class and the actual scene-architectural layout, since points of different classes are treated differently. The pre-processing time is roughly distributed as follow: 33% for disk file reading, 57% for normal estimation of 3D points, 2% for polygon conversion and simplification, and 8% for other operations. Compared to the pre-processing time of VRQ, our framework only adds less than 8% overhead, as in Table 6.2. Rendering Speed Figure 3.1 and 6.11 list static rendering speed of three cities, by measuring average frame-rates of the displayed static images within one minute. Table 2 lists dynamic rendering speed of three cities, by adjusting the viewport to fit the entire scene, and recording average frame-rate within one minute while orbiting the camera around the scene. 110 (a) VRQ: 1,922,328 pts, 33,482 quad, 20.4 fps (b) Our: 707,678 pts, 80,370 quads, 108.5 fps (c) VRQ: 3,649,426 pts, 70,269 quads, 26.4 fps; Our: 3,071,812 pts, 140,175 quads, 85.1 fps (d) VRQ: 2,006,927 pts, 233,705 quads, 20.2 fps; Our: 1,887,311 pts, 238,770 quads, 21.0 fps Figure 6.11: Rendering comparison between VRQ and our framework withL= 10 3 and G= p 2 2 w g . (a) and (b) for Denver; (c) for Atlanta1; and (d) for Atlanta2. Our framework offers comparable visual details but greatly improved rendering speed. The speedup of our framework over VRQ is dominated by the number of converted points and simplified polygons that are actually displayed in a frame. Because our framework is LOD based, which always displays just enough details for a view, the speedup only varies in detail level regardless of the scale of the input data set. Both static and dynamic rendering speed measurements show that our framework achieves 2-5 times speedup when data sets of medium sizes are completely fit in the display win- dow; but the speedup is not that impressive when the entire scene of a large data set is shown. As another example, Figure 6.11c can be considered as a closeup of Figure 6.11d; while the closeup provides 322% speedup, the original scene only has a slight improvement of 3%. Table 2 shows that the speedup by applying polygon simplification is not obvious compared to the speedup provided by polygon conversion. This is because the cost of 111 point rendering is dominant in hybrid rendering, given the fact that the number of wall polygons is only 1:7% - 3:8% of the number of original points. The presented framework can support hybrid rendering of colored point clouds by attaching textures to polygons. Similar to Wahl et al. [92], a theoretical speedup is estimated as speedup= v s =r v (v r + 4q)=r v +t=r f ; where v s is the number of simplified points, q the number of quads, t the number of texel, v r the number of remaining points, r v the transformation rate, and r f the fill-rate. (a) Before: 35k points (b) Before: 61k points (c) Before: 187k points (d) After: 1,789k points (e) After: 2,691k points (f) After: 3,819k points Figure 6.12: Rendering comparison over close-ups of Los Angeles (left column), Atlanta (middle column) and Denver (right column) before and after the refinement. Green trees, brown ground, pink buildings and light blue building walls. 112 Considering only ambient occlusion or shadows, whose computation cost is depen- dent on the number of ray-object intersection tests, if points are rendered as surface splats, a theoretical speedup is given by speedup= v s v r + q : Comparison with Existing Approaches Existing approaches [91,92,94] are also applicable to converting a planar subset of points into polygons. Unlike these approaches that either require additional knowledge (the number of planar surfaces, plane boundaries, etc.) or involve complex computations (triangu- lation, iterative plane fitting, etc.), our polygon conversion is much simpler by only applying PCA over regularly partitioned subsets. Therefore, one can expect a faster pre-processing for our polygon conversion. For example, Wahl et al. [92] report the RANSAC plane fitting performance of about 33k-84k points per second on a 3.4GHz Pentium 4, while the presented polygon conversion can process about 520k-680k points per second on a 2.0GHz Xeon; taking into account that our CPU is slightly better than theirs, we can still consider that our approach is a lot faster. In addition, our approach enables more efficient texture packing for visualization of colored points, since our poly- gons are of known regular sizes as opposed to polygons of random sizes produced by other approaches. On the downside, our approach may provide less speedup in hybrid rendering, because it does not necessarily produce the optimal polygonal representation, and leaves more points unconverted, compared to existing approaches that accurately recover latent surfaces of the point cloud. 113 Figure 6.13: Refinement effect over a series of zoom-in of a scene from Los Angeles. Before refinement (top row), roof boundaries are fuzzy. After refinement (bottom row), boundaries look sharper and the tiny top roof shows better details. 6.5.2 Real-time Refinement The implementations use the C# language under .NET 4.0 environment, OpenGL 2.1, Tao Library [105] and the Cg [106] language for shaders. The testing computer is equipped with dual-core Intel c Core(TM) i3 3.20GHz CPU, 6GB RAM, NVIDIA GeForce c GT 520 1024MB graphics card, Seagate c 7200RPM hard disk and Windows 7 c . All tests are rendered at 1280 800 pixels. The test data sets are three aerial LiDAR cities sampled with different resolutions, as detailed in Table 6.3. If not specified, a figure uses the Los Angeles data set. Table 6.3: Basic information of test data sets. Data set Denver Los Angeles Atlanta File size (MB) 44.0 49.6 104.6 Resolution (pts/m 2 ) 6.6 0.58 22.9 # points (M) 1.64 2.60 5.48 % ground + building 87 84 58 % tree + other 13 16 42 114 (a) 2,691k points, 7.5 seconds (b) 1,776k points, 5.1 seconds Figure 6.14: Refinement comparison without (a) and with (b) using hierarchical hybrid structures over 102k points from Los Angeles. Listed are the number of processed points and the processing time for refinement. Parameters When constructing the hybrid representation hierarchies, a parameterL is provided to adjust the error tolerance for fitting a polygon to a planar subset of points. As shown in Figure 6.15 and Table 6.4, a largerL leads to a smaller number of points to be processed for refinement, which in turn leads to a faster rendering speed. It also shows that the presented hybrid representation structures are effective for aerial LiDAR cities where flat regions are abundant, as the number of points that require refinement is largely reduced. If not specified,L= 10 3 is used. Table 6.4: The influence ofL over the percentage of points represented as point. Mea- sured in all leaf nodes of the initial MRQ over three cities. Data set Denver Los Angeles Atlanta L= 10 5 98.2 77.4 99.4 L= 10 4 64.8 56.9 86.0 L= 10 3 36.2 45.2 61.4 L= 10 2 27.3 37.7 46.3 L= 10 1 23.6 32.1 39.5 115 Performance Figure 1.7 and Figure 6.12 show close-up views of buildings with dif- ferent shapes and complexities from all three cities. Figure 6.13 demonstrates a series of comparison before and after the refinement under increasing magnification of a scene. The presented refinement is effective as all images with refinement show a real improve- ment in quality; for example, silhouettes look sharper and small details are better dis- played. (a) Real scene (b) L= 10 5 ; 87% (c) L= 10 3 ; 24% Figure 6.15: The percentage of points that are represented by point under differentL. Polygon-represented regions are highlighted with random colors. Being implemented on CPU without optimization, the presented refinement gener- ates about 300k points per second, which does not show noticeable delays in real-time rendering. Figure 6.14 compares the refinement with and without utilizing the presented hier- archical hybrid structures, which shows comparable visual appearance. However, the refinement with hybrid structures is more efficient. One can expect a speed-up propor- tional to the percentage of reduced points for up-sampling. 116 6.6 Summary This chapter presents a framework for interactive and visually-complete rendering of aerial LiDAR point clouds of cities, by coupling LOD technique with hierarchical hybrid point-polygon structures. The hybrid representation supports both quadrilaterals con- verted from planar subsets of points, and simplified quadrilaterals from augmented wall quadrilaterals. Without involving expensive computation, the construction of hybrid representation is efficient, adding only less than 8% pre-processing time. Experiments show that the framework is able to provide comparable visual details with 2-5 times speedup in rendering. The framework is generic to any aerial LiDAR data regardless of sampling resolution. With appropriate modifications, e.g., patch-based representa- tion [116], it can be applied to a variety of other data sets. Targeting at improving visualization quality under large magnification, the presented rendering framework also provides a dynamic real-time refinement for aerial LiDAR points of cities. Embedded in a multi-resolution rendering framework, the refinement is adaptive to views. By referencing hierarchical hybrid structures, the refinement concen- trates on high curvature surfaces only. Such a strategy effectively reduces the number of points that need further up-sampling for aerial LiDAR cities where flat surfaces are abundant. Experiments show that the presented refinement can be performed on-the-fly during real-time rendering while improving display quality. For future work, it is promising to integrate point cloud quantization and compres- sion [121] into the framework for better storage utilization. Another future work is to further reduce the processing time of refinement. GEM is currently used to store aerial LiDAR points; as it is highly regularized, it can be naturally mapped to a 2D texture. One can then parallelize current refinement operations to GPU. Extending the adaptive refinement to point cloud color mapping is also promising. By up-sampling the input 117 points according to the resolution of given input images, all colors can be preserved in points efficiently. 118 Chapter 7 Conclusion This thesis aims at a processing and rendering framework for city-scale aerial LiDAR point clouds that can provide interactive visualizations with important visual cues such as complete geometries and colors. To achieve complete geometries in the city-scale aerial LiDAR point cloud visu- alization, the presented system uses a hierarchical and out-of-core approach to data management and GPU-accelerated PBR with deferred shading. The class of object a point is related to augments the point cloud in pre-processing and/or adapts the online rendering, to produce visualizations that are more complete and realistic. The sparsely sampled ground areas under tree canopies are remedied through point cloud augmen- tation in pre-processing. Building walls are recovered with three distinct geometries: points, quadrangles and ARPs. Among which, point-walls are generated through point cloud augmentation in pre-processing while the other two geometries are procedurally generated in rendering. The system is generic to any data with buildings and trees regardless of point density. Experiments show that the rendering of the presented PBR system provides comparable 3D cues to those obtained with polygonal 3D modeling while maintaining interactive frame-rates. To add colors to the city-scale aerial LiDAR point cloud visualization, a scalable out-of-core technique is presented for mapping colors from aerial oblique imagery to city-scale aerial LiDAR point cloud that are from the same or different sources. The aerial LiDAR point cloud is first augmented to recover missing points on building walls and under tree canopies to ensure continuous textures for color mapping. The visibility 119 pass of GPU splatting is adapted for color mapping to allow computing visibility and color contribution in a single pass. A weighting scheme is utilized to accumulate colors from all contributing images while leveraging image resolution and surface orientation. The presented system is simple to implement, effective to produce good results, scalable to arbitrarily large data and generic to data with any resolution. Experiments show that color mapping of points can greatly improve the quality of LiDAR points visualizations. To improve quality of raw aerial LiDAR point clouds, which often suffer from noise and under-sampling, recent developments in geometry refinement are extended to accommodate unique properties of aerial LiDAR building points. A specialized two- step refinement framework is designed where normal discontinuous features (bound- ary points) are explicitly processed. Normals and boundary directions are estimated over anisotropic neighborhoods selected by Gauss map clustering. The smoothing step applies a two-stage feature preserving bilateral filtering to smooth out noise in normals, boundary directions and positions. The up-sampling step employs a local gap detector and a feature preserving bilateral projector to deliver uniform up-sampling. A simple extension allows feature enhancing by increasing samples in the vicinity of features. The presented method is fast, simple to implement, and easily extensible. Experiments on both synthetic and real data show that the presented refinement method effectively removes noise, fills gaps, and for the first time, regularizes position discontinuous point boundaries. It is a beneficial pre-process for other applications such as rendering and surface reconstruction. 120 Finally, the thesis presents a framework for interactive and visually-complete ren- dering of aerial LiDAR point clouds of cities, by coupling LOD technique with hierar- chical hybrid point-polygon structures. The hybrid representation supports both quadri- laterals converted from planar subsets of points, and simplified quadrilaterals from aug- mented wall quadrilaterals. Without involving expensive computation, the construc- tion of hybrid representation is efficient, adding only less than 8% pre-processing time. Experiments show that the framework is able to provide comparable visual details with 2-5 times speedup in rendering. Targeting at improving visualization quality under large magnification, the hybrid rendering framework provides a dynamic refinement approach that is adaptive to views and surface curvatures. For regions that need up-sampling, the refinement concentrates on high curvature surfaces by referencing the hybrid structures. Such a strategy effectively reduces the number of points that need further up-sampling for aerial LiDAR cities where flat surfaces are abundant. Experiments show that the pre- sented refinement can be performed on-the-fly during real-time rendering while improv- ing display quality. There are several future research directions to further improve speed and quality of current visualization framework. For better storage utilization, one can apply point cloud quantization and compression [121]. For faster rendering speed, one way is to replace current three-pass rendering with a more efficient rendering routine (e.g., single-pass splatting [122]); another way is to parallelize current refinement operations and migrate them completely onto GPU. For better visual quality, extending the adaptive refinement to point cloud color mapping is a promising future work. By up-sampling the input points according to the resolution of given input images, all colors can be preserved in points efficiently. 121 Reference List [1] Zhenzhen Gao, Luciano Nocera, and Ulrich Neumann, “Visually-complete aerial lidar point cloud rendering,” in Proceedings of the 20th International Conference on Advances in Geographic Information Systems, New York, NY , USA, 2012, SIGSPATIAL ’12, pp. 289–298, ACM. [2] “Bcal lidar data viewer,” http://www.idaholidar.org/tools/fusion-ldv. [3] Boštjan Kovaˇ c and Borut alik, “Visualization of lidar datasets using point-based rendering technique,” Comput. Geosci., vol. 36, no. 11, pp. 1443–1450, Nov. 2010. [4] Min Ding and Avideh Zakhor, “Automatic registration of aerial imagery with untextured 3d lidar models,” in CVPR, 2008. [5] Ruggero Pintus, Enrico Gobbetti, and Marco Callieri, “A streaming framework for seamless detailed photo blending on massive point clouds,” in Proc. Euro- graphics Area Papers, 2011, pp. 25–32. [6] Florent Duguet, Frédo Durand, and George Drettakis, “Robust Higher-Order Filtering of Points,” Tech. Rep. RR-5165, INRIA, Apr. 2004. [7] H. Huang, S. Wu, M. Gong, D. Cohen-Or, U. Ascher, and H. Zhang, “Edge-aware point set resampling,” ACM Transactions on Graphics, pp. 1–12, 2012. [8] Baoquan Chen and Minh Xuan Nguyen, “Pop: a hybrid point and polygon ren- dering system for large data,” in Proceedings of the conference on Visualization ’01, Washington, DC, USA, 2001, VIS ’01, pp. 45–52, IEEE Computer Society. [9] Qian-Yi Zhou and Ulrich Neumann, “2.5d dual contouring: A robust approach to creating building models from aerial lidar point clouds,” in Computer Vision - ECCV 2010, 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part III, Kostas Daniilidis, Petros Maragos, and Nikos Paragios, Eds. 2010, vol. 6313 of Lecture Notes in Computer Science, pp. 115–128, Springer. 122 [10] Charalambos Poullis and Suya You, “Automatic reconstruction of cities from remote sensor data.,” pp. 2775–2782, 2009. [11] Qian-Yi Zhou and U. Neumann, “A streaming framework for seamless build- ing reconstruction from large-scale aerial lidar data,” Computer Vision and Pat- tern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 2759–2766, 2009. [12] Marko Kuder and Borut Zalik, “Web-based lidar visualization with point-based rendering,” in Proceedings of the 2011 Seventh International Conference on Sig- nal Image Technology & Internet-Based Systems, Washington, DC, USA, 2011, SITIS ’11, pp. 38–45, IEEE Computer Society. [13] John Danahy, “Visualization data needs in urban environmental planning and design,” in Photogrammetric Week ’99, D. Fritsch and R.Spiller, Eds., Toronto, 1999, pp. 351–365, Heidelberg: Herbert Wichmann Verlag. [14] Dwipen Bhagawati, “Photogrammetry and 3-d reconstruction - the state of the art,” in ASPRS Proceedings, 2000. [15] Trent C. Palmer and Jeffrey Shan, “A comparative study on urban visualization using lidar data in gis,” URISA Journal, vol. 14, no. 2, pp. 19–25, 2002. [16] Vivek Verma, Rakesh Kumar, and Stephen Hsu, “3d building detection and mod- eling from aerial lidar data,” in Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2, Washing- ton, DC, USA, 2006, CVPR ’06, pp. 2213–2220, IEEE Computer Society. [17] Bogdan C. Matei, Harpreet S. Sawhney, Supun Samarasekera, Janet Kim, and Rakesh Kumar, “Building segmentation for densely built urban regions using aerial lidar data.,” in CVPR. 2008, IEEE Computer Society. [18] Qian-Yi Zhou and U. Neumann, “2.5d building modeling with topology control,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 2489–2496, 2011. [19] Florent Lafarge and Clement Mallet, “Building large urban environments from unstructured point data,” Computer Vision, IEEE International Conference on, vol. 0, pp. 1068–1075, 2011. [20] Andrew Mastin, Jeremy Kepner, and John W. Fisher III, “Automatic registration of lidar and optical images of urban scenes.,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2009, pp. 2639–2646, IEEE. 123 [21] M. Dellepiane, M. Callieri, M. Corsini, P. Cignoni, and R. Scopigno, “Improved color acquisition and mapping on 3d models via flash-based photography,” J. Comput. Cult. Herit., vol. 2, no. 4, pp. 9:1–9:20, Mar. 2010. [22] HANKE K., STOELLNER T., KOV ACS K., and MOSER M., “Combination of different surveying methods for archaeological documentation: the case study of the bronze age wooden chest from mitterberg,” in Proceedings of the 38th Conference on Computer Applications and Quantitative Methods in Archaeology, Granada, Spain, 2010, IEEE Computer Society. [23] Christian Frueh, Russell Sammon, and Avideh Zakhor, “Automated texture map- ping of 3d city models with oblique aerial imagery,” in Proceedings of the 3D Data Processing, Visualization, and Transmission, 2nd International Symposium, Washington, DC, USA, 2004, 3DPVT ’04, pp. 396–403, IEEE Computer Society. [24] Claus Scheiblauer, N. Zimmermann, and Michael Wimmer, “Interactive domit- illa catacomb exploration.,” in VAST, Kurt Debattista, Cinzia Perlingieri, Denis Pitzalis, and Sandro Spina, Eds. September 2009, pp. 65–72, Eurographics Asso- ciation. [25] C. Armenaki Li-Chee-Ming J., “Fusion of optical and terrestrial laser scanner data.,” in CD-ROM proceedings Canadian Geomatics Conference 2010 and ISPRS Com I Symposium. March 2011, pp. 156–161, ISPRS. [26] L. Wang and U. Neumann, “A robust approach for automatic registration of aerial images with untextured aerial lidar data,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2009, pp. 2623–2630, IEEE. [27] Quan Wang and Suya You, “A vision-based 2d-3d registration system,” in IEEE Workshop on Applications of Computer Vision (WACV), Snowbird, Utah, USA, December 2009, pp. 1–8. [28] Markus Gross and Hanspeter Pfister, Eds., Point-based graphics, The Morgan Kaufmann Series in Computer Graphics, 2007. [29] Nader Salman, Mariette Yvinec, and Quentin Mérigot, “Feature Preserving Mesh Generation from 3D Point Clouds,” in Computer Graphics Forum, Olga Sorkine and Bruno Levy, Eds., Lyon, France, July 2010, vol. 29, pp. 1623–1632, Willey, Special issue for EUROGRAPHICS Symposium on Geometry Processing, Lyon 2010. [30] Kai-Wah Lee and Wen-Ping Wang, “Feature-preserving mesh denoising via bilat- eral normal filtering,” in Computer Aided Design and Computer Graphics, 2005. Ninth International Conference on, 2005, pp. 6 pp.–. 124 [31] Christopher Weber, Stefanie Hahmann, and Hans Hagen, “Sharp feature detec- tion in point clouds,” in Proceedings of the 2010 Shape Modeling International Conference, Washington, DC, USA, 2010, SMI ’10, pp. 175–186, IEEE Com- puter Society. [32] Zhenzhen Gao and Ulrich Neumann, “Feature enhancing aerial lidar point cloud refinement,” vol. 27, no. 1, pp. 102–113, 2014. [33] Gael Guennebaud, Loô rc Barthe, and Mathias Paulin, “Real-Time Point Cloud Refinement,” in Symposium on Point-Based Graphics, Zurich, 02/06/2004- 04/06/2004, M. Alexa, M. Gross, H. Pfister, and S. Rusinkiewicz, Eds. juin 2004, pp. 41–49, Eurographics/IEEE Computer Society TCVG. [34] “Esri arcgis the complete geographic information system,” http://www.esri.com/software/arcgis/, 2008. [35] “GRASS GIS,” http://grass.osgeo.org, 1999. [36] J.C. Fernandez, A. Singhania, J. Caceres, K.C. Slatton, and R. Kumar M Starek, “An overview of lidar processing software,” Tech. Rep., Geosensing Engineering and Mapping, Civil and Coastal Engineering Department, University of Florida, 2007. [37] Jinhui Hu, Suya You, and Ulrich Neumann, “Approaches to large-scale urban modeling,” IEEE Comput. Graph. Appl., vol. 23, no. 6, pp. 62–69, Nov. 2003. [38] Qian-Yi Zhou and U. Neumann, “2.5d building modeling by discovering global regularities,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference, 2012. [39] “Exelis ENVI,” http://www.exelisvis.com/ProductsServices/ENVI.aspx. [40] “Exelis E3De,” http://www.exelisvis.com/language/en- US/ProductsServices/E3De.aspx. [41] “3D LiDAR visualization tool,” http://lidar.asu.edu/LViz.html, 2008. [42] Leif Kobbelt and Mario Botsch, “A survey of point-based techniques in computer graphics,” Computers & Graphics, vol. 28, pp. 801–814, 2004. [43] Mario Botsch, Alexander Hornung, Matthias Zwicker, and Leif Kobbelt, “High- quality surface splatting on today’s gpus,” in Symposium on Point-Based Graph- ics 2005, June 2005, pp. 17–24. [44] Fausto Bernardini, Ioana Martin, and Holly Rushmeier, “High-Quality Texture Reconstruction from Multiple Scans,” IEEE Transactions on Visualization and Computer Graphics, vol. 7, pp. 318–332, 2001. 125 [45] Adam Baumberg, “Blending images for texturing 3D models,” Proceedings of the British Machine Vision Conference, pp. 404–413, 2002. [46] M Callieri, P Cignoni, M Corsini, and R Scopigno, “Masked photo blending: Mapping dense photographic data set on high-resolution sampled 3D models,” Computers & Graphics, vol. 32, no. 4, pp. 464–473, 2008. [47] Cedric Allene, Jean-Philippe Pons, and Renaud Keriven, “Seamless image-based texture atlases using multi-band blending,” 2008 19th International Conference on Pattern Recognition, pp. 1–4, 2008. [48] Lin Xu, Eric Li, Jianguo Li, Yurong Chen, and Yimin Zhang, “A General Texture Mapping Framework for Image-based 3D Modeling,” Image Processing, pp. 2713–2716, 2010. [49] Shi Pu and George V osselman, “Building facade reconstruction by fusing terres- trial laser points and images,” Sensors, vol. 9, no. 6, pp. 4525–4542, 2009. [50] Matthias Zwicker, Mark Pauly, Oliver Knoll, and Markus Gross, “Pointshop 3d: an interactive system for point-based surface editing,” ACM Trans. Graph., vol. 21, no. 3, pp. 322–329, July 2002. [51] Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDonald, and Werner Stuetzle, “Surface reconstruction from unorganized points,” in Proceedings of the 19th annual conference on Computer graphics and interactive techniques, New York, NY , USA, 1992, SIGGRAPH ’92, pp. 71–78, ACM. [52] Mark Pauly, Richard Keiser, Leif P. Kobbelt, and Markus Gross, “Shape model- ing with point-sampled geometry,” in ACM SIGGRAPH 2003 Papers, New York, NY , USA, 2003, SIGGRAPH ’03, pp. 641–650, ACM. [53] Gael Guennebaud, Marcel Germann, and Markus Gross, “Dynamic Sampling and Rendering of Algebraic Point Set Surfaces,” Computer Graphics Forum, vol. 27, no. 2, pp. 653–662, 2008. [54] Niloy J. Mitra and An Nguyen, “Estimating surface normals in noisy point cloud data,” in Proceedings of the nineteenth annual symposium on Computational geometry, New York, NY , USA, 2003, SCG ’03, pp. 322–328, ACM. [55] Guofei Hu, Jie Xu, Lanfang Miao, and Qunsheng Peng, “Bilateral estimation of vertex normal for point-sampled models,” in Proceedings of the 2005 interna- tional conference on Computational Science and its Applications - Volume Part I, Berlin, Heidelberg, 2005, ICCSA’05, pp. 758–768, Springer-Verlag. 126 [56] Mincheol Yoon, Yunjin Lee, Seungyong Lee, Ioannis Ivrissimtzis, and Hans- Peter Seidel, “Surface and normal ensembles for surface reconstruction,” Com- put. Aided Des., vol. 39, no. 5, pp. 408–420, May 2007. [57] Bao Li, Ruwen Schnabel, Reinhard Klein, Zhiquan Cheng, Gang Dang, and Jin Shiyao, “Robust normal estimation for point clouds with sharp features,” Computers & Graphics, vol. 34, no. 2, pp. 94–106, Apr. 2010, http://www.sciencedirect.com/science/article/B6TYG-4Y9CF9M- 2/2/6983f845587e63492436be6a95e3a2a5. [58] Alexandre Boulch and Renaud Marlet, “Fast and robust normal estimation for point clouds with sharp features,” Comp. Graph. Forum, vol. 31, no. 5, pp. 1765– 1774, Aug. 2012. [59] Chandrajit L. Bajaj and Guoliang Xu, “Anisotropic diffusion of subdivision sur- faces and functions on surfaces,” ACM TRANSACTIONS ON GRAPHICS, vol. 22, 2002. [60] U. Clarenz, U. Diewald, and M. Rumpf, “Anisotropic geometric diffusion in surface processing,” 2000. [61] Mathieu Desbrun, Mark Meyer, Peter SchrÃ˝ uder, and Alan H. Barr, “Anisotropic feature-preserving denoising of height fields and bivariate data,” in In Graphics Interface, 2000, pp. 145–152. [62] Mathieu Desbrun, Mark Meyer, Peter Schröder, and Alan H. Barr, “Implicit fairing of irregular meshes using diffusion and curvature flow,” in Proceedings of the 26th annual conference on Computer graphics and interactive techniques, New York, NY , USA, 1999, SIGGRAPH ’99, pp. 317–324, ACM Press/Addison- Wesley Publishing Co. [63] Mark Meyer, Mathieu Desbrun, Peter SchrÃ˝ uder, and Alan H. Barr, “Discrete differential-geometry operators for triangulated 2-manifolds,” 2002, pp. 35–57, Springer-Verlag. [64] Hao Zhang and Eugene Fiume, “Mesh smoothing with shape or feature preser- vation,” Advances in Modeling, Animation, and Rendering, pp. 167–182, 2002. [65] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffu- sion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, no. 7, pp. 629–639, July 1990. [66] Thouis R. Jones, Frédo Durand, and Mathieu Desbrun, “Non-iterative, feature- preserving mesh smoothing,” in ACM SIGGRAPH 2003 Papers, New York, NY , USA, 2003, SIGGRAPH ’03, pp. 943–949, ACM. 127 [67] S. M. Smith and J. M. Brady, “Susan - a new approach to low level image pro- cessing,” International Journal of Computer Vision, vol. 23, pp. 45–78, 1995. [68] C. Tomasi, “Bilateral filtering for gray and color images,” 1998, pp. 839–846. [69] Frédo Durand and Julie Dorsey, “Fast bilateral filtering for the display of high- dynamic-range images,” in Proceedings of the 29th annual conference on Com- puter graphics and interactive techniques, New York, NY , USA, 2002, SIG- GRAPH ’02, pp. 257–266, ACM. [70] Shachar Fleishman, Iddo Drori, and Daniel Cohen-Or, “Bilateral mesh denois- ing,” in ACM SIGGRAPH 2003 Papers, New York, NY , USA, 2003, SIGGRAPH ’03, pp. 950–953, ACM. [71] Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Sta- hel, Robust Statistics: The Approach Based on Influence Functions (Wiley Series in Probability and Statistics), Wiley-Interscience, New York, 2005. [72] Peter J. Huber, Ed., Robust statistics, John Wiley and Sons, 1981. [73] Boris Mederos, Luiz Velho, and Luiz Henrique de Figueiredo, “Robust smooth- ing of noisy point clouds,” in PROC. SIAM CONFERENCE ON GEOMETRIC DESIGN AND COMPUTING, 2003. [74] D. Levin, “Mesh-independent surface interpolation,” in Geometric Modeling for Scientific Visualization, 2003, pp. 37–49. [75] Tolga Tasdizen, Ross Whitaker, Paul Burchard, and Stanley Osher, “Geometric surface processing via normal maps,” ACM Trans. Graph., vol. 22, no. 4, pp. 1012–1033, Oct. 2003. [76] Michal Nociar and Andrej Ferko, “Feature-preserving mesh denoising via atten- uated bilateral normal filtering and quadrics,” in Proceedings of the 26th Spring Conference on Computer Graphics, New York, NY , USA, 2010, SCCG ’10, pp. 149–156, ACM. [77] Xianfang Sun, P. L. Rosin, R. R. Martin, and F. C. Langbein, “Fast and effec- tive feature-preserving mesh denoising,” IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 5, pp. 925–938, Sept. 2007. [78] Joel II Daniels, Linh K. Ha, Tilo Ochotta, and Claudio T. Silva, “Robust smooth feature extraction from point clouds,” in Proceedings of the IEEE International Conference on Shape Modeling and Applications 2007, Washington, DC, USA, 2007, SMI ’07, pp. 123–136, IEEE Computer Society. 128 [79] Mark Pauly, Richard Keiser, Markus Gross, and Eth ZÃijrich, “Multi-scale fea- ture extraction on point-sampled surfaces,” 2003. [80] Qian-Yi Zhou and Ulrich Neumann, “Fast and extensible building modeling from airborne lidar data,” in Proceedings of the 16th ACM SIGSPATIAL inter- national conference on Advances in geographic information systems, New York, NY , USA, 2008, GIS ’08, pp. 7:1–7:8, ACM. [81] Yaron Lipman, Daniel Cohen-Or, David Levin, and Hillel Tal-Ezer, “Parameterization-free projection for geometry reconstruction,” in ACM SIG- GRAPH 2007 papers, New York, NY , USA, 2007, SIGGRAPH ’07, ACM. [82] Hui Huang, Dan Li, Hao Zhang, Uri Ascher, and Daniel Cohen-Or, “Consolida- tion of unorganized point clouds for surface reconstruction,” in ACM SIGGRAPH Asia 2009 papers, New York, NY , USA, 2009, SIGGRAPH Asia ’09, pp. 176:1– 176:7, ACM. [83] Marc Alexa, Johannes Behr, Daniel Cohen-Or, Shachar Fleishman, David Levin, and Claudio T. Silva, “Point set surfaces,” in Proceedings of the conference on Visualization ’01, Washington, DC, USA, 2001, VIS ’01, pp. 21–28, IEEE Computer Society. [84] Marc Alexa, Johannes Behr, Daniel Cohen-Or, Shachar Fleishman, David Levin, and Claudio T. Silva, “Computing and rendering point set surfaces,” IEEE Trans- actions on Visualization and Computer Graphics, vol. 9, no. 1, pp. 3–15, Jan. 2003. [85] Szymon Rusinkiewicz and Marc Levoy, “Qsplat: A multiresolution point ren- dering system for large meshes,” in Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New York, NY , USA, 2000, SIGGRAPH ’00, pp. 343–352, ACM Press/Addison-Wesley Publishing Co. [86] Jonathan D. Cohen, Daniel G. Aliaga, and Weiqiand Zhang, “Hybrid simpli- fication: Combining multi-resolution polygon and point rendering.,” in IEEE Visualization, Thomas Ertl, Kenneth I. Joy, and Amitabh Varshney, Eds. 2001, pp. 37–539, IEEE Computer Society. [87] Liviu Coconu and Hans-Christian Hege, “Hardware-accelerated point-based ren- dering of complex scenes,” in Proceedings of the 13th Eurographics Workshop on Rendering, Aire-la-Ville, Switzerland, Switzerland, 2002, EGRW ’02, pp. 43–52, Eurographics Association. [88] Michael Wand, Matthias Fischer, Ingmar Peter, Friedhelm Meyer auf der Heide, and Wolfgang Straçer, “The randomized z-buffer algorithm: interactive render- ing of highly complex scenes,” in SIGGRAPH, 2001, pp. 361–370. 129 [89] Wenting Zheng, Hanqiu Sun, Hujun Bao, and Qunsheng Peng, “Rendering of virtual environments based on polygonal & point-based models,” in Proceedings of the ACM Symposium on Virtual Reality Software and Technology, New York, NY , USA, 2002, VRST ’02, pp. 25–32, ACM. [90] Aimin Hao, Guifen Tian, Qinping Zhao, and Zhide Li, “An accelerating render- ing method of hybrid point and polygon for complex three-dimensional models.,” in ICAT, Zhigeng Pan, Adrian David Cheok, Michael Haller, Rynson W. H. Lau, Hideo Saito, and Ronghua Liang, Eds. 2006, vol. 4282 of Lecture Notes in Com- puter Science, pp. 889–900, Springer. [91] Tamal K. Dey and James Hudson, “Pmr: Point to mesh rendering, a feature-based approach.,” in IEEE Visualization, 2002, pp. 155–162. [92] Roland Wahl, Michael Guthe, and Reinhard Klein, “Identifying planes in point- clouds for efficient hybrid rendering,” in The 13th Pacific Conference on Com- puter Graphics and Applications, Oct. 2005, pp. 1–8. [93] Martin A. Fischler and Robert C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, June 1981. [94] Marko Kuder, Marjan Šterk, and Borut Žalik, “Point-based rendering optimiza- tion with textured meshes for fast lidar visualization,” Computers & Geosciences, vol. 59, no. 0, pp. 181 – 190, 2013. [95] Gael Guennebaud, Loô rc Barthe, and Mathias Paulin, “Dynamic Surfel Set Refinement for High Quality Rendering,” Computer & Graphics, vol. 28, no. 6, pp. 827–838, 2004. [96] Bui Tuong Phong, “Illumination for computer generated pictures,” Commun. ACM, vol. 18, no. 6, pp. 311–317, June 1975. [97] Oliver Kreylos, Gerald Bawden, and Louise Kellogg, “Immersive visualization and analysis of lidar data,” in Advances in Visual Computing, George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Paolo Remagnino, Fatih Porikli, JÃ˝ urg Peters, James Klosowski, Laura Arns, Yu Chun, Theresa-Marie Rhyne, and Laura Monroe, Eds., vol. 5358 of Lecture Notes in Computer Science, pp. 846–855. Springer Berlin / Heidelberg, 2008. [98] “Vertex buffer object,” http://www.opengl.org/wiki/Vertex_Buffer_Object, 2012. [99] “LAS Specifications,” http://www.asprs.org. [100] Martin Isenberg, “LAStools: converting, filtering, viewing, gridding, and com- pressing LIDAR data,” 2012. 130 [101] Abdullatif Alharthy and James Bethel, “Heuristic filtering and 3d feature extrac- tion from lidar data,” in In ISPRS Commission III, Symposium 2002 September 9 - 13, 2002, 2002, pp. 23–28. [102] W. Cho, Y-S. Jwa, H-J. Chang, and S-H Lee, “Pseudo-grid based building extrac- tion using airborne lidar data,” International Archives of Photogrammetry and Remote Sensing, vol. 35, no. part 3, pp. 378–381, 2004. [103] K Pearson, “Pearson, k. 1901. on lines and planes of closest fit to systems of points in space.,” Philosophical Magazine, vol. 2, no. 11, pp. 559–572, 1901. [104] “Geometry shader,” http://www.opengl.org/wiki/Geometry_Shader, 2010. [105] “The Tao Framework, Open source library,” http://sourceforge.net/projects/taoframework/, 2008. [106] William R. Mark, R. Steven Glanville, Kurt Akeley, and Mark J. Kilgard, “Cg: a system for programming graphics hardware in a c-like language,” in ACM SIGGRAPH 2003 Papers, New York, NY , USA, 2003, SIGGRAPH ’03, pp. 896– 907, ACM. [107] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson, Introduction to Algorithms, McGraw-Hill Higher Education, 2nd edition, 2001. [108] Mark Pauly, Markus Gross, and Leif P. Kobbelt, “Efficient simplification of point- sampled surfaces,” 2002. [109] Trevor J. Hastie, Robert John Tibshirani, and Jerome H. Friedman, The elements of statistical learning : data mining, inference, and prediction, Springer series in statistics. New York, N.Y . Springer, 2009. [110] Lars Linsen, “Point cloud representation,” Tech. Rep., 2001. [111] Cengiz Oztireli, Gaël Guennebaud, and Markus Gross, “Feature Preserving Point Set Surfaces based on Non-Linear Kernel Regression,” Computer Graphics Forum, vol. 28, no. 2, pp. 493–501, 2009. [112] Florent Lafarge and Pierre Alliez, “Surface reconstruction through point set struc- turing,” in Proc. of Eurographics, Girona, Spain, 2013. [113] Gaël Guennebaud and Markus Gross, “Algebraic point set surfaces,” in ACM SIGGRAPH 2007 papers, New York, NY , USA, 2007, SIGGRAPH ’07, ACM. [114] Paolo Cignoni, Massimiliano Corsini, and Guido Ranzuglia, “Meshlab: an open- source 3d mesh processing system,” ERCIM News, , no. 73, pp. 45–46, April 2008. 131 [115] Florent Lafarge and Clément Mallet, “Modeling urban landscapes from point clouds: a generic approach,” Technical Report RR-7612, INRIA, May 2011. [116] Tilo Ochotta and Stefan Hiller, “Hardware rendering of 3d geometry with eleva- tion maps,” Shape Modeling and Applications, International Conference on, vol. 0, pp. 1–10, 2006. [117] Paul Rosenthal and Lars Linsen, “Image-space point cloud rendering,” in Pro- ceedings of Computer Graphics International, 2008, pp. 136–143. [118] Renato Pajarola, “Efficient level-of-details for point based rendering.,” in Com- puter Graphics and Imaging. 2003, pp. 141–146, IASTED/ACTA Press. [119] R. Pajarola, Overview of Quadtree-based Terrain Triangulation and Visualiza- tion, Technical report (University of California, Irvine. Dept. of Information and Computer Science). Department of Information & Computer Science, University of California, Irvine, 2002. [120] Gaô nl Guennebaud, Loô rc Barthe, and Mathias Paulin, “Splat/mesh blending, perspective rasterization and transparency for point-based rendering.,” in SPBG, Mario Botsch, Baoquan Chen, Mark Pauly, and Matthias Zwicker, Eds. 2006, pp. 49–57, Eurographics Association. [121] Ruwen Schnabel and Reinhard Klein, “Octree-based point-cloud compression.,” in SPBG, Mario Botsch, Baoquan Chen, Mark Pauly, and Matthias Zwicker, Eds. 2006, pp. 111–120, Eurographics Association. [122] T. Ochotta, S. Hiller, and D. Saupe, Single-pass High-quality Splatting, Kon- stanzer Schriften in Mathematik und Informatik. Fachbereich für Mathematik und Statistik, 2006. 132
Abstract (if available)
Abstract
Aerial LiDAR (Light Detection and Ranging) is cost‐effective in acquiring terrain and urban information by mounting a downward‐scanning laser on a low‐flying aircraft. It produces huge volumes of unconnected 3D points. This thesis focuses on the interactive visualization of aerial LiDAR point clouds of cities, which is applicable to a number of areas including virtual tourism, security, land management and urban planning. ❧ A framework needs to address several challenges in order to deliver useful visualizations of aerial LiDAR cities. Firstly, the data is 2.5D, in that the sensor is only able to capture dense details of the surfaces facing it, leaving few samples on vertical building walls. Secondly, the data often suffers from noise and under‐sampling. Finally, the large size of the data can easily exceed the memory capacity of a computer system. ❧ This thesis first introduces a visually‐complete rendering framework for aerial LiDAR cities. By inferring classification information, building walls and occluded ground areas under tree canopies are completed either through pre‐processing point cloud augmentation or through online procedural geometry generation. A multi‐resolution out‐of‐core strategy and GPU‐accelerated rendering enable interactive visualization of virtually unlimited size data. With adding only a slight overhead to existing point‐based approaches, the framework provides comparable quality to visualizations of off-line pre‐computation of 3D polygonal models. ❧ The thesis then presents a scalable out‐of‐core algorithm for mapping colors from aerial oblique imagery to city‐scale aerial LiDAR points. Without intensive processing of points, colors are mapped via a modified visibility pass of GPU splatting, and a weighting scheme leveraging image resolution and surface orientation. ❧ To alleviate visual artifacts caused by noise and under‐sampling, the thesis shows an off‐line point cloud refinement algorithm. By explicitly regularizing building boundary points, the algorithm can effectively remove noise, fill gaps, and preserve and enhance both normal and position discontinuous features for piece‐wise smoothing buildings with arbitrary shape and complexity. ❧ Finally, the thesis introduces a new multi‐resolution rendering framework that supports real‐time refinement of aerial LiDAR cities. Without complex computation and without user interference, simply based on curvature analysis of points of uniform sized spatial partitions, hierarchical hybrid structures are constructed indicating whether to represent a partition as point or polygon. With the help of such structures, both rendering and refinement are dynamically adaptive to views and curvatures. Compared to visually‐complete rendering, the new framework is able to deliver comparable visual quality with less than 8% increase in pre‐processing time and 2-5 times higher rendering frame‐rates. Experiments on several cities show that the refinement improves rendering quality for large magnification under real‐time constraint.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
3D urban modeling from city-scale aerial LiDAR data
PDF
3D object detection in industrial site point clouds
PDF
Efficient SLAM for scanning LiDAR sensors using combined plane and point features
PDF
Object detection and recognition from 3D point clouds
PDF
Hybrid methods for robust image matching and its application in augmented reality
PDF
Green learning for 3D point cloud data processing
PDF
3D deep learning for perception and modeling
PDF
Comparative 3D geographic web server development: visualizing point clouds in the web
PDF
Mutual information estimation and its applications to machine learning
PDF
Human appearance analysis and synthesis using deep learning
PDF
Multi-scale dynamic capture for high quality digital humans
PDF
Feature-preserving simplification and sketch-based creation of 3D models
PDF
Data-driven 3D hair digitization
PDF
Vision-based and data-driven analytical and experimental studies into condition assessment and change detection of evolving civil, mechanical and aerospace infrastructures
PDF
Scalable dynamic digital humans
PDF
Immersive computing for coastal engineering
Asset Metadata
Creator
Gao, Zhenzhen
(author)
Core Title
City-scale aerial LiDAR point cloud visualization
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science (Multimedia and Creative Technologies)
Publication Date
06/30/2014
Defense Date
04/08/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
aerial LiDAR,city‐scale,GPU,hybrid,OAI-PMH Harvest,point cloud,rendering,Visualization,visually‐complete
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Neumann, Ulrich (
committee chair
), Kuo, C.-C. Jay (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
gao.zhenzhen@gmail.com,zzviola@hotmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-426678
Unique identifier
UC11286622
Identifier
etd-GaoZhenzhe-2590.pdf (filename),usctheses-c3-426678 (legacy record id)
Legacy Identifier
etd-GaoZhenzhe-2590.pdf
Dmrecord
426678
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Gao, Zhenzhen
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
aerial LiDAR
city‐scale
GPU
hybrid
point cloud
rendering
visually‐complete