Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Feature-preserving simplification and sketch-based creation of 3D models
(USC Thesis Other)
Feature-preserving simplification and sketch-based creation of 3D models
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
FEATURE-PRESERVING SIMPLIFICATION AND SKETCH-BASED CREATION OF 3D MODELS by Pei-Ying Chiang ADissertationPresentedtothe FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2011 Copyright 2011 Pei-Ying Chiang Table of Contents List Of Tables v List Of Figures vi Abstract xii Chapter 1: Introduction 1 1.1 Significance of the Research . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Review of Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Contributions of the Research . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 2: Research Background 13 2.1 Automated Metadata Indexing and Analysis (AMIA) System . . . . . . . 13 2.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1.1 Project Goal . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1.2 Project Resources . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1.3 System Framework. . . . . . . . . . . . . . . . . . . . . . 17 2.1.1.4 System Requirement . . . . . . . . . . . . . . . . . . . . . 18 2.1.1.5 System Performance . . . . . . . . . . . . . . . . . . . . . 19 2.1.2 Text-based Indexing and Retrieval . . . . . . . . . . . . . . . . . . 20 2.1.2.1 File Distributor . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.2.2 Tag Auto-Populator . . . . . . . . . . . . . . . . . . . . . 21 2.1.2.3 Filename Parser . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.2.4 Maya Parser . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.2.5 AlienBrain Parser . . . . . . . . . . . . . . . . . . . . . . 25 2.1.2.6 Apache Lucene Indexer . . . . . . . . . . . . . . . . . . . 26 2.1.3 Implementation Detail and Issues of Text-based Parsers . . . . . . 29 2.1.3.1 String Segmentation . . . . . . . . . . . . . . . . . . . . . 29 2.1.3.2 Word Regulation . . . . . . . . . . . . . . . . . . . . . . . 33 2.1.4 System Performance Comparison . . . . . . . . . . . . . . . . . . . 33 2.1.5 Ongoing and Future Work . . . . . . . . . . . . . . . . . . . . . . . 35 2.2 Techniques for 3D Thumbnail Generation . . . . . . . . . . . . . . . . . . 37 2.2.1 Mesh Simplification . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2.2 Thinning and Skeletonization . . . . . . . . . . . . . . . . . . . . . 39 ii Chapter 3: Feature-Preserving 3D Thumbnail Creation via Mesh Decom- position and Approximation 42 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2 System Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.1 Mesh Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.2 Extracting the Skeleton and Body Measurements with PCA Transformation . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.2.1 Shape Descriptor . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.3 Iterative Approximation and Primitive Selection . . . . . . . . . . 50 3.2.3.1 Bit Budget Determination . . . . . . . . . . . . . . . . . 50 3.2.3.2 Customized D-cylinder . . . . . . . . . . . . . . . . . . . 51 3.2.3.3 Primitive Approximation . . . . . . . . . . . . . . . . . . 52 3.2.3.4 Output Thumbnail Descriptor . . . . . . . . . . . . . . . 54 3.2.4 Online Rendering with Reverse PCA Transformation . . . . . . . . 54 3.3 Distortion Error Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 4: Voxel-based Shape Decomposition for Feature-Preserving 3D Thumbnail Creation 66 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2 Overview of Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3 Skeleton Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3.1 Skeleton Voxel (SV) Classification and Linking . . . . . . . . . . . 71 4.3.2 Skeleton Post-processing Techniques . . . . . . . . . . . . . . . . . 74 4.4 Shape Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.5.1 Comparison of Decomposed 3D Models . . . . . . . . . . . . . . . 87 4.5.2 Computational Time and File Size . . . . . . . . . . . . . . . . . . 90 4.5.3 Subjective Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Chapter 5: Sketch-based 3D (S3D) Modeling 94 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.2 2D Sketch Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3 Mapping 2D Sketch to 3D Model with Elements . . . . . . . . . . . . . . 100 5.3.1 Open Tube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.2 Other Modeling Elements . . . . . . . . . . . . . . . . . . . . . . . 111 5.4 Flexible Object Grouping and Manipulation . . . . . . . . . . . . . . . . . 114 5.5 Applications and User Evaluation . . . . . . . . . . . . . . . . . . . . . . . 117 5.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 121 Chapter 6: Conclusion and Future Work 123 6.1 Summary of Current Research. . . . . . . . . . . . . . . . . . . . . . . . . 123 6.2 Future Research Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 iii References 129 iv List Of Tables 2.1 System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 A filename that follows the naming convention and the translated tags associated with that file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3 Statistical analysis of examplary words. . . . . . . . . . . . . . . . . . . . 24 2.4 The truth table for deciding segmentation. . . . . . . . . . . . . . . . . . . 31 2.5 System performance of the proposed AMIA search engine. . . . . . . . . . 34 2.6 System performance of the commercial DAM tool. . . . . . . . . . . . . . 34 3.1 System performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.1 Computational Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2 Subjective preference of algorithms. . . . . . . . . . . . . . . . . . . . . . 92 v List Of Figures 2.1 Flowchart: Overall system design . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 Screenshot of the AMIA search interface . . . . . . . . . . . . . . . . . . . 19 2.3 Sample data in AlienBrain .amd files . . . . . . . . . . . . . . . . . . . . . 26 3.1 System framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 Decomposing the model into significant parts. (a) Original models (b) Mesh decomposition resultsthatweappliedtheapproachof Linet al. [34]. The main body are shown in red and the others are shown in different colors (c) the minimum bounding box of each part derived after PCA transformation (d) the rough 3D thumbnail approximating each part with asinglerectangleforeachdecomposedpart,and(e)approximatingwith asinglecylinder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3 Theskeleton andbodymeasurementsareextracted afterPCAtransforma- tion. Each black line represents each part’s skeleton and each rectangular layer represents the estimated body measurements along the skeleton. In this picture, all lines and layers were reverse-PCA transformed to the orig- inal coordination system for visual illustration. . . ............. 48 3.4 The shape descriptor for a single 3D model. . . . . . . . . . . . . . . . . . 49 3.5 The D-cylinder. (a) The filled D-cylinder. (b) The wire-frame of the D- cylinder. The deformable d-cylinder is composed of an upper ellipse, a lower ellipse, and a body composed of multiple quadrangles dividing the ellipses uniformly. . .............................. 52 3.6 The illustration of the coarse-to-fine approximation process.(a) The skele- ton of the part P k within the range of [SK k 1 , SK kn ](b)Thedecomposed part P k is approximated by a single d-cylinder and the layer with the max distortion error is found atSK k i (c) The old d-cylinder is divided atSK k i and replaced by two new d-cylinders. . . . . . . . . . . . . . . . . . . . . . 53 vi 3.7 The thumbnail descriptor for a single model. . . . . . . . . . . . . . . . . 55 3.8 The distortion error estimation. (a) The skeleton and the slices of the as- sociated body measurements are representing the original shape (b) Each decomposed part is approximated with few d-cylinders (c) The distortion error between the original mesh and the approximated d-cylinder is esti- mated at each slice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.9 Results of the thumbnail descriptor. . . . . . . . . . . . . . . . . . . . . . 63 3.10 More results of the thumbnail descriptor. . . . . . . . . . . . . . . . . . . 64 3.11 Different approximation results can be generated by adjusting the weight- ing parameters. Both thumbnails are composed of 8 d-cylinders, where (a) shows the result with a smaller w 2 value and (b) shows the result with a larger w 2 value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.12 The Java 3D applet based 3D thumbnail viewer, and a remote user can browse multiple 3D thumbnails interactively within few seconds. . . . . . 65 4.1 The block diagram of a feature-preserving 3D thumbnail creation system introduced in chapter 3[17], and we focus on the improvement of three blocks that are related to shape-decomposition and highlighted in orange in this work.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2 Illustration of skeleton extraction, classification and skeleton decomposi- tion: (a)object-voxels representingthevolumetricmodelareshowninlight gray while the thinned SVs are shown in black; (b) End-SV, Joint-SV, and Normal-SV are shown in red, yellow and black, respectively; (c) SVs are dividedintomultiplegroupsshownindifferentcolors; (d)extracted turning points (Peak-SV) representing local peaks are shown in purple; (e) object- voxels are decomposed into multiple parts roughly by assigning them to their nearest SVs; (f) the ideal shape decomposition result. . . . . . . . . 72 4.3 Illustration of one central voxel and its 26-adjacent voxels, where the two black voxels sharing only a common vertex are still viewed as neighbors. . 72 4.4 Illustration of turning point extraction using the global distance. . . . . . 73 vii 4.5 Examples of skeleton post-processing using the re-classifying filter, where (a) and (b) are before the linking process while (c) and (d) are after the linking process: (a) the original skeleton containing clusters of adjacent voxels which are classified as Joint-SVs and represented by yellow cubes; (b) one voxel in each cluster beingselected as therepresentative while oth- ers beingre-classified as redundantjoints and denoted with gray cubes; (c) the original skeleton with many redundant small loops formed by adjacent Joint-SVs; (d) these small loops being removed. . . . . . . . . . . . . . . . 75 4.6 Skeleton post-processingusingtwo filters in cascade: (a) theoriginal skele- ton; (b) after the application of the re-classifying filter to the original; (c) after the application of the replacing filter to the result in (b).. . . . . . . 76 4.7 Examples of three jagged skeletons and their smoothing results. ...... 78 4.8 Removing a sub-branch from the skeleton of a horse model: (a) a sub- branch in the neck region of a horse model, (b) incorrect shape decom- position around the neck caused by this sub-branch, (c) the sub-branch is removed and its adjacent skeleton groups are merged, (d) the shape decomposition result with the new decomposed skeleton. . . . . . . . . . . 78 4.9 More examples of sub-branchremoval, wherethethreeskeletons in thetop row contain sub-branches while the three skeletons in the bottom row are obtained by the proposed sub-branch removal algorithm. . . . . . . . . . . 81 4.10 The base part identification procedure: (a) the path along the skeleton fromoneparttoanother,wherethegreendotsrepresentcentersofdifferent parts; and (b) illustration of identified base parts shown in red for a few 3D models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.11 Illustration of the skeleton-guided shape decomposition process: (a) the result from the initial decomposition, (b) invalidating the segment of the protruding skeleton that goes beyond its boundary, (c) extending the base skeletonbymergingasegmentofaprotrudingskeleton, (d)dividinggroups into sub-groups with turning points. . . . . . . . . . . . . . . . . . . . . . 85 4.12 Comparison of skeletons obtained by the thinning operation alone (left) and by the thinning operation followed the skeleton refinement process (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.13 Thumbnails obtained by the proposed voxel-based shape decomposition scheme, wheretheoriginal3Dmodelsweredecomposedintomultipleparts and each part was approximated by a fitting primitive. . . . . . . . . . . . 89 viii 4.14 Thumbnailsapproximatedbyasmallnumberprimitives(giveninthelower left corner of each thumbnail) with the voxel-based approach. . . . . . . . 89 4.15 Comparisonofthesurface-basedandthevoxel-baseddecompositionschemes: (a1) a mesh decomposed by the surface-based technique [34]; (a2) the bot- tom view of the decomposed base part were several pieces are missing; (a3) the extracted skeleton and the body measurement; (a4) the resultant thumbnail with an upward base part caused by missing bottom pieces; (b1) the shape decomposed by the voxel-based scheme; (b2) the improved skeleton and the body measurement; and (b3) the resultant thumbnail of higher quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.16 Comparison of simplified models with different resolutions, where models in the top row were simplified by Garland’s method [23] and models in the bottom row with simplified by the proposed scheme. . . . . . . . . . . . . 91 4.17 Examples of the skeletons that do not represent the correct structure of a shape result in fail thumbnails. . . . . . . . . . . . . . . . . . . . . . . . . 93 5.1 User interface of the proposed S3D system. . . . . . . . . . . . . . . . . . 96 5.2 Illustration of the point rearrangement process, where S is the start point, and K 1 and K n are turning points: (a) the start point lies between two turning points and (b) the start point is always a turning point. . . . . . . 99 5.3 Examples of four user drawn 2D contours (the top row) and the results of applyingthenaturalcubiccurvefittingmethod(thelefttwointhebottom row) and the line fitting method (the right two in the bottom row). . . . . 100 5.4 (a) userinputcontour, (b)extracted turningpoints (reddots)andthefirst contour point (blue dot), and contours refined by (c) the quadratic Bezier curve fitting, (d) the natural cubic curve fitting, (e) the line fitting and (f) the feature preserving curve fitting methods. . . . . . . . . . . . . . . . . 101 5.5 Comparison of different curve fitting methods: (a) user input contours, contours refined by (b) the quadratic Bezier curve fitting, (c) the natural cubic curve fitting, and (d) the feature preserving curve fitting methods. 101 5.6 Illustration of five mapping rules used to convert 2D contours/skeletons to 3D elements (where the contour is in gray and the skeleton is in black): (a) the open-tube, (b) the closed-tube, (c) the ellipsoid, (d) the prism, and (e) the complex-prism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.7 The radius of the K 1 cross section. . . . . . . . . . . . . . . . . . . . . . . 104 ix 5.8 Comparison of 3D objects created by two surface normal estimation meth- ods: (a) a naive surface normal estimation method and (b) an improved surface normal estimation method. . . . . . . . . . . . . . . . . . . . . . . 106 5.9 Results of (a) a naive radius estimation algorithm, (b) an improved radius estimation algorithm by adding constrains, and (c) an even better radius approximation by adding more points. . . . . . . . . . . . . . . . . . . . . 108 5.10 When user-drawn skeleton points are not lying in the medial axis of the contour, the d-cylinder will lean to one side of the contour as shown in the body d-cylinder of the cartoon cat. . . . . . . . . . . . . . . . . . . . . . . 109 5.11 Theexamplesoftheskeletoncorrectionresult: (a)Theuser-drawncontour and skeleton. (b) The skeleton points are adjusted to the middle points of its two contour intersections. (c) The approximating open-tube can not well fitted in the contour. . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.12 Medial axis extraction by a distance transformation method [12]. . . . . . 111 5.13 (a) A 3D ellipsoid (right) created by an elliptical contour (left), where the 3D ellipsoid is composed of multiple parallel d-cylinders with different heights, and (b) an ellipsoid (right) created using the open-tube element with both a contour and a skeleton (left). . . . . . . . . . . . . . . . . . . 112 5.14 An example of the prism:(a) the user-drawn 2D contour, (b) specify the thickness by drag-and-drop, and (c) the fitting prism created according to the 2D contour and the user specified thickness. . . . . . . . . . . . . . . . 113 5.15 (a) The outer contour and inner contour drawn by the user, (b) they are divided into two halves, each of which is mapped to a prism, and (c) the resulting complex-prism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.16 The thickness of the complex-prism can be specified by drag-and-dropping the contour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.17 An example of a human model composed of multiple parts and managed in a tree structure, where a part attached to another part is a child node of the attached node in this tree graph. . . . . . . . . . . . . . . . . . . . 115 5.18 An example of duplicating objects, where the selected item is high-lighted in red and the gray items are duplicated from the selected one. . . . . . . 117 x 5.19 Examples of 3Dmodelcreation: (a)2D cartoon characters, (b)user-drawn contours based on the 2D character, (c) user-drawn skeletons, the blue lines (in the bottom example) illustrate the estimated cross section along the skeleton, which are used for adjusting the size of the approximating d-cylinders(d) the resultant 3D models. . . . . . . . . . . . . . . . . . . . 118 5.20 More examples of reference images and the resultant 3D models . . . . . . 119 5.21 Illustration of the object animation idea: (a) a 2D cartoon character, (b) the resulting 3D model, and (c)-(g) different gestures of the 3D model. . . 120 5.22 Before the actual test, each subject was trained to create these five 3D elements based on their 2D images. . . . . . . . . . . . . . . . . . . . . . . 120 xi Abstract Aprototypeofaninnovative3Dthumbnailsystemformanaginglarge3Dmeshdatabases ispresentedinthisresearch. Thegoalistoprovideanonline3Dmodelexhibitpagewhere the user can browse multiple 3D thumbnails interactively and efficiently. An overall system framework for a large scale 3D repository is described. It includes an offline process and an online process. For the offline process, a 3D mesh is first de- composed into several significant components. For each decomposed part, its skeleton and body measurements are extracted and saved as the shape descriptor. Subsequently, its thumbnail is created according to the shape descriptor and saved as the thumbnail descriptor. Intheonlineprocess,accordingtouser’spreference,thesystemcaneitherren- der the 3D thumbnail directly with its pre-generated thumbnail descriptor or re-generate the 3D thumbnail descriptor based on a pre-generated shape descriptor without starting from the scratch. As a result, the data size of a thumbnail descriptor is much less than its original mesh and can be downloaded quickly. Rendering a simplified thumbnail de- mands less hardware resource, and the online thumbnail viewer can display multiple 3D thumbnails simultaneously within a few seconds. Furthermore, we develop two feature-preserving thumbnail creation techniques. They are the surface-based and the voxel-based methods. For the surface-based technique, a xii 3D polygonal mesh is decomposed by a visual salience-guided mesh decomposition ap- proach that identifies and preserves significant components. For each decomposed part, its skeleton and body measurements are extracted after the PCA transformation. Then, acoarse-to-fineprimitiveapproximationalgorithmisusedtocreatethe3Dthumbnail. Moreover, a customized deformable primitive, called the d-cylinder, is designed for ap- proximating the shape better and fining the appearance of the resultant thumbnail. We generate the 3D thumbnail with a different number of d-cylinders so that the thumbnail can represent a simplified mesh in different level of details successfully. The processing time of each process and the file size of the 3D thumbnail descriptor are given to show the efficiency of the surface-based approach. In the voxel-based approach, a polygonal model is first rasterized into a volumet- ric model and a coarse skeleton is extracted with a thinning operation. The skeleton derived from the thinning process is further refined to meet the required accuracy. Sub- sequently, the skeleton is classified to significant groups, and the volumetric model is decomposed into significant parts accordingly. As compared with the surface-based ap- proach, thevoxel-based approachcanpreservemorefeaturesofthemodelanddecompose the model more precisely. Thus, the significant components of the original model can be preserved better in the 3D thumbnails while the model is extremely simplified. A thor- ough performance comparison between the surface-based and the voxel-based techniques is conducted. Finally, theframeworkisextendedtosketch-based 3Dmodeling. Wepresentasketch- based3D systemthat providesasetoftoolsforusersto create simple3D modelsfrom2D sketches easily. A user-drawn 2D sketch is refined and then approximated by 3D fitting xiii primitives. There are five new customized elements developed for approximation: the open-tube, the closed-tube, the ellipsoid, the prism, and the complex-prism. When a 3D model is composed of multiple parts, they can be grouped hierarchically using editing tools. Furthermore, the system embeds a 3D character with a hierarchical skeleton for the purpose of animation. xiv Chapter 1 Introduction 1.1 Significance of the Research Digital media has advanced rapidly in recent years. Due to the demand for digital assets from various industries (e.g.,education,entertainment,manufacturing),digitalfilesare being created daily and continue to grow in order of magnitude. Because of rapid file creation and editing, there are increased demands on systems to store and retrieve such files. Users benefit tremendously from effective reuse of existing resources. For example, the University of Southern California (USC) Institute for Creative Technologies (ICT) [6] currently holds over 400,000 files, ranging from simple photographs to complex Maya scene files with a multitude of referenced textures. It is important to be able to set-up acustomizablesearchsystemtoquicklyandaccuratelylocateanasset. Props,char- acters, animation sequences, textures and other digital files are all housed in a central server, along with back-up copies. Since a large simulation can typically hold upwards of hundreds of characters, it is important to discriminate between many different fields to determine the correct character, vehicle, or other files to be used or manipulated. Most 1 existing Digital Asset Management (DAM) systems require human inputto provide asset description manually, which is not practical for a large database system. ADAMsystemthatbuildsastructureddatabaseforeffectivesearchinalargereposi- toryofdigitalassetsisdemanded. WithincreasinglycomplexDAMsystemscommercially available, the standard is constantly being revised. An accurate and efficient method to retrieve these digital assets became a primarychallenge. Thepresented research isan on- going project that includes a multimedia indexing/search system, such as ”2D image/3D mesh retrieval”, for a large art asset database. We began with text-based indexing, since it is still the most reliable approach compared to other content-based media features. Automatic metadata extraction is essential to the management of a real-world database system. This practical issue is discussed in Chapter 2. The proposed text-based indexing and retrieval tool is capable of managing assets that were named properly, or that have embedded attributes already linked to the file. However, there are still many digital as- sets, often created years ago, which lack suitable metadata and were named randomly. These assets are inaccessible using the text-based indexing tool. Content-based shape retrieval, i.e.,theshapematchingproblem,remainsachalleng- ing one since many 3D models are complicated and a 3D shape can be easily classified into different categories while the posture is changed. The current DAM tools display static 2D thumbnails on the search page for easier browsing. These 2D thumbnails need to be pre-captured manually and it is extremely time-consuming to capture thumbnails for a large-scale 3D asset database. In addition, while the 2D thumbnail might capture thebestshotof a3Dobject, therearestill other features thatcannotbeseenfrom afixed angle. Our first research objective is to design a system framework to achieve online 3D 2 thumbnail preview for a large-scale database, where an interactive 3D thumbnail which canbeviewed fromdifferentangles isprovidedtohelpauserefficiently browse3Dmodels in a large 3D database. Since rendering complicated 3D models simultaneously requires much hardware re- source and degrades the system performance significantly, we need to simplify the 3D model for efficient rendering but preserving the features at the same time. Several ap- proaches and algorithms for mesh/shape simplification have been proposed in the past. However, most of them did not address the issue of preserving the features of a model or avoiding the elimination of the features. For example, the limbs and the body can meld together when the model is extremely simplified. Thus, our second research objective is to develop automatic feature-preserving 3D thumbnail generation techniques. They are the surface-based and the voxel-based methods, which will be detailed in Chapters 3 and 4, respectively. The shape descriptor and the thumbnail descriptor for each model are pre-generated offline, and the user can browse multiple 3D thumbnails efficiently online. Theshapedescriptorthatwecreated foreach modelcanbefurtherusedforvariousshape analysis application such as shape retrieval. Finally, the framework is extended to sketch-based 3D modeling. Many techniques developed in Chapter 3 and 4, such as turning point extraction and primitive approxi- mation, can be shared and reused for this purpose. We present a sketch-based 3D (S3D) system that offers a set of tools for users to create simple 3D models from 2D sketches easily. Although there are a few 3D modeling tools available to the public, many of them are not user friendly. It may take days and even months for users to get familiar with 3 dedicated 3D model creation software, and casual users can easily get frustrated. To ad- dress this concern, more intuitive and simplified modeling techniques are needed. In the proposed S3D system, a user-drawn 2D sketch is refined and then approximated by 3D fitting primitives. There are five new customized elements developed for approximation: the open-tube, the closed-tube, the ellipsoid, the prism, and the complex-prism. When a 3D model is composed of multiple parts, they can be grouped hierarchically using editing tools. Furthermore, the system embeds a 3D character with a hierarchical skeleton for the purpose of animation. 1.2 Review of Previous Work Automatic Generation of 2D Thumbnails The problem of automatically selecting the pose of a 3D object that corresponds to the most informative view of the shape is known as the best view problem. Mortara et al. [40] proposed an approach to select the pose of a 3D object that corresponds to the most informative and intuitive view of the shape automatically. Their solution was driven by the meaningful features of the shape, which maximizes the visibility of salient componentsfromthecontext orapplication’s pointof view. Meaningfulfeaturescans can beautomatically detected bysemantic-oriented segmentations. However, judgingasetof views as the best ones for an object is closely related to the nature of human perception of 3D shape. The fixed selection rule does not work well for all objects and the result can just as easily capture wrong features. In addition, while the 2D thumbnail might capture 4 the best shot of a 3D object, there are still other features that cannot be seen from a fixed angle. Shape-based Retrieval and Shape Descriptors Atypical3Dshaperetrievalsystem,suchasthe3Dmodelsearchenginedeveloped at the Princeton University [39], consists of a database with an index structure and an online query engine. A shape descriptor, which contains a compact description of the shape, has to be first created for each 3D model offline offline and used to represent the model in the online process. In the online query process, the query engine computes the query descriptor, and search for the model whose shape descriptor matches the query descriptor. Models similar to the query model are retrieved from the database. The similarity between two descriptors is quantified by a distance measure. Sundar et al. [46] encodesthegeometricandtopological informationintermsoftheskeletal graphandlocal shapedescriptors,whichareheldateachnodeofthegraph. Theselocalshapedescriptors contain relevant information such as the mean, the radius, the degree of freedom about the joint, or the degree of importance of a particular joint or node. They developed a graph matching method based upon the skeleton of a 3D shape and used the local shape descriptor for the matching purpose. A shape descriptor should capture geometric and topological properties of a 3D object well, and can be used for discriminating different objects. On the other hand, a shape descriptor is desirable to be insensitive to noise or small features. Small changes in a shape or models in different levels of detail should not differ significantly from the original model. However, poor sensitivity will lead to poor discriminative abilities. Mesh Simplification 5 Mesh simplification techniques have been developed more than a decade to support multi-resolution applications. Since a complex 3D model contains millions of polygons and requires a large amount of memory and time to process, it can significantly degrade the runtime performance. Many applications may require (or can only support) the low resolution model instead of the original complex model. For example, when an object is far away from the camera, rendering time can be sped up if its low-resolution version is used. Previous work on mesh decomposition can be categorized into surface-based and voxel-based two approaches. The surface-based simplification approach uses an iterative process to approximate the polygonal mesh with fewer and fewer polygons. Garland et al. [24] developed a surface-based simplification algorithm for producing high quality approximationsofpolygonalmodels. Theiralgorithmusediterativecontractionsofvertex pairs to simplify meshes, and maintained minimum surface error approximations using the quadric metric. The voxel-based mesh simplification approach rasterizes the model intoabinarygridandusesasetof3Dmorphologicaloperationstosimplifythevolumetric model. Heet al. [27]usedthesamplingandthelow-pass filteringoperations to transform an object into a multi-resolution volume buffer. Visually unimportant features, such as tubes and holes, can be eliminated with the morphological operators. Mesh Decomposition Mesh decomposition has become an important component in many applications in computer graphics such as modeling, compression, simplification, skeleton extraction and 3D shape retrieval. Research on mesh decomposition can be classified into two types based on the way an object is partitioned. 6 1. Part-identifying This type aims at identifying parts that correspond to meaningful features of the shape. Katz et al. [31] used it to find meaningful components using a clustering algorithm while keeping the boundaries between the components fuzzy. Then, the algorithm focuses on small fuzzy areas and finds the exact boundaries which go along the features of the object. 2. Patch-partitioning This type uses surface geometric properties of the mesh, such as the curvature or the distance to a fitting plane, and the mesh is segmented into a number of patches that are uniform with respectto a certain property. Attene et al. [13]presented the partitioning-based approach the segmented the mesh hierarchically based on fitting primitives. Based on a hierarchal face clustering algorithm, the mesh is segmented into patches that best fit a pre-defined set of primitives such as planes, spheres, and cylinders. Initially, each triangle represents a single cluster. Then, all pairs of adjacent clusters are considered at each iteration and the one that can be better approximated with one of the primitives forms a new single cluster. Sketched-Based 3D Modeling The sketch pad, which was invented by Sutherland [47] in 1963, has opened up a new way for human-machine communication. A user can interact with the machine using a light pen and draw geometric figures directly on a vector display instead of typing. The sketch-based human-machine interface becomes popular these days since it is intuitive 7 and easy to use. It offers an interaction style similar to pencil drawings for geometric modeling, animation and learning, etc. Teddy [28], the first well-known sketch-based modeling software, was developed by Igarashi et al. in 1999. It offers a sketching interface for users to draw free-form models usingtheinflation technology [26]. Itsupportsseveral modelingoperationssuchasextru- sion and cutting for mesh editing. Alexe et al. [11] reconstructed a 3D shape from its 2D sketch using convolution surfaces [16] with polylines and polygons skeletons. Schmidt et al. [45] utilized hierarchical implicit volume models (BlobTrees) as an underlying shape representation to inflate 2D contours into 3D rounded implicit volumes. More recently, GingoldandIgarashi[25]developedanothersketch-based2D-to-3D modelingsystemthat can create a 3D model by placing primitives and annotations on a 2D image. The sketch-based 3D (S3D) system proposed in Chapter 5 shares one common idea in [25]. That is, it attempts to creat a 3D model based on the drawing of 2D contours and skeletons. However, surfaces with edges or relatively flat surfaces cannot be properly handled in [25]. As compared with previous work, one major feature of the S3D system is its capability to allow users to create not only rotund 3D objects but also man-made objects such as a flat prism or a prism with a arbitrary hole. Moreover, the created 3D model is ready to animate since it has a hierarchical skeleton structure. For more information about sketch-based modeling, we refer to the survey in [38]. 8 1.3 Contributions of the Research In this research, we propose an feature-preserving thumbnail creation system. Main contributions of our work are summarized below. • Thumbnail creation with feature-preserving shape simplification To preserve the important topology of a 3D object, we develop a feature-preserving shape simplification technique, where significant parts of a model are first distin- guished and they are simplified individually. As compared with previous shape simplification techniques, our approach treats ”feature-preserving” as the highest priority. Thus, significant features of the model can be better preserved by our approach. This is especially true when the model is extremely simplified. • On-line 3D thumbnail browsing To improve the search efficiency for 3D models, researchers mainly focused on the shape-based retrieval in the past. In this work, we propose an inovative 3D thumb- nail framework. The remote user can browse multiple 3D thumbnails efficiently on the search page. Unlike the static 2D thumbnail that can only show a fixed angle of a3Dmodel,the3Dthumbnailcanbeexaminedfromdifferentanglesinteractively. The system framework is well designed and can be used for a large-scale database. The shape descriptors and the thumbnail descriptors can be pre-generated offline, and the thumbnail can be rendered efficiently in the online process. • Iterative coarse-to-fine approximation Thedeveloped3Dthumbnailrepresentsasimplifiedshapewithapproximatedprim- itives. We presentan innovative primitive approximation algorithm to approximate 9 ashapefromthecoarseleveltothefinelevelinaniterativeprocess.Wefirstgen- erate the roughest thumbnail composed of a minimum number of primitives and enhance the thumbnail representation by adding more primitives until the total bit budget is met. The 3D thumbnail containing higher or lower level of details can be created according to the available hardware resource or user’s preference. • 3D shape description We present a shape descriptor to describe the topology of a 3D model in a simplified format. The shape descriptor describes the skeleton and the body measurements for each significant part individually, and is used to represent the original model for further analysis in our system. The shape descriptor can be reused to generate the 3D thumbnail with different resolutions instead of the original model. In addition, sincetheshapedescriptorisdesignedtocapturethepropertiesofthe3Dmodeland can be used to discriminate shapes, these shape descriptors are valuable for shape matching in our future research. Here, we present two approaches for extracting the shape descriptor for each model. – Asurface-basedshapedecompositionapproachispresentedinChapter3. We extend the work of [34] on mesh decomposition. The significant parts of a modelare firstidentified. Theskeleton andthe bodymeasurementsdescribing each part are then extracted with the PCA transformation subsequently. – Avoxel-basedshapedecompositionapproachisdescribedinChapter4. The skeleton is refined and the shape can be decomposed into sub-parts more pre- cisely. 10 Performance comparison of these two approaches is discussed in Chapter 4. • Primitive approximation with customized deformable primitives The proposed 3D thumbnail represents a simplified approximated primitives. A regular primitive such as the rectangle or the cylinder has a fix size from the top to the bottom so that it is not flexible in fitting a generic shape. To generate a generic shape better, a customized deformable primitive d-cylinder is designed. • User friendly sketch-based 3D modeling system The design of the S3D system consists of three challenging tasks: 1) 2D sketch re- finement, 2) 3D primitive approximation and 3) 3D object editing. The 2D sketch refinement is needed since it is difficult to draw a precise curve with today’s com- puterinputdevices. User-drawn contours and skeletons are often jagged, discontin- uous and unsmooth. An innovative primitive approximation scheme is demanded so that a user can create 3D models easily with a variety of primitive combinations. Finally, the 3D editing tool is provided for users to manage 3D objects such as deletion, duplication and grouping. These challenging tasks will be addressed in Chapter 5. 1.4 Organization of the Dissertation The rest of this dissertation is organized as follows. Related previous work, such as mesh simplification,meshdecomposition,posenormalizationandskeletonization, isreviewedin Chapter2. Afeature-preservingthumbnailcreationsystembuiltuponsurface-basedmesh decomposition and approximation techniques is presented in Chapter 3. A voxel-based 11 shape decomposition and skeletonization scheme for 3D thumbnail creation is studied in Chapter 4. A sketch-based 3D modeling system that can map user-drawn 2D sketches to 3D objects is described in Chapter 5. Finally, concluding remarks and future work are given in Chapter 6. 12 Chapter 2 Research Background 2.1 Automated Metadata Indexing and Analysis (AMIA) System The AMIA project is being developing under an ICT contract managed by the United States Army Research, Development, and Engineering Command (RDECOM) Simula- tion and Training Technology Center (STTC). In order to maximize the ICT’s research and development efforts in game-based training and learning systems, various underlying technologies have been employed. While this approach has proven beneficial for individ- ual projects, it has resulted in a myriad of file formats and versions for the corpus of required digital assets. The sheer number of files and diversity of applications involved creates a massive asset management challenge. In response to this need, the AMIA effort was tasked with designing a set of creative asset development standards and tools, along with a repository of visual and audio assets. 13 2.1.1 Introduction 2.1.1.1 Project Goal Having conducted a decade of immersion research development, the ICT has now de- veloped 11 projects, creating a wealth of raw art assets (e.g., images, audio, video, 3D models, etc.) with inconsistent data management standards. Without an effective digi- tal asset management system, these valuable resources are difficult to find and retrieve. The ICT requires a great number of art assets to create an immersive environment, and the amount of digital assets for any particular project will grow as the project develops. Therefore, it is important to provide guidelines, processes, tools and support to ensure that futureartassets willbereusable. TheAMIA project aims to develop a system capa- ble of automatically indexing existing multimedia databases for faster, efficient retrieval. This system could also benefit users searching for relevant assets for use in other appli- cations. The research vectors will concentrate on a specific sets of deliverables. The first setofdeliverables concernoutliningamethodofapproach andpossiblesoftware solutions (which could include computer scripts, database query strings, applications, databases, etc.) focused on one or more of the following capabilities: I. Automatically extracting metadata. II. Adding metadata to existing assets and repositories. III. Automatically adding metadata as new assets are created. IV. Organizing assets based on metadata. 14 The goals are to make current assets more accessible and to ensure that current and future assets have a higher degree of reusability and organization. While any software solution will initially target the existing database repositories, it will also be designed with regard to portability and applicability in other systems. The second set of deliv- erables are reports that focus on the current landscape of image and file analysis, along with various pertinent research topics in the areas of metadata and digital asset manage- ment and organization. In addition, there will be an effort to take a multidisciplinary approachtowards extendingandexpandingcurrentmetadata standardsrelating tovisual and animation assets. 2.1.1.2 Project Resources TheICTcreatesnecessary2D,3D,andaudiosoftwareapplicationfilesfrommilitarydata. TheICTcurrentlyholdsover400,000 assets,including3Dscenes,2Dimages, audio,video and game engine exports. These types of assets include props (e.g., buildings, vehicles, weapons), characters (e.g., human models, rigs, animations), environments (e.g., level layouts, skies, backgrounds) and respective reference materials and textures. These raw 2D, 3D, and audio files are the most useful and versatile files within the Digital Backlot (DB). To organize the large amount of digital assets, the ICT adopted the commercial soft- ware AlienBrain [2] and Adobe Bridge [1] as DAM tools. Parts of the assets have been manually linked to associated metadata with AlienBrain and AdobeBridge. These meta- data, which describe the attributes of these digital assets, are valuable to preserve for 15 indexing. In addition to the metadata edited with DAM tools, there are other useful text-based attributes of assets embedded in various ways. The following list shows the resources that contribute to AMIA’s text-based indexing process: I. AlienBrain metadata The metadata that had been manually edited with AlienBrain was stored in the metadata files, separate from the assets. All metadata associated with the assets was stored under the same root directory and was written into a single “.amd” file. The metadata may indicate the file type, such as 2D, 3D, the file attributes such as texture, and the security label, such as “inherited legal academic use.” I. Maya embedded metadata In a Maya file, some useful descriptions, such as the object name (e.g., “student,” “wheel,” “ankle”), or object attributes, were written as notes by the artist or au- tomatically generated by the software. Maya objects may also be animated. For these objects, metadata can be extracted from the embedded metadata with our MEL scripts running on Maya. II. Adobe Bridge metadata The metadata previously edited at the ICT with Adobe Bridge was embedded in the file itself. Some useful file attributes are preserved along with the digital assets in Adobe Bridge format. III. Filename follows naming convention The ICT has set a standard naming convention to allow better organization and 16 sharing of assets. This naming convention defines the file name format to describe afileandalistofabbreviationtoshortenthelength. Forexample,thefilename “ChrUsaCivTeenM Skater001 mesh.ma” can be parsed to the tags: “Character,” “USA,” “Civilian,” “Teenager,” “Male,” “skater,” and “mesh.” Thus,all filenames that follow the naming convention can be interpreted for further indexing. 2.1.1.3 System Framework The current system we implement is composed of two main programs: an indexer that re-organizes the multimedia assets in the designated folder, and a search interface that allows users to retrieve the assets they need. The overall system diagram is presented in Fig. 2.1. The indexer contains three parsers to process the multimedia assets, along with other related management files, to populate relevant tags to associate with each asset. The file distributor first went through the repository and assigned the .amd file and the Maya file to the AlienBrain parser and Maya parser accordingly. Concurrently, each file name was assigned and interpreted by the filename parser as well. For example, for an asset depicting a US Army scene, related reference terms, such as “male,” “solider,” “USA,” “tank,” “Army,” and so on, will be populated as tags associated with this asset. More implementation detail is described in Sec. 2.1.2. The search interface allows users to type in keywords and retrieves the assets con- taining tags matching those keyword(s). Apache Lucene [3] is used as the backbone of the search engine. A Java program reads a list of tags and their associated files from the Tag Auto-Populator module. Then, the data is entered into the Apache Lucene engine for indexing. The more keywords an asset fits, the higher rank it will be assigned in the 17 Multimedia Repository Apache Lucene Query Interface Multimedia Files Indexer File Distributor Apache Lucene Indexer Indexed Data mayaList Tag Auto-Populator Maya Parser Mel Script Maya Tag Extractor melOut mayaResult AlienBrain Parser AlienBrain Tag Extractor ABResult Filename Parser Naming Convention Verier Filename Tag Extractor FNResult amdList lesList Figure 2.1: Flowchart: Overall system design search result. ThewebinterfaceinFig.2.2 isimplemented withPHP.Otherprogramming languages can be used to implement the search interface in the future. 2.1.1.4 System Requirement Except for the MEL script, all the functionalities of the indexer are written in Java. Therefore, running the indexer requires JDK 1.6.0 and Maya 7.0 (or higher) installed on the operating system. On the other hand, the AMIA interface works like Google Search. The system only requires a web-browser to operate the interface. However, the server 18 Figure 2.2: Screenshot of the AMIA search interface designated to provide this search service requires the installation of ApacheWeb Server v2.2.4, JDK 1.6.0, and PHP v5.2.3 or higher. 2.1.1.5 System Performance Securityconcernsrestricttheuseofsomedata; therefore,notalloftheassetsareavailable fortesting. Table2.1showstheprocessingtimefortestingthedata. Mostoftheprocesses in our system can be conducted within one to two minutes, except for programs written with MEL script. In order to extract the metadata embedded in Maya files, the program must operate Maya. Thus, processing time is based on Maya’s performance. Although thisprocess istime-consuming, itiscomplemented offlineandwillnotaffect onlinesearch performance. The current search process takes less than 1 second. 19 Process Time Open and extract metadata in Maya Files with MEL script -1352Mayafiles 30 min. Parse MEL output metadata into tags -1352metadatafiles 66 sec. Parse filenames and paths into tags -15928filenames 0.5 sec. Parse AlienBrain metadata files into tags -7amdfiles 15 sec. Insert tags into Lucene -generate14.7MBofdata 42 sec. Search -within15928assets 0.1 sec. Table 2.1: System Performance 2.1.2 Text-based Indexing and Retrieval Automating multimedia file tagging is a challenging task that warrants further study. Current techniques that search for images, videos, or audio, rely primarily on any tex- tual information associated with the media. Such an approach often relies on human interaction to provide textual annotations to facilitate media retrieval. Therefore, a more intelligent methodology for extracting information associated with multimedia files, as well as a metholodology for storing these files in a well-organized textual search frame- work, is desirable. In this section, we will discuss the implementation details of our text-based indexer. As shown in Fig. 2.1, the indexer iscomposed of three modules: File Distributor, Tag Auto-Populate and Apache Lucene Indexer. We will describe the implementation issues and details of each module. 20 2.1.2.1 File Distributor To preserve the metadata previously stored in Maya and AlienBrain files, we wrote two different parsers to process these files according to their file formats. Within the mul- timedia repository, Maya and AlienBrain files are handled separately from other types of files. Thus, three lists will be generated from this module and then distributed to different parsers in the module to populate tags. The three lists are made automatically and stored in allMayaFiles.txt, allAmdFiles.txt, and allFiles.txt by default configuration. 2.1.2.2 Tag Auto-Populator This module is composed of three parsers, which manage the three lists generated by the file distributor respectively. The three parsers process every file separate and respective lists and generate tags associated with the file. The following is the detail of these three parsers, including filename parser, Maya parser and AlienBrain parser: 2.1.2.3 Filename Parser Thisparseriscomposedofthreemodules: NamingConvention Verifier, StringSegmentor and File Type Identifier. I. Naming Convention Verifier If the filename follows the naming convention, corresponding word expansion, such as “Chr“ to “character,” “C” to “Child” will be performed. In the mean while, afilenamesuchas“C 1.jpg” cannot be interpreted as “child” because the letter “C” might be named randomly without meaning. Thus, we need a verifier to check whether a file name properly follows the ICT’s naming convention before 21 interpreting keywords. Table 2.2. exemplifies this naming convention, as well as the tags the associated tags interpreted from the file name. I. String Segmentor Although most filenames created years ago do not follow the naming convention, thosefilenameswerenamedwithmeaningfulwords. Thesewordswereconcatenated withdelimitersoruppercaseletterasafilename(e.g. andersonheadMolly white2.tif). The String Segmentator processes each filename and separates the concatenated words into several tags for further indexing. The delimiters including symbols (such as “ ”, “-”, “/”, and white space), uppercase letters, and numbers were pre- defined for segmenting words. For example, “andersonheadMolly white2.tif” is di- vided to “anderson,” “head,” “Molly,” “white2” and “white.” Words with numbers and consecutive uppercase letters are treated differently because some meaningful wordscontain numberssuchas“AK47”orcapitalized abbreviationssuchas“USA,” “RGB,” should remain the intact. Furthermore, The directory names are also segmented and output as tags since they are meaningful. For example, a 3D female model named ”Amy.ma” under the ”CharacterHuman” directory will beconsidered as a human character. In addition, to avoid redundant tags, some common tags such as the name of the root directory is discarded since all of the files is under the same directory. There were various issues that we encountered when implementing the String Segmentator, which are outline in further detail in 2.1.3.1. 22 II. File Type Identifier Most multimedia assets can be identified as such by their file extension. Therefore, we wrote a File Type Identifier to generate tags based on the file extension. For example, assign thetag“2D” tothefiletype.jpg, .bmp,.tga, andassignatag “3D” to the file type .ma, .mb, .nif and assign a tag “animation” to the file type .kfm, .kf. filename ChrUsaInfantryAdultM Star001 Mesh.ma Class Character SubClass Usa, Infantry, Adult, Male Descriptor Star001 Porcess Mesh File type ma Table 2.2: A filename that follows the naming convention and the translated tags associ- ated with that file. 2.1.2.4 Maya Parser The Maya parser includes a MEL script and a keyword filter. The MEL script running within a Maya environment is written to transfer all of the embedded metadata of the Maya file to a separated text file (here referred to as MelOut file). The parser is inter- ested in extracting all information concerning animation, character, skeleton, LOD, light, dynamics, and textures that may be embedded. For example, a MEL script is written to check whether the Maya file can be animated. A word ”animated” outputs into MelOut as a keyword for an animation-enabled Maya file. In addition, the filenames of the refer- ence textures have to be extracted and preserved as one of the tags so that the reference will not be lost. Furthermore, we observed that the filenames of the reference textures, such as “sky.jpg,” may provide useful information about the content of the Maya file. 23 Thus, thetags extracted fromthe filenameof thereference texture bythe filename parser are also saved. Except for the predefined keywords extracted by the MEL script, there are other useful metadata and countless redundant data that output to the MelOut file as well. Therefore, a keyword filter is required to discard the redundant data. The keywords are categorized askeywordsets,non-keywordsets,andunknown-wordsets. Thekeywordsets contain meaningful words such as “female,” “student,” “skeleton,” and “animated.” The non-keyword set contains useless words or words that are too common to have relevancy (e.g.,“group,” “translate,” “scale,” “rotate,” “visibility,” “x,” “y,” and “z”) when parsing the metadata, the word matches within the keyword set is preserved as a tag and the word matches within the non-keyword set is discarded. All other words which do not match any of keyword and non-keyword are classified as unknown words. Onewaytodeterminewhethermetadataisusefulorredundantistoemploystatistical analysis. To do this, we parse all words in the MelOut file and count the frequency that each word appears. Table 2.3 shows parts of the word counts. Word rotate translate scale x Count 27977 25981 24049 23421 Word y z Gamebryo student Count 23230 22425 895 287 Table 2.3: Statistical analysis of examplary words. This statistical result was not directly used for classifying whether a word belongs to the keyword or the non-keyword set. Since the appearing frequency is not as relevant to managing files in our project, we use this statistical result to handle words from the highest to the lowest frequency within the countless extracted words from all Maya files. 24 For example, words “rotate,” “translate,” “scale,” have the highest appearing frequency so that they are handled first. If these words are useless, we can get rid of numerous redundant tags in our search system by classifying them to the non-keyword set. Definingkeyword/non-keywordissubjectiveandapplication-specific. Thesetofappli- cation specific keywords can be trained from the assets generated from a certain applica- tion. Thesetofgeneralkeywordscanbefurtherdividedintouser-orienteddomain-specific groups. In the AMIA project, the database contains a large amount of military scenes and keywords used to describe models (such as “F16”, “AK47”, or “M1A1”). Note that thekeyword listfortheAMIAprojectmaynotbesuitableforother DAMs. However, the list may serve as a starting point, which can beenhanced through user-interaction. Thus, we leave this field editable for the system administrator, who will also allow flexibility, so that the end-user may define what is important to them. The unknown keyword list serves as a candidate list for keywords. Currently, we preserve these unknown words as we would preserve keywords. However, a tag scoring method can be used to differentiate priorities between tags. The system can define the importance of each term in the unknown keyword list to fit users’ interests. This tuning can be done manually or automatically. For example, we can tune the importance based on the query frequency of a term. 2.1.2.5 AlienBrain Parser The AlienBrain Parser is written to extract the keywords and values that are stored in the AlienBrain metadata file (*.amd). These .amd files can beexported from AlienBrain, and stored in the format (Filename| Keyword | Value). Each line shows a pair of keyword 25 and its value referring to a certain file. A sample data of the .amd file is shown in Fig. 2.3. Figure 2.3: Sample data in AlienBrain .amd files . 2.1.2.6 Apache Lucene Indexer The tags previously generated by the Tag Auto-Populator module are sorted in XML format files. Then, a list of tags and their associated files are extracted from these XML files and entered into the Apache Lucene engine for indexing. We worked on the Apache Lucene’s scoring algorithms to adjust the order of returned results. In Apache’s default, to evaluate the score for document “D,” given query “Q,” is in the following equation: coord(Q,D) ! q i (tf(q i ,D)idf(q i )boost(q i )norm(q i ,D)), (2.1) where Q=q 1 ,q 2 ,...,each q i is a tag used in the query. Thus the 5 factors are: A. coord(Q, D)):ThenumberoftermsinthequerythatarefoundinthedocumentD. B. tf(q i ,D):Thesquarerootoftheoccurrenceofthetag q i in document D. C. idf(q i ):Theinversedocumentfrequencyof“q i ”. The greater the amount of docu- ments that contain the tag q i ,thelessimportantthetagisdeemed. 26 D. boost(q i )):Auseroptiontoemphasizeatagduringthesearch.Theusercantype in “female ˆ2 teacher” as the query to emphasize the tag “female.” E. norm(q i ,D)):Theproductofthreeboostingfunctionscomputeduponaddingthe document into the database. The three functions are: (i) The importance of the document: The importance of a document is user-dependent. For example, Maya docu- ments are of higher weight than Photoshop documents if a user is targeting 3D assets. On the other hand, if the user is interested in the 2D design, Photoshop documents may be deemed more relevant. The importance of a document is embedded information, which is defined when it is added to the Apache Lucene index. The importance value can be tuned upon request. In our implementation, we gave the same importance value to every document. (ii) The importance of each field in the document: This is also user-dependent as the importance of the document. (iii) The inverse of the number of tokens in each field: This will be discussed below in Item II. The order takes 5 factors into consideration. Factors A and B are applicable to our search. Factors D, E(i) and E(ii) are user options, and therefore are not applicable. However, the default setting of C and E(iii) is not very suitable for searching for files associated with the queried tags. We analyzed the effect of each factor, and here we modify them to better suit our needs. 27 I. Omitting the factor of the inverse document frequency of q i . In the default setting of C, the default value of idf(q i )is: idf(q i )=1+log # of documents # of documents containing q i +1 Themoredocumentsthatcontain thetagq i ,thelessimportantq i willbe. However, inourapplication, thisfalselypenalizedtheupperclasswordssuchas“Characters,” “Props,” and the tags that characterize file extensions (since they appears in more documents). Therefore,thisfactorwasremovedfromthescoringevaluationprocess. II. Omitting the factor of the inverse of the number of tokens in each field. InthedefaultsettingofE(iii),theinverseofthenumberoftokensineachfieldfalsely penalized the files that were associated with more tags. A Maya file containing a larger set of scenes will have more tags than another Maya file containing a simple object; however, the same tag in both files should be of equal importance. For example, file 1, which contains 100 buildings, will have more tags than file 2, which contains only 1 building. The tag ”building” will appear in both file1andfile 2, and should be equally important in both files. As a result, we removed this factor in the scoring evaluation process. In addition, we made two more modifications to the Apache default indexer to meet the project’s requirements. One modification was merging the tags of duplicated files in thesystem. Because thetags generated fromthreedifferentparserscan indicatethesame file, the default indexer keeps the file separate, which may slow down search performance and search accuracy. The other modification is that we disabled the function that uses 28 numbers as the delimiters. Because numbers have meaning to our users, the user may search for “student03” as opposed to using “student,” which is differs from “student01.” 2.1.3 Implementation Detail and Issues of Text-based Parsers Extracting useful tags from metadata and filenames was a laborious process. One of the main efforts attempted to segment the words properly, another was to use Word Regulation. Here we discuss more detail about the String Segmentation and the Word Regulation process. 2.1.3.1 String Segmentation To provide as much information about an asset as possible, the digital assets were named withconcatenatedwords,akintothemetadataembeddedinMayafiles. Thesearchengine cannot find the words that are concatenated with others. For example, “HappyFeet” cannot be found by searching for “Happy” or “Feet” or will be assigned a relative low rank. Thus, we developed a String Segmentator to extract the keywords and save them separately as tags. The segmentation mechanism is built upon the following four principles: a. Segment strings at delimiters: all symbols, uppercase letter, and numbers. e.g., segment “femaleChild Low” to “female,” “child,” and “low.” b. Ignore connective words such as “and,” “to,” “from,” and “or.” e.g., “fromCourseProject” would only generate tags “course” and “project.” 29 c. Special handling of uppercase letters. e.g., preserve the capitalized abbreviations “USA”, by not dividing the characters into “U,” “S,” and “A,” d. Special handling of numbers. e.g. we generate tags “student” and “student003” from “student003”. The first two principles are straightforward and do not require special explanation. On the other hand, principles c and d are more complicated. There will always be an exception that does not fit the segmentation rules. The following sections discuss the problems encountered with the special handling of uppercase and numerical values. I. Special handling of uppercase letters Because we use uppercase letters as the delimiters, problems occur with capitalized abbreviation (in this case, CAPString stands for the string of capitalized abbrevia- tions). Weobserved thattherearedifferentcombinations ofCAPStringandregular words shown in the filename list. For example, AFLow stands for two abbreviation “Adult,” “Female,” and a regular word, “Low.” IA06project stands for IA, year 06, and project. Dividing words with upper case letters will not work perfectly in all cases. Certain wordsortermscan beidentified (suchas: ICT,VW,LOD,US...etc). These consecutive uppercase letters on the list are preserved to be recognized as a word. However, there are still various unrecognized words (such as FO, GB, ES) which are difficult to parse. In addition, sometimes CAPStrings are followed by a word that begins with alowercaseletter,suchasICTprojectandUSArmy. Thecorrectcutiseither 30 cutting before or after the last uppercase letter. For example, correct segmentation for USArmy is before the last uppercase letter (A in this example) to generate “US” and “Army”; however, the correct segmentation for ICTproject is after the last uppercase letter (p in this example) to generate “ICT” and “project.” Thus,weappliedafreeJava-basedDictionary,APISuggester[7]tocheckwhether asegmentedwordisavalidwordrecognizedbythedictionary. Inourimplemen- tation, we also added the list of ICT-defined acronyms (such as: ICT, USC) in the dictionary to help uppercase letters segmentation. With this dictionary, we check the following two conditions: (i) if the lower case letters forms a valid word; (ii) if combining the last uppercase letter and lower case letters forms a valid word. The current segmentation decision is made according to the following truth table illustrated by Table 2.4. Case i Case ii SegmentAt True True Before False True Before True False After False False Before Table 2.4: The truth table for deciding segmentation. Although several methods of special handling are applied, it is difficult to find a perfect solution that can accurately segment all filenames which were not initially named in a standard format. Some meaningful tags would still become meaningless after word segmentation. For example: “noAccum” is divided into two separate tags as “no,” and “accumulation.” 31 II. Special handling of numbers The principle of handling numbersis to always associate the numbers with the tags in front of the number. Although numbers are used as delimiters in our parser, we do not want to discard these numbers because they are part of the information. Thus, if there are numbers within a filename, we generate an extra tag preserv- ing the numbers, by saving the number along with the word it concatenated to. For example, the tags “student002” and “student” will both be preserved. “sky- Day512simple” would be divided to “sky,” “day,” “day512,” and “simple.” Saving just the numbers alone creates a long, meaningless tag. For example, the tag 002 is meaningless if separated from student002. There are also exceptions in the case of some words; when 2D, 3D, or a single character, occur immediately after the numbers, (e.g., final02b), the characters should not be separated. Solving this issue proves to be demanding. The preservation of the special numberby generating extra tags requires additional memory space. This is however needed to enhance the accuracy of the search per- formance. During the system design phase, end-users at ICT indicated a desire to search for a particular 3D model from a set of selections that are differentiated numerically. For example, a set of 3D buildings are named with the word “build- ing” followed by sequentially increased numbers, e.g. building01, building02... etc. When a user wants to retrieval a specific building from the set, he/she may not 32 enter a query string with a number immediately after word building. Thus, we designed our system accordingly. 2.1.3.2 Word Regulation To increase search power, a word regulator is being developed to translate a tag into a consistent format. For example, both singular and plural forms of a noun can be saved, so it is necessary to search both forms with a keyword in such a way that one form is not excluded. In addition, many capitalized abbreviations are defined, for shorten the length of the file name in the ICT’s naming convention. For example, “M” stands for “male,” “A” stands for “Adult,” “Chr” stands for “character.” A word regulator is needed to translate these abbreviations. However, this is not universally true for filenames that do not follow the naming convention, or the metadata embedded in the Maya files and the AlienBrain files. These files should be handled carefully as part of semantic analysis; hopefully, the correct translation can be constructed based on the context. Ambiguous words are not presently being handled with this methodology, and are left as they are. 2.1.4 System Performance Comparison The ICT database contains over half a million (571,533) art assets. Since most data have security restrictions, the evaluation given below was conducted by ICT profession- als with security clearance. It aims to compare our system with the commercial DAM tool currently employed by ICT. From all asset files, our system generated 11.7 mil- lion (11,681,548) tags, including 11.1 million (11,117,168) tags extracted by the filename parser, 0.2 million (193,541) tags extracted by the Maya parser, and 0.4 million (370,839) 33 tags extracted by the AlienBrain parser. In this evaluation, the 10 most common queries picked by the artists were used as test queries. The returned result of our system is reported in Table 2.5. Search Item Search Time number of file returned Relevancy (n out of 10) female civilian 1.16 sec. 10,812 9/10 military truck 9.26 sec. 92276 6/10 military and truck 2.12 sec. 143 8/10 city building 2.79 sec. 28575 8/10 military tank 9.2 sec. 90,288 10/10 us soldier 2.89 sec. 33,791 9/10 sgt and star 0.11 sec. 51 9/10 50 cal 2.94 sec. 2,757 9/10 brick texture 1.8 sec. 22,611 10/10 baghdad reference 7.45 sec. 78,809 7/10 Table 2.5: System performance of the proposed AMIA search engine. The same set of statistics cannot be obtained with ICT’s commercial DAM tool since the system repeatedly crashed when the queries were performed. The benchmark system also limits queries to be a single-word term. As a result, the system was not able to handle test queries in our first round of testing. Thus, we set up another test set (i.e. a small database consisting of 19,422 files) and different queries for the commercial DAM tool. The returned result of the second test set is shown in Table 2.6. Search Item Search Time number of file returned Relevancy (n out of 10) Iraqi Woman 4sec. 0 0/10 Woman 3sec. 10 4/10 Truck 2sec. 97 8/10 civilian 5sec. 156 9/10 tank 3sec. 18 6/10 buillding 4sec. 75 6/10 Table 2.6: System performance of the commercial DAM tool. 34 In the test, ICT end users entered search queries in both systems and recorded the search time and the number of files returned. In addition, subjective scores of returned results’ relevance were collected and averaged to obtain a relevancy score listed in the fourth column of both tables. The relevancy score ranges from 1 (being the lowest) to 10 (being the highest). Relevancy is an essential evaluation criterion recommended by end users at ICT to vote whether the first 10 returned results met their interest. For example, in search for “military truck,” the user intended to find truck models for the military purpose,ratherthan findingarbitrarymilitary scenes that contains atruck. The user can put in a more descriptive query to further refine search results. For example, in search for “military and truck,” fewer results were returned and accuracy became higher. By comparing results in Tables 2.5 and 2.6, we see that AMIA returns more results than the commercial DAM tool and achieves higher relevancy scores. Such performance improvement is attributed to additional tags generated automatically by our system as compared with the commercial DAM tool. Being with a huge amount of tags, the search time of our system is still bounded by 10 seconds and the average search time is about 4 seconds, which are comparable with those of the commercial DAM tool. 2.1.5 Ongoing and Future Work 3D mesh and associated textures are an extremely valuable resource to the ICT, and therefore necessitate an effective management tools for reuse. Our text-based indexing and retrieval tool is capable of managing assets that were named properly, or that have embedded attributes already linked to the file. However, there are still many digital assets, many created years ago, which lack suitable metadata or were named randomly 35 (e.g., object1.ma, 123.jpg). These assets are inaccessible using the text-based indexing tool, and a content-based indexing retrieval tool is still a goal to access these assets. As text-based indexing/search is nearly completed, content-based retrieval for 3D Shapes is the next goal of this ongoing project. The Princeton 3D search engine [22] provided a system framework which meets our need, it appears proper to use it as a reference for 3D assets search. However, unlike our goal to organize and reuse existing art assets, their system searches for available assets on the Internet. For a large database holding over 400,000 assets, like ICT’s Digital Backlot (DB), developing an indexing data structure and shape matching function with fast speed per- formance is important. Tangelder and Veltkamp [48] presented an overview on the 3D match function. According to their comparison, the feature-based-spatial-map approach [32, 42] and the weighted point method developed by Funkhouser et al. [21] performed relativelybetterthanvariousotherapproachesintermsofefficiency, discriminativepower and robustness. The geometry-based approach [21] is evaluated as the first powerful tool for part-in-whole matching. The AMIA project will start working towards content-based retrieval for 3D shapes. In addition, the database contains many image, audio and video data as well. After text and 3D mesh retrieval, an automatic content-based approach for 2D image, audio and video retrieval are next priorities in future development. 36 2.2 Techniques for 3D Thumbnail Generation Inthissection, wereviewseveraltechniqueswhicharerelatedto3Dthumbnailgeneration to be used in Chapters 3 and 4. 2.2.1 Mesh Simplification Our work on the 3D thumbnail representation is closely related to research on mesh sim- plification. For a complete survey of earlier work, we refer the survey literatures written by Talton [29] and Luebke[36]. Mesh simplification techniques have been developed more than a decade to support multi-resolution applications. Since a complex 3D model con- tains millions of polygons and requires a large amount of memory and time to process, it can significantly degrade the runtime performance. Many applications may require (or can only support) a low resolution model instead of its original model of full resolution. For example, when an object is far away from the camera, its rendering time can be shortened if its low-resolution version is used. Previous work on mesh simplification can be categorized into surface-based, voxel-based and the hybrid approaches. Garland et al. [24] developed a surface-based simplification algorithm for produc- ing high quality approximations of polygonal models. Their algorithm used iterative contractions of vertex pairs to simplify meshes, and maintained minimum surface error approximations using a quadric metric. Cohen et al. [19] proposed an error-driven op- timization algorithm for geometric approximation of surfaces. Their algorithm used a similar idea inspired by the Lloyd algorithm, which reduced the distortion error through 37 repeated clustering of faces into best-fitting regions. Their approach did not require pa- rameterization or local estimations of differential quantities. While [19] only considered planes for approximation, Wu et al. [50] extended their optimization technique by allow- ing different primitives including spheres, cylinders, and more complex rolling-ball blend patches to represent the geometric proxy of a surface region. They segment a given mesh model into characteristic patches and provide a corresponding geometric proxy for each patch. Allworkinaboveusedaniterativeapproach,whichwouldproduceafine-to-coarse approximation. Themesh is simplified incrementally duringtheir approximation process. Heet al. [27]proposedavoxel-based meshsimplificationapproachthatusedsampling and low-pass filtering to transforman object into amulti-resolution volume buffer. Then, the marching cubes algorithm [35] was used to construct a multi-resolution triangle-mesh surface hierarchy. Nooruddin et al. [41] adopted a hybrid approach that integrated the voxel-based and the surface-based mesh simplification approaches. They converted polygonal models to a volumetric representation and then repaired and simplified a model with 3D morphologi- cal operators. Visually unimportant features, such as tubes and holes, can be eliminated by the morphological operators. The volumetric representation result was then converted back to polygons and a topology preserving polygon simplification technique was used to produce the final model. However, most of the previous works did not pay special attention to preserve signifi- cant parts of a model. For example, the limbs and the body can meld together when the model is extremely simplified. We would like to alleviate this problem by distinguishing object’s important geometric components. 38 2.2.2 Thinning and Skeletonization Thinning is a morphological operation used for removing selected foreground pixels from abinaryimage. Itcanbeusedinseveralapplications,butparticularlyusefulforskele- tonization. 3Dthinningalgorithmshavebeendevelopedtoextractmediallinesbyshrink- ingan 3Dbinaryobjecttoavoxel listinatopology-preservingway. Thethinningprocess starts from an object’s boundary, deletes a set of simple points, and continually erodes until no more simple points can be removed. A simple point is an object point whose deletion does not alter the topology of the object. Three types of thinning algorithms have been developed based on how simple points are considered for deletion. • Directional sequential The first type is the directional or border sequential method. The directional se- quential algorithm examines the 3x3x3 neighborhood of each border point. Border points of a binary object that satisfy certain topological and geometric constraints are deleted in iteration steps. Each iteration step is divided into several subitera- tions. Only border points that match the prescribed 3x3x3 neighborhood can be deleted in each subiteration. Each subiteration is executed in parallel use different deletionrules. Allborderpointssatisfyingthedeletionconditionaresimultaneously deleted in an actual subiteration. We adopt the directional thinning algorithm in [44] in our work, where each iteration step consists of 8 subiterations, and each subiterations can be executed in parallel. The deletable points are given by a set of 3x3x3 matching templates at each subiteration. The entire process is repeated until only the skeleton is left. However, removing all simple points from an object 39 produces excessive shortening of the curve-skeleton branches, since all end-points of the curve-skeleton are simple points themselves. • Fully parallel The second type is the fully parallel method. Algorithms belonging to this type do not divide an iteration step into multiple subiterations. The parallel algorithm considers all boundary points for deletion in a single thinning iteration. In order to preserve topology, the fully parallel algorithm in [37] examined points in the 5x5x5 neighborhood (rather than the conventional 3x3x3 neighborhood). • Subfield sequential The third approach is the subfield sequential method. The set of discrete points in the 3D binary space is divided into more disjoint subsets which are alternatively activated. At a given iteration step, only border points of the active subfield are designated to be deleted. The subfield sequential 3D thinning algorithm proposed in [15] work in a cubic grid using eight subfields. There are several skeletonization techniques proposed in the past, including the dis- tance field, the Voronoi diagram, and the potential field, as described below. • The distance field approach It extracts the skeleton based on the distance map. The distance transformation yields the shortest distance from the edges of an object to a given interior point. A distance map interprets the distance information into the height and the ridges of the distance map are skeletal points. This approach is efficient butit can producea 40 disconnected skeleton and the final step has to reconnect them in order to produce asetof1Dcurves. • The Voronoi diagram approach [43] It partitions a discrete set of points into cells so that each cell contains exactly one generating point and the loci of all points that are nearer to this generating point than other generating points. The internal edges and faces of the Voronoi diagram can be used to extract an approximation of the medial surface of the shape. A curve-skeleton can be extracted from this medial surface approximation by pruning it to a 1D structure. • The potential field approach [18] It uses a potential field function to determine the potential at a point interior to the object by estimating a sum of potentials generated by the boundary of the object. The curve-skeleton is extracted by detecting local extrema of the field and connecting them. Detection of local extrema can be achieved by detecting the local maximum along equi-potential contours. For a complete review on skeletonization techniques, we refer to [20]. 41 Chapter 3 Feature-Preserving 3D Thumbnail Creation via Mesh Decomposition and Approximation 3.1 Introduction With the increasing amount of available multimedia digital assets, the digital asset man- agement (DAM) tools become more and more valuable. People rely on these DAM tools to search for a specific digital asset in a large database. Although the technology of text- based searching is maturing, many existing digital assets are not well named and cannot be found by just a text-based query. Our research aims to improve the search efficiency for digital assets, especially for 3D objects, since3Dobjects areplayinganimportantrolenowadays. Theentertainment and engineering industries createmorethanmillionsof3Dobjectsinadecade. Aneffective DAM tool is needed to find a 3D object even it is not well named. Current DAM tools display the static 2D thumbnails on the search page for easier browsing. However, those 2D thumbnails need to be pre-captured manually and it is extremely time-consuming to capture thumbnails for a large scale 3D asset database. Some researchers attempted 42 to develop systems that can take the best snapshot for a 3D object automatically by selecting the view angle that captures the most important features. This fixed selection rule does not work well for all objects and the result can just as easily capture the wrong features. In addition, whilethe 2D thumbnailmight capturethe bestshotof a 3D object, there are still other features that cannot be seen from a fixed angle. We propose an innovative approach to create a feature-preserving 3D thumbnail au- tomatically in this work. The resultant 3D thumbnail system aims to help the user efficiently browse 3D models in a large 3D database. The user can browse multiple 3D models with pre-generated 3D thumbnails, and view the 3D thumbnails from different angles interactively. The 3D thumbnail is a simplified version of the original model and requires much less memory and time to render. It can also support devices without enough memory resources for 3D rendering. In our system, we separate the framework into offline and online two processes for improving the runtime performance. Two types of descriptorsof each 3D modelaregenerated offlinesothatthethumbnailcan bequickly renderedonlineusingthepre-generated descriptors. Preliminaryexperimentalresultsare given to demonstrate the proposed methodology. In the offline process, we first decompose a model into visually significant parts so that its individual parts can still be well preserved even when the model is extremely simplified. Then, the skeleton and the body measurements of all parts are extracted and saved as a shape descriptor, which describes the shape in a certain format. Third, we ap- ply an iterative, error-driven approximation algorithm to find the best fitting primitives representing a simplified model. The approximation results are saved as a thumbnail de- scriptor for efficient online rendering. Our novel coarse-to-fine approximation is different 43 fromthepriorartwhereamodelisoften approximated usingthefine-to-coarse approach. In addition, an innovative deformable cylinder, called the d-cylinder, is developed for the primitive approximation. As a result, our thumbnail results are more discernable than other approximation methods using regular primitives. In the online process, the 3D thumbnail is rendered according to the pre-generated thumbnail descriptors. Multiple thumbnails can be downloaded and displayed on Java applets based 3D thumbnail viewer within a few seconds. A remote user can also regen- erate the 3D thumbnail if they prefer another level of detail by re-using the existing parts descriptor. This process can be done within a second. The rest of this chapter is organized as follows. The system framework and each individual process are discussed in Sec. 3.2. The coarse-to-fine approximation algorithm is described in Sec.3.3. Experimental results are evaluated and discussed in Sec. 3.4. Finally, concluding remarks and future research directions are given in Sec. 3.5. 3.2 System Framework The proposed 3D thumbnail system aims to help the user browse multiple 3D models efficiently within a large 3D database. The current 3D search engine usually uses 2D thumbnails to represent a 3D model since it requires large resource and time to render a complicated 3D model. In this work, we desire to create a feature-preserving 3D thumb- nail by simplifying and approximating the original model with fitting primitives. While ausersearchesinoursystem,multiple3Dthumbnailscanbedisplayedquicklyandthe user can examine the thumbnail from different angles interactively. 44 Figure 3.1: System framework The lower resolution thumbnail requires much less hardware resource to render and canbetransmittedquickly. Additionally, tospeeduptheonlineprocess,weperformmost of the processes offline and only leave the rendering process online. Fig. 3.1 shows the proposed system framework, where both the offline and online processes are illustrated. Intheofflineprocess,wefirstuseameshdecompositionalgorithmtoseparateamodel intosignificantparts. Eachpartisthenconsideredasanindependentunitinthefollowing procedure. Then, the Principal Component Analysis (PCA) transformation is adopted to normalize each part’s orientation and each decomposed part is transformed from the world’s space to its own principal component space. Third, we extract the skeleton and take body measurements for each part and saved them as a shape descriptor. Finally, we use the shape descriptor as the input to the primitive approximation algorithm and generate the thumbnail descriptor. In the online process, multiple 3D thumbnails can be rendered quickly by applying areversePCAtransformationtothepre-generatedthumbnaildescriptors. Ontheother 45 hand, users can generate a lower/higher resolution thumbnail according to the existing shape descriptors and their preference. In the following subsections, we will describe each process within the system frame- work in detail. 3.2.1 Mesh Decomposition The reason to apply mesh decomposition in the beginning of the process is because we want to preserve the visually significant parts of each model. Each significant part is treated as an independent unit in the rest of the process. Thus, our 3D thumbnail will always present these parts by keeping, at least, the roughest shape of each decomposed part, even if the model has been extremely simplified. To conduct mesh decomposition, we extended the approach of Lin et al. [34], which can decompose a 3D model into one main body and several protruding parts. To give an example, two original models and their decomposition results are shown in Figs. 3.2 (a) and (b) using different colors. By identifying the significant parts, we can generate a rough 3D thumbnail that depicts each decomposed part with a single rectangle or cylinder as shown in Figs. 3.2(d) and (e). 3.2.2 Extracting the Skeleton and Body Measurements with PCA Transformation After the mesh is fragmented, we want to extract each part’s skeleton and bodymeasure- ments so that its shape can be expressed in a simple format. This extracted data will be used to find the best fitting primitives in the next approximation process. Moreover, 46 Figure 3.2: Decomposing the model into significant parts. (a) Original models (b) Mesh decomposition results that we applied the approach of Lin et al. [34]. The main bodyare shown in red and the others are shown in different colors (c) the minimum bounding box ofeachpartderivedafterPCAtransformation(d)therough3Dthumbnailapproximating each part with a single rectangle for each decomposed part, and (e) approximating with asinglecylinder. in order to preserve each decomposed part during simplification, we process each part individually till the approximation process is completed. We apply the PCA transformation to each part individually. The PCA process can normalize each part’s orientation, i.e. its principal axes are aligned with the new co- ordinate axes in the transformed space. Thus, we can extract the skeleton easily along its principal axis. For example, the bounding boxes show various orientations encapsu- lating the decomposed parts in Fig. 3.2(c). By applying a PCA transformation, each part is transformed to its temporary PCA transformed space, where the bounding box is centrally located and aligned to new coordinate axes, say ( ˜ X, ˜ Y, ˜ Z). Moreover, the skeleton extraction process was based on sampling points from the original mesh’s surface. (Note that, our mesh is constructed with multiple triangle faces. These triangles are assigned to different parts by the mesh decomposition process.) For 47 Figure3.3: TheskeletonandbodymeasurementsareextractedafterPCAtransformation. Each black line represents each part’s skeleton and each rectangular layer represents the estimated body measurements along the skeleton. In this picture, all lines and layers were reverse-PCA transformed to the original coordination system for visual illustration. each triangle face belonging to a decomposed part, we uniformly extract sample points within the surface of the triangle by interpolation and use them to estimate the skeleton and body measurements as detailed below. We extract the skeleton by calculating the central line along its first principal axis, say ˜ X-axis, in the decomposed part’s PCA transformed space. Imagine that the decomposed part is chopped into multiple slices along its ˜ X-axis. Assuming (˜ x i ,˜ y x i ,˜ z x i )isthecenter ofsamplepointsonthesliceof ˜ X=˜ x i ;˜ x min and ˜ x max aretheminimumandthemaximum of ˜ X among all sample points. We obtain the skeleton by connecting the adjacent center points (˜ x i ,˜ y x i ,˜ z x i )withintherangeof ˜ X=[˜ x min ,˜ x max ]consecutively. Finally, we take the body measurements by estimating the average distance from sample points to the center on each slice. For example, for all sample points whose ˜ X=˜ x i ,wecalculatetheirdistancetocentralpoint(˜ x i ,˜ y x i ,˜ z x i )along ˜ Y-axis and ˜ Z- axis correspondingly. The average distances were then taken as the approximated body measurements for slice ˜ X=˜ x i ,intheformof(RW x i , RH x i ). 48 InFig. 3.3,weshowexamplesoftheextractedskeletonandbodymeasurements. Each blacklinerepresentstheskeletonofeachdecomposedpart. Eachcolorfulrectangularslice represents a body measurement whose width and height are equal to RW x i and RH x i , respectively, at ˜ X=˜ x i .Notethateachpartwasreversed-PCAtransformedtotheoriginal coordination system in this figure to display the temporary result. 3.2.2.1 Shape Descriptor Theextracted skeleton and bodymeasurement are called the shape descriptor and stored in the database. In the remaining processes, we use the shape descriptor to represent the shape instead of the original polygonal mesh. The shape descriptor for a 3D model contains the information as illustrated in Fig. 3.4. Besides the skeleton and body mea- surements, the inverse matrix and the inverse translation vector are also saved for each part so that we can transform the parts back into the original coordinate system. More- over, the parts descriptors can be re-used whenever a user wants to run the remaining processes to find the fitting primitives. Figure 3.4: The shape descriptor for a single 3D model. 49 3.2.3 Iterative Approximation and Primitive Selection After decomposing a mesh into salient parts and performing skeleton extraction with PCA transformation, we will proceed to the last stage; namely, coarse-to-fine iterative approximation and selecting proper primitives. During the iterative process, the fitting primitivesareappliedaccordingtotheskeletonandbodymeasurementsthatweextracted earlier. We first generate the roughest thumbnail composed of a minimum number of primitives and enhance the thumbnail representation by adding more primitives until the total bit budget is met. Moreover, to approximate an object with regular primitives may produce a relatively stiff result as shown in Fig. 3.2(d)(e), since the shape of the primitive is not flexible. We propose a deformable primitive ”d-cylinder” which can produce more pleasant 3D thumbnails. 3.2.3.1 Bit Budget Determination The bit budget, which is closely related to the number of primitives used to approximate asinglemodel,isdeterminedbyavailablehardwareresourceoruser’spreference. For example,ifthememoryonacomputercanafford3000primitivesrenderedsimultaneously, and the user prefers to view 10 objects at a single web page, then the budget for each thumbnail is 300 primitives. By experience, a general purpose computer can render 22 thumbnails simultaneously, each of which is composed by 40 d-cylinders. In our system, several thumbnails with different resolutions are pre-generated for each model. However, users can always request the system to generate a thumbnail with a certain resolution according to their preference. 50 3.2.3.2 Customized D-cylinder As shown in Fig. 3.5, a deformable d-cylinder is composed by an upper ellipse, a lower ellipse, and a body that contains multiple quadrangles dividing the ellipses uniformly. The major and the minor radii of the upper ellipse, denoted by α 1 and β 1 ,andthoseof the lower ellipse, denoted by α 2 and β 2 ,respectively,canbeadjustedindividuallytofit the approximated object. Multiple quadrangles interpolate the transition of the d-cyliner between these two ellipse. That is, the smoothness of the d-cylinder curve is determined by the number of quadrangles. The data structure of this d-cylinder contains four types of elements that can be adjusted to fit the approximating shape: (1) n seg :thenumberofdivisionsofthebody. There are n seg points that divide each ellipse and form n seg quadrangles of the body. The greater n seg is assigned, the smoother the d-cylinder curve will be. However, more memory is required to render this d-cylinder. By default, we use 36 quadrangles for interpolating a d-cylinder. (2) Radius upper (α 1 ,β 1 )and Radius lower (α 2 ,β 2 ): the major and the minor radii of the two ellipses; these four parameters are set individually to fit the body measurements of the approximating shape. (3) Center upper (x 1 ,y 1 ,z 1 )and Center lower (x 2 ,y 2 ,z 2 ): the center of each ellipse. The two centers are set according to the location of the skeleton points of the approximating shape. (4) Color(r,g,b): The display color of this d-cylinder. A representing color can be assigned arbitrary. The color can be used to represent the texture of the approximating shape in the future. 51 Figure 3.5: The D-cylinder. (a) The filled D-cylinder. (b) The wire-frame of the D- cylinder. The deformable d-cylinder is composed of an upper ellipse, a lower ellipse, and abodycomposedofmultiplequadranglesdividingtheellipsesuniformly. 3.2.3.3 Primitive Approximation Unlike existing mesh simplification techniques which use the fine-to-coarse iterative ap- proach, e.g., the work by Cohen et al. [19] and by Attene et al.[14]. We propose a coarse-to-fine iterative approximation method that first approximates the original shape with a minimum number of primitives first and then adds more primitives to produce afinerresultgraduallyuntilthebudgetismet. Inthisprocess,theskeletonandbody measurementsareusedastheinputforshapeapproximation. Thedeformabled-cylinders are used to represent the approximated shape. The main idea of our iterative approximation approach is illustrated in Fig. 3.6. Ini- tially, we assign one approximating d-cylinder to each decomposed part and the shape of this d-cylinder is decided according to the part’s skeleton points and body measure- ments. For example, assume that the skeleton of a decomposed part P k as shown in Fig. 3.6(a) has a start point of SK k 1 and an end point of SK kn .Thedecomposedpart P k is approximated with one d-cylinder whose upper and lower ellipses are centrally located at SK k 1 andSK kn respectively, and whose radii,Radius upper andRadius lower ,equaltothe body measurements at SK k 1 and SK kn respectively. Second, we examine the distortion 52 Figure 3.6: The illustration of the coarse-to-fine approximation process.(a) The skeleton of the partP k within the range of [SK k 1 ,SK kn ](b)Thedecomposedpart P k is approxi- mated by a single d-cylinder and the layer with the max distortion error is found atSK k i (c) The old d-cylinder is divided at SK k i and replaced by two new d-cylinders. between the original shape and the approximating d-cylinder and choose the area that has the highest distortion as the split point. Third, we divide the decomposed part into two regions and assign a new d-cylinder to each region. Notice that within an individual part P k ,eachrectangularslice L k i represents the body measurement at the skeleton point SK k i ,whereitswidthandheightareequalto the average surface distance (RW SK k i ,RH SK k i )to SK k i .Eachdecomposedpart P k can beapproximated bymultipled-cylinders,andeach d-cylindercovers severallayers within its region. In the iterative approximation process, we estimate the distortion error between the original shape and its approximating d-cylinder along each slice. The detail of how we estimatethedistortionerrorforeachslicebetweenanoriginalshapeandanapproximating primitiveisdescribedinsec.3.3. Theslicewhichhasthemaximumdistortionerrorderived from Eq. (3.5), is chosen to assign new approximating d-cylinders. Assuming the slice at SK k i has the maximum error as shown in Fig. 3.6(b), we markSK k i as a split point where the current region is going to be split. The old d-cylinder assigned to fit the 53 region of [SK k 1 ,SK kn ]willthenbereplacedbytwonewd-cylindersassignedtothenew regions of [SK k 1 ,SK k i ]and[SK k i ,SK kn ]respectively,asshowninFig. 3.6(c). Thenew d-cylinders have their centers of ellipses located atSK k 1 andSK k i ,respectively,andthe radii of ellipses are equal to the body measurements at SK k 1 and SK k i ,respectively. Similarly, the other d-cylinder has its centers located at SK k i and SK kn and the two radii are equal to the body measurements associated to SK k i and SK kn .Theworst approximating area is replaced at each iterative process, and the iterative process is repeated until the budget is reached, or the total distortion error is sufficiently small. 3.2.3.4 Output Thumbnail Descriptor At the end of the offline procedure, the result of the primitive approximation process is saved asthe thumbnail descriptor whichcan speeduptheonlinerendering. Asillustrated in Fig. 3.7, The thumbnail descriptor contains items for each individual part such as its inverse matrix, its inverse translation vector, the number of the d-cylinder applied to approximate a part, the center SK i and the radii (RW SK i , RH SK i )ofsliceswherea d-cylinder has been assigned. The average size of the thumbnail descriptor is 12KB for athumbnailcomposedof50d-cylindersusingtheplaintextfileformat. Therefore,a remote user can download the descriptors quickly. 3.2.4 Online Rendering with Reverse PCA Transformation We leave only the rendering part in the online process to improve the browsing perfor- mance. When rendering the 3D thumbnail, all approximating d-cylinders are reverse 54 Figure 3.7: The thumbnail descriptor for a single model. transformed back to the world coordinates from the PCA transformed space. The pre- generatedthumbnaildescriptorscontainalltheinformationneededtorenderathumbnail. Moreover, if users prefers to view the thumbnail with another level of detail, they can re-use the shape descriptors and re-generate the thumbnail from the approximation process within a second. 3.3 Distortion Error Estimation Fig. 3.8 illustrate the distortion between an original shape and a simple thumbnail. The original shape is represented by the slices and the thumbnail is represented by few primitives. To estimate the distortion error for each slice L k i between an original shape and an approximating primitive, we consider three factors: (1) the surface distortion, (2) the location distortion, and (3) the volumetric distortion. In the following, we describe each factor. 55 Figure 3.8: The distortion error estimation. (a) The skeleton and the slices of the associ- ated body measurements are representing the original shape (b) Each decomposed part is approximated with few d-cylinders (c) The distortion error between the original mesh and the approximated d-cylinder is estimated at each slice. I. Surface distortion The surface distortion measures the surface distance between the original mesh and the approximating d-cylinder at each slice. The surface distortion ψ 1 at slice L k i can be defined as: ψ 1 (L k i )= " ∀v∈L k i d(v,C kt ) |∀v∈L k i | , (3.1) where v is the sampling point extracted from the original mesh,C kt is the approxi- mating d-cylinder that covers slice L k i ,and d(v,C kt )isthedistancefromsampling point v to the surface of C kt at this slice; d(v,C kt )canbedefinedas: d(v,C kt )=||v−Center(L k i ,C kt )|−Radius(L k i ,C kt ) |, (3.2) where Center(L k i ,C kt )and Radius(L k i ,C kt )arethecenterandtheradiusof C kt at the slice L k i .Sincethecenterandtheradiiofboththeupperellipseandthe lower ellipse of the d-cylinder C kt is known, the value of Center(L k i ,C kt )and 56 Radius(L k i ,C kt )ateachslicewithinthisd-cylinder C kt can be derived via inter- polation. II. Location distortion Thelocation distortion measures the distance between thecen- ter of the original mesh’s slice and the center of d-cylinder’s slice. The location distortion ψ 2 at slice L k i can be defined as: ψ 2 (L k i )= |SK k i −Center(L k i ,C kt ) | max ∀P k {maxCenterDistance(P k )} , (3.3) whereSK k i istheskeleton pointofsliceL k i ,Center(L k i ,C kt )isthecenterofC kt at the slice L k i ,and maxCenterDistance(P k )isthemaximumdistancebetweenany pair of this decomposed part P k ’s skeleton points projected on a P k ’s slice, which constrains ψ 2 within [0,1]. III. Volumetric distortion The volumetric distortion, ψ 3 ,measuresthenon-overlappingregionbetweenthe slice of the original mesh and that of the approximating d-cylinder. The volumetric distortion ψ 3 at slice L k i can be mathematically defined as: ψ 3 (L k i )= (origSliceSize+cSliceSize)−2×overlapSize 2×maxSliceSize , (3.4) where origSliceSize and cSliceSize are the slice sizes of the original mesh and the approximating d-cylinder C kt , overlap is the the overlapping slice size between the original meshandC kt ,andmaxSliceSizeis themaximumslicesizeamongallparts and constrains ψ 3 within [0,1]. 57 The slice size of the original mesh, origSliceSize,canbeestimatedwiththebody measurements (RW SK i ,RH SK i )thatassociatedtotheskeleton SK i at this slice. The slice size of the approximating d-cylinder, cSliceSize,canbeestimatedwith the interpolated radii Radius(L k i ,C kt ). In addition, since the center and the radii of both the original mesh and the d-cylinder at sliceL k i are known , their bounding region at this slice can be roughly estimated. Assume that L k i lies on the plane X = x,andthederivedboundingregionoftheoriginalmeshhastheminimumat (x,origY min ,origZ min )andthemaximumat(x,origY max ,origZ max ). The derived bounding region of the d-cylinder has the minimum at (x,cY min ,cZ min )andthe maximum at (x,cY max ,cZ max ). Then, the size of their overlapping area can be roughly estimated with the equation: overlap= |min(cY max ,origY max )−max(cY min ,origY min ) | × |min(cZ max ,origZ max )−max(cZ min ,origZ min ) | Note that if the value of overlap is negative, it means the layers do not overlap at all. In this case, we set the overlap to zero. To summarize, let C kt bethed-cylinderapproximatingoneofthedivisionsof partP k and slice L k i within the approximating region of C kt .Thedistortionerrorforslice L k i can be estimated as E(L k i )=w 1 ×ψ 1 (L k i )+w 2 ×ψ 2 (L k i )+w 3 ×ψ 3 (L k i ), (3.5) 58 where w 1 , w 2 ,and w 3 are weighting parameters with w 1 +w 2 +w 3 =1. The three parameters are used to adjust the contribution of each distortion factor. The default values are set to 1/3, i.e. equal weight. However, we can adjust the weight whenever we would like to give more weight to a particular factor. For example, a larger w 2 value can better capture the detail of high curvature features than a smallerw 2 value. 3.4 Experimental Results The proposed algorithm was applied to different 3D objects in the experiments. All 3D objects were converted into the same format (.obj) and normalized into the same distribution range. The sample thumbnail results are shown in Fig. 3.9 and Fig. 3.10. From the left to the right, the first column of pictures shows that the original mesh is composed of a main part and several protruding parts displayed in different colors. The second column of pictures depicts the skeleton and body measurements extracted from each individual part. The third, fourth and fifth column show our 3D thumbnail results composed of 7, 20, 50 d-cylinders respectively. When a thumbnail model is extremely simplified, such as the examples shown in the third column of Fig. 3.9, each part is still represented by at least one primitive, so that the significant components can be preserved. A small number of primitives does preserve the lower-level details, and require fewer resource to render. Our experiment was running on a desktop computer with an Intel Core 2 Duo 2.53 GHz CPU, and 4G RAM. The average processing time for each process is listed in Table 4.1. Each process can be done within a second, and the total processing time from 59 Table 3.1: System performance the extraction process to the thumbnail rendering is around 1 second. The generated thumbnail descriptors composed of 7, 20, 50 d-cylinders were about 10KB, 10KB, 12KB, respectively, for models whose size were within the range of 160 - 870 KB. The file size of the thumbnail descriptor is determined by the number of primitives rather than the size of the original model. Thus, a thumbnail descriptor can be much smaller than its original model. In addition, the result of the skeleton and body measurement extraction is always the same, and will not be affected by the number of primitives. Note that the mesh decomposition processing time was not included in this table, since it was based on Lin’s work [34] and will be replaced by another method in the future. As reported in their work, the mesh decomposition time for a model, such as a dinopet model in Fig. 3.2 that contains 2039 vertices and 3999 faces, was 2.3 seconds. In addition, different results can be generated by adjusting the parameters of the error-driven approximation algorithm. For example, a higher w 2 value in Eq. (3.5) can capture the detail of high curvature features better than a lower w 2 value (e.g. the ankle of a bear and the beak of a penguin). Fig. 3.11(a) shows the result with a lower value of w 2 while Fig. 3.11(b) shows the result with a higher value. 60 However, the proposed primitive approximation process is based on the skeleton and body measurements extracted from each decomposed part, and the skeleton and the bodymeasurementsareextractedbasedonmeshdecompositionresults. Thus,thumbnail resultsarehighlyaffected byhowmeshesaredecomposedinthebeginning. Ifa3Dmodel has a complex topology or the model is not fragmented well, it will lead to an unpleasant 3D thumbnail result, since its skeleton and the body measurements may be inaccurate. Moreover, our thumbnail result may lose certain levels of detail from the original model, since it is approximated by primitives. Finally, we developed an online 3D thumbnail viewer as shown in Fig. 3.12, which was implemented with the Java 3D applet. The viewer can be embeded into a web browsereasily andallows theusertobrowsemultiple3Dthumbnailsfromdifferentangles interactively. Currently, the viewer is set to display 12 thumbnails at the same page. All 12 thumbnails can be displayed within 5 seconds. In our experiment, the browser can display up to 22 thumbnails without running out of memory when each thumbnail is composed of 40 d-cylinders. The lower the level of detail of a thumbnail, the more thumbnails can be displayed at the same time, and vice versa. 3.5 Conclusion and Future Work In this work, we proposed a novel feature-preserving 3D thumbnail system for efficiently browsing multiple 3D objects in a large database. The significant components of the original model can be well preserved in the 3D thumbnails even the model is extremely simplified, and it requires much less hardware resources to render. Since the data size of 61 thethumbnaildescriptorismuchlessthanthatoftheoriginalmesh,itcanbedownloaded quickly. Additionally, the online thumbnail viewer can display multiple 3D thumbnails withinafewsecondssothat theremote usercan browsetheminteractively andefficiently in a large database. Thelimitationoftheproposedsystemisthatwhena3Dmodelhasacomplextopology and cannot be well decomposed by Lin’s algorithm [34]. In the future, we will focus on improving the mesh decomposition method. We are developing a new volumetric based decomposition method to address this issue. The skeleton and body measurement extraction process will also beadjusted so as to capture more features that have not been considered in this work. Finally, more types of primitives will be evaluated for primitive approximation and the textures of a model will also be considered. 62 Figure 3.9: Results of the thumbnail descriptor. 63 Figure 3.10: More results of the thumbnail descriptor. 64 Figure 3.11: Different approximation results can be generated by adjusting the weighting parameters. Both thumbnails are composed of 8 d-cylinders, where (a) shows the result with a smaller w 2 value and (b) shows the result with a larger w 2 value. Figure 3.12: The Java 3D applet based 3D thumbnail viewer, and a remote user can browse multiple 3D thumbnails interactively within few seconds. 65 Chapter 4 Voxel-based Shape Decomposition for Feature-Preserving 3D Thumbnail Creation 4.1 Introduction The number of 3D models grows rapidly due to their popularity in several industrial sectors. There is an increasing demand on effective management of 3D models (e.g., archival, indexingand retrieval of 3D models in a large repository) since users can benefit from effective reuse of existing models. The conventional 3D search engine displays static 2D thumbnails on the search page for the browsing purpose. However, a 2D thumbnail may not representa 3D model well. Some researchers attempted to automate the process of 2D snapshottaking of a 3D object byselecting thebest viewing angle [40]. However, it isdifficult to findan ideal angle selection rulefor generic objects. Even if onecan capture the best angle of a 3D object by a 2D thumbnail, there are still features that cannot be seenfromthisangle. Toovercometheshortcomingof2Dthumbnails,afeature-preserving 3D thumbnail creation system was presented in chaper 3[17] with an objective that users 66 can browse multiple 3D models by viewing their 3D thumbnails interactively at a lower cost (e.g. smaller memory space and faster rendering speed). Although there exist quite a few mesh simplification techniques, most of them are not designed to preserve the salient features of a given model. For example, the limbs and the body of a human model can meld together when it is extremely simplified. To preserve the shape features of a 3D model, a mesh-based shape decomposition scheme, which was initially proposed in [34], was adopted in chapter 3 to decompose a model into multiple meaningful parts and, then, each part was simplified individually. However, the mesh-based shape decomposition scheme has its limitations. For example, although it can decompose the protruding parts from the main body successfully, other meaningful parts(e.g. the head and the neck) may not beproperlyseparated. Besides, the surface of the decomposed mesh may become fragmented. The surface radius measured using the fragmented mesh may not be accurate enough to yield a good representative thumbnail. Here, instead of finding patches to resolve problems arising from mesh decomposition and simplification, a volumetric shape decomposition scheme is presented to overcome the above-mentioned difficulties. To be specific, we propose a voxel-based shape decom- position scheme in this work. With this scheme, a 3D model is first converted into a 3D voxel representation, and the skeleton of the voxelized model is extracted by the thinning operation. Then, a skeleton refinement process is used to fine-tune the thinned skeleton, the refined skeleton is decomposed into multiple groups, and the voxelized model is de- composed into meaningful parts accordingly. The other operations for the 3D thumbnail generation remain the same as that described in chapter 3 [17]. They include: taking 67 bodymeasurements for each part with thePCAtransformation, and generating thefinal thumbnail by approximating each part with fitting primitives. The voxel-based decomposition method has a challenge of its own. That is, the skeleton obtained by the thinning operation (or other skeletonization operations) may contain artifacts to result in a messy 3D thumbnail. For example, the skeleton may contain noisyorredundantskeleton-voxels (SVs),theskeleton maybejaggedandskewed, or the skeleton may not represent the principal axis of a shape correctly. These artifacts oftenaffect theshapedecomposition processandlead toaninaccurate3Dthumbnail. We show that the skeleton refinement process plays an important role in reducing artifacts of skeletons yet preserving sharp features and will discuss this process in detail. The main contribution of this research lies in the development of a robust voxel-based shape decomposition method. The rest of this paper is organized as follows. An overview of the proposed system are given in Sec.4.2. The skeleton refinement process that links, groups and fune-tunes the skeleton is discussed in Sec. 4.3. The voxel-based shape decomposition scheme is described in Sec. 4.4. Experimental results and subjective visual tests are detailed in Sec. 4.5. Finally, concluding remarks and future research directions are given in Sec. 4.6. 4.2 Overview of Proposed System Most of previous mesh simplification methods were not much concerned with preserving significantpartsofamodel. Forexample,agreatlysimplifiedhumanmodeltendstomeld limbsandthebodytogether. Howtoaddressthisissuewillbethemainfocusofthiswork. 68 Figure 4.1: The block diagram of a feature-preserving 3D thumbnail creation system introduced in chapter 3[17], and we focus on the improvement of three blocks that are related to shape-decomposition and highlighted in orange in this work. The block diagram of a feature-preserving 3D thumbnail creation system introduced in chapter 3 [17] is depicted in Fig. 4.1. The main difference between our current work and the previous work in chapter 3[17] is that a voxel-based shape decomposition scheme is employed herewhile a surface-based shapedecomposition scheme was adopted in chapter 3[17]. Our main objective is to improve the performance of shape-decomposition by intro- ducing three new blocks highlighted in orange in Fig. 4.1. They are explained below. • Voxelization and Thinning Apolygonalmodelisfirstrasterizedintoabinary3Dvoxelgrid. Then,acoarse skeleton is extracted from the volumetric model using a thinning algorithm [44]. These tasks can be accomplished with tools in [5] and [10], respectively. In the following, we use object-voxels and skeleton-voxels to denote the volumetric model and the thinned skeleton, respectively, as illustrated in Fig. 4.2(a). 69 • Skeleton Refinement The skeleton refinement process links and groups discrete SVs obtained with the thinningoperation. Sincethethinnedskeleton often contains defects that affect the shape decomposition result, a skeleton refinement process is developed to enhance the extracted skeleton. • Shape Decomposition Ashapedecompositionmethodisusedtodecomposetheshapeintomeaningful parts according to the grouping of skeletons. By assigning object-voxels to the groupassociated withtheirnearestskeleton, ashapecanbedecomposedroughly. A refined skeleton decomposition method is developed to re-group SVs more precisely so that the shape can be decomposed more accurately as well. Once a shape is decomposed into multiple parts, each part will be handled individ- ually. As shown in Fig. 4.1, the remaining processes include: PCA transformation, bodymeasurement, and primitive approximation. ThePCAtransformation is applied to each part individually for pose normalization. The body measurement (i.e.,theradius of the surrounding surface along the skeleton) is taken for each part. Then, a primi- tive approximation method approximates each partwith fitting primitives based on body measurement results to yield the 3D thumbnail. Moreover, the shape descriptor and the thumbnail descriptor describing the simplified shape of the original model are generated off-line and stored in the database. Consequently, the thumbnail can be downloaded and rendered efficiently on-line. For more details, we refer to chapter 3[17]. 70 In the following, we will give in-depth treatment on two new blocks in Fig. 4.1. They are: 1) the skeleton refinement process and 2) the shape decomposition process, which will be examined carefully in Sec. 4.3 and Sec. 4.4, respectively. 4.3 Skeleton Refinement After thevoxelization andthinningprocess, thinnedSVsconsist ofasetof discretevoxels that are to be linked, grouped and fine-tuned to represent a meaningful structure of the model. This process, called “skeleton refinement”, is discussed in this section. 4.3.1 Skeleton Voxel (SV) Classification and Linking We classify SVs into the following five categories. I. End-SV: The end point of a skeleton which has only one neighbor; II. Joint-SV: The joint of a skeleton which has more than two neighbors; III. Peak-SV: The turning point of a skeleton, which has two neighbors and is a local peak; IV. Orphan-SV: An isolated voxel which has no neighbors; V. Normal-SV: A voxel which has two neighbors and is not a local peak. The classification tasks are often easy, since most of them can be accomplished by counting the neighbors of a voxel. We consider a 3x3x3 grid surrounding a central voxel and view the surrounding 26 voxels as its neighbor as illustrated in Fig. 4.3. The classification of Peak-SV demands some extra effort, which will be performed after SVs 71 Figure 4.2: Illustration of skeleton extraction, classification and skeleton decomposition: (a) object-voxels representing the volumetric model are shown in light gray while the thinned SVs are shown in black; (b) End-SV, Joint-SV, and Normal-SV are shown in red, yellow and black, respectively; (c) SVs are divided into multiple groups shown in differentcolors; (d)extractedturningpoints(Peak-SV)representinglocalpeaksareshown inpurple;(e)object-voxels aredecomposedintomultiplepartsroughlybyassigningthem to their nearest SVs; (f) the ideal shape decomposition result. arelinkedsincealocalpeakismoredifficulttoextractbasedondiscretevoxels. Examples of End-SV, Joint-SV and Normal-SV are shown in Fig. 4.2(b). Inthelinkingprocess,welinkneighboringSVsanddividetheminto groupsaccording to the joint location. In practice, we create each group with either one of End-SVs or one of Joint-SVs, and continuously link unvisited adjacent voxels to this group until another End-SV or Joint-SV is met. As a result, each skeleton group will consist of linked SVs whose two ends are either an End-SV or a Joint-SV as shown in Fig. 4.2(c). However, athinnedskeletonmayhavedefectstoaffectthedecompositionresult. Thiswillbe handled by some post-processing techniques as described in Sec. 4.3.2. Figure 4.3: Illustration of one central voxel and its 26-adjacent voxels, where the two black voxels sharing only a common vertex are still viewed as neighbors. 72 The turning point is used to separate parts that can be bent such as the separation of the lower arm and the upper arm. After the linking process, a Peak-SV can be extracted by analyzing the curvature. However, since the thinned skeleton may be jagged (i.e., containing many unwanted local peaks), extracting turning points by detecting local peaks may not work properly. To address this issue, a hybrid scheme that integrates two commonly used feature extraction methods is developed. The turning point is first located using the global distance and then adjusted based on the local curvature. The concept of the global distance is illustrated in Fig. 4.4. For each group of SVs, we first draw a straight line from one end to the other end. Then, we compute the perpendicular distance from each point along the curve to this straight line. At each iteration, a voxel whose distance to the line is greater than a distance threshold and whose angle (formed by both ends with itself in the middle) is smaller than an angle threshold will be selected as a candidate. The candidate voxel whose distance is the greatest is then picked as the turning point, and the curve is divided into two sub-curves at this voxel. The process is applied recursively in these two sub-curves until there is no voxel that qualifies as a candidate. However, the turning point extracted using the global distance alone may be located at a flat region. The angle constraint is then used to adjust the extracted turning point by examining the local curvature at each iterative. For each turning point extracted Figure 4.4: Illustration of turning point extraction using the global distance. 73 using the global distance criterion, we examine the curvature in its local neighborhood. The point whose curvature is the largest in this neighborhood will be reclassified as the turning point instead of the original one. Fig. 4.2(d) shows an example of extracted turning points. Some meaningful parts such as the head and the neck of a 3D object can be separated using these turning points. However, it is worthwhile to point out that we do not divide the skeleton with turning points in the beginning since it will complicate the following processes. 4.3.2 Skeleton Post-processing Techniques If obtained SVs are clean and smooth, a skeleton can be linked and grouped into sev- eral meaningful parts easily. If SVs obtained by the thinning operation have artifacts, the skeleton-based decomposition result will become messy. In this subsection, post- processing techniques are proposed to handle three types of artifacts: 1) clustered SVs, 2) jagged and skewed skeletons, and 3) Sub-branches. A. Clustered SVs Ideally, a thinned skeleton should have the width of one voxel while retaining the connectivity of the original shape. However, the extracted skeleton may contain clusters ofadjacentvoxels asshowninFigs. 4.5(a) and4.6(a). TheseclusteredSVsasrepresented by yellow cubes are classified as Joint-SVs since they all have more than two neighbors. Connecting two Joint-SVs that are adjacent (or very close) to each other using the linkingprocessmayresultin aredundantgroupor an incorrect tinyloop as shownin Fig. 4.5(c) and Fig. 4.6(a). These are caused by the ambiguous relationship of thinned SVs. 74 Figure 4.5: Examples of skeleton post-processing using the re-classifying filter, where (a) and (b) are before the linking process while (c) and (d) are after the linking process: (a) the original skeleton containing clusters of adjacent voxels which are classified as Joint- SVs and represented by yellow cubes; (b) one voxel in each cluster being selected as the representative while others being re-classified as redundant joints and denoted with gray cubes; (c) the original skeleton with many redundant small loops formed by adjacent Joint-SVs; (d) these small loops being removed. When two SVs are adjacent to each other, they may not be linked definitely. For Joint- SVs that have more than two neighbors, it is ambiguous to choose the proper adjacent voxelforlinking. Wanget al. encounteredthesameproblemin[49]andproposedtocheck whetheraconnectionwillcausea3-edgecyclebeforeitisaddedtotheskeleton. However, it was not discussed which edge is to be discarded among a 3-edge cycle. Furthermore, they did not handle some other cases that may lead to tiny groups or loops as well. Here, we propose the use of two filters (i.e., the re-classifying filter and the replacing filter) to solve this linking problem without deleting any SV. For each cluster of adjacent joints, the reclassifying filter chooses one of them as the representative and reclassify the rest as redundant joints. The representative is the one that has the largest number of adjacent joints, where a tie can be resolved by choosing 75 Figure4.6: Skeleton post-processingusingtwo filtersincascade: (a) theoriginalskeleton; (b)after theapplication of there-classifying filter totheoriginal; (c) after theapplication of the replacing filter to the result in (b). the one that is closest to the center of this cluster. We do not allow a redundant joint to directly link to another redundant joint that is associated to the same representative joint. In other words, the path between each pair of redundant joints in the same cluster will be discarded. The effect of applying the re-classifying filter in shown in Fig. 4.5. The replacing filter aims to solve the problem when two joints are too close to each other such as the example shown in Fig. 4.6(b). To address it, the replacing filter iteratively searches a pair of joints that are connected by the shortest path with their distance less than a certain threshold (e.g.,5voxels)and,then,selectsthemiddlepoint of this path as a new representative to replace the original two joints. The shortest path is split into two halves at the new representative joint and merged into their adjacent groups, respectively. An skeleton post-processing example using the re-classifying filter and the replacing filter in cascade is shown in Fig. 4.6. B. Jagged and Skewed Skeletons 76 The thinned skeleton is often jagged and skewed. It is desirable to smoothen it while preserving key sharp features in the process. To avoid removing sharp features, we first extract important features using the global distance criterion as described before. The extractedfeatureswillnotbemovedinthefollowingsmoothingoperation. Afterwards,an iterative curve smoothing operation is applied. At each iteration, we examine every voxel along the skeleton and adjust its location according to the distribution of SVs within a localwindow. Thewindowsizewillincreaseatthenextrununtilitreaches themaximum size (e.g.,thepathlength). Foreachexaminedvoxel,wewillmoveitaccordingtothe averaged direction of SVs in the local window. Besides, a voxel can be moved to a new location only if it meets the following two conditions. I. The new location is inside the 3D model (i.e.,anon-emptyobject-voxel). II. The new location is still close to the original skeleton (e.g. it cannot be more than two voxels away). The above process is iteratively applied until there is no further movement. To avoid an infinite loop, a maximum number of iteration is also set. With the application of the proposed smoothing process, a jagged line can be straightened without losing sharp features. Three skeleton smoothing examples are shown in Fig. 4.7. C. Sub-branches As shown in Fig. 4.8(a), a thinned skeleton may have short sub-branches such as the one around the horse neck. These sub-branches will result in incorrect shape decompo- sition as shown in Fig. 4.8(b). It is challenging to check whether a given branch is a main or a sub-one. Although most sub-branches are short, classification based on the 77 Figure 4.7: Examples of three jagged skeletons and their smoothing results. length is not robust since short branches can still be important parts of a skeleton in some occasions and removing short branches may remove key features such as toes or tails. Although a sub-branch is not necessarily shorter than its adjacent voxel groups, it typically lies inside the surrounding voxel region of another main branch. Besides, one of its end voxels has to be an End-SV. A sub-branch removal algorithm is developed based on the above two observations. Figure 4.8: Removing a sub-branch from the skeleton of a horse model: (a) a sub-branch in the neck region of a horse model, (b) incorrect shape decomposition around the neck causedbythis sub-branch,(c)thesub-branchis removed anditsadjacent skeleton groups are merged, (d) the shape decomposition result with the new decomposed skeleton. Skeleton group G 1 is said to lie within its adjacent group G 2 ,ifthefollowingthree conditions are all satisfied I. G 1 is smaller than G 2 , i.e., G 1 contains a smaller number of SVs. 78 II. Theangle, θ,atthejointofbranches G 1 andG 2 is within the range from 50 to 130 degrees. III. ThedistancefromG 1 ’sSV,denotedbyK i ,toitsnearestSVK j inG 2 issmallerthan the radius of the surrounding surface around K j .Thisconditioncanbeexpressed mathematically as d(K i ,K j )<Radius(K j ), ∀K i ∈G 1 , (4.1) whereK j is the nearest skeleton ofK i inG 2 , d(K i ,K j )isthedistancebetweentwo voxels as described in Eq. (4.2), and Radius(K j )istheradiusofthesurrounding surface around K j . The radius of the surrounding surface around each SV can be estimated by assigning each surface voxel to its nearest SV. The surface voxel is the object-voxel which does not have 26 adjacent neighbors in a 3×3×3grid. ItsnearestSVissearchedbycomputing the Euclidean distance together with a connectivity check. For example, the length of the path from one finger tip to anther is longer than their Euclidean distance since there is no straight path between them. The distance between surface voxel V j and SV K i is defined as d(V j ,K i )= %V j ,K i % if a straight path V j K i exists. ∞ otherwise. (4.2) To check if a straight path exists between V j and K i ,linesegment V j K i is created. All voxels in V j K i can be derived by interpolation. If there is an empty voxel in V j K i ,the 79 length of V j K i is set to infinity. Then, all surface voxels can be assigned to their nearest SVs. Each K i has a list, denoted by List(K i ), to record its associated surface voxels. This list is used to calculate the average radius, Radius(K i ), of its surrounding surface. Mathematically, we have List(K i )={V j |∀V j ,K i is the nearest SV}, (4.3) Radius(K i )= " ∀V j ∈List(K i ) d(V j ,K i ) size of List(K i ) , (4.4) where Radius(K i )isestimatedbyaveragingthedistancesfromallsurfacevoxelsin List(K i )to K i . The sub-branch removal process can be summarized as follows. I. We associate all skeleton groups with some clusters using the following rule. All groupsadjacenttoeachotherbelongtothesamecluster. Inotherwords,allskeleton groups in a cluster share a common joint. Note also that a group can belong to two clusters if both its two ends are joints. II. For each cluster, we sort skeleton groups by their sizes, and examine each group sequentially from the smallest to the largest one. A skeleton group will be removed if it satisfies the three conditions discussed above. More than one group may be removed from a given cluster in this process. III. Skeleton groupspreviouslypartitioned bythejointof aremoved sub-branchwillbe merged. 80 Figure 4.9: More examples of sub-branch removal, where the three skeletons in the top row contain sub-branches while the three skeletons in the bottom row are obtained by the proposed sub-branch removal algorithm. Asub-branchremovalexampleisshowninFig. 4.8(c),wherethetwogroupsadjacent to the joint of the removed sub-branch around the neck are merged. More examples of sub-branch removal are shown in Fig. 4.9. 4.4 Shape Decomposition The goal of the shape decomposition process is to divide a 3D shape into meaningful parts. We begin with a simple object-voxel assignment process to decompose a 3D object roughlyand,then,usetheskeletondecompositionresulttoguidetheshapedecomposition to get a more accurate result. This idea is described below. • Step 1: Initial Shape Decomposition The initial shape decomposition is achieved by assigning each object-voxel to its nearest SV based on Eq. (4.2). Thus, every object-voxels can be assigned to a part by following its associated SV. To reduce the complexity of shape analysis, we 81 only consider surface voxels of the object-voxel and ignore interior voxels in this process. Fig. 4.2(e) shows an example of the assignment result. Each object-voxel is assigned to the same group as its nearest SV shown in the same color. After the assignment, each SVhas an associated list recordingits affiliated surfacevoxels and an estimated radius of its surrounding surface as described in Sec. 4.3.2. These data will be used in the next step. • Step 2: Skeleton-guided Shape Decomposition Askeleton-guideddecompositionprocessisproposedtoimprovetheinitialshape decomposition result. This is motivated by the fact that most skeleton branches intersect at the base skeleton, and two problems in the initial shape decomposition arise accordingly. First, portion of the base part (i.e.,themainbody)canbe mistakenly decomposed to protruding parts (e.g.,thelimb). Second,thebasepart can be mistakenly separated into two parts at the intersection. For the example in Fig. 4.2(e), the arm skeleton protrudes into to the main body and intersects with the body skeleton. As a result, a portion of the main body is mistakenly assigned to the arm and the main body is mistakenly divided in the middle. The skeleton- guided shape decomposition process is used to fine-tune the result. It consists of the following three sub-steps. – Step2.A:Distinguishthebasepart(i.e.,themainbody)fromprotrudingparts (e.g. limbs). – Step 2.B: Delete a protruding skeleton that goes beyond its boundary and re-build the skeleton of the base part accordingly. 82 – Step 2.C: Divide a protruding skeleton into sub-groups depending on turning points. In the following, we will discuss them in detail. Step 2.A: Base Part Identification To identify the base part, we define weighted accumulated distance for each part P i as ρ(P i )=d E (cen(P i ),cen(P))×s(P i )+Σ ∀j#=i d P (mid(P i ),mid(P j ))×s(P j ), (4.5) where cen(P i )and cen(P)arethecentersofmassof P i and the model P,respectively, d E is the Euclidean distance, s(P i )isthenumberofobject-voxelsbelongingto P i ,and mid(P i )istheSVof P i that is closest to cen(P i ). Since the distance between two parts is not equal to their Euclidean distance, we define the distance from P i to P j as d P (mid(P i ),mid(P j )) =%path(mid(P i ),mid(P j ))%, (4.6) where path(mid(P i ),mid(P j )) contains all SVs along the path from the mid(P i )to mid(P j ), and %.% is the path length. The idea is illustrated in Fig 4.10(a). Finally, the base part that is closest to all other parts can be identified by finding the part that has the minimal accumulated distance. A few examples obtained by this algorithm are shown in Fig. 4.10(b), where the identified base part of each model is colored in red. Step 2.B: Skeleton Invalidation and Mergence To invalidate or merge protruding skeletons, we check all protruding skeleton groups connected to the base skeleton and classify them to either the invalidation or the merging 83 Figure 4.10: The base part identification procedure: (a) the path along the skeleton from one part to another, where the green dots represent centers of different parts; and (b) illustration of identified base parts shown in red for a few 3D models. category. The protruding group which intrudes the main body should be invalidated whiletheprotrudinggroupwhichismistakenlyseparatedfromthebaseskeletonshouldbe merged. Theclassificationcanbedonebasedontherelativeanglebetweentheprotruding skeleton and the base skeleton. If they are nearly parallel (i.e., the angle is close to 180 degrees), this protruding skeleton is likely to belong to the base skeleton and should be classified to the merging category. Otherwise, it is classified to the invalidation category. The direction of a protruding skeleton is estimated from the joint connected to the base skeleton to its nearest turning point or to the other end if there is no other turning point. The direction of the base skeleton is estimated from one end, which is connected to the protruding part, to the other end. After the classification job, we search for the boundary between the protruding part and the main body. Only the segment that goes beyond the protruding boundary will be invalidated or merged. The boundary can be detected by exploiting the fact that the radiusofthesurroundingsurfacealongtheprotrudingskeleton increasesdrasticallywhen it crosses the boundary between the protruding part and the base part. For example, 84 Figure 4.11: Illustration of the skeleton-guided shape decomposition process: (a) the result from the initial decomposition, (b) invalidating the segment of the protruding skeleton that goes beyond its boundary, (c) extending the base skeleton by merging a segmentofaprotrudingskeleton, (d)dividinggroupsintosub-groupswithturningpoints. the surrounding surface along the arm skeleton in Fig. 4.2(e). The SV whose surface radiusincreases faster than a threshold ischosen as the boundarypoint of the protruding group. All SVs between the boundarypoint and the intersection with the base group will be invalidated or merged. When a protruding skeleton is long or connected to other sub-parts, the variation of the surrounding surface radius may not be a reliable indicator for the boundary of the body part and the protruding part. For example, the head skeleton of the crane model in Fig. 4.11(a) (shown in orange color) is bending and the variation of the surrounding surface radius along the skeleton is irregular. It is difficult to select a threshold for this path. To resolve this problem, we narrow down the search region of the protruding boundary. Foreachprotrudinggroupclassifiedtotheinvalidationcategory,theboundary search region is confined to a small area controlled by the surface radii of the base part. That is, if the average surface radius surrounding the base skeleton is δ,onlyprotruding SVs whose distance to the base skeleton close to δ need to be examined. We can express this condition mathematically as δ 1 −ε≤d(K i ,K j )≤ δ 2 +ε, ∀K i ∈[K s ,K e ], 85 where [K s ,K e ]isthesearchregionalongtheprotrudingskeleton, K i is the SV of the protruding group,K j is the SV of the base group that is nearest toK i ,and δ 1 and δ 2 are the average and the largest surface radii surrounding the base skeleton, respectively, and ε is an offset to slightly increase the search region. If no point in the search region meets the above criterion, we select K e as the boundarypoint. All protrudingSVs between this boundary point and the intersection joint are invalidated. Furthermore, SVs belonging to these invalidated skeletons are assigned to their nearest base skeleton. Fig. 4.11(b) shows the improved shape decomposition result. For a protruding group classified to the merging category, we also search for the boundary between the base group and the protruding group and only extend the base skeleton totheboundary. Theboundarysearchprocessisconductedasfollows. First, the turning point is used to narrow down the search region so that the mergence stops at the location where the protruding part starts to bend, such as the boundary between crane’s neck and the body as shown in Fig. 4.11. Thus, the search region is set between the intersection of two parts and the nearest turning point of the protruding group. Second, we search for the boundary point by checking the SV whose surrounding surface radius increases dramatically. If no boundary is found by this way, the turning point is selected as the boundary. The improved shape decomposition result based on skeleton merging is shown in Fig. 4.11(c). Step 2.C: Shape Decomposition with Turning Points Some protruding group can be further decomposed into subgroups with turning points. For example, meaningful parts of animal models, such as legs can be decomposed by analyzing the curvature of the skeleton. In Fig. 4.11(d), we show an improved shape 86 decomposition result, where protruding parts such as legs and the neck of the crane model are decomposed into subgroups with selected turning points. After the skeleton decomposition process, we do not need to re-calculate or re-assign the object-voxel to its nearest skeleton-voxel. Instead, we only update its belonging group since its belonging SV might be merged or divided to a different group. 4.5 Experimental Results In the experiment, we adopted a collection of polygonal 3D models used in [31]. All of them were pre-converted into the same obj format, normalized to the same range, and voxelized to the corresponding volumetric models. In the voxelization process, we chose a smaller voxel size so as to get a higher resolution model in order to capture more details. Asatradeoff, itsprocessingdemandedmoretimeandlargermemorysize, andits thinningresultwas morenoisy. Theresolution was finallychosen to bea80×80×80 grid since it stroke a good balance between model quality, memory size and computational complexity. 4.5.1 Comparison of Decomposed 3D Models Theeffect oftheskeleton refinementprocessisillustrated inFig. 4.12, whereweshowthe extracted skeletons of 12 models. The left one was obtained by the thinning operation alone while the right one was obtained by the thinningoperation followed by the skeleton refinement process. It is clear that the skeleton refinement process does improve the quality of the extracted skeletons. We also provide the shape decomposition result in the same figure, where the identified base parts are shown in red and protruding parts are 87 shown in different colors. With the improved skeleton, each model can be decomposed into its significant parts more accurately. Figure 4.12: Comparison of skeletons obtained by the thinning operation alone (left) and by the thinning operation followed the skeleton refinement process (right). Based on decomposed 3D models, we constructed feature-preserving 3D thumbnails in Fig. 4.13, each of which consisted of 90 primitives. These thumbnails preserve suffi- cient details of their original models. We show thumbnails with 7, 11, 15 and 30 fitting primitives in Fig. 4.14, which demonstrates the robustness of the proposed voxel-based shape decomposition scheme. Next, we compare the performance of the surface-based decomposition scheme in chapter 3[17] and the proposed voxel-based decomposition scheme for a crane model in Fig. 4.15. We see clearly from Figs. 4.15(a2)-(a4) that the decomposed mesh using the surface-based scheme does not represent the shape well since it becomes fragmented with adjacent parts being taken apart in the simplified crane model. Its skeleton and body measurement extracted from the base part lean toward the upper body, which is caused 88 Figure 4.13: Thumbnails obtained by the proposed voxel-based shape decomposition scheme, where the original 3D models were decomposed into multiple parts and each part was approximated by a fitting primitive. Figure 4.14: Thumbnails approximated by a small number primitives (given in the lower left corner of each thumbnail) with the voxel-based approach. by missing pieces in the bottom part. This problem is fixed by the proposed voxel-based approach as shown in Figs. 4.15 (b1)-(b3). In this comparison, we kept the file size of both thumbnails about the same (i.e.,8KBperthumbnail),eachofwhichcontained90 primitives. Figure4.15: Comparisonofthesurface-basedandthevoxel-baseddecompositionschemes: (a1) a mesh decomposed by the surface-based technique [34]; (a2) the bottom view of the decomposed base part were several pieces are missing; (a3) the extracted skeleton and the body measurement; (a4) the resultant thumbnail with an upward base part caused by missing bottom pieces; (b1) the shape decomposed by the voxel-based scheme; (b2) the improved skeleton and the body measurement; and (b3) the resultant thumbnail of higher quality. 89 4.5.2 Computational Time and File Size Theexperimentwasrunon adesktop computerwithan IntelCore2Duo2.53 GHzCPU, and 4G RAM. The average processing time for creating a 3D thumbnail composed by 50 primitivesisgiveninTable4.1. Notethattheprocessingtimeforprimitiveapproximation may vary according to the number of d-cylinders assigned. Approximating a thumbnail using 50 d-cylinders took 0.24 seconds on the average. 20 and 40 d-cylinders took 0.072 and 0.142 second, respectively. For 3D models whose original file sizes were in the range of 160 ∼ 870KB, the size of their thumbnail composed by 7, 20, 50 d-cylinders were about 0.9KB, 1.7KB, 4KB, respectively. Since the size of the thumbnail is primarily decided by the number of primitives (rather than the size of the original model). The size of a thumbnail can be much smaller than that of its original file for a complex model. It is worthwhile to emphasize that, since the extracted skeleton and the body measurement of a model are notaffectedbythenumberofassignedprimitives,theycanbereusedtocreatethumbnails of different resolutions. Table 4.1: Computational Benchmarking Voxelization (80x80x80) [5] Thinning [10] Skeleton Refinement & Shape Decomposition PCA Transformation & Body Measurement Primitive Approximation (50 d-cylinders) Time (Sec.) 8.67 0.18 1.03 0.51 0.24 90 Figure 4.16: Comparison of simplified models with different resolutions, where models in the top row were simplified by Garland’s method [23] and models in the bottom row with simplified by the proposed scheme. 4.5.3 Subjective Evaluation We conducted a subjective test on the performance of the thumbnails obtained by the proposed method and Garland’s mesh simplification method [23]. A total of 30 people took part in this experiment. We randomly chose 18 models and obtained their simpli- fied models using the above-mentioned two methods. Each model was simplified into 4 resolutions (with 10, 20, 40 and 90 primitives). We selected pairs of simplified models (called Model Aand Model B) in a random order and placed them side-by-side with their originalmodel. Each subjectwasrequested tochoose oneof thethreeoptions: (1) Model Aisclosertooriginalone;(2)ModelsAandBareaboutthesame;(3)ModelBiscloser to original one. One set of such test models was shown in Fig. 4.16. Thesubjectivetestresultsfor4differentresolutionsareshowninTable4.2, wherethe numberineachboxshowsthepercentageofpeoplepreferredthecorrespondingalgorithm. For example, the first column shows that all subjects preferred our algorithm when the simplified model was composed by 10 primitives. It is clear that our method outperforms 91 Table 4.2: Subjective preference of algorithms. Number of Primitives 10 20 40 90 Ours 100% 100% 83.3% 53.3% Garland [23] 0% 0% 10.0% 26.7% Equivalent 0% 0% 6.7% 20.0% Garland’s methodin allresolutions in thetest. However, thegap becomes narrower when the number of primitives increases. 4.5.4 Discussion Finally, we would like to point out one shortcoming of the proposed voxel-based method. That is, the decomposition result is highly affected by the quality of the extracted skele- ton. If the skeleton does not represent the correct structure of an 3D model well, the proposed method may fail to decompose the model. Two such examples are given in Fig. 4.17. The skeleton in the middle part of the cactus model is skewed, and it does not have the same direction as the upper part and the lower part. As a result, the base part of the cactus fails to extend to the correct direction. The alien model has two big holes in the face and the skeleton obtained by thinning cannot capture this feature correctly. Then, one half of the face is missing in the created thumbnail. To improve these cases, we need abetterskeletonizationtechniquefor3Dmodelswhichisanitemforfutureresearch. 4.6 Conclusion An innovative voxel-based shape decomposition method was presented in this work. The skeleton was first extracted and decomposed into multiple groups. Then, the volumetric 92 Figure 4.17: Examples of the skeletons that do not represent the correct structure of a shape result in fail thumbnails modelwasdecomposedinto significantpartsguided bytheskeleton decomposition result. Thesignificantpartsofa3Dmodelcanbewell-preservedbyitsthumbnailrepresentation. As compared with the surface-based shape decomposition scheme in chapter 3 [17], the new voxel-based shape decomposition scheme can represent the shape of each part better and decompose themodel moreaccurately even if the original modelis greatly simplified. 93 Chapter 5 Sketch-based 3D (S3D) Modeling 5.1 Introduction Inpreviouschapters,weproposedafeature-preservingshapesimplification approachthat approximates the 3D shapes with fitting primitives. A customized d-cylinder is designed for shape approximation. Each 3D model is decomposed to meaningful parts and each part is approximated with multiple fitting d-cylinders. However, the d-cylinder can only fit the tubular shape. The man-made object such as chairs, tables or any flat object cannot be approximated well with the d-cylinder. In this chapter, more customized primitives are designed so that a variety of 3D shapes can be approximated. Moreover, we extend the framework to 2D sketches and approximate them by fitting primitives since many techniques developed earlier such as turning point extraction and primitive approximation can be shared and reused for this purpose. For example, a customized 3D primitive can be created based on 2D contours and skeletons drawn by human. The proposed sketch-based 3D modeling system provides a convenient tool for users to create simple 3D models quickly. Although there are quite a few 3D modeling tools developed, 94 many of them are not user friendly. It may take days and even months for users to get familiar with dedicated 3D model creation software, and casual users can easily get frustrated. To address this concern, more intuitive and simplified modeling techniques are needed. The user interface of the proposed sketch-based 3D modeling system, called S3D, is shown in Fig. 5.1, the large white panel is the canvas that a user can draw on while the button panel in the right offers a set of editing tools. A user can load a 2D image and draw the 2D contour/skeleton on the panel. The user-drawn 2D contours will be approximated by fitting primitives when the approximation button is clicked. To ob- tain a finer approximation, the user-drawn 2D sketch is first refined (optional) and then approximated by more fitting primitives of smaller size. There are five user-customized primitives provided for approximation in this system; namely, open-tube, closed-tube, ellipsoid, prism and complex-prism. A user can create these primitives by drawing their contours/skeletons. Meaningful parts of a 3D character can be grouped hierarchically with 3D editing tools. A well-grouped 3D character can be controlled with an embedded skeleton for the animation purpose. The design of the S3D modeling system consists of three challenging tasks: 1) 2D sketch refinement, 2) 3D primitive approximation and 3) 3D object editing. First, the 2D sketch refinement is essential since it is difficult to draw a precise curve with todays com- puter input devices. User-drawn contours and skeletons are often jagged, discontinuous and unsmooth and they need to be refined. Second, an innovative primitive approxima- tion approach is demanded so that the user can create 3D models easily with a variety of primitive combinations. Third, the 3D editing tool is needed for users to manage 3D 95 Figure 5.1: User interface of the proposed S3D system. objects freely such as deletion, duplication and grouping. These challenging tasks will be addressed thoroughly in this work. It will be demonstrated by experimental results and user evaluation that the S3D system allows new users to create a variety of simple 3D models with a short learning curve. On the other hand, this S3D system is not powerful enough to create complicated 3D models for professionals. The current S3D system targets at children and hobbyists for them to create simple models quickly. The rest of this chapter is organized as follows. We discuss the design of the 2D sketch editing module and the 3D primitive approximation module in Sec. 5.2 and Sec. 5.3, respectively. Then, we discuss the management of created 3D objects in Sec. 5.4. 96 User evaluation is studied in Sec. 5.5. Finally, concluding remarks and future research directions are given in Sec. 5.6. 5.2 2D Sketch Editing The S3D system allows a user to create a 3D model by drawing its 2D contours and skeletons. An innovative contour editing tool and a primitive approximation step are developed to achieve this goal. The contour-editing tool is provided to refine the user- inputsketches includingcontoursandskeletons. Sincemostcomputerinputdevices, such as the mouse or the stylus, are not designed for the drawing purpose, the sketch often results in jagged and discontinuous lines/curves. A sketch refinement process is applied, which is followed by four optional curve-fitting tools for further refinement. Then, a primitive approximation step is used to approximate a refined contour with customized 3D primitives. In this section, the sketch editing process will be discussed in detail. Basic Sketch Refinement Users’ drawn contours/skeletons are constructed by a bunch of discrete 2D points. Due to the difficulty in controlling computer input devices, the input sketch is typically jagged and irregular. Besides, the last and the first points of a desired closed contour may not be connected. To get a continuous and smooth contour, several pre-processing operations are applied. Discrete points are first down-sampled to reduce the jaggedness of curves. Then, gaps between two consecutive points are closed to create a continuous curve. Afterwards, the first and the last points are connected by interpolation to create aclosedcontour. 97 After the pre-processing operations stated above, we extract the turning points along the curve to offer a better fit. The basic idea is sketched below. First, we draw a straight line from one end of the curve to its another end, and then compute the perpendicular distance from each point along the curve to this straight line. If the largest difference betweenapointandthelineexceeds acertainthreshold,thispointisviewedasaturning pointandwewilldividethecurveintotwo sub-curvesatthisturningpoint. Thisprocess is applied recursively until the maximum distance between the point and the line does not exceed the threshold. In addition, a simple vertex list rearrangement process is applied to a closed contour to enhance the turning points extraction process. Note that, although a contour is closed and should not have two ends, there exist start and end points when we manage all contour points in a list. The start and end points are separated in the list although they are actually adjacent to each other in the closed contour. As shown in Fig. 5.2 (a), S denotes a start point. K 1 and K n are the turning points. Segment (K 1 , K n )isdivided at S and the gray arrow indicates the order of points in the list. The segment divided by S may cause a problem in the curve smoothing process to be conducted later, since turning points are adopted to fit a smooth curve. The curve between turning points K 1 andK n is split into two segments, and two smooth curves will be constructed for (K 1 ,S) and (S,K n ), respectively, instead of one. To overcome this problem, we rearrange the list so that it always starts from the first turning point as shown in Fig. 5.2(b). User Specified Curve- and Line-Fitting The S3D system offers three curve-fitting and one line-fitting methods for contour refinement. All these four methods adjust the contour based on extracted turning points 98 Figure 5.2: Illustration of the point rearrangement process, where S is the start point, and K 1 and K n are turning points: (a) the start point lies between two turning points and (b) the start point is always a turning point. described in the last subsection. The three curve-fitting methods include two well known ones(thequadraticBezier curve[4]andthenaturalcubiccurve[8])aswell asthefeature- preserving one. The line-fitting method is simply to create a straight line between two turning points. The user can choose any of the four to refine their contours. In Fig. 5.3, we show four examples and compare their original and refined contours with the natural cubiccurvefittingandthelinefittingmethods. Clearly,thesecontoursaresmoothenedby them. However, someimportantfeatures suchas sharpcornersmay notbewellpreserved with these fitting methods. Thus, we propose a feature-preserving curve-fitting method as described below. Consider a set of control points K 1 , K 2 ,..,K n ,where K 1 is the first contour point and K 2 K n are turning points. First, we check all control points and classify them to key points and regular points. A control point, K i ,isakeypointifitmeetsoneofthe followingtwocriteria: (1)ifthecurvehasaC 2 discontinuityatK i (i.e. K i isafluctuation point between concave and convex), or (2) if the adjacent angle at K i is smaller than a threshold. Otherwise it is a regular point. 99 To determine if the curve isC 2 discontinuities at control pointK i ,weusefourcontrol pointsK i−2 ,K i−1 ,K i andK i+1 ,whereK i−2 andK i−1 areits previoustwo control points and K i+1 is its next control point. We draw a line from K i−1 to K i ,andif K i−2 and K i+1 lies at different sides of this line, this curve has C 2 discontinuities at K i .Next, we examine all control points from K 1 to K n and apply different curve fitting methods according to thefollowing rules: (1) Ifthere areregular points between keypointsK i and K j ,weapplythenaturalcubiccurvefittingmethodtosmoothencurvesbetweenthem using control pointsK i ,K i+1 ,K i+2 ,..., K j .(2)Ifthereisnoregularpointbetweentwo key points, we perform the line fitting between them. The performance of different curve fitting methods is compared in Figs. 5.4 and 5.5. 5.3 Mapping 2D Sketch to 3D Model with Elements To map user-drawn 2D sketches to 3D models, five 3D object elements are developed. Theyare: theopen-tube,theclosed-tube,theellipsoid,theprism,andthecomplex-prism. Figure 5.3: Examples of four user drawn 2D contours (the top row) and the results of applying the natural cubic curve fitting method (the left two in the bottom row) and the line fitting method (the right two in the bottom row). 100 Figure 5.4: (a) user input contour, (b) extracted turning points (red dots) and the first contour point (blue dot), and contours refined by (c) the quadratic Bezier curve fitting, (d) the natural cubic curve fitting, (e) the line fitting and (f) the feature preserving curve fitting methods. Figure 5.5: Comparison of different curve fitting methods: (a) user input contours, con- tours refined by (b) the quadratic Bezier curve fitting, (c) the natural cubic curve fitting, and (d) the feature preserving curve fitting methods. 101 Avarietyof3Dmodelscanbecreatedwithadifferentcombinationofthese3Delements. The rules used to map user-drawn 2D sketches to a proper element are given below: • Acontourthathasaskeletondrawninsideismappedtoanopen-tube. • Acontourthathasbothaninnercontourandaninnerskeletonbetweentheouter and inner contours is mapped to a closed-tube. • An elliptical contour (specifically drawn with our ellipse painting tool) is mapped to an ellipsoid. • Acontourwithoutanycontour/skeletoninsideismappedtoaprism. • Acontourhasaninnercontour(noinnerskeleton)ismappedtoacomplex-prism. Examples for the above five mapping rules are given in Fig. 5.6 (a)-(e), respectively. The S3D system provides three types of drawing tools to users: 1) contour drawing, 2) skeleton drawing, and 3) ellipse drawing. Thus, a user can draw a different combination of contours and skeletons as shown in Fig. 5.6. In addition, to make the S3D system easier to use, the system can pair user-drawn skeletons and contours automatically by detecting the relationship among them. For example, if a userdraws multiple contours or skeletons first, the skeleton or the contour inside another contour would be detected and associate to its outer contour automatically. As a result, the element type used in the 3D object modeling can be determined automatically. In the following, we will discuss each element type in detail. 102 Figure5.6: Illustration of fivemappingrulesusedto convert 2Dcontours/skeletons to3D elements (where the contour is in gray and the skeleton is in black): (a) the open-tube, (b) the closed-tube, (c) the ellipsoid, (d) the prism, and (e) the complex-prism. 5.3.1 Open Tube Auser-drawncontourwithanassociatedinnerskeletonismappedtoanopen-tubeas shown in Fig. 5.6(a). The open-tube is constructed by multiple d-cylinders, where each d-cylinder is usedto fit a segment of a 2D contour. Recall that a d-cylinder consists of an upperellipse, a lower ellipse, andmultiple quadranglesconnecting thesetwo ellipses. The major and minor radii of the upper ellipse and those of the lower ellipse can be adjusted individually to fit the 3D model. To improve the appearance of an open-tube, two caps are attached to its both ends. A. Overview of Design Methodology To create a fitting open-tube for a user-drawn 2D contour, we use turning points to decompose the contour into multiple segments and adopt one d-cylinder to fit one segment. An example is given in Fig. 5.7, where a skeleton is decomposed into three segments, denoted by Seg(K 0 ,K 1 ), Seg(K 1 , K 2 ), and Seg.(K 2 ,K 3 ), where K 0 and K 3 are 103 Figure 5.7: The radius of the K 1 cross section. end points andK 1 andK 2 are turning points. Each segment is fit by one d-cylinder, and we use Cylinder(K i , K j )todenotethefittingd-cylinderforSeg(K i ,K j ). The upper and lower ellipses of Cylinder(K i ,K j )shouldmatchthecrosssectionsof K i andK j ,respectively. Eachellipsecanbespecifiedby(i)itsmajorandminorradii(to determine the size), (ii) its center point (to determine the location), and (iii) its surface normal (to determine its orientation). For the example in Fig. 5.7, the upper ellipse of Cylinder(K 1 ,K 2 )shouldhaveitssurfacenormalequaltothetangentoftheskeletonat turningpointK 1 ,anditsdiameterequaltothatoftheK 1 cross section. Thelower ellipse can be determined at turning point K 2 similarly. Thus, the vertices E i of each ellipse of afittingd-cylindercanbecalculatedviathidequation: E i =V(a×cos(α),b×sin(α),0)×rotM(z −axis ,N)+(C x ,C y ,0), (5.1) where a is its major radius, b is its minor radius, (C x , C y )isitscenter, N is its surface normal, α=[0,2π), and rotM(z-axis, N)istherotationmatrixthatrotatesthez-axisto N. Without the rotation matrix, the vertices of the ellipse will lie on the plane of the 2D display screen and the medial axis of the d-cylinder will be parallel to the z-axis (i.e. vector (0,0,1) pointing outward of the screen). We rotate the medial axis of the ellipse 104 from vector (0,0,1) to the surface normal direction N for alignment. The equation of a rotational matrix which rotates a directional vector to another can be found in [9]. The estimation of the surface normal and the radius for the upper/lower ellipse of a fitting d-cylinder is discussed below. In addition, a skeleton correction process and an optional medial-axis guided skeleton drawing tool for improving the user-drawn skeleton are also presented. B. Surface Normal Estimation If seg(K i ,K i+1 )isnearastraightline,thesurfacenormaloftheupper/lowerellipse can be estimated by measuring the vector from one end of a segment to its another end. In other words, the surface normals of the two ellipses of d-cylinder Cylinder(K i ,K i+1 ) can be computed as: NormalOfEllipseAt(K i )=V(K i ,K i+1 ), NormalOfEllipseAt(K i )=V(K i ,K i+1 ), whereNormalOfEllipseAt(K i )isthesurfacenormaloftheK i crosssection,andV(K i ,K i+1 ) is the vector from point K i to K i+1 .However,thissimpleideatendstocreatea3Dob- ject composed of discontinuous d-cylinders as shown in Fig. 5.8(a). The discontinuity is caused by different surface normals between two adjacent ellipses. For example, thesurfacenormalsatK 1 withrespecttoCylinder(K 0 ,K 1 )andCylinder(K 1 ,K 2 ) are different as shown in Fig. 5.7. Thus, the surface normal of the ellipse at K i should 105 Figure 5.8: Comparison of 3D objects created by two surface normal estimation meth- ods: (a) a naive surface normal estimation method and (b) an improved surface normal estimation method. be estimated by its adjacent turning points K i−1 and K i+1 instead. Mathematically, the surface normal estimation can be re-written as NormalOfEllipseAt(K i )=V(K i−1 ,K i+1 ). After adding the constraint that the surface normal of an ellipse should be consistent with those of its adjacent d-cylinders, we can get an improved surface normal and the adjacent d-cylinders are smoothly connected as shown in Fig. 5.8(b). C. Radius Estimation To estimate the radius of an ellipse at a turning point, one simple idea is to draw a line which is perpendicular to the tangent at the turning point, say K i .Thislinewill intersect with the contour at two points, say (C Ki , D Ki ). Then half of the distance from C Ki to D Ki will be the radius of the ellipse. The line which is perpendicular to the tangent and passes the turning point K i can be described by f(P i )=δ x ×X i +δ y ×Y i −(δ x ×x 0 +δ y ×y 0 ), (5.2) 106 where P i =(X i ,Y i )isacontourpoint, K i =(x 0 ,y 0 )istheturningpointoftheskeleton, and (δ x , δ y )isthetangentat K i .Tofindthetwointersectionswherethislineintersects with the contour, we calculate the value of f(P i )ateachcontourpoint. Thepointthat has its value closest to 0 would be one of the desired intersections. Then, we find another pointP j where f(P j )isclosestto0amongallcontourpointsthatontheothersideofthe tangent. However, theabove naive algorithm may notwork well inpractice dueto two reasons. First, since user-drawn skeletons are rough. They may lean to one side of the contour or may be jagged, thus leading to irregular tangents along the skeleton. As shown in Fig. 5.9(a), the line perpendicular to the tangent at each turning point may intersect with undesired contour points, where each blue line connects a turning point to its nearest contour point. Second, for a curly contour, this line may intersect with the contour more than twice. This algorithm may return a contour point which is on the line but not the nearest intersection. To overcome this problem, we add several constraints. For given intersection point P i ,weperformthefollowing: I. drawacirclewithradius(r−δ r )atturningpointK i ,whereristhedistancebetween the intersection point andK i ,and δ r is a small number that loosen this restriction, and check if this circle is inside the contour. This constraint can fix the problem of irregular tangents along the skeleton. 107 Figure 5.9: Results of (a) a naive radius estimation algorithm, (b) an improved radius estimation algorithm by adding constrains, and (c) an even better radius approximation by adding more points. II. draw a line segment from the found contour P i to K i ,checkifpartofthislineis outside of the contour. This constraint can solve the problem of that this line may intersect with the contour more than twice. The result of the improved method is shown in Fig. 5.9(b). Furthermore, we can increase the number of sampling points along the skeleton. That is, if the radius change is above a pre-selected threshold, we will add one more sampling point along the skeleton and, accordingly, one more cylinder to approximate the segment. After that, we can estimate the radius of its cross section. The result of the further improved method is shown in Fig. 5.9(c). D. Skeleton Correction The skeleton correction process is used to improve the quality of the user-drawn skeleton. In some cases, the approximating d-cylinder cannot fit the contour well since the user-drawn skeleton does not locate in the medial axis of the contour. As a result, parts of the d-cylinder will lie outside of the contour as shown in Fig. 5.10. 108 Figure 5.10: When user-drawn skeleton points are not lying in the medial axis of the contour, the d-cylinder will lean to one side of the contour as shown in the body d- cylinder of the cartoon cat. The skeleton correction process is conducted as follows. At each key points K i (in- cludingturningpointsandtwoendpointsoftheskeleton), weextractitstwointersections with the contour points C ki and D ki as described before and shift the key point to the middle of its two intersection points, Middle(C ki , D ki ). Similarly, we adjust skeleton points between two turning points Ki and Ki+1, by uniformly sample contour points between two pairs of contour points (C ki , C ki+1 ), and (D ki , D ki+1 ), where (C ki , D ki ) are the two intersections of Ki and (C ki+1 , D ki+1 )arethetwointersectionsof K i+1 .If S 1 ∼S j are contour points uniformly sampled between C ki and C ki+1 ,and Q 1 ∼Q j are contour points uniformly sampled between D ki and D ki+1 ,skeletonpointsbetween K i and K i+1 can be shifted to the middle points, Middle(S 1 ,Q 1 ), Middle(S 2 ,Q 2 ), ..., and Middle(S j ,Q j ). Examples of using the skeleton correction processing are shown in Fig. 5.11, where the user-drawn skeleton is properly shifted and the resulting open tube fits the contour well now. E. Medial-Axis-Guided Skeleton Drawing To obtain the radius more accurately, the S3D system provides the medial axis trans- formation [12] as an optional tool to users. It computes the depth map and then extracts 109 Figure 5.11: The examples of the skeleton correction result: (a) The user-drawn contour and skeleton. (b) Theskeleton points are adjustedto themiddlepoints of its two contour intersections. (c) The approximating open-tube can not well fitted in the contour. the medial axis by connecting adjacent singularities (i.e. creases or curvature discontinu- ities). As shown in Fig. 5.12, the medial axis can be extracted from the depth map with all points off the skeleton suppressed to zero. It is worthwhile to point out that the medial axis method may not be robust for automatic skeleton extraction since the extracted medial axis may have many branches and it is difficult to define which branch to keep or discard. In addition, the radius extracted with the distance transform map is not suited for d-cylinder approximation since the radius can be smaller than expectation. For example, the radius estimated near toabottleneckcanbemuchsmallerthantheradiusextracted usingthetangentapproach as described earlier. Regardless of its shortcomings, the depth map can still be used in guiding a user to draw a skeleton along the medial axis. Thus, the S3D system offers the medial-axis method as a guiding tool. F. Tube Caps 110 Figure 5.12: Medial axis extraction by a distance transformation method [12]. Finally, we improve the appearance of the open-tube by adding one cap to each end. Acapofaopen-tubeisconstructedwiththreemainparts: abottomellipse,atop-center point and a body that connect each contour point of the bottom ellipse and the center- top point. Its bottom ellipse is the same as the one used for the fitting d-cylinder at the end point. Its center-top point is extracted by intersecting the tangent at the end of the skeleton with the contour. Its body is formed by multiple quadratic Bezier curves that connect the vertices of the ellipse and the top-center points. Each Bezier curve has two vertices of the ellipse as its two ends and the top-center point as its middle point. The two vertices, P i and P j ,aretwopointsselectedfromtheellipsesuchthat,ifwedrawa line between them, the line will pass through the center of the ellipse. 5.3.2 Other Modeling Elements A. Closed-Tube If one contour is completely inside the other and a skeleton lies between them, a donut-like closed-tube will be employed for the mapping as shown in Fig. 5.6(b). The generation of the closed-tube is similar to that of the open-tube. The skeleton is first divided into multiple segments at turning points and each segment is mapped to a fitting d-cylinder. The difference between a open-tube and a closed-tube is that the open-tube 111 Figure 5.13: (a) A 3D ellipsoid (right) created by an elliptical contour (left), where the 3D ellipsoid is composed of multipleparallel d-cylinders withdifferent heights, and(b)an ellipsoid (right) created using the open-tube element with both a contour and a skeleton (left). has two caps attached to its ends while the two ends of a closed-tube are connected to each other. B. Ellipsoid If a user wants to draw an ellipsoid object, he/she can drawn a 2D elliptical contour with the ellipsoid element to create a perfect 3D ellipsoid (or sphere) that matches this 2D contour. Although an ellipsoid can also be created with the open-tube element, the ellipsoidelementismoredirectasshowninFig. 5.13. The3Dellipsoidisconstructedwith multiple parallel d-cylinders with different heights. Assume that the size of the minimum bounding box of the user drawn elliptical contour is (w, h). Since only one radius and a height can be specified by 2D elliptical contour, the radii of the ellipse in the middle of the ellipsoid will be equal to (w/2, min{w/2,h/2}), where min{w/2, h/2} will return the smaller one between the two numbers. The radii of the ellipse at the top/bottom of the contourwillbe0. Theradiioftheellipsesin-betweencanbeestimatedasafunctionofthe height. Finally, although drawing a skeleton is not required for creating an ellipsoid, the S3D system will create a skeleton automatically along the y-axis for the control purpose. C. Prism 112 Figure 5.14: An example of the prism:(a) the user-drawn 2D contour, (b) specify the thickness bydrag-and-drop, and(c) thefittingprismcreated according tothe 2Dcontour and the user specified thickness. If a contour does not contain any other skeleton or contour, the prism element will be used for this modeling. The user can specify the thickness of a prism by drag-and- dropping the contour as shown in Fig. 5.14. A prism is constructed by a top polygon, abottompolygon,andamultiplequadranglesconnectingthetopandthebottomfaces. The top polygon is created according to the user-drawing contour. The bottom polygon is then translated along the z-axis (towards the display screen) by a distance equal to the user-specified value or a default value. D. Complex-Prism The way to draw a complex-prism is similar to that of drawing a closed-tube but without a skeleton. If a user draws a contour inside of another contour, the complex- prism will be created based on these two contours as shown in Figs. 5.15 and 5.16. The complex-prism can create an arbitrary 3D solid shape with an inner hole of arbitrary- shape. A complex-prism is composed of two prisms. To create a modeling complex-prism, both contours will be divided into two half parts as shown in Fig. 5.15(b). To divide contours, we firstselect two points on the outer contour: a random point V out1 ,andthepoint V out2 which is farthest to V out1 .Then,we select two pointson theinnercontour,V in1 andV in2 ,whicharenearesttoV out1 andV out2 , 113 Figure 5.15: (a) The outer contour and inner contour drawn by the user, (b) they are dividedintotwohalves,eachofwhichismappedtoaprism,and(c)theresultingcomplex- prism. Figure 5.16: The thickness of the complex-prism can be specified by drag-and-dropping the contour. respectively. Then, the contour can be divided by two line segments (V out1 , V in1 )and (V out2 , V in2 ). 5.4 Flexible Object Grouping and Manipulation IntheproposedS3Dsystem,ausercancreatea3Dmodelcomposedofmultiplepartsthat approximatedbythemodelingelements. Forexample,ahumanmodelcanbeconstructed by an ellipsoid for the head and multiple open-tubes for the torso and limbs. The S3D system allows auserto adjusteach of themindividuallyand group them together. It also allows a user to select an individual part and move it to a proper position or orientation. After all parts are settled to a proper position/orientation, the user can group them by manually attaching a selected part to another selected part. The part attached to 114 Figure 5.17: An example of a human model composed of multiple parts and managed in atreestructure,whereapartattachedtoanotherpartisachildnodeoftheattached node in this tree graph. the other part is a child. A parent can have multiple children but a child only has one parent. Once being grouped, all child parts in the group will move along with their parent. However, each child can still rotate individually. For example, a human body model can have 4 limbs as its children and all limbs have the body as their parent. All limbs will move with the body while each limb can still rotate individually. Moreover, each child is constrained to rotate along its own root, to which this child is attached to its parent. For example, a human arm can only rotate along the shoulder joint. To manage the relation among all grouped parts, a tree graph is constructed. One example is given in Fig. 5.17, where the left/right ears are attached to the head so that they are child nodes of the head. In addition, to rotate an individualpart efficiently, the coordinates of every 3D object are stored by the relative position to the root of this object. Thus, we can rotate each objecteasilywithrespecttotherootposition. Forexample,anarmhasarootatshoulders joint and rotates along this joint. By default, the root of a controllable part is the first point of its skeleton while the root of a non-controllable part is the first point of its 115 contour. The controllable parts include the open-tube, the closed-tube, and the ellipsoid which are elements with embedded skeletons. The non-controllable parts include the prism and the complex-prism which are elements with no embedded skeletons. However, sometimes, an object may be attached to another object at the other end away from its root. Then, this object may not rotate properly with the default root. To address this issue, the S3D system provides a root-switch tool that allows the root to be changed to the other end of the object (i.e. the last point of the skeleton or the point which is the farthest to the current root) for controllable elements. When the root is changed, all shape vertices need to bechanged according to their coordinates relative to this new root as well. The S3D system also allows a user to select a 3D object, remove it, duplicate it, or change its color. When deleting a 3D object, its associated tree node will be removed from the tree graph, and all its child nodes will be disconnected from this tree. When duplicating a selected 3D object, all the information of its associated node will be copied to the duplicated one except its parent and children. Initially, the duplicated node will have no parent or child node. This duplicate tool is provided for a user who wants to create a symmetric object as shown in Fig. 5.18. Based on our experience, we found that the delete and duplicate operations are useful in removing a poorly-created object and copying a well-created one. Finally, one objective of the proposed S3D system is to create a 3D object for anima- tion. The structural skeleton of a 3D object can be used for the purpose of animation. For a 3D model constructed with multiple parts, the skeleton of each part is embedded 116 Figure 5.18: An example of duplicating objects, where the selected item is high-lighted in red and the gray items are duplicated from the selected one. according to the user-drawn skeleton. The user can also specify the hierarchical relation- ship among the skeletons of these connected parts. The open-tube, the closed-tube and the ellipsoid are controllable with their skeletons. With these embedded hierarchical skeletons, a 3D model created by the S3D system can be further animated by applying existing skeletal animation technologies [52], [30], [33], [51]. However, the other two elements (i.e., the prism and the complex-prism) do not have embedded skeletons and cannot be animated. 5.5 Applications and User Evaluation This system is implemented in Java3D. It can be a web-based or a cross-platform ap- plication. We generate a variety of 3D models to demonstrate its functionality. Fig. 5.19 shows two examples of creating 3D models using the S3D system. Two 2D cartoon characters collected from the web are given in Fig. 5.19(a). Then, the 2D characters are loaded into the interface of the S3D system so that a user can draw contours as shown in Fig. 5.19(b). Skeletons are drawn in Fig. 5.19(c). Theblue lines (in one of the examples) illustrate the estimated cross section along the skeleton, which are used for adjusting the size of the approximating d-cylinders. Finally, the resultant 3D models created by the S3D system are shown in Fig. 5.5.19(d). 117 Figure 5.19: Examples of 3D model creation: (a) 2D cartoon characters, (b) user-drawn contours based on the 2D character, (c) user-drawn skeletons, the blue lines (in the bottom example)illustrate theestimated crosssection alongtheskeleton, whichareused for adjusting the size of the approximating d-cylinders(d) the resultant 3D models. Fig. 5.20 shows more examples of 2D reference images and their resultant 3D models created with the S3D system. The S3D system can create not only tubular objects but also man-made objects with sharp angles and multiple corners such as chairs, cars, and swords, which can be created easily with the prism and the complex-prism. The average time for creating a 3D model as shown in Figs. 5.19 and 5.20 was about 20 minutes on the average. Moreover, since the resultant 3D model is embedded with a skeleton, one can per- form animation accordingly. This will be an interesting work item for the future. One simple example is shown in Fig. 5.21, where the 3D object can have different poses by manipulating the skeleton. We conducted a study on human evaluation of the S3D system. The study had 6 subjects in the test. None of them has computer science background or any 3D modeling 118 Figure 5.20: More examples of reference images and the resultant 3D models experience. Each user was assigned two 2D images and required to create their 3D models. Before the actual test, there was a training session. We spent 10 minutes in giving a simple instruction and then asked each subject to create the five elements shown in Fig. 5.22. The completion time can be viewed as the required training time. It was about 12 minutes in average. After that, each subject was requested to create a 3D character using one of the 2D images shown in Fig. 5.19. This was used to check the ease of use and the average modeling time for a simple character. The user drew the 2D sketch along the contour of the2Dcharacter intheimage, andcreated a3Dcharacter. Somesubjects whoweremore 119 Figure 5.21: Illustration of the object animation idea: (a) a 2D cartoon character, (b) the resulting 3D model, and (c)-(g) different gestures of the 3D model. Figure 5.22: Before the actual test, each subject was trained to create these five 3D elements based on their 2D images. familiar with the mouse/track pad completed the test in 10 minutes while others took longer time (about 24 minutes). The average completion time was about 18 minutes. Overall, users were satisfied and excited with the proposed S3D system. They felt that the system was very easy to learn and use. On the other hand, users complained the difficulty of using the input device. It is difficult to draw a good contour with a mouse 120 or a touch pad when the object is small. Although this was not the problem caused by our S3D system, a solution should be provided to improve it in the future. 5.6 Conclusion and Future Work We developed a 3D modeling system, called S3D, which allows a user to create 3D models by drawing 2D sketches. The user-friendly interface is intuitive and easy to use. The user can learn and create a 3D model in a short period of time. The five modeling elements and their flexible combinations facilitate the creation of 3D models. Not only tubular objects but also man-made objects of irregular shape can be created easily. In addition, many practical issues, such as rough user drawn skeletons, were identified and solved by the sketch refinement process. Moreover, 3D objects created with controllable elements are embedded with a skeleton and users can define the skeleton structure by attaching one object to the other. A tree graph of skeletons can be created and used to animate the resultant 3D model. There are several limitations of the current S3D system. First, it can only create ashapethatfitswellwithpre-definedelements. Forexample,ahalfeggshellcannot be easily generated. New elements have to be defined to include more flexible objects. Second, since the created 3D model is composed of multiple parts approximated by 3D elements,thesurfaceappearanceofthe3Dmodelisunsmootharoundthejointofdifferent parts. Third,theanimation moduleoftheS3Dsystemisstill underdevelopment. We are currentlydevelopinganeasy-to-useanimationsystemwhichcananimatethe3Dmodelby drawing the 2D sketch of its skeleton. The user can draw a sequence of key frames of the 121 skeleton with multiple postures. Then, the animation will be generated by interpolating different postures specified by the key frames. 122 Chapter 6 Conclusion and Future Work 6.1 Summary of Current Research In this research, we proposed a novel feature-preserving 3D thumbnail system for effi- ciently browsing multiple 3D objects in a large database based on two different tech- niques. The first technique, called the surface-based technique, was developed in Chapter 3 to extract the skeleton and the body measurements for each model, which is called the shape descriptor. The shape descriptors and the thumbnail descriptors are generated for each model offline and the system can render the pre-generated thumbnail online efficiently. To simplify the 3D model, a surface-based mesh decomposition approach is adopted for identifying significant parts of a model, and each part is approximated with the primitives individually. Therefore, significant components of the original model can be well preserved in the 3D thumbnail even the model is extremely simplified. Moreover, a customized deformable primitive d-cylinder was used to approximate the shape better and fine tune the appearance of the resultant thumbnail. As a result, the 123 data size of the thumbnail descriptor is much less than that of the original mesh and can be downloaded quickly. Rendering a simplified thumbnail demands less hardware resource, and the online thumbnail viewer can display multiple 3D thumbnails simultane- ously within seconds. The limitation of the proposed system is that it was built upon the mesh decomposition workofLin et al. [34]. Thus,when a3D modelhas acomplex topol- ogy and cannot be well decomposed by their approach, the shape extracted descriptor cannot represent the shape accurately, leading to an unpleasant thumbnail result. The second technique, called the voxel-based technique, was proposed in Chapter 4, where a two-phase shape decomposition process was investigated. The polygonal model is first rasterized into a volumetric model and a coarse skeleton is extracted using the thinning operation. With the proposed shape decomposition approach, the defects of the skeleton derived from thinning are refined to meet our requirement; subsequently, the skeleton is classified significant groups, and the volumetric model is decomposed into significant parts accordingly. To improve the current result furthermore, we will search more advanced skeletonization technologies for extracting finer skeleton from the volumetric model. Moreover, more types of primitives will be evaluated for primitive approximation and the textures of a model will also be considered. As compared with the previous surface-based technique, the voxel-based technique can preserve more features of the model, and decompose the model more precisely. The significantcomponentsoftheoriginalmodelcanbebetterpreservedinthe3Dthumbnails when the model is extremely simplified. The limitation of the proposed system is that, when the skeleton of a 3D model extracted by the thinning process cannot represent the structure of the shape correctly, the shape decomposition result will be affected. 124 Finally, We developed a 3D modeling system, called S3D, in Chapter 5. The S3D system allows a user to create 3D models by drawing 2D sketches. The user-friendly interface is intuitive andeasy to use. Theuser can learn and create a 3D modelin a short period of time. The five modeling elements and their flexible combinations facilitate the creation of 3D models. Not only tubular objects but also man-made objects of irregular shape can be created easily. In addition, several practical issues, such as the problem of the rough user drawn skeleton, are discovered and improved by our sketch refinement process. 3D objects created with controllable elements are embeddedwith a skeleton and users can define the structure of the skeletons by attaching one object to the other. A tree graph of skeletons can be created so as to animate the resultant 3D model. 6.2 Future Research Topics To make this research more complete, we would like to extend current results along the following directions. I. visual-pleasing primitive approximation and 2D sketch animation There are several limitations of the current S3D system. First, it can only create ashapethatfitswellwithpre-definedelements. Newelementshavetobedefined to include more flexible objects. Second, since the created 3D model is composed of multiple parts approximated by 3D elements, the surface appearance of the 3D modelisunsmootharoundthejointofdifferentparts. Third,theanimation module of the S3D system is still under development. We are currently developing an easy- to-useanimation systemwhichcananimatethe3Dmodelbydrawingthe2Dsketch 125 of its skeleton. The user can draw a sequence of key frames of the skeleton with multiple postures. Then, the animation will be generated by interpolating different postures specified by the key frames. II. Automatic 3D retargeting Over the past few years, mobile computing devices (e.g.,mobilephones,handheld game consoles, laptops, etc.) have become more and more popular. The hand- held devices are equipped with an increasing processor speed, advanced computer graphics capability and integrated wireless technologies. People start to play with 3D models on multiple devices. However, different devices have different memory and processing power capacity, and a 3D model can only be applied to data sets that do not exceed the size of the main memory. In addition, the display resolu- tion may vary. A scalable 3D retargeting technique is needed to support devices of different specifications such as the display resolution, computation power or in- teractivity. Existing 3D model simplification algorithms scale proportionally, and they fail to preserve the important geometrical characteristics of the model at low resolution. The 2D image retargeting techniques have been studied for years. The trend is to collect the visual salient information to identify foreground objects to be preserved during the scaling operation. However, the 3D retargeting is still an open problem. The simplest approach is to scale it globally along some direction so that a model is resized into one of a different aspect ratio. However, this leads to unwanted distortions of significant features. 126 We attempt to develope a content-aware 3D retargeting approach, where 3D models are automatically adjusted according to the resolution and the aspect ratio of the display screen. For example, a simplified character contains less detail is used when auserswitchesplayingfromcomputerstohandhelddevices. Whenauserholdsa game deviceandchanges itsorientation fromthevertical direction to thehorizontal direction, a portrait scene can be retargeted to a landscape scene automatically withoutlosingimportantfeatures. With theproposedapproach, a3Dmodelcan be converted into a multi-resolution representation that has a flexible representation with important features preserved. Even at the most simplified level, the major visual characteristics of the original model are still well preserved. To preserve the meaningful features while retargeting a 3D model to different aspect ratios, a mesh decomposition algorithm and a 3D shape retargeting algorithm are needed. Each model will be first decomposed into several meaningful parts and adjusted individually according to the changed aspect ratio. For example, a human model is decomposed into several parts such as head, neck, body and limbs so that the head can be adjusted without modifying other parts. Although existing work can decompose simple models successfully, it fails when being applied to a model with a complex topology. Automatic, robust and generic mesh decomposition is achallengingtask. The3Dshaperetargetingalgorithmshouldbeabletoadjust decomposedcomponentsofthe3Dmodelwithdifferentratiosbasedonasubjective rule or user’s preference. III. Motion re-targeting and animation 127 In order to animate the 3D model, the control-skeleton needs to be extracted accu- rately. Although some existing research has demonstrated that they could decom- pose and extract the skeleton for some simple models successfully, their approach fails for a model with a complex topology. The challenge of the second goal is how to decompose and extract the skeleton correctly and efficiently for various models. We attempt to enhance the proposed skeleton extraction process so that it is robust with respect to different types of models. The data of the control-skeleton should be small enough to be transmitted efficiently. Furthermore, it is difficult to generate a realistic motion from the scratch, and we often adopt existing captured motion and apply it to the target model. If the skeleton of the target model does not match the skeleton of the subject performing the captured motion, we have to resolve the discrepancy between them. How to re-target the motion if the skeleton of the model does not match the skeleton of the subject performing the captured motion? How to store a rich database for future reference most efficiently on the limited memory? One idea is given below. After the control points are matched between the captured subject and the display model, we estimate the proportional scaling factor of each bone, adjust the mocap data properly, and apply them to the display skeleton. For models containing extra limbs which are not part of the captured subject, the user can specify the physical property of extra limbs so that the respective motion can be adjusted accordingly. 128 References [1] “Adobe bridge,” http://www.adobe.com/products/creativesuite/bridge/. [2] “AlienBrain,” http://www.alienbrain.com/. [3] “Apache lucene,” http://lucene.apache.org/java/docs/. [4] “Bezier curve,” http://en.wikipedia.org/wiki/Bzier curve. [5] “Binvox. 3D mesh voxelizer,” http://www.cs.princeton.edu/ ∼ min/binvox/. [6] Institute for Creative Technologies. [Online]. Available: http://ict.usc.edu/ [7] “Java suggester,” http://softcorporation.com/products/suggester/. [8] “Natural cubic spline,” http://en.wikipedia.org/wiki/Spline (mathematics). [9] “Rotation matrix,” http://www.euclideanspace.com/maths/geometry/rotations/ conversions/angleToMatrix/index.htm. [10] “Thinvox. 3D thinning tool,” http://www.cs.princeton.edu/ ∼ min/thinvox/. [11] A. Alexe, L. Barthe, M. P. Cani, and V. Gaildrat, “Shape modeling by sketching using convolution surfaces,” in ACM SIGGRAPH 2007 courses,ser. SIGGRAPH ’07. New York, NY, USA: ACM, 2007. [Online]. Available: http://doi.acm.org/10.1145/1281500.1281550 [12] C. Arcelli and G. S. di Baja, “Euclidean skeleton via centre-of-maximal- disc extraction,” Image and Vision Computing,vol. 11,no. 3,pp. 163 –173, 1993. [Online]. Available: http://www.sciencedirect.com/science/article/ B6V09-48TD9TH-9P/2/d1c651ebddfc483d399a799c027f7532 [13] M. Attene, S. Katz, M. Mortara, G. Patane, M. Spagnuolo, and A. Tal, “Mesh segmentation - a comparative study,” in SMI ’06: Proceedings of the IEEE Inter- national Conference on Shape Modeling and Applications 2006.Washington,DC, USA: IEEE Computer Society, 2006, p. 7. [14] M.Attene, B.Falcidieno, andM.Spagnuolo, “Hierarchical meshsegmentation based on fitting primitives,” Vis. Comput.,vol.22,no.3,pp.181–193,2006. [15] G. Bertrand and Z. Aktouf, “A 3d thinning algorithms using subfields,” SPIE Con- ference on Vision Geometry III,vol.2356,p.113124,1994. 129 [16] J. Bloomenthal and K. Shoemake, “Convolution surfaces,” in Proceedings of the 18th annual conference on Computer graphics and interactive techniques,ser. SIGGRAPH ’91. New York, NY, USA: ACM, 1991, pp. 251–256. [Online]. Available: http://doi.acm.org/10.1145/122718.122757 [17] P.-Y. Chiang, M.-C. Kuo, T. Silva, M. Rosenberg, and C.-C. J. Kuo, “Feature- preserving 3D thumbnail creation via mesh decomposition and approximation,” in Pacific-Rim Conference on Multimedia.Shanghai,China:Springer,2010. [18] J.-H. Chuang, C.-H. Tsai, and M.-C. Ko, “Skeletonization of three-dimensional ob- ject using generalized potential field,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 11, pp. 1241–1251, 2000. [19] D. Cohen-Steiner, P. Alliez, and M. Desbrun, “Variational shapeapproximation,” in SIGGRAPH ’04.NewYork,NY,USA:ACM,2004,pp.905–914. [20] N. D. Cornea, D. Silver, and P. Min, “Curve-skeleton properties, applications, and algorithms,” IEEE Transactions on Visualization and Computer Graphics,vol.13, no. 3, pp. 530–548, 2007. [21] T. Funkhouser, M. Kazhdan, P. Shilane, P. Min, W. Kiefer, A. Tal, S. Rusinkiewicz, and D. Dobkin, “Modeling by example,” in Proceeding of SIGGRAPH 2004,2004. [22] T.Funkhouser, P. Min, M. Kazhdan, J.Chen, A. Halderman, D. Dobkin, and D. Ja- cobs´ e, “A search engine for 3D models,” in ACM Transactions on Graphics.New York, NY, USA: ACM, 2003, pp. 83 – 105. [23] M. Garland, “Quadric-based polygonal surface simplification,” Ph.D. dissertation, Pittsburgh, PA, USA, 1999, chair-Heckbert, Paul. [24] M. Garland and P. S.Heckbert, “Surface simplification usingquadricerror metrics,” in SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graph- ics and interactive techniques.NewYork,NY,USA:ACMPress/Addison-Wesley Publishing Co., 1997, pp. 209–216. [25] Y. Gingold, T. Igarashi, and D. Zorin, “Structured annotations for 2d-to-3d modeling,” in ACM SIGGRAPH Asia 2009 papers,ser. SIGGRAPH Asia ’09. New York, NY, USA: ACM, 2009, pp. 148:1–148:9. [Online]. Available: http://doi.acm.org/10.1145/1661412.1618494 [26] T. K. T. K. I. S. K. O. H. Nishimura, M. Hirai, “Object modeling by distribution function and a method of image generation,” in Transactions of the Institute of Electronics and Communication Engineers of Japan,1985. [27] T.He, L.Hong, A.Kaufman,A.Varshney, andS.Wang, “Voxel basedobjectsimpli- fication,” in VIS ’95: Proceedings of the 6th conference on Visualization ’95.Wash- ington, DC, USA: IEEE Computer Society, 1995, p. 296. 130 [28] T. Igarashi, S. Matsuoka, and H. Tanaka, “Teddy: a sketching interface for 3d freeform design,” in Proceedings of the 26th annual conference on Computer graphics and interactive techniques,ser.SIGGRAPH’99. NewYork,NY,USA: ACM Press/Addison-Wesley Publishing Co., 1999, pp. 409–416. [Online]. Available: http://dx.doi.org/10.1145/311535.311602 [29] J. O. T. Iii, “A short survey of mesh simplification algorithms,” 2004, course Notes. [30] D. L. James and C. D. Twigg, “Skinning mesh animations,” in ACM SIGGRAPH 2005 Papers,ser.SIGGRAPH’05. NewYork,NY,USA:ACM,2005,pp.399–407. [Online]. Available: http://doi.acm.org/10.1145/1186822.1073206 [31] S. Katz and A. Tal, “Hierarchical mesh decomposition using fuzzy clustering and cuts,” in SIGGRAPH ’03.NewYork,NY,USA:ACM,2003,pp.954–961. [32] M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz, “Rotation invariant spherical harmonic representation of 3D shape descriptors,” in Proceeding of Symposium on geometry processing,2003. [33] J. P. Lewis, M. Cordner, and N. Fong, “Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation,” in Proceedings of the 27th annual conference on Computer graphics and interactive techniques,ser.SIGGRAPH ’00. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co., 2000, pp. 165–172. [Online]. Available: http://dx.doi.org/10.1145/344779.344862 [34] H.-Y. S. Lin, H.-Y. M. Liao, and J.-C. Lin, “Visual salience-guided mesh decompo- sition,” IEEE Transactions on Multimedia,vol.9,pp.45–57,2007. [35] W. E. Lorensen and H. E. Cline, “Marching cubes: A high resolution 3d surface construction algorithm,” in SIGGRAPH ’87: Proceedings of the 14th annual confer- ence on Computer graphics and interactive techniques.NewYork,NY,USA:ACM, 1987, pp. 163–169. [36] D. P. Luebke, “A developer’s survey of polygonal simplification algorithms,” IEEE Comput. Graph. Appl.,vol.21,no.3,pp.24–35,2001. [37] C.M.MaandM.Sonka,“Afullyparallel3dthinningalgorithmanditsapplications,” Comput. Vis. Image Underst.,vol.64,no.3,pp.420–433,1996. [38] T. L. d. A. Machado, A. S. Gomes, and M. Walter, “A comparison study: Sketch-based interfaces versus wimp interfaces in three dimensional modeling tasks,” in Proceedings of the 2009 Latin American Web Congress (la-web 2009),ser. LA-WEB ’09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 29–35. [Online]. Available: http://dx.doi.org/10.1109/LA-WEB.2009.22 [39] P. Min, J. A. Halderman, M. Kazhdan, and T. A. Funkhouser, “Early experiences with a 3d model search engine,” in Web3D ’03: Proceedings of the eighth interna- tional conference on 3D Web technology.NewYork,NY,USA:ACM,2003,pp. 7–ff. 131 [40] M. Mortara and M. Spagnuolo, “Semantics-driven best view of 3d shapes,” IEEE International Conference on Shape Modelling and Applications 2009,vol.33,no.3, pp. 280 – 290, 2009. [41] F. S. Nooruddin and G. Turk, “Simplification and repair of polygonal models using volumetric techniques,” IEEE Transactions on Visualization and Computer Graph- ics,vol.9,no.2,pp.191–205,2003. [42] M. Novotni and R. Klein, “3D zernike descriptors for content based shaperetrieval,” in Proceeding of solid modeling,2003. [43] R. Ogniewicz and M. Ilg, “Voronoi skeletons: Theory and applications,” in in Proc. Conf. on Computer Vision and Pattern Recognition,1992,pp.63–69. [44] K. Pal´ agyi and A. Kuba, “Directional 3d thinning using 8 subiterations,” in DCGI ’99: Proceedings of the 8th International Conference on Discrete Geometry for Com- puter Imagery.London,UK:Springer-Verlag,1999,pp.325–336. [45] R. Schmidt, B. Wyvill, M. C. Sousa, and J. A. Jorge, “Shapeshop: sketch- based solid modeling with blobtrees,” in ACM SIGGRAPH 2007 courses, ser. SIGGRAPH ’07. New York, NY, USA: ACM, 2007. [Online]. Available: http://doi.acm.org/10.1145/1281500.1281554 [46] H.Sundar,D. Silver, N.Gagvani, andS.Dickinson, “Skeleton basedshapematching and retrieval,” in SMI ’03: Proceedings of the Shape Modeling International 2003. Washington, DC, USA: IEEE Computer Society, 2003, p. 130. [47] I. E. Sutherland, “Sketch pad a man-machine graphical communication system,” in Proceedings of the SHARE design automation workshop,ser. DAC ’64. New York, NY, USA: ACM, 1964, pp. 6.329–6.346. [Online]. Available: http://doi.acm.org/10.1145/800265.810742 [48] J. W. Tangelder and R. C. Veltkamp, “A survey of content based 3D shape retrieval methods,” in Multimedia Tools and Applications.Hingham,MA,USA:Kluwer Academic Publishers, 2008, pp. 441 – 471. [49] Y.-S. Wang and T.-Y. Lee, “Curve-skeleton extraction using iterative least squares optimization,” IEEE Transactions on Visualization and Computer Graphics,vol.14, pp. 926–936, 2008. [50] J. Wu and L. Kobbelt, “Structure recovery via hybrid variational surface approxi- mation,” Computer Graphics Forum,vol.24,no.3,pp.277–284(8),sep2005. [51] W. Xu, J. Wang, K. Yin, K. Zhou, M. van de Panne, F. Chen, and B. Guo, “Joint-aware manipulation of deformable models,” in ACM SIGGRAPH 2009 papers,ser.SIGGRAPH’09. NewYork,NY,USA:ACM,2009,pp.35:1–35:9. [Online]. Available: http://doi.acm.org/10.1145/1576246.1531341 132 [52] H.-B. Yan, S. Hu, R. R. Martin, and Y.-L. Yang, “Shape deformation using a skeleton to drive simplex transformations,” IEEE Transactions on Visualization and Computer Graphics,vol.14, pp.693–706, May2008.[Online].Available: http://dx.doi.org/10.1109/TVCG.2008.28 133
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
3D face surface and texture synthesis from 2D landmarks of a single face sketch
PDF
3D deep learning for perception and modeling
PDF
Point-based representations for 3D perception and reconstruction
PDF
Machine learning methods for 2D/3D shape retrieval and classification
PDF
3D object detection in industrial site point clouds
PDF
Object detection and recognition from 3D point clouds
PDF
Autostereoscopic 3D diplay rendering from stereo sequences
PDF
Interactive rapid part-based 3d modeling from a single image and its applications
PDF
Scalable dynamic digital humans
PDF
3D modeling of eukaryotic genomes
PDF
Deep representations for shapes, structures and motion
PDF
3D inference and registration with application to retinal and facial image analysis
PDF
Data-driven 3D hair digitization
PDF
Artistic control combined with contact and elasticity modeling in computer animation pipelines
PDF
Face recognition and 3D face modeling from images in the wild
PDF
Genome-wide studies of protein–DNA binding: beyond sequence towards biophysical and physicochemical models
PDF
Landmark-free 3D face modeling for facial analysis and synthesis
PDF
Correspondence-based analysis of 3D deformable shapes: methods and applications
PDF
Statistical modeling and machine learning for shape accuracy control in additive manufacturing
PDF
Labeling cost reduction techniques for deep learning: methodologies and applications
Asset Metadata
Creator
Chiang, Pei-Ying
(author)
Core Title
Feature-preserving simplification and sketch-based creation of 3D models
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
06/15/2011
Defense Date
06/15/2011
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,primitive approximation,skeleton extraction,skeletonization,sketch-based 3D modeling,volumetric shape representation,Voxel-based shape decomposition
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kuo, C.-C. Jay (
committee chair
), Jenkins, B. Keith (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
nacooooo@gmail.com,peiyingc@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c127-616942
Unique identifier
UC1359698
Identifier
usctheses-c127-616942 (legacy record id)
Legacy Identifier
etd-ChiangPeiY-26.pdf
Dmrecord
616942
Document Type
Dissertation
Rights
Chiang, Pei-Ying
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
primitive approximation
skeleton extraction
skeletonization
sketch-based 3D modeling
volumetric shape representation
Voxel-based shape decomposition