Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Critically sampled wavelet filterbanks on graphs
(USC Thesis Other)
Critically sampled wavelet filterbanks on graphs
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CRITICALLY SAMPLED WAVELET FILTERBANKS ON GRAPHS by Sunil K. Narang A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) August 2012 Copyright 2012 Sunil K. Narang Dedication To my teachers who taught me to be honest and hardworking. ii Acknowledgements It is a pleasure to thank those who made this dissertation possible: my teachers, collaborators, family and friends. First and foremost, I would like to thank my advisor, Professor Antonio Ortega, whose expertise, continuous guidance, and understanding helped me evolve from a naive undergraduate student into a practitioner of critical thinking. The best thing I liked about Antonio is that he gave me the freedom to explore on my own, and at the same time guidance to remain focused on the big picture and not getting lost in minor details. I could not have wished for a better or friendlier supervisor. I would also like to thank Professor Bhaskar Krishnamachari and Professor Yan Liu for serving on my dissertation committee, as well as to Professor C.-C. Jay Kuo and Professor Srikanth Narayanan for being members in my qualifying exam committee. It is a privilege to have their advice on my work. I am also indebted to my colleagues in the Compression Research Group, who taught me many things through various interactions and discussions. Specially, I owe my gratitude to Dr. Godwin Shen from whom I learnt so much about distributed compression, and whose PhD research became the foundation and motivation for my own research. Your friendship both within and outside of our research world has been a great source of strength for me. Special thanks is also due to Dr. Woo-Shik Kim, with whom I had a lot of memorable collaborations and who rst introduced me to the Korean cuisine. I would also like to acknowledge the inputs and collaborations of Sungwon Lee, Yenting (Greg) Lin, Yung-Hsuan (Jessie) Chao, and Akshay Gadde, without which this thesis would have been incomplete. Furthermore, I owe special thanks to Professor Urbashi Mitra, Professor Bhaskar Krishna- machari and Dr. Marco Levorato, for giving me a wonderful opportunity to collaborate with you. The time spent in our joint eorts has truly enriched my experience at USC. I would also like to thank Dr. Grzegorz M Swirszcz, and Dr. Tomasz Nowicki from IBM Watson research lab, for our discussions on distributed compression during my summer internship in New York. I am also iii grateful to Xavier Perez Trufero from Polytechnic University of Catalonia and Apostol T. Gjika from Politecnico Di Torino, for all the wonderful discussions and collaborations. Finally, and most importantly, I would like to thank my wife Shanu, whose encouragement, quiet patience and unwavering love has been the driving force throughout my PhD, and before. A big hug and a heartfelt \thanks" to you. I also thank my dad, Rambeer Singh, for his faith in me and for allowing me and supporting me in every way to be as ambitious as I wanted. iv Table of Contents Dedication ii Acknowledgements iii List of Tables viii List of Figures ix Abstract xii Chapter 1: Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.1 Sampling operations in graphs . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Two-channel wavelet lterbanks on bipartite graphs . . . . . . . . . . . . . 4 1.3.3 Bipartite subgraph decomposition . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Thesis Statement and Research Questions . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2: Basic Theory 9 2.1 Spatial Representation of Graph Signals . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Spectral Representation of Graph Signals . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Downsampling in Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Two-Channel Filterbanks on Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.1 Spatial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.1.1 Random transforms . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.1.2 Graph wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5.1.3 Lifting wavelet transforms . . . . . . . . . . . . . . . . . . . . . . 18 2.5.2 Spectral Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.2.1 Diusion wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.2.2 Spectral graph wavelets . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Chapter 3: Lifting wavelet lterbanks on graphs 22 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2 Maximum Bipartite Subgraph Approximation . . . . . . . . . . . . . . . . . . . . . 26 3.2.1 Example: graph denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Dominating Set Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 v 3.3.1 Example: data gathering in WSN . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Chapter 4: Downsampling in Graphs using Spectral Theory 41 4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2 Downsampling in kRBG graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3 Extension to non-regular bipartite graphs . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 Example: Images as kRBG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Chapter 5: Two-channel Wavelet Filterbanks on Bipartite Graphs 53 5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.2 Two-Channel Filterbank Conditions for Bipartite Graphs . . . . . . . . . . . . . . 56 5.2.1 Aliasing cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.2.2 Perfect reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.2.3 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 Graph-QMF Filterbanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.3.1 Chebychev polynomial approximation . . . . . . . . . . . . . . . . . . . . . 62 5.4 One-hop Localized Spectral Filterbanks . . . . . . . . . . . . . . . . . . . . . . . . 65 5.4.1 One-hop localized designs for arbitrary graphs . . . . . . . . . . . . . . . . 66 5.4.2 One-hop localized designs for bipartite graphs . . . . . . . . . . . . . . . . . 67 5.5 Graph-Bior Filterbanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.5.1 Designing half-band kernel p() . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.5.2 Spectral factorization of half-band kernel p() . . . . . . . . . . . . . . . . . 74 5.5.3 Nomenclature and design of graph-Bior lterbanks . . . . . . . . . . . . . . 75 5.6 Filterbank designs using asymmetric Laplacian matrix . . . . . . . . . . . . . . . . 77 5.6.1 Perfect Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.6.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Chapter 6: Separable Multi-dimensional Wavelet Filterbanks on Graphs 84 6.1 Proposed Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.1.1 Graph after downsampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2 Bipartite Subgraph Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.2.1 Harary's decomposition algorithm . . . . . . . . . . . . . . . . . . . . . . . 91 6.2.2 Min-cut weighted max-cut (MCWMC) algorithm . . . . . . . . . . . . . . . 92 6.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Chapter 7: Examples and Applications of Graph Wavelet Filterbanks 97 7.1 Multi-resolution Decomposition of Graphs . . . . . . . . . . . . . . . . . . . . . . . 97 7.1.1 Bipartite subgraph decomposition . . . . . . . . . . . . . . . . . . . . . . . 98 7.1.2 Spectral wavelet lterbank implementation . . . . . . . . . . . . . . . . . . 99 7.2 Edge Aware Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 7.2.1 Graph representation of images . . . . . . . . . . . . . . . . . . . . . . . . . 104 7.2.2 Graph Filter-banks on Images . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.2.3 Edge-aware graph representations . . . . . . . . . . . . . . . . . . . . . . . 107 7.2.4 Downsampling image graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 108 7.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 7.3.1 Image non-linear approximation . . . . . . . . . . . . . . . . . . . . . . . . 110 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 vi Chapter 8: Conclusions and Future Work 113 8.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 References 117 vii List of Tables 2.1 Evaluation of graph wavelet transforms. CS: Critical Sampling, PR: Perfect Re- construction, Comp: compact support, OE: Orthogonal Expansion, GS: Requires Graph Simplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.1 Comparison between graph-QMF lterbanks and graph-Bior lterbanks . . . . . . 77 5.2 Polynomial expansion coecients (highest degree rst) of graphBior (k 0 ;k 1 ) lters (approximated to 4 decimal places) on a bipartite graph. . . . . . . . . . . . . . . 77 5.3 Comparison of proposed two-channel lterbank designs on bipartite graphs. DC: subspace corresponding to lowest eigenvalue, CS: Critical Sampling, PR: Perfect Reconstruction, Comp: compact support, OE: Orthogonal Expansion . . . . . . . . 81 6.1 Comparison of bipartite subgraph decomposition schemes . . . . . . . . . . . . . . 95 8.1 Evaluation of graph wavelet transforms. CS: Critical Sampling, PR: Perfect Re- construction, Comp: compact support, OE: Orthogonal Expansion, GS: Requires Graph Simplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 viii List of Figures 1.1 Lifting Scheme: Downsampling followed by ltering . . . . . . . . . . . . . . . . . . 5 1.2 Spectral Scheme: Filtering followed by downsampling . . . . . . . . . . . . . . . . 6 2.1 Block diagram of a two-channel wavelet lterbank on graph. . . . . . . . . . . . . 14 2.2 Block diagram of two-channel lifting wavelet lter-banks . . . . . . . . . . . . . . . 18 3.1 Even Odd Assignment in routing trees designed in [38].The dashed lines show the edges not used by the transform though they are within radio-range . . . . . . . . 24 3.2 Even-Odd assignment on Zachary Karate Data [50] using CFP algorithm. . . . . . 29 3.3 (a)Similarity graph with 200 sampled points from the underlying distribution.The nodes in shaded region areN ( 1 ; 2 ) and the nodes in white region areN ( 2 ; 2 ) (b)-(f) Voronoi Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.4 STD of the original and denoised samples . . . . . . . . . . . . . . . . . . . . . . . 39 3.5 PSNR of the original and denoised samples . . . . . . . . . . . . . . . . . . . . . . 39 3.6 Cost Comparison of Dierent Lifting Schemes (1: Haar-like lifting transform with rst level of even/odd split on trees 2: With 3 levels of even/odd split on trees 3: Proposed unweighted set cover based even/odd split on graph 4: Proposed weighted set cover based even/odd split on graph). . . . . . . . . . . . . . . . . . . . . . . . 39 3.7 Number of raw data transmissions taking place in transform computations of dierent lifting schemes. The numbers are averages over Ns = 10 realizations of each size graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.8 Transform denition on SPT and on graph. Circles denote even nodes and x's denote odd nodes. The sink is shown in the center as a square. Solid lines represent forwarding links. Dashed lines denote broadcast links. . . . . . . . . . . . . . . . . 40 3.9 Performance comparisons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1 Block diagram of DU operations in graphs . . . . . . . . . . . . . . . . . . . . . . . 43 4.2 Graph-formulations of a 2D image . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.3 Fourier frequency responses of ideal spectral lters H ideal 0 . . . . . . . . . . . . . . . 51 ix 5.1 (a) Ideal kernel (blue) vs. Meyer's wavelet kernel (red). It can be seen that Meyer's wavelet has smoother transition at = 1 than the ideal kernel, (b)-(f) the recon- struction error magnitudes between original kernels and their polynomial approxi- mations of order 2; 4; 6; 8 and 10 respectively: ideal kernel (blue curves) and Meyers kernel(red curve). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 Proposed 1-hop spectral kernels for bipartite graphs. . . . . . . . . . . . . . . . . . 69 5.3 The spectral distribution of p() with K zeros at = 0 . . . . . . . . . . . . . . . 74 5.4 Spectral responses of graphBior(k 0 ;k 1 ) lters on a bipartite graph. In each plot, h 0 () and h 1 () are low-pass and high-pass analysis kernels, C() and D() con- stitute the spectral response of the overall analysis lter T a , as in (5.65). For near-orthogonalityD() 0 andC() 1. Finally, (p() +p(2))=2 represents perfect reconstruction property as in (5.51), and should be constant equal to 1, for perfect reconstruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.1 Block diagram of a 2D Separable two-channel Filter Bank: the graph G is rst decomposed into two bipartite subgraphsB 1 andB 2 , using the proposed decom- position scheme. By constructionB 2 is composed of two disjoint graphsB 2 (L) andB 2 (H), each of which is processed independently, by one of the two lter- banks at the second stage. The 4 sets of output transform coecients, denoted as y HH ; y HL ; y LH and,y LL , are stored at disjoint sets of nodes. . . . . . . . . . . . . 86 6.2 Example of 2-dimensional separable downsampling on a graph: (a) original graph G, (b) the rst bipartite graphB 1 = (L 1 ;H 1 ; E 1 ), containing all the links inG be- tween setsL 1 andH 1 . (c) the second bipartite graphB 2 = (L 2 ;H 2 ; E 2 ), containing all the links in GB 1 , between sets L 2 and H 2 . . . . . . . . . . . . . . . . . . . . 87 6.3 Example of a bipartite graph-cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.4 Histogram of absolute dierence in similarity between node-pairs in two bipartite subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.1 (a) The Minnesota trac graph G, and (b) the graph-signal to be analyzed. The colors of the nodes represent the sample values. (c)(d) bipartite decomposition of G into two bipartite subgraphs using Harary's decomposition. . . . . . . . . . . . 98 7.2 Output coecients of the proposed graph-QMF lterbanks with parameter m = 24. The node-color re ects the value of the coecients at that point. Top-left: LL channel wavelet coecients, top-right: absolute value of LH channel wavelet coecients, and bottom-right: absolute value of HH channel wavelet coecients . 100 7.3 Reconstructed graph-signals from the graph-QMF wavelet coecients of individual channels. As before the node-color re ects the value of the coecients at that node. Top-left: reconstruction from LL channel only, top-right: reconstruction from LH channel only, and bottom-right: reconstruction from HH channel only. Since, HL channel is empty the reconstruction is an all-zero signal (bottom-left gure). The reconstruction SNR of sum of all four channels is 50.2 dB. . . . . . . . . . . . . . . 101 7.4 Output coecients of the graph-Bior lterbanks with parameter (k 0 ;k 1 ) = (7; 7). The node-color re ects the value of the coecients at that point. Top-left: LL channel wavelet coecients, top-right: absolute value of LH channel wavelet coef- cients, and bottom-right: absolute value of HH channel wavelet coecients . . . 102 x 7.5 Reconstructed graph-signals from the graph-Bior wavelet coecients of individual channels. As before the node-color re ects the value of the coecients at that node. Top-left: reconstruction from LL channel only, top-right: reconstruction from LH channel only, and bottom-right: reconstruction from HH channel only. Since, HL channel is empty the reconstruction is an all-zero signal (bottom-left gure). The reconstruction SNR of the sum of four channels is 168.57 dB. . . . . . . . . . . . . 103 7.6 Two dimensional decomposition of 8-connected image-graph . . . . . . . . . . . . 104 7.7 Separable two-dim two channel graph lterbank on a toy image with both rectan- gular and diagonal edges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.8 Reconstruction of binary image shown in Figure 7.7, using only 4 th level LL- channel wavelet coecients, using (a) 2-D separable CDF 9=7 lterbanks, (b) proposed graph-QMF lterbanks with lter length (m = 28), and (c) proposed graph-Bior lterbanks with lter length (k 0 = 20, k 1 = 21). . . . . . . . . . . . . . 106 7.9 Example demonstrating importance of edge-weighted graph formulation of images: (a) input image (b) edge-information of the image and a highlighted pixelv, (c) un- weighted 8-connected image-graph formulation (d) edgemap-weighted 8-connected image-graph formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7.10 (a) HH wavelet lter (dB scale) on the pixel v on the unweighted graph (b) HH wavelet lter (dB scale) on the pixelv on the weighted graph, (c) undecimated HH band coecients using unweighted graph and (d) undecimated HH band coecients using edge-weighted graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 7.11 The weighted-graphs computed for Lena image, in 4 levels of decomposition . . . 109 7.12 Performance comparison: non-linear approximation . . . . . . . . . . . . . . . . . 111 7.13 Reconstruction of \Lena.png" (512 512) from 1% detail coecients . . . . . . . 112 xi Abstract Emerging data mining applications will have to operate on datasets dened on graphs. Examples of such datasets include online document networks, social networks, and transportation networks etc. The data on these graphs can be visualized as a nite collection of samples, a graph-signal which can be dened as the information attached to each node (scalar or vector values mapped to the set of vertices/edges) of the graph. Major challenges are posed by the size of these datasets, making it dicult to visualize, process, analyze and act on the information available. Wavelets have been popular for traditional signal processing problems (e.g., compression, segmentation, denoising) because they allow signal representations where a variety of trade-os between spatial (or temporal) resolution and frequency resolution can be achieved. In this research, we seek to leverage novel basic wavelet techniques for graph data, and apply them to realistic information analytics problems. The primary contribution of this thesis is to design critically sampled wavelet lterbanks on graphs, which provide a local analysis in the graph (localized within a few hops of a target node), while capturing spectral/frequency information of the graph-signals. The graphs in our study are simple undirected graphs. We rst design "one-dimensional" two-channel lter- banks on bipartite graphs, and then extend them to any arbitrary graph. The lterbanks come in two avors, depending upon the chosen downsampling method: i) lifting wavelet lterbanks and ii) spectral wavelet lterbanks. For bipartite graphs we dene a spectral folding phenomenon, analogous to aliasing in regular signals, that helps us dene lterbank constraints in simple terms. For arbitrary graphs we propose two choices: a) to approximate the graph as a single bipartite graph and apply \one-dimensional" lterbanks, or b) to decompose the graph into multiple bipar- tite subgraphs and apply \multi-dimensional" lterbanks. All our proposed lterbanks designs are critically sampled and perfect reconstruction. To the best of our knowledge, no such lter- banks have been proposed before. The tools proposed in this thesis make it possible to develop i) multiresolution representations of graphs, ii) edge-aware processing of regular signals, iii) anomaly detection in datasets, and iv) sampling of large networks. xii Chapter 1 Introduction 1.1 Motivation Our work is focused on constructing linear wavelet-like transforms for functions dened on the vertices of an arbitrary nite weighted graph. Graphs provide a very exible model for representing data in many domains. Many networks such as biological networks [48], social networks [7, 13] and sensor networks [38, 47] etc. have a natural interpretation in terms of nite graphs with vertices as data-sources and links established based on connectivity, similarity, ties etc. The data on these graphs can be visualized as a nite collection of samples, which we term graph-signals. For example, graphs can be used to represent irregularly sampled datasets in Euclidean spaces such as regular grids with missing samples. In many machine learning applications multi-dimensional datasets can be represented as point-clouds of vectors and links are established between data sources based on the distance between their feature-vectors. In computer vision, meshes are polygon graphs in 2D/3D space and the attributes of the sampled points (coordinates, intensity etc) constitute the graph-signals. The graph-signal formulation can also be used to solve systems of partial dierential equations using nite element analysis (grid based solution). The sizes (number of nodes) of the graphs in these applications can be very large, which presents computational and technical challenges for the purpose of storage, analysis etc. In some other applications such as wireless sensor-networks, the data-exchanges between far-o nodes can be expensive (bandwidth, latency, energy constraints issues). Therefore, instead of operating on the original graph, it would be desirable to nd and operate on smaller graphs with fewer nodes and data representing a 1 smooth 1 approximation of the original data. Moreover, such systems need to employ localized operations which could be computed at each node by using data from a small neighborhood of nodes around it. Multi-channel wavelet lterbanks, widely used as a signal processing tool for the sparse representation of signals, possess both these features (i.e. smooth approximations and localized operations 2 ). For example, a two channel wavelet transform splits the sample space into an approximation subspace which contains a smoother (coarser) version of the original signal and a detail subspace containing additional details required to perfectly reconstruct the original signal. A discussion of the construction and analysis of wavelet lterbanks for regular signals can be found in standard textbooks such as [44]. While wavelet transform-based techniques would seem well suited to provide ecient local analysis, a major obstacle to their application to graphs is that these, unlike images, are not regularly structured. For graphs, traditional notions of dimensions along which to lter the data do not hold. 1.2 Background Transform techniques for graph analysis can be broadly divided into a) global methods, e.g., those using concepts of graph spectral theory, and b) wavelet like localized methods which are supported on a local neighborhood around each node. Global methods are often based on the Laplacian matrix, whose eigenvalues and eigenvectors contain global information about the shape of the graph. Common applications of global methods include, graph partitioning (graph-cuts) [13], simplication, graph based feature extraction [24] and graph matching [51]. A comprehensive discussion of global methods can be found in [3] and [26]. In addition to uncovering mostly global information, global methods usually do not scale well as the graph size increases, e.g., the time required to perform the eigenvalue decomposition can be signicant. The most expensive component in the computation of global methods is eigenvalue decomposition(EVD) of graph matrices, which normally requireO(N 3 ) arithmetic operations. Note that, decentralized algorithm have been proposed which can greatly simplify the computation of a partial set of principal eigenvectors in graphs. For example, the algorithm proposed by Kempe [20], computesk principal eigenvectors of the graph inO( mix N 2 ) decentralized steps, where mix is the mixing time of a random walk on a network, and withO(k 3 ) computations per node in each round. While these 1 more generally, it could be any sparse approximation of the original data. 2 In case of FIR wavelet lters 2 approaches are good for structural analysis of graph, requiring only a partial set of eigenvectors, they have limited use in applications such as compression and denoising which require full spectral decomposition of the graph. Researchers have recently focused on developing localized transforms specically for data de- ned on graphs. Crovella and Kolaczyk [7] designed wavelet like functions on graphs which are localized in space and time. These graph functions j;k are composed of either shifts or dila- tions of a single generating function . Wang and Ramchandran [47] proposed graph dependent basis functions for sensor network graphs, which implement an invertible 2-channel like lter- bank. There exists a natural spectral interpretation of graph-signals in terms of eigen-functions and eigen-values of graph Laplacian matrix L. Maggioni and Coifman [6] introduced \diusion wavelets" as the localized basis functions of the eigenspaces of the dyadic powers of a diusion operator. Hammond et al. [14] construct a class of wavelet operators in the graph spectral do- main, i.e., the space of eigenfunctions of the graph Laplacian matrix L. These eigenfunctions provide a spectral decomposition for data on a graph similar to the Fourier transform for stan- dard signals. A common drawback of all of these lterbank designs is that they are not critically sampled : the output of the transform is not downsampled and there is oversampling by a fac- tor equal to the number of channels in the lterbank. Unlike classical wavelet transforms which have well-understood downsampling/upsampling operations, there is no obvious way in graphs to downsample nodes in a regular manner, since the neighboring nodes vary in number. Lifting based wavelet transforms have been proposed in [45, 18] for graphs in an Euclidean Space. However, these transforms require a Euclidean embedding of the graph. Shen and Ortega [38, 40] have applied wavelet lifting transforms on spanning trees for a wireless sensor network application, where invertibility is guaranteed for any tree, as long as nodes in the tree are partitioned into two sets (even and odd nodes) and the transform is structured by modifying even nodes based on odd nodes (and vice versa). 1.3 Contributions The objective of this thesis is to design critically-sampled wavelet-lterbanks on graphs. The building blocks in our proposed designs are two channel wavelet lterbanks on bipartite graphs, which provide a decomposition of any graph-signal into a low-pass (smooth) graph-signal, and a 3 high-pass (detail) graph-signal. These designs come in two avors: i) lifting wavelet lterbanks, and ii) spectral wavelet lterbanks. We choose bipartite graphs because they are a natural choice for implementing lifting wavelet lterbanks, and provide easy-to-interpret perfect reconstruction conditions for spectral wavelet lterbanks. For arbitrary graphs we have two choices: a) we can either implement our proposed wavelet lterbanks on a bipartite graph approximation of the original graph, which provides a \one-dimensional" analysis, or b) we can decompose the graph into multiple edge-disjoint bipartite subgraphs, and apply our proposed lterbanks iteratively on each subgraph, leading to a \multi-dimensional" analysis. Our contributions in this thesis can be divided into three major part which we describe in what follows. 1.3.1 Sampling operations in graphs One of the desired properties of wavelet transforms on graphs is critical sampling: the output wavelet coecients are equal in number to the input samples. The critical sampling can be achieved by partitioning the set of vertices in the graphs into two subsets (say L and H), such that nodes in L only sample the output of low-pass channel and nodes in H only sample the output of highpass channel. Algebraically, we describe a linear downsample then upsample (DU) operation on graphs, in which a set of nodes in the graph are rst downsampled (removed) and then upsampled (replaced) by inserting zeros. Especially, we show that for all undirected bipartite graphs, the DU operations lead to a spectral decomposition of the graph-signal where spectral coecients are reproduced at mirror graph-frequencies around a central frequency. This is a phenomenon we term as spectrum folding in graphs as it is analogous to the frequency-folding or \aliasing" eect for regular one-dimensional signals. We utilize this property to dene critically sampled operations for implementing wavelet lterbanks on graphs. This is described in detail in Chapter 4. 1.3.2 Two-channel wavelet lterbanks on bipartite graphs We propose two-channel wavelet lterbanks on bipartite graphs which are critically sampled, and perfect reconstruction. The lterbanks come in two avors, depending upon the chosen downsampling method: a) lifting wavelet lterbanks, and b) spectral wavelet lterbanks. The block diagrams of these two designs are shown in Figures 1.1 and 1.2, respectively. 4 + P + + U + - analysis side + synthesis side + -P + + U- + - + Downsample Filter Filter Upsample - - Figure 1.1: Lifting Scheme: Downsampling followed by ltering The lifting wavelet lterbanks are critically sampled and invertible by construction. These transforms require splitting the vertex set into two disjoint sets, often called the even, and odd sets, given which the transform is computed only on the links between nodes in dierent sets. The previous splitting schemes for lifting transforms required either the coordinates of the nodes in some Euclidean embedding (eg. in [45]), or a specic graph structure (eq., trees in [38, 40]). Our contribution is that we formulate the problem of splitting nodes as a bipartite subgraph approx- imation problem, and provide greedy heuristics to compute optimal subgraphs. We also apply these graph-based lifting transforms in a data-gathering application in wireless sensor networks (WSN). Next, we propose spectral wavelet lterbanks, in which ltering and downsampling is chosen so as to guarantee perfect reconstruction. We design spectral lters, and use the spectral fold- ing phenomenon to provide necessary and sucient conditions for aliasing cancellation, perfect- reconstruction and orthogonality in these lterbanks. As a practical solution we propose a graph- quadrature mirror lterbank (referred to as graph-QMF) design for bipartite graphs which has all the above mentioned properties. However, the exact realizations of the graph-QMF lters do not have compact support on the graph, and approximations of exact solutions as compactly supported solutions incur small reconstruction error and loss of orthogonality. As an alternative, we design biorthogonal wavelet lterbanks which have compact support and yet provide perfect reconstruction of any graph signals. 5 analysis side synthesis side filter downsample upsample filter - - Figure 1.2: Spectral Scheme: Filtering followed by downsampling 1.3.3 Bipartite subgraph decomposition In order to extend two-channel wavelet lterbanks on bipartite graphs to arbitrary graphs, we formulate a bipartite subgraph decomposition problem, which provides an edge-disjoint collection ofK bipartite subgraphs, each with the same vertex setV and whose union is the original graph. Each of these subgraphs is then used as a separate \dimension" to lter and downsample, leading to a K-dimensional separable wavelet lterbank design. The bipartite subgraph decomposition of graphs is not unique, therefore, we propose metrics to identify optimal decompositions, and propose algorithms to compute them. 1.4 Thesis Statement and Research Questions It is possible to extend standard DSP techniques for data dened on graphs. In particular, criti- cally sampled wavelet transforms can be designed on graphs which have a spectral interpretation, and provide perfect reconstruction of any signal dened on the graph. The research questions that we answer are: 1. How to dene downsampling and upsampling operations on graphs? 2. How to design wavelet lters on graphs, given a set of spatial and spectral constraints? 6 3. Using these concepts, how to design critically sampled wavelet lterbanks on graphs? 1.5 Publications The work in this thesis has resulted in following published articles: S. K. Narang and Antonio Ortega, \Perfect Reconstruction Two-Channel Wavelet Filter- Banks For Graph Structured Data", IEEE Transactions on Signal Processing, June 2012 S.K. Narang, Y. H. Chao and A. Ortega, \Graph-wavelet lterbanks for edge-aware image processing", to appear in IEEE Statistical Signal Processing (SSP'12) S.K. Narang and A. Ortega, \Multi-dimensional separable critically sampled wavelet l- terbanks on arbitrary graphs", IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP'12) S.K. Narang and A. Ortega, \Downsampling Graphs Using Spectral Theory", IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP'11). J. P.-Trufero, S.K. Narang and A. Ortega, \Distributed Transforms for Ecient Data Gathering in Arbitrary Networks", Intl. Conf. on Image Proc. (ICIP'11). S.K. Narang and A. Ortega, \Local two-channel critically-sampled lter-banks on graphs", Intl. Conf. on Image Proc. (ICIP'10). S.K. Narang, G. Shen and A. Ortega, \Unidirectional Graph-based Wavelet Transforms for Ecient Data Gathering in Sensor Networks". IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP'10). S.K. Narang and A. Ortega, \Lifting based wavelet transforms on graphs", Asia-Pacic Sig. and Information Proc. Association (APSIPA ASC'09). G. Shen, S.K. Narang and A. Ortega, \Adaptive Distributed Transforms for Irregularly Sampled Wireless Sensor Networks". In Proc. of 2009 IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP'09). 7 1.6 Summary In this chapter, we have described the objectives of this thesis, the motivation behind and a summary of our contributions. The rest of the thesis is organized as follows: In Chapter 2, we describe the basic framework required to understand wavelet lterbanks on graphs, and evaluate some of the existing work on wavelet-like transforms on graph, based on the proposed framework. In Chapter 3, we propose critically sampled lifting wavelet transforms on any arbitrary graphs, and discuss a distributed data-gathering application, in which proposed lifting lterbanks are useful. In Chapter, 4 we introduce the theory behind downsampling/upsampling operations on graphs. In particular, we describe a spectral folding phenomenon in bipartite graphs which is analogous to aliasing in standard regular signals. We use these concepts in Chapter 5, to design two-channel critically sampled wavelet lterbanks on bipartite graphs. In this chapter, we rst propose graph-QMF lterbanks which are orthogonal and perfect reconstruction but do not have compact support. Then, we considered 1-hop localized lterbanks in which the analysis lters are exactly 1-hop localized, but synthesis lters do not have compact support. As an alternative, we propose graph-Bior wavelet lterbanks on bipartite graphs in which are perfect reconstruction and in which both analysis and synthesis lters have compact support. These lters are not orthogonal but can be designed to mostly preserve energy. In Chapter 6, we extend the lterbank designs proposed for bipartite graphs to arbitrary graphs via bipartite subgraph decomposition. In Chapter 7, we discuss some applications of the proposed lterbanks. Finally, in Chapter 8, we conclude and describe our future work. 8 Chapter 2 Basic Theory In this chapter, we describe the basic theory, helpful to understand the construction of graph wavelet lterbanks. We will use the following notations for the rest of the thesis: we represent matrices and vectors with bold letters, mathematical sets with calligraphic capital letters and scalars with normal letters. A graph can be denoted as G = (V; E) with vertices (or nodes) in setV and links (or edges) as tuples (i;j) in E. The graphs considered in our research work are undirected graphs without self-loops and without multiple edges between nodes. The edges can only have positive weights. The size of the graph N = jVj is the number of nodes and the geodesic distance metric is given as d(v;m), which represents sum of edge weights along the shortest path between nodesu andv, and is considered innite if u andv are disconnected . The j-hop neighborhoodN j;n =fv2V : d(v;n) jg of node n is the set of all nodes which are at mostj-hop distance away from noden. Algebraically, a graph can be represented with the node- node adjacency matrix A such that the element A(i;j) is the weight of the edge between node i andj (0 if no edge). The valued i is the degree of nodei, which is the sum of weights of all edges connected to node i, and D =diag(fd i g) denotes the diagonal degree matrix whose i th diagonal entry is d i . The Laplacian matrix of the graph is dened as L = D A. The Laplacian matrix also has a symmetric normalized formL = I D 1=2 AD 1=2 , and an asymmetric normalized formL a = I D 1 A, where I is the identity matrix. We denote < f 1 ; f 2 > as the inner-product between vectors f 1 and f 2 . The rest of the chapter is organized as follows: In Section 2.1, we formally dene graph signals and graph transforms. In Section 2.2, we describe the spectral representation of graph signal and graph-transforms in terms of eigenvalues and eigenvectors of graph Laplacian matrix. In 9 Section 2.3, we introduce downsampling upsampling operations as graph transforms on graphs, and in Section 2.4 we utilize these concepts to dene a general framework for two-channel wavelet lterbanks on the graph. In Section 2.5 we compare and evaluate existing graph transforms based on our proposed framework, and nally we conclude the chapter in Section 2.6. 2.1 Spatial Representation of Graph Signals A graph signal is a real-valued scalar function f :V!R dened on graph G = (V; E) such that f(v) is the sample value of function at vertex v2V. 1 On a nite graph, the graph-signal can be viewed as a sequence or a vector f = [f(0);f(1);:::;f(N)] t , where the order of arrangement of the samples in the vector is arbitrary and neighborhood (or nearness) information is provided separately by the adjacency matrix A. Graph-signals can, for example, be a set of measured values by sensor network nodes [47] or trac measurement samples on the edges of an Internet graph [7] or information about the actors in a social network. Further, a graph based transform is dened as a linear transform T :R N !R M applied to the N-node graph-signal space, such that the operation at each noden is a linear combination of the value of the graph-signal f(n) at the node n and the values f(m) on nearby nodes m2N j;n , i.e., y(n) =< t n f >=T (n;n)f(n) + X m2Nj;n T (n;m)f(m); (2.1) where t n is then th row of the transform T. In analogy to the 1-D regular case, we would sometimes refer to graph-transforms as graph-lters and the elements T (n;m) form = 1; 2;:::N as the lter coecients at the n th node 2 A desirable feature of graph lters is spatial localization, which typically means that the energy of each basis (i.e., each row) of the graph lter is concentrated in a local region around a node. Let us dene 2 k (t n ) given as: 2 k (t n ) = 1 jjt n jj 2 X l2N k;n T (n;l) 2 ; (2.2) 1 The extension to complex or vector sample values f(v) is straightforward but is not considered in this work. 2 Not every linear transform is a graph-transform, since graph-transforms, by denition, are dened along the edges in the graph. For example, lter-coecient T (n;m) can be non-zero only if nodes n and m are connected, i.e., d(n;m)<1, and the magnitude of T (n;m) usually decreases with increasing distance d(n;m). 10 to be the fraction of energy of n th basis function (i.e., n th row t n ), in the k-hop neighborhood around node n. A graph transform is said to be strictly k-hop localized, or having a compact support in the spatial domain, if 2 k (t n ) = 1 for all n = 1; 2;::N. Note that spatial localization can also be applied in a weaker sense in which 2 k (t n ) is not exactly 1 but very close to it for all n = 1; 2;:::N. Our focus in this thesis is to propose graph-lters with compact support. . 2.2 Spectral Representation of Graph Signals The spectral decomposition of graphG is dened can be dened in terms of the set of eigenvalues (G), and the corresponding eigenvectors u , 2 (G) of the graph Laplacian matrix. The Laplacian matrices L andL are both symmetric positive semidenite matrices and therefore, from the spectral projection theorem, there exists a real unitary matrix U which diagonalizesL, such that U t LU = = diagf i g is a non-negative diagonal matrix. In our proposed designs, we use the symmetric normalized form of Laplacian matrixL = D 1=2 LD 1=2 , which is more closely related to random walks in the graphs, and is more appropriate in dealing with non-regular graphs 3 . This leads to an eigenvalue decomposition of matrixL given as L = UU t = N X i=1 i u i u t i ; (2.3) where the eigenvectors u 1 ; u 2 ;:::; u N , which are columns of U, form a basis inR N and f0 1 2 ::: N g are corresponding eigenvalues. Thus, every graph-signal f2R N can be decomposed into a linear combination of eigenvectors u i given as f = P N n=1 f(n)u n . It has been shown in [3, 26] that the eigenvectors of Laplacian matrix provide a harmonic analysis of graph signals which gives a Fourier-like interpretation. The eigenvectors act as the natural vibration modes of the graph, and the corresponding eigenvalues as the associated graph-frequencies 1 . The spectrum (G) of a graph is dened as the set of eigen-values of its normalized Laplacian matrix, and it is always a 3 This is because Q = D 1 A is the transition matrix of a Markov chain which has the same eigenvalues as IL. The eigenvalues ofL are in \normalized" form, i.e, if2(G) then 0 2, and are thus consistent with the eigenvalues in the stochastic processes. Further, the normalization reweighs the edges of graph G so that the degree of each node is equal to 1. Refer to [3] for details. 1 The mappingun!V associates the real numbersun(i);i =f1; 2;:::;Ng, with the verticesV ofG. The numbers un(i) will be positive, negative or zero. The frequency interpretation of eigenvectors can thus be understood in terms of number of zero-crossings (pair of connected nodes with dierent signs) of eigenvector un on the graphG. For any nite graph the eigenvectors with large eigenvalues have more zero-crossings (hence high-frequency) than eigenvectors with small eigenvalues. These results are related to `nodal domain theorems' and readers are directed to [8] for more details. 11 subset of closed set [0; 2] for any graph G. For the purpose of this thesis, an eigenvector u is either considered to be a \lowpass" eigenvector if eigenvalue 1, or \highpass" eigenvector if > 1. The graph Fourier transform (GFT), denoted as f, is dened in [14] as the projections of a signal f on the graph G onto the eigenvectors of G, i.e., f() =< u ; f >= N X i=1 f(i)u (i): (2.4) Note that GFT is an energy preserving transform. A signal is considered \lowpass" (or \high- pass") if the energyj f()j 2 0 for all > 1 (or for all 1). In case of eigenvalues with multiplicity greater than 1 (say 1 = 2 =) the eigenvectors u 1 ; u 2 are unique up to a unitary transformation in the eigenspaceV =V 1 =V 2 . In this case, we can choose 1 u 1 u t 1 + 2 u 2 u t 2 = P where P is the projection matrix for eigenspace V . Note that for all symmetric matrices, the dimension of eigenspace V (geometric multiplicity) is equal to the multiplicity of eigenvalue (algebraic multiplicity) and the spectral decomposition in (2.3) can be written as L = X 2(G) X i= u i u t i = X 2(G) P : (2.5) The eigenspace projection matrices are idempotent and P and P are orthogonal if and are distinct eigenvalues of the Laplacian matrix, i.e., P P =( )P ; (2.6) where () is the Kronecker delta function. 2.3 Downsampling in Graphs A downsampling operation on the graph G = (V; E) can be dened as choosing a subset H of vertex setV such that all samples of the graph signal f, corresponding to indices not in H, are discarded. A subsequent upsampling operation projects the downsampled signal back to original R N space by inserting zeros in place of discarded samples in H c = L. Given such a set H we dene a downsampling function H 2f1; +1g given as 12 H (n) = 8 > < > : 1 if n2H 1 if n2L (2.7) and a diagonal downsampling matrix J H = diagf H (n)g.Note that, by denition L (n) = H (n), therefore everything we derive for H can also be derived for L with appropriate sign changes. The overall `downsample then upsample' (DU) operation , using H can then be algebraically represented as f du (n) = 1 2 (1 + H (n))f(n) = 1 2 (f(n) + H (n)f(n)): (2.8) Thus the signal after DU operation is the sum of the original signal and the signal modulated with H (n). We can write (2.8) in the matrix form as: f du = 1 2 (I + J H )f = 1 2 (f + J H f) (2.9) Note that J H is a symmetric matrix such that J 2 H = I (identity matrix). Since the graph-signal after DU operation also belongs toR N space, it too has a GFT decomposition f du according to (2.4). The relationship between the GFTs of f and f du is given as: f du (l) =< u l ; f du >= 1 2 (< u l ; f > +< u l ; J H f >) (2.10) The inner-product < u l ; J H f > can also be written as < J H u l ; f >, which represents the projection of input signal f onto a modulated eigenvector J H u l . We dene this projection as a modulated spectral coecient f d (l) and (2.10) can be written as: f du (l) = 1 2 ( f(l)+< J H u l ; f >) = 1 2 ( f(l) + f d (l)) (2.11) We describe the eect of this modulation in the spectral domain of graph in Chapter 4. We show that in case of bipartite graphs, the spectrum of the graph is symmetric, and the modulated eigenvectors are also the eigenvectors of the same graph. This phenomenon, which we term as 13 spectral folding, forms the basis of our two-channel spectral lterbank framework, and will be described in detail in Chapter 4. 2.4 Two-Channel Filterbanks on Graph A two-channel wavelet lterbank on a graph provides a decomposition of any graph-signal into a lowpass (smooth) graph-signal and a highpass (detail) graph-signal component. The two channels of the lterbanks are characterized by the graph-ltersfH i ; G i g i2f0;1g and the downsampling operations H and L as shown in Figure 2.1. The transform H 0 acts as a lowpass lter, i.e., it transfers the contributions of the low-pass graph-frequencies, which are below some cut-o, and attenuates signicantly the graph-frequencies which are above the cut-o. The highpass transform H 1 does the opposite, i.e, it attenuates signicantly, the graph-frequencies below some cut-o frequency. The ltering operations in each channel are followed by downsampling operations H and L , which means that the nodes with membership in the set H store the output of highpass channel while the nodes in the set L store the output of lowpass channel. For critically sampled output we have: jHj +jLj = N. Using (2.9), it is easy to see from Figure 2.1 that the output an a lys i s si d e syn t h e sis sid e L L H H L H Figure 2.1: Block diagram of a two-channel wavelet lterbank on graph. signals in the lowpass and highpass channels after reconstruction are given as ^ f L = 1 2 G 0 (I + J L )H 0 f = 1 2 G 0 (H 0 f + J L H 0 f); (2.12) 14 and ^ f H = 1 2 G 1 (I + J H )H 1 f = 1 2 G 1 (H 1 f + J H H 1 f); (2.13) respectively. Thus, ^ f L is the sum of product of G 0 with signal H 0 f and a modulated signal J L H 0 f. Similarly, ^ f H is the product of G 1 with signal H 1 f and a modulated signal J H H 1 f. Note that without the DU operations, the output of two channels are simply ^ f L = G 0 H 0 f and ^ f H = G 1 H 1 f, respectively. Thus the modulated signals G 0 J L H 0 f and G 1 J H H 1 f in (2.12) and (2.13), respectively, can be interpreted as producing distortion (or aliasing) in the two channels. The overall output ^ f of the lterbank is the sum of outputs of the two channels, i.e., ^ f = ^ f L + ^ f H = Tf, where T is the overall transfer function of the lterbank. Combining (2.12) and (2.13) ^ f can be written as: ^ f = 1 2 G 0 (I + J L )H 0 f + 1 2 G 1 (I + J H )H 1 f: (2.14) Separating out modulation terms in (2.14), we get ^ f = 1 2 (G 0 H 0 + G 1 H 1 ) | {z } Teq f + 1 2 (G 0 J L H 0 + G 1 J H H 1 ) | {z } T alias f: (2.15) where T eq is the transfer function of the lterbank without the DU operation and T alias is another transform which arises primarily due to the downsampling in the two channels. For perfect reconstruction T should be equal to identity which can be ensured by requiring T eq to be a scalar multiple of identity and T alias = 0. Thus the two-channel lterbank on a graph provides distortion-free perfect reconstruction if G 0 J L H 0 + G 1 J H H 1 = 0 G 0 H 0 + G 1 H 1 = cI (2.16) In order to design perfect reconstruction lterbanks we need to determine a) how to design ltering operations H i ; G i ;i =f0; 1g, and b) the downsampling functions L and H . In Chapter 15 4, we show that the spectral folding phenomenon in bipartite graphs leads to an aliasing interpre- tation of (2.16) and we design lterbanks which cancel aliasing and lead to perfect reconstruction of any graph-signal. 2.5 Literature Review There has been some work in the past few years on developing localized transforms for data dened on graphs. While some of these works do not take into account the \lterbank perspective", our goal in this section is to place these designs in the common framework described so far in this chapter. We broadly divide the existing graph-transform designs into two types, namely, spatial and spectral designs. We rst describe spatial wavelet transform designs which are based on the spatial features of the graph, i.e., in terms of node connectivity and distances between nodes. Next, we describe spectral wavelet transforms on graphs which are based on the spectral features of the graph, i.e. in terms of the eigenvalues and eigenvectors of a matrix dened on the graph. In order to understand these designs we introduce some additional notation. We dene @N h;k to be an h-hop neighborhood ring around node k (i.e., the set of all nodes v such that the shortest hop distance between v and k is exactly equal to h), a j-hop adjacency matrix A j s.t. A j (n;m) = 1 only if m2N j;n , a j-hop diagonal degree matrix with D j (k;k) =jN j;k j s.t. d j;k =jN j;k j and a j-hop uniform Laplacian matrix L j = D j A j . Similarly we dene a ring adjacency matrix@A h such that@A j (n;m) = 1 only if m2@N j;n and corresponding ring degree matrix @D h =diagf@d j;k g s.t. @d j;k =j@N h;k j. 2.5.1 Spatial Designs 2.5.1.1 Random transforms Wang and Ramchandran [47] proposed spatially localized graph transforms for sensor network graphs with binary links (i.e., links which have weight either 0 or 1). The transforms proposed in [47] either compute a weighted average given as y(n) = (1a + a d j;k + 1 )x(n) + X m2Nj;n a d j;k + 1 x(m); (2.17) 16 or a weighted dierence given as y(n) = (1 +b b d j;k + 1 )x(n) X m2Nj;n b d j;k + 1 x(m); (2.18) in a j-hop neighborhood around each node in the graph. The corresponding transform matrices can be represented for a given j as T j = Ia(I + D j ) 1 L j S j = I +b(I + D j ) 1 L j : (2.19) This approach intuitively denes a two-channel wavelet lter-bank on the graph consisting of two types of linear lters: a) approximation lters as given in (2.17) and b) detail lters as given in (2.18). Note that these transforms are oversampled and produce output of the size twice that of the input, since no downsampling is applied after ltering. Further none of the transforms can be called a wavelet lter since both transforms have a non-zero DC response. 2.5.1.2 Graph wavelets Crovella and Kolaczyk [7] designed wavelet like transforms on graphs that are localized in space. They dened a collection of functions j;n :V ! R, localized with respect to a range of scale/location indices (j;n), which at a minimum satisfy P m2V j;n (m) = 0 (i.e. a zero DC response). Each function j;n is constant within hop rings @N h;n and can be written as: y(n) =a j;0 x(n) + j X h=1 X m2N h;n a j;h @d j;n x(m) (2.20) In matrix form the j-hop wavelet transform T j can be written as: T j =a j;0 I +a j;1 @D 1 1 @A 1 +:::a j;j @D 1 j @A j (2.21) Further, the constants a j;h satisfy P h=j h=0 a j;h = 0, which allows the wavelet lters to have zero DC response. and can be computed from any continuous wavelet function (x) supported on the interval [0; 1) by taking a j;h to be the average of (x) on the sub-intervals I j;h = [ h j+1 ; h+1 j+1 ]. Though these transforms are local and provide a multi-scale summarized view of the graph, they 17 do not have approximation lters and are not invertible in general. 2.5.1.3 Lifting wavelet transforms Lifting based wavelet transforms have been proposed for graphs with a Euclidean embedding in [45], and for arbitrary routing trees in [38]. In [18], lifting transform is designed in an iterative way by lifting one coecient at a time. These lterbanks provide a natural way of constructing local two-channel critically sampled lter-banks on graph-signals. A block-diagram of lifting wavelet lter-bank is shown in Figure 2.2. In this approach the vertex set is rst partitioned into f 1 even + f 0 even f 0 odd Bi- partition Block f 0 D O P + D E + U + - + f 1 odd To next level decomp Figure 2.2: Block diagram of two-channel lifting wavelet lter-banks sets of even and odd nodesV =O[E. The odd nodes compute their prediction coecients using their own data and data from their even neighbors followed by even nodes computing their update coecients using their own data and prediction coecient of their neighboring odd nodes. The equivalent transform in matrix-form can be written as: T lift = ~ U z }| { 2 6 4 I O 0 U D E 3 7 5 ~ P z }| { 2 6 4 D O P 0 I E 3 7 5 (2.22) where D O and D E are diagonal matrices of sizejOj andjEj respectively. Thus, the lifting trans- forms are critically sampled by design. However, the partitioning schemes for these lifting trans- forms required either the coordinates of the nodes in some Euclidean embedding (eg. in [45]), or specic structure of the graph (eq., trees in [38, 40]). In Chapter 3, we formulate the problem of partitioning nodes as a bipartite subgraph approximation problem, and provide greedy heuristics to compute optimal subgraphs. 18 2.5.2 Spectral Designs 2.5.2.1 Diusion wavelets Maggioni and Coifman [6] introduced diusion wavelets, a general theory for wavelet decomposi- tions based on compressed representations of powers of a diusion operator. Their construction interacts with the underlying graph or manifold space through repeated applications of a diu- sion operator T, such as the graph Laplacian matrix L. The localized basis functions at each resolution level are orthogonalized and downsampled appropriately to transform sets of orthonor- mal basis functions through a variation of the Gram-Schmidt orthonormalization (GSM) scheme. Although this local GSM method orthogonalizes the basis functions (lters) into well localized `bump-functions' in the spatial domain, it does not provide guarantees on the size of the support of the lters it constructs. Further the diusion wavelets form an over-complete basis and there is no simple way of representing the corresponding transform T. 2.5.2.2 Spectral graph wavelets Hammond et al [14] dened spectral graph wavelet transforms that are determined by the choice of a kernel function g :R + !R + . The kernel g(), is a continuous bandpass function in spectral domain with g(0) = 0 and lim !max g() = 0, where max is the highest magnitude eigenvalue of the Laplacian matrix L (orL). The corresponding wavelet operator T g = g(L) = Ug()U t acts on a graph signal f by modulating each Fourier mode as T g f = N X k=1 g( k ) f(k)u k (2.23) The kernel can be scaled as g(t) by a continuous scalar t. For spatial localization, the authors design lters by approximating the kernels g() with smooth polynomials functions. The ap- proximate transform with polynomial kernel of degree k is given by T ^ g = ^ g(L) = P k l=0 a l L l and is exactly k-hop localized in space. By construction the spectral wavelet transforms have zero DC response, hence in order to stably represent the low frequency content of signal f a second class of kernel function h :R + !R + is introduced which acts as a lowpass lter, and satises h(0) > 0 and lim !max h() = 0. Thus a multi-channel wavelet transform can be constructed from the choice of a low pass kernel h() and J band-pass kernelsfg(t 1 );:::;g(t J )g and it 19 has been shown that the perfect reconstruction of the original signal is assured if the quantity G() = h() 2 + P J k=1 g(t i ) 2 > 0 for all eigenvalues in the spectrum of L. However, these transforms are overcomplete, for example, a J-scale decomposition of graph-signal of size N pro- duces (J + 1)N transform coecients. As a result, the transform is invertible only by the least square projection of the output signal onto a lower dimension subspace. 2.6 Summary In this chapter, we formally introduced graph signals, graph transforms and their spectral domain representation. Further, we introduced downsampling-upsampling (DU) operations on the graphs. These concepts provide a framework for the design of critically sampled two-channel lterbanks on the graphs, and we stated the necessary conditions for these lterbanks to provide perfect reconstruction of any graph signal. Further, we analyzed and evaluated some of the existing graph based transforms, by representing them using the framework introduced in Chapter 2. A common drawback with most of the existing transforms, is oversampling, i.e., number of output wavelet coecients generated are more than the number of input coecients. The lifting wavelet transforms are exceptions to this, as they are critically sampled by construction. However, existing lifting based transforms require graph simplication (i.e., approximation of graph to a bipartite graph), which results in the loss of graph properties. In Chapters 5, we will describe two lterbank designs, namely graph-QMF lterbanks, and graph-Bior lterbank, respectively which are critically sampled and do not require any graph-simplications. To conclude this chapter, Table 2.1 presents a summary of existing methods and their properties. 4 When designed using asymmetric normalized Laplacian matrix. 5 The exact Graph-QMF solutions are perfect reconstruction and orthogonal, but they are not compact support. Localization is achieved with a matrix polynomial approximation of the original lters, which incurs some loss of orthogonality and reconstruction error. This error can be reduced to arbitrary small levels by increasing the degree of approximation. 20 Method DC response CS PR Comp OE GS Wang & Ramchandran [47] non-zero No Yes Yes No No Crovella & Kolaczyk [7] zero No No Yes No No Lifting Scheme [18, 38, 45] zero for wavelet basis Yes Yes Yes No Yes Diusion Wavelets [6] zero for wavelet basis No Yes Yes Yes No Spectral Wavelets [14] zero for wavelet basis No Yes Yes No No graph-QMF lterbanks (Sec: 5.3) zero for wavelet basis 4 Yes Yes No 5 Yes No graph-Bior lterbanks (Sec: 5.5) zero for wavelet basis 4 Yes Yes Yes No No Table 2.1: Evaluation of graph wavelet transforms. CS: Critical Sampling, PR: Perfect Recon- struction, Comp: compact support, OE: Orthogonal Expansion, GS: Requires Graph Simplica- tion. 21 Chapter 3 Lifting wavelet lterbanks on graphs In this chapter, we describe the construction of two-channel lifting wavelet transforms on the vertices of a graph. 1 Lifting wavelet transforms, introduced earlier in Section 2.5, are comprised of three steps, namely, splitting step, prediction step and update step. In the context of graphs, the splitting step corresponds to splitting (labeling) the nodes in the graph into two sets, traditionally called even set and odd set, respectively. Then in the prediction step, odd set of nodes compute detail coecients using data from their neighboring even nodes, and subsequently in the update step, the even nodes compute update coecients using detail coecients from their neighboring odd nodes. The overall lifting transform, written in matrix form in (2.22), is critically sampled and invertible by design. A block-diagram of lifting wavelet lter-bank is shown in Figure 2.2. There are two important choices involved in designing lifting lterbanks: a) how to compute even-odd assignment of the nodes (i.e., splitting step), and b) how to design prediction and update lters. For the latter, there exist many choices. The prediction lters, for example, can be designed in a variety of ways, such as simple average lters [38], lters providing planar approximation [45], lters based on spectral properties [29], or data-adaptive lters [36], etc. Similarly, the update lters can be designed as simple smoothing lters [38], or lters providing orthogonality between update and detail coecients [39], etc. However, the choice in (a), i.e, the problem of choosing even-odd assignment of the nodes, is relatively less studied in literature. While any even-odd assignment strategy will guarantee invertibility of the resulting lterbank, it is not clear what is an optimal split on the graph. An architecture for lifting wavelet analysis is introduced in [45] for irregular grids in 2-D or 3-D Euclidean spaces. However, the even-odd assignment strategy 1 Parts of this research are jointly conducted with Dr. Godwin Shen, and J. Perez-Trufero. See [31], and [32] for details. 22 used there, requires location information of the nodes, and hence cannot be applied to general graphs. In [18], Jansen et al., introduce lifting wavelet transforms on general graphs based on the \lifting one coecient at a time" theme. In this scheme a lifting lterbank is implemented iteratively in N stages (N is the number of nodes in the graph), such that in each stage only one node is assigned an odd parity, while the remaining nodes are all declared even nodes. Thus, the algorithm produces only one wavelet (i.e., detail) coecient at each scale, and this can be very slow in decomposing graphs with a large number of nodes. Recently, Shen et al. [38] proposed lifting transforms on trees, in which even-odd assignment at a node depends on the shortest hop distance of the node from the root node. Their assignment strategy denes roughly 50% of nodes as odd nodes, at each scale. The starting point for this work is the observation, that the idea in [38] can be extended to arbitrary graphs, no longer constrained to be planar and acyclic, as long as suitable even/odd assignment algorithms on the graph can be identied. Our main contribution in this chapter is that we formulate the even-odd assignment problem on the graph as a bipartite subgraph approximation problem,which is to approximate the original graph as a single bipartite subgraph, or as a collection of bipartite subgraphs. Each of these subgraphs is dened on the original set of vertices and a subset of edges.. This formulation helps us dene optimal even/odd assignment strategies for various applications. The outline of the rest of the chapter is as follows: in Section 3.1 we formulate the problem of even-odd assignment in lifting wavelet transforms on graphs as a bipartite subgraph approximation. In Section 3.2, we propose an even-odd assignment based on maximum bipartite subgraph approximation in the original graph, and propose a greedy heuristic based algorithm to obtain such an assignment. This approach is appropriate for applications where we want to minimize the loss in the quality of lifting lters due to bipartite subgraph approximation, and we discuss a graph-denoising example where these lifting transforms are useful. In Section 3.3, we propose an even-odd assignment based on nding a bipartite subgraph with a dominating set as one of its natural partitions. This strategy is appropriate in compression applications, where the data stored on the even nodes require more bits for storage or transmission than the data on odd nodes, and therefore it is desired to have the minimum number of even nodes in the network. We use it to implement lifting transforms in a data gathering application in wireless sensor networks (WSN), and show performance gains. Finally, we summarize the chapter in Section 3.4. 23 3.1 Problem Formulation The starting point of this work is the design of a unidirectional 2D lifting transform along arbitrary trees in a wireless sensor network application, proposed by Shen and Ortega [38]. Given a tree graph, the authors split the nodes into even and odd nodes based on their minimum hopping distance from the root node (see the tree dened by solid lines in Figure 3.1 as an example). A lifting transform is then applied locally on the tree using these assignments. Since trees are acyclic planar graphs, the even-odd assignment of nodes is well-dened and no pair of directly connected nodes is assigned identical (even/odd) parity. To apply this idea to arbitrary graphs (in general cyclic and non-planar) would require selecting an even-odd assignment on these graphs. Referring Figure 3.1: Even Odd Assignment in routing trees designed in [38].The dashed lines show the edges not used by the transform though they are within radio-range again to Figure 3.1 if we now consider a graph that includes both solid and dashed lines it can be seen that nodes that are neighbors in the graph are no longer guaranteed to have opposite parity (e.g., 4 is even and connected to 3 and 5 which are both even as well).Let ~ P and ~ U be the transform matrices in the prediction and update step as dened in (2.22) and letE denote the set of even nodes (blue nodes), andO denote the set of odd nodes (red nodes). Note thatE andO are disjoint sets andE[O =V. Given this even-odd assignment of nodes in the graph, the lifting transform is implemented as follows: we dene f O and f E as the components of input signal on the setsO, andE, respectively, ~ P O;E as the submatrix of ~ P containing prediction weights from 24 nodes inO to nodes inE, and ~ U E;O as the submatrix of ~ U containing update weights from nodes inE to nodes inO. The forward lifting transform is then given as: d O = f O ~ P O;E f E s E = f E + ~ U E;O d O : (3.1) This transform is invertible and the original values can be recovered by following inverse lifting steps given as: f E = s E ~ U E;O d O f O = d O + ~ P O;E f E : (3.2) In some applications when we may want to have over-sampled transforms on the graph, we compute one more lifting transform, with the parity of even and odd nodes swapped. In this case, each node has one detail coecient and one update coecient value. We observe that in the prediction step of lifting, odd nodes only use their even neighbors' data to compute prediction coecients. Similarly, in the subsequent update step, even nodes only use their odd neighbors' data to compute their update coecients. Thus, nodes of the same parity (even/odd), do not use each other's data, even if they are neighbors in the graph. In other words, links between any two even nodes or any two odd nodes, do not participate in computing the lifting transform, and can be considered non-existent for the purpose of implementing the lterbank. Therefore, the even-odd assignment problem can be formulated as a bipartite subgraph approximation problem. A bipartite graphB = (L;H;E) contains two natural clusters L and H, such that all the links connect nodes inL to nodes inH, and vice versa. The bipartite graphs are also called two-colorable graphs, since they have no con icting edges. Thus, given a graph G = (V;E), and an even-odd assignment which splits the vertex setV into an even setE and an odd setO, the graph which is actually used to compute prediction and update lters is a bipartite graphB = (E;O; ^ E), where ^ EE is the subset of all those edges, which connect an even node with an odd node. Note that this formulation results in edge losses, since the edges in setE ^ E do not participate in computing the transform. An alternative to this approach is to decompose the graph iteratively 25 into multiple bipartite subgraphs (say K), and implement the lifting transform in K stages, restricting the splitting and ltering operations in each stage to only one bipartite graph. The details of bipartite subgraph decomposition of a graph are presented in Chapter 6. In the present chapter, we only focus on optimizing one stage of lifting transforms (i.e., by approximating the original graph as a single bipartite graph). The optimality criteria for the bipartite approximation depends on the application. In the next section, we discuss optimality in terms of the quality of the lifting lters. 3.2 Maximum Bipartite Subgraph Approximation Assume an even-odd assignmentfE;Og, which assigns an even or an odd parity to each vertex of a graph G = (V;E) of sizejVj =N. Given this assignment the adjacency matrix A of G can be written as: A = 2 6 4 A O;O A O;E A E;O A E;E 3 7 5 (3.3) where the submatrix A O;O of A is adjacency matrix of a subgraph containing odd nodes only. Similarly A E;E is a submatrix of a subgraph having even nodes only. These matrices contain edges which have con icts since they connect nodes of same parity. The block matrices A E;O and A O;E contain edges which do not have con icts. A lifting transform based on this even-odd assignment utilizes only the A E;O and A O;E matrices of adjacency matrix, in which case the adjacency matrix, actually used in computing lters is given as: ^ A = 2 6 4 0 A O;E A E;O 0 3 7 5 (3.4) and corresponds to a bipartite subgraphB = (E;O; ^ E), as explained in Section 3.1. This bipartite subgraph approximation aects the quality of the prediction and update lters. Therefore, we dene a metric that measures the loss in the quality of lifting lters due to the bipartite subgraph decomposition, and choose optimization criteria to minimize this loss metric. In order to further expand upon this, we would need to choose a specic design of prediction and update lters. Let us choose prediction and update lters based on simple averages as dened in [38]. Using this lter-design, the best quality prediction lter at node i is achieved, when i can use data from all 26 of its neighbors. As a result, the \best" case weight applied to node j by the prediction lter at node i is given as: w p (i;j) = 8 > < > : 1 if i =j A(i;j) D(i;i) if i6=j (3.5) where D(i;i) is the degree of node i (D(i;i) = 1 for isolated nodes). The operations in (3.5) can be written in matrix form as: w p = I D 1 A (3.6) However, because of bipartite subgraph approximation some neighbors of each node can not be used in computing lters, and the the \approximate" case weight applied to node j by the lter node i, in the prediction step is given as: ^ w p (i;j) = 8 > < > : 1 if i =j ^ A(i;j) ^ D(i;i) if i6=j (3.7) and in the matrix form as: ^ w p = I ^ D 1 ^ A (3.8) Therefore, an optimal bipartite subgraph decomposition, in this case, is the one that minimizes the dierence between the best case lters and the approximate lters. Using (3.6) and (3.8), we get: jjw p ^ w p jj 1 = jjD 1 A ^ D 1 ^ Ajj 1 = (3.9) where is the entry-wise 1-norm of the dierence between w p and ^ w p . The ltering operation in the update step depends on the predict operations in the previous step, and therefore are non-linear. However, to keep the optimization linear, we assume that the prediction coecients are computed according to the \best" case (i.e, by using data from all neighbors at each node). Thus, we ignore the contribution of predict operations, and focus only on the update step. Then, the update weight applied to the node j, by the update lter centered at node i is equal to 27 w u (i;j) = A(i;j)=(2D(i;i)) in the \best" case and is equal to ^ w u (i;j) = ^ A(i;j)=(2 ^ D(i;i)) in the \approximate" case. Similar to the prediction case, the entry-wise 1-norm of the dierence between best case lters and the approximate lters can be written as: jjw u ^ w u jj 1 = X i X j jw u (i;j) ^ w u (i;j)j = X i X j j A(i;j) 2D(i;i) ^ A(i;j) 2 ^ D(i;i) j = jj 1 2 D 1 A 1 2 ^ D 1 ^ Ajj 1 = 1 2 (3.10) Thus, minimizing in (3.9), and (3.10) minimizes the norm of the dierence of both prediction and update operations. While this is applicable only for the average prediction and update lters, in general provides a good measurement of loss in the quality of best case and approximate lters. Further, since degree matrix D itself is derived from A, therefore if A ^ A then ^ D D and =jjD 1 A ^ D 1 ^ Ajj 1 0. Thus, in order to minimize , we minimize =jjA ^ Ajj 1 , which using (3.3), and (3.4), can be written as: =jjA ^ Ajj 1 =jjA O;O jj 1 +jjA E;E jj 1 (3.11) According to (3.11), is equal to the number of con icting edges in the graph, i.e., number of edges which have nodes of same parity on both its ends, and minimizing corresponds to an even-odd assignment which minimizes the number of con icts in the graph. The problem can also be formulated as a max-cut problem [1], for which many good centralized approximations available. However, here we choose a decentralized, synchronous algorithms called conservative xed probability colorer (CFP) given in [12], which can be computed iteratively by using only 1- hop communications at each step. The algorithm operates in a distributed manner which is more ecient in the case of large graphs. The CFP colorer algorithm solves the corresponding problem of 2 colors graph coloring problem (2-GCP) so as to minimize the con icts. This algorithm is based on a simple greedy local heuristics and gives competitive results as compared to other k-GCP algorithms [12]. The algorithm starts with each node choosing a parity (even/odd) at random. Then the nodes repeatedly update their colors in synchronized steps. In each step, each node decides whether or not to activate by comparing a randomly generated number with some 28 activation probability p. If the node activates, then it chooses a color that minimizes the number of con icting edges that it has with its neighbors based on their parity in the previous step. Those nodes that change color inform their neighbors: all of a node's operations are thus based on information that it has available locally. Figure 3.2(a) shows a sample even-odd assignment for the Karate Data [50] and Figure 3.2(b) shows the reduction of con icts with each iteration. The convergence of solution has been discussed in [12]. If the solution converges, it ensures (a) Even-Odd assignment on Graph (b) convergence of CFP algorithm. The x-axis is number of iterations. The value on y-axis is the fraction of con icting edges. Figure 3.2: Even-Odd assignment on Zachary Karate Data [50] using CFP algorithm. in probability that there are no nodes having more than 50% neighbors of same parity. The algorithm can be easily extended to weighted edge graphs, where we minimize the total weight of the con icting edges at each node, instead of the total number of edges. 3.2.1 Example: graph denoising We address the even-odd assignment problem in the design of lifting transforms for a simple graph denoising application. Graph denoising problem refers to denoising data, dened on the vertices of a graph, using connectivity information, and may be applied as a preprocessing tool in analyzing real world graphs, e.g., protein interaction networks [33]. The motivation behind graph denoising is that if the original clean data is smooth, or piecewise smooth on the graph, then the samples at any two nodes, which are connected directly by a link, are correlated. Hence the sample at a node can be predicted using the samples of its 1-hop neighbors, and the prediction error corresponds to the noise as well as transition between two smooth regions. The lifting based wavelet transforms are useful in graph denoising, because the wavelet (detail) coecients 29 obtained in the transforms are exactly these prediction errors. Thus, denoising can be performed by transforming the noisy data into the wavelet domain, applying thresholding in the wavelet domain, and inverse transforming the denoised wavelet coecients. Since both forward and inverse lifting transforms have compact support (1-hop localized support at each node), they can be quite ecient in a distributed denoising application. In this example, we use prediction and update lters based on simple averages as dened in [38]. The toy graphs of our experiment are similarity graphs (see [26], Section 2.2) with N uniformly sampled nodes from two partially overlapping Gaussian distributions. An edgefi;jg between two vertices in the graph exists if the dierence in the corresponding sample values is less than some threshold. In order to get optimal quality prediction and update lters, we formulate the even-odd assignment problem as a maximum bipartite subgraph approximation problem, and apply CFP algorithm, described above, to obtain a solution. For thresholding we apply universal threshold given by Donoho [10] thr = p 2 log 2 (N) on the wavelet coecients normalized to the noise level [19]. An example graph withN = 200 sample values is shown in Figure 3.3(a). Figures 3.3(b)-(f) show Voronoi tessellations of the distribution eld with N = 1500 sampled points as Voronoi sites. For Figure 3.3(b) the value of each sample is the mean of the distribution from which it is drawn. In Figure 3.3(c) sample values are the actual noisy values. The intensity of each cell re ects the value of corresponding sample in the cell rescaled to the range between [0; 1]. Figure 3.3 (d),(e),(f) are the Voronoi tessellations of denoised samples. This problem can be seen as a 2D version of denoising of a general M-dimensional discrete data. While our results are preliminary they demonstrate promising performance as compared to simple, single-step methods operating on the Laplacian matrix that have been proposed in the literature. We compare our results to both short time and long time solutions of the diusion heat equation ([3, 51]) on the graphs . The Voronoi tessellations of the eld constructed from denoised values of the samples are drawn in Figure 3.3(c)-(f). The plots show that lifting transform based denoising results are closer to original distribution in Figure 3.3(b) than diusion based methods. To quantitatively assess these results we use two quality metrics: peak signal to noise ratio (PSNR) and standard deviation(STD) of samples. Results are in Figure 3.5 and 3.4. As can be seen in Figure 3.5, PSNR achieved in lifting is higher than for diusion based methods, with better results achieved with the oversampled approach. Note that gains from oversampling are only signicant for relatively sparse graphs. In Figure 3.4 we can observe reduction STD with respect to the original signal and, 30 here too, we observe STD of oversampled transform to be lower than STD in critically sampled case. 3.3 Dominating Set Approximation In the context of compression schemes which use distributed spatial transforms, the main objective is to compress the data into transform coecients that require as few bits as possible. Note that in the case of the lifting transforms, even nodes in the network must transmit raw (original) data to their neighbors before any transform computations can take place. Moreover, the update coecients computed on the even nodes, are also very similar to the original data, and take as many bits as required to encode the original data. Therefore, we also refer to even nodes as raw data nodes. Odd nodes then use this even node data to compute detail coecients, hence, odd nodes can also be called aggregating nodes. Therefore, in this section we shall refer to even-odd assignment as raw-aggregating node assignment, or simply RANA. If the sensed data is spatially correlated and a \sucient" amount of data is received at aggregating nodes, the decorrelated data (i.e., the transform coecients) at aggregating nodes will generally require signicantly fewer bits than those needed to represent raw data. In this case, the objective of even-odd assignment is to minimize the number of even nodes as much as possible, and hence the maximum bipartite subgraph approximation described in Section 3.2, may not be optimal. Consider an in-network transform where RANA leads to raw nodes in setE and aggregating nodes in setO. Suppose that we compute the transform in a distributed manner using the method described above. For simplicity, suppose that data from each raw node m is encoded using B r bits and that the transform coecient from each aggregating node n is encoded using cB r bits for some constant c2 (0; 1]. This could be the case in fairly dense networks since then each aggregating node is likely to receive the same amount of data from raw nodes. Thus, the amount of compression that can be achieved will be about the same for all aggregating nodes. Further, let g(n) denote the cost of storing or transmitting a single bit at node n. The cost g(n) could, for example, be the cost of transmitting a single bit from n to a single or multiple sinks in the network, or could be inversely proportional to the bandwidth available to noden. Then, the total 31 cost to compute a given transform in a distributed manner and route the resulting coecients to the sink is simply C(E) = X m2E B r g(m) + X n2O cB r g(n): (3.12) Note that P m2V g(m) = P m2E g(m) + P n2O g(n) sinceV =E[O andE\O =;. Therefore, (3.12) becomes C(E) = (1c)B r X m2E g(m) +cB r X n2V g(n): (3.13) Since the graph G is xed, B r is constant and c is constant, we have that both (1c) and c P n2V B r g(n) are constant. Therefore, the only thing left to optimize is P m2E g(m), which is nothing more than the sum of the routing costs of the raw nodes. Note that a given assignment ofE andO only makes sense if every aggregating noden has at least one raw data neighbor from which it can transform its own data, i.e., for all n2O,N 1;n 6=;. This is equivalent to requiring that every aggregating node be \covered" by at least one raw data node, or more precisely, that [ m2E N 1;m =V. In other words, setE is a dominating set 2 in graph G, and[ m2E N 1;m provides a set cover for the nodes in G. Thus, nding the minimum cost C(E) in (3.13) is equivalent to nding an assignment of raw data and aggregating nodes which minimizes P m2E g(m) under the constraint that[ m2E N 1;m provides a set cover for G. This can be formulated as a minimum weighted set cover (MWSC) problem described below. Denition 1. Minimum Weight Set Cover Problem: For graph G = (V;E) denote closed neighborhood n [v] = n [v] =fvg[fu2 V : (v;u)2 Eg as a disk centered at node v with corre- sponding weight g(v) for all nodes v2V. Given a collectionN of all setsfn [v] g , a set-cover CN is a sub collection of the sets whose union is V . The MWSC problem is to nd a set-cover with minimum weights. Set-covering problem for unweighted undirected graphs is NP-hard in general. However it can be solved by a natural greedy algorithm that iteratively adds a set that covers the highest number of yet uncovered elements. It provides a good approximation [4] and can be implemented in a distributed way. The algorithm is same for directed graphs with the exception that sets with highest outdegree of central node are added rst to the cover. The algorithm for choosing a greedy set cover in a graph is given in Algorithm 1. 2 A dominating set for a graphG = (V;E) is a subsetE ofV such that every vertex not inE is joined to at least one member ofE by some edge. 32 Algorithm 1 Greedy Minimum Set Cover Require: N =fn [v] g v2V 1: InitializeC =. Dene f(C) =j S n [v] 2C n [v] j 2: repeat 3: Choose v j 2V maximizing the dierence [f(C[fn [vj] g)f(C)] 4: LetC C[fn [vj] g 5: until f(C) =f(N ) 6: return C However, these set covering problems do not take into consideration the fact that even if the selected even nodes are small in number, the total cost P m2E g(m) of even nodes may be very high. To avoid this we propose a minimum weighted set covering problem. In the weighted set- covering problem, for each set n [v] 2N a weight w v 0 is also specied, and the goal is to nd a set coverC of minimum total weight. In the context of our problem weight w v =g(v) for node v. The greedy algorithm for weighted set cover builds a cover by repeatedly choosing a a set n [v] 2N that minimizes the weight w v divided by number of elements in n [v] not yet covered by chosen sets. The algorithm for choosing a greedy weighted set cover is given in Algorithm 2. Algorithm 2 Greedy Minimum Weight Set Cover Require: N =fn [v] g v2V ;W =fw v g v2V 1: InitializeC =. Dene f(C) =j S n [v] 2C n [v] j 2: repeat 3: Choose v j 2V minimizing the cost per element w vj =[f(C[fn [vj] g)f(C)] 4: LetC C[fn [vj] g 5: until f(C) ==f(N ) 6: return C 3.3.1 Example: data gathering in WSN In this section, we discuss an application of lifting transforms in which, we optimize spatial compression in wireless sensor networks (WSN) on arbitrary communication graphs. Since nodes in a wireless sensor network (WSN) are severely energy-constrained devices, it is essential to perform in-network compression for energy-ecient data gathering. The data gathering problem 33 in a single sink case is described as follows: suppose we have a network of N nodes and a sink node (indexed byN +1), and suppose that each node has some datax(n) that it needs to forward to the sink. Assume that each node (indexed by n2I =f1; 2; ;Ng) transmits using a radio range of R n and let G = (V;E) be the directed communication graph which results from these choices of radio ranges. LetN Rn (n) denote the set of nodes withinR n radio range of noden, i.e., N Rn (n) is the set of all nodes that can hear transmissions from n. Furthermore, let T = (V;E 0 ) denote a routing tree rooted at the sink node and letg(n) denote the cost to route a single bit from node n to the sink along T . The cost g(n) could, for example, be proportional to the number of hops, or to the sum of the squared distances between all nodes along the path from n to the sink, etc. Such a tree can be constructed using standard routing protocols such as the Collection Tree Protocol (CTP) [42], in which case the cost g(n) is simply proportional to the number of hops to the sink. The objective of data-gathering is then to transmit data from all nodes to the sink with minimum transmission cost. For data-gathering application, the lifting transforms implemented on the routing tree T have been shown to perform better than other existing approaches [38]. Here, we extend these lifting transforms to the overall communication graph, using the optimal RANA strategy described above. For this, we compute even and odd nodesE andO, respectively, using the greedy set cover algorithms described in Algorithms 1 and 2. Given such sets, let the set of even neighbors that odd node n overhears be given byH n , i.e.,H n =fm2Ejn2N Rm (m)g. Then node n can compute a prediction of its own data using data from nodes inH n , producing detail coecient d(n), i.e., d(n) =x(n) X m2Hn a n (m)x(m): (3.14) Note that if the prediction P m2Hn a n (m)x(m) is close to x(n), then d(n) will have small magnitude and so can be encoded using fewer bits than that those that would be needed for raw datax(n). This will ultimately lead to cost reduction for odd nodes since they transmit fewer bits per coecient. An update step can also be computed for data from each even node to produce smooth coecients, but the number of bits needed to encode smooth coecients is typically the same as the number of bits needed for raw data. Therefore, we do not use an update step in this work. Distributed computation of this transform proceeds as follows. Since odd nodes predict their data from even neighbors, it is only natural for the even nodes to transmit data rst along 34 T . We assume that these raw data transmissions are broadcasted and that they are forwarded all the way to the sink. In this way, odd nodes can utilize data received from all even neighbors regardless of whether they are required to forward data from them. Odd nodes will then compute detail coecients, encode them and then transmit them to the sink along T . Since the lifting transform is computed as the data ows towards the sink, it is termed as unidirectional lifting transform. We now compare the unidirectional transforms with graph-based splits presented here against the transform with tree-based split (i.e., a 1-level transform) [38] and an extension of this transform where odd nodes perform additional levels of decomposition on data received from their even children (i.e., a multi-level transform). This multi-level transform is constructed in the same manner as the multi-level transform proposed in [41]. For all transforms, we use the data adaptive prediction lter design in [36]. Figure 3.7 compares the number of raw data transmissions required by a Haar-like lifting scheme and our proposed scheme, for networks of dierent sizes. It is clear that our proposed method leads to a signicant reduction in raw data transmissions. Assuming a nearly uniform deployment of sensors, the distances between nodes are roughly equal. Hence reduction in the number of raw transmissions is directly proportional to the reduction in transmissions costs as shown in Fig. 3.6. Further in Fig. 3.6 the cost of raw data transmissions for weighted set cover based split is lower than the cost of raw data transmissions for unweighted set cover based split. This is to be expected since even nodes selected by weighted set cover algorithms now have lower costs of transmitting data to the sink. We use an AR-2 model to generate (noise-free) simulation data with high spatial data corre- lation, e.g., the amount of data correlation between two nodes increases as the distance between them decreases. We also assume that raw measurements use 12 bits. A randomly generated 50 node network and a shortest path routing tree (SPT) is computed and used for routing. The SPT is shown in Fig. 3.8(a) along with the even and odd splitting described in [38]. We use the trans- mission schedule discussed in [41]. The structure of the graph-based transform presented here is shown in Fig. 3.8(b). Note that although both Figs. 3.8(a) and 3.8(b) have the same underlying routing structure (solid blue lines), the number of required even nodes is smaller for graph-based transform than for tree-based transform. This leads to reduction in raw-data transmission costs. Performance comparisons are shown in Fig. 3.9, which plots energy consumption versus re- construction quality (in terms of Signal to Quantization Noise Ratio). Energy consumption is 35 computed using the cost model in [46]. Each point corresponds to a dierent quantization level with adaptive arithmetic coding applied to blocks of 50 coecients at each node. The transform proposed here is the best overall. This is to be expected since it seeks to minimize the num- ber of nodes that must transmit raw data to their neighbors, thereby reducing the total energy consumed in the data gathering process. The 1-level and multi-level transforms with tree-based split do outperform simple raw data gathering, and the multi-level transform does better than the 1-level transform since more de-correlation is achieved in the network. However, both of these methods have roughly 50% raw data nodes, hence, they are not as ecient as the two transforms with graph-based splits (which have roughly 25% raw data nodes). To see this distinction more clearly, consider the lossless coding numbers for this same network shown in Fig. 3.6. The cost for raw data forwarding and the total cost are shown separately. As we can see, the overall performance is greatly aected by the raw data forwarding cost in that lower raw data forwarding leads to lower total cost. In particular, the methods proposed here have the lowest raw data forwarding cost, hence, they also have the lowest overall cost. Our proposed transform can be easily applied to any arbitrary WSN, since it is computed as the data is routed towards the sink. The schedule of computation and the even-odd assignment of nodes can be pre-fed into sensors at initialization. This transform design can be seen as precursor to a new class of algorithms which would focus on minimizing raw data transmissions in a WSN by jointly optimizing routing tree and even/odd partition (or raw nodes/aggregating nodes partition). 3.4 Summary In this chapter, we proposed a novel lifting based wavelet transform for graphs. We extended the lifting transforms proposed in [38] for routing trees to any arbitrary graph. For this, we formulated the even-odd assignment problem in these transforms as a bipartite subgraph approximation problem. The denition of optimal bipartite graph depends on the application. In particular, we proposed two solutions: one based on maximum bipartite subgraph approximation and the second based on nding a bipartite graph with a dominating set as one of its natural partitions. For the former, we discussed a toy example of graph denoising, where lifting transform using proposed even-odd assignment are useful. For the latter, we discussed a data-gathering application in WSN, where proposed even-odd assignment leads to low number of raw data transmissions. These lifting 36 transforms provide a new way of applying signal processing tools on graph based data. In the next chapter, we introduce a theory behind downsampling upsampling operations on graphs. This will lead to the design of spectral wavelet lterbanks on graphs in Chapter 5. 37 Figure 3.3: (a)Similarity graph with 200 sampled points from the underlying distribution.The nodes in shaded region areN ( 1 ; 2 ) and the nodes in white region areN ( 2 ; 2 ) (b)-(f) Voronoi Plots 38 Figure 3.4: STD of the original and denoised samples Figure 3.5: PSNR of the original and denoised samples Costs for Different Schemes 1 2 3 4 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Raw Data Cost Total Cost Figure 3.6: Cost Comparison of Dierent Lifting Schemes (1: Haar-like lifting transform with rst level of even/odd split on trees 2: With 3 levels of even/odd split on trees 3: Proposed unweighted set cover based even/odd split on graph 4: Proposed weighted set cover based even/odd split on graph). 39 1 2 3 4 5 6 7 8 9 10 0 50 100 150 200 250 Raw data transmissions in transform computation Size of Network (# nodes) # raw data transmissions Haar−like Graph−based Figure 3.7: Number of raw data transmissions taking place in transform computations of dierent lifting schemes. The numbers are averages over Ns = 10 realizations of each size graphs. 0 100 200 300 400 500 600 0 100 200 300 400 500 600 Transform Structure on SPT (a) Transform on SPT 0 100 200 300 400 500 600 0 100 200 300 400 500 600 Transform Structure on Graph (b) Transform on Graph Figure 3.8: Transform denition on SPT and on graph. Circles denote even nodes and x's denote odd nodes. The sink is shown in the center as a square. Solid lines represent forwarding links. Dashed lines denote broadcast links. 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0.055 0.06 5 10 15 20 25 30 35 40 45 50 Total Energy Consumption (Joules) SNR (dB) SNR vs. Energy Consumption Tree−based Split (1−level) Tree−based Split (Multi−level) Unweighted Graph−based Split Weighted Graph−based Split Raw Data Figure 3.9: Performance comparisons. 40 Chapter 4 Downsampling in Graphs using Spectral Theory In traditional signal processing applications, downsampling and upsampling operations are an integral part of multi-rate wavelet lterbanks. An example of downsampling and upsampling in one-dimensional signal x[n] is by a factor of 2, so that in the resulting signal every other sample is zero. The resulting signal x du [n] can be expressed as: x du [n] = 1 2 (x[n] + (1) n x[n]) (4.1) The discrete time Fourier transform of the resulting signal, x du [n], contains the spectrum of the original signal as well as a frequency shifted copy of the original signal, i.e., X du (e j! ) = 1 2 (X(e j! ) +X(e j(!) )) (4.2) which results in aliasing if the signalX(e j! ) and shifted copyX(e j(!) ) have overlapping regions of support. This phenomenon is also termed as frequency folding, since the frequency components of the original signal appear to fold across center frequency ! ==2. As described in Chapter 2, the data on graphs can be dened as graph-signals, which have a spectral interpretation given by eigenvalues (and eigenvectors) of the graph Laplacian matrix, similar to Fourier transform for regular signals. The focus of this chapter is to propose methods for downsampling graph-signals by extending downsampling results for regular signals to graphs. In particular, we show that for bipartite graphs, the eect of downsampling followed by upsampling can be seen as being analogous to well known aliasing: the spectral representation of the resulting signal is the sum of the spectrum of the original signal and that of a signal obtained by folding the frequency 41 information around the middle frequency (where we work with a spectral representation based on the graph Laplacian). Although general graphs do not have this form, having a formal approach to analyze downsampling in the bipartite graph case provides a tool to address general graphs, by decomposing them into a series of bipartite graphs. This leads to a \multi-dimensional" decomposition of graphs, in which each \dimension" refers to ltering/downsampling operations restricted to only one bipartite subgraph. The bipartite subgraph decomposition is discussed in Chapter 6. We choose 2-D images as one such example which can be represented as bipartite graphs and for which downsampling/upsampling has a known interpretation. The images can be represented as 4-regular graphs with either rectangular or diamond connectivity. We formulate the problem of downsampling images as bipartizing the underlying graph, and show that common downsampling methods such as rectangular and quincunx sampling can also be understood in terms of spectral properties of these graphs. Then, we design new downsampling methods for images by way of dierent representations of images into bipartite graphs. This chapter is organized as follows, in Section 4.1, we formulate the problem of downsampling graphs. In Section 4.2, we discuss downsampling results for k-regular bipartite graphs (k-RBG), and in Section 4.3 we extend them to non-regular bipartite graphs. In Section 4.4 we approximate 2-D images as bipartite graphs, and compare the spectral properties of DU operations on these bipartite graphs with the standard DU operations. Finally, we summarize the ndings in this chapter in Section 4.5. 4.1 Problem Formulation The general formulation of downsampling is described in Section 2.3. In this formulation, we dene a downsampling function H on the graph G = (V; E) as choosing a subset HV such that all samples of the graph signal f corresponding to indices not in H, are discarded. A subsequent upsampling operation with H projects the downsampled signal back to original R N space by inserting zeros in place of discarded samples in L =H c . A block diagram of the DU operation is given in Figure 4.1. Referring again to (2.11), the GFT coecient f du (l) of graph signal, afterDU operation with H , consists of GFT coecient f(l) of original signal, and a modulated spectral coecient f d (l), 42 Figure 4.1: Block diagram of DU operations in graphs which can be interpreted as projection of f onto a modulated basis function J H u l . Similar mod- ulation also occurs duringDU operations in nite length 1-D signals, which can be represented in terms of their Discrete Fourier Transform (DFT), given asW k N (n) = exp(2jkn=N). Comparing (2.8) and (4.1), we can write a downsampling function for nite length signals as 1D (n) = 1 if index n is even and1 otherwise. The basis functions in this case have the property: 1D (n)W k N (n) = (1) n W k N (n) =W (kN=2) N N (n); (4.3) where W (kN=2) N N is the (kN=2) modulo N basis function. In the matrix form (4.3) can we written as: J 1D W k N = W (kN=2) N N ; (4.4) where J 1D = diagf 1D (n)g is the downsampling matrix for 1-D signals. Thus, the modulated DFT basis function for nite length signals is also a DFT basis function with a dierent discrete frequency. This phenomenon is known as aliasing in the regular nite length signal domain. We would like to consider conditions for this to be true for graphs as well. This is important as it will allow us to design \anti-aliasing" lters such that the distorted signal J H f will be band- limited and with spectral response disjoint from that of the original signal (after anti-aliasing has been applied). Further, it will also help us design lterbanks which cancel aliasing, and provide distortion free reconstruction. For graph signals, the basis functions are GFT basis, which are eigen-vectors fu k g, k = 1; 2;:::N of the graph Laplacian matrix. Let us rst consider regular graphs, which have same degree at each node. For regular graphs, the eigenvectors of Laplacian matrix L and normalized 43 Laplacian matrixL are identical. Similar to nite length signal example we want modulated eigenvector J H u k to be an eigenvector of the Laplacian L, i.e. LJ H u k = ^ J H u k (4.5) This is true if J H LJ H | {z } ^ L u k = ^ J 2 H u k = ^ u k (4.6) where matrix ^ L(i;i) = L(i;i) 2 H (i) = L(i;i) =d i . Thus ^ L = D J H AJ H | {z } ^ A (4.7) Here ^ A is a modulated adjacency matrix given as: ^ A(i;j) = 8 > < > : A(i;j) if H (i) H (j)> 0 A(i;j) otherwise (4.8) Clearly one way in which matrices ^ A and A will have the same eigen-vectors is if ^ A =A. This implies that A(i;j) = 0 whenever H (i) = H (j), which is only true if the graph is bipartite with H and L as its bipartite sets. In the next section, we use this property to describe a spectral folding phenomenon ink-regular bipartite graphs (k-RBG) analogous to well known aliasing in the regular signal domain. Subsequently, we extend the results to general bipartite graphs by matrix normalization. 4.2 Downsampling in kRBG graphs A bipartite graphB = (L;H; E) is a graph whose vertices can be divided into two disjoint sets L andH, such that every link connects a vertex in L to one in H. Bipartite graphs are also known as two-colorable graphs since the vertices can be colored perfectly into two colors so that no two connected vertices are of the same color. A k-regular bipartite graphB has the same degree k at each of its vertices (i.e. D = kI). The following known results are useful to understand the spectral properties of DU operations in k-RBG: 44 Lemma 1 ([17, Section 2]). The following statements are equivalent for any graph G: 1. B is k-RBG with bipartitions H and L. 2. B has an even number of nodes N = 2n withjLj =jHj =n nodes in each partition. 3. The spectrum of Laplacian matrix L(B) is symmetric about k and the minimum and maxi- mum eigenvalues of L(B) are 0 and 2k respectively. 4. If u = u T 1 u T 2 T is an eigenvector of L(B) with eigenvalue with u 1 indexed on H and u 2 indexed on L (or vice-versa) then the modulated eigenvector ^ u = u T 1 u T 2 T is also an eigenvector of L(B) with eigenvalue 2k. Thus, given a k-RBG,B = (L;H; E) a downsampling function can be dened such that (n) = 1 if n2 L and (n) =1 if n2 H. This downsampling function has the property that ^ A = J AJ =A. Thus ^ L = D + A =kI + A L = D A =kI A (4.9) In this case both matrices L and ^ L have the same set of eigenvectors. This leads to following proposition: Proposition 1. Given a k-regular bipartite graphB = (L;H; E), let the downsampling function be chosen as = H or = L , and let L be the graph Laplacian matrix. If u is an eigenvector of L of a with eigenvalue then modulated eigenvector J u is also an eigen-vector of L with eigenvalue 2k. Proof. By denition J LJ u = (kI + A)u = (2kI (kI A))u = (2k)u : ! LJ u = (2k)J u : Thus J u is an eigenvector of L with eigenvalue 2k. Proposition 1 implies that if is a unique 1 eigenvalue of L, then J u =u 2k ; (4.10) 1 For eigenvalues with algebraic multiplicity greater than 1, the modulated eigenvector J u can be any vector in the V subspace. A general result dealing with non-unique eigenvalues is described in Proposition 2. 45 where the sign on the right side will change depending upon whether = H or = L . The result in (4.10) is analogous to aliasing result for nite length signals in (4.3). Substituting this in (2.11) we get for k-regular bipartite graphs: f du () = 1 2 ( f()< J u ; f >) = 1 2 ( f()< u 2k ; f >) = 1 2 ( f() f(2k)) (4.11) where f(2k) is an alias of f() in the GFT domain. We term this phenomenon spectral folding in k-RBG, since the spectrum of any signal after DU operations, consists of the original spectrum and the spectrum folded across eigenvalue =k. This phenomenon leads to following Nyquist-like sampling theorem for signals dened on k-RBG: Theorem 1. A graph-signal f on a k-RBG,B = (L;H; E) can be completely described by only half of its samples in the set L or H if the spectrum of f is band-limited by =k. Proof. The spectrum of f, band-limited by =k, implies f() = 0 fork. Since we keep only samples in subsetH, and discard everything else, f du can be written as [f t H 0 t ] t , and using (4.11), can be expanded as: f du = X f du ()u = 1 2 X ( f() + f(2k))u = 1 2 X f()u + 1 2 X f(2k)u ; (4.12) where the summation is over all eigenvalues . Discarding the terms with f() = 0 in (4.12), we get: f du = 1 2 X <k f()u + 1 2 X >k f(2k)u ; (4.13) 46 Thus, f du () = 0:5 f() for <k, and f du () = 0:5 f(2k) for >k. In order to recover, f() from f du (), we dene an anti-aliasing lter as: H ideal 0 = X h ideal 0 ()u u t (4.14) where spectral kernel h ideal 0 is given as: h ideal 0 () = 8 > < > : 2 if <k 0 if k (4.15) Using (4.15) and (4.13), we get: H ideal 0 f du = X ; h ideal 0 (): f du ( )u u t u = X h ideal 0 (): f du ()u = 1 2 X <k 2: f()u + 1 2 X >k 0: f(2k)u = X <k f()u = f (4.16) Thus, f can be recovered from f du by applying anti-aliasing lter H ideal 0 . 4.3 Extension to non-regular bipartite graphs The results obtained in the case of k-RBG cannot be extended to other non-regular bipartite graphs, if using combinatorial Laplacian matrix L. The reason is that for non-regular graphs the degree of nodes is not a constant, and therefore adjacency matrix A and the Laplacian matrix L = D A do not share the same set of eigenvectors. Therefore, for general bipartite graphs, instead of operating on the unnormalized adjacency matrix A, we operate on a symmetric normalized adjacency matrixA = D 1=2 AD 1=2 . The normalization reweighs the edges of graph G so that the degree of each node is equal to 1. Further, we choose eigenvectors of the normalized 47 Laplacian matrixL = D 1=2 LD 1=2 , as the GFT basis functions 2 . To understand the spectral interpretation of DU operations in bipartite graphs, the following properties of bipartite graphs are useful: Lemma 2 ([3, Lemma 1.8]). The following statements are equivalent for any graphB: 1. B is bipartite with bipartitions H and L. 2. The spectrum ofL(B) is symmetric about 1 and the minimum and maximum eigenvalues of L(B) are 0 and 2 respectively. 3. If u = u T 1 u T 2 T is an eigenvector ofL with eigenvalue with u 1 indexed on H and u 2 indexed on L (or vice-versa) then the modulated eigenvector ^ u = u T 1 u T 2 T is also an eigenvector ofL with eigenvalue 2. Note that the results in Lemma 2 are similar to the results fork-RBG in Lemma 1. The primary dierence between these results is that the Laplacian matrix used for k-RBG is the combinatorial Laplacian matrix L, while for bipartite graphs we use symmetric normalized Laplacian matrixL. However, sincek-RBG graphs are also bipartite graphs, the results in Lemma 2 are also applicable to them. Therefore, from now on we only make use of symmetric normalized Laplacian matrix, unless otherwise stated. This enables us to dene the following spectral folding result, similar to k-RBG graphs: Proposition 2. Given a bipartite graphB = (L;H; E) with Laplacian matrixL, if we choose downsampling function as H or L as dened in (2.7), and if P is the projection matrix corresponding to the eigenspace V , then J P = P 2 J : (4.17) Alternatively, if u is an eigen-vector ofL with eigenvalue then J u is also an eigen-vector of L with eigen-value 2. Proof. Let be an eigenvalue ofB with multiplicityk. This implies that there exists an orthogonal set ofk eigenvectorsfu i g i= of Laplacian matrixL with eigenvalue. The projection matrix P 2 Note that the eigenvectors of Laplacian matrix L and normalized Laplacian matrixL are identical for regular graphs. The normalized Laplacian matrix popularized by Fan K. Chung [3] is found to be a more natural tool to deal with non-regular graphs. 48 corresponding to is given by P = P i= u i :u t i . If the downsampling function is chosen as H or L , then the modulated eigenvector ^ u in Lemma 2 is equal to J u, which is an eigen-vector of L with eigen-value 2. It can also be seen that if eigenvectorsfu i g i= are orthogonal to each other then so are the modulated set of eigenvectorsfJ u i g i= and form basis of eigenspace P 2 . Therefore,LJ P J = P i= L:J u i :(J u i ) t = P i= (2):J u i :(J u i ) t = (2)P 2 , therefore J P J = P 2 which implies that J P = P 2 J . This phenomenon is termed as spectrum folding in bipartite graphs, as the modulated eigen- vector (or eigenspace) for any2(B) appears as another eigenvector (or eigenspace) at a mirror eigenvalue around = 1. As a result, for any graph-signal f onB, the modulated signal J f is an aliased (although frequency reversed) version of f. To understand it, let f be an N-D graph-signal on bipartite graphB = (L;H; E) with eigenspace decomposition f = X 2(B) P f = X 2(B) f ; (4.18) where f = P f is the projection of f onto the eigenspace V . Similarly, we dene (J f) as the projection of J f onto the eigenspace V . Using (4.17), we can write (J f) as: (J f) = P J f = J P 2 f = J f 2 : (4.19) Thus for bipartite graphs, (J f) , is same as modulated f 2 . Further, using (4.19) and the fact that J 2 = I, we can show that: jj(J f) jj 2 2 =jjJ f 2 jj 2 2 = (f 2 ) t :J 2 :f 2 =jjf 2 jj 2 2 ; (4.20) which implies that the energy of modulated signal in eigenspace V is equal to the energy of original signal in V 2 eigenspace. Both (4.19) and (4.20) show that J f is an aliased version of f. Using (4.19), the eigenspace decomposition of output signal f du after DU operation can be written as: f du = 1 2 (f + J f) = 1 2 X 2(B) (f + J f 2 ) = 1 2 (f + f alias ) (4.21) In other words, the output signal is the average of the original signal and a shifted and aliased 49 version of the original signal. 4.4 Example: Images as kRBG In this section, we show some examples of how downsampling applied on the graph representation of images leads to familiar results that could be derived from the standard frequency formulation. Digital images are 2-D regular signals, but they can also be represented as graphs by connecting every pixel (node) in an image with its neighboring pixels (nodes) and by interpreting pixel values as the values of the graph-signal at each node. to apply downsampling in the image Figure 4.2 shows some of the ways in which pixels in an image can be connected with other pixels to formulate a bipartite graph representation of any image. The bipartite sets L andH on the graphs are shown as nodes of dierent colors. Thus, the image after downsampling in each graph case, would consist of samples of only one color set. The decision to choose a particular graph representation can be compared with choosing various lattice-subsampling patterns [27] in standard image processing. For example, we can choose Figure 4.2(a) to represent the image which is equivalent to the quincunx downsampling scheme. Similarly, Figures 4.2(c) and 4.2(d) are equivalent to rectangular downsampling schemes (V = [2 0; 0 1]) and (V = [1 0; 0 2]) respectively. The advantage of using a graph representation of the images is that it provides exibility of linking pixels in arbitrary ways, leading to dierent ltering/downsampling patterns. For example, we can represent images as bipartite graphs by connecting each pixel with its 4 diagonal neighbors. This bipartization scheme is shown in Figure 4.2(b). In this scheme since the pixels are connected along the diagonal axis, which can be useful for images with high frequency response in diagonal directions Figure 4.2: Bipartite graph representation of 2D images (a) a 4 connected rectangular image- graph G r (b) a 4-connected diagonal image graph G d , (c) a 2-connected image-graph G v with vertical links only, and (d) a 2-connected image-graph G h with horizontal links only. 50 Once we choose a graph representation of image, we can design ltering operations on graphs, based on the theory developed in this chapter. In order to compare the spectral interpretation of ltering operations based on graph, we implement ideal low-pass graph lters for dierent graph representations of the images. Since, the image-graphs are bipartite graphs, the ideal spectral lowpass lter H ideal 0 on these graph can be computed as in (4.14). In Figure 4.3, we plot the DFT magnitude response of ideal lowpass spectral transforms on these bipartite image- graphs. Because of the regularity and symmetry of the links, the resulting lters at each node, are translated versions of each other (except at the boundary nodes), and so we can compute the 2-D DFT magnitude response of a spectral transform, by computing the DFT response of the ltering operations at a single node. In particular, we choose a row of the ideal spectral lowpass lter, corresponding to a pixel away from the boundary, reshape it into two dimensions as the original image, and perform a 2-dimensional DFT. Figure 4.3: DFT magnitude responses of the ideal spectral lters H ideal 0 on (a) G r , (b) G d and (c) G v . We observe in Figure 4.2(a) that, the downsampling pattern (red/blue nodes) on the rectan- gular subgraph G r is identical to the the quincunx downsampling pattern, and in Figure 4.3(a), it can be observed that the DFT magnitude response of H ideal 0 lter on G r is same as the DFT magnitude response of the standard anti-aliasing lter for quincunx downsampling. Similarly, we observe that the spectral low-pass lter forG v (orG h ) in Figure 4.2(c) (or 4.2(d)), have the same DFT magnitude responses (Figure 4.3(c) (or Figure 4.3(d))) as the anti-aliasing lters for vertical (or horizontal) factor-of-2 downsampling case. The graph formulation of images also allows us to explore new downsampling patters, for example, the image pixels can be connected to their diagonally opposite neighbors as shown in Figure 4.2(b). The DFT magnitude response of the ideal spectral low-pass lter in this case, is shown in Figure 4.3(b) and has a wider passband in the diagonal directions. 51 4.5 Summary In this chapter, we have proposed a method for downsampling datasets dened on graphs. Espe- cially, we have described a spectral folding phenomenon in bipartite graphs, which leads to aliasing in the spectral domain of the graph. Further, we have shown that images can be represented as bipartite graphs and shown some of the bipartization examples. The anti-aliasing lters on some of these image-graphs have identical frequency response, as those designed using standard signal framework. The results show that our proposed graph based method provide an alternative way of interpreting downsampling ltering in images. In the next chapter, we use this property to design critically sampled two-channel lterbanks on graphs. 52 Chapter 5 Two-channel Wavelet Filterbanks on Bipartite Graphs In Chapter 4, we described downsampling/upsampling (DU) operations in bipartite graphs, which produce a spectral folding phenomenon in graph signals. In this chapter, we utilize this property of bipartite graphs to propose designs of critically sampled two-channel wavelet lterbanks for analyzing functions dened on the vertices of any arbitrary nite weighted undirected graph. A general framework for the design of two channel critically sampled lterbanks on graphs was in- troduced in Section 2.4. In this chapter, we expand upon this framework for the special case of bipartite graphs. We specically design wavelet lters based on the spectral decomposition of the graph, which helps us translate the aliasing-cancellation, and perfect reconstruction conditions given in (2.16), in very simple terms. In addition, we state necessary and sucient conditions for orthogonality in these lterbanks. As a practical solution, we propose quadrature mirror lterbanks (referred to as graph-QMF) for bipartite graph which have all the above mentioned properties. While the exact graph-QMF designs satisfy all the above conditions, they are not compactly supported 1 on the graph . In order to design compactly supported graph-QMF trans- forms, we perform a Chebychev polynomial approximation of the exact lters in the spectral domain. This leads to some error in the reconstruction of the signal, and loss of orthogonal- ity. As an alternative, we relax the conditions of orthogonality and design lters with compact support, which satisfy the perfect reconstruction conditions. These biorthogonal designs are of two types: the rst type of lterbanks have 1-hop localized analysis lters. These lterbanks are invertible, and we choose synthesis lters to guarantee perfect reconstruction. The second type of designs, termed as graph-Bior, have both analysis and synthesis lters with compact support. 1 see Section 2.1 for denition of compact support for functions dened on graphs. 53 This design is analogous to the standard Cohen-Daubechies-Feauveau's (CDF) [5] construction to obtain maximally half-band lters. Even though these lterbanks are not orthogonal, we show that they can be designed to nearly preserve energy. In particular, we compute expressions for Riesz bounds of the lterbanks, and choose graph-wavelets with the minimum dierence between upper and lower Riesz bounds. The rest of the chapter is organized as follows: in Section 5.1, we formulate the problem of designing two-channel wavelet lterbank on graphs. In Section 5.2, we describe the lterbank design specic to bipartite graphs and state necessary and sucient condi- tions for these lterbanks to provide perfect-reconstruction, aliasing cancellation and orthogonal basis. In Section 5.3, we describe the proposed graph-QMF solution, which satises all the above conditions. In Section 5.4, we describe the one-hop localized biorthogonal transforms on graphs. In Section 5.5, we revisit the necessary and sucient conditions for perfect reconstruction and orthogonality, and pose them as solving a system of biorthogonal spectral kernels. This leads to the proposed graphBior solution. Finally, we summarize the chapter in Section 5.7. 5.1 Problem Formulation A general formulation of two-channel wavelet lterbanks is described in Section 2.4. A block diagram of the two-channel critically sampled graph wavelet lterbanks is shown in Figure 2.1. For designing wavelet lters on graphs, we exploit similar concepts of spectral decomposition as in [14]. Because of this, it is useful to dene analysis wavelet lters H 0 and H 1 in terms of spectral kernelsh 0 () andh 1 (), respectively. In our analysis, we use the normalized form of the Laplacian matrixL = D 1=2 LD 1=2 , which in the case of regular graphs has the same set of eigenvectors as L. The normalization reweighs the edges of graph G so that the degree of each node is equal to 1. Thus, given the eigen-space decomposition of Laplacian matrixL as in (2.5), the analysis lters can be represented as: H 0 =h 0 (L) = X 2(G) h 0 ()P H 1 =h 1 (L) = X 2(G) h 1 ()P (5.1) Since the Laplacian matrixL is real and symmetric, the lters designed in (5.1) are also real and symmetric. These lters have the following interpretation: the output of a spectral lter 54 with kernel h() can be expanded as: f H = Hf = P 2(G) h i () P f, where f = P f is the component of input signal f in the -eigenspace. Thus, lter H either attenuates or enhances dierent harmonic components of input signals depending upon the magnitude of h(). For a general kernel function, the ltering operations with H i and G i require spectral decomposition of the Laplacian matrix, which is non-scalable and computationally expensive. However it has been shown in [14] that when the spectral kernels are approximated as polynomial kernels of degree k, the lters can be computed iteratively with k one-hop operations at each node. Thus, the computational complexity of the ltering operations reduces toO(kjEj), which increases linearly with the number of edgesjEj of the graph, and the degree k of polynomial approximations. Further, any spectral transform corresponding to a k degree polynomial kernel is exactly k-hop spatially localized , and can be eciently computed without diagonalizing the Laplacian matrix. The degree of the polynomial kernels can be interpreted as the length of the corresponding spectral lters, and one of the desirable feature of graph wavelets is to have shorter lter lengths. Referring again to Figure 2.1, for graph G = (L;H; E), let H = be the downsampling function for H 1 lter channel and let L = be the downsampling function for H 0 channel. Thus the nodes in H only retain the output of highpass channel and nodes in L retain the output of the lowpass channel. In our proposed design, we also choose the synthesis lters G 0 and G 1 to be spectral lters with kernels g 0 () and g 1 () respectively 3 . Let f be the graph signal on G, and f = P f is the projection of f onto the eigenspace V . Then by using (5.1), the perfect reconstruction conditions in (2.16) can be rewritten as: T eq f = (G 0 H 0 + G 1 H 1 )f = X ; 2(G) (g 0 ()h 0 ( ) +g 1 ()h 1 ( )) P P f = X 2(G) (g 0 ()h 0 () +g 1 ()h 1 ()) P f = X 2(G) (g 0 ()h 0 () +g 1 ()h 1 ()) f (5.2) 3 In general, synthesis lters do not have to be based on the spectral design. A case is presented in our previous work [29] with linear kernel spectral analysis lters and non-spectral synthesis lters. 55 T alias f = (G 1 J H 1 G 0 J H 0 )f = X ; 2(G) (g 1 ()h 1 ( )g 0 ()h 0 ( )) P J P f = X ; 2(G) (g 1 ()h 1 ( )g 0 ()h 0 ( )) P J f (5.3) In (5.2), we use the orthogonality property of eigenspaces, given in (2.6), which reduces the double summation over and in the expansion of T eq , into a single summation over . How- ever, the same cannot be applied in (5.6), where projection f of signal f onto V eigenspace, is modulated with downsampling matrix J before multiplication with P . For an arbitrary graph, modulated signal J f may not entirely belong to a single eigenspace (in fact it can have compo- nents in all eigenspaces). This means that for arbitrary graphs alias-free perfect reconstruction can be guaranteed for any graph-signal, if g 1 ()h 1 ( )g 0 ()h 0 ( ) = 0 g 0 ()h 0 () +g 1 ()h 1 ()) = c 2 ; (5.4) for some constant c and for all ; 2(G). Note that the system of equations in (5.4) may not have a solution for every graph G, since there are more constraints (O(N 2 )) than the number of variables (O(N)). However for bipartite graphs, (5.4) can be simplied toO(N) constraints, because of the spectral folding property. This is discussed in the next section. 5.2 Two-Channel Filterbank Conditions for Bipartite Graphs We showed in Chapter 4 that, if the underlying graph is a bipartite graphB = (L;H;E), and if we choose the downsampling function to be either H or L , then J f is an alias of signal f. Using the spectral folding property of bipartite graphs in (4.19), J f can be expressed as: J f = (J f) 2 ; (5.5) 56 where (J f) 2 , is the projection of modulated graph-signal J f, onto eigenspace V 2 . This implies that J f belongs to eigenspaceV 2 for bipartite graphs. This property, when combined with (5.6) and orthogonality property of eigenspaces in (2.6), leads to: T alias f = X ; 2(B) (g 1 ()h 1 ( )g 0 ()h 0 ( )) P J f = X ; 2(B) (g 1 ()h 1 ( )g 0 ()h 0 ( )) P (J f) 2 = X 2(B) (g 1 ()h 1 (2)g 0 ()h 0 (2)) (J f) (5.6) Thus, in case of bipartite graphs, the double summation in the expansion of T alias over and , reduces to a single summation over , as derived in (5.6), and perfect reconstruction in the lterbanks can be guaranteed if g 1 ()h 1 (2)g 0 ()h 0 (2) = 0 g 0 ()h 0 () +g 1 ()h 1 ()) = c 2 ; (5.7) for some constantc and for all2(B). The system of equations in (5.7) can also be represented in matrix form as: 2 6 4 h 0 () h 1 () h 0 (2) h 1 (2) 3 7 5 | {z } Hm() 2 6 4 g 0 () g 1 () 3 7 5 = 2 6 4 c 0 3 7 5; (5.8) and will have atleast one solution for any bipartite graph, assuming full rank of H m () for all 2(B) (i.e.,det(H m ())6= 0 , wheredet(:) is the determinant of a matrix). Before proposing a solution of (5.7), we state necessary and sucient conditions for a two-channel spectral lterbank to be aliasing-cancellation, perfect reconstruction and orthogonal on any bipartite graph. 57 5.2.1 Aliasing cancellation Combining (5.2) and (5.6) we can write the overall transfer function of the two-channel spectral lterbank on any bipartite graph as: y = Tf = X 2(B) (g 0 ()h 0 () +g 1 ()h 1 ()) f + X 2(B) (g 1 ()h 1 (2)g 0 ()h 0 (2)) (J f) (5.9) Taking projections of both LHS and RHS in (5.9) onto V space, we get: P y = y = (g 0 ()h 0 () +g 1 ()h 1 ()) | {z } Teq() f + (g 1 ()h 1 (2)g 0 ()h 0 (2)) | {z } T alias () (J f) : (5.10) Thus, projection y of the output signal for all is a weighted linear sum of projections of input signal f and alias signal J f onto theV eigenspace. Therefore, an alias-free reconstruction using spectral lters is possible if and only if the weight of alias signal in (5.10) is zero for all 2(B), i.e., T alias () =g 0 ()h 0 (2)g 1 ()h 1 (2) = 0: (5.11) 5.2.2 Perfect reconstruction Referring again to (5.5), (J f) = J f 2 . Perfect reconstruction means that the reconstructed signal ^ f is the same as (or possibly a scaled version of) the input signal f. This implies y =c 2 f in (5.9), or equivalently y =c 2 f in (5.10) for some constant c and for all 2(B). Therefore, in case of perfect reconstruction (5.10) can be written as: y =c 2 f =T eq ()f +T alias ()J f 2 ; (5.12) or (c 2 T eq ())f =T alias ()J f 2 ; (5.13) 58 Since f and f 2 are mutually orthogonal components of input signal f and hence are independent of each other, the only way (5.13) holds for all signals is, if T eq () =c 2 ; T alias () = 0: (5.14) Thus, a necessary and sucient condition for perfect reconstruction, using spectral lters, in bipartite graphs lterbanks is that for all in (B), g 0 ()h 0 () +g 1 ()h 1 () =c 2 ; g 0 ()h 0 (2)g 1 ()h 1 (2) = 0: (5.15) 5.2.3 Orthogonality The wavelet coecient vector w produced in the lterbank shown in Figure 2.1 is given as: w = T a f = 1 2 ((I J )H 0 f + (I + J )H 1 f) = 1 2 (H 1 + H 0 )f + 1 2 J (H 1 H 0 )f (5.16) Applying (5.1) in (5.16), we get: w = 1 2 X 2(B) (h 1 () +h 0 ())P f + 1 2 X 2(B) (h 1 ()h 0 ())J P f: (5.17) Using (4.17), and changing the variable to 2 in the second summation term in (5.17), we get: w = 1 2 X 2(B) (h 1 () +h 0 ()) | {z } C f + (h 1 (2)h 0 (2)) | {z } D 2 (J f) : (5.18) Taking projections of both LHS and RHS in (5.18) onto V space, we get: w = P w = 1 2 (C f +D 2 (J f) ): (5.19) 59 The lterbank provides an orthogonal decomposition for any graph signal if and only ifjjwjj 2 2 = jjT a fjj 2 2 =jjfjj 2 2 for all f 2 R N . For brevity we simply denote 2-norm of f asjjfjj. Since the eigenspaces ofL are orthogonal, the projections w are orthogonal and the energyjjw jj 2 can be computed as sum of energy of w in each eigenspace, i.e., jjwjj 2 = X 2(B) jjw jj 2 = X 2(B) jj 1 2 (C f +D 2 (J f) )jj 2 : (5.20) With some algebraic manipulationjjw jj 2 in (5.20) can be written as: jjw jj 2 = 1 4 C 2 jjf jj 2 +D 2 2 jj(J f) )jj 2 + 2C D 2 < f (J f) > : (5.21) Using (4.19) and (4.20) in (5.21), we get: jjw jj 2 = 1 4 C 2 jjf jj 2 +D 2 2 jjf 2 jj 2 + 2C D 2 < f J f 2 > : (5.22) It can be seen from (5.22), thatjjw jj 2 only depends on the f and f 2 components of signal f. Similarly,jjw 2 jj 2 also depends on only the f and f 2 components of signal f. Further, f and f 2 are only used to compute w and w 2 . Therefore, for all 2(B), if f = f + f 2 , then w = w + w 2 . Thus, orthogonality of lterbank is guaranteed if for all 2(B): jjw jj 2 +jjw 2 jj 2 =jjf jj 2 +jjf 2 jj 2 (5.23) Using (5.22), and with some algebraic manipulation we can write: jjw jj 2 +jjw 2 jj 2 = 1 4 (C 2 +D 2 )jjf jj 2 + 1 4 (C 2 2 +D 2 2 )jjf 2 jj 2 + 1 2 (C D 2 +C 2 D )< f J f 2 >: (5.24) 60 In (5.24), f and f 2 are orthogonal to each other, and hence are independent of each other. Therefore, for (5.23) to hold true for all 2(B) and for all signals f2R N : C 2 +D 2 = 4 C D 2 +C 2 D = 0: (5.25) Expanding C and D in terms of h 0 () and h 1 (), we get: C 2 +D 2 = 2(h 2 0 () +h 2 1 ()) = 4 C D 2 +C 2 D = h 0 ()h 0 (2)h 1 ()h 1 (2) = 0: (5.26) all . Thus, a necessary and sucient condition for orthogonality in bipartite graph lterbanks using spectral lters is : h 0 ()h 0 (2)h 1 ()h 1 (2) = 0 h 2 0 () +h 2 1 () = 2: (5.27) Note that comparing (5.15) and (5.27), the orthogonality conditions can be obtained from the perfect reconstruction conditions by selectingg 0 () =h 0 () andg 1 () =h 1 (). This is analogous to the case of standard lterbanks and leads to our proposed graph-QMF design as explained in Section 5.3. 5.3 Graph-QMF Filterbanks In this section, we extend the well-known quadrature mirror lter (QMF) solution to the case of bipartite graphs. Our proposed solution, termed as graph-QMF, requires the design of a single spectral kernel h 0 (), while the other spectral kernels are chosen as a function of h 0 () as: h 1 () =h 0 (2) g 0 () =h 0 () g 1 () =h 1 () =h 0 (2) (5.28) 61 Proposition 3 (QMF Filters on Graph). For a bipartite graph G = (L;H; E), let a two-channel lterbank be as shown in Figure 2.1 with the downsampling function = H and with spectral ltersfH 0 ; H 1 ; G 0 ; G 1 g corresponding to spectral kernelsfh 0 ();h 1 ();g 0 ();g 1 ()g respectively. Then for any arbitrary choice of kernel h 0 (), the proposed graph-QMF solution cancels aliasing in the lterbank. In addition for solutionh 0 () such thath 0 () 2 +h 0 (2) 2 =c 2 for all2(B) andc6= 0 the lterbank provides perfect reconstruction and an orthogonal decomposition of graph- signals. Proof. Substituting (5.28) into (5.11) leads to g 0 ()h 0 (2)g 1 ()h 1 (2) = 0 and aliasing is indeed canceled. The reconstructed signal ^ x in this case is simply equal to (1=2)T eq x and can be written as: ^ x = 1 2 X 2(B) (h 2 0 () +h 2 0 (2))x (5.29) Thus for (h 2 0 ()+h 2 0 (2)) =c 2 andc6= 0, the reconstructed signal ^ x = c 2 2 x is a scaled version of the original signal. Similarly applying the mirror designh 1 () =h 0 (2) in the conditions (5.27) we geth 0 ()h 0 (2)h 1 ()h 1 (2) = 0 andh 2 0 () +h 2 1 () =c 2 and hence the corresponding analysis side transform T a is orthogonal. 5.3.1 Chebychev polynomial approximation We now consider the design of kernelsh 0 () satisfying the design constraint of Proposition 3, i.e., for which h 2 0 () +h 2 0 (2) = c 2 for all 2 (B). For maximum spectrum splitting in the two channels of the lterbank, the ideal choice of kernelh 0 () would be a lowpass rectangular function on given as: h ideal 0 () = 8 > > > > < > > > > : c if < 1 c= p (2) if = 1 0 if > 1 (5.30) The corresponding ideal lter is given by H ideal 0 = X <1 cP + c p 2 P =1 (5.31) Note that the ideal transform has a non-analytic spectral kernel response with sharp peaks and is therefore a global transform (i.e., the lters operations are not localized). Even analytic solutions 62 of the constraint equation h 2 0 () + h 2 0 (2 ) = c 2 , such as h 0 () = c p 1=2 or h 0 () = c cos(=4), are not compactly supported in the spatial domain. By relaxing the constraints one can obtain compact support solutions at the cost of some small reconstruction error and near-perfect orthogonality. One such solution is the approximation of the desired kernel with a polynomial kernel. We choose polynomial approximations of the desired kernel due to the following localization property for corresponding transforms: Lemma 3 ([14]). Let h 0 () be a polynomial of degree k and letL be the normalized Laplacian matrix for any weighted graphG, then the matrix polynomial H 0 =h 0 (L) is exactlyk-hop localized at each node of G. In other words for any two nodes n and m if m = 2N k (n) then H 0 (n;m) = 0. Further, we choose a minimax polynomial approximation which minimizes the Chebychev norm (worst-case norm) of the reconstruction error since it has been shown in [14] that it also minimizes the upper-bound on the errorjjH ideal H poly jj between ideal and approximated lters. Thus, in order to localize the lters on the graph, we approximate h ideal 0 with the truncated Chebychev polynomials (which are a good approximation of minimax polynomials) of dierent orders. However sinceh ideal 0 is a rectangular function it projects a lot of its energy in the truncated part of the polynomial expansions and as a result the polynomial approximation errors for h ideal 0 are high. A possible solution of this problem is to soften the ideal case, by nding a smooth function that is low-pass and satises the constraint. An analogous construction in regular signal processing is Meyer's wavelet design which replaces the brick-wall type ideal frequency-response with a smooth scaling function that satises the orthogonality and scaling requirements. By a change in variable from!2 [1; 1] to2 [0; 2] we can extend Meyer's wavelet construction in the case of bipartite graph. The construction involves choosing a function (x) such that () = 0 for 0 ,() = 1 for 1 and() +(1) = 1 everywhere. One such function is given as: () = 8 > > > > < > > > > : 0 if 0 3 2 2 3 if 0 1 1 if 1 ; (5.32) which leads to a smooth kernel given as: h Meyer 0 () = r (2 3 2 ) (5.33) 63 In Figure 5.1(a), we plot the ideal and Meyer wavelet kernels and in Figures 5.1(b)-(f) we plot the reconstruction errors between desired kernels and their polynomial approximations of dierent orders. It can be seen that Meyer's wavelet approximations yield small reconstruction errors 0 1 2 0 0 . 5 1 1 . 5 0 1 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 1 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 1 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 1 2 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 1 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 (a ) ( b) (c ) (d) (e) (f) Figure 5.1: (a) Ideal kernel (blue) vs. Meyer's wavelet kernel (red). It can be seen that Meyer's wavelet has smoother transition at = 1 than the ideal kernel, (b)-(f) the reconstruction error magnitudes between original kernels and their polynomial approximations of order 2; 4; 6; 8 and 10 respectively: ideal kernel (blue curves) and Meyers kernel(red curve). as compared to ideal-lter approximations. Thus by choosing h() as the low-order polynomial approximations of smooth low-pass functions (such as Meyer's wavelets), we obtain near perfect reconstruction QMF wavelet lters on any bipartite graph which are very well localized in spatial domain. In order to obtain a bound on the reconstruction error due to polynomial approximation, we apply graph-QMF kernels given in (5.28) to the overall transform in (2.15). We observe that the graph-QMF lterbanks cancel aliasing term (i.e., term T alias ) and the reconstructed signal can be written as: ^ f = X 2(B) h 2 0 () +h 2 1 () 2 f()u (5.34) Hence, the reconstruction MSE is given as: MSE = 1 N jjf ^ fjj 2 2 = 1 N X 2(B) 1 h 2 0 () +h 2 1 () 2 2 f 2 () 1 N arg max 1 h 2 0 () +h 2 1 () 2 2 | {z } MSEmax jjfjj 2 2 (5.35) 64 where MSE max is the square of maximum deviation of equivalent kernel response h 2 0 ()+h 2 1 () 2 from 1. The SNR of reconstruction can thus be bounded from below using MSE max as: SNR = 10log 10 1 N jjfjj 2 2 1 N jjf ^ fjj 2 2 20log 10 p (MSE max ) (5.36) Table 5.2 shows the average SNR obtained using dierent polynomial approximations of graph- QMF lterbanks. The average is computed by randomly generating 20 graph-signals on 10 random instances of bipartite graphs, each with 100 nodes. We observe that increasing the degree of poly- nomial approximations leads to higher reconstruction SNR, at the cost of increasing computational complexity (i.e., longer lters). In the next section, we relax the condition of orthogonality, and turn to compactly supported designs, which are perfect reconstruction and can be designed with shorter polynomial spectral kernels. 5.4 One-hop Localized Spectral Filterbanks We are interested in lterbanks which can be computed locally at each node (i.e., have com- pact support), critically sampled and invertible. To achieve compact support, instead of nding polynomial approximations of arbitrary kernel functions as described in Section 5.3, we propose designing low-degree polynomial kernels which have localized support, and are invertible. In this approach, we use analysis lters to be based on spectral kernels, similar to the ones dened in (5.1). We specically choose degree-1 polynomial kernels: h 0 () = a 0 +b 0 h 1 () = a 1 +b 1 ; (5.37) in which case the corresponding lters are given as: H 0 = a 0 L +b 0 I H 1 = a 1 L +b 1 I: (5.38) Referring again to (5.16) the analysis wavelet transform T a is given as: 65 T a = 1 2 (H 1 + H 0 ) + 1 2 J (H 1 H 0 ): (5.39) Thus, T a corresponds to choosingm rows of lowpass transform matrix H 0 for nodes inL and Nm rows of highpass transform H 1 for nodes in H. 5.4.1 One-hop localized designs for arbitrary graphs Proposition 4 below states that transform T a is invertible (non-singular) for any partition L[H of vertex set for our choice of degree-1 kernel functions h 0 () and h 1 (). Proposition 4. Let transforms H 0 and H 1 be two transforms on a graphG based on linear spectral kernels h 0 and h 1 as dened above, and T a is the overall analysis transform. LetV =L[H be any partition of vertex set of G. Then for a 0 ;a 1 6= 0 and b 1 =a 1 ;b 0 =a 0 0 with strict inequality for at least one b, the matrix T a is invertible. Proof. Without loss of generality, let us assume T a consists of rst m rows of matrix H 1 and remaining l =Nm rows of matrix H 0 . Represent the Laplacian matrix L in block form as L = 2 6 4 (L 1 ) mm (L 3 ) ml (L 0 3 ) lm (L 2 ) ll 3 7 5: (5.40) Using (5.39), T a can be written as: T a = 2 6 4 a 1 L 1 +b 1 I 1 a 1 L 3 a 0 L 0 3 a 0 L 2 +b 0 I 2 3 7 5 = 2 6 4 a 1 I 1 0 0 a 0 I 2 3 7 5 | {z } A 2 6 4 L 1 + b1 a1 I 1 L 3 L 0 3 L 2 + b0 a0 I 2 3 7 5 | {z } B (5.41) 66 Since matrix A is full-rank fora 1 ;a 0 6= 0. Thereforerank(T a ) =rank(B). However, matrix B is the sum of Laplacian matrix L and matrix E given as: B =L + 2 6 4 (b 1 =a 1 )I 1 0 0 (b 0 =a 0 )I 2 3 7 5 | {z } E (5.42) which is positive semi-denite (forb 1 =a 1 ;b 0 =a 0 0). Therefore B is positive semi-denite. Further for (b 1 = 0, b 0 > 0) the null-spaceN (L) =fx = cD 1=2 1 : c2 Rg of matrix L and null-space N (E) =fx2R N : x(k) = 08k2fm + 1:::Ngg of matrix E are disjoint (8x6= 0). Hence B is positive denite and is therefore full rank. Same is true for the case when (b 0 = 0,b 1 > 0). Hence matrix T a is full-rank and invertible. The advantage of the 1-hop localized design, described above, is that it can be applied to any arbitrary graph and for any partition sets L and H. However, the design is valid only for linear spectral kernelsh 0 andh 1 with the constraints mentioned in Proposition 4, and does not hold for larger lters. 5.4.2 One-hop localized designs for bipartite graphs The constraints required to satisfy invertibility in 1-hop localized lterbanks can be more relaxed, and easy to satisfy if the underlying graph is a bipartite graph. Let us consider the same 1-hop localized spectral lters on bipartite graphs. Referring again to (5.8), the lterbanks on a bipartite graphB with a pair of kernels h 0 () and h 1 () are invertible (i.e., have a solution of g 0 () and g 1 ()), if det(H )6= 0 for all 2(B). Using linear spectral kernels given in (5.37) in (5.8), we get: det(H ) =det 0 B @ 2 6 4 a 0 +b 0 a 1 +b 1 a 0 (2)b 0 a 1 (2) +b 1 3 7 5 1 C A (5.43) This can be written as: det(H ) = (a 0 +b 0 )(a 1 + 2a 1 +b 1 ) (a 1 +b 1 )(a 0 2a 0 b 0 ) = 2(a 0 a 1 2 2a 0 a 1 a 1 b 0 a 0 b 1 b 1 b 0 ) = 2a 0 a 1 ( (1 a )) ( (1 + a )) (5.44) 67 where a = r 1 + a 1 b 0 +a 0 b 1 +b 0 b 1 a 0 a 1 (5.45) Thus,det(H ) is a quadratic polynomial in terms of with roots at 1 a and 1 + a . Therefore, the lterbanks are invertible, as long as 1 a and 1 + a are not the eigenvalues ofB 2 . This is a much more relaxed condition than the conditions in Proposition 4, but works only in case of bipartite graphs. Let us take a specic design choice in which a 1 = 1=2;a 0 =1=2;b 1 = 0 and b 0 = 1. The kernels h 0 () and h 1 () in this case, are plotted in Figure 5.2(a). Here h 0 () can be considered a linear approximation of ideal lowpass kernelh ideal 0 in (5.30), andh 1 () =h 0 (2), is the highpass kernel. The design does not satisfy the conditions in Proposition 4, since (b 0 =a 0 )< 0. However for bipartite graphs, applying these values in (5.45), we get: a = r 1 + a 1 b 0 +a 0 b 1 +b 0 b 1 a 0 a 1 = p 1; (5.46) which is imaginary. This implies that det(H ) does not have any real roots, i.e. det(H )6= 0 for all 2 [0 2]. Therefore, the lterbank is always invertible on any bipartite graph. Solving (5.8) by applying the specic parameters we get: h 0 () = 1 2 h 1 () = 2 g 0 () = 1 1 2 2 2 + 2 g 1 () = 1 2 2 2 + 2 (5.47) To summarize this section, we have proposed 1-hop localized lterbanks on graphs which are invertible and critically sampled. The conditions for invertibility on any arbitrary graph are given in Proposition 4. These conditions are quite constrained. On the other hand, we presented analysis of 1-hop localized lterbanks for bipartite graphs and showed that the invertibility conditions in bipartite graphs are much more relaxed, and easy to nd. One example of 1-hop localized lterbanks is proposed in (5.47), which is perfect reconstruction on any bipartite graphs and for all graph-signals. However, the synthesis kernels in the proposed design are not polynomials, 2 As stated in Lemma 2, the eigenvalues of a bipartite graph occur symmetrically around = 1, i.e., if 1 +a is an eigenvalue ofB then 1a is also an eigenvalue ofB. 68 (a) analysis kernels (b) synthesis kernels Figure 5.2: Proposed 1-hop spectral kernels for bipartite graphs. which implies that they do not have compact support. In the next section, we consider designs where both analysis and synthesis kernels have compact support. 5.5 Graph-Bior Filterbanks We now revisit the perfect reconstruction conditions on bipartite graphs, described in Section 5.2 to design critically sampled perfect reconstruction lterbanks, in which both analysis and synthesis lters are compactly supported. In our proposed designs, both analysis lters H i and synthesis lters G i for i = 0; 1, of the two channels are graph transforms characterized by spectral kernels h i () and g i () for i = 0; 1 respectively. The minimum constraints for designing biorthogonal lters are to satisfy the perfect reconstruction conditions given in (5.15), which involves designing four spectral kernels, namely lowpass analysis kernelh 0 (), highpass analysis kernelh 1 (), lowpass synthesis kernel g 0 (), and highpass synthesis kernel g 1 (). If we choose analysis and synthesis high-pass kernels to be: h 1 () = g 0 (2) g 1 () = h 0 (2); (5.48) then, (5.15) reduces to a single constraint for all eigenvalues, given as: h 0 ()g 0 () +h 0 (2)g 0 (2) = 2: (5.49) 69 Further, dene p() =h 0 ()g 0 (), then (5.49) can be written as: p() +p(2) = 2: (5.50) In our approach, we rst design h 0 () and g 0 (), and then h 1 () and g 1 () can be obtained using (5.48). Further, sincep() is the product of two low-pass kernels, it is also a low-pass kernel. Therefore, the objective is to designp() as a polynomial half-band kernel 3 , which satises (5.49), and then obtain kernels h 0 () and h 1 () via spectral factorization. The following result is useful in our analysis: Proposition 5. If h 0 () and g 0 () are polynomial kernels, then any p() = h 0 ()g 0 (), which satises (5.48) for all 2 [0 2], is an odd degree polynomial. Proof. By changing the variable to 1, we can write (5.48) as: p(1 +) +p(1) = 2; (5.51) wherep(1+) =h 0 (1+)g 0 (1+). Ifh 0 () andg 0 () are polynomial kernels, then the functions p(1 +) and p(1) are also polynomials and can be expressed as: p(1 +) = K X k=0 c k () k : p(1) = K X k=0 c k () k : (5.52) Using (5.52) in (5.51), we get: p(1 +) +p(1) = K X k=0 c k (() k + () k ) = 2c 0 + K=2 X k=1 c 2k 2k : (5.53) Thus p(1 +) +p(1) is an even polynomial function of . However, we need to design p() such that p(1 +) +p(1) = 2 for all 2 [1 1], which is true if and only if, c 0 = 1 and all 3 An half band kernelh() in the spectral domain of a graph can be dened as a kernel with h() 1 for 1 (i.e., less than Nyquist frequency.), andh() 0 otherwise. Examples of half-band kernels are ideal spectral kernel in (5.30), and Meyer's kernel in (5.33). 70 other even power coecients c n in the polynomial expansion of p(1 +) are 0. Therefore, the solution p(1 +), expressed as: p(1 +) = 1 + K X n=0 c 2n+1 2n+1 ; (5.54) is an odd degree polynomial. Thus, ignoring the trivial case p(1 +) = 1, the highest degree of p(1 +) (and hence p()) is always odd. 5.5.1 Designing half-band kernel p() While the graph-QMF lters cannot be exact polynomials, there exist non-orthogonal lters that satisfy (5.48), and (5.49). The following known results help us prove the existence of a polynomial p() that satisfy (5.49), and its spectral factorization: Lemma 4 (Bezout's identity [44, prop. 3.13]). Given any two polynomials a() and b(), a()x() +b()y() =c(); (5.55) has a solution [x(); y()], if and only ifgcd(a();b()) dividesc(), wheregcd(a();b()) refers to the greatest common divisor of polynomials a() and b(). Theorem 2 (Complementary Filters [44, prop. 3.13]). Given a polynomial kernel h 0 (), there exists a complementary polynomial kernel g 0 () which satises the perfect reconstruction relation in (5.49), if and only if h 0 (1 +) and h 0 (1) are coprime. Proof. Let us denote a() =h 0 (1 +), b() =h 0 (1), x() =g 0 (1 +), y() =g 0 (1) and c() = 2. Then, (5.49) can be written in the same form as (5.55). Following the result of Lemma 4, given polynomial kernel h 0 (), a polynomial solution of g 0 () exists if and only if gcd(h 0 (1 + );h 0 (1)) dividesc() = 2, which is a prime number. This impliesgcd(h 0 (1 +);h 0 (1)) is either 1 or 2 for all2 [1 1], which is true ih 0 (1 +) andh 0 (1) do not have any common roots. This implies that h 0 (1 +) and h 0 (1) are coprime. Corollary 1 ([44, exercise. 3.12]). There is always a complementary lter for the polynomial kernel (1 +) k , i.e., (1 +) k R() + (1) k R() = 2 (5.56) 71 always has a real polynomial solution R() for k 0. Proof. Let us denote a() = (1 +) k , b() = (1) k , x() = R(), y() = R() and c() = 2. Then, (5.56) can be written in the same form as (5.55). Since a() and b(), in this case are coprime, therefore gcd(a();b()) = 1 divides c() = 2. Hence, a polynomial R(), which satises (5.56) always exists. For a perfect reconstruction biorthogonal lterbank, we need to design a polynomial half-band kernel p() that satises (5.49), or equivalently (5.51). Following Daubechies' approach [5], we propose a maximally- at design, in which we assignK roots top() at the lowest eigenvalue (i.e., at = 0). Subsequently, we select p() to be the shortest length polynomial, which has K roots at = 0 and satises (5.51). This implies that p(1 +) has K roots at =1, and can be expanded as: p(1 +) = (1 +) K k X m=0 r m m : | {z } R() (5.57) where R() is the residual k degree polynomial. By Corollary 1, there always exist such a poly- nomialR(). On the other hand, Proposition 5 says that any p(1 +) that satises (5.51) has to be an odd-degree polynomial. Hence, p(1 +) can also be expanded as: p(1 +) = 1 + M X n=0 c 2n+1 2n+1 : (5.58) for a given M. Comparing (5.57) and (5.58), we get: (1 +) K k X m=0 r m m = 1 + M X n=0 c 2n+1 2n+1 : (5.59) Comparing the constant terms in the left and right side of (5.59), we get r 0 = 1. Further, comparing the highest powers on both sides of (5.59) we get: M = K +k 1 2 (5.60) Further, the right side in (5.59) has M constraints c 2n = 0 for n =f1; 2;:::Kg, and the left side 72 in (5.59) has k unknowns r m for m =f1; 2;:::kg. In order to get a unique p(1 +) that satises (5.51), we must have equal number of unknowns and constraints, i.e, M =k = K +k 1 2 ) M =K 1: (5.61) Thus, (5.59) can be written as: (1 +) K (1 + K1 X m=1 r m m ) = 1 + K1 X n=0 c 2n+1 2n+1 ; (5.62) andK 1 unknowns can be found uniquely, by solving a linear system of K 1 equations. Note that given K, the length of p() (i.e, highest degree) is K +M = 2K 1. As an example, we design p() with K = 2 zeros at = 0. In this case p(1 +) can be written as: p(1 +) = (1 +) 2 (1 +r 1 ) = 1 + (r 1 + 2) + (1 + 2r 1 ) 2 +r 1 3 Since p(1 +) is an odd polynomial, the term corresponding to 2 is zero, i.e., 1 + 2r 1 = 0 or r 1 =1=2. Therefore, p(1 +) is given as: p(1 +) = (1 +) 2 (1 1 2 ): (5.63) which implies that: p() = 1 2 2 (3): (5.64) We plot in Figure 5.3 p() for various values of K, and it can be seen that by increasing K, we get better ideal half band lter approximation of p(). Note that for graph-QMF designs h 1 () = h 0 (2), hence p() = h 2 0 () is a perfect square of a polynomial. While graph-QMF designs satisfy constraints given in both (5.48) and (5.49), they cannot be designed as exact polynomials. As proven in Proposition 5, any p() which satises (5.49) is an odd-degree polynomial. Using this result, we can prove the following: Proposition 6. The kernels in the graph-QMF design cannot be exact polynomial functions. Proof. For the graph-QMF solution in (5.28), the design was based on selecting g 0 () = h 0 () 73 Figure 5.3: The spectral distribution of p() with K zeros at = 0 and we can write p QMF () = h 2 0 (). Thus, if h 0 () is a polynomial kernel then p QMF () is the square of a polynomial, and therefore should have an even degree. However as proved in Proposition 5 above,p QMF () is an odd degree polynomial and cannot be factored into the square of a polynomial. Therefore,h 0 () in the graph-QMF designs, cannot be an exact polynomial. 5.5.2 Spectral factorization of half-band kernel p() Once we obtain p() by using above mentioned design, we need to factorize it into lter kernels h 0 () andg 0 (). Sincep() is a real polynomial of odd degree, it has at least one real root and all the complex roots occur in conjugate pairs. Since we want the two kernels to be polynomials with real coecients, each complex conjugate root pair of p() should be assigned together to either h 0 () or g 0 (). While any such factorization would lead to perfect reconstruction biorthogonal lterbanks, of particular interest is the design of lterbanks that are as close to orthogonal as possible. For this, we dene a criteria based on energy preservation. In particular, we compute the Riesz bounds of analysis wavelet transform T a , which are the tightest lower and upper bounds, A > 0 and B <1, ofjjT a fjj 2 , for any graph-signal f withjjfjj 2 = 1. For near-orthogonality, we require A B 1. The analysis transform T a can be expressed in terms of transforms H 0 74 and H 1 as in (5.39), and the Riesz bounds can be computed as the square-roots of the extreme eigenvalues of T t a T a . By expanding T t a T a , using (5.39) and (5.1) we obtain: T t a Ta = 1=2 X 2(B) (h 2 0 () +h 2 1 ()) | {z } C P + 1=2 X 2(B) (h1()h1(2)h0()h0(2)) | {z } D J P (5.65) In (5.65), the term D() consists of product terms h 0 ()h 0 (2) and h 1 ()h 1 (2), which are small for away from 1 (since it is the product of a low pass and a high pass kernel), and approximately cancel out each other for close to 1 (see Figure 5.4). Therefore, we can ignore D() in comparison to C(), and (5.65) can be approximately reduced to (5.66). T t a Ta 1=2 X 2(B) (h 2 0 () +h 2 1 ()) | {z } C P (5.66) Thus, T t a T a is a spectral transform with eigenvalues 1=2(h 2 0 () +h 2 1 ()) for 2 (B), and the Riesz Bounds can be given as: A = inf 1 2 (h 2 0 () +h 2 1 ()) B = sup 1 2 (h 2 0 () +h 2 1 ()) (5.67) We choose the =A=B, as the measure of orthogonality (for orthogonal lterbanks = 1). The exact computation of requires all eigenvalues of the graph. In general, the eigenvalues can be computed from the eigen-decomposition of the graph, if the graph is known. However, this incurs additional computational complexity, something we have avoided so far in our designs. Therefore, we compute an approximte as the dierence of lowest and highest values of 1=2(h 2 0 () +h 2 1 ()) at 100 uniformly sampled points from the continuous region [0 2]. We choose lters with least dissimilar lengths, and compute for all such possible factorizations (which are 2K1 K in number). Finally, we choose the factorization with the maximum magnitude . 5.5.3 Nomenclature and design of graph-Bior lterbanks The proposed biorthogonal lterbanks are specied by four parameters (k 0 ; k 1 ; l 0 ; l 1 ), where k 0 is the number of roots of low pass analysis kernel h 0 () at = 0, k 1 is the number of roots of low pass synthesis kernel g 0 () at = 0, l 0 is the highest degree of low pass analysis kernel 75 h 0 (), andl 1 is the highest degree of low pass synthesis kernel g 0 (), respectively. The other two lters, namelyh 1 () andg 1 () can be computed as in (5.48). Given these specications, we design p() =h 0 ()g 0 () as a maximally at half band polynomial kernel with K =k 0 +k 1 number of roots at = 0. As a result, p() turns out to be a 2K 1 degree polynomial, and we factorize it into h 0 () and g 0 (), with least dissimilar lengths (i.e., we choose l 0 = K and l 1 = K 1). We use to be the criteria to compare various possible factorizations, and choose the one with the maximum value of . This leads to a unique design of biothogonal lterbanks. We term our proposed lterbanks as graphBior(k 0 ;k 1 ). We designed graphBior lterbanks for various values of (k 0 ;k 1 ), and we observed that designs withk 0 =k 1 stand out, as they are close to orthogonal and have near- at pass-band responses. The low-pass and high-pass analysis kernels are plotted in Figure 5.4, and their coecients are shown in Table 5.2. A comparison between proposed graph- Bior lterbanks and proposed graph-QMF lterbanks, in terms of perfect reconstruction error (SNR) and orthogonality () is shown in Table 5.1. The reconstruction SNR and orthogonality are computed as an average over 20 instances of randomly generated graph-signals on 10 random bipartite graphs with 100 nodes each. In this table, the lter length of graph-Bior designs is chosen to be the maximum of the two lter lengths (i.e, K). It can be seen from Table 5.1 that all graph-Bior designs provide perfect reconstruction (SNR> 100dB). The graph-QMF lters in comparison are more orthogonal (i.e., closer to 1), but have considerably lower reconstruction SNR. In Table 5.1, the value of for graph-Bior lterbanks seem far o from 1. However, this is primarily because the lowpass and highpass kernels are not symmetric or even equal length. As a result, the basis in the biorthogonal case (i.e., rows of analysis transform T a ) corresponding to dierent kernels have dierent norms, and therefore the signal projected on to dierent basis experiences dierent gains, leading to << 1. In order to see the orthogonality of the basis in a practical case, we empirically compute the mutual coherence M of the transform matrix T a , which is dened as the maximum absolute value of the cross-correlation between the rows of T a , i.e., M = max 1i6=jN j< t ai t aj >j; (5.68) and for orthogonal basis,M = 0. We observe in Table 5.1 that the value of mutual coherence M is very close to 0 for biorthogonal lters, which we compute as the average mutual coherence over 10 instances of random bipartite graphs. This implies that the basis in T a are nearly orthogonal. 76 In order to get a uniform gain over all basis, we need to adjust the gains of lowpass and highpass basis in T a , as done in the standard biorthogonal lterbanks. This is a part of our future work. Filterlength Graph-QMF Graph-Bior SNR (in dB) SNR (in dB) M 4 32.2107 0.9657 300.3649 0.5042 0.0946 10 42.1521 0.9856 245.691 0.5777 0.0634 14 48.0403 0.9902 173.5528 0.6528 0.0518 16 44.6852 0.9854 154.7572 0.6802 0.0445 18 45.1262 0.9876 123.9701 0.6864 0.0423 20 54.6041 0.9963 115.0719 0.7112 0.0378 Table 5.1: Comparison between graph-QMF lterbanks and graph-Bior lterbanks graphBior(k 0 ,k 1 ) lter coecients k 0 = 6, k 1 = 6 h 1 = [-0.3864 4.0351 -17.0630 36.5763 -39.8098 17.6477 0 0 0 0 0 0] h 0 = [ 0.4352 -4.9802 23.2396 -55.4662 67.2657 -29.0402 -13.0400 7.5253 9.5267 -4.8746 -2.0616 1.2633 1.2071] k 0 = 7, k 1 = 7 h 1 = [0.3115 -3.9523 21.0540 -60.3094 98.0605 -85.9222 31.7578 0 0 0 0 0 0 0] h 0 = [ -0.4975 6.8084 -39.6151 126.2423 -234.3683 241.5031 -97.6557 -46.2635 62.1232 -19.3648 -2.0766 6.5886 -4.5632 0.5775 1.5614] k 0 = 8, k 1 = 8 h 1 = [-0.3232 4.7284 -29.7443 104.3985 -221.0705 282.7915 -202.6283 62.8477 0 0 0 0 0 0 0 0] h 0 = [ 0.4470 -6.9872 47.5460 -183.6940 440.0924 -670.0905 643.3979 -396.0713 209.9824 -154.0976 92.8617 -30.8228 16.6112 -12.7664 3.2403 -0.0284 1.3793] Table 5.2: Polynomial expansion coecients (highest degree rst) of graphBior (k 0 ;k 1 ) lters (approximated to 4 decimal places) on a bipartite graph. 5.6 Filterbank designs using asymmetric Laplacian matrix A DC signal on a graph corresponds to a scalar multiple of the eigenvector of graph Laplacian matrix corresponding to the lowest eigenvalue (i.e., = 0). So far in this chapter, we have used the symmetric normalized Laplacian matrixL to design spectral lters, in which case, a DC vector on a bipartite graph is of the form cD 1=2 f = c1, where 1 is a vector with all elements 1, and D is the degree matrix. As described in Section 4.3, the normalization is necessary in order to extend the downsampling results for k-RBG to other non-regular bipartite graphs. Further, the matrix IL has the same eigenvalues as the probability transition matrix D 1 A of a random 77 walk dened on the graph, and thus is consistent with the stochastic properties of the graph. However, in some applications, such as image-processing, a DC signal is dened as an all constant signal f =c1, and a desired property of wavelet lters in this case is to have zero response (i.e., the wavelet coecients are all zero) corresponding to f = c1. In order to make our lterbanks compatible with these applications, we propose designing spectral lters using the asymmetric Laplacian matrixL a , which is dened as: L a = D 1 L = D 1=2 LD 1=2 (5.69) Since 1 is an eigenvector ofL with eigenvalue = 0, it is also an eigenvector ofL a with eigenvalue = 0. Further, matrixL a is similar toL, and therefore has the same set of eigenvalues asL. The eigenvector u ;a ofL a is related to eigenvector u ofL as: u ;a = D 1=2 u (5.70) Note that for non-regular graphsL a is an asymmetric matrix, therefore the eigenvectors ofL a are not orthogonal. The eigenvector decomposition ofL a is given as : L a = (D 1=2 U)(D 1=2 U) 1 : (5.71) Therefore, similar to (5.1), a spectral lters usingL a , corresponding to spectral kernel h() can be dened as: H a =h(L a ) = (D 1=2 U)h()(D 1=2 U) 1 = D 1=2 Uh()U t D 1=2 = D 1=2 HD 1=2 ; (5.72) where H is the spectral lter with the same spectral kernel h(), dened using normalized Lapla- cian matrixL. To avoid confusion, we will refer to lters designed using asymmetric Laplacian matrix as simply asymmetric lters, and lters designed using symmetric Laplacian matrix as symmetric lters. According to (5.72) , any asymmetric lters H a is similar to a symmetric lters H with same spectral kernel. However, the advantage of using H a instead of H in a lterbank 78 is that 1 is an eigenvector of H a with eigenvalue h(0). Referring again to Figure 2.1, the overall transfer function of an asymmetric lterbank can be written as: ^ f = 1 2 G 0a (I + J )H 0a f + 1 2 G 1a (I J )H 1a f = 1 2 (G 0a H 0a + G 1a H 1a )f + 1 2 (G 0a J H 0a G 1a J H 1a )f: (5.73) Using the similarity relation given in (5.72), we can simplify (5.73) as: ^ f = 1 2 (D 1=2 G 0 D 1=2 D 1=2 H 0 D 1=2 + D 1=2 G 1 D 1=2 D 1=2 H 1 D 1=2 )f + 1 2 (D 1=2 G 0 D 1=2 J D 1=2 H 0 D 1=2 D 1=2 G 1 D 1=2 J D 1=2 H 1 D 1=2 )f: (5.74) In (5.74), the matrices D 1=2 ; J , and D 1=2 are diagonal matrices and hence commute with each other. Therefore, D 1=2 J D 1=2 = J D 1=2 D 1=2 = J (5.75) Thus, (5.74), can be simplied as: ^ f = 1 2 (D 1=2 G 0 H 0 D 1=2 + D 1=2 G 1 H 1 D 1=2 )f + 1 2 (D 1=2 G 0 J H 0 D 1=2 D 1=2 G 1 J H 1 D 1=2 )f = D 1=2 T eq D 1=2 f + D 1=2 T alias D 1=2 f = D 1=2 (T eq + T alias )D 1=2 f; (5.76) where T eq and T alias correspond to the overall transfer function of symmetric lterbanks, as dened in (5.2) and (5.3), respectively. Thus, the asymmetric lterbanks are equivalent to symmetric lterbanks designed with same spectral kernels, in which the input is normalized with D 1=2 prior to ltering/downsampling operations, and the output is de-normalized with D 1=2 after the ltering/downsampling operations. We use this result to nd out the perfect reconstruction conditions and orthogonality in asymmetric lterbanks. 79 5.6.1 Perfect Reconstruction If T eq +T alias =cI, then ^ f = D 1=2 (cI)D 1=2 f =cf in (5.76), therefore the asymmetric lterbanks are perfect reconstruction if the symmetric lterbanks designed using the same spectral kernels, are perfect reconstruction. As a result, the conditions mentioned in (5.15), are also necessary and sucient conditions for perfect reconstruction in asymmetric lterbanks. 5.6.2 Orthogonality Since the eigenvectors ofL a are not orthogonal, the asymmetric lterbanks using graph-QMF kernels are also not orthogonal. The following analysis explains the frame property of asymmetric lterbanks. Similar to (5.16), the wavelet coecient vector w produced in the asymmetric lterbanks can be written as: w = T aa f = 1 2 (H 1a + H 0a )f + 1 2 J (H 1a H 0a )f = 1 2 D 1=2 (H 1 + H 0 )D 1=2 f + 1 2 D 1=2 J (H 1 H 0 )D 1=2 f = D 1=2 T a D 1=2 (5.77) In (5.77), if we dene f n = D 1=2 f, and w n = D 1=2 w, then (5.77) can be written as w n = T a f n . Thus, if the corresponding symmetric lterbank is orthogonal, i.e., if the spectral kernels satisfy, (5.27), thenjjw n jj =jjf n jj (the 2-norm). However, d min N X i=1 w 2 (i)jjw n jj 2 = N X i=1 d i w 2 (i)d max N X i=1 w 2 (i) d min N X i=1 f 2 (i)jjf n jj 2 = N X i=1 d i f 2 (i)d max N X i=1 f 2 (i); (5.78) where d min is the minimum degree in the graph (1 if there is an isolated node), and d max is the maximum degree. Using (5.78), we obtain: d min jjwjj 2 jjw n jj 2 = jjf n jj 2 d max jjfjj 2 d min jjfjj 2 jjf n jj 2 = jjw n jj 2 d max jjwjj 2 ; (5.79) 80 and d min d max jjfjj 2 jjwjj 2 d max d min jjfjj 2 (5.80) Thus, the asymmetric graph-QMF lterbanks dene a frame in the graph-signal space, with lower bound A = p d min =d max and upper-bound B = p d max =d min . Note that for regular graphs d min = d max , hence A = B = 1, and the asymmetric graph-QMF lterbanks are orthogonal. Similar analysis can be done for asymmetric graphBior lterbanks. Thus, we can use both symmetric normalized and asymmetric normalized Laplacian matrices to design spectral lters in any of our proposed lterbank designs. The decision about which lterbank design and which Laplacian matrix to choose depends on the desired properties of the lterbanks. In Table 5.3, we present a comparison of all of our proposed designs. Note that all of these lterbanks are designed for bipartite graphs. The extension of these designs to any arbitrary graph is presented in the Chapter 6. Method Laplacian matrix DC CS PR Comp OE Graph-QMF (exact) symmetric 4 f =cD 1=2 1 Yes Yes No Yes asymmetric 5 f =c1 Yes Yes No No Graph-QMF (approx.) symmetric 4 f =cD 1=2 1 Yes No 7 Yes No asymmetric 5 f =c1 Yes No Yes No One-hop localized symmetric 4 f =cD 1=2 1 Yes Yes Yes 6 No asymmetric 5 f =c1 Yes Yes Yes 6 No Graph-Bior symmetric 4 f =cD 1=2 1 Yes Yes Yes No asymmetric 5 f =c1 Yes Yes Yes No Table 5.3: Comparison of proposed two-channel lterbank designs on bipartite graphs. DC: subspace corresponding to lowest eigenvalue, CS: Critical Sampling, PR: Perfect Reconstruction, Comp: compact support, OE: Orthogonal Expansion 5.7 Summary In this chapter, we proposed the construction of critically sampled wavelet lterbanks for ana- lyzing graph-signals dened on any undirected weighted bipartite graph. We designed wavelet lters based on spectral techniques, and provided necessary and sucient conditions for aliasing cancellation, perfect reconstruction and orthogonality in these lterbanks. As a practical solution, 4 designed using symmetric Laplacian matrixL 5 designed using asymmetric Laplacian matrixLa 6 for analysis lters only. 7 This reconstruction error can be reduced to arbitrary small levels by increasing the degree of approximation. 81 we have proposed a graph-QMF design for bipartite graphs which has all the above mentioned features. The lterbanks are however, realized by Chebychev polynomial approximations at the cost of small reconstruction error and loss of orthogonality. As alternatives to graph-QMF l- terbanks on graphs, we described two approaches for constructing two-channel lter-banks on bipartite graphs. One of the approaches is to design spectral 2-channel lterbanks, where analysis lters are designed using linear spectral kernels, and synthesis lters are chosen so as to guarantee invertibility. The other approach is to design spectral 2-channel lterbanks, where both analysis and synthesis lters are based on polynomial spectral kernels. These lters are not orthogonal, but they have compact support, and provide perfect reconstruction. All these lterbanks are designed to operate on a bipartite graph. In the next chapter, we describe a separable multi-dimensional implementation of these designs to any arbitrary graph via bipartite subgraph decomposition. 82 Figure 5.4: Spectral responses of graphBior(k 0 ;k 1 ) lters on a bipartite graph. In each plot,h 0 () and h 1 () are low-pass and high-pass analysis kernels, C() and D() constitute the spectral response of the overall analysis lter T a , as in (5.65). For near-orthogonality D() 0 and C() 1. Finally, (p() +p(2))=2 represents perfect reconstruction property as in (5.51), and should be constant equal to 1, for perfect reconstruction. 83 Chapter 6 Separable Multi-dimensional Wavelet Filterbanks on Graphs Both, the lifting wavelet lterbanks in Chapter 3 and the spectral wavelet lterbanks in Chap- ter 5, are designed for bipartite graphs. This is because, bipartite graphs are a natural choice for implementing lifting wavelet lterbanks, and provide easy-to-interpret perfect reconstruction con- ditions for spectral wavelet lterbanks, in terms of simple functions of spectral kernels. However, not all graphs are bipartite. For arbitrary graphs, the results applicable to bipartite graphs can be extended in a variety of ways. One way is to approximateG with a bipartite subgraph ^ G, and im- plement designs proposed in Chapter 3 and Chapter 5 on the approximate graph. This approach results in edge-losses, since the edges between nodes belonging to the same partition are discarded while computing the transform coecients. We refer to this approach as \one-dimensional" high loss implementation. As an alternative, we decompose the graph G into K edge-disjoint bipar- tite subgraphs whose union is G and implement ltering/downsampling operation in K stages, restricting the ltering/downsampling operations in each stage to one bipartite graph. This way, all the edges in the graphs participate in computing the wavelet transform. We refer to this approach as \multi-dimensional" no loss implementation. However, the requirement of using all edges in the graph may sometimes lead to very high-dimensional representation of graph-signals, where most of the edges are contained in a few bipartite subgraphs, and the remaining subgraphs are nearly empty (in terms of edges). Therefore, another alternative is to use a hybrid approach, where we compute K edge-disjoint bipartite subgraphs whose union is G, but discard bipartite subgraphs with very few edges. This way, some edges inG are not used in computing the wavelet transform, but these edges constitute a very small fraction of the total number of edges. We refer 84 to this approach as \multi-dimensional" low-loss implementation. In this chapter, we describe the properties of \multi-dimensional" implementations of proposed two-channel lterbanks. In high-dimensional regular signals, ltering and downsampling is done along the geometrical directions (horizontal, vertical etc) of the underlying regular lattice. The subgraph decomposition in graphs can be interpreted in the same spirit as dening \graph dimen- sions" for ltering and downsampling. Thus, a graph \dimension" can be interpreted as a subset of links ^ EE for traversing the graph, starting at any node, and can also be represented as a sub- graph (V; ^ E) of graph (V;E). Further, two graph-dimensions may be considered \orthogonal", if the graph-lters implemented in these dimensions (i.e, on the corresponding subgraphs), measure non-redundant information. This can be achieved if the sets of nodes discovered, while traversing the graph starting at any node, in two dierent dimensions are mutually disjoint. Therefore, the \dimensionality" of a graph G can be dened as the minimum K, for which the graph can be de- composed intoK subgraphs (V; ^ E p ),p = 1; 2;:::K, such that thek-hop neighborhood setsfN p k (n)g, centered at a noden, corresponding to all subgraphs (V; ^ E p ), are pairwise disjoint for allk, and for all nodes n. Further, since our proposed lterbanks operate only on bipartite graphs, we dene dimensionality in terms of decomposing the graph into K bipartite subgraphs. So, the question that frames the rest of our discussion in this chapter is that of how to nd these orthogonal subgraph decompositions. A more relevant question, especially for the multi-dimensional low-loss case, is to nd \good" K-dimensional bipartite subgraph decompositions of a graph for a xed K. In this chapter, we propose a separable downsampling and ltering approach to apply our lterbank design to an arbitrary graph, G = (V; E), where our previously designed two-channel lterbanks are applied in a \cascaded" manner, by ltering along a series of bipartite subgraphs of the original graph. This is illustrated in Figure 6.1. We call this a \separable" approach in analogy to separable transforms for regular multidimensional signals. For example in the case of separable transforms for 2D signals, ltering in one dimension (e.g., row-wise) is followed by ltering of the outputs along the second dimension (column-wise). In our proposed approach, a stage of ltering along one \dimension" corresponds to ltering using only those edges that belong to the corresponding bipartite subgraph. As shown in Figure 6.1, after ltering along one subgraph the results are stored in the vertices, and a new transform is applied to the resulting graph signals following the edges of the next level bipartite subgraph. We study the desired properties 85 Figure 6.1: Block diagram of a 2D Separable two-channel Filter Bank: the graph G is rst decomposed into two bipartite subgraphsB 1 andB 2 , using the proposed decomposition scheme. By constructionB 2 is composed of two disjoint graphsB 2 (L) andB 2 (H), each of which is processed independently, by one of the two lterbanks at the second stage. The 4 sets of output transform coecients, denoted as y HH ; y HL ; y LH and,y LL , are stored at disjoint sets of nodes. of these bipartite subgraph decompositions, and propose metrics to quantitatively measure the separations. Subsequently, we propose greedy heuristic to optimize these metrics and compare the resulting decompositions with other non-optimized schemes. The rest of the chapter is organized as follows: in Section 6.1, we describe our proposed approach for implementing wavelet lterbanks on arbitrary graphs via bipartite subgraph decom- position. In Section 6.2, we discuss the desired properties of bipartite subgraph decomposition in the lterbank design, and dene some metrics to compare various bipartite decompositions based on these properties. In the same section we propose two algorithms, namely Harary's decomposition, and min-cut weighted max-cut (MCWMC) decomposition, to compute bipartite subgraphs. In Section 6.3, we compare proposed algorithms in terms of desired properties. Finally we conclude the chapter in Section 6.4. 6.1 Proposed Design In what follows we will assume thatG has been decomposed into a series ofK bipartite subgraphs B i = (L i ;H i ; E i ), i = 1:::K; how such a decomposition may be obtained will be discussed later. The bipartite subgraphs cover the same vertex set: L i [H i =V, i = 1; 2;:::K. Each edge in G belongs to exactly one E i , i.e., E i \ E j =;, i6=j, S i E i = E. Note that for each bipartition we need to decide both a 2-coloring (H i ;L i ) and an assignment of edges (E i ). In order to guarantee 86 invertibility for structures such as those of Figure 6.1, given the chosen 2-colorings (H i ;L i ), the edge assignment has to be performed iteratively based on the order of the subgraphs. That is, edges for subgraph 1 are chosen rst, then those for subgraph 2 are selected, and so on. The basic idea is that at each stage i all edges between vertices of dierent colors that have not been assigned yet will be included in E i . More formally, at stage i with sets H i and L i , E i contains all the links in E S i1 k=1 E k that connect vertices in L i to vertices in H i . Thus E 1 will contain all edges between H 1 and L 1 . Then, we will assign to E 2 all the links between nodes in H 2 and L 2 that were not already in E 1 . This is also illustrated in Figure 6.2. Note that by construction G 1 =GB 1 = (V;EE 1 ) contains now two disjoint graphs, since all edges between L 1 andH 1 were assigned toE 1 . Thus, at the second stage in Figure 6.1,B 2 is composed of two disjoint graphs B 2 (L 1 ) andB 2 (H 1 ), which each will be processed independently by one of the two lterbanks at this second stage. Clearly, this guarantees invertibility of the decomposition of Figure 6.1, since it will be possible to recover the signals inB 2 (L 1 ) andB 2 (H 1 ) from the outputs of the 2nd stage of the decomposition. The same argument can be applied to the decompositions with more than two stages. That is, the output of a two-channel lterbank at level i leads to two subgraphs, one per channel, that are disconnected when considering the remaining edges (E S i k=1 E k ). The output of a K-level decomposition leads to 2 K disconnected subgraphs. HH LH HL LL = + L L 1 2 H 1 H 2 (a) (b) (c ) Figure 6.2: Example of 2-dimensional separable downsampling on a graph: (a) original graph G, (b) the rst bipartite graphB 1 = (L 1 ;H 1 ; E 1 ), containing all the links in G between sets L 1 and H 1 . (c) the second bipartite graphB 2 = (L 2 ;H 2 ; E 2 ), containing all the links inGB 1 , between sets L 2 and H 2 We now derive expressions for the proposed cascaded transform along bipartite subgraphs. Using theK = 2 case as an example, assuming that the original graph can be approximated exactly 87 with two bipartite subgraphs as shown in Figure 6.2, we choose i = Hi as the downsampling function for bipartite graphB i , for i = 1; 2. Further, let us denote J i , as the downsampling matrices, and H i0 and H i1 as the low-pass and high-pass lters respectively, for the bipartite graphB i , for i = 1; 2. Since, the vertex sets L 1 and H 1 in bipartite graphB 2 are disconnected, the ltering and downsampling operations on graphsB 2 (L 1 ) andB 2 (H 1 ) do not interact with each other. Therefore, graph-lters H 2j , for j = 0; 1 on the second bipartite graphB 2 , can be represented as block-diagonal matrices with diagonal entries H 2j (H 1 ;H 1 ) and H 2j (L 1 ;L 1 ). As a result, H 20 and H 21 commute with downsampling matrix J 1 of the rst bipartite subgraph, i.e., H 2j J 1 = J 1 H 2j ; (6.1) for j = 1; 2. 1 Further, let T ai be the equivalent analysis transform forB i , for i = 1; 2. The combined analysis transform T a in the 2-dimensions can be written as the product of analysis transform in each dimension. Using (5.16), we obtain: T a = T a2 :T a1 = 2 Y i=1 1 2 ((H i1 + H i0 ) + J i (H i1 H i0 )); (6.2) Note that for graph-Bior lter designs, lifting wavelet lter designs, and for exact graph-QMF lter design such as with the Meyer kernel in (5.33), T ai is invertible with T 1 ai = T t ai , for i = 0; 1. As a result, T a is invertible with T 1 a = T t a1 :T t a2 2 . The transform function T a can be further decomposed into the transform functions T HH ; T Hl ; T LH and T LL corresponding to the four channels in Figure 6.1. For example, the transform T HH , consists of all the terms in the expansion of T a in (6.2), containing lters H 11 and H 21 . Thus, T HH = 1 4 (H 21 H 11 + H 21 J 1 H 11 + J 2 H 21 H 11 + J 2 H 21 J 1 H 11 ); (6.3) where (1=4)H 21 H 11 is the transform without downsampling, and the remaining terms arise pri- marily due to the downsampling in the HH channel. Using (6.1), which is a property of our proposed decomposition scheme in (6.3), we obtain: 1 In general, this result can be applied to any general K-dimensional decomposition using proposed recursive method, as the downsampling matrixJ i commutes with all lter matricesH k1 andH k2 corresponding to bipartite subgraphB k , where k>i. 2 For polynomial approximations, of Meyer kernels, we incur some reconstruction errors in each dimension. 88 T HH = 1 4 (H 21 H 11 + J 1 H 21 H 11 (6.4) + J 2 H 21 H 11 + J 2 J 1 H 21 H 11 ) = 1 4 (I + J 2 )(I + J 1 )H 21 H 11 : Thus, the equivalent transform in each channel of the proposed 2-dimensional separable l- terbanks can be interpreted as ltering with a 2-dimensional lter, such as H 21 H 11 for the HH channel, followed by DU operations with two downsampling functions 2 (n) and 1 (n) in cas- cade. It also follows from (6.5), that the output of H 21 H 11 in the HH channel is stored only at the nodes corresponding to H 1 \H 2 . Thus, the output of each channel is stored at mutually disjoint sets of nodes, and each node stores the output of exactly one of the channel. Therefore, the overall lterbank is critically sampled. Further, if the spectral decompositions ofB 1 andB 2 are given asf; P 1 g andf ; P 2 g, then H 21 H 11 consists of a two dimensional spectral kernel h 21 ( )h 11 () and corresponding eigenspace P 2 P 1 . The analysis extends to any dimensionK > 2 withK-dimensional graph-frequencies ( 1 ; 2 ;:::; K ), corresponding eigenspace P 1 1 ; P 2 2 ;:::P K K and transforms with spectral response Q K i=1 g i ( i ). Note that invertible cascaded transforms can also be constructed even when the conditions for edge selection described are not followed, e.g., if an edge e 1 between nodes in H 1 and L 1 is not included inE 1 . In such a situation, it is possible to perform an invertible cascaded decomposition if e 1 is no longer used in further stages of decomposition. Thus, we would have an invertible decomposition but on a graph that approximates the original one (i.e., without considering e 1 ). Alternatively it can be shown that it is possible to design invertible transforms with arbitrary E i selections (i.e., not following the rules set out in this chapter), but these transforms are not necessarily critically sampled. A more detailed study of this case falls outside of the scope of this thesis. 6.1.1 Graph after downsampling TheDU operations in graphs, only dene the node-setsH orL to be retained after downsampling. For bipartite graphs, unlike the case of regular lattices, the resulting downsampled graphsG L and G H may neither be identical nor bipartite. Therefore, for the next level of decomposition, we can 89 either operate on a single bipartite graph approximation of G L which leads to a one-dimensional two-channel lterbank, or a multiple bipartite graph approximation, which leads to a multi- dimensional two-channel lterbank implementation on the downsampled graph, which shall be explained in the Chapter 5. Further, this multiresolution decomposition of graph-signals can be extended to the case of general K-dimensional two-channel lterbanks for any arbitrary graph G, which decomposes the signal into 2 K lower-resolution versions, as described in Section 6.1. In this case, the downsampled graphs in each channel, can be computed by reconnecting two nodes in the downsampled vertex set, if they are 2 K -hops away in the original graph. 6.2 Bipartite Subgraph Decomposition So far we have described how to implement separable multi-dimensional two-channel lterbanks on a graph G, given a decomposition of G into K bipartite subgraphs. In particular, we dened a \separable" method of graph decomposition, which leads to a cascaded tree-structured imple- mentation of the multi-dimensional lterbanks. While these multi-dimensional lterbanks can be implemented for any separable bipartite subgraph decomposition of G, the denition of a \good" bipartite decomposition is not clear. In this section, we study the desired properties of these bipartite subgraph decompositions. As described in the beginning of this chapter, the bipartite subgraph decomposition of a graph can be interpreted as decomposing the graph into dierent \graph-dimensions", where a \graph dimension" refers to a subset of edges in the graph. Further, two bipartite subgraphs are \orthogonal" if the neighborhood sets dened on the two subgraphs at each node are mutually disjoint. However, it can be problematic to strictly impose orthogonality in the decomposition of some graphs (specially dense graphs), where this can lead to the genera- tion of too many bipartite subgraphs, with very few edges in most of these subgraphs. Therefore, the question is, given a xed K (such as K = 2 in this case), what is a \good" K-dimensional bipartite subgraph decomposition of a graph. The answer we propose is a bipartite subgraph decomposition G = S K p B p in which the k-hop neighborhoodsN p k (n) dened on each subgraph B p are maximally disjoint for all k and for all n. For simplicity, we restrict our discussion to the 2-dimensional case. The extension to higher dimensional decomposition is straightforward. Before nding the solution, we dene some metrics which measure the neighborhood separation in bipartite subgraphs. For this, we dene k-hop adjacency matrix as A i;k so that A i;k (n;m) 90 represents the number of paths from noden to nodem of length up tok, in the bipartite subgraph B i . The diagonal entries of A i;k are set to zero. Using matrix A i;k , we measure separability in the k-hop neighborhoods of bipartite subgraphsB i by computing the correlation betweenn th rows of adjacency matrices A i;k at each noden. Thek-hop neighborhood set correlation NSC(k) between two bipartite subgraphs is the average correlation between the k-hop neighborhoods dened as: NSC(k) = 1 N N X n=1 P m A 1;k (n;m)A 2;k (n;m) pP m A 1;k (n;m) 2 P m A 2;k (n;m) 2 (6.5) A low value of NSC would imply mutually disjoint neighborhoods in the decomposed bipartite subgraphs. At a global scale, the eigen-vectors of bipartite subgraph Laplacian matrices which form the graph-Fourier basis should also be decorrelated with each other. The correlation between the l th eigen-vectors u 1;l and u 2;l on two bipartite subgraphs can be measured by their inner- product. Therefore, we dene spectral basis correlation SBC between bipartite subgraphsB 1 and B 2 to be the Euclidean norm of inner-products between the corresponding eigenvectors: SBC = v u u t N X l=1 (u t 1;l u 2;l ) 2 (6.6) To measure the loss because of the approximation, we dene an edge-loss fraction (ELF) which is the ratio between total number of edges in remaining graph G 2 = GB 1 B 2 and the total number of edges in G. Thus, ELF measures the fraction of edges not used in computing the transform. We next present two heuristic algorithms to nd good subgraph decompositions in arbitrary graphs. 6.2.1 Harary's decomposition algorithm In this section, we propose a bipartite subgraph decomposition method, referred to as Harary's decomposition, which provides adlog 2 ke bipartite decomposition of a graph G given a k-coloring dened on it 3 . The method is derived from [15] and we describe it in Algorithm 3. 4 Although the 3 A graph is perfectly k-colorable if its vertices can be assigned k-colors in such a way that no two adjacent vertices share the same color. The term chromatic number (G) of a graph refers to smallest such k. 4 Note that the bipartite decomposition is not unique and depends on the ordering in which the k-colors are divided. 91 problem of determining the chromatic number (G) is NP-complete, there exist several approxi- mate minimum coloring algorithms with various orders of accuracies, a comparison of which can be found in [23]. The complexity of these algorithms range fromO(N 2 ) for greedy algorithms to O(N 4 ) for a backtracking sequential coloring (BSC) algorithm presented in [23]. Based on this result, we propose adlog 2 ke- bipartite decomposition of the graph G, given a perfect k-coloring dened on it. We refer to this method as Harary's decomposition Algorithm 3 Harary's Decomposition Require: F, s.t. F (v) is the color assigned to node v, min(F )=1 , max(F )=k. 1: Set L 1 = set of nodes with F (v)bk=2c colors. 2: Set H 1 = set of nodes with F (v)>dk=2e colors. 3: Set E 1 E containing all the edges between sets H 1 and L 1 . 4: Compute bipartite subgraphB 1 = (L 1 ;H 1 ;E 1 ), 5: Set G =GB 1 . 6: G is now a union of two disconnected subgraphs G(H 1 ) and G(L 1 ). 7: Graph G(L 1 ) isdk=2e-colorable. 8: Compute coloring F L on G(L 1 ) s.t. min(F L )=1 , max(F L )=dk=2e. 9: Graph G(H 1 ) isbk=2c-colorable. 10: Compute coloring F H on G(H 1 ) s.t. min(F H )=1 , max(F L )=bk=2c. 11: Repeat 1 4 on G(L 1 ) and G(H 1 ) to obtain bipartite subgraphsB 2 (L 1 ) andB 2 (H 1 ). 12: Compute bipartite subgraphB 2 =B 2 (L 1 )[B 2 (H 1 ). 13: Set G =GB 2 . 14: repeat 1 13 exactlydlog 2 ke times after which graph G will become an empty graph. 6.2.2 Min-cut weighted max-cut (MCWMC) algorithm The nature and complexity of nding minimum bipartite subgraph decomposition with maximally disjoint neighborhoods is not known. We therefore, propose the following greedy heuristic to nd bipartite subgraphs with disjoint neighborhoods: given a graph G, let be chosen as the rst downsampling function inducing a partition (S 1 ;S 2 ) on graph G with sizesjS 1 j = N 1 and jS 2 j =N 2 . Let us denep(S) to be the probability of randomly choosing a nodev2S in graphG, which is equal tojSj=jVj. Further let e =E S1;S2 denote the cut-set andB = (S 1 ;S 2 ; e) denote the bipartite subgraph corresponding to . This decomposition can be graphically represented as in Figure 6.3. The exclusion ofB fromG changes the neighborhood structure of the resulting graph G 1 . Thus in remaining graph G 1 , nodes in setS 1 cannot reach nodes in setS 2 and vice-versa. We dene the expected change in the neighborhood size at each node given the cut e as: E[@Nj e] =p(S 1 )jS 2 j +p(S 2 )jS 1 j = 2N 1 N 2 N (6.7) 92 Figure 6.3: Example of a bipartite graph-cut Clearly E[@N ] is maximized if N 1 N 2 at each iteration. This problem is widely studied in graph-literature as the balanced-cut problem in graphs. However nding balanced-cut iteratively becomes problematic as it leads to roughlylog 2 (N) bipartite subgraphs for graph-sizeN. Further, in each bipartite graph the nodes that do not have edges in the cut-set e are disjoint and do not take part in the transform. Therefore, we would like to maximally pack these edges into larger and fewer bipartite subgraphs, packing edge-sets e in the order of their importance E[@N j e]. In order to do this we assign a weight w e = E[@N ]=jej to each edge in the cut-set e2 e in each iteration of balanced cut decomposition. The weight signies the importance of the edge in changing the neighborhood structure of resulting decompositions. We perform an iterative max- cut algorithm on the resulting min-cut weighted graph which provides bipartite subgraphs with maximum packing of the weighted edges. The algorithm is thus termed as the min-cut weighted max-cut (MCWMC) algorithm, and described in Algorithm 4. 6.3 Experiments In order to evaluate the dierent schemes for bipartite subgraph decomposition, we simulate ran- dom graphs by uniformly distributing N = 100 nodes in a 2-D eld and connecting nodes which are within a xed radius of each other. 5 For MCWMC algorithm, we use the balanced-cut algo- rithm proposed in [16] and the max-cut algorithm in [1]. For each graph G, we decompose the graph iteratively into bipartite subgraphs up to two steps to obtain bipartite subgraphsB 1 andB 2 respectively. We then evaluate the metrics NSC and SBC for the two bipartite subgraphs obtained by using a) MCWMC algorithm b) Harary's algorithm proposed in [30] and c) a random decom- position (in which we randomly assign downsampling functions to nodes). Table 6.1 summarizes 5 Note that the 2-D embedding of graph is for illustration only. The MCWMC algorithm only depends on the link-structure of the graph nodes. 93 Algorithm 4 MCWMC Decomposition Require: G = (V;E) 1: Set (V;E d ) =min cut weighing(V;E) 2: Set k! 1. 3: whilejEj6= 0 do 4: Compute normalized Laplacian matrixL(G d ). 5: Compute eigenvector u , of maximum magnitude eigenvalue ofL(G d ). 6: Set L k =fv : u (v) 0g. 7: Set H k =fv : u (v)< 0g. 8: Set E k =f(u;v) : (u;v)2E; u2L k ;v2H k g. 9: Set E dk =f(u;v) : (u;v)2E d ; u2L k ;v2H k g. 10: SetB k = (L k ;H k ;E k ). 11: Set E =EnE k ; G = (V;E) 12: Set E d =E d nE dk ; G d = (V;E d ) 13: Set k!k + 1. 14: end while 1: function G d =min cut weighing(V;E) 2: ifjEj = 0 then 3: Set E d =E. 4: else 5: Compute normalized Laplacian matrixL(G), where G = (V;E). 6: Compute eigenvector u , of minimum non-zero eigenvalue ofL(G). 7: SetS 1 =fv : u (v) 0g. 8: SetS 2 =fv : u (v)< 0g. 9: Set E 1 =f(u;v) : (u;v)2E; u2S 1 ;v2S 1 g. 10: Set E 2 =f(u;v) : (u;v)2E; u2S 2 ;v2S 2 g. 11: Set e =f(u;v) : (u;v)2E; u2S 1 ;v2S 2 g. 12: Compute w e = 2jS1jjS2j jej(jS1j+jS2j) 13: Set e d =w e :e.fmultiply all edges in e with w e g 14: Set (S 1 ;E d1 ) =min cut weighing(S 1 ;E 1 ) 15: Set (S 2 ;E d2 ) =min cut weighing(S 2 ;E 2 ) 16: Set E d =E d1 [E d2 [e d . 17: end if 18: return G d = (V;E d ) 19: end function the comparison results for 100 instances of such random graphs. A low value of ELF suggests that MCWMC algorithm packs more edges in subgraphsB 1 andB 2 than other algorithms. Further, we observe that NSC(k), in general decreases for largek-hop neighborhoods which makes sense since at each step of iterative decomposition, the removal of a bipartite subgraph bisects the remaining graph and thus reduces the long-hop connections between nodes. However, we observe that the NSC(k) drops sharply with MCWMC algorithm which implies that the neighborhood are better separated than by using other schemes. At global scale, SBC is lowest for MCWMC, which means that the eigenvectors of the resulting bipartite subgraph are also better decorrelated in case of 94 proposed algorithm. To see it more clearly, we measure similarity (i.e inverse of shortest hoping distance) between all node pairs in dierent subgraphs. With maximal neighborhood separation, we expect any pair of nodes in the graph to have dierent similarities on dierent subgraphs. Figure 6.4 plots the histogram of absolute dierence in the similarities of node-pairs on dierent bipartite subgraphs. Method Random Harary MCWMC ELF 0.249 0.225 0.14 NSC(2) 0.48 0.53 0.51 NSC(4) 0.50 0.54 0.51 NSC(6) 0.49 0.53 0.48 NSC(8) 0.47 0.51 0.45 NSC(10) 0.45 0.49 0.42 NSC(12) 0.43 0.48 0.39 SBC 0.60 0.61 0.55 Table 6.1: Comparison of bipartite subgraph decomposition schemes 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 MCWMC Harary Figure 6.4: Histogram of absolute dierence in similarity between node-pairs in two bipartite subgraphs In case of Harary's decomposition the histogram is concentrated near zero which means that most node-pairs are near similar on the two subgraphs, whereas in the case of proposed MCWMC algorithm the histogram is shifted to the right implying most node-pairs are not equally similar on both subgraphs, i.e., if they are close to each other in one subgraph, then they are away from each other in other subgraph. This further corroborates our claim that neighborhood are better separated using the proposed decomposition scheme. 95 6.4 Summary In this chapter, we have proposed a separable, multi-dimensional wavelet lterbank design for any arbitrary undirected graph. The design is an extension of two-channel lterbanks described in Chapter 5. According to our proposed formulation, a graph is iteratively decomposed into a set of bipartite subgraphs, and then ltering and downsampling operations are carried out in cascade on each bipartite subgraph. We have proposed bipartite subgraph decompositions, which provide dimensionality to the graph similar to the case of regularly sampled signals in higher dimensions. We explained that dimensionality in graphs can be understood as neighborhood separability, and we dened some metrics to evaluate various bipartite decompositions based on this understanding. Further, we proposed two algorithms which compute bipartite subgraph decompositions, and compared them based on these metrics. In the next chapter, we describe some applications of our proposed lterbanks. 96 Chapter 7 Examples and Applications of Graph Wavelet Filterbanks In this chapter, we consider some applications where the graph lterbanks proposed in Chapter 5, can be applied. We rst consider an example of multi-resolution decomposition of graphs, where we analyze data dened on the vertices of an arbitrarily linked graph. This is the most general illustration of our proposed designs, where we only use topological information to compute down- sampling and ltering operations. Next, we consider an edge-aware image-processing application, where image-pixels can be connected with their neighbors to form undirected graphs. 1 Here, we propose various graph-formulations of images, which capture both directionality and intrinsic edge information. The proposed graph-wavelet lterbanks provide a sparse, edge-aware representation of image-signals. The outline for the rest of the chapter is as follows: in Section 7.1, we discuss an example of proposed graph wavelet lterbanks in multi-resolution decomposition of graphs. In Section 7.2, we discuss application of proposed lterbanks on a graph representation of images. We present some experimental results for non-linear approximation of images in Section 7.3, and nally we conclude the chapter in Section 7.4. 7.1 Multi-resolution Decomposition of Graphs Our proposed lterbanks can be used as a useful tool in analyzing/compressing arbitrarily linked irregular graphs. To study its feasibility, we take the example of Minnesota trac graph from [14]. The graph is shown in Figure 7.1(a), where the spatial coordinates are only used to display the graph and the wavelet transform, and do not aect the edge-weights. Further, we consider 1 This research was conducted jointly with Yung-Hsuan Chao. See [28] for details. 97 a graph-signal on this graph with sharp irregular discontinuity, as shown in Figure 7.1(b), where the color of a node represents the signal value at that node. 7.1.1 Bipartite subgraph decomposition The Minnesota graph is not bipartite (two-colorable). Therefore, in order to implement wavelet lterbanks, we decompose the graph into bipartite subgraphs, using Harary's decomposition al- gorithm. The graph is perfectly 3-colorable and hence, it can be decomposed intodlog 2 (3)e = 2 bipartite subgraphs, which are shown in Figure 7.1(c-d). Figure 7.1: (a) The Minnesota trac graph G, and (b) the graph-signal to be analyzed. The colors of the nodes represent the sample values. (c)(d) bipartite decomposition of G into two bipartite subgraphs using Harary's decomposition. 98 7.1.2 Spectral wavelet lterbank implementation Given the bipartite subgraph decomposition ofG, we compute the normalized Laplacian matrices L i and the downsampling functions i = Hi , for each bipartite subgraphB i . Further, we com- pute the low-pass analysis kernelh i;0 () onB i , as them th i order Chebychev approximation of the Meyer kernel h Meyer 0 (), for m i = 24. The remaining spectral kernels h i;1 ();g i;0 ();g i;1 () are computed from h i;0 () according to graph-QMF relations mentioned in (5.28). The correspond- ing analysis and synthesis transforms are then computed as H i;j = h j (L i ) and G i;j = g j (L i ), respectively, for j = 0; 1. Note that since the kernels are polynomials, the transforms are also matrix polynomials of Laplacian matrices and do not require explicit eigenspace decompositions. In our experiments, we use m i =m, and hence h i;j () =h j (); j2f0; 1g for all i, in which case the resulting transforms are exactly m-hop localized on each bipartite subgraph. The order m is a parameter of our design and should be chosen based on the required level of spatial localization and how much reconstruction error can be tolerated. The overall lterbank is designed by con- catenating lterbanks of each bipartite subgraph in the form of a tree, analogous to Figure 6.1 in the 2-dimensional decomposition case. Since, the proper coloring of the Minnesota graph is 3, the output of the HL channel (i.e. nodes for which ( 1 (n); 2 (n)) = (1; 1)) is empty after downsampling. The output coecients of the other three non-empty channels (LL;LH;HH) are shown in Figure 7.2. Note that after downsampling, the total number of output coecients in the four channels is equal to the number of input samples, thus making the transform critically sampled. We observe in Figure 7.2 that the output coecients in the LH and HH channels, have signicantly high magnitude along the discontinuity, hence re ecting the high-pass nature of these channels. Further, in order to see how much energy of the original signal is captured in each channel, we upsample then lter the coecient of each channel by the synthesis part of proposed lterbank. This is shown in Figure 7.3. In this gure, we see that the reconstructed sig- nal from LL channel coecients, provides a low-pass approximation of the original signal (sharp boundaries blurred), whereas the signals reconstructed from the LH and HL channels provide a high-pass approximation of the input signal (highlighting the boundaries). Thus, the proposed graph-based lterbanks, provide a meaningful decomposition of input signals, analogous to the standard wavelet-lterbanks. Similarly, we design graph-Bior lterbanks. For this we choose parameters (k 0 ;k 1 ) = (7; 7), as 99 Figure 7.2: Output coecients of the proposed graph-QMF lterbanks with parameter m = 24. The node-color re ects the value of the coecients at that point. Top-left: LL channel wavelet coecients, top-right: absolute value of LH channel wavelet coecients, and bottom- right: absolute value of HH channel wavelet coecients described in Section 5.5.3. Given these specications, we designp() =h 0 ()g 0 () as a maximally at half band polynomial kernel with K = k 0 +k 1 = 14 number of roots at = 0. As a result, p() turns out to be a 2K 1 = 27 degree polynomial, and we factorize it into h 0 () and g 0 (), with least dissimilar lengths (i.e., we choose l 0 = K = 14 and l 1 = K 1 = 13). The other two lters, namely h 1 () and g 1 () are computed as in (5.48). Once again, we choose the same lter kernels for both bipartite subgraphs, in which case the resulting transforms H 0 and G 0 are exactly K-hop and (K 1)-hop localized on each bipartite subgraph, respectively. The wavelet coecients of resulting lterbanks are given in Figure 7.4, and the reconstructed signals from individual channels are given in Figure 7.4. The proposed decomposition of graphs can be useful in many applications. From the analysis point of view, the high-magnitude samples in the signals, reconstructed from high-pass channels, 100 Figure 7.3: Reconstructed graph-signals from the graph-QMF wavelet coecients of individual channels. As before the node-color re ects the value of the coecients at that node. Top-left: reconstruction from LL channel only, top-right: reconstruction from LH channel only, and bottom- right: reconstruction from HH channel only. Since, HL channel is empty the reconstruction is an all-zero signal (bottom-left gure). The reconstruction SNR of sum of all four channels is 50.2 dB. provide knowledge of the location and type of discontinuities in the graph-signal. For example, in a trac sensing scenario, in the Minnesota graph example, the stations (nodes), with high- magnitude samples in the HH channel in Figure 7.3 and Figure 7.5, provide a good location to install trac sensors. Further, since the lters in our proposed designs can be computed iteratively in a few steps, with local one-hop operations at each node in each step, the resulting lterbanks can be very useful in detecting anomalies in large distributed networks. From the compression point of view, the output coecients in the LL channel in Figure 7.2 and Figure 7.4, provides a downsampled representation of original graph-signal, with only 45% of the samples. This can be extended to multiple levels, by reconnecting the LL nodes to obtain a downsampled graph, and applying the proposed lterbanks on the downsampled graph by treating LL output coecients as new signal, and so on. In the next section, we consider a compression 101 Figure 7.4: Output coecients of the graph-Bior lterbanks with parameter (k 0 ;k 1 ) = (7; 7). The node-color re ects the value of the coecients at that point. Top-left: LL channel wavelet coef- cients, top-right: absolute value of LH channel wavelet coecients, and bottom-right: absolute value of HH channel wavelet coecients application for 2-D images, where we dene multi-level implementations of proposed lterbanks, on graph representations of images. 7.2 Edge Aware Image Processing In this section, we propose a novel method for image-analysis using graph wavelets. While stan- dard separable extensions of wavelet lterbanks to higher dimensional signals, such as 2-D images, provide useful multi-resolution analysis, they do not capture the intrinsic geometry of the images. For example, these extensions can capture only limited (mostly horizontal and vertical ) direc- tional information. This means that if the object boundaries in an image are neither horizontal nor vertical, e.g., diagonal or round shape, the resulting transform coecients tend not to be sparse and high pass wavelet components can have signicant energy. Therefore, more powerful representations are sought for images, in which basis functions can adapt to the directionality 102 Figure 7.5: Reconstructed graph-signals from the graph-Bior wavelet coecients of individual channels. As before the node-color re ects the value of the coecients at that node. Top-left: reconstruction from LL channel only, top-right: reconstruction from LH channel only, and bottom- right: reconstruction from HH channel only. Since, HL channel is empty the reconstruction is an all-zero signal (bottom-left gure). The reconstruction SNR of the sum of four channels is 168.57 dB. and edge-information contained in the image. Among the various solutions proposed, some trans- forms, such as 2-D Gabor wavelets [25] and complex wavelets [22], provide extra dimensionality at the cost of producing an over-sampled output. Other designs such as curvelets [2] and con- tourlet transforms [9], which provide a dictionary of anisotropic edge-aware basis functions, require higher complexity and suer from the same problem of oversampling. Some other designs such as bandlets [34], directionlets [43] and tree-based lifting transforms [37] provide critically sampled transforms based on side-information about geometric ows in the image. Images can also be viewed as graphs, by treating pixels as nodes, pixel intensities as graph- signals, and by connecting pixels with their neighbors in various ways. The advantage of formulat- ing images as graphs is that dierent graphs can represent the same image, which oers exibility of choosing the graphs that have useful properties. In particular, the weights of the links can be adjusted at each node, in order to take into account local edge-information present in the image. An example of weighted image-graph formulation is the anisotropic diusion based image smoothing considered in [51]. In Chapter 5, we designed two-channel wavelet-lterbanks for any 103 undirected weighted graphs, with vertices (nodes) as data-sources. These lterbanks are critically sampled and provide basis elements which are localized in both spatial and frequency domain of the graph 2 . Further, they can be implemented using an iterated separable lterbank structure, and thus provide a multi-resolution analysis of graph-signals. We now apply these lterbanks to undirected unweighted graph representations of images, and show that the interpretation of resulting graph-wavelets transforms is analogous to classical wavelet decompositions.Since the DC signal in images corresponds to an all constant signal, we design graph lterbanks using asymmet- ric Laplacian matrix, as described in Section 5.6. We provide preliminary results related to image non-linear approximation that show promising gains over standard separable wavelet transforms 3 . 7.2.1 Graph representation of images Digital images are 2-D regular signals, but they can also be viewed as graphs by connecting every pixel (node) in an image with its neighboring pixels (nodes) and by interpreting pixel values as the values of the graph-signal at each node. Graph representations of the regularly sampled signals have been shown to be promising in practice recently [35, 11]. In our experiments, we use an 8-connected representation G of an image as shown in Figure 7.6. In this representation, each pixel has two types of connections with its neighbors: (a) rectangular connections with NWSE neighbors, and (b) diagonal connections with its diagonal neighbors. Note that adding more directions to the graph, for example, by linking each pixel with its 2-hop neighbors, is possible but is not considered in our present work. In the 8-connected image graph G, separating out Figure 7.6: Two dimensional decomposition of 8-connected image-graph rectangular and diagonal links into separate graphs leads to two bipartite subgraphsB 1 andB 2 as shown in Figure 7.6. The importance of each dimension can be changed by adjusting the weights 2 The frequency of graph is dened in terms of eigenvalues of the normalized graph Laplacian matrix 3 For results related to denoising, see [28]. 104 of the links in each bipartite-subgraph. Given such a decomposition, we can implement a two- stage (\two-dimensional") graph-wavelet lterbank, as described in Chapter 5, where the ltering operations in the rst dimension capture the variations along rectangular directions and those in the second dimension capture the variations along the diagonal directions. The overall wavelet lterbank has 4 output channels, and the downsampling pattern in each channel is identical to a downsampling-by-4 pattern for standard separable case. The nodes sampled in dierent channels are shown by dierent colors in the 8-connected graph G in Figure 7.6. 7.2.2 Graph Filter-banks on Images The graph-based approach provides additional degrees of freedom (directions) to lter/downsample the image while still having a critically sampled output. To demonstrate this, we implement a graph wavelet lterbank on the 8-connected image-graph G of a given image, as shown in Fig- ure 7.6. Here we assume the graph to be unweighted (i.e., all the links in the graph have equal weight). Figure 7.7 shows the one-level output wavelet coecients of proposed 2-dim lterbank on a toy image which has both diagonal and rectangular edges. In this gure, the energy of wavelet Figure 7.7: Separable two-dim two channel graph lterbank on a toy image with both rectangular and diagonal edges. coecients in the LH channel (low-pass onB 1 , high-pass onB 2 ) is high around the rectangular edges, which is reasonable, since subgraphB 2 is diagonally connected and its low-pass spectral frequencies are oriented along diagonal links. Similarly we observe that the high-energy wavelet coecients in the HL channel (high-pass onB 1 , low-pass onB 2 ) lie around the diagonal edges, 105 sinceB 1 is rectangularly connected and its low-pass spectral frequencies are oriented towards hor- izontal and vertical directions. In Figure 7.8, we compare the proposed graph-based lterbanks with standard separable implementation with CDF 9=7 lters used in JPEG2000, on the binary image in the above example. We observe in Figure 7.8(a), that the reconstructed image, using standard CDF lters, has a lot of distortion near diagonal edges. This is because the wavelet lters in the standard case are only oriented in either horizontal or vertical direction. There- fore, they produce a lot of detail coecients at the diagonal discontinuities. On the other hand, the reconstructed images in Figures 7.8(b) and 7.8(c), corresponding to graph-QMF lterbanks and graph-Bior lterbanks respectively, have less distortion (roughly 0:5dB in this case) than the standard case. This is because, the graph-based lters are oriented in both rectangular and diagonal directions, and therefore produce less detail coecients in these directions. However, we also observe in Figures 7.8(b) and 7.8(c) that these graph-based lters also produce artifacts near the edges, which is because the ltering operations still cross edges in one direction (rect- angular/diagonal in graph case) or the other. Therefore, ltering operations need to be made edge-aware. Note that in the graph-based formulation, more directions can be added to down- sample/lter by increasing the connectivity of the pixels in the image-graph. Moreover, since graph-based transforms operate only over the links between nodes, the graph formulation is useful in designing edge-aware transforms (which avoid ltering across edges) by removing links between pixels across edges. We discuss the edge-aware representation of images in the next section. (a) CDF 9/7 (PSNR = 64.16 dB) (b) graph-QMF (PSNR = 64.72 dB) (c) graph-Bior (PSNR = 64.63 dB) Figure 7.8: Reconstruction of binary image shown in Figure 7.7, using only 4 th levelLL-channel wavelet coecients, using (a) 2-D separable CDF 9=7 lterbanks, (b) proposed graph-QMF l- terbanks with lter length (m = 28), and (c) proposed graph-Bior lterbanks with lter length (k 0 = 20, k 1 = 21). 106 7.2.3 Edge-aware graph representations Graph representation of images provide a simple way to accommodate the edge-information present in the images, by adjusting the weights of the pixels near the edges. In this approach, rst the pixels at the edges (i.e., pixel whose intensities change sharply from their neighboring pixels), are detected using standard edge-detection algorithms. Subsequently, the links between each edge-pixel and its neighbors are tagged as either regular or less-reliable, depending on the dierence between pixel intensities across the link being low or high, respectively. Then in the edge-aware graph representation, the less-reliable links are either completely removed or assigned a lower link-weight than the regular links. Similar constructions have been proposed in recent work [11, 21], but these constructions use lifting transforms and block transforms respectively, and do not use graph-lterbanks. In our proposed design, we choose to assign a lower link-weight for less-reliable links, as completely removing links around edge-pixels sometimes create isolated pixels (holes) in the graph, which do not participate in computing the wavelet transform, and thus need to be separately accounted for. Consequently, the graph-wavelet lters on the resulting weighted graph have most of their energy on one side of the edge, and produce less number of non-zero wavelet coecients at the edges than in the case of unweighted image graphs. This is demonstrated by an example in Figures 7.9 and 7.10. Figure 7.9: Example demonstrating importance of edge-weighted graph formulation of images: (a) input image (b) edge-information of the image and a highlighted pixel v, (c) unweighted 8- connected image-graph formulation (d) edgemap-weighted 8-connected image-graph formulation 107 Figure 7.10: (a) HH wavelet lter (dB scale) on the pixel v on the unweighted graph (b) HH wavelet lter (dB scale) on the pixelv on the weighted graph, (c) undecimated HH band coecients using unweighted graph and (d) undecimated HH band coecients using edge-weighted graph. 7.2.4 Downsampling image graphs Similar to standard wavelet transforms, the graph-wavelet lterbanks can be recursively applied at multiple levels, treating the LL channel output coecients to be the new graph-signal, operating upon the downsampled graph constructing using the LL channel nodes only. In the proposed 8-connected image graph representation, since the LL channel nodes are uniformly sampled, the downsampled graph using LL nodes is made 8-connected by connecting each LL pixel to its neighboring 8 LL pixels. Further, the link weight between two neighbors in a given orientation (horizontal, vertical, diagonal, or o-diagonal) in the downsampled graph is equal to the weight of the path in the same orientation, between the two nodes in the original graph. Here the weight of the path is product of the weights of the links that it consists of. For example, the edge weight between two horizontal neighborsu andv in the downsampled graph, is the product of the weights of horizontal set of links connecting u and v in the original graph. The graphs obtained for 4 levels of decomposition for the Lena image are shown in Figure 7.11. 108 Figure 7.11: The weighted-graphs computed for Lena image, in 4 levels of decomposition 7.3 Experiments In our experiments, we choose an undirected 8-connected representation of images as described in Section 7.2.1. For edge-detection in an image, we use standard Gaussian ltering followed by thresholding. In addition, we perform a connected component analysis to weed out small clusters of edge-pixels (of size less than 200), and dilate the remaining edges using a 2 2 structuring element to ll out the empty corners in the edges. For each edge-pixel the links between the pixel and its 8 neighbors are divided into two sets by applying a two-class clustering, based on the intensity dierence. The links in the cluster with high intensity dierence are declared less-reliable and their weights are adjusted to one-fourth of the weights of regular links (which is set to 1). The resulting graph has a binary weight distribution of links (regular/ less-reliable). The graphs in the subsequent levels of decomposition are generated from downsampling the rst level graph as described in the Section 7.2.4, and have a more varied link-weight distribution. The graph at each level of decomposition is further decomposed into a rectangular-link only and a diagonal-link only bipartite graph as shown in Figure 7.6, and a two-stage two-channel graph-QMF lterbank is then 109 applied at each level. The lters in the lterbank are chosen to be polynomial approximations of graph-QMF lters in (5.28) with prototype kernel h 0 () to be Meyer kernel in (5.33) with parameter m = 30 (for m th order of approximation). 7.3.1 Image non-linear approximation We now compare the proposed graph-based lterbanks with existing CDF 9=7 lters used in JPEG2000, using non-linear approximation withk-largest wavelet coecients. Figure 7.12, shows PSNR and SSIM [49] values plotted against fraction of detail coecients used in the reconstruction of Lena (512512) image. It can be seen from both the plots that graph-QMF lterbanks achieve better compression than the standard CDF 9=7 lterbanks. This is because the graph-QMF lterbanks capture signal variation in more orientations than the separable case. Among the graph-based wavelet transforms, the edge-weighted formulations perform better than unweighted formulation. This makes sense, as the weighted graph-formulations of the image are edge-aware and produce fewer wavelet coecients compared to unweighted graphs near the edges. However, the performance gain does not include additional edge-map information which can eclipse the gain. Recent work [35, 21] using transforms based on similar edge-map information have shown that this trade-o is favorable. Formulating the trade-o between extra performance gain using edge-weighted graphs and the edge-information needed, as an optimization problem constitutes part of our ongoing work. Figure 7.13 shows the reconstructed image with largest 1% detail coecients in all described cases. It can be seen that perceptually the reconstructed images using both graph-QMF wavelets look sharper than the reconstruction with standard CDF 9=7 wavelet reconstruction. However, the reconstructions using the unweighted graph-QMF wavelets have ringing artifacts near some edges, which disappear when we use the edge-weighted graph formulation. Thus, the edge-weighted graphs and corresponding graph-wavelet lterbanks produce a sparser representation of edges, than the standard separable wavelets. 7.4 Summary In this chapter, we have proposed some applications of our proposed spectral wavelet lterbanks. The rst application is based on a multiresolution decomposition of arbitrary graphs, where graphs are downsampled and ltered into smaller graphs, with data on the graphs representing a (smooth 110 Figure 7.12: Performance comparison: non-linear approximation or sharp) approximation of the original data. This method can be useful in graph compression and anomaly detection applications. Next we discussed, a novel method of processing 2D images using proposed graph-based wavelet lterbank design. We have proposed a graph representation of images in which pixels are connected with their neighbors to form undirected graphs. The graph formulation captures the geometric structure of the image by linking pixels in dierent directions and by adjusting the weights of the links near edges. Preliminary results show gains in the image non-linear approximation application over standard wavelet lterbank. 111 Figure 7.13: Reconstruction of \Lena.png" (512 512) from 1% detail coecients 112 Chapter 8 Conclusions and Future Work 8.1 Main Contributions In this thesis, we have proposed wavelet lterbanks for analyzing data dened on graphs. We termed the data on the vertices of graphs as graph signals and extended regular DSP techniques such as basis decomposition, ltering and downsampling to these graph-signals. The graphs in our research are undirected with no self loops or multiple edges. While many wavelet transform have been proposed in literature for graphs (see Chapter 2 for discussion), a common drawback of most of these designs is lack of critical sampling, which limits their applications in compression and denoising tasks. Therefore, we have proposed critically sampled wavelet lterbanks for graphs. These lter- banks are implemented as \one-dimensional" two-channel lterbanks on bipartite graphs, and extended as \multi-dimensional" separable lterbanks on arbitrary graphs. We have found bi- partite graphs to be the natural choice for implementing critically sampled two-channel wavelet lterbanks, because of their spatial and spectral properties. For lifting wavelet transforms, dis- cussed in Chapter 3, we showed that any even-odd assignment strategy is equivalent to nding a bipartite subgraph approximation of the graph. Further, in Chapter 4, we showed that downsam- pling in bipartite graphs leads to a spectral folding phenomenon, which is analogous to aliasing in regular signals. The spectral folding phenomenon allowed us to implement critically sampled spectral wavelet lterbanks on any bipartite graph, by computing lters which satisfy simple constraints (discussed in Chapter 5). In particular, we proposed wavelet lters, which are based on spectral kernels 113 and provided necessary and sucient conditions for aliasing cancellation, perfect reconstruction and orthogonality in the resulting lterbanks. As a practical solution, we proposed graph-QMF designs for bipartite graphs which satisfy all the above mentioned properties. The exact orthogonal lterbanks are not compact support. They can however be realized as compact support lters by using Chebychev polynomial approximations at the cost of small reconstruction error and loss of orthogonality. As an alternative, we proposed graph-Bior lterbanks which are not orthogonal, but have compact support and provide perfect reconstruction. These lterbanks are critically sampled and invertible and oer a multi-level subband decomposition of graph-signals. For arbitrary graphs we proposed three choices: a) approximate the graph G as a single bipartite graphB, and implement the \one-dimensional" designs proposed in Chapter 5, (b) decompose the graph into K edge-disjoint bipartite subgraphs whose union is G via bipartite subgraph decomposition, discussed in Chapter 6, which leads to separable \multi-dimensional" lterbanks on graphs, and c) a combination of both (a) and (b) in which we nd K edge-disjoint bipartite subgraphs, whose union is not exactly G, but very close to it. There are edge losses in approaches (a) and (c). However, the edge losses in (c) can be minimized by suitably choosing K- bipartite subgraphs. A comparison of our proposed design vis-a-vis existing transforms is shown in Table 8.1. Method DC response CS PR Comp OE GS Wang & Ramchandran [47] non-zero No Yes Yes No No Crovella & Kolaczyk [7] zero No No Yes No No Lifting Scheme [18, 38, 45] zero for wavelet basis Yes Yes Yes No Yes Diusion Wavelets [6] zero for wavelet basis No Yes Yes Yes No Spectral Wavelets [14] zero for wavelet basis No Yes Yes No No graph-QMF lterbanks (Sec: 5.3) zero for wavelet basis 1 Yes Yes No 2 Yes No graph-Bior lterbanks (Sec: 5.5) zero for wavelet basis 1 Yes Yes Yes No No Table 8.1: Evaluation of graph wavelet transforms. CS: Critical Sampling, PR: Perfect Recon- struction, Comp: compact support, OE: Orthogonal Expansion, GS: Requires Graph Simplica- tion. We applied lifting wavelet lterbanks using approach (a) in a data-gathering application in wireless sensor networks in Chapter 3. Here, we formulated the even-odd assignment problem as a minimum dominating set problem, which led to about 44% reduction in communication cost, 1 When designed using asymmetric normalized Laplacian matrix. 2 The exact Graph-QMF solutions are perfect reconstruction and orthogonal, but they are not compact support. Localization is achieved with a matrix polynomial approximation of the original lters, which incur some loss of orthogonality and reconstruction error, which can be arbitrarily reduced by increasing the degree of approximation. 114 compared to raw-transmissions, and about 10% reduction compared to state of the art tree-based lifting transforms. Further, we implemented spectral wavelet lterbanks in Chapter 7 on graph representation of images, in which pixels are connected with their neighbors to form undirected graphs. The graph formulation captures the geometric structure of the image by linking pixels in dierent directions and by adjusting the weights of the links near edges. Preliminary results showed gains in the image non-linear approximation and denoising application over standard wavelet lterbank. 8.2 Future Work The proposed wavelet lterbanks in this thesis can be operated on any arbitrary undirected graphs. The building blocks of our design are lterbanks on bipartite graphs which provide a \one-dimensional" analysis of graph-signals. The spectrum of bipartite graphs (using normalized Laplacian matrix) lies in the closed set [0 2], and has eigenvalues symmetrically placed on either side of = 1. Further, the eigenvectors of bipartite graphs, corresponding to any pair of symmetric eigenvalues, are identical to each other (except some sign changes). These properties are not found in any other graph. Therefore, one question which shapes our future direction is whether some form of aliasing occurs in graphs with higher chromaticity. More precisely, can the two-channel lterbank constraints discussed in Section 5.2, be extended to any non-bipartite graphs? Moreover, the bipartite graph based designs themselves have lots of degrees of freedom. The choices to be made for implementing proposed lterbanks on any graph can be split into three major parts: (i) which bipartite subgraph decompositions to choose for a given graph, (ii) what lters designs to choose (orthogonal or biorthogonal, shorter or longer, spectral or non-spectral etc.), and (iii) how to compute graphs after downsampling, so that they are meaningful and approximate the properties of original graph. Our future work includes working on optimizing all these design choices. For deciding (i), we have proposed choosing bipartite subgraphs which provide mutually disjoint neighborhood sets at each node, which leads to orthogonal ltering operations on bipartite graphs. We also proposed two algorithms: Harary's decomposition and MCWMC decomposition, to compute bipartite subgraphs, which performed well on some of the graphs we studied. However, in some applications, other bipartite subgraph decompositions, such that those favoring more links in the low-pass channels, may be more favorable. Therefore, nding 115 optimal bipartite subgraph decomposition for any given application is what we plan to investigate in future. In the edge-aware image processing application, preliminary results showed gains in the image non-linear approximation using proposed graph-based lterbanks over standard wavelet lter- banks. In the future, we would like to implement proposed graph-based lterbanks using H:264 encoders to better estimate the gains compared to the standard designs. Further, our future work includes studying a more heterogeneous distribution of link weights in the image-graphs and its impact on the graph-formulation. 116 References [1] B. Aspvall and J. R. Gilbert. Graph coloring using eigenvalue decomposition. Technical report, Ithaca, NY, USA, 1983. 28, 93 [2] E. J. Cands and D. L. Donoho. Curvelets and curvilinear integrals. J. of Approx. Theory, 2001. 103 [3] Fan R. K. Chung. Spectral Graph Theory (CBMS Regional Conf. Series in Math., No. 92). American Mathematical Society, February 1997. 2, 11, 30, 48 [4] V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 4:233{235, 1979. 32 [5] A. Cohen, I. Daubechies, and J.-C. Feauveau. Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 45(5):485{560, 1992. 54, 72 [6] R. Coifman and M. Maggioni. Diusion wavelets. Applied and Computational Harmonic Analysis, 21:53{94, 2006. 3, 19, 21, 114 [7] M. Crovella and E. Kolaczyk. Graph wavelets for spatial trac analysis. In INFOCOM 2003, volume 3, pages 1848{1857, Mar 2003. 1, 3, 10, 17, 21, 114 [8] E. B. Davies, G. M. L. Gladwell, J. Leydold, and P. F. Stadler. Discrete nodal domain theorems. Linear Algebra and its Applications, 336(1-3):51 { 60, 2001. 11 [9] M.N. Do and M. Vetterli. The contourlet transform: an ecient directional multiresolution image representation. Image Proc., IEEE Transactions on, dec. 2005. 103 [10] D.L. Donoho. De-noising by soft-thresholding. Information Theory, IEEE Trans. on, 41(3):613{627, May 1995. 30 [11] E. M. Enriquez, F. D. Maria, and A. Ortega. Video encoder based on lifting transforms on graphs. In Intl. conf. image proc. ICIP. IEEE, Sep 2011. 104, 107 [12] S. Fitzpatrick and L. Meertens. An experimental assessment of a stochastic, anytime, de- centralized, soft colourer for sparse graphs. In In Proc. SAGA'01, pages 49{64, 2001. 28, 29 [13] M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc Natl Acad Sci U S A, 99(12):7821{7826, June 2002. 1, 2 [14] David K. Hammond, Pierre Vandergheynst, and R emi Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129{150, Mar 2011. 3, 12, 19, 21, 54, 55, 63, 97, 114 [15] F. Harary, D. Hsu, and Z. Miller. The biparticity of a graph. Journal of Graph Theory, 1(2):131{133, 1977. 91 117 [16] Shi J. and Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888{905, 1997. 93 [17] Dmitry Jakobson, Stephen D. Miller, Igor Rivin, and Zev Rudnick. Eigenvalue spacings for regular graphs. In IN IMA VOL. MATH. APPL, pages 317{327. Springer, 1999. 45 [18] M. Jansen, G. P. Nason, and B. W. Silverman. Multiscale methods for data on graphs and irregular multidimensional situations. Journal of the Royal Statistical Society, 71(1):97125, 2009. 3, 18, 21, 23, 114 [19] I. M. Johnstone and B. W. Silverman. Wavelet threshold estimators for data with correlated noise. Royal Statistical Society: Series B (Statistical Methodology), 59:319{351, 1997. 30 [20] D. Kempe and F. McSherry. A decentralized algorithm for spectral analysis. ACM sympo- sium on Theory of computing, pages 561{568, 2004. 2 [21] W.S. Kim, S.K. Narang, and A. Ortega. Graph based transforms for depth video coding. In in ICASSP'12, Mar 2012. 107, 110 [22] Nick Kingsbury. Complex wavelets for shift invariant analysis and ltering of signals. Applied and Computational Harmonic Analysis, 10, 2001. 103 [23] W. Klotz. Graph coloring algorithms. Mathematik-Bericht, 5:1 { 9, 2002. 92 [24] R.I. Kondor and J. Laerty. Diusion kernels on graphs and other discrete structures. In Proc. ICML, pages 315{322, 2002. 2 [25] Tai Sing Lee. Image representation using 2d gabor wavelets. Pattern Anal. and Mach. Intel., IEEE Trans. on, 18(10):959 {971, oct 1996. 103 [26] U. Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395{416, 2007. 2, 11, 30 [27] R. Mersereau and T. Speake. The processing of periodically sampled multidimensional sig- nals. ITASS, 31(1):188 { 194, feb. 1983. 50 [28] S. K. Narang, Y. H. Chao, and A. Ortega. Graph-wavelet lterbanks for edge-aware image processing. to appear SSP, Aug. 2012. 97, 104 [29] S. K. Narang and A. Ortega. Local two-channel critically-sampled lter-banks on graphs. ICIP, pages 333 {336, Sep. 2010. 22, 55 [30] S.K. Narang and Ortega A. Perfect reconstruction two-channel wavelet lter-banks for graph structured data. IEEE trans. on Signal Processing, 60(6):2786{2799, June 2012. 93 [31] S.K. Narang, G. Shen, and A. Ortega. Unidirectional graph-based wavelet transforms for ecient data gathering in sensor networks. In In Proc. of ICASSP'10, March 2010. 22 [32] J. P.-Trufero, S.K. Narang, and A. Ortega. Distributed transforms for ecient data gathering in arbitrary networks. In ICIP '10., pages 1829{ 1832, Sept 2010. 22 [33] G. Pandey, M. Steinbach, R. Gupta, T. Garg, and V. Kumar. Association analysis-based transformations for protein interaction networks: a function prediction case study. In KDD '07, pages 540{549. ACM, 2007. 29 [34] E. Le Pennec and S. Mallat. Sparse geometric image representations with bandelets. IEEE Trans. on Image Proc., 14(4), 2005. 103 118 [35] G. Shen, W.S. Kim, S.K. Narang, A. Ortega, J. Lee, and H.C. Wey. Edge-adaptive transforms for ecient depth map coding. In Picture Coding Symposium (PCS), 2010, Dec 2010. 104, 110 [36] G. Shen, S. K. Narang, and A. Ortega. Adaptive distributed transforms for irregularly sampled wireless sensor networks. ICASSP '09, 0:2225{2228, 2009. 22, 35 [37] G. Shen and A. Ortega. Compact image representation using wavelet lifting along arbitrary trees. ICIP'08, 2008. 103 [38] G. Shen and A. Ortega. Optimized distributed 2D transforms for irregularly sampled sensor network grids using wavelet lifting. In ICASSP'08, pages 2513{2516, April 2008. ix, 1, 3, 5, 18, 21, 22, 23, 24, 26, 30, 34, 35, 36, 114 [39] G. Shen and A. Ortega. Tree-based wavelets for image coding: Orthogonalization and tree selection. In PCS'09, Chicago, IL, May 2009. 22 [40] G. Shen and A. Ortega. Transform-based distributed data gathering. Sig. Proc., IEEE Trans. on, 58(7):3802 {3815, july 2010. 3, 5, 18 [41] G. Shen, S. Pattem, and A. Ortega. Energy-ecient graph-based wavelets for distributed coding in wireless sensor networks. ICASSP' 09, 0:2253{2256, 2009. 35 [42] TinyOS-2. Collection tree protocol. http://www.tinyos.net/tinyos-2.x/doc/. 34 [43] V. Velisavljevic, B. Beferull-Lozano, M. Vetterli, and P.L. Dragotti. Directionlets: Anisotropic multidirectional representation with separable ltering. IEEE Trans. on Image Proc., 15(7), 2006. 103 [44] M. Vetterli and J. Kova cevic. Wavelets and subband coding. Prentice-Hall, Inc., NJ, USA, 1995. 2, 71 [45] R. Wagner, R. Baraniuk, S. Du, D.B. Johnson, and A. Cohen. An architecture for distributed wavelet analysis and processing in sensor networks. In IPSN '06, pages 243{253, April 2006. 3, 5, 18, 21, 22, 114 [46] A. Wang and A. Chandraksan. Energy-ecient DSPs for wireless sensor networks. IEEE Signal Processing Magazine, 19(4):68{78, July 2002. 36 [47] W. Wang and K. Ramchandran. Random multiresolution representations for arbitrary sensor network graphs. In ICASSP, volume 4, pages IV{IV, May 2006. 1, 3, 10, 16, 21, 114 [48] M. Weber and S. Kube. Robust perron cluster analysis for various applications in computa- tional life science. In CompLife, pages 57{66, 2005. 1 [49] H. R. Sheikh Z. Wang, A. C. Bovik and E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Trans. on Image Proc., 13(4), 2004. 110 [50] W.W. Zachary. An information ow model for con ict and ssion in small groups. Journ. of Anthropological Research, 33:452{473, 1977. ix, 29 [51] F. Zhang and E. R. Hancock. Graph spectral image smoothing using the heat kernel. Pattern Recogn., 41(11), 2008. 2, 30, 103 119
Abstract (if available)
Abstract
Emerging data mining applications will have to operate on datasets defined on graphs. Examples of such datasets include online document networks, social networks, and transportation networks etc. The data on these graphs can be visualized as a finite collection of samples, a graph-signal which can be defined as the information attached to each node (scalar or vector values mapped to the set of vertices/edges) of the graph. Major challenges are posed by the size of these datasets, making it difficult to visualize, process, analyze and act on the information available. Wavelets have been popular for traditional signal processing problems (e.g., compression, segmentation, denoising) because they allow signal representations where a variety of trade-offs between spatial (or temporal) resolution and frequency resolution can be achieved. In this research, we seek to leverage novel basic wavelet techniques for graph data, and apply them to realistic information analytics problems. The primary contribution of this thesis is to design critically sampled wavelet filterbanks on graphs, which provide a local analysis in the graph (localized within a few hops of a target node), while capturing spectral/frequency information of the graph-signals. The graphs in our study are simple undirected graphs. We first design ""one-dimensional"" two-channel filterbanks on bipartite graphs, and then extend them to any arbitrary graph. The filterbanks come in two flavors, depending upon the chosen downsampling method: i) lifting wavelet filterbanks and ii) spectral wavelet filterbanks. For bipartite graphs we define a spectral folding phenomenon, analogous to aliasing in regular signals, that helps us define filterbank constraints in simple terms. For arbitrary graphs we propose two choices: a) to approximate the graph as a single bipartite graph and apply one-dimensional"" filterbanks, or b) to decompose the graph into multiple bipartite subgraphs and apply multi-dimensional"" filterbanks. All our proposed filterbanks designs are critically sampled and perfect reconstruction. To the best of our knowledge, no such filterbanks have been proposed before. The tools proposed in this thesis make it possible to develop i) multiresolution representations of graphs, ii) edge-aware processing of regular signals, iii) anomaly detection in datasets, and iv) sampling of large networks.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Sampling theory for graph signals with applications to semi-supervised learning
PDF
Lifting transforms on graphs: theory and applications
PDF
Estimation of graph Laplacian and covariance matrices
PDF
Compression of signal on graphs with the application to image and video coding
PDF
Efficient transforms for graph signals with applications to video coding
PDF
Graph-based models and transforms for signal/data processing with applications to video coding
PDF
Human motion data analysis and compression using graph based techniques
PDF
Modeling and predicting with spatial‐temporal social networks
PDF
Novel algorithms for large scale supervised and one class learning
PDF
Efficient graph learning: theory and performance evaluation
PDF
Scalable sampling and reconstruction for graph signals
PDF
Learning and control for wireless networks via graph signal processing
PDF
Distributed wavelet compression algorithms for wireless sensor networks
PDF
Efficient graph processing with graph semantics aware intelligent storage
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Application-driven compressed sensing
PDF
Neighborhood and graph constructions using non-negative kernel regression (NNK)
PDF
Human activity analysis with graph signal processing techniques
PDF
Efficient pipelines for vision-based context sensing
PDF
Learning the geometric structure of high dimensional data using the Tensor Voting Graph
Asset Metadata
Creator
Narang, Sunil Kumar
(author)
Core Title
Critically sampled wavelet filterbanks on graphs
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
07/24/2012
Defense Date
05/17/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
digital signal processing,network theory (graphs),OAI-PMH Harvest,sampling in graphs,wavelet transforms
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ortega, Antonio K. (
committee chair
), Krishnamachari, Bhaskar (
committee member
), Liu, Yan (
committee member
)
Creator Email
kumarsun@usc.edu,narang.sunil@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-63114
Unique identifier
UC11289510
Identifier
usctheses-c3-63114 (legacy record id)
Legacy Identifier
etd-NarangSuni-977.pdf
Dmrecord
63114
Document Type
Dissertation
Rights
Narang, Sunil Kumar
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
digital signal processing
network theory (graphs)
sampling in graphs
wavelet transforms