Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Robust video transmission in erasure networks with network coding
(USC Thesis Other)
Robust video transmission in erasure networks with network coding
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ROBUST VIDEO TRANSMISSION IN ERASURE NETWORKS WITH NETWORK CODING by Hui Wang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) August 2009 Copyright 2009 Hui Wang Dedication This dissertation is dedicated to my husband Wanli Wu, my daughter Katherine Wu, my parents Houkang Wang, and Yujia Zhang. ii Acknowledgements First of all, I would like to thank my advisor, Professor C.-C. Jay Kuo, who gave me the opportunity of pursuing Ph.D in Media Communication Lab ve years ago. I'm grateful for his trust on me and inspiration of my potential during the Ph.D process. The process of Ph.D research is very challenging especially under the guidance of Professor Kuo. After going through many intensive discussions with Professor Kuo, who always asked tough questions and provided insightful suggestions, I learned how to do research in depth and breadth. Without Professor Kuo's encouragement, patience, and guidance, I would not have the condence to go through the dark period of struggle, hesitation, and self-doubt during my PhD study. The role model set by Professor Kuo will impact all my life. I would like to express my deep appreciation to my Ph.D thesis committee members, Professor Cyrus Shahabi, Professor Antonio Ortega, Professor Zhen Zhang, Professor Bhaskar Krishnamachari, and Professor Panayiotis Georgiou for their insightful comments and suggestions on improving the presentation and contents of my thesis. I would also like to thank my collaborator Dr. Ronald Chang. He set a good model for me as an excellent researcher. His critical thinking and challenge questions inspire me with new ideas. I also appreciate his diligence on paper revising and proofreading from which I learn writing skills to express my ideas precisely. iii I am very grateful to my good friends and lab collaborators, Jiaying Liu, May-chen Kuo, Yongjin Cho, Jessy Lee, Bei Wang, Byung-Ho Cha, Namgook Cho, Joyce Liang, Siwei Ma, Zhiwei Yan, Shilin Xu, Meiyin Shen, Xuhua Liu, Kelvin Chou, Athena Huang, Wesley Lee, Angel Dai, Layla Tadjpour, Tanaphol Thaipanich, Jong-Dae Oh, Junghun Park, Michael A. Enright. Without their companion and support, my life in USC would not be so wonderful. Finally, my deepest gratitude goes to my husband, Wanli, who is my best friend, research collaborator, and partner of life. He is the ever-present help in my life. Thanks for his supporting of my Ph.D studies. Thanks for him taking good care of our Katherine when I was not present, and thanks for him being my best research partner and giving me valuable suggestions. iv Table of Contents Dedication ii Acknowledgements iii List Of Tables viii List Of Figures ix Abstract xiii Chapter 1: Introduction 1 1.1 Signicance of the Research . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Review of Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Robust Video Transmission . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1.1 FEC Protection . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1.2 Redundant Path . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2 Applications of Network Coding . . . . . . . . . . . . . . . . . . . 6 1.2.2.1 Concatenated Network Coding . . . . . . . . . . . . . . . 6 1.2.2.2 File Distribution in P2P Network . . . . . . . . . . . . . 7 1.2.3 Wireless Network Coding for Media Streaming . . . . . . . . . . . 8 1.2.3.1 Wireless Information Exchange with Network Coding . . 8 1.2.3.2 Opportunistic Wireless Network Coding . . . . . . . . . . 9 1.2.3.3 Apply Network Coding for Wireless Video Streaming . . 9 1.3 Contributions of the Research . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.4 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 2: Research Background 14 2.1 Network Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1 Optimal Capacity of Information Flow . . . . . . . . . . . . . . . . 15 2.1.2 Random Linear Network Coding . . . . . . . . . . . . . . . . . . . 16 2.1.3 Practical Network Coding . . . . . . . . . . . . . . . . . . . . . . . 21 2.1.4 Network Coding in Erasure Network . . . . . . . . . . . . . . . . . 22 2.2 H.264/SVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2.1 Overview of H.264/SVC . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2.2 Error Concealment Methods for H.264/SVC . . . . . . . . . . . . . 25 v 2.2.2.1 Frame Copy and Temporal Direct . . . . . . . . . . . . . 25 2.2.2.2 BLSkip . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3 Systematic Erasure Coding . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter3: NC-basedVideoTransmissionwithLadder-ShapedGobalCod- ing Matrix (LGCM) 29 3.1 Video Transmission System with Network Coding . . . . . . . . . . . . . . 29 3.1.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1.2 Average Packet Loss Rate . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.3 Problems of NC-based Video Transmission . . . . . . . . . . . . . 35 3.1.4 H.264/SVC Scalable Layers . . . . . . . . . . . . . . . . . . . . . . 38 3.2 Network Coding with Ladder-Shaped Global Cocient Matrix (LGCM) . 40 3.2.1 Structure of Ladder-shaped GCM (LGCM) . . . . . . . . . . . . . 40 3.2.2 NC Implementation for LGCM . . . . . . . . . . . . . . . . . . . . 43 3.2.3 Mapping Quality Layers to LGCM . . . . . . . . . . . . . . . . . . 46 3.2.4 Mapping Temporal Layers to LGCM . . . . . . . . . . . . . . . . . 47 3.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.1 Performance Comparison between Dierent GCMs . . . . . . . . . 49 3.3.2 Optimal Redundancy Assignment . . . . . . . . . . . . . . . . . . . 51 3.3.3 Congestion in Bottleneck Link . . . . . . . . . . . . . . . . . . . . 53 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Chapter 4: NC-based Video Transmission with Interleaving 56 4.1 Interleaving Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.1 Traditional Interleaving Schemes . . . . . . . . . . . . . . . . . . . 57 4.1.2 NC-oriented Interleaving Scheme . . . . . . . . . . . . . . . . . . . 59 4.1.3 Comparison between NC-oriented Interleaving and LGCM . . . . . 61 4.2 Joint Error Concealment and Interleaving . . . . . . . . . . . . . . . . . . 63 4.3 GOP Partition and Unequal Protection . . . . . . . . . . . . . . . . . . . 65 4.3.1 GOP Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3.2 Unequal Erasure Protection . . . . . . . . . . . . . . . . . . . . . . 68 4.4 Design of Optimal Interleaving Scheme . . . . . . . . . . . . . . . . . . . . 68 4.4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.4.2 Iterative Algorithm to the Optimization Problem . . . . . . . . . . 70 4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.5.1 Optimal Parameters for Interleaving Scheme . . . . . . . . . . . . 74 4.5.2 Comparison of NC-Oriented Interleaving and Dense GCM . . . . . 76 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter 5: Wireless Multi-party Video Conferencing with Network Cod- ing 81 5.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2 Wireless Video Conferencing with NC . . . . . . . . . . . . . . . . . . . . 84 5.2.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.2.2 Proposed NC Process . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.2.3 Real-Time Video Scheduling . . . . . . . . . . . . . . . . . . . . . 89 vi 5.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.3.1 Comparison of Proposed and Opportunistic NC Methods . . . . . 92 5.3.2 Eect of Unequal Path Loss in Overhearing Channels . . . . . . . 94 5.3.3 Eect of NC Generation Period . . . . . . . . . . . . . . . . . . . . 97 5.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.4.1 Scenario with Equal Path Loss . . . . . . . . . . . . . . . . . . . . 99 5.4.2 Scenario with Unequal Path Loss . . . . . . . . . . . . . . . . . . . 102 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Chapter 6: Conclusion and Future Work 105 6.1 Summary of the Research . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.2 Future Research Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.2.1 Extension in Diserv Network . . . . . . . . . . . . . . . . . . . . . 107 6.2.2 Deployment of Dummy Users . . . . . . . . . . . . . . . . . . . . . 109 Bibliography 110 vii List Of Tables 3.1 An example of LGCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.1 Comparison of the PSNR value of each user when = 5%. . . . . . . . . 100 5.2 WiMAX PER and physical distances when = 0:3; 2 = 0:2. . . . . . . . 102 viii List Of Figures 1.1 Illustration of a concatenated network coding scheme. . . . . . . . . . . . 7 1.2 Illustration of the wireless opportunistic NC. . . . . . . . . . . . . . . . . 10 2.1 A classical network coding example. . . . . . . . . . . . . . . . . . . . . . 15 2.2 The RLNC process at an intermediate node. . . . . . . . . . . . . . . . . . 17 2.3 The global coding matrix (GCM) of an end-to-end delivery system using NC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4 Illustration of the RLNC process going through multiple nodes. . . . . . . 20 2.5 An example of layering structure of H.264/SVC consisting of four temporal layers and two quality layers. . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.1 Video transmission with network coding in an erasure network. . . . . . . 30 3.2 Network coidng with various amounts of delay. . . . . . . . . . . . . . . . 32 3.3 A line erasure network consisting of three nodes. . . . . . . . . . . . . . . 34 3.4 Three dierent GCM types. . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.5 Quality degradation due to frame dropping in temporal and quality layers for the test Foreman sequence. . . . . . . . . . . . . . . . . . . . . . . . . 39 3.6 Encoding a layered source data block with LGCM. . . . . . . . . . . . . . 41 3.7 The structure of an RLNC packet. . . . . . . . . . . . . . . . . . . . . . . 44 3.8 NC encoding at intermediate nodes. . . . . . . . . . . . . . . . . . . . . . 45 ix 3.9 Two scenarios for the last-hop erasure: (a) access via a wireless communi- cation link and (b) access via a wired broadband modem. . . . . . . . . . 49 3.10 Performance comparison of three GCM types. . . . . . . . . . . . . . . . . 50 3.11 Comparison of reconstructed frames with (a) the traditional store-and- forward policy and (b) NC with dense block GCM, under a packet loss rate of p = 4%. No redundant packets are used in the frame reconstruction. 50 3.12 Comparison of the PSNR value for four dierent NC schemes as a function of the packet loss rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.13 Comparison of reconstructed frames using the dense block GCM and the LGCM with the ratio of redundancy = 4% and the packet loss ratep = 4%. 51 3.14 Optimal redundancy assignment to the base layer and the enhancement layer as a function of the packet loss rate. . . . . . . . . . . . . . . . . . . 52 3.15 The PSNR value comparison between dense block GCM and LGCM. . . . 53 3.16 Throughput of a video session as a function of time, where congestion occurs between 5 and 300 seconds. . . . . . . . . . . . . . . . . . . . . . . 54 3.17 Comparison of reconstructed frames using the general GCM and LGCM with the ratio of redundancy = 10% in the presence of congestion in the bottleneck link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1 Interleaving at the bit level. . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 Interleaving at the packet level. . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3 Interleaving and unequal protection for NC-based video transmission. . . 59 4.4 Numerical comparison of the receiving probability of two schemes. . . . . 62 4.5 Comparison the PSNR values of concealed video with and without inter- leaving. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.6 Adjustment of redundancy among partitions. . . . . . . . . . . . . . . . . 72 4.7 Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Harbor sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . 75 4.8 Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Soccer sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . . 75 x 4.9 Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Crew sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . . . 76 4.10 The probability of losing a generation. . . . . . . . . . . . . . . . . . . . . 77 4.11 Minimized quality degradation and the average PSNR value for the 4CIF Harbor sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . 78 4.12 Minimized quality degradation and the average PSNR value for the 4CIF Soccer sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . . 78 4.13 Minimized quality degradation and the average PSNR value for the 4CIF Crew sequence at a frame rate of 30 frames/sec. . . . . . . . . . . . . . . . 79 4.14 Comparison of a decoded frame (the 44th frame of the Harbor sequence of resolution 704 576) using NC-oriented interleaving and dense GCM with 10% redundancy and the packet loss rate p = 8%. . . . . . . . . . . . . . . 79 4.15 Comparison of a decoded frame (the 26th frame of the Soccer sequence of resolution 704 576) using NC-oriented interleaving and dense GCM with 10% redundancy and the packet loss rate p = 8%. . . . . . . . . . . . . . . 80 4.16 Comparison of a decoded frame (the 34th frame of the Crew sequence of resoluton 704 576) using NC-oriented interleaving and dense GCM with 10% redundancy and the packet loss rate p = 8%. . . . . . . . . . . . . . . 80 5.1 Illustration of the NC wireless multi-party video conferencing system. . . 86 5.2 Real-time scheduling with even and odd generations of video packets. . . 90 5.3 The subspace for user node 1 with (a) the opportunistic NC method and (b) the proposed NC method. . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4 Illustration of relative locations among users: (a) the unequal path loss case and (b) the equal path loss case. . . . . . . . . . . . . . . . . . . . . . 95 5.5 Illustration of inserting a dummy user to improve the overall system per- formance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.6 The averaged PSNR performance as a function of the average packet loss rate, , of the downlink channel. . . . . . . . . . . . . . . . . . . . . . . . 100 5.7 The downlink bandwidth consumption as a function of the average packet loss rate, , of the downlink channel. . . . . . . . . . . . . . . . . . . . . . 101 xi 5.8 Comparison of the decoded 275th frame of the Soccer sequence at user node 1 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.9 Comparison of the decoded 22nd frame of the Foreman sequence at user node 1 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.10 Comparison of the decoded 83th frame of the Crew sequence at user node 2 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.11 Comparison of the decoded 97th frame of the Foreman sequence at user node 2 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.12 Comparison of the decoded 60th frame of the Crew sequence at user node 3 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.13 Comparison of the decoded 282nd frame of the Soccer sequence at user node 3 with = 0:05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.1 NC-based video delivery in DiServ networks. . . . . . . . . . . . . . . . . 108 xii Abstract To transmit video through erasure networks eectively, it is essential to reduce the impact of packet loss on transmitted video quality. The problem of robust video transmission in erasure networks using network coding (NC) is investigated in this research. NC theory, which has been developed recently, oers an alternative to encode video packets at intermediate nodes to improve the throughput. We consider an NC-based video delivery system that performs random linear network coding (RLNC) at intermediate nodes in erasure networks. RLNC linearly combines a group of packets by randomly selecting weighting coecients on a nite eld in a distributed way. The loss of an RLNC-coded packet is equivalent to the lost of one dimension in a constrained system of equations required for the RLNC decoding. Unless the global network coding coecient matrix (or simply the global coding matrix) is of full rank, we are not able to recover all source packets by network decoding. Three innovative schemes are proposed and analyzed to address the problem of robust video transmission in erasure networks. First, we propose a way to construct a sparse global coding matrix (GCM), called the ladder GCM, for layered H.264/SVC (scalable video coding) video transmission. The ladder shape of the sparse matrix is maintained through the RLNC process. The sparse GCM is designed to exploit scalable layers of H.264/SVC with two objectives: 1) to xiii enable partial decoding of a block; and 2) to provide unequal erasure protection for H.264/SVC priority layers. Quality degradation is minimized by optimizing the amount of redundancy assigned to each layer. Graceful quality degradation is achieved by error concealment (EC). Next, we present an interleaving scheme that facilitates an integrated NC/EC method. This scheme distributes the impact of one long burst erasure into many short ones which are distributed in adjacent GOPs so that lost packets can be recovered more easily by NC and spatial/temporal EC. Moreover, we partition one GOP into priority levels. Packets from the same priority level of several GOPs form one RLNC generation. Then, unequal erasure protection can be applied to dierent generations. The optimal interleaving length and the redundancy assignment are determined to achieve graceful video quality degra- dation. Finally, we study the problem of applying NC to multi-party wireless video confer- encing. The network coding scheme can be used to enhance the robustness of video transmission in wireless channels. The erasure protection procedure can be simplied and the downlink bandwidth can be reduced by leveraging opportunistic NC and wire- less broadcasting. Specically, we propose a pipelining schedule that can meet the delay requirement for real-time video conferencing. The proposed NC method outperforms the opportunistic NC method by a signicant margin in terms of video quality and downlink bandwidth. xiv Chapter 1 Introduction 1.1 Signicance of the Research Video transmission has received a lot of attention nowadays since it provides a convenient way for users to enjoy video with heterogeneous terminals over broadband IP networks. Due to the high bit rate of video data, it is important to increase the throughput of the delivery network. Due to trac congestion, packets can be dropped in bottleneck links with over ow in queues. Another reason of packet loss is resulted from unreliable wireless channels. It is essential to reduce the impact of packet loss on received video quality since the human vision system is sensitive to subtle video degradation. The traditional ARQ (Automatic Repeat-reQuest) scheme may not be suitable for real-time video applications since it introduces long delay and may not meet the stringent time requirement of streaming video. As an alternative, one may adopt the forward-error- correction (FEC) codes in protecting video packets and use multiple multicast trees to provide redundant paths to transmit the coded video bit stream. In this delivery scheme, packets are transmitted by the store-and-forward (S/F) mechanism at intermediate nodes. 1 The recently developed network coding (NC) theory oers a new approach for video delivery. The NC scheme encodes packets at intermediate nodes. Although NC has been extensively studied over the last eight years by researchers in the information theory community, its application to video delivery has not yet been well examined. The application of NC to video delivery has two clear advantages. First, it can be used to improve the throughput of data multicast [5, 32]. Second, it is more resilient to packet loss by generating redundant packets in a rateless way. A large amount of studies focus on NC for general data transmission [9,12,32] in the current literature. There are however two special considerations that dierentiate video delivery from general data delivery. One is the stringent time constraint in the display of continuous media. Late arriving packets could be useless in the display. In this sense, they are equivalent to packet loss. The other is its error resilience property with respect to packet loss, which plays an important role to video transmission in an erasure network. Video and audio data may tolerate partial data loss with degraded quality. When there is no sucient protection for a block of data, contents of lost packets can be compensated by exploiting other received packets for spatial- and/or temporal-domain error concealment (EC). Generally speaking, we can achieve graceful video quality degradation in a packet loss environment. Due to the mixing process at intermediate nodes of a network in NC, it is dicult to keep the global coding matrix (GCM) of NC as a sparse generator matrix. Generally speaking, if the GCM does not have the full rank, it cannot be inverted properly. Then, without proper NC decoding, received packets are useless for the video decoder. Thus, one challenge of our research is to address the insucient rank problem of GCM in video transmission in and erasure network using NC. We propose two solutions in this 2 thesis proposal. The rst solution is to construct a GCM of the ladder form for layered H.264/SVC (scalable video coding) video transmission, which allows partial decoding of a data block. The second solution is to interleave GOPs to facilitate an integrated NC/EC method. In these two solutions, the GCM structure is designed specically for priority layers of the H.264/SVC bitstream. It enables unequal protection of H.264/SVC transmission against packet loss eectively. Our proposed solution provides a general framework of video transmission with NC. It consists of three main processing modules: 1) processing at the source node, 2) processing at intermediate nodes of the network, and 3) processing at the destination (or receiving) node. It can be applied to dierent network types. One example is the peer-to-peer (P2P) network, where a node can serve as both the client and server simultaneously. The computing, storing, and routing capabilities of peers are sucient for the NC implemen- tation. Another main challenge of applying NC to robust video transmission arises in the context of wireless networks. Unlike traditional wireless NC techniques [17, 20, 21, 37], which only address erasure protection over one type of wireless channels, we consider an enhanced NC scheme for robust video transmission over uplink/downlink/overhearing channels in wireless multi-party video conferencing. This application imposes a more stringent constraint on previously studied scenarios such as video multicast or broad- cast. Video quality, communication bandwidth, and stringent delay requirements all pose formidable challenges in real-time video conferencing through error-prone wireless networks. 3 To address this challenge, we propose a scheme that simplies the erasure protection procedure and reduces the downlink bandwidth by leveraging opportunistic NC and wire- less broadcasting. We also design a pipelining schedule to meet the delay requirement for real-time video conferencing. The enhanced NC scheme can be potentially extended to more general multimedia applications such as those involving video upload and download with multiple network connection interfaces. 1.2 Review of Previous Work 1.2.1 Robust Video Transmission Two techniques are often adopted in traditional video streaming systems for robust packet transmission. One is to oer redundancy and/or priority of coded video such as multiple description coding (MDC) and prioritized scalable video coding (SVC). The other one is to provide path diversity. 1.2.1.1 FEC Protection The application-level forward error correction (AL-FEC) scheme [13,30,31,38] is a tech- nique that uses erasure codes such as the Reed-Solomon (RS) and the Fountain codes [7] to generate protected packets in the source node, where protection is applied to a block of packets at the application level. Several existing standards such as 3GPP [2] and DVB-H [1] adopt AL-FEC for robust video transmission in wireless networks. Letk be the number of source packets andn the number of encoded packets (n>k) in our following discussion. The RS code is a maximum distance separable (MDS) code. The 4 complexity of RS(n;k) code is O(n 2 ). Examples of Fountain codes include the Raptor code [47] and the LT code [35]. They behave like the MDS code asymptotically. It only requires slightly larger than k number of packets for the decoding of source packets. It belongs to the class of low-density parity-check (LDPC) codes with a sparse generator matrix. Their encoding and decoding complexities are equal toO(n). If more than (nk) packets are lost, code redundancy fails to protect the whole data block. With error concealment (EC), lost source packets can be concealed using the informa- tion from other received source packets, resulting in graceful quality degradation. Schierl et al. [44, 45] presented a robust H.264/SVC streaming scheme protected by the Raptor FEC code in mobile ad-hoc networks (MANETs). An unequal amount of FEC codes is assigned to dierent video layers based on their importance. Robust video transmis- sion via joint source-channel coding in an erasure network environment has also been extensively studied, e.g. [14,36,51]. 1.2.1.2 Redundant Path Another way to achieve robust video transmission is to provide path diversity [61] by building multiple multicast trees. Wei and Zakhor [53] proposed to use two path-disjoint multicast trees to improve the video quality of a single multicast tree in an ad-hoc wireless network. They presented an algorithm to construct and maintain path-disjoint multicast trees in a dynamic network environment. A transmission scheme is adopted to distribute MDC streams eectively over multiple trees at the source node. Padmanabhan, Wang and Chou [39] proposed a video multicast scheme using multiple trees and MDC to provide 5 redundancy for live data streaming, where robustness is achieved by redundant paths in the network and redundant data representation in MDC. 1.2.2 Applications of Network Coding 1.2.2.1 Concatenated Network Coding Walsh et al. [52] proposed a method that concatenates LDPC with NC. The decoder can decode only if one packet in a data block is received. Source packets are arranged by priority and coded by a method called the priority error transmission (PET) [9]. Suppose there are N source packets of length M bytes as shown in the left side of Fig. 1.1. Assume that the importance of packets decreases from packet 1 to N. Source packet i in the block is coded into an intermediate block of size N M i . Intermediate blocks are assembled into an encoded data block as shown in the right side of Fig. 1.1. Then, rows of the encoded data block act as input packets for NC. The concatenation of two coding methods guarantees that the most important n source packets can be decoded as long as n NC packets are received. Moreover, the decoding delay is small since any received NC packet contributes to the decoding of a source packet. However, this method has one shortcoming. That is, it demands more bandwidth. Alghough the packet number of the encoded data block is the same as the source data block as shown in Fig. 1.1, the packet length is expanded by a factor N P i=1 1 i . IfN > 4, the bandwidth requirement will be more than twice of the original one. 6 The most important packet The least important packet Figure 1.1: Illustration of a concatenated network coding scheme. 1.2.2.2 File Distribution in P2P Network Traditional P2P content distribution methods such as BitTorrent partition a le into multiple pieces and distribute them to multiple peers. A node collects pieces of a le from its neighbors. A local rarest scheduling algorithm is adopted to nd missed pieces. There is an emerging trend in using NC for le distribution in P2P networks. One famous example is Avalanche, which provides a scalable and fast le distribution solution for TV- on-demand, patches/software distribution, etc. The le download time of Avalanche is 20% 30% less than BitTorrent. Gkantsidis et al. [12] explained how the le download time in a P2P network is reduced by NC. A le is partitioned into multiple pieces, and each piece is encoded using RLNC and stored in peers in a distributed way. Peers linearly combine stored le pieces and may generate new combinations from existing combinations. Such encoding ensures that any piece uploaded by a peer can be of use to another peer. The client keeps requesting NC mixed packets from peers until its global coding matrix has the full rank so that the original le can be decoded properly. It reduces the download time as compared with the traditional P2P system since it does not need to dierentiate the rarest packets. Peers do not need to nd specic pieces in the system to complete a desired le. Intead, any 7 encoded piece will suce. This makes the system robust even when some peers leave from the system abruptly. Also, the source is no more the bottleneck since no peer is more important than others. Moreover, network bandwidth is reduced considerably because of the benet of NC. The main disadvantage is that the time of RLNC encoding/decoding increases with the le size. 1.2.3 Wireless Network Coding for Media Streaming Wireless NC has been studied by several researchers. Some of the main results are brie y reviewed in this section. The benets of wireless NC are two folds: one is to reduce bandwidth consumption, the other one is to improve erasure protection capability. How- ever, most of the previous wireless NC methods only address the erasure protection over downlink. Those methods do not apply to our multi-party video conferencing scenario to provide erasure protection over uplink, downlink and overhearing channels. 1.2.3.1 Wireless Information Exchange with Network Coding Wu et al. [55] studied the benet of performing NC in a simple wireless ad hoc network. They proposed a bandwidth-reduction and power-saving method based on the XOR op- eration on intermediate nodes for wireless information exchange. The proposed scheme reduces the bandwidth consumption by broadcasting NC packets over the wireless chan- nel. However, the bandwidth reduction is reduced when the number of users increases. For U users: the bandwidth is 1=U. Moreover, when NC is adopted in an error prone network, errors propagate. The issue of performance degradation due to error propa- gation was not discussed for NC-based message exchange in [55]. To address the error 8 propagation eect, Karande et al. [20] proposed a cross-layer wireless NC method and studied the optimality condition, which can be achieved by selecting NC or S&F dynam- ically on intermediate nodes based on the error rate. For small SNR, it performs NC on the intermediate node. For large SNR, it performs traditional store-and-forward on the intermediate node. 1.2.3.2 Opportunistic Wireless Network Coding Katti et al. [21] proposed the use of opportunistic scheduling in the scenario of multiple unicasts to improve throughput. This method performs optimal scheduling based on the state information of neighboring nodes. With the overheard information from neighbors, optimally decodable NC codes can be generated for neighbors. As a result, the throughput is maximized. For example, in Fig. 1.2, assumen1 wants to send packetP 1 ton2, andn4 wants to sendP 2 ton5. Ifn5 andn2 overhearsP 1 andP 2 respectively, by P 1 L P 2 on n3,n5 andn2 decodeP 2 andP 1 respectively. The problems are no erasure protection on uplink and overhearing channels. And no throughput gain when packets are lost on the both channels. For example, in Fig. if P 1 is lost over channel n1!n3 and n1!n5 in Fig. 1.2, n3 only forwardsP 2 ton5, andn2 can not decodeP 1. There is no throughput gain. 1.2.3.3 Apply Network Coding for Wireless Video Streaming Nguyen et al. [37] studied the application of NC in wireless networks for video broadcast- ing. They proposed an optimal scheme to generate erasure protection codes by performing 9 n1 n3 n4 n5 n2 Uplink Overhearing channel Downlink Figure 1.2: Illustration of the wireless opportunistic NC. NC for retransmitting lost packets. By gathering ARQ (Automatic Repeat Request) mes- sages from receivers, the broadcasting source generates optimal NC packets. Seferoglu et al. [17] proposed an extension of opportunistic scheduling with NC for multiple video unicasts. The scheme takes account into not only the throughput, but also the video quality and the transmission deadlines. On the intermediate nodes, new packets are gen- erated by XOR of selective video packets from dierent bitstreams according to their contribution to the overall quality. The NC codes are generated based on the priority and emergency of these packets. Receiving nodes listen to the transmission of neighbors and stores overhead packets for future decoding. It introduces storage overhead on the receivers. Moreover, the neighbor nodes need to exchange and update the stored content with each other, which requires extra communication in the network. The assumptions of those works is that the wireless BS/AP receives video packets without loss, then multicast them to wireless receivers. Both of the two works only provide erasure protection over downlinks. And the assumption that BS/AP receives all video packets does not apply for our multi-party video conferencing application scenario. 10 1.3 Contributions of the Research The design of a robust video transmission scheme using NC in an erasure network is studied in depth in this thesis. Main research contributions are summarized below. A robust video transmission system using H.264/SVC and NC in an erasure net- work is proposed in Chapter 3. First, at the source node, the H.264/SVC video bitstream is partitioned into priority layers and packetized into packets of equal length. Random linear network coding (RLNC) is performed at intermediate nodes of the network. At the receiver node, video data are decoded with an H.264/SVC decoder equipped with the error concealment capability. The system performance is evaluated by comparing the quality of the source video bitstream with that of the decoded bitstream. We address the issue of partial decoding when the global coding matrix (GCM) is not of the full rank. We propose the following two solutions by exploiting the video property. { We design a sparse ladder GCM for layered video transmission, whose shape is maintained throughout the RLNC process. The sparse ladder GCM has two functions: 1) to enable partial decoding of a block and 2) to provide unequal erasure protection for priority layers. Quality degradation is minimized by optimizing the amount of redundancy at each layer. Furthermore, graceful quality degradation is achieved with the assistance of error concealment (EC). This subject is treated in depth in Chapter 3. 11 { We propose an interleaving scheme that enables NC and EC to cooperate with each other eectively. This scheme distributes the impact of one long burst erasure into many short ones which are distributed in adjacent GOPs so that lost packets can be recovered more easily by NC and spatial/temporal EC. Moreover, we can partition one GOP into priority levels. Packets from the same priority level of multiple GOPs form one RLNC generation. Then, un- equal erasure protection can be applied to dierent generations. The optimal interleaving length and the redundancy assignment are solved to achieve grace- ful video quality degradation by a low-complexity algorithm. This is the main theme of Chapter 4. We propose an unequal erasure protection of H.264/SVC video using its quality and temporal layers. Specically, we consider a sparse GCM and partition GOPs according to properties of H.264/SVC scalable layers. Dierent unequal erasure protection schemes are applied to dierent layers of the H.264/SVC bitstream. We propose an improved NC scheme for robust and cost-eective wireless multi- party video conferencing. The proposed scheme enhance robust transmission to protect video transmission over uplink/downlink/overhearing channels. It simplies the erasure protection procedure, and reduces the downlink bandwidth by leveraging opportunistic NC and wireless broadcasting. We design a pipelining schedule to meet the delay requirement for real-time video conferencing. The proposed NC method outperforms the opportunistic NC method by a signicant margin in terms of video quality and downlink bandwidth. 12 1.4 Organization of the Dissertation The rest of the disseration is organized as follows. The background of NC and H.264/SVC is reviewed in Chapter 2. Video transmission with sparse ladder GCM is studied in Chapter 3. Video transmission with interleaving, NC and EC is presented in Chapter 4. An improved NC scheme for robust wireless multi-party video conferencing is presented in Chapter 5. Finally, concluding remarks are given and future research directions are pointed out in Chapter 6. 13 Chapter 2 Research Background Reseearch background is provided in this chapter for the sake of completeness. We rst review network coding (NC) theory in Section 2.1, including the optimal rate achieved by NC, its properties in the erasure network and practical implementation issues. Second, we give an overview of the H.264/SVC coding standard as well as error concealment methods in Section 2.2. Then we examine sparse and dense generator matrices, and concludes that the sparse matrix is desired for erasure protection codes in Section 2.3. Finally, we review wireless NC techniques which are proposed to improve the performance of video streaming in 1.2.3. 2.1 Network Coding Since the pioneering work of Ahlswede et al. [5] on NC, properties and applications of NC have been extensively studied by researchers. For example, NC achieves the optimal multicast rate [5]. NC can reduce the power consumption by broadcasting NC packets in wireless networks [54,56,57]. NC can reduce the data gathering time in sensor networks. 14 Store & Forward Network Coding S U T W X Z Y b 1 b 2 S U T W X Z Y b 2 b 2 b 2 b 2 b 1 + b 2 b 2 b 1 + b 2 b 1 + b 2 b 1 b 1 b 1 b 1 b 1 b 1 b 1 b 1 Figure 2.1: A classical network coding example. NC also enhances network's capability in error correction [8,59,60], security [27], network storage [4,11], etc. The basic concept of NC can be explained in the classic Butter y example as shown in Fig. 2.1, where each link is assumed to be with unit capacity. Source node S sends two bits (b 1 ;b 2 ) to nodes Y and Z. In the traditional store-and-forward network, node W has to select node b 1 or b 2 to forward to node X. In this case, link W ! X is the bottleneck. NodeY receives (b 1 ;b 2 ) while nodeZ receivesb 2 (or nodeY receivesb 1 while node Z receives (b 1 ;b 2 )). If NC is performed at node W (say, b 1 b 2 is sent along link W!X), both nodes Y and Z can receive (b 1 ;b 2 ) after simple decoding. 2.1.1 Optimal Capacity of Information Flow The graph tool is often used to explain the NC concept and operations. We denote a directed graph byG = (V;E), whereV andE are sets of vertices and edges.jEj denotes the total number of edges in G. V =fS;I;Tg, where S is the set of source nodes, I is the set of intermediate nodes and T is the set of receiving nodes. Every edge is assumed to be with unit capacity, and two nodes can be connected with multiple links to indicate higher capacity. 15 Ahlswede et al. [5] proved an important result. That is, the maximum multicast information rate from a source to a set of receivers can be achieved only by allowing coding at intermediate nodes. The optimal multicast rate is dened below. In a multicast session, source node S transmits information to a set of receivers, denoted by T =ft 1 ;t 2 ;:::;t n g. A cut is a partition that results in two sets of vertices in the multicast graph. One part, denoted byfV (S)g, has S as a member. The other part, denoted byfV (t i )g, has one of it receivers, t i , as a member. For each (S;t i ) pair, there exist many cuts to perform the partition. The capacity of a cut is the sum of all capacities of the links fromfV (S)g to fV (t i )g over the cut. C(S;t i ) = X p;q2 + (Q) r pq ; (2.1) where r pq is the capacity on link p! q and p;q2 + (Q) means the set of all links cross the cut. There is a minimum capacity of all these cuts. Then, the optimal rate is rate(S;T )min t2T Mincut(S;t): 2.1.2 Random Linear Network Coding Theoretically, the maximum multicast information rate can be achieved by a generic linear network coding scheme [18,24,25], where Packets are generated by a linear combination of source packets with coecients selected from a nite eld. By selecting these coecients carefully, a newly generated packet is linearly independent of all other packets on all nodes in the network. The computational complexity of the algorithm is polynomial. However, 16 it is a centralized method that requires the knowledge of network topology and a huge eld size to guarantee a successful decoding operation. The eld size can be computed by 0 B B @ +k 1 k 1 1 C C A ; where is the total number of channels in the network, and k is the source information rate. The number is usually too large for practical implementation. Ho and Medard [15,16] proposed a randomized linear network coding (RLNC) scheme, which selects coecients over the Galois nite eld randomly and independently. RLNC is a distributed scheme that does not demand the knowledge of network topology. Its computational complexity is low. Besides, it demands a small eld size to achieve a reasonably high successful decoding probability. in i P i out j P Figure 2.2: The RLNC process at an intermediate node. At the source node, source data packets are grouped into blocks of dierent gener- ations. Packets of the same generation are random linearly combined to generate new packets for outgoing links as shown in Fig. 2.2. Suppose an intermediate node, i, has l(i) in incoming links andl(i) out outgoing links. Packets at the source node can be viewed 17 from virtual in-coming links. Node i generates a packet for each outgoing link by ran- domly and independently selecting linear NC coecients from the Galois Field (GF) and performing the following linear combination with all received packets from incoming links as P l(i)out = X j=fl(i) in g P j j : (2.2) For each source-node and receive-node pair, we can relate source and received packets by a global coding matrix (GCM) as R =GS; where S is the source data block, S = [P 0 ;:::;P k ] T , R is the received data block, R = [Y 0 ;:::;Y k ] and G is the GCM which is updated when a new packet is received. This is illustrated in Fig. 2.3. At the receive-node, a packet is called an innovative packet if its arrival increases the rank of G. OnceG has the full rank, source packets can be decoded with Gaussian Elimination on the receiver. 11 1 1 1 1 k kkkk k g gP Y g gP Y ⎛⎞⎛⎞⎛⎞ ⎜⎟⎜⎟⎜⎟ = ⎜⎟⎜⎟⎜⎟ ⎜⎟⎜⎟⎜⎟ ⎝⎠⎝⎠⎝⎠ … 1 k P P ⎛⎞ ⎜⎟ ⎜⎟ ⎜⎟ ⎝⎠ 1 k Y Y ⎛⎞ ⎜⎟ ⎜⎟ ⎜⎟ ⎝⎠ Recv Src Figure 2.3: The global coding matrix (GCM) of an end-to-end delivery system using NC. 18 The RLNC mixing process in a network can be described by linear algebra. For example, consider the RLNC process at node N5 in Fig. 2.4. Assume there are 3 ingoing links, 2 outgoing links, and there are 4 original SVC prioritized packetsfP 0 ;P 1 ;P 2 ;P 3 g being sent from the source nodes for one time unit. The incoming packets are P in 0 (N5) = a 0 P 0 +a 1 P 1 +a 2 P 2 +a 3 P 3 ; (2.3) P in 1 (N5) = b 0 P 0 +b 1 P 1 +b 2 P 2 +b 3 P 3 ; (2.4) P in 2 (N5) = c 0 P 0 +c 1 P 1 +c 2 P 2 +c 3 P 3 : (2.5) To generate a new packet for one of the outgoing link at node N5, we have P out 0 (N5) = d 0 P in 0 (N5) +d 1 P in 1 (N5) +d 2 P in 2 (N5); (2.6) P out 1 (N5) = e 0 P in 0 (N5) +e 1 P in 1 (N5) +e 2 P in 2 (N5): (2.7) where d i and e i , i = 1; 2; 3, are random coecients selected from the Galois Field Then, for each newly generated packet, it is also a linear combination of original packetsfP 0 ;P 1 ;P 2 ;P 3 g. It can be written as P out 0 (N5) = d 0 d 1 d 2 0 B B B B B B @ a 0 a 1 a 2 a 3 b 0 b 1 b 2 b 3 c 0 c 1 c 2 c 3 1 C C C C C C A 0 B B B B B B B B B B @ P 0 P 1 P 2 P 3 1 C C C C C C C C C C A (2.8) 19 Figure 2.4: Illustration of the RLNC process going through multiple nodes. The above process is iteratively performed at every node in the network. The generation of each new packet is controlled by a vector of coecients, which can be transmitted as packet headers to the next node. At the receiver end, we use the GCM to describe the end-to-end relation between source and received packets. If the rank of the GCM is the same as the dimension of original source packets, then source packets are recoverable. Sometimes, due to random coecient selection, some packets may be linearly dependent with each other. Then, the receive node has to wait for more packets to arrive so that the GCM can reach the full rank. The probability, P s , for the GCM to be invertible is related with three parameters: the Galois Field size (jFj), the number of receivers (d), and the number of links in the network (). It was derived in [16] that the probability of successful decoding is equal to P s = (1 d jFj ) : (2.9) 20 Thus, the probability for a received packet to be linearly dependent on other packets is small. 2.1.3 Practical Network Coding In a realistic network environment, packets are transmitted asynchronously with various delay. Packets are lost randomly due to broken links, unknown capacities and topologies, and dynamic environments with changing nodes and link failures. With RLNC, Chou, Wu and Jain [9] proposed a practical NC scheme that encompasses realistic network char- acteristics by considering a buer management model and addressing the asynchronous packet transmission problem with delay and loss. The buer policy is to ush when the rst packet of a new generation arrives at any node. It is a simple and robust method but at the cost of throughput loss. Global coecients are carried in the header of every packet and updated if there is a new RNLC process performed at an intermediate node. The generation number is also carried in the header for the ease of encoding and decoding. For a setting with 50 packets in one generation and 1400Bytes in one packet, the header overhead is less than 3%. In the practical implementation, NC can be performed in the network layer of an IP network or the application layer of an overlay network. When it is performed in the network layer, RLNC packets are stored in the payload of IP packets. On the other hand, when it is performed in the application layer, RLNC packets are stored in the payload of UDP/TCP packets. 21 2.1.4 Network Coding in Erasure Network Since NC can potentially generate a large number of coded packets at intermediate nodes, it provides redundancy to protect source packets in erasure networks. Lun et al. [32,33] and Pakzad et al. [40] studied the optimal rate in erasure networks and derived the following result: rate(s;T )min t2T min Q2Q(s;t) f X (i;j)2 + (Q) r ij (1" ij )g; (2.10) where Q2Q(s;t) is the cut of (s;t), (i;j)2 + (Q) denotes all links crossing the (s;t) cut, and " ij is the packet loss rate on link (i;j). Lun et al. [34] also proved that a small amount of memory is sucient to achieve the above rate asymptotically. Although NC oers a better erasure protection capability, it does not guarantee the full rank of packets at every generation. It is possible that packets are lost and the GCM does not have the full rank. Consequently, RLNC decoding fails. We may concatenate the FEC code with NC as follows. Concatenated FEC-NC Encoding: Step 1: Encode [s 1 ;:::;s k ] T into [x 1 ;:::;x n ] T with Reed-Solomon (n;k) codeC. Step 2: Encode [x 1 ;:::;x n ] T into [y 1 ;:::;y N ] T with RLNC. Concatenated FEC-NC Decoding: step1: Obtain [x 1 ;:::;x n ] T by Gaussian elimination if the rank of GCM is equal to n. step2: Obtain [s 1 ;:::;s k ] T from [x 1 ;:::;x n ] T by RS decoding. 22 However, if the rank of GCM is less than n, [x 1 ;:::;x n ] T cannot be decoded in the above decoding process. When NC is used for transmission in erasure networks, Silva et al. [48,49] studied a rank-metric NC method to guarantee erasure protection or error correction, which are measured with the minimum rank distance of the codes. If the number of lost packets is smaller than the minimum distance provided by the rank-metric code, it guarantees that packets can be decoded. The decoding process is similar to that of the Reed-Solomon decoding. The rank-metric NC is a maximum distance separable (MDS) code. Its encoding and decoding algorithms are described below. Encoding of Rank-metric NC step 1: Encode [s 1 ;:::;s k ] T into [x 1 ;:::;x n ] T with Gabidulin codeC, wherek packets are encoded into n packets with nk redundant packets. step 2: Encode [x 1 ;:::;x n ] T into [y 1 ;:::;y N ] T with RLNC. Decoding of Rank-metric NC: step1: Get [x 1 ;:::;x n ] T by nding a codeword ^ x inC that satisesargmin(rank(y ^ x)). step2: Get [s 1 ;:::;s k ] T from [x 1 ;:::;x n ] T by Gabidulin decoding. In other words, the minimum distance erasure protection is provided by rank-metric codes instead of FEC. The rank-metric code is a powerful tool at the cost of higher encoding and decoding complexity. Suppose the minimum rank distance is d, it can detect and correct any pattern of errors and erasures, if 2 +d 1. 23 2.2 H.264/SVC 2.2.1 Overview of H.264/SVC Scalable video coding is a compression technique that encodes a video stream with a number of meaningful (decodable) video representations. A global scalable bit stream of full resolution/quality can be truncated to get lower layers. Although this technique has been adopted in some video standards such as MPEG-2 and MPEG-4, it was not widely used due to it poor coding gain and higher coding complexity as compared with the single layer coding scheme. The emerging H.264/SVC standard has received a lot of attention recently due to its improved coding eciency and reduced coding complexity. It encodes an image sequence into base and enhancement layers. The bit-stream of multiple priority layers enables video streaming to be adaptive to the bandwidth uctuation in a network. By discarding packets of less importance (or truncating a bit stream), a reduced spatial-temporal-quality resolution of a full video bit-stream can be obtained with graceful quality degradation. It is an ideal candidate for video transmission in many scenarios such as erasure networks and networks with heterogeneous clients. Moreover, it can be used in surveillance video system to provide exible storage by partially removing the low-quality portion of archived video. As compared with multiple description codes (MDC) that encodes a video source into several correlated descriptions, where any subset of these descriptions can be decoded to reconstruct the video source partially, SVC encodes the video source into a layered bit-stream consisting of a base layer and several enhancement layers. We choose the 24 H.264/SVC coded bitstream in our research for three reasons. First, it provides a set of layers along the temporal, spatial and quality dimensions. It enables partial decoding of the bitstream when receiving a set of low resolution layers. Second, unequal error protec- tion can be used in association with priority layers eciently. Third, error concealment can be easily performed since we can use low resolution video to conceal high resolution video. T 0 T 3 T 2 T 3 T 1 T 3 T 2 T 3 T 0 Figure 2.5: An example of layering structure of H.264/SVC consisting of four temporal layers and two quality layers. 2.2.2 Error Concealment Methods for H.264/SVC Error concealment (EC) methods in H.264/SVC handle not only intra-layer concealment but also inter-layer concealment. Some methods are reviewed below. 2.2.2.1 Frame Copy and Temporal Direct Frame copy and temporal direct methods can conceal lost frames or slices from adjacent frames. Every pixel of the lost frame is copied from the rst frame of the reference picture list in the frame copy method. For example, if a frame in layer T 3 is lost in the hierarchy 25 B structure as shown in Fig. 2.5, it is copied from the left adjacent frame. All the motion information is lost in frame copy. With the temporal direct method, motion vectors in lost slices are calculated in the same way as that used in the temporal direct mode [50]. As compared with the frame copy method, the temporal direct method estimates motion information from reference frames, which results in better quality than that of frame copy. However, there may exist block artifacts which could be annoying to human eyes. 2.2.2.2 BLSkip BLSkip is an EC tool to conceal quality or spatial layers by exploiting inter-layer corre- lations in the H.264/SVC decoder. It is proposed by X. Kai, et al. [19]. When the base layer is received and the enhancement layer is lost, BLSkip up-samples the motion and the residual information from the base layer to reconstruct the enhancement layer. Since the coarse quality layers are encoded similarly as spatial layers, the BLSkip method can also be useed to reconstruct quality enhancement layers. 26 2.3 Systematic Erasure Coding It was mentioned in Chapter 1 that Reed-Solomon or LDPC coding methods can generate systematic codes, where the source packets are embedded in the encoded packets. Suppose there are k source packets, r redundant packets. The generator matrix is given by G sparse = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 ::: 0 . . . . . . . . . 0 ::: 1 p 11 ::: p 1k . . . . . . . . . p r1 ::: p rk 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ; where I k is a kk identity matrix. P is a (nk)k matrix, and r =nk. When the number of lost packets is larger than r, the built-in redundancy fails to protect k source packets. Then, some lost source packets has to be estimated from received packets by EC. Due to the linear combination procedure adopted at each intermediate node, NC generates a non-systematic erasure code, where direct replicas of source packets are not 27 found in encoded packets. The generator matrix is in general a dense matrix of the following form: G dense = 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 g 11 ::: g 1k . . . . . . . . . g k1 ::: g kk g k+1;1 ::: g k+1;k . . . . . . . . . g n1 ::: g nk 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 ; whereG dense is ank matrix. When the number of lost packets is larger thanr, built-in redundancy fails to protect the block of k source packets. Due to the higher density of the generator matrix, received packets are not directly correspond to source packets. We conclude from the above discussion that video transmission prefers systematic erasure protection to allow eective EC. If a systematic code is not possible, other sparse generator matrices are desired to allow partial decoding of source packets. In the next chapter, we will design a sparse GCM based on the property of the H.264/SVC bitstream. 28 Chapter 3 NC-based Video Transmission with Ladder-Shaped Gobal Coding Matrix (LGCM) In this chapter, we rst examine a video transmission system based on the network coding (NC) technology in Section 3.1.3. Then, we analyze a special network coding scheme that is characterized by a ladder-shaped global coding matrix (LGCM) in Section 3.2. Finally, simulation results are presented and discussed in Section 3.3. 3.1 Video Transmission System with Network Coding 3.1.1 System Overview The proposed H.264/SVC video transmission system with NC in an erasure network is shown in Fig. 3.1. The system consists of three main modules: the source module, the network module and the receiver module. The operations in each module are detailed below. First, at the source node, the H.264/SVC video bitstream is partitioned into pri- ority layers and packetized into packets of equal length. The optimizer gathers 29 Evaluation PSNR SVC encoding with prioritized layers Packetization NC encoding SVC decoding Bufferring Source Receiver NC Cooperating Network with Packet Erasure & Delay Average end-to-end packet loss rate p Optimizer Error Concealment Reordering NC decoding Timer S R G Figure 3.1: Video transmission with network coding in an erasure network. the bitstream information such as the number of packets per GOP and the mean square error (MSE) of each layer and computes parameters, including the number of packets of the source video and redundancy required by NC. These parameters control the amount of packets generated at each layer. Random linear network coding (RLNC) is performed at the network module. RLNC is attractive in practical implementation due to its distributed computation prop- erty. Each node in the network selects RLNC coecients and generates RLNC- encoded packets as well as packet headers. The headers consist of several elds, including RLNC coecients, the bitstream layer information, etc. According to the packet header information, RLNC is performed in intermediate nodes of the network. By an erasure network, we mean that packets may be lost due to the bit error rate and/or the buer over ow of intermediate nodes. 30 At a receiver node, received packets are buered under the control of a timer. Then, Gaussian elimination is used for RLNC decoding, which is followed by packet reordering. Video data are decoded with an H.264/SVC decoder equipped with the error concealment capability. Finally, video data are played back at each receiver node, and we evaluate the system performance by comparing the quality of the source video bitstream with that of the decoded bitstream. As shown in Fig. 3.1, S is the source data block, R is a received packet and G is the GCM. Specically, S is the output of the packetization functional block, which organizes video a bitstream into packets of the same size. We divide source data packets into multiple generations according to their time index. Packets can be linearly combined only when they belong to the same generation. In video transmission,S may be composed by one or several GOPs. For simplicity, we assume one source data block is composed by one GOP in this work. Suppose there are total n packets in data block S. Each source block lasts b seconds. The packet rate is n= b packets per second. R is the output from the network module and the input to the NC decoding functional block in receivers. From the end-to-end point of view, the RLNC process can be described by a global coecient matrix (GCM), denoted by G. Then, we can relate R and S via R =GS; (3.1) where S are formed by source data packets of the same generation. 31 3.1.2 Average Packet Loss Rate To evaluate the performance of a video transmission system, the end-to-end average packet loss and receiving probabilities are denoted by p and , respectively. Clearly, we have p = 1. In the traditional store-and-forward network, is simply dened as the probability for a source packet to be received at the receiver. However, since the RLNC process mixes source packets, a newly arriving packet at the receiver may not carry the new information. With RLNC, a received packet is called innovative if it cannot be expressed as a linear combination of previously received packets in the same generation. Then, we can dene as the probability to receive an innovative packet at the receiver within a certain time duration after the source sends all source packets of the same generation. Ri s Routing cloud d1' d2' d3' d1 d2 d3 ni Figure 3.2: Network coidng with various amounts of delay. Parameter is aected by three factors. Throughput loss of NC Due to the delay spread and the buer management of intermediate nodes, there could be some loss in the NC throughput as discussed by Chou in [9]. As shown in 32 Fig. 3.2, packets experience various delays d 1 , d 2 , d 3 , ::: from the source to an in- termediate node. If we adopt the buer ushing policy at intermediate nodes, which deletes the old generation of packets when the packet of a new generation arrives, a large delay spread among all paths may leads to a large amount of throughput loss at intermediate nodes, which decreases the value . Packet loss over links Packet loss may occur due to buer over ow of intermediate nodes, congested trac, and the bit error rate in wireless networks. When the packet loss over links is higher, is smaller. Time constraint imposed by video applications As shown in Fig. 3.2, there are many paths from which packets travel from source S to receiver R i with various end-to-end delay amounts. To play back a video clip continuously, a block of video data has to be received within a certain amount of time. The timer for a block is triggered once the rst packet belonging to this block is received. When the timer expires after, say, b seconds, the remaining packets of this block will be discarded since they are too late to play back in a timely manner. Although the above three factors introduce packet loss, NC provides a mechanism to reduce packet loss in a rateless way [32] as illustrated by a simple example in Fig. 3.3, where the capacity of two links is C packets per time unit. The packet loss rates are " 1 and " 2 for links A!B and B!C, respectively. Suppose that the source video rate is 1000 packets per time unit at node A with 10% redundant packets (i.e. 1:1KC). We compare the erasure protection capability of three schemes below. 33 A B C ε 1 ε 2 Figure 3.3: A line erasure network consisting of three nodes. Scheme 1: AL-FEC encoding at nodeA, store-and-forwarding at nodeB, decoding at node C. The throughput from A to C is C(1" 1 )(1" 2 ). Scheme 2: AL-FEC encoding at node A, decoding and re-encoding at node B. The throughput is minfC(1" 1 );C(1" 2 )g, which is generally larger than the throughput of Scheme 1. This scheme also demands more memory than Scheme 1 since it has to store the whole block of coded AL-FEC packets for decoding on node B. Moreover, the decoding and the encoding processes introduce extra delay which is basically the time required to receive the whole block of packets. Scheme 3: RLNC encoding is performed at nodeA, then at nodeB, as long as there is an opportunity to send a new packet to the outgoing link. It performs RLNC on received packets in the same block to generate a new packet. The throughput is the same as that of Scheme 2. As to the memory requirement, it has to store a small number of packets of a block. Besides, there is no delay in RLNC en- coding/decoding. Thus, RLNC outperforms the other two schemes in term of the erasure protection and coding delay. Generally speaking, the packet loss probability with NC is smaller than that without NC since NC achieves larger capacity as proved in [5] and provides better erasure pro- tection as proved in [32,33]. If the GCM is not of full rank, every node can generate an innite number of coded packets in principle and a receive node may wait longer for more 34 packets until the GCM become a full rank matrix. However, in practice, the amount of received packets will be limited by the time constraint imposed by video playback. 3.1.3 Problems of NC-based Video Transmission Although RLNC can achieve the optimal multicast capacity with its successful decoding probability close to one [16], the decoding requirement of a full-rank GCM makes NC- based video transmission in erasure networks a nontrivial task. The RLNC decoding process could be an under-determined linear system when packet erasure happens. In this subsection, we analyze and compare the performance of dierent transmission methods. To evaluate the eect of dierent GCM types on video transmission, we study the average received number of packets (n) per GOP. (a) (b) (c) Figure 3.4: Three dierent GCM types. The following three GCM types are compared. Diagonal GCM This corresponds to the traditional store-and-forward network without NC. If a packet is received, the diagonal element of the correspondent row is one. Otherwise, the corresponding element of a GCM row is zero. The lost packet can be concealed by the video decoder with EC. The average number of received packets is n d = (1p)N: (3.2) 35 General dense block GCM This is a consequence of the RLNC process at multiple nodes of a network. The cascade of the RLNC process across multiple nodes will in general lead a dense block GCM. In a synchronous network, all packets belonging to the same data block are received and buered before an NC encoded packet is generated for a downstream link. The buer size is the data block size on each node. The encoding delay is the delay of receiving a whole data block. And the decoding process on the receivers can't be performed before waiting for enough packets. If the full rank of GCM can't be achieved, the whole GOP is treated as lost. The probability of receiving all packets in one block is (1p) N . The average number of received packets is n b =N(1p) N : (3.3) Low-triangular GCM [58] This corresponds to the asynchronous arrival of packets at intermediate nodes. The strict lower triangular GCM has several advantages. If the delivery system is lossless, every source packet can be decoded as long as one innovative packet is received. If the ith NC packet is lost, we can determine source packets from 1 to (i 1) but cannot go beyond that since one equation is lost. Thus, the source data block is partially decoded until the loss of an NC packet. The number of decodable 36 packets depends on the row position of the lost NC packet. The loss of higher rows leads to more loss. The expected number of received packets is n l = N X k=1 (1p) k p Nk k: (3.4) Decoded video quality with EC decreases as n becomes smaller. Among three numbers n d , n b and n l , n b is the smallest. Thus, the dense block GCM yields the worst video quality. Next, we analyze the delay for each GOP to start to decode its content. Under the assumption of the constant bit rate (CBR) transmission, the interval between adjacent packets denoted by , the delay amounts for the diagonal GCM and the low-triangular GCM are both equal to d d =d l = N X k=1 p k (Nk): In contrast, we have d b N where the equal sign holds if there is no packet loss. Clearly, d b is larger than d l and d d . In conclusion, the general dense block GCM is not suitable for video transmission due to a high average packet loss and longer delay. The low-triangular GCM is a more attractive choice due to lower packet loss and shorter delay. 37 3.1.4 H.264/SVC Scalable Layers H.264/SVC encodes video into three scalable layers [46]. For temporal scalability, the hi- erarchical B prediction structure is adopted. A GOP consists of a key picture (T 0 in Fig. 2.5) and pictures between two key pictures. Key pictures can be I or P pictures. Bidi- rectional predicated pictures are inserted between key pictures. For quality scalability, the base layer is encoded with larger quantization parameters while enhancement layers are either inter-layer predicted from the base layer or temporally predicted from neighbor frames. The prediction residual is encoded with smaller quantization parameters. Spatial scalability is similar to quality scalability yet with a dierent spatial resolutions. We focus on temporal and quality scalabilities here. The H.264/SVC bitstream hasQ quality layers and each quality layer hasT temporal layers. The quality of received video is denoted by q i if all packets within quality layer i are received. If one or more packets are lost in layer i while all packets in layer i 1 are received, the quality is between q i1 and q i . Due to the inter-layer dependence, there are two types of priorities. The rst type is resulted from dierent quality layers. Lower quality layers are more important than enhancement quality layers. For example, the corruption of a frame in the base layer not only impacts neighbor frames in the same quality layer, but also impacts enhancement layers. The second type is resulted from dierent temporal layers within the same quality layer. The bidirectional prediction depends on reference frames in the hierarchical B structure. Packet loss in lower temporal layers causes more error propagation. The packet loss impact is also correlated with the intra period since the I frame can be used 38 to reduce error propagation as discussed in [22]. In this work, we focus on the hierarchical B structure. 0 50 100 150 200 250 300 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 Frame Dropping Index i Delta PSNR (dB) Enhancement layer Base layer Figure 3.5: Quality degradation due to frame dropping in temporal and quality layers for the test Foreman sequence. To quantify the impact of packet loss, we measures the quality degradation of recon- structed video with and without loss. We dene the quality degradation as D = F X i=1 d i ; whered i is quality degradation of frame i, andF is the measure window which indicates the number of neighboring frames used to evaluate D. F is typcially set to the intra frame period since the refresh of intra frames erase the degradation. Then, the priority decreases according to the order of T 0 , T 1 , T 2 , T 3 as shown in Fig. 2.5. We use a foreman sequence with a GOP size of eight (G = 8) and two quality layers (Q = 2) as an example. The intra period is 8 frames. The quality degradation results due to frame dropping are shown in Fig. 3.5, where the x-axis is the dropping frame index 39 in a GOP re-ordered according to temporal layers T 3 , T 2 , T 1 , T 0 . For each dropped, the quality degradation is shown in the y-axis. We observe from this gure that quality degradation due to the frame loss in higher temporal layer (sayT 3 ) is small, thus being assigned a lower priority. Moreover, priorities of two quality layers are indicated by the dashed and the solid curves in Fig. 3.5, which correspond to quality degradation due to the frame dropping in the base quality layer and the enhancement quality layer, respectively. It is clear that packet loss in the base layer introduces more quality degradation than that in the enhancement layer. 3.2 Network Coding with Ladder-Shaped Global Cocient Matrix (LGCM) To reduce the impact of GCM rank deciency on video transmission, we propose an NC scheme with a ladder-shaped GCM (LGCM) in this section. We rst present the LGCM and then discuss a way to maintain LGCM throughout the network. 3.2.1 Structure of Ladder-shaped GCM (LGCM) A ladder-shaped GCM (LGCM) consists of submatrices M 1 , M 2 , M 3 , etc. as shown in Fig. 3.6, where submatrix M i corresponds to a block of video data for NC encoding. The white area of LGCM is lled by zeros. Submatrix M i consists of two parts. The rst part has k i rows, which form a lower triangular matrix, and the second part has r i rows. The rst part is used for unequal protection while the second part is used to create redundant NC packets. Thus, for every layer of source data with k i packets, the source 40 r 3 M 1 M 2 M 3 k 3 r 2 k 2 r 1 k 1 K r 3 M 1 k 3 r 2 k 2 r 1 k 1 K M 2 M 3 OR × S K L Block Diagonal GCM Ladder GCM Source Data Block with Prioritized Layers NC generation sub-matrix for source layer 1 NC generation sub-matrix for source layer 2 NC generation sub-matrix for source layer 3 NC generation sub-matrix for redundancy Zero filling Figure 3.6: Encoding a layered source data block with LGCM. node generatesk i +r i packets. By adjustingr i , we can control the ratio ofr i =k i according to the layer importance. As long as k i linearly independent packets are received, source layeri is recoverable. We provide an example of LGCM in Table 3.1, where r 1 =k 1 = 10% and r 2 =k 2 = 5%. It means that video packets in layer 1 are more important than those in layer 2. In the source data block, K = Q P i=1 k i is the total number of packets while L is the length of a packet. A special case of LGCM is the block diagonal GCM (BDGCM), which is also shown in Fig. 3.6. Table 3.1: An example of LGCM. k 1 k 2 r 1 r 2 r 1 =k 1 r 2 =k 2 50 40 5 2 0.1 0.05 41 The probability of receiving a block of data is analyzed for H.264/SVC prioritized bit- stream transmission. First, we consider the case of BDGCM. the probability of receiving layer i is P i;b = i Y j=1 a i Q Y j=i+1 (1a j ); where subscript b stands for BDGCM and a i = B(n i ;r i ;p), and where B(n;k;p) is the cumulative Binomial distribution function dened by B(n;k;p) = n X j=k 0 B B @ n j 1 C C A (1p) j p nj : Next, we consider the case of LGCM. It is straightforward to nd that P i;l =c i Q Y j=i+1 (1a j ); where subscript l stands for the ladder GCM, and c i is the probability that all packets from layers 1 to i (with a total number of i P j=1 k j packets) are decodable. Then, we have c i = 8 > > < > > : a 1 ; i = 1; c i1 a i +b i ; i> 1: (3.5) 42 whereb i is a parameter called the incremental factor of LGCM. The meaning ofb i is given below. Since redundancy in layer i can protect the erasure in layer i 1, b i is dened as the probability of successful decoding at layer i but not at layer i 1. Then, we have b i = 8 > > < > > : 0; i = 1; r i P m=1 (k i1 +r i1 ;k i1 m;p)B(k i +r i ;k i +m;p); i> 1; (3.6) where is the probability mass function of the Binomial distribution of the following form (n;k;p) = 0 B B @ n j 1 C C A (1p) k p nk : For example, whenQ = 2,P 1 =a 1 (1a 2 ),P 2 =a 1 a 2 +b 2 . As compared with BDGCM, the decoding probability of LGCM is enhanced by factor b i . Thus, the general LGCM is chosen for NC in our video transmission system. 3.2.2 NC Implementation for LGCM We discuss the detailed NC implementation to obtain LGCM in this subsection. First, we examine the header information of an NC packet. It consists of the NC gen- eration number (GenerationNum), the index of priority layers (LayerNum), the number of packets in this layer (PktNum) and the global coding vector (GlovVec) as shown in Fig. 3.7. The headers are the overhead in video transmission. The information amount of headers can be estimated as follows. Two bytes are sucient for the GenerationNum eld to transmit more 9 hours video using this 1Mbps bitstream. 43 W (packet length) M N (number of packets) GenerationNum LayerNum PktNum GlobVec Packet header Payload Figure 3.7: The structure of an RLNC packet. LayerNum indicates which video layer (in terms of spatial, temporal and quality layers) the packet belongs to. Usually, only several bits are needed. PktNum indicates the number of packets in each layer. Usually, 1 byte is sucient. The length of the global coding vector eld depends on the GOP size. Suppose the bit rate of a bitstream is 1Mbps and the packet size is 1400 bytes. A GOP of duration less than half a second contains less than 50 packets. If the nite eld size of RLNC is 2 8 , the length of the GlobVec eld is equal to 50 bytes, which is 3:2% of the video bitstream. Thus, the proposed scheme requires a small amount of overhead. LGCM can be maintained by marking and tracing the stage number of all intermediate nodes that perform NC. Stage numbers are marked at intermediate nodes according to the stage numbers of those packets joining NC linear combination. Suppose all packets stored in the node buer are used to generate new packets. If these input packets belong to stages 1; ;i, the generated packet will be labeld by stage i. 44 t1 t2 t3 t4 time 1 1 γ × P 2 2 1 1 γ γ × + × P P ∑ = × 4 1 i i i P γ ∑ = × M i i i P 1 γ 0000 10 1111 01 2 3 00 0 0 kk k k kk gYP ggYP gg g g YP ⎛⎞ ⎛ ⎞ ⎛⎞ ⎜⎟ ⎜ ⎟ ⎜⎟ ⎜⎟ ⎜ ⎟ ⎜⎟ = ⎜⎟ ⎜ ⎟ ⎜⎟ ⎜⎟ ⎜ ⎟ ⎜⎟ ⎜⎟ ⎜ ⎟ ⎝⎠ ⎝⎠ ⎝ ⎠ …… …… … Intermediate Node Receiver Figure 3.8: NC encoding at intermediate nodes. The amount of generated packets at every RLNC stage depends on the capacities of incoming links and outgoing link, denoted by c in and c out , respectively. To describe how the ladder shape is maintained, we dene the expanding factor as e = c in =c out . When e = 1, the shape of the encoded block is the same as that of the received data block. When e> 1 and e< 1, it expands and shrinks but keeps the ladder shape. The lower triangular sub-matrix on LGCM is constructed by taking advantage of the asynchronous packet transmission. For example, consider an intermediate node, packet x 1 arrives att 1 ,x 2 arrives att 2 ,fx 3 ;x 4 g arrives att 3 , wheret 1 <t 2 <t 3 as shown in Fig. 3.8. Between t 1 and t 2 , if there is an opportunity to send packets through an outgoing link, the generated packet can be expressed asa 1 x 1 since there is only one packet in the buer. Then, the NC global vector has only one coecient,a 1 while other coecients are equal to zero. Betweent 2 andt 3 , the NC generated packet is of forma 2 x 1 +a 3 x 2 . If more packets are buered at this node, there are more coecients for the global vectors of generated outgoing packets. At received nodes, those received global vectors form a lower triangular matrix. With Gaussian Elimination, packet P 0 can be recovered even if only P R 0 is received. 45 For LGCM, most packets can be recovered without waiting for all packets of the block to arrive. The order of recovered packets is the same as that of transmission from the source node. A non-strict lower triangular GCM can be maintained as long as every node generates NC packets based on the two principles described above. 3.2.3 Mapping Quality Layers to LGCM We can map dierent quality layers in H.264/SVC to stages of LGCM. It provides block decomposition for ecient NC decoding as well as unequal protection when unequal redundancy is assigned according to the layer priority. If we use D to quantify the quality degradation due to the loss of a layer, we can dene the mapping problem as the following optimization problem: min Q X i=1 D i P i ; (3.7) subject to Q X i=1 r i R: (3.8) The constraint given in Eq. (3.8) can be expressed as R = Q X i=1 k i ; where = Q X i=1 r i = Q X i=1 k i : Quality degradation is evaluated by dropping all packets of the same quality layer within a GOP. For given redundant ratio and average packet loss rate p, we need to 46 nd the optimal value offr 1 ;r 2 ;:::;r Q g such that the optimization problem in (3.7) is solved. This problem usually is solved by a gradient method. Due to the limited number of packets in each GOP and a small value ofQ in H.264/SVC, it is also possible to obtain the optimal solution by the full search method. 3.2.4 Mapping Temporal Layers to LGCM We can also map temporal layers of H.264/SVC to lower triangular sub-matrixes of LGCM. It provides the subblock decomposition for NC ecient decoding. Moreover, it provides the second type of unequal protection. The property of lower triangular sub- matrix is studied below. The probability of losing a row vector is equal to P l (n) = 8 > > < > > : p; n = 1; P (n 1) +p(1p) n1 ; Kn> 1: (3.9) where K is the number of packets in a temporal layer. This probability calculation reveals the importance of each row in the lower triangular submatrix. Since P l (n) is a monotonically increasing function, it has the same behavior of quality degradation as shown in Fig. 3.5. Frames are arranged in order of T 0 ;T 1 ;T 2 ; , and mapped to the lower triangular area of the LGCM. Unequal protection of temporal layers is achieved by analyzing the dierent importance of rows in a lower triangular matrix. 3.3 Simulation Results We use H.264/SVC reference source code JSVM7.9 to generate the video bitstream. The CIF Foreman bit-stream is encoded at a rate of 30 frames per second. The GOP size 47 is 8, and the intra fresh period is 8. Two quality layers are encoded. One is the base layer and the other is the enhancement layer. The QP parameter for the base layer is 40, the bit rate is 170.63kbps, and the average PSNR is 31.26db. The QP parameter for the enhancement layer is 32. The total bit rate is 588.81kbps, and the PSNR of the two-layer video is 36.14db. The BLSkip method [19] is adopted as an EC tool at the H.264/SVC decoder. When lost packets belong to the enhancement layer only, BLSkip is eective in compensating lost packets. Those lost packets are compensated by motion vectors and residual upsampling of the base layer. When most lost packets belong to the base layer, the lost frames are compensated by frame copy from previously decoded frames instead of BLSkip mode. We use ns2 to simulate the network environment. The nite eld size for RLNC is 2 8 . The packet size is set to 1400 bytes. To evaluate the overall performance of the video transmission system, The simulation is conducted multiple times to obtain statistically meaningful results. Since we concern with the impact of rank deciency of GCM on video quality in a RLNC-based multicast network, a simplied simulation scenario was adopted. That is, video data were transmitted using RLNC from the source node to a network cloud, and packet erasure was only implemented at the last hop of the network as shown in Fig. 3.9. The above implementation corresponds to the following two practical scenarios. First, a client receives video data via wireless transmission from the base station as shown in Fig. 3.9 (a), where packet erasure is due to the transmission in the wireless channel. Second, a client accesses the internet with a modem (ADSL or cable) as shown in Fig. 3.9 (b). If a client has a video streaming session and other download sessions simultaneously, the 48 access hop is likely to be the bottleneck and packets could be dropped in the buer of that link. Src Src NC Enabled Network N Bottleneck link (a) (b) Figure 3.9: Two scenarios for the last-hop erasure: (a) access via a wireless communication link and (b) access via a wired broadband modem. 3.3.1 Performance Comparison between Dierent GCMs To evaluate the performance of various GCM on video transmission, we rst compare the results of three GCMs (namely, diagonal GCM, LGCM, general dense block GCM) under the same average packet loss rate p without redundant protection. The results are shown in Fig. 3.10. The PSNR of the rst two methods are close while the general dense block GCM results in very poor PSNR. Reconstructed frames of no-NC (or the traditional store-and-forward policy) and NC with dense block GCM under a packet loss rate of 4% are shown in Fig. 3.11. We see that NC with dense GCM is more sensitive to packet loss than the traditional store-and-forward network. Next, we compare the dense block GCM and LGMC with the protection of redundant packets. The results are shown in Fig. 3.12. For fairness, the ratio of redundant packets is the same for both methods under the same packet loss rate p. For the dense block GCM method, redundant packets are generated to protect the whole block/GOP. For 49 0 5 10 15 20 25 30 35 40 0 2 4 6 8 10 12 Average Packet Loss Rate (%) Average PSNR (db) non-coding, non-R NC-ladder, non-R NC-dense block, non-R Figure 3.10: Performance comparison of three GCM types. (a) store-and-forward (30.31dB) (b) NC with dense block GCM (26.55dB) Figure 3.11: Comparison of reconstructed frames with (a) the traditional store-and- forward policy and (b) NC with dense block GCM, under a packet loss rate of p = 4%. No redundant packets are used in the frame reconstruction. the LGCM method, the number of redundant packets is assigned optimally to each layer. When we simulate the LGCM method, the optimal redundancy assignment outperforms the static redundancy assignment with r 1 = 0 and r 2 = R. Actually, it yields the best performance among all four methods under comparison. The dense block GCM yields the lowest PSNR performance. Reconstructed frames of the dense block GCM and the LGCM are shown in Fig. 3.13 for visual comparison at a packet loss rate of 4%. 50 0 5 10 15 20 25 30 35 40 02 46 8 10 12 14 16 Packet Loss Rate (%) PSNR (dB) Optimized r1 r2 r1=0, r2=R Optimized r1 r2 with lower triangular Dense block NC with redundancy Figure 3.12: Comparison of the PSNR value for four dierent NC schemes as a function of the packet loss rate. (a) Dense GCM (33.47dB) (b) Ladder GCM (36.14dB) Figure 3.13: Comparison of reconstructed frames using the dense block GCM and the LGCM with the ratio of redundancy = 4% and the packet loss rate p = 4%. 3.3.2 Optimal Redundancy Assignment Fig. 3.14 shows the optimal percentage of assigned redundancy to each layer as a function of the packet loss rate. When the packet loss rate is low, most redundancy is assigned to the enhancement layer. This can be explained by the fact that a smaller packets loss rate leads to a small probability of losing important packets. Even if the base layer is not protected by redundant packets, redundancy assigned to the enhancement layer can still help the base layer as discussed in Section 3.2.1. When the packet loss rate is higher, more 51 redundancy is assigned to the base layer. Important packets of the base layer are received with a higher probability at the cost of reducing the protection of the enhancement layer to achieve graceful quality degradation. 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 Packet Loss Rate (%) Allocated Redundancy (%) r1 r2 Figure 3.14: Optimal redundancy assignment to the base layer and the enhancement layer as a function of the packet loss rate. Finally, we show the PSNR of each frame with the dense block GCM and the LGCM with optimal redundancy assignment in Fig. 3.15, where the ratio of redundancy is = 0:04 and the packet loss rate is p = 4%. When there is burst loss in GOPs (from frame 33 to 49), the pre-allocated redundancy may not be sucient to protect all packets in those GOPs. For LGCM, most lost packets belong to either the enhancement layer or the less important frames of the base layer. These lost packets can be concealed by BLSkip, which can provide graceful quality degradation. However, for dense block GCM, the whole GOP is lost. Those frames can only be obtained by the frame copy method from the previous key picture. Thus, its quality degradation is more severe. However, there is an exception between frames 49 and 56, where the LGCM has a slightly lower PSNR value than the dense block GCM. Lost packets belong to the quality enhancement layer 52 in that GOP. Since most redundancy is assigned to the base layer, LGCM cannot receive as many enhancement packets as the block dense GCM. However, LGCM guarantees the quality of the base layer over all frames. Thus, its performance is more robust. 0 5 10 15 20 25 30 35 40 45 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 GOP frame number PSNR (dB) NC-dense, loss rate=0.04 NC-ladder, loss rate=0.04 Figure 3.15: The PSNR value comparison between dense block GCM and LGCM. 3.3.3 Congestion in Bottleneck Link To evaluate the performance in the presence of congestion in the bottleneck link, we considered the conguration in Fig. 3.9 (b), where a client accessed the Internet via a link with capacity of 3Mbps. The packet loss was due to the congestion of the access network (or the last hop). The Mobile test sequence was transmitted as a rate of 1.3Mbps and the same 300-frame sequence was repeated to form a long sequence. There was another competing UDP CBR session with a rate of 2.1Mbps. Both downlink ows competed for the bottleneck link of 3Mbps between 50 and 300 seconds and some packets in these two ows were dropped due to buer over ow. The throughput of these two sessions are plotted in Fig. 3.16. The throughput of video session was only 1.15Mbps between 50 and 300 seconds due to competing ows. After that, the video transmission session is back to 53 the normal level with the throughput equal to 1.3Mbps. We compare the video quality of RLNC with the general GCM and LGCM in Fig. 3.17. The proposed LGCM results better video quality than the general GCM during trac congestion. 1.50E+01 5.00E+05 1.00E+06 1.50E+06 2.00E+06 2.50E+06 0 50 100 150 200 250 300 350 400 450 Throughput (bps) Time (sec) Video session Competing CBR session Figure 3.16: Throughput of a video session as a function of time, where congestion occurs between 5 and 300 seconds. (a) General GCM (24.21dB) (b) LGCM (29.32dB) Figure 3.17: Comparison of reconstructed frames using the general GCM and LGCM with the ratio of redundancy = 10% in the presence of congestion in the bottleneck link. 3.4 Conclusion Ecient H.264/SVC video multicasting using NC with various GCM methods was inves- tigated in this work. Although NC cannot generate systematic erasure codes, we showed 54 that layered H.264/SVC video bitstream can be eciently delivered with RLNC that has a ladder GCM. It was demonstrated by computer simulation that the proposed LGCM scheme outperforms the block dense GCM signicantly. 55 Chapter 4 NC-based Video Transmission with Interleaving In Chapter 3, LGCM is maintained at all intermediate nodes of the network so that a partially received data block can still be decoded to allow graceful video quality degrada- tion in erasure networks. In this chapter, we study an interleaving scheme which does not require LGCM in the network. Instead, it consider the scheduling at the source node and performs general RLNC in intermediate notes of the network. The proposed interleaving scheme allows NC to cooperate with error concealment (EC) eectively. Moreover, it improves the unequal erasure protection capability of H.264/SVC priority layers. We rst propose an interleaving scheme for NC, compare it with general interleaving schemes, and discuss its pros and cons with respect to LGCM in Section 4.1. Then, we examine the EC technique together with the interleaving scheme in Section 4.2. Unequal erasure protection in the presence of the interleaving scheme is discussed in Section 4.3. We formulate the problem of optimal interleaving design and propose an algorithm to solve it in Section 4.4. Simulation results are given in Section 4.5. Finally, concluding remarks are given in Section 4.6. 56 4.1 Interleaving Scheme The interleaving scheme at the bit level is usually integrated with error control codes to reduce the damage of burst bit errors. At the application level, we can adopt the interleaving scheme to increase the robustness against packet erasure, which is called the packet-level interleaving scheme as proposed by Liang et al. in [26]. In the following, we discuss these two interleaving schemes and compare them with the proposed interleaving scheme for NC. Then, the benets of interleaving for video transmission using NC are described. 4.1.1 Traditional Interleaving Schemes By interleaving, we rearrange the transmission order of a data stream at the source node. The purpose is to break a long burst error into shorter ones, which facilitates the detection and correction of burst errors. The bit-level and packet-level interleaving schemes are illustrated in Fig. 4.1 and 4.2 respectively. They are explained below. In both schemes, the source data is protected by FEC vertically. There are K source bits and R redundant bits per packet as represented by a column in these two gures. Redundancy is introduced by FEC codes. One examplary FEC code is the Reed-Solomon code, which is a maximum distance separable (MDS) code. The (K+R;K) Reed-Solomon code can correct up to R erasure bits. However, its complexity is high. To reduce the computational complexity, another type of FEC, known as the low-density parity- check codes (LDPC) with the Raptor code [47] as an example, can be used to generate redundancy. It behaves like MDS codes asymptotically. Since its generator matrix is 57 sparse, it only requires linear encoding and decoding complexity. Horizontally, the length of a row, L, is the interleaving depth, which is related to the interleaving degree. The value of L means the number of bits and GOPs in bit-level and packet-level interleaving schemes, respectively, in the context of video transmission. K L R Transmission order Source bits Redundant bits Lost bits Transmission order FEC encoding order Figure 4.1: Interleaving at the bit level. The interleaving scheme increases the erasure protection capability mostly because of L, which decides how much a burst erasure can be seperated. The larger L value, the less redundancy is required for FEC codes. For example, to correct bit errors shown in Fig. 4.1, FEC codes must correct at least 2 bit erasures. However, for a larger L value, lost bits may not be overlapped at all vertically. Then, FEC codes only need to correct 1 bit erasure. For the (K +R;K) Reed-Solomon code, the interleaving scheme allows maximum R erasures. 58 Transmission order GOP0 GOP1 L K R GOPL-1 FEC encoding order Source packets Redundant packets Lost packets Figure 4.2: Interleaving at the packet level. GOP1 GOP2 GOP3 GOP4 GOP5 (c) Unequal redundant assignment (b) Interleaved blocks for network coding (a) Original video bitstream Figure 4.3: Interleaving and unequal protection for NC-based video transmission. 4.1.2 NC-oriented Interleaving Scheme We propose an interleaving scheme for NC by exploiting the property that it breaks one long burst erasure into short ones. However, it is dierent from the general interleaving schemes described above in three aspects: 1) the process of encoding and decoding, 2) the purpose of protection, and 3) the method of protection. The main idea is shown in Fig. 4.3. 59 At the source node, interleaving is performed in the packetization step of the proposed video transmission system in Fig. 3.1. Each GOP S is decomposed into Q partitions by priority as shown in Fig. 4.3 (a). Suppose that the interleaving length is L. We group partitions of the same priority from L GOPs into a data block, which is called a generation. Thus, the number of generations at each round of interleaving is equal to Q. The optimizer generates the control information, including L and redundancy r i , i2 [1;:::Q] in these generations. The value of r i controls the number of redundant packets generated for theith generation of data. Since all packets in one generation is of the same importance, the interleaving scheme does not need to dierentiate the priority of packets for NC in the same generation. Thus, the packet header does not require the "LayerNum" eld shown in Fig. 3.7. With this interleaving scheme, the RLNC process can be performed at intermediate nodes of the network. As compared with the work in Chapter 3, the NC operation is greatly simplied since we do not have to maintain the shape of LGCM. The de-interleaving process should be performed after the NC decoding step at the receiver in Fig. 3.1. The receiver should buer at most L GOPs (i.e., Q generations) and recovered data blocks from NC decoding are reordered into the original GOP sequence for video playback. The bit-level, packet-level and NC-oriented interleaving schemes have dierent ob- jectives in data protection. The bit- and packet-level interleaving schemes protect data against the burst erasure. For instance, the fading or interference in a wireless channel leads to burst bit errors or erasures. Trac congestion leads to the buer over ow on a bottleneck link, and packet burst loss. However, when the interleaving scheme is applied in the NC context, it protects against the insucient rank of GCM for RLNC decoding. 60 Video data are protected by FEC codes with bit- and packet-level interleaving schemes, and interleaving is performed after FEC coding. In contrast, in the proposed NC-oriented interleaving scheme, data to be interleaved are packets of the source bitstream without any FEC protection while the protection is achieved by redundant packets generated by the RLNC process. 4.1.3 Comparison between NC-oriented Interleaving and LGCM Video data protection by NC-oriented interleaving and LGCM is dierent in the following four aspects. Contents of a generation The content of a generation in LGCM is one or several GOPs. A GOP always belongs to a generation. However, the content of a packet in a generation with NC- oriented interleaving is from multiple GOPs, and the content of a GOP is distributed into many generations. Resource requirement For NC-oriented interleaving, it requires more memory buer at the source node and the receiver node since a GOP is distributed among many generations. As a result, it introduces longer delay in RLNC decoding. If n GOPs are interleaved together, its resource requirement is n times of the LGCM scheme. Redundancy utilization The erasure protection mechanism in these two schemes is dierent. LGCM obtains redundancy to protect priority layers by generating more NC packets. The amount 61 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 Loss rate Probability g=8, ladder g=8, interleaving g=6, ladder g=6, interleaving g=4, ladder g=4, interleaving g=2, ladder g=2, interleaving Figure 4.4: Numerical comparison of the receiving probability of two schemes. of redundancy is optimized with respect to data in the same generation. NC- oriented Interleaving assigns redundancy to protect generations. The amount of redundancy is optimized with respect to a group ofn GOPs. Moreover, interleaving is more eective in providing unequal erasure protection. Overhead of RLNC The overhead of the packet header is dierent for the two schemes. Suppose that there are on the average 45 packets per GOP and three partitions per GOP in the interleaving scheme. The maximum interleaving length is 8 GOPs. If each partition has 15 packets, there are 120 packets for every generation at the maximum. Then it requires 120 bytes for the GlobVec eld in Fig. 3.7 to carry all the coecients. The overhead amount in carrying RLNC coecients is 2:3 times larger than that of the ladder GCM method. 62 The probabilities of recovering theith layer ing GOPs for LGCM and interleaving are denoted by P L and P I , respectively. Suppose that there are k i packets in the ith layer, and r i packets are used as redundancy in LGCM. Correspondingly, we have redundancy r i g assigned to theith generation in one interleaving range. Then, it is straightforward to obtain P L i = B L (k i +r i ;k i ;p) g ; (4.1) P I i = B I (k i g +r i g;k i g;p): (4.2) The probability curves withk i = 15 andr i = 5 are shown in Fig. 4.4. When interleaving length g is large, the performance of unequal erasure protection of interleaving outper- forms that of LGCM. WhenN is small, the performance of the two schemes is close. The eciency of the interleaving scheme is achieved at the cost of a larger buer size and a longer delay amount. 4.2 Joint Error Concealment and Interleaving When NC-oriented interleaving is applied to video transmission, error concealment (EC) can work eectively since it is easier to locate erasure packets in the video decoder. Before presenting the advantages of joint interleaving and EC, we rst review the EC mechanism brie y. EC works eectively when lost packets are spatially or temporally correlated with received packets. The degree of correlation between lost and received packets depends on the packetization scheme. The exible macroblock ordering (FMO) technique, which 63 splits a frame into several independent-decodable slices, is developed to take the advan- tage of spatial correlation. In particular, consider the checker-board type of FMO. When one slice group is lost, it can be eectively compensated by the other group using spatial correlation. In contrast, for the traditional slice of macroblocks in the raster scan order, a lost slice cannot be easily compensated by other received slices due to weak spatial correlation among them. If a bi-predictive slice is lost, its prediction modes and motion vectors can be derived directly from previously encoded information just like the tem- poral direct mode [50]. H.264/SVC adopts the same mechanism [19] for temporal error concealment in a quality layer, where motion vectors are generated from the forward and the backward reference lists. The interleaving scheme allows eective EC by spreading the longer burst erasure pattern into shorter isolated ones. It increases the spatial and the temporal correlation between lost and received packets. In Chapter 3, a NC data block consists of a whole GOP or several GOPs. The loss of a RLNC generation results in burst erasure of packets in a GOP. By mapping a GOP into several generations, the interleaving scheme re-distributes the burst erasure pattern. We compare the video quality of these two lost patterns using the 4CIF sequence which is encoded by the H.264/SVC reference code JSVM7.9 in Fig. 4.5. If the burst erasure causes the loss of the 2nd GOP, the resultant average PSNR is equal to 34.65 dB. The whole GOP is compensated by the frame copy method, and most motion information is lost. It introduces large quality degradation. The quality is even worse if the lost GOP includes a key frame. In this case, adjacent GOPs are also eected due to the hierarchy B structure in H.264/SVC. With the interleaving scheme, the number of lost packets is 64 15 20 25 30 35 40 45 1 4 7 1013 16 1922 25 28 3134 37 40 4346 49 52 5558 61 64 6770 73 76 79 Frame number PSNR (dB) lossless, average PSNR 36.14 dB 2nd GOP lost, average PSNR 34.65 dB one B frame lost from GOP 1 to GOP 8, average PSNR 35.87 dB Figure 4.5: Comparison the PSNR values of concealed video with and without interleav- ing. the same but they are distributed in 8 adjacent GOPs. Moreover, with unequal erasure protection, lost packets have lower priority. Thus, quality degradation is smaller with and the average PSNR is 36.14 dB. By using the temporal correlation between adjacent frames, the frame copy method is eective since the neighboring frames are temporally correlated. Without interleaving, a burst erasure leads to the loss of a whole GOP or several GOPs in a dense GCM. When the rank of GCM is smaller than the number of packets, received packets are useless to the video decoder. EC cannot be applied since there exists ambiguity in identifying the location of lost source packets. In this case, EC is not eective. With interleaving, it is easier to locate lost packets for EC. 4.3 GOP Partition and Unequal Protection When the interleaving scheme is applied to prioritized video layers, unequal erasure pro- tection can improve the visual performance of reconstructed video. In this section, we 65 rst discuss methods to partition a GOP into priority layers, and then present ways to perform unequal protection in the interleaving scheme. 4.3.1 GOP Partition To get the full benet of interleaving, we need to decompose a GOP into several smaller partitions. The following two conditions should be considered in the partitioning of GOPs. Strong spatial and temporal correlation between partitions, which enables ecient error concealment. Priority dierence among partitions so as to enable unequal protection. We present four partition methods based on error resilient coding tools of H.264/AVC and H.264/SVC, and discuss whether they are suitable for unequal protection in the context of interleaving below. Partition a GOP into three groups that contain only I, P or B frames, respectively This has been used in the early work on unequal error protection [6]. It is a simple and eective idea, where the temporal correlation is utilized for EC. Moreover, the importance of I, P, and B groups is obvious. Similarly, the H.264/SVC bitstream can be organized into T 0 ;T 1 ;:::;t T groups based on the hierarchy B structure as shown in Fig. 2.5. The importance of those groups is measured by delta PSNR. We adopt this method to generate the input H.264/SVC bitstream in computer simulation. Each partition of a GOP contains one or several temporal layers. Partition a frame spatially with FMO in H.264/AVC The slices that correspond to the same location of dierent frames are put into one 66 generation. The spatial correlation is utilized for EC. It provides the exibility to dene the area of slices, and regions of interest (ROI) and the importance of each slice can be obtained accordingly. Partition a packet at the ne-grain level Packet classication using the Relative Loss Index (RLI) was proposed by Kim et al. [23] for the Quality of Service (QoS) mapping to the DiServ network. The un- derlining idea of RLI is to evaluate the loss impact of every packet at the macroblock level by considering factors such as the magnitude and the direction of the motion vector, encoding types, initial error, etc. Then, packets are categorized based on the RLI value. Since the computation of the loss impact considers both spatial and temporal correlation, it is suitable for the interleaving scheme. H.264/AVC data partition The coded information of each slice in H.264/AVC has three partitions. { Type A partition - including the slice header information such as MB types, quantization parameters and motion vectors; { Type B partition - including intra coded block patterns and intra coecients; { Type C partition - including inter coded block patterns and inter coecients. The above partition generates an embedded bitstream. If partition A is lost, the whole slice cannot be decoded. Thus, the bandwidth used for the transmission of partitions B and C is wasted so that this method is not suitable for the interleaving scheme. 67 4.3.2 Unequal Erasure Protection After decomposing a GOP into several partitions with dierent priorities, we can group packets of the same priority from multiple GOPs into a generation for RLNC. Then, the loss of one generation only results in the loss of several packets of the same priority in these GOPs. To reduce the probability of losing important generations with NC, we may provide more redundancy for them to get stronger protection. With NC-oriented inter- leaving, redundancy for erasure protection can be achieved by generating more packets from the RLNC process. 4.4 Design of Optimal Interleaving Scheme 4.4.1 Problem Formulation We can optimize two parameters for the best performance of the NC-oriented interleaving scheme: 1) interleaving length g; and 2) the optimal number of redundant packets in each generation. The interleaving length should not be too large since a larger value of g demands longer encoding/decoding delay and a larger buer size. Moreover, a large value ofg leads to the large eld size of the GlobVec in the packet header of RLNC coecients. The size of the eld is equal to Q P j=1 k ij , where k ij is the number of packets in the ith partition of the jth GOP, and Q is the number of partitions. Thus, there should be an upper bound ong, which is denoted withG in the optimization problem. The number of redundant packets ofQ partitions (or generations) in each interleaving group of GOPs is denoted by a vector r = (r 1 ; ;r Q ) T . The optimization problem can be formulated as follows. 68 We would like to minimize the expected quality degradation as given by J(g ;r ) =min( Q X i=1 D i P i ); (4.3) subject to the following three constraints: 1) gG; (4.4) 2) Q X i=1 r i R; (4.5) 3) r 1 k 1 > r 2 k 2 >:::> r Q k Q : (4.6) where P i = 1B(n i ;k i ;p) = k i 1 X j=1 0 B B @ n i j 1 C C A (1p) j p n i j is the probability of losing generation i, k i =gm i is the number of packets per inter- leaved data block 1 , n i =k i +r i , and D i is the quality degradation of the ith partition. We adopt the o-line calculation in our simulation. That is, before the transmission, the optimizer in Fig. 3.1 calculates the PSNR degradation of the bitstream by dropping corresponding partitions. In practical video transmission, quality degradation can be calculated on-line using the relative loss index (RLI) as discussed in [23]. 1 mi is the number of packets at the ith layer for a GOP 69 Constraint (4.5) denes the total number of redundant packets,R, and the redundancy ratio is equal to = R Q P i=1 k i : Constraint (4.6) species the requirement that the redundancy assignment should protect important generations better. 4.4.2 Iterative Algorithm to the Optimization Problem We propose an iterative algorithm to solve the optimization problem (4.3) of low com- putation complexity. The process is illustrated in Fig. 4.6. Initially, every generation is treated equally so that redundancy is evenly assigned. After that, redundancy assigned to the generation of lower priority is moved to that of higher priority, which may be the closest neighbor generation or a generation in the far left as long as the new assignment r satises Condition (4.6). The redundancy adjustment is iterated until there is no further improvement can be reached. The input parameters of the iterative algorithm include: the number of packets in each partition denoted by vector k (fk i ;i2 [1;:::;Q]g), the quality for each partition denoted by vector q (fq i ;i2 [1;:::;Q]g), and the redundancy ratio . The output parameters are the optimal interleaving length g and the optimal redundancy values denoted by vector r . The detailed algorithm is given below. Iterative Algorithm to Selection of Optimal Interleaving Parameters J = 0 g = 0, r [i] = 0 for all i2 1:::Q q active = 0, q next = 0 70 for g = 1 to G do Calculate k i for layer i in g GOPs for all i2 1:::Q r[i] =k[i] for all i2 1:::Q q active = 0, q next = 0 while r[i] 0 for all i2 2:::Q do if J(g;r 1 ;r 2 ;:::;r Q )>J then g =g, r [i] =r[i] for all i2 1:::Q, J =J end if repeat if q active 2 then r[q active 1] + +, r[q active ] end if if r[q next ] == 0 then q next end if until r[i1] k[i1] > r[i] k[i] for all i2 1:::Q end while end for The computational complexity of the proposed algorithm depends on the number of redundant packets to be relocated from generations of lower priority to generations of higher priority. The number of packets to be relocated at each round of the loop denes the granularity. If the granularity is larger, the speed is faster. A larger granularity may result in a sub-optimal solution. A smaller granularity tends to yield the optimal solution 71 (c) Optimized redundant assignment (b) intermediate redundant assignment (a) Initial redundant assignment Figure 4.6: Adjustment of redundancy among partitions. at the cost of higher computational complexity. The smallest granularity is to adjust one packet at each iteration, which demands the highest complexity. We can evaluate the complexity of the proposed algorithm as follows. Let R be the total number of redundant packets and Q the number of partitions per GOP (so that the number of generations is also equal to Q). Initially, every generation is assigned with R=Q redundant packets. For each redundant packet adjustment, we calculate and check condition in (4.6). Let the 1st generation have the highest priority, and the Qth generation the lowest priority. The Qth generation is assigned R=Q redundancy in the initial stage. The maximum redundancy to be moved to the other Q 1 generations is R=Q. The (Q 1)th generation is assigned R=Q initially. Besides, there are R=Q redundant packets possible from the Qth generation. Thus, there are maximally 2R=Q 72 redundant packets to be moved to the other Q 2 generations. By induction, the total operations can be written as Q P i=1 (i R Q (Qi)) = R Q (Q Q P i=1 i Q P i=1 i 2 ) =R( Q(Q+1) 2 + Q(Q+1)(2Q+1) 6 ): (4.7) Thus, the complexity of the proposed algorithm is O(RQ 2 ). For comparison, the full search method nds the optimal value ofg andr from all possible combination offr;gg under Constraints (4.4), (4.5) and (4.6). The computational complexity is O(R Q ), which is much higher than that of the proposed algorithm. In the process of solving the optimalg value, some largeg values are explored. When we compute intermediate variables of the combination item in the cumulative binomial probability, it is often too large to result in over ows even using 64 bits. To avoid the over ow, we use the regularized incomplete beta function to obtain cumulative binomial probability [41], i.e. I p (k;nk + 1) =B(n;k;p); where I x (a;b) = R x 0 t a1 (1t) b1 dt R 1 0 t a1 (1t) b1 dt : 4.5 Simulation Results We use H.264/SVC reference software JSVM7.9 to generate the video bitstream. Several 4CIF sequences (Harbor, Crew, and Soccer) are used in the simulation. The frame rate 73 is 30 frames/second, the QP parameter is 35, the GOP size is 8, and the intra period is 8. Only one quality layer is encoded, but there are four temporal layers. The nite eld size is 2 8 = 256. 4.5.1 Optimal Parameters for Interleaving Scheme We show the optimal interleaving length g for three sequences in Figs. 4.7(a), 4.8(a) and 4.9(a). If the average packet loss rate p is small,g is large. Ifp is large,g is small. This relationship is explained by the characteristics of the regularized incomplete beta function as shown in Fig. 4.10, where the value of I 1p (kg; (nk)g + 1) is calculated numerically by changing p and g with k = 30 and r = 3. We see an in ection point for every curve parameterized by g. Before the in ection point, a larger g value results in smaller I 1p (kg; (nk)g + 1). After the in ection point, smaller g results in larger I 1p (kg; (nk)g + 1). Since I 1p (kg; (nk)g + 1) corresponds to P i in the optimal function (4.3). To minimize quality degradation, the smallest P i is required. Then, we have J(g ;r ),min(I 1p (kg; (nk)g + 1)) and, consequently, the optimal interleaving length is g =argmin g2[1;G] (I 1p (kg; (nk)g + 1)): The in ection point varies with k, which is the number of packets in a generation. Since k varies in dierent video sequences, g is dierent in the above gures. 74 0 1 2 3 4 5 6 7 8 9 10 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Number of GOPs Optimal interleaving length 0 10 20 30 40 50 60 70 80 90 100 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Percentage of optimal redundancy (%) r1 r2 r3 (a) Optimal interleaving length (b) Optimal assigned redundancy Figure 4.7: Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Harbor sequence at a frame rate of 30 frames/sec. 0 1 2 3 4 5 6 7 8 9 10 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Number of GOPs Optimal interleaving length 0 10 20 30 40 50 60 70 80 90 100 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Percentage of optimal redundancy (%) r1 r2 r3 (a) Optimal interleaving length (b) Optimal assigned redundancy Figure 4.8: Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Soccer sequence at a frame rate of 30 frames/sec. Figs. 4.7(b), 4.8(b) and 4.9(b) illustrates the optimal percentage of assigned redun- dancy for generations of dierent priority. When the packet loss rate is small, the assigned redundancy is proportionally to the number of packets in the three generations. Redun- dancy assignment does not dierentiate the importance of generations because a small packets loss rate leads to a small probability of generation loss. When the packet loss rate is higher, more redundancy is assigned to the most important layer. Redundancy for the least important layer decreases to zero whenp> 0:08. The most important generation is 75 0 1 2 3 4 5 6 7 8 9 10 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Number of GOPs Optimal interleaving length 0 10 20 30 40 50 60 70 80 90 100 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Percentage of optimal redundancy (%) r1 r2 r3 (a) Optimal interleaving length (b) Optimal assigned redundancy Figure 4.9: Optimal parameters of the NC-oriented interleaving scheme for the 4CIF Crew sequence at a frame rate of 30 frames/sec. received with the highest probability at the cost of less protection of other generations. Thus, graceful quality degradation is achieved. 4.5.2 Comparison of NC-Oriented Interleaving and Dense GCM Figs. 4.11(a), 4.12(a) and 4.13 (a) show the minimized quality degradation for the three sequences, respectively. When the packet loss rate is small, quality degradation is almost zero, which means that eective protection is achieved by the assigned redundancy. Figs. 4.11 (b), 4.12(b) and 4.13 (b) shows the average PSNR values of the interleaving scheme and the dense GCM method at the same redundant ratio = 10%. When p is small, both methods achieve the same quality as lossless transmission. However, whenp is large, the NC-oriented interleaving scheme outperforms the dense GCM method by 3 5 dB. Finally, we show the reconstructed frames using NC-oriented interleaving and dense GCM in Figs. 4.14, 4.15 and 4.16. Most frames using dense GCM have block artifacts. For example, there is a ghost shade in Fig. 4.15(b). When all packets of theith GOP are received, the decoder uses two key frames to reconstruct the remaining B frames. If one 76 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 Average packet loss rate Probability of loss a geneartion (%) g=1 g=2 g=3 g=4 g=5 g=6 g=7 g=8 Figure 4.10: The probability of losing a generation. of the key frames of the i 1th GOP is lost, it is concealed using the received key frame of thei 2th GOP. Then, decoding of theith GOP is based on this concealed key frame. Using received motion vectors and residuals of B frames in the ith GOP, there are many block artifacts in reconstructed B frames. However, with NC-oriented interleaving and unequal protection, key frames can be received with higher probability, and EC works eectively because of high temporal correlations of lost and received frames. As a result, decoded video has better quality. 4.6 Conclusion We proposed an interleaving scheme that is suitable for robust video transmission in erasure networks using NC. It is called the NC-oriented interleaving scheme. It performs scheduling at the source node, and performs the RLNC operations at intermediate nodes of the network. The proposed scheme enables NC to cooperate with EC eectively. We studied the optimal interleaving scheme by optimizing the interleaving length and 77 0 20 40 60 80 100 120 140 160 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Minimized quality degradation (dB) Minimized quality degradation 15 20 25 30 35 40 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Average PSNR (dB) Interleaving scheme Dense GCM (a) Minimized quality degradation (b) Average PSNR Figure 4.11: Minimized quality degradation and the average PSNR value for the 4CIF Harbor sequence at a frame rate of 30 frames/sec. 0 20 40 60 80 100 120 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Minimized quality degradation (dB) Minimized quality degradation 15 20 25 30 35 40 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Average PSNR (dB) Interleaving scheme Dense GCM (a) Minimized quality degradation (b) Average PSNR Figure 4.12: Minimized quality degradation and the average PSNR value for the 4CIF Soccer sequence at a frame rate of 30 frames/sec. redundancy selected for dierent priority layers. It was shown by simulation results that the NC-oriented interleaving scheme results in better quality than the dense GCM method. 78 0 20 40 60 80 100 120 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Minimized quality degradation (dB) Minimized quality degradation 15 20 25 30 35 40 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Average packet loss rate Average PSNR (dB) Interleaving scheme Dense GCM (a) Minimized quality degradation (b) Average PSNR Figure 4.13: Minimized quality degradation and the average PSNR value for the 4CIF Crew sequence at a frame rate of 30 frames/sec. (a) Interleaving (29.58dB) (b) dense GCM (27.54dB) Figure 4.14: Comparison of a decoded frame (the 44th frame of the Harbor sequence of resolution 704 576) using NC-oriented interleaving and dense GCM with 10% redun- dancy and the packet loss rate p = 8%. 79 (a) Interleaving (28.38dB) (b) dense GCM (26.12dB) Figure 4.15: Comparison of a decoded frame (the 26th frame of the Soccer sequence of res- olution 704 576) using NC-oriented interleaving and dense GCM with 10% redundancy and the packet loss rate p = 8%. (a) Interleaving (30.24dB) (b) dense GCM (27.07dB) Figure 4.16: Comparison of a decoded frame (the 34th frame of the Crew sequence of resoluton 704576) using NC-oriented interleaving and dense GCM with 10% redundancy and the packet loss rate p = 8%. 80 Chapter 5 Wireless Multi-party Video Conferencing with Network Coding In this chapter, we propose a cost-eective scheme for robust wireless multi-party video conferencing based on NC. The main idea is the adoption of a NC scheme to enhance robust transmission, simplify the erasure protection procedure, and reduce the downlink bandwidth by leveraging opportunistic NC and wireless broadcasting. Moreover, we de- sign a pipelining schedule to meet the delay requirement for real-time video conferencing. The proposed NC method outperforms the opportunistic network coding method in terms of video quality and downlink bandwidth. 5.1 Challenges Real-time multi-party video conferencing in wireless networks is a challenging task. Good video quality, large wireless bandwidth, and stringent delay requirements all pose formidable 81 challenges in real-time video conferencing through error-prone wireless networks. To im- prove inecient solutions with traditional methods, we propose a novel network coding scheme for the multi-party real-time video conferencing application. The largest challenge for video conferencing is that the transmission is subject to errors in wireless networks. Usually, the video data from all participants are switched at the base station (BS) or the access point (AP). Transmissions are subject to packet loss over uplink and downlink channels. With the traditional store-and-forward (S&F) method on BS, all participants have to protect data against the erasure transmission in each channel for graceful video quality degradation, which is complicated and band- width inecient. Previous wireless network coding methods [17,21,37]only apply to the video broadcast/multicast scenarios where the BS/AP broadcast/multicast video data to wireless receivers. They assume that the wireless BS/AP receives video packets without loss. Therefore, those wireless network coding methods only provide erasure protection for the transmission over downlinks. If those methods are adopted in our multi-party video conferencing scenario, additional erasure protection is required to protect the up- link transmission to achieve graceful video quality degradation. Furthermore, transmission of multiple video bit-streams requires a large amount of bandwidth consumption when all video bit-streams are exchanged at the BS. It limits the number of video conferencing participants since the bandwidth of downlink/broadcast from the BS to all users is a precious resource in a wireless network. The traditional S&F method usually consumes huge downlink bandwidth which is equal to the aggregation of video bitrates from all users. 82 Finally, real-time conferencing poses a stringent delay requirement. For example, it only allows 100 200ms delay in a commercial video conference system [10]. Most research on wireless network coding [17, 37] primarily focused on the video streaming application, which allows longer delay. In this chapter, we propose a cost-eective approach for robust wireless multi-party video conferencing based on network coding (NC). Major contributions of this work are stated below. We propose a novel NC scheme for advanced erasure protection. We improve trans- mitted video quality by providing erasure protection in uplink, downlink and over- hearing channels. Erasure protection in uplink and overhearing channels is achieved by multi-step NC encoding while erasure protection in downlink channels is achieved by dynamic request. The proposed NC scheme not only enhances robust transmis- sion but also simplies the erasure protection mechanism, which only involves the BS/AP. It is shown by computer simulation that the proposed NC scheme leads to a higher decoding probability in recovering source video packets at the receiver. As a result, each user can receive better video quality. We aim at reducing the bandwidth of the downlink channel by leveraging oppor- tunistic NC and wireless broadcasting. Simulation results show that the downlink bandwidth can be greatly reduced by the proposed NC scheme. We design a pipelining schedule and a buering policy to facilitate NC on all partici- pants to meet the delay requirement in real-time video conferencing. The maximum decoding delay is around 100ms, which is within the delay requirement. 83 The rest of this chapter is organized as follows. We propose an NC scheme that is suitable for wireless multi-party video conferencing in Sec. 5.2. Then, the performance of the proposed NC scheme is analyzed in Sec. 5.3. Simulation results are shown in Sec. 5.4. Finally, concluding remarks are given in Sec. 5.5. 5.2 Wireless Video Conferencing with NC We focus on multi-party video conferencing in WiMAX networks. To facilitate our anal- ysis and discussion, we have made the following assumptions on the system model. All channels are i.i.d. stationary, duplex and symmetric channels. Ignore the mutual interference among users and collisions in wireless channels. All channels are line of sight (LOS) channels, where the multi-path eect is ignored. Thus, the main channel behavior is signal propagation (with a decaying magnitude and certain delay) corrupted by noise. The multi-carrier technique is adopted by all nodes in the network so that each node can transmit and receive data simultaneously. In this section, rst we give a system overview. Then we discuss the challenge of the system, and describe the proposed NC process in details. Finally we present the real-time video scheduling. 84 5.2.1 System Overview A representative system diagram is shown in Fig. 5.1. Suppose U users join a video conference session, and every user expects to receive video streams from all other par- ticipants. Source video sequences are encoded at all user nodes, organized into packets of equal length. Meanwhile, every user overhears transmission from his/her neighbors. Each user encodes his/her own source packets and overheard packets from neighbors into a generation with random linear NC (RLNC) [9, 16]. Then, the user sends a generation of NC packets to the base station (BS). After nishing sending a generation, users send requests to BS for more packets to successfully decode an NC generation. The BS receives packets from all users, and gener- ates NC-coded packets for broadcasting based on user's requests. Then, users reconstruct the Global Coding Matrix (GCM) from the received and overheard packets, and decode the source data of other users. Finally, users can display these video contents. NC packets are stored in buers S andR of user nodes as well as buerQ of the BS. Channels are classied as Uplink channels from users to the BS; Downlink/broadcast channels from the BS to all users; Overhearing channels between neighboring users when they work at the promiscuous mode. The packet loss rate of both uplink and downlink channels for useri is i while the packet loss rate of the overhearing channel between user nodes i and j is i;j . The overhearing 85 channel is symmetric in the sense i;j = j;i . In the following, we discuss the system, which consists of two parts: 1) an NC process, and 2) real-time scheduling. S 3 R 3 S 2 S 1 R 2 R 1 α 2 α 3 β 1,2 α 1 β 1,3 BS User3 User2 β 2,3 User1 Q Figure 5.1: Illustration of the NC wireless multi-party video conferencing system. 5.2.2 Proposed NC Process When applying wireless NC to multi-party video conferencing application scenario, the challenge is how to eectively and robustly transmit packets over the three wireless chan- nels. In 2, we discussed several prevailing wireless NC methods [17,20,21,37,55]. None of those methods applies to the multi-party video conferencing application scenario. Most of them [17,37] only address robust transmission over downlink. The best method among those methods, opportunistic wireless NC [21] proposed eective transmission over over- hearing channels with NC. However, this method fails to provide eective transmission for the three channels when applying this method to our application scenario. For example, packetP 1 on usern1 can be propagated through the network via two channels. It is pos- sible thatn4 overhearsP 1. With traditional opportunistic wireless NC,n4 only sends its own packet P 2 without any NC procedure, which disables P 1 being further propagated 86 through the network. In case P 1 is lost over both channels n1!n3 andn1!n5,P 1 is erased permanently in the network. In this section, we propose a NC process to enable overheard packets being further propagated through the network. Our method eectively transmits packets over uplink, downlink and overhearing channels with wireless NC and meets the challenge of multi-party video conferencing. In the proposed system, a generation of NC packets contains data blocks from all users in a period of time of length T . For example, if all users begin with the video conferencing session simultaneously, packets with the same frame index may be encoded into one generation. The NC process is performed on both user nodes and the BS as described below. In the 1st step, NC encoding is performed at each user node by mixing source packets from itself and overheard packets from neighboring nodes. For useri, buer S i stores source packets. The number of source packets for the current generation isjS i j. A packet in this buer is denoted by s j with 0jjS i j. Buer R i , whose size is denoted byjR i j, stores overheard packets from other users. A packet in this buer is denoted by r j with 0 jjR i j. The kth NC-encoded packet at user i can be written as Y 0 k = jS i j X j a j;k s j + jR i j X j b j;k r j ; (5.1) where a j;k and b j;k are randomly selected coecients over a nite eld. User i generatesjS i j NC packets for each NC generation. The coecients for one gener- ation compose the local coding matrix on users (LCMU) which is a matrix of size jS i j(jS i j+jR i j). These NC packets are sent while packets of neighbors are received 87 via overhearing channels. It takes a time interval of length T to send all packets for one generation. After nishing sending a generation, if user i nds that over- heard packets are not sucient to decode other users' packets for this generation, it requests m i additional packets from the BS. To calculate m i , user i counts two numbers: N 0 andjR i j. N 0 is the total number of packets from all other users in this generation, and N 0 = U P j6=i jS j j wherejS j j is the number of source packets on user j. jR i j is the number of overheard packets on user i. It is clear that m i =N 0 jR i j. In the 2nd step, BS generates NC packets by mixing all received packets after receiv- ing users' requests. For every generation, BS generates M packets and broadcasts them to all participants, where M = max(m i ;8i2 [1 U]). Each packet in the BS buer Q is denoted by q j with 0jjQj, wherejQj is the number of packets stored in buer Q. The BS does not send any NC packet immediately but wait for T until all users nish transmitting packets for a generation. Since more packets are received on the BS, a space of higher dimension is spanned by these NC packets. Thus, by waiting for T , the BS can generate, with a high probability, innovative NC packets which increases the chance to achieve full rank of GCM at the user end. The kth NC-encoded packet at the BS can be written in form of Y k = jQj X j c j;k q j ; (5.2) where c j;k is a randomly selected coecient over a nite eld. The coecients of one generation compose the M (jQj) local coding matrix on the BS (LCMB). 88 Those coecients are transmitted along with NC packets [9] from the BS to each user node. In the 3rd step, each user performs NC decoding. For user i, the source packets in buer S i are of dimensionjS i j in one generation. If they participate in the decod- ing process together with received NC packets in buer R i , the need of additional packets from the BS for successful decoding is eased and thus the downlink band- width saved. If user i wants to decode data from all other users (S j ,8j2 [1U], j6=i), it has to reconstruct an NN GCM i where N = P jS j j. The upper part of GCM is anjS i jN submatrix. Columns corresponding to [ i1 P k to i P k ] are an identity matrix of dimensionjS i jjS i j. The remaining submatrices are lled with zero. The lower part is an N 0 N submatrix, which is composed of global coding coecients from LCMU and LCMB. Then, the decoding equation becomes 2 6 6 4 S i R i 3 7 7 5 =GCM i 2 6 6 6 6 6 6 4 S 1 . . . S U 3 7 7 7 7 7 7 5 (5.3) Source video packets from other users are decoded by applying Gaussian elimination to (5.3). 5.2.3 Real-Time Video Scheduling We design a scheduler to achieve the real-time requirement for video conferencing as shown in Fig. 5.2. Based on the three NC steps stated in the last subsection, the following three operations can be pipelined: 1) user sending, 2) BS sending, and 3) user decoding/display. 89 Each operation consists of two stages so as to process the odd and the even generations, respectively. As shown in the gure, t 0 is the start time for video conferencing and users start to send data. The BS waits for a time interval of length T and starts to send at t 1 . The NC method proposed in [32] generates NC packets immediately after receiving any packet. In the video conferencing scenario, this may waste the downlink bandwidth since fewer packets are NC-encoded on the BS and the probability of generating innovative packets to users becomes lower. In our scheme, the BS gathers a sucient number of packets to generate new NC packets by waiting for an interval of length T and, as a result, the chance of having innovative NC packets becomes higher. To decode the NC generation, users wait another interval of length T until a sucient number of packets of the same generation are received for decoding at t 2 and then start to display. Users send Odd generations Even generations BS sends Users decode t 0 t 1 t 2 Figure 5.2: Real-time scheduling with even and odd generations of video packets. To enable the pipelining process for real-time video conferencing, we develop an alter- native storage structure for buers in Fig. 5.1. That is, we store even and odd generations of video packets alternatively. The life cycle to complete one even and one odd gener- ations is 2T . Those buers are refreshed at the end of each life cycle. Within one life cycle, we have the following buer management processes. 90 Management of buerS. In the rst period of lengthT ,S provides packets for the 1st-step NC encoding. In the end of the second T period, S provides packets for NC decoding. Management of buer R. In the rst period of length T , R stores packets received from overhearing channels. In the second period of length T , R stores packets received from broadcasting channels. At the end of the second period, R provides packets for NC decoding. Management of buer Q. In the rst period of length T , Q stores packets received from uplink channels. In the second period of length T , Q provides packets for the 2nd-step NC encoding. With the above buer management scheme, the decoding delay of the proposed NC process is equal to t 2 t 0 = 2T and the overall delay is d = 2T + 2 where is the propagation delay in wireless media between users and the BS. Usually, is small and negligible. If T is suciently small, we can meet the real-time transmission constraint. For example, if T = 1=30s, the NC decoding delay is 2 1 30 66:7ms. 5.3 Performance Analysis In Sec. 5.1, we discussed the traditional S&F method, which requires large downlink bandwidth and complicated erasure protection. Generally speaking, the non-NC method has worse performance than the NC method. For example, for a non-NC scheme with overhearing, it may reduce the requirement of the downlink bandwidth. However, to benet from overhearing channels, it has to address two problems. First, to avoid sending 91 redundant packets in uplink and overhearing channels, users need a complicated protocol to inform the BS about which packets have already been overheard. Second, packet loss impacts video quality. Packets may be lost in uplink and overhearing channels so that erasure protection is also needed in these channels. In this section, we will focus on wireless NC methods. The NC scheme presented in Sec. 5.2 will be compared with the opportunistic NC scheme in [17]. We will show that the proposed NC scheme oers a simpler and more robust erasure protection mechanism. 5.3.1 Comparison of Proposed and Opportunistic NC Methods In a wireless environment, video conferencing packets are subject to erasure in uplink, downlink and overhearing channels. The opportunistic wireless NC method in [17] has a problem in erasure protection. That is, when user i receives packets from user j, it may receive innovative packets from the BS. However, a packet may be lost in the overhearing and the uplink channels with probability j j;i . When useri does not achieve the full rank ofS j from other users, he/she cannot have the full rank ofGCM i in (5.3) to decode the current generation of packets regardless of the number of NC packets sent from the BS. Then, all data in this generation would be lost, which results in severer video quality degradation. A scheme to alleviate this problem is to add redundancy for both users and the BS via channel estimation and optimized erasure protection. In the analysis framework outlined above, our main concern is the rank ofS j of the BS and user i, which composes a subspace as illustrated in Fig. 5.3(a), where the subspace, I i;j , consists of a set of nodes where user j collects innovative packets after user i sends 92 NC packets of a generation. For GCM i to be of full rank, S i from useri has to be of full rank on user j. Thus, it is essential to study the ranks of all I i;j ;8j2 [1U];j6=i. For the opportunistic NC method [17], the probability of achieving full rank over I i;j is P o = (1 i ij ) jS i j ; (5.4) which is equal to the probability that all packets in one generation from useri are received by either BS or user j. As to the proposed NC method, more user nodes can contribute innovative packets to user j besides the BS. This situation is illustrated in Fig. 5.3(b). The subspace I i;j of the proposed NC method includes nodes setfBS;user k ;8k2 [1 U];k6=ig. The probability for S i to have the full rank becomes P m = (1 i Y k6=i i;k ) jS i j ; (5.5) which is the probability that all packets in a generation S i from user i are received by the BS or other users k;k6=i. α 2 α 2 α 3 β 1,2 α 1 β 1,2 BS User3 β 2,3 User1 User2 α 3 β 1,2 α 1 β 1,3 BS User3 β 2,3 User1 User2 (a) (b) Figure 5.3: The subspace for user node 1 with (a) the opportunistic NC method and (b) the proposed NC method. 93 It is clear that P m > P o if 0 1 and 0 1. If more users join the video conference,P m is close to 1 whileP o does not change withU. The proposed NC method has a higher probability to obtain innovative packets and, consequently, (5.3) is more likely to be decodable. Furthermore, it is free from erasure protection at user nodes, which eliminates the additional bandwidth required for redundant packets in uplink channels. The proposed NC method only has to protect packets in the downlink channel. We use a dynamic request to obtain more NC packets from the BS based on the BS sequence number of NC packets. That is, if a BS packet is lost in the downlink channel to user i, user i will request one more packet on top of m i . 5.3.2 Eect of Unequal Path Loss in Overhearing Channels It was assumed in Sec. 5.2 that packet erasure comes from the path loss of LOS wireless channels. For a given modulation type, the path loss model [3] is PL = 130:62 + 37:6log 10 R; (5.6) where R is the physical distance in the unit of kilo-meters. The packet erasure rate (PER) in each overhearing channel is dependent on relative locations among users. A dierent R value results in an unequal PER. For simplicity, we assume that PERs in the uplink/downlink channels are the same, which implies that all users are located at the perimeter of a circle centered at the BS as shown in Fig. 5.4 with i = j , where 1i;jU, i6=j. 94 α 2 α 3 β 1,2 α 1 β 1,3 BS User3 User2 β 2,3 User1 α 2 α 3 β 1,2 α 1 β 1,3 BS User3 User2 β 2,3 User1 Figure 5.4: Illustration of relative locations among users: (a) the unequal path loss case and (b) the equal path loss case. For packet loss rate in overhearing channels, we use =E() and 2 =var() to denote its mean and variance value, respectively. We have 2 = 0 and 2 > 0 for the equal and unequal path loss scenarios, respectively. The variance value, 2 , has an eect on the downlink bandwidth and received video quality. They are detailed below. Bandwidth consumption in the downlink channel When 2 = 0, the request amount of data from users to the BS is the smallest. Suppose the number of NC packets S i for user 1 i U is the same; namely, S i =S. Then, we have M = (U 1)S (1)(u 1)S = (1)(U 1)S: When 2 > 0, the request amount becomes larger. Under this scenario, i;j E() or i;j E(). On one hand, useri receives packets from other users with a higher probability when i;j <E() and, as a result, he/she issues a smaller m i for fewer packets from the BS. On the other hand, user i receives packets from other users 95 with a lower probability when i;j >E() and, consequently, issues a biggerm i for more packets from the BS. The BS has to consider all requests from users, and meet the requirement of the maximum request. Therefore, the bandwidth consumption in the downlink channel increases. Quality of received video When 2 > 0, consider useri that has a smaller PER in several overhearing channels (i.e., i;j E()). A small value in Q k6=j j;k will result in a large value of (5.5). The probability of achieving the full rank in these subspaces would be higher so that video packets sent by those users are received and decoded with a higher probability. In contrast, for users that have one or several overhearing channels with a larger PER value, the probability of achieving the full rank in their subspaces would be lower and the corresponding video quality is also lower, too. To give an example for received video quality, we consider a case with U = 3. The quality of received video streams for user 1 from the other two users is evaluated when 1;2 = 2;1 = 0:1, 1;3 = 3;1 = 0:5, and 3;2 = 2;3 = 0:3. Let I 1;2 be the subplane for user1 to collect NC packets from user2, and I 1;3 be the subplane for user1 to collect NC packets from user3. The probability of achieving the full rank in I 1;2 and I 1;3 are P m;1;2 = (1 ( 0:1 0:3) S 2 , and P m;1;3 = (1 ( 0:5 0:3) S 3 , respectively. Since P m;1;2 >P m;1;3 , user 1 receives video packets from user 2 with higher quality, and receives video packets from user 3 with lower quality. 96 To conclude, the value of 2 has an eect on the downlink bandwidth and received video quality. To improve the system performance, dummy users may be inserted in the wireless network intentionally. The function of dummy users is to act as overhearing relays for the NC packets among users in a video conferencing session. No source video bitstream is generated from them. An example of inserting a dummy user is shown in Fig. 5.5. Its location should be close to all other users. Meanwhile, its introduction results in a smaller 2 . Thus, it can relay more NC packets from user 1 and user 2 to user 3. Consequently, user 3 can issue a smaller request, m 3 , to the BS and the downlink bandwidth can be reduced. Moreover, video packets from user 1 and user 2 are received by user 3 with a higher probability since they are transmitted in more overhearing channels. Received video quality can be improved accordingly. β 1,2 β 1,3 User3 User2 β 2,3 User1 Dummy user Figure 5.5: Illustration of inserting a dummy user to improve the overall system perfor- mance. 5.3.3 Eect of NC Generation Period The period of one NC generation, T, also has an impact on the performance of the proposed NC method in terms of video quality, decoding delay, packet overhead and bandwidth consumption. The video quality is related to the probability of achieving the 97 full rank of the NC decoding matrix as calculated in Eq. (3.1.3). Parameter T has an impact on N, and T should not be too large. For decoding delay, if T > 1=fps where fps is the number of frames per second, the decoding delay is equal to 2T . Furthermore, as discussed in Sec. 5.2, if T 1=fps, the decoding delay is 2=fps since the video frame display can start only after a whole frame is decoded. For packet overhead and bandwidth consumption, parameterT should keep a balance by considering the following two factors. Overhead of NC coecients NC coecients are transmitted as packet headers. For given video bit streams, a larger T value results in a larger number of packets within one generation since more NC coecients should be included in packet headers. For example, if there are 25 packets per generation, the packet length is 500 byte, NC coecients are selected from a nite eld 2 8 , the coecient packet overhead is 5%. If there are 50 packets per generation, the overhead is 10%. Therefore, to reduce the overhead of NC coecients, a small T value is preferred. Overhead of zero alignment When the amount of data in the period of T is not an exact multiple of the packet length, it needs zero alignment. As T increases, the number of packets requiring zero alignment increases as well. 5.4 Simulation Results We simulated the proposed NC method with ns2. According to the analysis in Sec. 5.3, when more users join the video conferencing session, the probability for the GCM to 98 achieve the full rank increases and received video quality improves. For simplicity, we use the case U = 3 to demonstrate the advantage of the proposed NC method. The 4CIF Crew sequence with a bit rate of 848kbps coded by H.264/AVC is stored in the buer of user 1. Similarly, the coded Soccer (655kbps) and the coded Foreman (553kbps) sequences are stored in the buers of users 2 and 3, respectively. The frame rate is 30 fps, the QP parameter is 35, the GOP size is 8, and the intra period is 8. We adopt frame copying as the error concealment method. The nite eld for NC coecients is set to 2 8 , and the period of a generation is T = 1=30sec = 33:3msec. 5.4.1 Scenario with Equal Path Loss In this subsection, we consider the scenario with equal path loss in overhearing channels. We set i;j = 0:3, for i;j = 1; 2; 3 and j6= i. We use the averaged PSNR value of all six received sequences as a function of the packet loss rate (PLR), , of the downlink channel to evaluate the system performance in Fig. 5.6. We see that the proposed NC method outperforms the opportunistic NC method by a signicant margin when the PLR is higher since the proposed NC method has a more ecient erasure protection capability. We show the bandwidth consumption of the downlink channel as a function of the average packet loss rate, , labeled by "Equal PL" in Fig. 5.7. The total bit rate of three video bit streams is 2.056 Mbps. If video packets are switched at the BS with the traditional S&F method without NC, the bandwidth consumption over the downlink is 2.056 Mbps. With the proposed NC method, it is about 1=3 of the total bit rates. This reduction comes from eective overhearing of innovative NC packets from neighbor users. 99 15 20 25 30 35 40 0 0.02 0.04 0.06 0.08 0.1 Average PSNR (dB) Average packet loss rate % Opportunistic NC Proposed NC Figure 5.6: The averaged PSNR performance as a function of the average packet loss rate, , of the downlink channel. Table 5.1: Comparison of the PSNR value of each user when = 5%. Soccer1 Foreman1 Crew2 Foreman2 Crew3 Soccer3 Opportunistic NC 27.8 28.2 30.1 28.9 29.8 26.4 Proposed NC with equal PL 31.4 33.5 32.7 32.3 31.2 27.4 Proposed NC with unequal PL 32.9 30.1 34.2 31.0 33.1 31.1 Since additional NC packets are sent from the BS to each user to compensate erasure packets by the dynamic request, the throughput increases slightly with . Table 5.1 shows the PSNR value of received sequences on three users when = 5%. Again, the proposed NC method outperforms the opportunistic NC method for every sequence. It provides better erasure protection in the uplink so that the probability to get a full rank GCM increases, and more packets are decoded to improve video quality. Furthermore, we show one reconstructed frame of two received video sequences for user 1 in Fig. 5.8 and Fig. 5.9, for user 2 in Fig. 5.10 and Fig. 5.11, and for user 3 in Figs. 5.12 and 5.13. The proposed NC method oers image frames of better quality than that of the opportunistic NC method. For example, the left frame misses some detailed 100 300 400 500 600 700 800 900 0 0.02 0.04 0.06 0.08 0.1 Downlink bandwidth consumption (kbps) Average packet loss rate Equal PL Unequal PL Figure 5.7: The downlink bandwidth consumption as a function of the average packet loss rate, , of the downlink channel. information such as the camera's ash at the background in Fig. 5.12. This problem results from the frame-copy error concealment scheme. In contrast, the right frame keeps the desired information since the proposed NC method receive more packets to recover video of better quality. (a) Opportunistic NC (27.8dB) (b) Proposed NC (31.4dB) Figure 5.8: Comparison of the decoded 275th frame of the Soccer sequence at user node 1 with = 0:05. 101 (a) Opportunistic NC (28.2dB) (b) Proposed NC (33.5dB) Figure 5.9: Comparison of the decoded 22nd frame of the Foreman sequence at user node 1 with = 0:05. (a) Opportunistic NC (30.1dB) (b) Proposed NC (32.7dB) Figure 5.10: Comparison of the decoded 83th frame of the Crew sequence at user node 2 with = 0:05. 5.4.2 Scenario with Unequal Path Loss To simulate the unequal path loss in overhearing channels, we adopt the packet error rate (PER) model without FEC coding (QPSK or 16-QAM) [28,29]. Several parameters obtained from [3] are given below: transmission power P t = 23dBm, noise density N 0 = 174dBm=Hz, and system bandwidth W = 10MHz. Based on the LOS path loss model [42], we obtain the relation between PER, BER and distance in Table 5.2. Table 5.2: WiMAX PER and physical distances when = 0:3; 2 = 0:2. PER BER distance (m) 0.1 2:1 10 4 1800 0.3 7:1 10 4 2300 0.5 1:4 10 3 3000 102 (a) Opportunistic NC (28.9dB) (b) Proposed NC (32.3dB) Figure 5.11: Comparison of the decoded 97th frame of the Foreman sequence at user node 2 with = 0:05. (a) Opportunistic NC (29.8dB) (b) Proposed NC (31.2dB) Figure 5.12: Comparison of the decoded 60th frame of the Crew sequence at user node 3 with = 0:05. We compare the PSNR values of the proposed NC method with equal and unequal path losses in Table 5.1. The Soccer and Crew sequences are received with a higher PSNR value with unequal path loss since user 1 and user 2 that send these two sequences are closer and their overhearing channel have a smaller PER, which is smaller than the equal path loss scenario. Since user 3 is linked by an overhearing channel of a higher packet erasure rate, the received Foreman sequence has a smaller PSNR value. Overall, the average PSNR is a little bit higher than that of the equal path loss scenario. The downlink bandwidth consumption of the equal and the unequal path loss scenarios is compared in Fig. 5.7. The bandwidth consumption of the unequal path loss increases a 103 (a) Opportunistic NC (26.4dB) (b) Proposed NC (27.4dB) Figure 5.13: Comparison of the decoded 282nd frame of the Soccer sequence at user node 3 with = 0:05. little bit since the request to the BS from user 3 is larger due to the higher PER in its two overhearing channels. 5.5 Conclusion The application of wireless multi-party video conferencing is more complex than wireless video streaming such as multicast or broadcast. It demands erasure protection in uplink, downlink and overhearing channels. The opportunistic NC scheme was proposed in the context of multicast or broadcast so as to oer erasure protection in the downlink channel. A novel network coding method with a simple yet robust erasure protection mechanism was proposed for multi-party video conferencing in this chapter. Since the video con- ferecing application imposes a more stringent time constraint than video streaming, we designed a scheduling scheme to meet the low delay requirement in real-time applications. Simulation results show that the proposed NC method outperforms the opportunistic NC method in terms of received video quality and throughput by a signicant margin. 104 Chapter 6 Conclusion and Future Work 6.1 Summary of the Research A robust video transmission system using NC was investigated in this research. Several main results obtained in this research are summarized below. First, we proposed a robust video transmission system using NC in erasure networks. In the design of such a system, we followed the network design principle [43]. That is, the core network should be as simple as possible while complicated applications should be implemented at edges of the network. Simply speaking, the H.264/SVC video bitstream is partitioned into priority layers and packetized into packets of equal length at the source node. Random linear network coding (RLNC) is performed at intermediate nodes of the network. Video data are decoded with an H.264/SVC decoder equipped with the error concealment (EC) capability at the receiver node. The system performance is evaluated by comparing the quality of the source video bitstream with that of the decoded bitstream. Then, we pointed out a problem associated with video transmission using RLNC. That is, when the GCM is not of full rank due to packet loss, EC is actually not as ecient as 105 that in the traditional store-and-forward network without NC. To resolve this problem, we proposed a sparse ladder global coding matrix (GCM) for layered H.264/SVC bit-stream transmission, where the shape of the sparse matrix is maintained through the RLNC process. The ladder GCM has two functions: 1) to enable partial decoding of a block and 2) to provide unequal erasure protection for H.264/SVC priority layers. Graceful quality degradation was achieved with the assistance of EC. It was shown by computer simulation that the proposed ladder GCM scheme outperforms the traditional dense GCM by a signicant margin. We proposed an interleaving scheme that allows NC and EC to work together eec- tively. This scheme distributes the impact of one long burst erasure into many short ones which are distributed in adjacent GOPs so that lost packets can be recovered more easily by NC and spatial/temporal EC. Moreover, we discussed many ways to partition a GOP by priority and adopted the H.264/SVC temporal hierarchy B structure in our work. Packets from the same priority level of multiple GOPs form one RLNC generation. Then, unequal erasure protection can be applied to dierent generations. The optimal interleaving length and the redundancy assignment are solved to achieve graceful video quality degradation by a low-complexity algorithm. It was shown by simulation results that the interleaving scheme results in better quality than the dense GCM method. Finally, we studied the problem of applying NC to robust wireless multi-party video conferencing. The main contribution was the adoption of a network coding scheme to enhance robust transmission in all wireless transmission channels, to simplify the erasure protection procedure, and to reduce the downlink bandwidth by leveraging the properties of opportunistic NC and wireless broadcasting. We proposed a pipelining schedule to 106 meet the delay requirement for real-time video conferencing. The proposed NC method outperformed the opportunistic network coding method by a signicant margin in terms of video quality and downlink bandwidth. 6.2 Future Research Work 6.2.1 Extension in Diserv Network To make our research more complete, we would like to examine the problem of video transmission using NC over dierential service (DiServ) networks. In DiServ networks, packets of dierent priorities may experience dierent loss rates and delay according to their service class. In Chapters 3 and 4, packets sent from the source node are associated with dierent priority levels. Dierent packet priority levels can be mapped to dierent service classes provided by a DiServ network. NC-based video delivery in DiServ networks is illustrated in Fig. 6.1. In the DiServ network, the DiServ policy is applied at the gate routers. Each router in the network is congured to dierentiate trac based on its service class. Packets are queued dierentially into some physical or virtual queues. When the RLNC is applied at the routers, packets stored in the same queues can be linearly combined. Thus, packets of the same generation should not be marked with dierent service classes or buered in dierent queues. We should determine a proper scheme in transmitting NC packets in the DiServ network. We may compare the following two ideas. 107 Evaluation PSNR SVC encoding with prioritized layers Packetization NC encoding SVC decoding Bufferring Source Receiver Different average end-to- end packet loss rate p for different service class Optimizer Error Concealment Reordering NC decoding Timer QoS enable network with network coding on every node Figure 6.1: NC-based video delivery in DiServ networks. 1. The proposed sparse GCM in Chapter 3 dierentiates packets of the same genera- tion. When packets generated by the sparse GCM are mapped to dierent service classes (namely, a generation can be marked with many dierent service classes), they may be buered in dierent queues. 2. The interleaving scheme in Chapter 4 dierentiates priorities among generations. Packets of the same generation are mapped to the same DiServ service class, and buered in the same queue. It would be dicult to design a buer police at routers in the rst scheme so that it may not be suitable for the DiServ network. In contrast, the second scheme provides a good way to generate priority packets in DiServ networks. It is also a challenging task to obtain the optimal video quality under a given packet loss rate, a xed delay amount and a DiServ network architecture. Several options under 108 our control include: the design of mapping from priority packets to service classes and the determination of optimal redundancy assigned to various priority packets. 6.2.2 Deployment of Dummy Users To tackle the unequal path loss problem over overhearing channels, we may have two strategies in deploying dummy users. The rst strategy is to distribute dummy users in the WiMAX coverage area in advance. A simple solution is to evenly distribute dummy users. However, the locations of dummy users may not be optimally arranged for the best performance. A more complicated one is to estimate the probability distribution function of locations of video conferencing users and calculate the optimal locations of dummy users accordingly. However, the estimation may not be accurate, which is especially true for mobile users. The second strategy is to deploy dummy users dynamically. The BS maintains a registry table that records WiMAX mobile stations volunteering to be dummy users. Once a video conferencing session is established, the BS can get locations of users, calculate the optimal dummy user locations, and select the best available volunteers to act as dummy users. It is worthwhile to study these strategies and evaluate their performance in the future. 109 Bibliography [1] \ETSI DVB TM-CBMS1167, IP datacast over DVB-H: Content delivery protocols, sept. 2005, draft technical specication, http://www.dvb.org." [2] \3GPP TSG-SA WG4 S4-AHP238, specication text for systematic raptor forward error correction, PSM SWG, sophia antipolis, france," Apr 2005. [3] \Draft ieee 802.16m evaluation methodology," IEEE 802.16m-07/037r2, IEEE 802.16 Broadband Wireless Access Working Group, Dec 2007. [4] S. Acedanski, S. Deb, M. Medard, and R. Koetter, \How good is random linear coding based distributed networked storage?" NetCod, 2005. [5] R. Ahlswede, N. Cai, S. Y. R. Li, and R. W. Yeung, \Network information ow," Information Theory, IEEE Transactions on, vol. 46, pp. 1204{1216, 2000. [6] A. Albanese, J. Blmer, J. Edmonds, M. Luby, and M. Sudan, \Priority encoding transmission," IEEE Transactions on Information Theory, Nov 1996. [7] J. W. Byers, M. Luby, M. Mitzenmacher, and A. Rege, \A digital fountain approach to reliable distribution of bulk data," in SIGCOMM, 1998, pp. 56{67. [8] N. Cai and R. W. Yeung, \Network error correction, part II: Lower bounds," Com- munications in Information and Systems, 2006. [9] P. Chou, Y. Wu, and K. Jain, \Practical network coding," 51st Allerton Conf. Com- munication, Control and Computing, Oct 2003. [10] Cisco, \Telepresence network, http://www.cisco.com." [11] S. Deb, C. Choutte, M. Mdard, and R. Koetter, \Data harvesting: A random coding approach to rapid dissemination andf ecient storage of data," IEEE INFOCOM 2005, Mar 2005. [12] C. Gkantsidis and P. R. Rodriguez, \Network coding for large scale content distri- bution," INFOCOM, vol. 4, pp. 2235{2245, 2005. [13] D. Gomez-Barquero and A. Bria, \Forward error correction for le delivery in DVB- H," Vehicular Technology Conference, 2007. VTC2007-Spring. IEEE 65th, Apr 2007. [14] J. Goshi, R. Ladner, E. Riskin, A. Mohr, and A. Lippman, \Unequal loss protection for H.263 compressed video," in DCC '03: Proceedings of the Conference on Data Compression. Washington, DC, USA: IEEE Computer Society, 2003, p. 73. 110 [15] T. Ho, M. Medard, R. Koetter, D. Karger, M. Eros, J. Shi, and B. Leong, \A random linear network coding approach to multicast," Information Theory, IEEE Transactions on, Mar 2005. [16] T. Ho, M. Medard, J. Shi, M. Eros, and D. Karger, \On randomized network cod- ing," Proceedings of 41st Annual Allerton Conference on Communication, Control, and Computing, Oct 2003. [17] H.Seferoglu and A.Markopoulou, \Opportunistic network coding for video streaming over wireless," Proc. of Packet Video, Nov 2007. [18] S. Jaggi, P. Sanders, P. Chou, M. Eros, S. Egner, K. Jain, and L. Tolhuizen, \Polynomial time algorithms for multicast network code construction," IEEE Trans. Inform. Theory, Jul 2003. [19] X. Kai, Z. Feng, P. Purvin, and B. Jill, \Frame loss error concealment for SVC," Journal of Zhejiang University Science A, vol. 5, 2006. [20] S. Karande., K. Misra, and H. Radha, \Clix: Network coding and cross layer in- formation exchange of wireless video," Image Processing, 2006 IEEE International Conference on, Oct 2006. [21] S. Katti, H. Rahul, W. Hu, D. Katabi, M. Medard, and J. Crowcroft, \XORs in the Air: Practical Wireless Network Coding," IEEE/ACM Transactions on Networking, vol. 16, no. 3, pp. 497{510, June 2008. [22] J.-G. Kim, J.-W. Kim, and C.-C. J. Kuo, \Integration of adaptive intra refresh (AIR) and unequal error protection (UEP) with a corruption model for robust video trans- mission," Conference on Video Technologies for Multimedia Applications, SPIE's International Symposium on the Convergence of Information Technologies and Com- munications, Aug 2001. [23] J.-G. Kim, J. Kim, and C.-C. J. Kuo, \Coordinated packet level protection employ- ing source and channel redundancy for robust video transmission," Conference on Visual Communications and Image Processing (VCIP), Part of the Symposium on Electronic Imaging, Jan 2001. [24] R. Koetter and M. Medard, \An algebraic approach to network coding," Networking, IEEE/ACM Transactions on, pp. 782 { 795, Oct 2003. [25] S. Li, Y. R., R. W. Yeung, and N. Cai, \Linear network coding," Information Theory, IEEE Transactions on, vol. 49, pp. 371{381, 2003. [26] Y. J. Liang, J. G. Apostolopoulos, and B. Girod, \Model-based delay-distortion op- timization for video streaming using packet interleaving," Proceedings 36th Asilomar Conference on Signals, Systems and Computers, Nov 2002. [27] L. Lima, M. Mdard, and J. Barros, \Random linear network coding: A free cipher?" IEEE International Symposium on Information Theory, Jun 2007. 111 [28] Q. Liu, S. Member, S. Zhou, and G. B. Giannakis, \Queuing with adaptive modu- lation and coding over wireless links: cross-layer analysis and design," IEEE Trans. Wireless Commun, vol. 4, pp. 1142{1153, 2005. [29] Q. Liu, S. Zhou, and G. Giannakis, \Cross-layer combining of adaptive modulation and coding with truncated arq over wireless links," Wireless Communications, IEEE Transactions on, Sep 2004. [30] M. Luby, M. Watson, T. Gasiba, T. Stockhammer, and W. Xu, \Raptor codes for re- liable download delivery in wireless broadcast systems," Consumer Communications and Networking Conference, 2006. CCNC 2006, vol. 1, pp. 192 { 197, Jan 2006. [31] M. Luby, M. Watson, T. Gasiba, and T. Stockhammer, \Mobile data broadcasting over MBMS tradeos in forward error correction," Proceedings of the 5th interna- tional conference on Mobile and ubiquitous multimedia, 2006. [32] D. Lun, M. Mdard, and M. Eros, \On coding for reliable communication over packet networks," Proc. 42nd Annual Allerton Conference on Communication, Control, and Computing, September-October 2004. [33] D. Lun, M. Mdard, R. Koetter, and M. Eros, \Further results on coding for reliable communication over packet networks," Proc. 2005 IEEE International Symposium on Information Theory (ISIT 2005), pp. 1848{1852, Sep 2005. [34] D. Lun, P. Pakzad, C. Fragouli, M. Mdard, and R. Koetter, \An analysis of nite- memory random linear coding on packet streams," Proc. 4th International Sym- posium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt '06), Apr 2006. [35] M.Luby, \LT codes," The 43rd Annual IEEE Symposium on Foundations of Com- puter Science, 2002. [36] A. Mohr, E. Riskin, and R. Ladner, \Unequal loss protection: Graceful degradation of image quality over packet erasure channels through forward error correction," JSAC special issue on Error-Resilient Image and Video Transmission, 1999. [37] D. Nguyen, T. Nguyen, and X. Yang, \Multimedia wireless transmission with net- work coding," Proc. of Packet Video, Nov 2007. [38] K. Nybom and D. Vukobratovic, \A survey on application layer forward error correc- tion codes for IP datacasting in DVB-H," EUROPEAN COOPERATION IN THE FIELD OF SCIENTIFIC AND TECHNICAL RESEARCH, Sep 2007. [39] V. N. Padmanabhan, H. J. Wang, and P. A. Chou, \Resilient peer-to-peer stream- ing," in ICNP '03: Proceedings of the 11th IEEE International Conference on Net- work Protocols, 2003, p. 16. [40] P. Pakzad, C. Fragouli, and A. Shokrollahi, \Coding schemes for line networks," ISIT, Aug 2005. 112 [41] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C: The Art of Scientic Computing. CAMBRIDGE UNIVERSITY PRESS, 1992. [42] S. Sa-e, M. Chamchoy, and S. Promwong, \Study on propagation path loss and ber performance for xed broadband wimax," Communications, Asia-Pacic Conference on, pp. 289{292, 2007. [43] J. H. Saltzer, D. P. Reed, and D. D. Clark, \End-to-end arguments in system design," Second International Conference on Distributed Computing Systems, pp. 509{512, Apr 1981. [44] T. Schierl, H. Schwarz, D. Marpe, and T. Wiegand, \Wireless broadcasting using the scalable extension of H.264/AVC," Proc. ICME, Jul 2005. [45] T. Schierl, K. Gnger, C. Hellge, T. Stockhammer, and T. Wiegand, \Svc-based multi source streaming for robust video transmission in mobile ad-hoc networks," IEEE International Conference on Image Processing (ICIP'06), Oct 2006. [46] H. Schwarz, D. Marpe, and T. Wiegand, \Overview of the scalable video coding extension of the H.264/AVC standard," IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Scalable Video Coding, vol. 17, no. 9, Sep 2007. [47] A. Shokrollahi, \Raptor codes," Information Theory, IEEE Transactions on, vol. 52, pp. 2551 { 2567, Jun 2006. [48] D. Silva and F. Kschischang, \Rank-metric codes for priority encoding transmission with network coding," 10th Canadian Workshop on Information Theory, Jun 2007. [49] D. Silva and F. R. Kschischang, \A rank-metric approach to error control in random network coding," IEEE Canadian Workshop on Information Theory, Jul 2007. [50] A. Tourapis, F. Wu, and S. Li, \Direct mode coding for bipredictive slices in the H.264 standard," Circuits and Systems for Video Technology, IEEE Transactions on, Jan 2005. [51] M. van der Schaar and H. Radha, \Unequal packet loss resilience for ne-granular- scalability video," IEEE Transactions on Multimedia, Dec 2001. [52] J. M. Walsh and S. Weber, \A concatenated network coding scheme for multime- dia transmission," Fourth Workshop on Network Coding, Theory, and Applications (Netcod 2008), Jan 2008. [53] W. Wei and A. Zakhor, \Multiple tree video multicast over wireless ad hoc networks," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 17, pp. 2{15, Jan 2007. [54] Y. Wu and P. A. Chou, \Network coding for the internet and wireless networks," IEEE Signal Processing Magazine, 2007. 113 [55] Y. Wu, P. A. Chou, and S.-Y. Kung, \Information exchange in wireless networks with network coding and physical-layer broadcast," Proc. 39th Annual Conference on Information Sciences and Systems (CISS), Mar 2005. [56] Y. Wu, P. A. Chou, Q. Zhang, K. Jain, W. Zhu, and S.-Y. Kung, \Network planning in wireless ad hoc networks: a cross-layer approach," IEEE Journal on Selected Areas in Communications, special issue on wireless ad hoc networks, Jan 2005. [57] Y. Wu, P. Chou, and S.-Y. Kung, \Minimum-energy multicast in mobile ad hoc networks using network coding," IEEE Trans. on Communications, Nov 2005. [58] Y. Wu, \Network coding for multicasting," Princeton University PhD Thesis, Nov 2005. [59] R. W. Yeung and N. Cai, \Network error correction, part I: Basic concepts and upper bounds," Communications in Information and Systems, 2006. [60] Z. Zhang, \Linear network error correction codes in packet networks," IEEE Trans. on Inform. Theory, vol. 54, Jan 2008. [61] X. Zhu, S. Han, and B. Girod, \Congestion-aware rate allocation for multipath video streaming," Proc. IEEE International Conference on Image Processing,(ICIP-04), pp. 2547{2550, Oct 2004. 114
Abstract (if available)
Abstract
To transmit video through erasure networks effectively, it is essential to reduce the impact of packet loss on transmitted video quality. The problem of robust video transmission in erasure networks using network coding (NC) is investigated in this research. NC theory, which has been developed recently, offers an alternative to encode video packets at intermediate nodes to improve the throughput. We consider an NC-based video delivery system that performs random linear network coding (RLNC) at intermediate nodes in erasure networks. RLNC linearly combines a group of packets by randomly selecting weighting coefficients on a finite field in a distributed way. The loss of an RLNC-coded packet is equivalent to the lost of one dimension in a constrained system of equations required for the RLNC decoding. Unless the global network coding coefficient matrix (or simply the global coding matrix) is of full rank, we are not able to recover all source packets by network decoding. Three innovative schemes are proposed and analyzed to address the problem of robust video transmission in erasure networks.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Algorithms for scalable and network-adaptive video coding and transmission
PDF
Distributed source coding for image and video applications
PDF
Structured codes in network information theory
PDF
Advanced techniques for high fidelity video coding
PDF
Focus mismatch compensation and complexity reduction techniques for multiview video coding
PDF
Robust routing and energy management in wireless sensor networks
PDF
Efficient coding techniques for high definition video
PDF
PerFEC: perceptually sensitive forward error control
PDF
Compression of signal on graphs with the application to image and video coding
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Graph-based models and transforms for signal/data processing with applications to video coding
PDF
Efficient reachability query evaluation in large spatiotemporal contact networks
PDF
Robust representation and recognition of actions in video
PDF
Elements of robustness and optimal control for infrastructure networks
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Improve cellular performance with minimal infrastructure changes
PDF
Modeling and predicting with spatial‐temporal social networks
PDF
Scheduling and resource allocation with incomplete information in wireless networks
PDF
Modeling intermittently connected vehicular networks
PDF
Algorithmic aspects of energy efficient transmission in multihop cooperative wireless networks
Asset Metadata
Creator
Wang, Hui
(author)
Core Title
Robust video transmission in erasure networks with network coding
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
06/16/2009
Defense Date
04/15/2009
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
H.264/SVC,network coding,OAI-PMH Harvest,robust transmission,unequal erasure protection (UEP),video streaming,wireless video transmission
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kuo, C.-C. Jay (
committee chair
), Ortega, Antonio (
committee member
), Shahabi, Cyrus (
committee member
)
Creator Email
graciawang@gmail.com,wanghui@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2307
Unique identifier
UC1495771
Identifier
etd-Wang-2899 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-241460 (legacy record id),usctheses-m2307 (legacy record id)
Legacy Identifier
etd-Wang-2899.pdf
Dmrecord
241460
Document Type
Dissertation
Rights
Wang, Hui
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
H.264/SVC
network coding
robust transmission
unequal erasure protection (UEP)
video streaming
wireless video transmission