Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Data compression and detection
(USC Thesis Other)
Data compression and detection
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. ProQuest Information and Learning 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. DATA COMPRESSION AND DETECTIO N bv Giiang-C'ai Zhou A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN C A LIFO R N IA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY ( ELEC’TR IC A L ENG INFER IN G ) December 2001 Copyright 2001 Guang-Cai Zhou Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3065872 _ _ _ __ _ _ _ ( g > UMI UMI Microform 3065872 Copyright 2003 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA The Graduate School University Park LOS ANGELES, CALIFORNIA 90089 1695 Thi s d issertatio n , w ritte n b y U nder th e d irectio n o f h .if... D issertatio n C om m i ttee, an d approved b y a ll its m em bers, has been presented to and accepted b y The G raduate School , in p a rtia l fu lfillm e n t o f requi rem ents fo r th e degree o f D O C TO R O F P H I LO S O P H Y o f Graduate Studies D ate December 17. 2001 D IS S E R T ' O M M IT T E E Chairperson Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Dedication This dissertation is dedicated to my wife Zhi-Hua Zhang and my son V ictor VY. Zhou ii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgements I would like to express my most profound gratitude to my Ph.D. thesis advisor. Professor Zhen Zhang, for his patience, encouragement, inspiration and guidance during my study at L ’SC. I have benefited greatly from Prof. Zhen Zhang's extensive knowledge and invaluable experience in research, and 1 w ill benefit from these in my career and research forever. I would also like to express my deepest appreciation to him for his generous fmacial support which makes my study at CSC possible. I wish to thank Prof. Vajay Kumar. Prof. Keith Chugg and Prof. ( ’. ( ’. Kuo for their excellent lecturing during my study and their kindly serving on my qual ifying exam committee. I will also take the chance to express my thanks to Prof. Peter Baxendale for his kindly serving in both my qualifying exam committee and dissertation committee. My thanks also go to Prof Antonio Ortega for serving as a member in my dissertation committee. I wish to thank all my friends for their friendship and fru itfu l discussions. In particular. I will like to thank my officemates. Dr. .Jeng-hong Chen. Tungsheng Lin and Alan C'hoy for their technical and casual chats. My thanks especially go to Mr. Yuankai Wang. Mr. .bin Yang. Dr. Ruhua fie. Mr. Lei Zhuge. Mr. Yaxin Cao. iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Mr. Gregory 0 . Dubney and Mr. Terry Lewis for their warm friendship and kind help when I need. My thanks are also extended to Dr. Hamid Sadjadpour for his friendship when I was in ATicT Shannon Laboratory in Spring. 2001. Finally. I would like to express my deepest appreciation to Ms. M illy Montenegro and Ms. Mayumi Thrasher for their support and administrative assistance. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Contents Dedication ii Acknowledgements iii List Of Tables viii List Of Figures ix Abstract xii 1 Introduction 1 1.1 Motivation and Research Focuses............................................................. 1 1.2 Organization of the Dissertation................................................................ 5 2 T h e R edundancy of Trellis Lossy Source C o din g 8 2.1 In tro d u ctio n .................................................................................................. 8 2.2 Main Results ............................................................................................... 1 2 2.2 Types and c/-Ball C overing............................................................................. 22 2 .-1 F.ncoding by a Modified M -A lg o rith m ...................................................... 21 2.5 Proof of Theorem 2 .1 ................................................................................... 1 6 2.6 An Example: ( niformly Distributed Source with Hamming Distortion M e a su re ......................................................................................................... 57 2.7 C onclusions.................................................................................................. 65 3 S ynchronization Recovery of P refix-free Codes 67 2.1 In tro d u ctio n .................................................................................................. 67 2.2 Mean Error Propagation Length of Prefix-free C o d e s ........................... 70 2.2 Conditions for Robustness......................................................................... 80 2 .-1 Algorithms for Finding Prefix-free Codes with Short MEP1........................ 88 2.1.1 Fixed Order M e th o d ........................................................................ 8!) 2.-1.2 Maximum Gain M e th o d ................................................................. 02 2.-1.2 M-Algorithms using Partial M E P L .............................................. 96 2.5 Performance C om parisons............................................................................100 2.6 C onclusions.....................................................................................................107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 Synchronization of V a riab le-L en g th Codes in H a m m in g Distance 111 4.1 Synchronization Measurements ................................................................... I l l 4.2 Index Assignment of Synchronization C o d e s .............................................123 4.3 Improving the Synchronization P ro b a b ility ..................................................12S 4.4 Simulation R e s u lts ......................................................................................... 131 4.5 C onclusions...................................................................................................... 13S 5 Robustness o f P refix-fre e Codes for F in ite S ta te M o d els 140 5.1 A Finite State Coding Scheme...................................................................... 140 5.2 Mean Error Propagation Lengths for F S M ................................................145 5.3 Index Assignment for F S M .............................................................................154 5.4 C onclusions......................................................................................................15S 6 Robust In d ex Assignm ent for F in ite S tate V e c to r Q u an tizatio n 160 6.1 Introduction and B a ckg roun d...................................................................... 160 6.2 Finite State Vector Q u a n tiza tion ................................................................. 162 6.3 Channel Distortion for F S V Q ....................................................................... 165 6.4 Robustness of FSVQ .................................................................................... 171 6.5 Index Assignments for short A C D ................................................................175 6.6 Numerical E x a m p le s ......................................................................................181 6.7 Conclusions....................................................................................................... 184 7 F S V Q w ith M in im u m C hannel D isto rtio n D e te c tio n for Noisy Chan nels 185 7.1 Introduction and B a ckg rou n d ...................................................................... 185 7.2 Joint Source and Channel Detection............................................................ 187 7.2.1 Minimum Channel Distortion Paths for F S V Q ..............................187 7.2.2 Minimum Channel Distortion Detection for \ Q .......................... 191 7.2.3 Adaptive M o d e ls .................................................................................192 7.3 Predictive \ Q and FSVQ ............................................................................ 193 7.4 Simulation R e s u lts .........................................................................................198 7.5 C onclusions..................................................................................................... 213 8 M u ltip le D escription Trellis-C od ed V ecto r Q u a n tiz a tio n 214 8.1 Intro d u ctio n..................................................................................................... 214 8.2 The MD-TCVQ S ch e m e .............................................................................. 217 8.2.1 Spatial Separation........... ......................................................................221 8.2.2 Temporal S eparation..........................................................................222 8.2.3 MSTC’Q ................................................................................................ 225 8.3 A Lower Bound for M ultiple D escription.................................................. 226 8.1 Simulation R e s u lts ........................................................................................ 230 8.4.1 S M D T C Q .............................................................................................231 8.4.2 T .M D T C Q .............................................................................................236 vi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.4.3 Asymptotic Perform ance................................................................213 S.o Discussions and C onclusions...................................................................... 2 1 6 9 Discussions and F u tu re W o rk 247 Reference List 248 Appendix A Proofs of Theorems for Chapter 1 ...................................................................... 257 Appendix B Proofs of Chapter 5 .............................................................................................259 Appendix C Proofs of Theorems for Chapter 8 .......................................................................261 vii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List Of Tables •5 .1 A Five Letter Source and Codewords....................................................... 78 3.2 MEPL for Rminer's 4 Sources......................................................................1 0 1 3.3 A 26 English Letter Source with S p a ce ...................................................... 108 3.4 Escott and Perkins' Code -I ......................................................................... 108 3.5 A 26 English Letter Source without S p a c e ................................................109 3.6 The Source of 26 Motion V e c to rs ............................................................... 110 4.1 A Five Letter S o u rce ..................................................................................... 127 1.2 A Nine Letter S ource..................................................................................... 127 1 .3 Synchronization Probability of Motion V ectors.........................................136 8.1 MSTCQ performance. lb/Sample. dim=4. 8 ............................................226 8.2 Losing Patterns. Weight. MSE and PSXR ............................................... 239 8.3 Losing Patterns. Weight. MSE and PSXR ...............................................212 8.4 MSE and SQXR for I. 8. 61 S ta te s ............................................................216 v iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List Of Figures 3.1 S - Ib y F - 0 ...........................................................................................................101 3.2 S-I by M-G .......................................................................................................101 3.3 S-I by M - [ ..........................................................................................................102 3.4 S-I by M -II .......................................................................................................102 3.5 S-I by R ud n e r....................................................................................................102 3.6 S-I I by F - 0 ......................................................................................................102 3.7 S-I I by M -C l..................................................................................................... 103 3.5 S-I I by M-I ..................................................................................................... 103 3.9 S-I I by M - I I ..................................................................................................... 103 3.10 S-I I by R u c ln e r............................................................................................... 103 3.11 S-III by F - 0 .................................................................................................... 101 3.12 S-III by M-G .................................................................................................104 3.13 S-III by M - I ....................................................................................................104 3.1 1 S-III by M -II .................................................................................................101 3.16 S-III by R u d n e r..............................................................................................106 3.16 S-IV by F - 0 ....................................................................................................106 3.17 S-IV by M-G .................................................................................................106 3.18 S-IV by M - I ....................................................................................................106 3.19 S-IV by M -II .................................................................................................106 3.20 S-IV by R u d n e r............................................................................................. 106 -1 .1 Synchronization P ro b a b ility ..........................................................................133 4.2 C M E P L ............................ ' ............................................................................. 134 4.3 Synchronization P ro b a b ility ..........................................................................136 4.4 C.MEPI................................................................................................................. 137 7.1 Predictive F S V Q ............................................................................................. 191 7.2 Channel and Total Distortion G ains............................................................. 201 7.3 Channel and Total Distortion G ains............................................................. 201 7 .-1 Channel and Total Distortion G ains............................................................. 202 7.6 Channel and Total Distortion G ains............................................................. 2 0 -1 7.6 Channel and Total Distortion G ains............................................................. 2 0 -1 7.7 Channel and Total Distortion G ains............................................................. 206 7.8 Original L e n a ...................................................................................................206 ix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.9 Quantized Lena W /0 N o is e ........................................................................ 206 7.10 Separated Q / D .............................................................................................. 206 7.11 Jointed Q / D .....................................................................................................206 7.12 Weighted Joint Q / D .....................................................................................207 7.13 Normalized Joint Q / D ................................................................................. 207 7.1-1 Channel and Total Distortion G ains................................................ 207 7.16 Channel and Total Distortion G ains.............................................................20S 7.16 Channel and Total Distortion G ains............................................................ 20S 7.17 Original L e n a ..................................................................................................210 7.15 Quantized Lena W ithout Noise .................................................................. 210 7.19 Separated Q / D .............................................................................................. 211 7.20 Jointed Q / D ..................................................................................................... 211 7.21 Weighted Jointed Q / D ................................................................................. 211 7.22 Normalized Joint Q / D ................................................................................. 211 7.23 Channel and Total Distortion G ains...........................................................212 7.2-1 Channel and Total Distortion G ains.......................................................... 212 7.25 Channel and Total Distortion G ains............................................................. 213 8.1 Simulation Results for (-1. 2. 0. I ) ...............................................................232 5.2 Simulation Results for R = I. Rx = R2 = 1/2 235 5.3 Simulation Results for R = 2. R\ = R2 = I ............................236 8 .-1 Original L e n a ..................................................................................................238 8.5 Central Q ua n tize r........................................................................................... 238 8.6 Package 1 lo s t ...................................................................................................238 8.7 Package 2 lost ................................................................................................238 8.8 Package 3 lo s t ...................................................................................................238 8.9 Package - 1 lo s t ...................................................................................................238 S. 10 Pattern 0000 .................................................................................................. 2 -1 0 8.11 Pattern 1000 .................................................................................................. 2 -10 8.12 Pattern 0100.................................................................................................... 2-10 8.13 Pattern 0010.................................................................................................... 2-10 8.1-1 Pattern 0001.................................................................................................. 2 -10 8.15 Pattern 1100.................................................................................................... 2-10 8.16 Pattern 1010.....................................................................................................2 -1 1 S. 17 Pattern 1001.................................................................................................... 2 -1 1 8.18 Pattern 0110.................................................................................................... 2 -1 1 8.19 Pattern 0101.................................................................................................... 2 -1 1 5.20 Pattern 0011.................................................................................................... 2 -1 1 8.21 Pattern 111 0.................................................................................................... 2 -1 1 8.22 Pattern 1101.................................................................................................... 2 -1 1 8.23 Pattern 1011.................................................................................................... 2 -1 1 8.2-1 Pattern 0111....................................................................................................2 -1 1 8.25 Pattern 0000 .................................................................................................. 2 -1 2 x Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.26 Pattern..1000 .................................................................................................. 242 8.27 Pattern..0100 .................................................................................................. 242 8.2S Pattern..0010 .................................................................................................. 243 8.29 Pattern..0001 .................................................................................................. 243 8.30 Pattern...1100 .................................................................................................. 243 8.31 Pattern...1010....................................................................................................243 8.32 Pattern...1001 .................................................................................................. 243 8.33 Pattern..0110....................................................................................................243 5.34 Pattern 0101....................................................................................................244 5.35 Pattern 0011 .................................................................................................. 244 8.36 Pattern 1110....................................................................................................244 8.37 Pattern 1101 ....................................................................................................244 5.38 Pattern 1011....................................................................................................244 8.39 Pattern 0111....................................................................................................244 8.40 Simulation Results for R = 2. R\ = R2 = I ............................................245 xi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Abstract This dissertation contains two main topics: the redundancy of trellis coded quan tization and the robust coding and detection for lossless and lossy data compression. For a fully connected trellis coded quantizer with codeword block length n. it is proved that the redundancy of this TC'Q is of order ^ while for block coding the re dundancy is of order when the block length is n. The results theoretically explain why ICQ outperforms block codes subject to the same storage and computational complexity. For lossless source coding, the robustness of Huffman codes is systematically studied. The robustness measures MEPL and VEPL for optimal prefix codes are introduced and compact formulas are conceived. In the Hamming distance sense, the robustness is measured by synchronization probability, conditional MEPL and VEPL. Based on these measures, three coding index assignment methods for Huff man codes are presented. Extensive examples show that these algorithms are very effective. In order to significantly improve the synchronization probability, a simple coding method is presented. The robustness concepts arc also extended to finite state models. xii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For finite state vector quantizer, robust index assignment is investigated. It can be considered as an extension o f pseudo Gray codes extensively studied for memoryless source or quantizer. When jo in t source coding and channel detection are considered, a detection method based on the objectiveof minimum channel distortion is addressed. It can be generalized to predictive TC'Q even though generally only sub-optimal path can be achieved. In this thesis, both SMDTC'YQ and TM DTCVQ are investigated. To estimate the performance of MDC. a lower bound is provided when there are more than two side descriptions in a MD scheme. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1 Introduction 1.1 M otivation and Research Focuses This thesis includes the following topics: 1 . I he redundancy of trellis coded vector quantization: 2. the robustness of variable length codes for memoryless and memory cases: J. The robustness of finite state vector quantizers: -1 . Jointed quantization and detection: 5. Multiple description trellis coded vector quantization. Topic 1 discusses the advantages of trellis coded vector quantization and topic 2-5 discuss the robustness of lossless and lossy source coding. It is well known that trellis lossy source codes have better performance/complexity trade-off than block codes as shown by simulations. This makes the trellis coding 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. technique attractive in practice. To get a better understanding of this fact, this thesis studies the redundancy of trellis coding for memoryless sources and compares it with a similar result for block codes. It was known that for block codes of block length n the rc-th order distortion redundancy V n(R) at fixed rate R > 0 equals — — ^ + o (^y). where — is the partial derivative of d(p. R). the clistortion- rate function of a source p. evaluated at R and assumed to exist. Since t nR. the number of codewords of the block code can be used as an approximate measure of both the storage complexity C’a of the code and the computational complexity C'. per source symbol for full search encoding, the redundancy can be written as functions of the complexity measures in the form OR 2lnC ’, and Od(p.R) /? In In C, D(R.CS) % -■ OR 2 In C, In this thesis, it is demonstrated that for a particular trellis lossy source code with storage complexity Cs = c2'lR and computational complexity Cc = t 2nli (assuming \ iterbi algorithm is used for encoding), the distortion redundancy satisfies !>.<*> < ^ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where c(p. R) is a constant independent o f n. For this code the com plexity/redundancy trade-off can be written as In C5 and In C ■ which shows that trellis coding improves the redundancy/complexity trade-off over block coding by roughly a factor ln ln C c. I he synchronization recovery property of variable-length (VL) codes lias been ex tensively studied. In the thesis, the mean error propagation length (M EPL) and the variance of error propagation length (VEPL). which are the secondary performance criteria of a YL code(the first performance criterion is the optim ality in entropy sense), are introduced to measure the synchronization recovery capability of a VL code. For the same probability distribution, there exist many different VL codes which have the same redundancy as the Huffman code but quite different MEPLs and YEPLs. To find one of the Y’L codes which has the minimum MEPL is a very difficult problem. We present four design algorithms for finding minimum redun dancy V L codes with short MEPL and VEPL. The first two algorithms are simple and have the property that the codewords are assigned one by one. The efficiency of the algorithms are tested extensively by comparing the algorithms with known Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. construction methods available in literature. Actually. V’L codes obtained by the two algorithms outperform almost all codes available. The synchronization property for lossless source coding can be further charac terized systematically. For variable length to variable length codes, fixed length to variable length codes and variable length to fixed length codes, it is known that these codes have the synchronization property in two senses: 1) self-recovery from the wrong parsing in insertion-deletion distance sense: 2) self-rccovery from the wrong parsing in the Hamming distance sense. In general, well designed optimal prefix- free codes can recover with probability I in insertion-deletion distance, but can only have a positive probability that recover from parsing errors in Hamming distance sense. In this thesis we present several formulas to characterize the synchronization ability in the above two senses. Simulation results show that the coding algorithms provided in this thesisfsee also [102]) are robust in both synchronization senses. Since high synchronization probability is expected in practice, in this thesis we present a simple and nearly costless method to improve the synchronization proba b ility for Huffman codes, the optimal prefix-free codes, in Hamming distance sense. We also roughly explain the mathematical theory supporting this method. The robustness of optimal prefix-free codes for memoryless source has been in tensively studied as mentioned. In this thesis we conceive two formulas to calculate the mean and variance of the error propagation length of an optim al prefix-free code for a finite state source model. The quantities are good measures of the robustness - 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the code. According to the formulas, we suggest some intuitive rules to design a robust code for a finite state model. As we know, our work is original in this topic. In this thesis we also conceive a formula to calculate the average channel dis tortion for a finite state machine when bit errors occur. Based on this formula, we give some conditions for a finite state machine to be robust. In order to reduce the average channel distortion for a finite state machine, we present two methods to assign the index for each codevector. Multiple descriptions are one of the efficient methods combating package loss. In this thesis we introduce m ultiple description schemes for trellis coded vector quantiz ers. For these schemes, the Viterbi algorithm provides an optim al path for encoding and the design procedure utilizes the generalized Lloyd algorithm for suboptirnal codebooks. The schemes are extensions of [-10. -12] in both temporal and spatial senses. We also conceive a lower bound about the joint descriptions when there are more than two descriptions for a source. This lower bound is an extension of the well-renowned achievable bound for memoryless Gaussian source presented by Ozarow[60]. 1.2 Organization of the D issertation This dissertation consists of 10 chapters and is organized as follows: In the second chapter, the redundancy of trellis lossy source coding for a special trellis structure is discussed. The previous work on redundancy on lossy source • 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. coding is presented in Sec. 2.1. The main results of this chapter are given in Sec. 2.2. In Sec. 2.3 and Sec. 2.4. the preliminary knowledge and encoding algorithm achieving the redundancy are explained. The main theorem is proven in Sec. 2.5 and an example detailing the proofs are presented in Sec. 2.6. Chapter 3-6 handle robustness of lossless source coding methods, especially ex haustive and prefix-free codes. In Chapter 3. the synchronization recovery prop erties of prefix-free codes in the insertion-deletion distance sense are discussed. In this chapter, the mean error propagation length(MEPL). which is the main measure for robustness, is introduced. Based on MEPL. the requirements for a code to be robust is presented in Sec 3.3. Given the robustness measurement, three algorithms obtaining robust codes are depicted in Sec. 3.4. In Sec. 3.5. the performance of different robust coding algorithms are compared and it is shown that algorithms constructed in this chapter are the best till present. In Chapter 4. we systematically study the synchronization properties of YLC in Hamming distance sense. We conceive the formula calculating the synchronization probability, the MEPL and VEPL conditional on the synchronization in Hamming distance. Extensive examples show that the coding algorithms provided in the pre vious chapter are also very robust in Hamming distance sense. In Chapter 5. we present a simple and nearly costless method to improve the synchronization probability for Huffman codes in Hamming distance sense. We also roughly explain the mathematical theory supporting this method. Simulation results 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. show that the method can significantly improve the synchronization probability while the conditional MEPL and VEPL may increase proportionally In Chapter 6. we extend the synchronization properties of memoryless source to sources o f finite state model. We can also introduce MEPL and VEPL in insertion- deletion distance sense, synchronization probability and conditional MEPL. VEPL in Hamming distance sense. In Chapter 7. we discuss the robustness of finite state vector quantizer (FSVQ) and introduce a measure of robustness—channel distortion for FSVQ. Based on this measure, we can assign the index for each codevector of each state codebook such that when bit error occurs, the distortion can be as small as possible on average. In Chapter 8. we consider the jointed source and channel detection such that when the channel distortion is considered as an objective for a Viterbi algorithm, the resulted channel distortion and hence the total distortion can be as small as possible. Simulation results show that the jointed detection can significantly reduce the channel distortion. In Chapter !). we introduced a generic multiple description scheme for TCQ. Simulation results show that with lower complexity. MDTCVQ can achieve much better performance than traditional scalar MD. In the last chapter of this thesis, we make a conclusion on this thesis and we also outline some future topics related to robustness such as space-time codes. Some open problems related to source coding and wireless communications are also addressed. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2 The Redundancy of Trellis Lossy Source Coding 2.1 Introduction For an information source, it is well known that the performance of optimal lossy codes is characterized by the rate-distortion function of the source which gives the minimum number of bits per symbol needed to convey in order to reproduce the source with an average distortion that does not exceed a specified level. The tradi tional rate distortion theory, however, says neither the convergence rate at which the performance of block codes approaches its theoretical lim it as the block length goes to infinity nor anything about the coding complexity which is a key factor in the real world. I'he redundancy of a lossy source code defined as the difference of the perfor mance of the code and the theoretical lim it— rate-distortion function is a parameter that characterizes the convergence behavior of lossy source codes. This parameter has recently been studied intensively ([61. 90. 50. 91. 95. 96. 81. 59. -13. 29j). 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The problem was originally proposed by R..J. Pile [61]. He studied for the first time the so-called distortion redundancy of block lossy source codes. For a memo- rvless source p with finite source and reproduction alphabets coded at a fixed rate, he proved that the n-th order distortion redundancy is upper bounded by (— (I -f t)jgd(p. 1+ o ( 1)) and argued that it is lower bounded by (--^d(p. )(1 + o (l)) where -y^d(p. R) is the derivative of the distortion rate function d(p. R) with respect to R. and n is the block length. Unfortunately, his proof for the lower bound turns out to be wrong. This problem was solved by Z. Zhang. E. Vang and V. Wei [96] where the authors proved that the distortion redundancy for block codes of block length n at a fixed rate R is equal to In the same paper, the rate-redundaney for the so-called D-semifaithful block codes is also studied. It is shown that for memoryless source p. the rate redundancy of D-semifaithful codes of block length n is upper bounded by in n / in ii -f o ----- n \ n and lower bounded bv Inn /Inn' ^ ° ----- in \ n 9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Block codes are codes without structures. In practice, block codes are encoded either by the full search method or by some improved search methods. It is commonly believed that for an arbitrary block code without any structure, the best possible encoding method has an exponential computational complexity t cn for some c > 0 per source symbol where n is the block length of the code. Let C\ denote the storage complexity per source symbol, which is the size of memory space required to store the codebook. When the rate is R in nats. Cs = t nH where n is the block length. Let C'c denote the computational complexity per source symbol which is defined as the number of code letters checked per source symbol during the search. Then C., = cnR when full search encoding is used. The rate redundancy of optimal block codes of block length n can be converted as two functions with C's and C, as variables, respectively, which are R In In C's /'lu lu C s vcf/j.C J < + o 2 In C's \ ln£'s / and R In In C, ( In In Cc \ h T c r + n ^ ) ' Similar expressions can also be obtained for distortion redundancy. To improve the redundancy/complexity trade-ofT of lossy source coding, it is natural to consider codes with structure encoded by certain well designed encoding algorithms. For instance, codes with trellis structure encoded by Viterbi algorithm 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. or by .\/-aIgorithm are studied extensively in literature, as well as tree structured codes encoded either by the tree-search method or by some other improved search methods. The redundancy problem for codes with structure has been studied in some special cases. Chou. EfFros and Gray [11] have studied two stage VQ. Their definition of redundancy is different from ours. Their redundancy is actually the redundancy of a universal two stage code minus the redundancy of an optimal block code of the same block length. That is. their redundancy is the redundancy over the optimal code of the same block length instead of over the optimal performance theoretically attainable (OPTA). In [SS]. the redundancy of Gold-Washing algorithm has been studied by 'tang and Zhang. The redundancy problem for codes with trellis structure has been studied by Viterbi and Omura [SI]. Their main results are restated as Theorem 2.2 in this chapter. In the chapter, we consider codes with a special trellis structure and prove that such codes have much lower redundancy than block codes with similar coding com plexity. The chapter is organized as follows. In Section 2.2. we present the main result of the chapter. In Section 2.3. we discuss types and rf-ball covering which will be used throughout the chapter. In Section 2.-1. we explain a suboptimal encoding scheme—the "modified M-algorithm" which is used to prove the main result. In Section 2.o we prove Theorem 2.1—the main result of the chapter. In Section 2.6 for a uniformly distributed source with Hamming distance measure, we explicitly calculate the redundancy. 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2 M ain R esults Let A and B be two nonempty finite alphabets. A and B w ill serve as the source alphabet and the reproduction alphabet, respectively. W ithout loss of generality, we shall assume A = { 1. • • •. J} and B = { 1 . ■ • •. f\} . where ./ and f\ are the respective cardinalities of A and B . If x = (x<) is a finite or infinite sequence of symbols from A or B. let = (xm. x m+i. • • •. x n) and. for simplicity, w rite .r" as J'ri- shall denote the set of all n-tuples drawn from A (B ) by A n(B ri). Throughout this chapter, an information source is assumed to be a sequence of independently and identically distributed (i.i.cl) random variables with finite alphabet A. Let p be the generic probability distribution of the source {.Yf }/![. that is. Pr(.Y, = j ) = pj for any j € A and t > 0. Let p : A x B [0 .x ) be a single letter distortion measure. Without loss of generality [13], we shall assume that m inp (j-.y) = 0 for any j- € A. y€B Denote by R(p.d) the rate distortion function of the source { . Y , } ^ with respect to the single letter fidelity criterion {pn } generated by p. where pn : X n x B " [0. dc ) is a map in which pn(sn. yn) = « -1 H iLi f°r any .rr‘ E A n and ij'L t B ri. From rate distortion theory [6]. it is well-known that R(p.d) is given by the following minimization problem: R(p.d) = inf f(p:Q) . (2.1) QeQu 12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where h p -Q) = E p>Q‘ < — ) * • '• 7 (Qk\j)j*K is a m atrix of transitional probabilities . Qk = ^P jQ k\: ■ and Qi - {(Qk\j)j*K '■ ^,PjQk\jP[j-k) < <t} • (--3) j.k (Assume that the set Q,i in (2.3) is nonempty.) The inverse of R(p.d) is the distor tion rate function of the source {.Y( which is denoted by <l(p. R). Throughout the chapter, the rate distortion function R(p.d) and the coding rates are measured in terms of nats. L'nlcss otherwise specified, the functions “ In'* and "exp" are through out to base r. Let (Qly)jxi< be an optimal transitional probability matrix which achieves the minimum in (2.1). For each k € B. let Ql = Y.PjQI\j - j=i Q~ w ill be called an optimal distribution on B associated with p and < /. From [6j. we have the following facts: (1 ) If Q'k\j = Q for some j € A. then Q ~k = 0: 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where s is the slope of R{p. d) at point d whenever 0 < d < dmax = miruG B Tljta Pjpij• k)- W ithout loss of generality, we shall assume that Qm k > 0 for any k € B. Otherwise some letters can be deleted from B. Throughout the chapter, for simplicity, the optimal distribution Q' on B shall be assumed to be unique. A block code C ri of order n is simply a subset of B a. The block code C „ of order n is said to operate at rate level R i f |C.,| < enIi. (Here and henceforth. |5| denotes the cardinality of 5 if 5 is a finite set.) When the block code C n is used to encode the source the resulting average distortion is defined to be /?n(C„) = Epn(X n. C, where for each xn € A ". f)n(xn.C n) = min pn(xn. c) . t-*€ C r i The performance of the block code C n is measured by the distortion redundancy of the code which is defined to be Dn(C n) = pn(C fl) - </(p. R) . l-l Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From rate distortion theory [6] [36]. it follows that Dn(C n) is nonnegative. Define the n-th order distortion redundancy of fixed-rate coding of the source p as follows T>n(R) = m in {D „(C n)|C „ is a block code of order n operating at rate level R} . (2.5) One of the the main results in [96] is the determination of the n-th order distortion redundancy V n(R) for sufficiently large n. The following proposition was proved in [96]. Proposition 2.1 Let R > 0. Assume that the optimal distribution Qm on B asso ciated with p and d(p. R) is unique. Then ~ , r.. $ , hi n ( In n \ Dn(R) = ~ — d(p. R)—— + o l . (2.6) cJR In \ n J where e)d{p. R)/i)R is the derivatirt of d(p. R) with respect to R. This proposition was proved in [96] based on Lemma 3 of [96]. In this chapter, the above theorem is improved as the following proposition. Proposition 2.2 Let R > 0. Assume that the optimal distribution Qm on B asso ciated with p and d(p. R) is unique. Then = + i2.r> oR In n where dd(p. R)/c)R is the deriratire of d(p. R) with respect to R. 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The proof of this proposition is based on Lemma 2.3 of this chapter which is an improvement of Lemma 3 o f [96]. R e m a rk 2.1 Under the assumption that Q" is unique and Qm {k) > 0 for any k 6 B . it can be proved that d(p. R) is at least second order differentiable in a small neigh borhood of (p. /?). We w ill use this property repeatedly in the following sections. It was shown in [34. 3o. 37] that trellis coding systems have the potentiality for nearly optimal performance based on Shannon theoretic arguments. Their re sult says, the lim iting performance of optimal trellis lossy source codes equals the distortion-rate function for a broad class of sources. This is a fundamental result for the analysis of trellis source coding. But this result does not explain why the perfor mance of trellis coding is superior to that of block codes because the performance of optimal block codes also approaches the distortion-rate function as the block length goes to infinity. In this chapter, we will prove that for a particular class of trellis source codes with a special block structure of block length n. the redundancy is only -p-. a factor Inn better than block codes having roughly the same computational and storage complexity per source symbol. This particular family of finite state block source codes is defined as follows [31]: The input space is .V = A n where A is the source alphabet. That is. the inputs are source words of block length n. The encoder output space is U = { I. • • •. .V} where .V = [ ”e'lW ]. the smallest integer greater than or equal to cnR with R to be the rate of the code. The state space S = U and the next state function is /{■ ■ > . u) = u where .s is the current state and 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. u is the current encoder output. The trellis for such a next state function is fully connected. The decoding function is tim e dependent. At time k > L it is given as Ck : x U -> B n where B is the reproduction alphabet. For s g S. Ck s = {c-fc (.s.U) : u £ U ) is the state codebook of state .s at tim e k and ck = U cj seS is the super codebook of the finite state coding system at time k. We denote such a coding system by C„. where the subscript n indicates that the block length of source words and codewords is n. The encoding of such a trellis system can be done either by Yiterbi algorithm which always finds the optimal path or by the A/-algorithni which usually finds only a "good" path that may not be optimal. A path is a sequence of states starting from the initial state s0 at time 0 leading to the encoding state .s/. at time L. For a given path, by tracing the path through the trellis [31]. and by invoking the decoding functions, a reproduction sequence con sisting of concatenated codewords can be fully determined, which together with the 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. input source data, gives a path distortion for this particular path. The performance of such a coding system is measured as follows. For a source input consisting of suc cessive L source words of block length n. let a be the path selected by the encoding algorithm. Associated with the path a we have a sequence of codewords C | .c-i.---.ci. Then the path distortion for the input source data x nL = {.r'‘( L). • • •. j-'“( L )} is given by L X > u - n( o .r () f=t where j-n(t) is the f-th input source word. Since different encoding algorithms may select different paths for a given sequence of source data, this quantity depends on the encoding algorithm. If the the Viterbi algorithm is used as an encoding algorithm, then the selected path a is the optimal path, and the corresponding path distortion is a minimum. In this case, we define 1 L l \ ( C n. L ) = E i - ^ p j x ^ n . c , ) } . '=i where the expected value E is taken with respect to the distribution of the input source. The distortion redundancy per source symbol is then defined as V n(p.R.L) = inf(D n(Cn. L) - cl{p. R)) 18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where the infimum is taken over all possible finite state codes with the specific structure defined above, having block length n and rate R. Define T>n(p. R) = infX>n(p. R. L). (2.S) T h e o re m 2.1 Let R > 0. Assume that the optimal distribution Q* over B that achieves the distortion-rate function d(p. R) is unique. Then for any L > Inn. P ll(/>.fl.£)<— + o ( - ) C2.9) n \n. where c(p. R) > 0 /s a constant dependiny only on p and R. Therefore V n(P.R) < + o ( ~ ) . (2.10) n \ n , In the proof of this theorem, we are going to show that such a redundancy can be achieved not only by the optim al encoding algorithm —Viterbi algorithm—but also by a modified .\/-algorithm with M = e3^ for some constant 3 > 0 which has much lower computational complexity than Viterbi algorithm. In this chapter, we are only interested in the order of the redundancy and hence we may take A — I. Let the time variant finite state code consist of L super codebooks. Then the storage complexity and computational complexity per source symbol for such a coding system is given as 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cs = e2nliL. In this chapter. Inn < L <C n. Therefore the redundancy/complexity trade-off is given by V(p. R. Cs) < 2 - Rc{p' R) +Q In C. V In C ' and ™ r> \ s* IRc(P-R) , ( 1 V{p. R.Cc) < — —— h u In C, V In Cc vvhicli is apparently better than fully searched block codes. R em ark 2.2 As pointed out by one of the referees, the proposed method is nev ertheless a way to map a string of symbols of size 11L to another string of the same length. So it is a particular case of block codes w ith block length nL and hence by Proposition '2.2. the redundancy/complexity trade-off is at least of the order 1 = ln^,‘[ l| n |i--) = £ + o(^) (set L = Inn). Hence our result shows that fully-connected trellis source codes can achieve the lower bound even though the complexity has been significantly reduced. Here, we should specially point out the result by Viterbi and Omura. In [81] they proved the following theorem 20 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. T heorem 2.2 For any memoryless source with bounded distortion measure, there exists a time-variant trellis code with q-ary inputs and constraint length [\ for which the average per-letter distortion D is bounded by db<rKE(R) D < D' + 1 [1 where db is a constant, and E(R) and c art positive for code rates R greater than R(Dm ). the rate-distortion function of the source. This theorem is for a more general model and it says that the distortion difference D — D' decays exponentially as the constraint length increases when the rate is bounded away from R(D’ ). In other words, in this result the distortion difference D — O' and the rate-difference R— R( Dm ) exist simultaneously. This is different from our redundancy concept— only one of the two types of redundancies is considered and the other one is assumed to be 0. Even though this theorem proves the advantage of trellis coding, because of the relation of e and E(R) (see [29]). the result is not applicable in the lim iting case where e — » 0. /\ — > oc as well as R — > R( D~). We have shown that, by assuming that their result is applicable to the above lim iting case, and by converting one type o f redundancy to the other, the total redundancy assured by Yiterbi-Omura Theorem is of the order From the complexity/redundancy trade off point of view, this is the same as the result for block codes. In this sense. Theorem 2.1 improves Yiterbi-Omura Theorem for the special trellis codes considered in this chapter. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the following sections, we w ill prove Theorem 2.1. Since Lemma 3 in [96] is sharpened by Lemma 2.3 of this chapter. Proposition 2.2 can be easily obtained following the proof of Theorem 3 in [96] and hence we w ill not prove it in this chapter. 2.3 Types and d-Ball Covering The contents of this section are mainly from [96]. For the convenience of readers, we restate some of the most relevant results without proofs. First of all. we review some basic facts on types and d-ball covering from [16]. Let M = {r i. • • •. em} be a finite set and let 'P(M ) be the set of all probability distributions on M . A t 6 'P(M ) is said to be an //-type if for any t £ M . t(c) € {0. * • • . 1 }. The set of all //-types on M is denoted by 7 ^(M ). We have I he type of an //-sequence : n 6 is defined as which is an //-type on M . where Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For t € 7^(M ). define '% (0 = {-n 6 M n :t(=n) = t}. In tills chapter, we are interested in three alphabets: the source alphabet A . the reproduction alphabet B. and the joint alphabet A x B . Let .s be a joint n-type in Tn(A. x B). then t(j) = E *eB defines an n-type in Tn(A.) which is called the marginal ri-type of s on A . Similarly, we can define the marginal n-type r of .s on B. For these three alphabets, we use T'y(t). Ty(r) and 7'vy-(s) to denote (/). 7 g (r) and 7’^ x g (- s) • respectively. The following facts are well known [16]. Fact 2.1 [f t(j) > 0 for each j € A . then 17*v(7 )| = co <1>--TI lr.r. + «//(0 . (2.11) where H{t) is the entropy of the distribution t in terms of nuts. Similarly, ire hare |7\” ( r ) | = f 0 U ) - iLf i lnM + rl W(r) ^ (2 .1 2 ) and |7’.v.»-(tf)( = . (2.13) 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. for type r satisfying r(k) > 0 for each k £ B and type s satisfying s(j.k) > 0 fo r(j.k ) 6 A x B . Moreover, the "0(1) ’ in (2.11)-(2.13) can be made uniformly bounded over a set of types whose components are uniformly bounded away from 0. Fact 2.2 Under the same conditions as in Fact 2.1. we have and W n{T%'Y[s)) = c O d ) - * ^ I-n -n /H u n . (2 1(i) where IY n(.vn.yn) = pn(xn)Qn(y'l\xn). Qn(yn\xn) = n i 'L 1 (?(/y,|.r,) . Q"(,j») = £ r „ i r ' ‘(, and /(/*i|I//)) denotes the relative entropy between two distributions m and //■ .» on the same finite alphabet( say M ). that is. Moreover, the "0 (1 )" in (2. l.j)-(2.16) can be made uniformly bounded over a set of types whose components are uniformly bounded away from 0. 2 -1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For any two ^.--dimensional real vectors v = (i’[. • • •. nt) and u = (u i.---.iid . let 1 1 1' — «|| denote the Euclidean distance between them, that is. ||r — u|| = \/Tli=i I - u;|2. For any yn E B ". let B (y \d ) = {xn e A n : Pn(.rn.y'l ) < d } . The set B(yn.d) can be conceptualized as a ball centered at yn with radius d. For t E Tn(A). define the restricted d-ball as B(!/l.t.d) = B (!j\d )n m t). It is easy to see that the cardinality of B(yn.t.d ) depends on yn only through the type /((/'') of yn. Therefore, we can define a function Fn(r.t.d) by Fn(r.t.d) = \B(yn.t.d)\. where r is an n-type on B and yn E Ty(r). Let S n(r.t.d) = {.s E 77i(A x B) : r and t are the two marginals of skE sp(X. V") < </}. where £’s/;(.V.V) = E je A .i.- €B s(j. A-)/y(j. k). A simple counting argument shows that r- M71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Define a so-called upper joint entropy H Jr.t.d ) by Hu(r.t.d )= sup H[s) . (2.18) s£S{r.t.d) where S{r.t.d) = {s € V {A x B) : r and t are the two marginals of skEsp(X. V ) < d}. (2.19) //(.s) is the entropy of s. and r and t are distributions on 'P(B) and V[ A ), respec tively. When / £ Tri(A) and r £ Tn(B) are n-types. we can now use the upper joint entropy Hu(r.t.d) to rewrite (2.17) as follows Fn(r.t.d) = + (2.20) Letting h [rj.d ) = / / ( r ) + / / ( / ) - I lu{r.t.d) . (2.21) which is called the lower mutual information between distributions r and t with respect to distortion d. we have the following equation Pn( B{yn. t.d)} = exp{o(n) - n(/(/||/>) + h(r.t.d))} . (2.22) 26 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Note that when d is small enough. S(r.t.d) m ay be em pty. Therefore, we define r.t) = min{r/ : \S{r.t.d)\ > 0}. On the other hand, when d is large enough. S(r.t.d) may consists of all joint dis tributions s on A x B so that .s has a marginal r on B and a marginal t on A and hence hfjr.t.d) equals a constant ll{t) H(r). Therefore, it is convenient to define rAn.ix('‘-0 = sup{r/ : H ,jr.t.d) < H{t) + ff(r)}. We have the following results [96] L em m a 2.1 The Junctions 11, j r . t. d) and li(r.t.d ) finer thr following pro pi rties: I. Hu(r.t.d) is a concave function of r.t.d and for any fired t. I j r . t. el) as a function of r and d is convex: J. Hu(r.t.d) is upper semi-continuous in its domain and Ijr.t.d ) is fourr semi- continuous in its domain: ■ J . for fixed r.t. l l u(r.t.d ) /s a strictly increasing function in [dmujr. t). dmax(r. t)\: J. R{t.d) — m f{ !jr . t. c/)|r € "PIB)}. where R(t.d) is the rate distortion function of the distribution t. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lem m a 2.2 For any point (r0.tQ .d0). where r0 £ ^ (B ) satisfies r0(k) > 0 for any k € B. t0 € V(A) satisfies t0(j) > 0 for any j £ A . and d0 £ (dmin(ro. t0). elm AX (r0. t0)). there exists a small neighborhood X(ru.t0.d0) of (rQ . tu. dQ ). where A (rn.t0.d0) = {(r./.r/)|r € V(B).t £ 'P(A). | | r - r 0|| < a. | | f - / 0|| < er.k\d-d0\ < o\ (2.2:5) for some a > 0. such that (I) Hn(r.t.d) is at least second order differentiable in A'(r0. tu. d0): (J) d llu( r. t. el)/ Oil = a > 0; (d) dUu(r.t.d)/i)d is bounded in A (r0./0.</0). that is. there exist two positive real numbers by and / > ■ > such that for any (r.t.d) £ A (ru. tu, e /0) . The following lemma is a sharpened version of Lemma 3 of [90], Lem m a 2.3 For sufficiently large n and for any (r.t.d) £ A (ru. t0. do). where r ( 2 .2 -1) and t are n-types. there exist two constants Ci and c>. which depend only on tin neiyhborhood but are independent of n. such that ( 2 .2 ')) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and In Fn(r. t.d) > nHu(r,t.d) — ntl(r) — — In n -j- c2. (2.26) Proof: Fn(r. t.d) = T T T ( ^ ) s€J?r,.d)keB W ( l- ^ ) . M 2 .k ).....n s (J .k )j = y e x p {0 (l) - — } Inn + nH{$) - riH(r)}. (2.27) s€^n(r.r.af) Let S tlr.l.J ) = |.s e £>.(.,/) : k - ~ < j|s - ,-| < - L J and Sk(r.t.d) = { - € S(r. t.d) : < ||, - s'|| < A j . where s‘ is the element in S(r. t. d) achieving the maxinuun of //(*) over s 6 S{ r.t.d) [96]. In the proof, for simplicity, we use to denote S'l.(r.t.d) and Sh to denote Sk(r.t.d). Then 1 ,1 " K I — Fn{r.t.d ) = y y e x p { 0 ( l) ----------------Inn + n//(.s) - ;///(;•)} fc=l stS: + y y exp{0( 1) - — In n + n//(.s) - nfl(r)} t= lnn+ l,e5'” = A + 1 2 (2.2S) 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where / t and /2 represent the two terms, respectively. By (94) in [96]. A ff(s ) = H(s) - His') = - £ (A ? ,0 ' n )J + 0 ( ||A » f) + aidis) - d) j.k * (J'1 *) where A.s = s — s*. d{s) = J2j.k H j- k)p(j' k) and q — > g. Hence when ia - - ’ 11 > % 3 where c > 0 is a constant depending only on s’ . Therefore ^ = E E c xp {n [//u(r ./.d ) - / / ( ; • ) ] + n A //(s) - (/V'7 ^ h‘ " + 0( 1)} t= :ln ri + 1 ~ < nl u ~[ e xp {n [//„(r. t.d) - fl(r)} - r(ln n )2 + 0 (1 )} < v\p{n[Hu(r.t.d) - //(/•)] - (2.29) when n is large enough. We now estimate l {. = E E exp{ii[[[u(r.t.d) — [ [ (r)] + hA//(.<) - — -—— ---b 0( 1)} k=is&; = exp{n[//u(r. t. d) - H(r)} - ( ^ >ln " } x E E cxl> {-» E (A ! r 7 | } + »0(||A.s||3) + na(d{s) - d)} k=lseS£ J.k " S < exp{n[Hu(r.t.d) - H(r)} - — A illL I i} x % ™ p {- J x U - i l j ) } + I exp{»a((/,..) - ,/)}. r >:!0) 40 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Note that Y exp{na{d{s) - rf)} < nhJ~h- J+l f ^ exp{na(d(s) - d)}ds s£S£ •'s€Sjc -iUSitU 'S ic +i since Sk (including 5£) is a shell in a KJ — K — J + [ dimensional space. The region 5a—i U 5t u5a+i is used in the integral instead of the original region 5/t to guarantee that the integral is an upper bound. Let Ji = r/(.s). x>. ■ ■ ■. xkj- k - j+i be a basis (not necessarily orthogonal) of the KJ - K - J + I dimensional Euclidean space. The above integral can be upper bounded by • I [ ‘ [ ' • ' / , e ^ - % h - ld.r2 - - • d.vKJ_K_J+l where the length of the intervals L, are of the form ^4 = y- where pjk) are polynomials of k and A is a constant due to the transform of the basis. Both p jk) and .-I depend on the choice of j-t. • • • .-t k j- k -J+i an(l are independent of n. Hence Y exp{no(r/(.s) - d)} < P{k)niKJ- K~J)/‘ » € •> * where P(k) is a polynomial of k independent of n. Therefore /, < cxp{n[Ifu(r.t.d) - H(r)\ - ' ^ W- } x a V I V max{s‘ (/.y )} , 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. = exp{n[Hu(r .t.d) — H(r)\ — — —— + c} (2.31) where c' is a constant independent of n. Combining this with the estimate for / 2. we obtain Fn( r.t.d) < exp{n[//u(r. t. d) - //(r)] - } where C [ is a constant independent of n. In Lemma 3 of [96]. the authors have proved that Fn(r.t.d) > cxp{n[flu(r.t.d) - //(/*)] ~ + c} where c is bounded when {r.t.d) is in a small neighborhood A ( r 0./0. (/0). Hence we may replace c by c> and the lemma is proved. □ Let Ft(d) be the probability distribution function of pn( r n, Y n) subject to the fact that t(r'1) = t and V" is uniformly distributed over Ty(r). By (2.1 L) and Lemma 2.3. Ft(d) = Pr{ pn(xn.yn) < d \ t{yn) = r.t{j-'1) = /} _ Fn{r.Ld) nll{rJj )_iiip.+oli) \Tx(t)\ ‘ ( for (r.t.d) satisfying conditions in Lemma 2.2. Here we do not include r in the notation because, in the following sections, r w ill be a fixed type. It should be pointed out herein that the above distribution function Ft(d) is not a continuous function of d. It is in fact a right continuous step function of d. 32 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Furthermore, if d{ and f/2 are in a small neighborhood of do such that F-(d\) = FtUlz). then |r/[ — do\ is at most 0 ( l/n ) . As proved in [96]. l[(r.t.d) is a differentiable function of d and hence for any fixed d in a small neighborhood of d0. the jum p can be estimated as follows. By the proof of Lemma 2.3. Q£- r < F r ( r / ) < whereq and e\ are two constants independent of n . Let r/_ = sup {el1 : Ft(d') < Ft(d)} and d+ = inf {d' : Ft{d') > Ft(d)}. Then d+ — r/_ = G( l/n). By (2.33). we can de duce that F,(d_) = Ft(d) < Ft(d+) < c(d0)Ft(d.) (2.3-1) where c(du) > 1 is a constant depending 011 e l0 and independent of n. Since Ft(d) is a step function. F f l (-) is a multiple map. In the sequel, we use the following definition F ~1 (.r) = sup{d : 3d' > d. B F,(d) < Ft(d') < .r} when Ft(.v) is not bijective. Therefore we have the following lemma Lem m a 2.4 < F (F f- ‘ (.r)) < .r (2.33) c(d0 when Ft~l(.v) is in a small neighborhood of d0. 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.4 Encoding by a M odified M -Algorithm Recall that our finite state code has a very special structure. The state space S and the encoder output space U are both {1. • • •. .V} where .V = R is the rate of the coding system and n is the block length. The next state function / : S x U — > S is given by /(s . u) = u. The encoder inputs are source words xn £ A '1 of block length n and the codewords are also of block length n from B '1 . The decoding function is assumed to be time variant, that is. for each time instance m. there is a decoding function v,n :S xW -> B For .s £ S. let c ;n = { t ’m(.s.u ): u e u } be the state codehook of the state .s at time m. Since the decoder is a labeled transition decoder, in a codeword c m{$.u). the state at time m is .sm = .s and the state at time m + I is .sm +1 = u. That is. i/) can be written as .sm+[). Tlie decoding is done by means of certain search algorithms over the trellis such as Viterbi algorithm and M-algorithm. We now analyze the encoding by Viterbi algorithm. Suppose that the source input x nL = {x'‘( 1 ).•••. .rr ‘ ( L )} is a sequence :U Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of L source words from A n where xn(m) = Associated with each state sm at time ni. there is a cost function p(xnm.sm) which is defined as follows. Let 7 = (^0 "Si. • • •. .sm) be a path starting from the initial state .s 0 leading to the current state sm at time m. Along the path there is a sequence of codewords (c|.--.cm) where c, = cq— t(.s,— i.. s,). Then the path distortion is The cost function p(.r'im..sm) at time m is defined as the minimum of the path distortions over all paths starting from .s 0 and ending at sm at time m. The Viterbi algorithm gives a simple way to calculate p(j-nm..sm) as Viterbi algorithm is an optim al encoding algorithm, which always finds the optimal path. M-algorithm is a suboptimal encoding method having lower encoding complexity. It is described as follows. Define inductively a state cost function m »}• nun 6 5 vr {pn(xn(m). )} Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where < S . u .a - is defined as follows. Let p(xnk.$k) be arranged in a non-decreasing order as P(xnk-Skd)) < p(xnk.sk(2)) < < p{xnk. .S fc ( .V )) where s^( 1). sa(2). • • •. st(.V) is a permutation of { I. • • •. .V}. Then S\r.it = {-^-f l).-st(2). • • • ,sk(.\[)}. To analyze the encoding process by these encoding algorithms, we invoke the random coding technique. Assume that the decoders (cq. v >-• ■ •) are constructed randomly as follows: all codewords (.'m(sm.s m+[ ) are selected independently with an uniform distribution from the set T£(Q) when' \\Q — Qm || < This gives a time variant random finite state code. To analyze such a coding system, the main difficulty is the complexity of the distribution function of the state cost function p(.Y,lm.s m) whose randomness comes from the randomness of the codewords as well as the randomness of the source words. The following lemma provides some insight for simplifying the analysis of the encoding. Although this lemma w ill not be used in the proof of Theorem 2.1. we state it here to help the reader to understand our approach. 26 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lem m a 2.5 Let .Y[. • • •. .V.v be X i.i.d. random variables with continuous generic distribution Fx(d). Rearrange the random variables in non-decreasing order as A'i < .V2 < • < -V.v where A'i. .V2. • • •. A'.v is a permutation of X\. • ■ ■. A'.v- Let Tl = F\-( .V,). Then £'r'l' a ' - T for ang X and any i — I. 2. • • • . .V. I he above lemma can be used to guide the design of our "modified M-algorithm" which is believed to work almost as well as the traditional M-algorithm. The “ mod ified M-algorithm" is depicted as follows. As we mentioned earlier, the true dis tribution of the state cost function is very complicated. It is almost impossible to analyze the behavior of the encoding by means of the true distribution. In stead. based on Lemma 1 . if we rank the state cost function p(.ra(m_l).s) in non decreasing order and use only the rank of the cost function in this order in the analysis, the problem becomes tractable. Let be the common conditional distribution of state cost functions at time in given the values of the state cost function at time ni — 1 at all states in S. If the state cost function of a state s ranks number i in the order, then its mean is roughly — i>()- Consider the random variable(assuming that the Viterbi algorithm is in force) p{.rnm. sm) = 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mins{p(j-n^m_1,..s)-(-pa(x rl(rn). When a fixed distribution level d is un der consideration, in order that the state s at tim e rn — 1 is competitive for achieving the minimum in the definition of p{xnm. sm). pn(xn(m). c m(.s. s')) should be roughly at the value such that the factor i in can compensated. This leads to the following so-called modified ,\/-algorithm. Assume that the trellis search depth is L > Inn and the blocked source data is xnL = (xn( 1 ).•••. x a( L )) where xn(i) is the /-th source word of block length n. That is we consider a block code of block length 11L with a special trellis structure. Assume that the initial state is s0 € S. Let the type of the source word x n(i) be The modified A/-algorithm works as follows: 1 . Calculate the distortions p,L (xn{ 1). ri(.s0.s)) for all .s € S. Let. Vs € S. Ysl = Ftl(pn(xn( l).fi(*o..s))). (-J.36) Rearrange V’ s l in non-decreasing order as where s [. ■ • •. s*v is a permutation of 1 . • • •. .V. Lor j = 1. • • •. .1/. s1 are sur vivor states and .s () — > sj are survivor links. AS Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2. At time m > 2. for any s € S. define K m = j=minw { F t m ( P n ( * n ( m ) . t'm(*? " 1. s ) )) j } . (2.38) Rearrange Y’m in non-decreasing order such that — * s'lt (2.3!)) where s "\ • - •. Sy is a permutation of I. • • •. A’. For j = I. • • •. .1/. if m-1 ' r n — I ' b ')))' rn — I for an index i'J1 1 € { 1.2. • • •. .1/}. then we call — > s'" a survivor link and ; s'" a survivor state. 3. Repeat step 2 until we get survivor states sf l . • • • . s ^ ‘ . -1 . Let s*‘ be the state such that V = "ei 1 jJr'.'V/ c L(s _ J - 1. .s))) j} . (2.-10) This is the final survivor state. If the state s^ 1 achieves the minimum in the definition of YJl- -s^-1 — > ■ is the survivor link. 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •5 . By tracing back from the final survivor state sL to the initial state .s 0 by following the corresponding survivor links, a path starting from the initial state .s 0 at time 0 and ending at sL at time L can be found. The channel symbols corresponding to this path are released as the encoder output. Clearly, the above algorithm does not guarantee that in each step the .1/ selected states have the minimum distortions. Hence the above algorithm w ill have perfor mance worse than that of the Viterbi algorithm trellis encoder in general. Since the A/-algorithm tree encoder is only a suboptimal encoder, there is no guarantee that the traditional ,l/-algorithm is better than the modified .U-algorithm. But we believe that the traditional .\/-algorithm should be better in average. We will prove that when the encoding is done by this modified A/-algorithm. the redundancy is at most ' where c(p. R) is a constant depending only on p and R. Hence, for the Viterbi algorithm trellis encoder, the redundancv is also at most L . o ri The following two lemmas play important roles in the analysis of the behavior of the modified .l/-algorithm . In the modified ,l/-algorithm. we take M = jV ^ ]. Lem m a 2.6 Let X(u) be a random variable with distribution function F\(x) where u € V is an element of the sample space V of the unde rliniiuj probability space {V. IF. P}. ft is also assumed that for any .r with F y(x -) f 0. F y(x_) < F.v(.r+ ) < 6 F.v(.r-) for a constant b > 1 . where .r_ = sup {.r' : f\\(.r') < F.y(-r )} and x+ = inf {.F : F\(x') > F\(x)}. Let .V[(»). • • •. X\(u) be X = i.i.d. copies of X (u) . -10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fora fixedu. rearrange .Yi(u). ■■■. .V.v(u) in nondecreasing order as A 'i( u ).- .V .v (« ) . Let M = \e'^]. Define , >' l F x ( X M ) e ‘ H , 1 r(u) = max < --------- — . 1 > • 1 = 1 I 2* J Then for I < t < 261n«. Fj(t) = Pr{T(u) < f} > 1 — ce ^ h where c is depending on b. Proof: By definition. F t(I) = Pr{T < t ) = Pr jnmN — < t f v | f> (*.("»> nR 2i < t.i = 2i > 2 / > I P r {*,(«) > / \ y ‘ 1=1 M 2it . nR = * - j > { i = l ^ at most 3/ of .V|(h). • • •. X \(u ) ^ A ,(«) < /* y - l / A/ = ‘-E i=i M i- l E 7=0 s *-E 1=1 i- i E 7 = 0 \ / \ .V \ J / ( \ A' \ J J (l “ W ^ > > 2 i t \ J / _2P_Vn t nfi/ V bcnR) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. constant \ '-j / Note that the last step is an inequality instead of an equality because the distribution function may be a step function. Since when 1 < / < 2b Inn. ( \ S \ J ) (^ ■ )J 0 “ •V-J .V - j + I 2it ( \ S J - I jTiR 1 - in — fanR > 21 > 2 (& y o ■ ■in v v - j +1 br_nR j for large enough ri. we have < < i- i E j = 0 I \ s \ J I A / - i w-i 2it , nH l- l 'lit \ v - ‘+l brn!t / 2t (2 / f ) - ' ([ 2ti n V . , +1 2 / — 1 (/ — 1 )! btnH <cn 2 (2 i- i ,i— i -it/b — exp{/ + (/ — 1 ) ln(2 f ) — it/b — - In /} — exp{ — i(t/b — 1 — ln(2 /)) - ~ In i — ln(2 f )} where in step (1). we have used the Stirling formula and the following derivation: ■ 1 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2it brnR when n is large enough. Thus ( / \ M i- i .V E E i=i j= 0 ^ j t V 2 it c nR L - — ) benFt) X-j M > 1 - \/ — ^ 2 exp{ - i ( t / b — 1 - ln(2/)) — - hi i - ln(2/)} V " ,=i = 1 - _ L c x p { - / ( / / 6 - ! - I n ( 2 / ) ) } . v - v> Let ti, lie the constant such that when t > tt,. t / b — 1 — ln(2f) > I. Then f r i t ) > I - J l ' - W ' e M - U / b - I - I, i(2/))}1 ~ CN|’l 7''/,('/ '' ~ 1 ~ 1 " ^ > .)l V " I - e x p { - ( / / 6 - 1 - ln(2/))} 2 c2 > I - \l ----------- e ~ c — I ■t/b In the last step, we have used the bound I — exp { — M(t/b — 1 — ln(2 /))} < 1 — e.xp{ — ( / / 6 — 1 — ln(2/))} I - e x p { - ( / / 6 - 1 - In(2/))} I c < I — 1/e c — For 1 < t < ti,. it is easy to find a constant c such that Ft(1) > 1 — ce '/b. Take c = max{c. lemma is proved. □ -13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lem m a 2.7 Let X\(u). ■ ■ ■. X\j{u) be \ [ = [e ^ ] i.i.d. copies of X(u) with distri bution function Ft{d) defined as (2.32) satisfying constraint(2.3.{) when d is restricted to a small neighborhood ofd0. Let V = rnin;'^, {F,(.Y ,(u))/} and D = Ft~l (Y). Then F d {<1) > c3y/nFt(d) - 0{nFt2{d)) where > 0 /.s a constant depending on e l{). Proof: FD(d) = Pr{ D < d} = Pr{F,-l (Y ) < d) > Pr{Y < F,{d)} = I - P r{\' > Ffd)} — I - P r{F t{.\,( u))i > f't(d). for i = 1. • • • ..!/} Let e l, = sup{<l' : Ft(d') < }. Note that even though i may be very large. d, is very close to e l. In fact, by (2 .. ’Li) we can see that for any / = I. -..IF O < d - e l , = 0 ( ^ ) < 0 ( ^ ) . By (2.3-1) we have Pr{F,(.Y,(u)) < i c(elQ)i 4 -1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hence Fo(d) > i = i V c(“o)f = I — exp{53 In Tl - 1=1 \ Ft{d) c{d0)i - > 1 j s r i c(d0)i Let c' > 0 be a constant such that 1 /c(d0) > c'. Then f u I FD{d) > 1 - exp i -c'Ft(d) - t 1=1 ' > I - exp {-c F,(d)(\n M - 1)} = c F,(d)( In .1/ — 1 ) - 0 ( n F t1{d)) > Ft(d) — 0(n Ft2(d)). □ Lemma 2.7 is a key result used in the proof of Theorem 2.1. It will be used to estimate the increment of the state cost function from one stage to the next stage. By certain techniques, this lemma shows that a factor \fn is added to the distribution function Ft(.v) in the minimization procedure defining V 's "‘ (see (2.28)). This factor removes the term in ln[F,(.r)]. Since the term in ln[Ff(.r)] corresponds to the main term - '~fRR- ^ in the redundancy of block codes, the -1 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. main term of the redundancy of trellis codes in the chapter becomes the next term in M n [F (x)| = /,(A \.Y ) - ^ + 0 ( l/n ) . i.e.. 0(l/n). 2.5 Proof of Theorem 2.1 From the description of the modified .\/-algorithm. at time in. we have the following random variables. 1. xn(m) : the rn-th random source word: 2 . tm: the random type of x n(m): 3. s"‘ : I < j < M . the survivor states at time in: -I. the survivor links at time in: o. V37 ‘. < Vs 7 , 1 . < • ■ ■ < V/m defined in the description of the algorithm (see (2.36)- (2.39)). Two more sets of random variables are defined as follows. Let D m = F,-1(V m). 5 t m ' 5 > Then D”1 is also a random variable. Define Fom(d) = P r { D < (l\. Since all codewords in a state codebook are uniformly distributed. Fo™(d) is independent of .s and m and hence we simply denote it by Fo(d). Rank D"1 in non-decreasing order. The order should be the same as the order for \'s m. Therefore the order is D ™ m < D”l m < ■ ■ ■ < D?m . I — 2 — ,\t -1 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Define f Fp(D?? )e nR Tm = , = n .ia - xu { ------- 51---------l >‘ (241) For any j = I . • • •. .1/ FD( D ^ )e nR Tm > "■ '------. i.e. Fd( D™ ) < 2Tm je~nR - j In the following paragraphs, we define three events and show that the total probability of the complements of the three events is at most o( -). Define w = {|km - HI - a\ ~ ~ '■ for 1 < in < I.) ( - . 1 2) for some constant a such that a1 > . / + ! . As proved in [96]. Pr { i|/, - HI > | In this chapter, the search depth L is taken as L = [In n ]. Therefore, we have P r ( £ i) = o ( ! ) n 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To make Lemma 2.7 applicable, the term 0(nF\(d)) in the lemma should be at most -. Define n S2 = { v ; ‘ = < e~nft' 1 J for any m : I < m < L and any j : 1 < j < .1/} (2.-14) where R' < R. To estimate the probability of £>. we observe if £ ■ > occurs, then for any tn. there exist at least .1/ ,s £ S such that jJJ V . P x / < r ~ n f i '■ Lnder the condition that £t occurs. P r { j J}1 '1 1 1 u { Pt n ( X "'^ ' ‘ C n *S"'" ' ' - s * ^ J } - r ~" R >} = P r ^ F tJ p n(xn(m - I) . f,n« - ‘ ..s))) > j ~ j f V/ = s " ' S t ' - s W y ' U , l n ( I 1— ) where the inec|uality (1) for some constant c is due to Lemma 2.-1. Hence the probability that there are less than .1/ .s £ S such that , = ? '1 1 \i { Ft'n ^ X ^ L'm( -18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is less than or equal to .U-i E i = 0 enR - i In n + 0 (l| \ e -■ / In n + Q( 1) \ 1 e ) ( 1 — C c -" * ' < c nR enR - M + 1 In n + OH) \ r-nR ~ \ I + l e < e / In n+Q( 1) \ , nfi 1/a .ll M n R e [ ------crnR' J>e - ■ '' + ! ) = O ( - ) . n- Tliis implies P r(£ ;n £ l ) = o(-) n The last event is defined as follows: = { I < Tm < lb Inn for all in : I < rn < L} (2.16) for some constant 6 > 1. where 6 is a constant depending on d and we w ill specify the constant later. By Lemma 2.6 (we will prove that Lemma 2.6 is applicable later), we have P r{T m > 26 In n} < C t~ilan = 0 { —t ). n~ Then. Pr {Tm > 26In n for some m : 1 < in < L) = o( —). n -19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Therefore P r ( ^ ) = „ ( ! ) . n Finally, we obtain the following result Pr{U?=1£ n = o < ! ) n In the sequel, we assume that the three events £, for i = 1.2. .'I occur simultaneously. That is. we will estimate the average path distortion under the condition that £ = n?=1£, occurs. We denote this conditional average distortion by Dm . The average path distortion is then D < Pr(£')Lpm„ + i r whore plim x denotes the largest possible distortion per letter, i.e.. pm ;ix = tnaXj.gA.j-eB /d-'-- jV I lie per symbol path distortion is then j D < Pr{£L ')pm,ir + jO " If we can prove that j-D~ is of the order O (^). then the theorem is proved. Recall that f FD(D?m )cn /I T™= — v— - 1 1' 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For any rn. we have v / = i . . a/, r, > Fo{D?m)e J -J nR i.e.. Fd ( D ™ m ) < 2Tmje - n R By Lemma 2.7 we have Fd( D?t ) > C ;i( D"l m )s/nFtrn{ D?t ) — 0[nFfm(D?? )) where c3(.) is the constant introduced in Lemma 2.7. Hence c:i(D ” sl n ) ^ F tJD ?n ) - O ( n F f jD ^ ) ) < 2Tm je - n R Since Ftm(D”l„) < the above inequalitv can be rewritten as i n -nR < - n R r :i( D's n m )( v/77 - C) C ;j( D "Jn )S /Tl' Xote that by Lemma 2.-1 FtJD?m) > \Jn/c(F>n sl m) = FtJ Pn(xn(m ).rm( ,m - 1 s,r , where .s^J, — > s'n is a survivor link. Hence we have ; -lc( Dn ain ) T ,n J ( -n R C-.AD: 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Therefore < F.-J + ° C ) ( « ■ ) where c '3(D7i,x) = c^{ D’lm) I c( D'lL). The term 0 {\ln ) in the inequality is due to the J J J fact that Ft~*(d) defined previously are step functions and the jumps at discontinuous points are at most 0 (l/n ). For simplicity, we denote by a the path selected by tlie encoding algorithm which consists of L + I states s™ for in = 0. • • •. L where s V ._ — su. sf = s/ with kr = L and s7‘ - 1 — > ,s 7 ‘ are all survivor links, ['hen the * 0 K L Km — l ^ m total distortion in the L steps according to the modified .[/-algorithm is D < o{-)L + Z )\ n And D" = E[pri(.ra(l). fi(.s 0 ..sj.t )) + Pn(-rn{i")' ••"["J )j rn — 2 < 1 . 0 I - ) + E [ F - ,(-l7'l< -I )+ £ F\: — I tm \ I n ,n m = 1 where the expected value E is taken with respect to the joint distribution of the information source { A j} * and the random finite state codebook. 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. By (2.32). the term Ftm l ( 7^"*^ ) is m fact at a distance of at most 0(±) \ km J from the solution of the equation , U T f = - n l i „ - n l l (Q.tm.d)-'-Y L+ 0 ( l ) _ ________ mft-mC________ ? c '( D £ )km. ly/H j about r/. where the type Q is the type used to select the random codewords in the random finite state code which satisfies \\Q — Q ‘ || < . Let dm be the solution of (2.-IS) for in = 2. • • •. L. We have <u = /,-• ( q . tm. /? - + 2 (1 1 ') (2.-19) V n ti n n J where [(Q.t..r) = y iff Ii(Q.t.y) = j \ Xote that c':l(D”m ) is absorbed into the km 0( 1 ) term because' D"1 is in a small neighborhood of d(p. R). For simplicity, let In 7’| In A ’i Inn 0(1) •m = I ---------------3--------------- n n In n and ^ .y hi Tm I n A . m 1 n A m _ [ ^ 0(1) . \ m — -j- -f- n n for in = 2 . • • •. L. Then oil Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Expanding /, at point {Q.p. R). we get D' = L - l r l(Q.P.R) . E [f / a / , f t ) , \ , ^ d t f ^ Q . p . R ) , , , + n h \ «?— ■<- " 7 + £ ‘ 8 « ( ~ Am) L L + - p . - A in) i r ( / m - p . - A m)r + J ] 0 (ii(t, - P.- A ,) ||3)] rn = 1 rn= l = £ • ir'(Q.p.R) + E[L Inndlt l(Q.p.R) 1 ^ 1 1 1 OR n 2/i OR , £ / d lr l(Q.p.R) t \ dir\Q.p.R) ^ In 7m + L ( 7 T n J m ~ n ---------- ttb--------- £ \ i>P 7 M ” L L + - p .-A m)li'(/m - A m)y + 5 ] 0 (||(/m - p .-A , ’> 1 = 1 rn = l where i r = - • > ;jpi :> 2i r l(Q.p.n) \ V Jp01i :<p:>R X(Q, i'll2 ■ > 2irUQ.P .R ) and r stands for the transpose of a vector or a matrix. Since U is fixed for any fixed (Q.p. R). 1 < T < lb Inn and km < M = c ^ . we have £ ( * „ , - p . - Am) i r ( / m - p . - . \ , n)r + ]T °{\\l m - p . - A m||3) 0 ( 1) m = 1 L m=i r n = 1 5-1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Thus r - i ^ .. d\ i r dIr\Q.p.R)0(i) , In n d lr l(Q.p.R) D~ = L - l r l{Q.p.R)+L- ^ 7 v ' + OR n 2n OR + e [£ _ W / > , * ) f In T ^ m^l \ °P / ^ ,^ 1 n + £ 0 ( l ) | | ; m _ /,||2] + £ . 2 i H . (2.50) .=i « In (2.50). fm and Tm are random variables. By taking expected values of tm and Tn we obtain e - = i . , , - ( q . , . f l ) + t . g£ ' i ^ * m an n o.R) 2n OR DR . I—I L In n d l r l(Q.p.R) c)irl(Q.p.R)^ ( In 7 n + £ 0 (i)£ ,„. HI2) + £ • — . (2.51) 1=1 By discussions in [96] (pp S2). we have We now estimate Erm{h\T,n) when I < Tm < 26Inn and determine the constant 6 . Since — ) ^ P 'R'* > 0 we only need to estimate the upper bound of £ 7^ (In Tm). Recall that (see (2.-11)) \ F D(D?m)e'R - max < r: . 1 max < ; = 1.-..V 1 00 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where Paid) = P r{D < d} = P r{F tf{Y™) < d}. Hence Fo{d) and Ftm{d) have the same jumps as distribution function. But when Tm 6 [1.26 Inn]. |/m - p\ < D ™ m is in a small neighborhood of d(R) as discussed in Lemma 2.7. Hence we can take the constant 6 in Lemma 2.6 as c(d) (defined in (2.34)). Hence Lemma 2.6 is applicable and we have ET m (In Tm 1 1 < Tm < 26 In n ) = / 2b In n In xdFTm(.r /•jfliia (fj* < 2FrJJ ')ln .r|fl" " - 2 / FTJ x ) ~ J 1 X / 2b In a | ££ — -rfb - i f /P >'(£:>) 26 In n ([x - d.v "2b In n ~ T/ > > ■d.r < x . Therefore we have proved D- = L • I f (Q. p. It) - I • - Ij R) — c)R n Inn d lf iQ .p . R) 0(1) 2 n OR n and the average distortion per source letter is ]-D = f - l (Q.p.R) + 0 { - } L n when L > Inn. 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Careful readers may ask why we do not take a larger .1/ to further reduce the redundancy. From the proof of Theorem 1 . we can see that this is impossible because a term A ■ l^ 2 ~ rnust be added to the redundancy of the trellis code. If In \ l 2 > s/n. then this term w ill be greater than O(^) and becomes the main part in the redundancy. 2.6 An Example: Uniformly D istributed Source with Hamming Distortion M easure In this section we examine an example showing explicitly the behavior of the redun dancy as the alphabet size increases. Both the source alphabet A and the reproduc tion alphabet B are {1.2. •••../} and the source has uniform distribution over A. The distortion measure is Hamming distortion measure which is given by 0 i = j: 1 i # j. From Chapter 2 of [6 ]. we know that the rate-distortion function for this source is H{d) = log./ — tlb(d) - d[og(J — 1) (2.-52) •)i Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where d is the distortion level and //>,((/) = —d log d — ( I — d) log( I — d). We restrict d to the region [0. ^y4] since if d > we do not need to use block code or M- algorithm. As in Section 2.4. PnU'n-yn) = - p(.r,.(/,). pn(/ . C „ ) = min pn(xn.c). n c€Cn Let L ,i'd F.v(d) i P r{,f : pn{ x \ y n) < d} = £ / \ n ( • / - D‘ i=0 J n [nd\ ± Y - v i= 0 As n is large enough, for any i = 0. 1 . • • •. [m /j — 1 . ~ M +1 A, / \ n \ ' + 1 / (•/ -I) i+i / \ n ( . / - l)‘ V ' / (./ - !)(/, - z) (•/ — I )(n + 1 ) / + 1 i 4 - I ( J - l ) This implies that Fx (d) = L n J J , ■ • > 1-1 V 1 = 0 - ‘Hnrfj oS Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [nrfj [n JJ-1 , • A L ^ J E n T j i= 0 j =i J + l [nt/J [nrfj — 1 • . = - vjE n ' ,=o J=, ( J ~ l ) { n - j ) (./ - l)(n - [nd\ + 1 ) [J — l)(n - [ru/J + I) — [nr/J yj'2~[nd\(n — |_«r/J) — — [n(l\ + 1 ) — L W ^J ,, , /1 \\ I (./ - 1 ) ( 1 - t /) I I (-/ - 1 )(L V 1 -</) X «cp 1 n = (L + o( 1)) exp{ - - nTZ(d) + ,-l(n. d../)} (2 .53) where {nd} = nd — [m/J and A(n.d.J) = {ndyjZ'(d) + \ n [ - ^ = - ^ - 1>(1- , ^ 1 , ]. W 2 ^ (J ~ 1 ) ( 1 - d ) - d \ d ( l - d ) 1 Note that even though A(n.d.J) depends on n. its bounds may be independent of ii. Fx (d) is a step function. The jum p of /•’{■'(•) is exactly 1/n. Let [r/]ri = J — i-. we have F.x(d) = F.\([d]n). (2 .0 -1) For this particular source. Lemma 2.7 has a more precise form: 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Lem m a 2 . 8 Let .Y,.--.A'u be \ I = [e ^ ] i.i.d. copies of X(u) with distribution function Fx (d). Let Y = niin,-=i......u{/\y(A',(,u))i} and D = F ^ l(Y). then F d W - Proof: Foid) = P r { D < d } = Pr{Fx( D ) < F x(d)} = P r{Y < Fx(d)} = I - Pr{Y > Fx (d)} = I - Pr{Fx(Xt{u))i > Fx(d). V / = L • • • . A/}} = i-ft (i - \l = 1 - n ( 1 - Pr{FxiXt(u)) < Fx(d,)}) i=i where d, - tnax{.r : Fv(x) < F x }. By the analysis about Fx(d) at the beginning of this section, we can get Fx (dt) < M M l l < Fx(dt + - ) < (eR'{d) + 0(-))Fx (dt) n n = ( (J 1 ) ( 1 <l) + 0 ( - ) ) F x (d,). d n 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hence FD{d) > > - n ( i - 1=1 \ i= m _ exp| £ ln ^ _ » - S ( i j r - exp - exp j - - exp | — - exp | - cl 1 ) ( 1 - d ) W ^ T ) + ° \ n - I - (3 / -L ^ u \n * ,(•/- Dfl - d ) cl 1=1 1 cl A J - m TT^/j ^ ^ v^" + /*) j S J - 0 ( 1 -d) i f f cl + O f — ) ] /\v([d]n )( \/n + // ) 2 [{(./ — 1 )( I - c l ) + 0 V 7 1 d]n)( V ^ + / 0 > i j - m - d ) ^ F x ( d ) = CJAy/nFx(cl) where fi is tlie Euler's constant anti the last inequality holds when n is sufficiently large, anti CJA = □ We are now ready to estimate the redundancy. As in Section 2.5. assume that the source data consists of L > Inn source words of block length it. Suppose the Yiterbi encoding algorithm finds a path starting from the initial state .s o and going through the states st(kt) for i = Similar to the general case, we can deal 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with the total distortion except that in this case the redundancy terms are carefully estimated. The inequality (2.47) is now ( ^Tk £~nlt \ I < F v l Jl - i - i + I \ c j . 4 , k i - i \ / n J n for i = 2.3. • • •. L and the total distortion satisfies the following inequality: d < f ; > ( a r . t v - " ) + E F * - 1 ( i r ' k e" r ) + " <2"” 7^1 \ cJ . J , k i - l y / n j rl where d t satisfies l’\ (d,) = Fx(pn(-rn{i ~ 1 ). (.'.(-S-i (F-i )• -s(F ))))F-i = F(/>„(.r,l(/ — 1 ). c,,(s1_i(£•,_[). ))))k\_i for i = 2. •••.£. Before we continue, we have to estimate c jjt. By (2.3-5) we can easily deduce that Pr {/>„(-r'‘. c) < d(R) — 1 / \/n} = o( l/r/2). lienee we may assume that (// > d(R) — I / v/77 where d(R) is the distortion-rate function. Under this assumption, we have </(/?) - l / v ^ rf(fl) I 1 - ( . / - 1)( 1 -■/(«) + l/v/») (./ — 1)( 1 -</(/?)> \ / « Since G 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. setting InTi In A .* i Inn At = ---------+ ----------------- — n n 2n and _ In T, In A :, lnArt_t A| — -F n n n for / = 2. • • •. L. we have o = (« -A , - + x ; f t - 1 ( r - a, - J) + + - ^ V « n J n = u {m + ^ U - l ^ d l ) + L an \ n ) n m - \ R ) ( , A{n.d,.J) Incj.fi, \ \ — “ J + j <tlf‘ \ " A | ------------^------ ) ,1 ^cf--R~l( R ) ( % ,t(n.d,.J) Ino.,,V’ +a S —3^— ------— + — J , //m , dn~l( R ) ^ f i n r ,^ \n,\(n.,i = + — d’R~l(R) / I n o . j \ In n r/7v_ I( /?) , l d ^ - l (/? )/ , A(n . d i . J ) \ 2 2 " r f / p " r A l n J , l ^ ^ - l(fl) / , A(n.di.J) , I n o . , . \ - , L * h <M2 \ " « ’ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. As we did in last section, we also restricted T, to [8 ~ 6[lna • ~b In n] and in this case. 1 dzR ' l (R) ( . A(n.(h,J)\ 2 2 rf/P V n ) , 1 ^ c R R ~ l (R) ( % A[n.dt.J) , \ncj.d, \ 2 i k W \ » ~ ^ J ^ L I d*n-l(R) ~ 2 ' n ’ <//P r - 1 - + -1 _______ri(«) ' I —i{R) - n (ln (.l/ — L) — In jz jflii) * We now apply for Lemma 2.3 and in this case 6 = 1 arguments show that E r,(I n T ,i - r - j 1 — < T, < 2 bIn n ). bb In n Therefore ,n . ^ cxLdK~l{R) , [nndTZ~l(R) t r , . - . r j D M ) < u m - - — j f r + ^ - ^ n r - dTZ~l (R) f In A(n.d[. J) y^lncy./, d R ~ « h L -i(R\ + - hr) 1 i - hr) Hence the redundancy per letter is „ In n d'R~l( R) V n(R) < 7TT^ + n dR d7l~l + - n (R ) dR 2 n l dR y» 111 .-l(u. (/;../) lnC/,/( i=l ii L Similar 6 -1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 + 1 d(R) 1 l-d (R ) InL (ln(.W - I ) - l a ^ j ) _ Q £ + In n/2 + I ^ i Ajn.dj. J) — 1° Q .<rt ^ ( N - W - D - l n ^ y ) 1 — + ~ 1 . 1 -i(fl) T l-d{R ) 1 + - n < 2" ( ! " ( . > '- n - i ' i i W ct + 1/2 + A{p.d(R.).J) — In c j j (R) — !---- 1 L HR) “ 1—ii 2n (ln( .1/ — 1 ) - I n ^ U when L > Inn. 2.7 Conclusions In this chapter, we study the redundancy of the trellis coding of memoryless sources with finite alphabets. The main term of the distortion redundancy of block codes with block length n is We show that the term ^ is removed in trellis block 0 2n 2 n source coding. When the state cost function is calculated in the trellis coding, the path distortion is minimized over all paths leading to the same state. It happens that this minimization procedure introduces a factor \ JTi to the approximated distribution function of the state cost function(see Lemma 6 ). This factor cancels out the term ^ in the redundancy. Therefore, roughly speaking, the main term of the redundancy of the trellis source coding is the second term of the redundancy for the block codes which is of the form O(^). This is the main result of this chapter. 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The redundancy o f trellis source coding is analyzed for a very special trellis code. It is still unknown whether such an analysis can be generalized to the more general sliding block trellis codes. Although we believe that such a generalization is possible, major difficulty exists. The redundancy analysis of tree structured YQ is another interesting problem in the direction of redundancy analysis of structured source codes. A by-product of the chapter is the selection of the parameters M and £(delav) in the M-algorithm tree encoding. We have shown that M = and L = Inn are enough for the distortion redundancy to achieve the order ^ . The use of larger .1/ and L will give only minor improvement. We also know that M-algorithm is also used in the decoding of convolutional codes. Our result may suggest that M = where 1 1 is the constraint length of the convolutional code may be sufficient for the decoding of convolutional code in certain sense. Since the redundancy/complexity trade-off of the trellis encoding of memoryless sources has been proved to be of the order a natural question to ask is whether we can find codes with structures which can further improve this trade-off. 66 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3 Synchronization Recovery of Prefix-free Codes 3.1 Introduction Under certain circumstances such as in mobile communications, compressed data have to be transm itted through very noisy channels. Since the misdecoding er ror probability is usually high for very noisy channels and most of efficient data compression techniques are very sensitive to errors, some techniques are needed to prevent catastrophic behavior of such systems. One way to resolve the problem is to design robust data compression algorithms. A data compression algorithm is said to be robust if a single bit error causes finite number of errors on average in the decompressed file. Otherwise, the algorithm is said to be catastrophic. The robustness of source coding is a very attractive research direction. Various topics related to this concept including robustness of prefix-free coding have been studied in literature [32. 58. 7. G 3. 6 8 . -19. S. 7-1. 69. 70. 75. 20j. The robustness of prefix-free coding is called "self-synchronizing" property in [32. 58. 7. 63. 6 8 . -19. 20]. 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and in order to characterize this property, they introduced the concept of {synchro nizing sequence (or word). They use the length of the shortest possible synchroniz ing sequence or the occurrence frequency of synchronizing words to measure how robust the code is. In [63]. B. Rudner presented an algorithm for constructing min imum redundancy codes having the shortest possible synchronizing sequence. In [75]. Titchner studied the synchronization properties based on the T-codes. While in these papers, the main concern is the robustness property of prefix-free codes with the minimum redundancy, some other papers study the problem of improving the robustness of prefix-free codes bv adding additional redundancies. For instance, in [2-1].[10].[57].[-IS], the authors studied the existence of self-synchronizing prefix-free codes and the method for constructing self-synchronizing codes with lower redun dancy. Instead of using the intuition based concept—length of the shortest possible syn chronizing sequence or the probability of the occurrence of synchronizing codewords to measure the error resilience property of the prefix-free code, in Section 3.2. we study a measure of the robustness of prefix-free codes, the mean error propagation length (MEPL). The concept of MEPL is called expected error span by Maxted and Robinson [53]. In [53]. the authors studied an error transition model and gave a method to compute MEPL. This model was extended by Swaszek and DiC'iccoin 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [71]. In [72], for a different model, the authors presented a formula for comput ing MEPL and a search method for finding codes among equivalent Huffman codes which has seemingly short MEPL. As is known, actual data transmission systems are prone to two types of error: the phase error in which a code symbol is lost or gained, and the amplitude error, in which a code symbol is altered [58]. In this chapter, we mainly study the case of am plitude error even though we can easily generalize most of the results to phase error case. The results for phase errors are presented without extensive discussion. The concept of MEPL is closely related to the robustness of prefix-free code. Actually, a prefix-free code is robust if and only if it has finite MEPL. Therefore, in Section 3.3. we study conditions under which a prefix-free code has finite MEPL. A natural problem is how to construct minimum redundancy prefix-free codes w ith the shortest possible MEPL. This problem seems very difficult. In Section 3.-1 four algorithms for finding minimum redundancy prefix-free codes with short MEPL are presented. The codes constructed by these algorithms are compared with those constructed by existing methods. 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.2 Mean Error Propagation Length of Prefix-free Codes When an error occurs in a sequence of concatenated codewords from a prefix-free code, it causes parsing errors(the loss of synchronization) so that error propagates until correct parsing resumesfsynchronization is recovered). Fortunately, prefix-free codes have the synchronization recovery property as called in [32. 6 8 . 7. 63. 6 8 . 19. 20] by which we mean that correct parsing resumes with probability I under some mild conditions. To measure the synchronization recovery property quantitatively, we use the concept of the mean error propagation length (MEPL) o f the code. For relevant concepts, see [63. 72. 71]. In this chapter, the source is assumed to be memoryless. Let the source alphabet be A = {cq «ri} and let the probability mass function of the source be /;(«i). • ■ p(dn)- The source is coded by a prefix-free code C = {c[. • • • ,cri} in such a way that cij is coded by Cj. The elements of the set C = Ur ^_0C" are called sentences. When a sequence of source letters is coded by the prefix-free code C. the codewords are concatenated to form a sentence which is transmitted through the channel. We study the case where only one bit amplitude error occurs in the sentence transmitted. For simplicity, in this chapter an error means an amplitude error. As mentioned above, a bit error in a sentence from a prefix-free code may cause parsing errors. The error propagation length for a particular bit error occurred 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in a particular sentence is defined as the number of codewords in the sentence from the one where the bit error occurs to the one after which the correct parsing resumes. This concept is illustrated by the following example: E xam p le 3.1 Suppose A = {« i. az. a^. «.|. coded as 01. I. 000. 0010. 0011 respectively and a sequence 2 • • • / • > ' encoded as 1.01.000. 1.01. I. • • • . Supposing that then is a hit error occurred in the channel and the first bit is received as a 0 instead of a I. then the sequence 0 0 1 0 0 0 10 1 1 ... icill he parsed as 0 0 1 0. 0 0 1 0. 1. 1.-- which is decoded as a\a.\a<a> • • - . The correct parsing is resumed after 0 code words)in the original sequence). IIV evoulel say that the error propagation length is 0 — the source letters 0-^11 let^ei-ieii art lost and misdecoded as a^aiUy. To make this definition more precise, the decompression process of a sentence corrupted by a single bit error is analyzed as follows. A prefix-free code corresponds to a code tree. A code tree has two kinds of nodes— nodes that give rise to other nodes and nodes that do not. The first kind of nodes are called internal nodes and 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the second are called leaves or external nodes. Let us denote the set of internal nodes by X = {ou = A .q i. • - • .o m} (3.1) (in fact m = n — 2). where A denotes the root node of the tree or the empty string. If a single bit error occurs in a codeword (a leaf node), there are two possible cases: 1. it becomes another codeword or a sentence: 2 . it becomes an internal node of the code tree or the concatenation of a sentence and a non-empty internal node at the end. In any of these two cases, a bit error causes ari error of length at least 1. In Case 2. the error propagates. If a codeword containing a single bit error is in Case 2. parsing the corrupted codeword ends up with an internal node. This internal node is said to be the stall of this codeword for this particular bit error which is denoted by = S(j.k) < E 1 where 1 < k < n is the index of the codeword in the code and 1 < j < /(c;.) is the position of the bit error in the codeword. If a codeword containing a bit error is in Case 1. define .S ( = S(j.k) — A. The state internal node si will be concatenated with the next codeword and then parsed by using the prefix-free code. It again has two possible cases—Case 1 and Case 2. If it is in Case 2 . the concatenation of the state and the next codeword is the concatenation of a sentence and a non-empty internal node at the end. This internal node is said to be the second state and is denoted as s _ > . In Case I. the second state is the empty 72 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. string. In general, the tth state s, is concatenated with the next codeword— the (i + l)-th codeword. This concatenation of strings can be either a sentence (Case 1 ). or a concatenation of a sentence and a non-empty internal node at the end (Case 2). In Case 2. the internal node at the end is said to be the next state s,+ i . In Case 1. the next state is s,+i = A. Once the state s, = A occurs, the correct parsing resumes. After that, all states must be the empty string, that is. = A .V / > i . The total number of steps needed to reach an empty string state, or the total number of non-empty string states plus I in the state sequence is the error propagation length. If the letters of a data sequence are from a stationary information source {.Vf } f=[. and if it is assumed that the bit error may occur anywhere in the bit stream with equal probability, then the MEPL is defined as the average error propagation length for all possible source data and all possible bit error positions with respect to the source distribution and distribution of the bit error position which is assumed to be uniform in the compressed data. To make this definition more precise, consider all source data b € A n. Let the sentence £(b) be the compressed data for b. that is Z(b) = {C{bl ).....C (bn)). (3.2) where b = (6 L . • • • .bn) and C (6 ,) is the codeword for b, from the prefix-free code. We assume that a bit error occurred at the jt li position in £(b) and the receiver receives sj(b) which is identical to £(b) except that the j tli bit is altered. Since it is possible that the correct parsing would not resume at all. we define its error propagation 73 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. length in that case as the number of codewords in the sentence from the one where the bit error occurs to the end of the sentence. Suppose that this bit error has an error propagation length L(b.j). then the average error propagation length for b is defined as mb)) . £ ,b ' = £ w b » £ ,b -J )' The mean error propagation length for the code and the source is then defined as epl(C) = lim E /.(X -V) (3.-1) •\ - t x where X N is the random source word of length .V and the expected value is taken with respect to the source distribution. In the remainder of this section, we derive a formula for the MEPL of prefix-free codes. The computation of MEPL has been studied in [53. 72. 71] for two different models. Our model is the same as in [53. 71] where methods for computing M EPL was presented. Our formula is quite similar to the one presented in [72] for a different model, with precise definitions for all quantities used in the formula. Recall that the average codeword length of the code is defined as n M O = ]C/>(«,K(0 - (3-5) i=i Let Cj,i denote the probability that the bit error occurs at the j -tli bit of the /-th codeword c, under the assumption that only one bit error occurs in the sentence 74 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. corresponding to the source data and the error position as a random variable has a uniform distribution in the sentence. Then we have Recall that, when a sentence corrupted by a single bit error is decompressed, a sequence of states which are internal nodes of the code tree is obtained. If both the source data and the bit error position are random, then this sequence of states forms a Markov chain. Let (0 < i . j < m) be the transition probability from state /(internal node a,) to state ./(internal node o j which is the sum of the probabilities of such codewords that the concatenation of state / and the codeword can be expressed as the concatenation of a sentence and the state j at the end. Let T(j{i.j) = {£• 6 {1. • • • • /i} : 3c" ( E C" such that o,c* . = c'O j}. (.'Li Then (h.j = S Define Gc = ( * J £ i;= ,- Notice that in the definition of the m atrix Qq . the state o0 = A is not included. This m atrix is called the error transition matrix and when C corresponds to a code i o Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. tree T . we also denote it as Q j. In this section, since the codebook and code tree are fixed, we simply denote it as Q. Let p(cti) be the probability that the codeword containing a bit error ends up at state i after parsing by using the prefix-free code, that is / » ( « . ) = E eJ-k- ( ;*-9) Let p = (p (o i). • • • ,p(am)) and 1 = (1. • • •. I ) of length m. The following theorem is a key result of this chapter. T h e orem 3.1 The mean error propagation length for a prefix-free code when the source is memoryless is given by cpl(C) = 1 + p - f + p t ? l'+ ••• = 1 + p (/ - Q )-1! 1 (3.10) where I is the identity matrix of size in andf denotes the transpose of a matrix. The last equality holds if (I — Q)~l exists. Proof The 1 in the formula corresponds to the codeword containing the bit error which is in error with probability I. The term p - 1 ' is the total probability that the codeword containing the bit error has a non-empty state. This is also the probability that the second codeword is decoded incorrectly. In the same way. the term p Q l' is the probability that after the second codeword the correct parsing does not resume. This is also the probability that the third codeword is decoded incorrectly and so on. 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Therefore, the right hand side of the formula is the sum of the probabilities of the events that the i-th codeword is decoded incorrectly for all i. It is exactly the mean error propagation length of the code. When I — Q is invertible, we can conclude that the absolute values of eigenvalues of Q are less than I. Hence as an operator on the m-dimensional Euclidean space. ||@|| < 1. So the summation converges and the theorem is proved(see Proposition 3.2 for details). □ T h e o re m 3.2 The variance of error propagation length for a prefix-free code when the source is memoryless is given by X rr-. = 3epl(C) + 2 £ npQnl ‘ - 2 - cpl(C)2 (3.1 1 ) n=0 = cpl(C) + 2 p (/ - Q)~2l f - epl(C)2 (3.12) where the last equality holds if (I — Q)~l exists. Proof. By the proof of Theorem 3.L. we know that the probability that 1 code word is decoded incorrectly and the following codewords will be decoded correctly is 1 — p - l f. Similarly, the the probabilities that corrupt exactly n + 1 codewords are p(?n - Il ' — pQnV for n = 1.2. • • •. Hence variance of the error propagation length is er2 . = 1 - p . 1‘ + ]£(n +2)2 n=0 = 3epl(C) + 2 f ; npQ "i' - 2 - epl(C)2 n = 0 I I p Q -l1 - p Q ' ^ i 1 } -epl(C) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. If / — Q is invertible, then |0| < 1 and by the identity that ( I — Q.u) 2 = £ ^ 0(n + D<?Sf the theorem is proved. □ E x a m p le 3.2 Consider the five-letter source given in Table S. I and the code ob tained from the Huffman algorithm (see [24]) The non-empty internal nodes are Table 3.1: A Five Letter Source and Codewords Source Letter Probability Code A 0.3 01 B 0 . 2 10 C 0 . 2 11 D 0 . 2 0 0 0 E 0.1 00 1 a i = 0 . o2 = 0 0 . a3 = 1 with p = ( / > ( « ! ) . = (0.173913.0.217391.0.0860.57) aru Q = ( \ 0.1 0.0 0.3 0.2 0.2 0.5 0.2 0.2 0.5 Therefore and epl(C) = I + p( I — Q) 1 = 3.S9S551. a* = tpl{C) + 2p(I — Q )~-T - epl(CY = 24.207099 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. This code is a bad code and using the algorithms given in Section 3.4- U -’e can obtain a code with epl(C) = 1.747283 and cr£ = 1.072003. R e m a rk 3.1 If a phase error occurs in the sentence transmitted, a formula quite similar to the one in Theorem I can be used to determine the MEPL. The only difference is the vector p. If we assume that, the phase error is an insertion with probability : 0 < '■ < I and the phase error is a deletion with probability I — and the position of the phase error has a uniform distribution in the sentence transmittal, then the probability that an insertion occurs before the jth bit of the codeword c\ is - g f j and the probability of the deletion of the jth bit of the codeword i is (1 — If we further assume that an insertion can be either a 0 or a I with equal probability, then by means of exactly the same method as used for the amplitude error case, we can derive a formula for the vector p in the phase error case. The matrix Q remains the same as in the amplitude error case. With these modifications. Theorem I still holds in the phase error case. R e m a rk 3.2 For variable length to variable length prefix free codes, the MEPL can be defined similarly by keeping in mind that a codeword now corresponds to a source word consisting oj several source letters as opposed to the fixed length to variable length code case where a codeword corresponds to one source letter. Therefore, the error propagation length is defined as the total number of source letters in all source words whose codewords are affected by the single bit error in decoding. The vector 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. p and the matrix Q are defined in exactly the same way as for the fixed to variable length case. The formula for the MEPL is epl(C) = L(S)(l + p • f* + p Q l‘ H ) = L(S)( I + p([ — Q)~l l ‘ ) where L(S) is the average source word length of the code. 3.3 Conditions for Robustness We now set about to find conditions for prefix-free codes to have finite MEPL. The following result is recognized by many researchers without mathematical proof. Proposition 3.1 If a prefix-free code is not a block code and is suffix-free, then the code is catastrophic. Proof. For suffix-free code, the matrix Q is a stochastic m atrix and hence it has I as one of its eigenvalues and l f = ( 1. • • -. 1 )^ as the corresponding eigenvector, i.e.. Q -V = I'. We then have p(?'l l ‘ = H,"=i Pi where p = (/j(r>t). • • • ./>(o,n)). Since this code is not a block code, p f 0. Therefore, the sequence x m i + p i ( + p (? i( + • • • = i + tx y . Pi) 1 = 1 1 = 1 S O Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where p, = p(a,) is the i-th component of p. does not converge. Therefore, the MEPL is not finite. Hence the code is catastrophic. □ The following is a non-block code which is both prefix-free and suffix-free. E xam ple 3.3 The prefix-free code 0010. 0011. 000. 01. 100. 1010. 1011. 110. III. is suffix-free. So its MEPL=oc. R em ark 3.3 In [OS] (see also [40]) Stiffler claims that a code is necer self-synchronizing if and only if none of the proper sujfixcs of the codewords are themselves code- words(i.e.. suffix-free). However, suffix-free is not a necessary condition fora prefix- free code to be catastrophic. Example .[ °f this section gives a prefix-free code which is not suffix-free but is catastrophic in the sense of MEPL. The following proposition gives a sufficient condition for a code to have finite MEPL in terms of eigenvalues of Q. P ro p o sitio n 3.2 If the eigenvalues of Q of a prefix-free code C for a given source arc all strictly less than 1. then epl(C) < zc. Proof. Let Q = (</,.jU xm - P = (p (« i).” *.p(o-fl)) and 1 = (L. •••.!)'. Let A,. / = I.■■■.rn. be the eigenvalues of Q. Since eptJ > 0 and f2']l =l (p.j < 1 . we can SI Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. conclude that for any A,. — 1 < A, < 1. Let Xmar = maxl=i....,m{|A,|}. A theorem about spectrum radius of an operator [1 2 ] says that Jim \\Qn\\± = A m„ . Hence Ar;iflx < 1 means that for 0 < e < I — Amar there exists an .V > 0 such that if n > .V then ||(?a|! < (c + Am ,,x)rl. Therefore for n > A we have |jpQ nl|| < (c + Amax)nv/7«||p||. This implies that epl(C) < oc. □ R em ark 3.4 Even though the condition in Proposition 1.2 is sufficient for a code to be robust, it is not necessary. .4 trivial counter example is the block code of length n with '2n codewords. For this code. I is an eigenvalue of the matrix Q, but its mean error propagation length is I. .1 nontrivial counter example is Example 5. 1 1 > can show that the eigenvalues of Q are strictly less than I is equivalent to that the state A is an absorbing state [/t6], i.e.. there is no infinite loop among error states. It seems that the MEPL of a code for a given source depends not only on the structure of the code, but also on the source distribution. It is plausible that whether a code is robust should also depend on both. In the following, we introduce a necessary and sufficient condition for a prefix-free code to be robust. This condition shows that whether or not a prefix-free code is robust does not depend on the source distribution. It is solely determined by the structure of the code. Recall that I (see 82 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.1) denotes the set of internal nodes of the code C. Let A be the empty node. For P c i . define C(P) = { i 6 1 : 3c" E C'.p € P such that sc'p € C} (3.13) and A(P) = { i G 1 : 3/ t C(P) such that st E C}. (3.14) Define A°(P) = P. A^(P) = A(Am~l (P)) and A'(P) = l£ = 0 -T'l (P ). Note that A(P) is the set of all states that are accessible to P in one step. A n(P) is the set of all states that are accessible to P in n steps and A'[P) is the set of all states that are accessible to P in finite number of steps. Let T, — ,-\"({s}) for s E 1. Let 11 — {-s E 1 : 3(A\y) : 1 < k < //. 1 < j < l(ck). S(k.j) = s}. (3.15) This is the set of all states which can serve as the state with positive probabilities for a codeword containing a single bit error. We have the following proposition. P ro p o s itio n 3.3 epl(C) = oc if and only if (J Tsn U ^ O (3.16) s€l -T\ whr.rt O stands for the empty set. 83 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Proof. As before we assume that 1 = {a0. • • • . a m}. Sufficiency: We assume that o 6 7 j fl li for some 3 £ I — 7a- Furthermore, we assume that a € Ak( {-i}) for some k > 0. This means that r/^ j > 0 . where q[kj is the probability of the event that starting from state a. the state 3 is achieved in k steps. Since a € U. which means p(a) > 0. when an error occurs in a sequence of concatenated codewords, the probability of the event that the error propagation state is eventually transited to the state 3 is positive. Since from state 3 it is impossible to access to the state A. once state 3 is reached, the state A w ill never be accessed. This means that the correct parsing will never resume. This implies that the MEPL of the code is infinity. Afce.s.s///y: Case 1. 1 — T\ = 0 . In this case we may assume that > 0 for k, > 0 . / = where is the probability of the event that starting from state a,, the state A is accessed in k, steps. Letting s = max{A.-[. • • •. km} and P = (n.j)mxm = Qs we have r ‘.j > H r'-j - 1 ~ m ink .,'..\} < L j=t We can then conclude that the maximum eigenvalue of R is strictly less than 1. Therefore by a proof similar to the one for Proposition 2 . t=i 1 < oc. S-l Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Case 2 . X — 7a ^ 0 but for any a € Lher-Tv a £ W ithout loss of generality. we can assume X — 7 \ = { « !.• • • .o m, } for some m { > 1 and U Ts = {« !. • • • .o mi.Qmi + 1 . • • • .o mi+m2} for some m 2 > 0 . *er-7\ In this way. the set X — A is partitioned into the union of three disjoint subsets {o i.---.Q „n }. {omi + 1- • • • .Qmi+rn,}. and {a m, {. • • •. o m }. This partition in duces a block structure of the matrix Q as follows: Q = ( \ Q i.i Qi.i Q i.i Q u Qi.i (? > ..« Qi.i Q -.i. z Qi. 3 By definition, the set of states { q i. } are accessible only from the states in the set { o i . ' " . o m|. o m|+,.- - - .n „ ll+m i}. Therefore. Q:u = 0. Q .-j.j = 0. Since { o i. • • •. ft m 2} n u = 0 /;(° i) = p{Qj) = • • • = p(Om2) = o. This implies pQ 'l‘ = (0 . • • • . 0.p(ftmI+m 2 + l )• • • • .p(am = (p(a,nl+:n2 + l ). ' ' ‘ . p( ft m ) )Q‘ X, l'. So Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where 1 . = ( 1 . • • •. Since A is accessible from any state a, for i = nii + ni2 + I. • • •. m. this reduces to Case I. By the same argument as used in Case 1 . we obtain )• • • • .p(«m)) Q3.3 I I < Therefore epl{C) < x . □ R e m a rk 3.5 Let ff{P ) = (J'1= o -'^‘(P)- Bt(P) — A(Bt(P)). we can easily con clude that A'(P) = B,( P). So even though Am (P) is defined by an infinity union, we can determine it in finite steps. This makes it possible to check the robustness of a prefix-free code by computer. C o ro lla ry 3.1 The robustness of a prefix-free code depeneis only on the structure of the code and is indepe ndent of the source distribution as long as the probabilities of the letters of the source are all positive. E xa m p le 3.4 The following code is prefix-free but not suffix-free. The mean error propagation length of the code is x : 0010. 0011. 000. 01. 100. 1010. 1011. 110000.1100010. UOOOll. 11001. 110100. 1101010. HOlOll. H O lll. 110110. 111. Proof. For this code U = {0.00.001. L 10. 101. 11. n o . 11000. UOOOl. 110101. 11011} = 1 - {A. 1100. 1101. 11010}. 8 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. By definition, we can get .-1°({A}) = {A }. A l ({A }) = .42({A }) = • •• = {110} and T\ = -d*({A}) = {A } U { 1 1 0 }. Therefore U,€r-T, Z D l - { A. 1 1 0 }. Obviously u z n u ^ o . *€r-7\ Therefore, the code is catastrophic. □ R e m a rk 3.6 In [65], Scliutzenberger outlined a result claiming that onlg three hinds of codes were not "self-synchronizable": I. uniform codes (block codes). J. uniformly composed codes (block product of one code) and 3. anagrammatic codes (suffix-free codes). In [7], according to the statement of [65], the authors proved that binary codes with "inclusion" property are self-synchronizable if and only z/gcd(/[. • • •. /„) = 1 by proving that "inclusion codes" arc not uniformly composed, not anagrammatic. This example shows that the claim in [65] is not complete. Because there is no detailed proof in [65], w e have not yet found the wrong place there. Apparently, if .T“( { A }) = X. then the code must be robust. The following exam ple gives a case where .-l'({.\}) ^ X. but the code is still robust. E xam ple 3.5 The following code is prcjix-free and not suffix-free with = 1 and MB PL < x ; oo. oi. io o o. toot, to to. i o n . 1100. n o t . m o . ////. Proof. The internal nodes are {0. 1. 10. 11. 100.101. 110. 111}. Since the set of states {0. I. 100. 101. 110. I l l } is a closed class. I is an eigenvalue of Q. We can 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. check that T\ = {10.11} and Use r-7\ T, = { 0 .1. 100.1 0 1 . 1 1 0 . 1 1 1 }. Fortunately. U = {00. 01}. U Tsf)U = Q. s<Zl-Ts This implies that epl(C) < oc. □ 3.4 Algorithm s for Finding Prefix-free Codes with Short M EPL Given an information source, let the codeword lengths of the Huffman code be </.><••• < G- Let. I\ be the maximum codeword length and let m, be the number of codewords of length i. Then ms = Our goal is to find a prefix- free code with these given codeword lengths and having as short as possible MEPL. It should be pointed out that even though the code with minimum MEPL exists, it is very hard to find the code especially when the source is large. In [G .'jJ. B. Rudner gave an algorithm for constructing minimum-redundancy codes with an optimal synchronizing property. Although. Rudner"s algorithm does not intend to find codes with short MEPL. we checked all four examples in [63] and found that Rudner's algorithm works very well in terms of the MEPL of the code. But there are three main drawbacks in her algorithm. The first one is that, for her algorithm, the minimum length of codewords is restricted to be less than or equal to -1. The second one is that the code lengths should satisfy some conditions. S S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The last one is that her algorithm is much more complicated than the fixed order method—one of our algorithms presented below. For all the four sources used in [63]. our algorithms obtain better codes than the ones from [63]. There are some other techniques for the construction of minimum redundancy code with short MEPL(see. for example. [24. 57. 72]). We think the method given in [72] works well even though the search is restricted to strongly equivalent Huffman codes[24]. But for the two examples in [72]. our algorithms give better codes. The following three algorithms are motivated by Theorem L. 3.4.1 Fixed Order M ethod This algorithm uses a preference order -< of binary strings. The order is defined by induction as follows. Case I: The minimum length is bigger than 1 . i.e.. /[ > 1. • For / = 1 . 0 -< 1. (3.17) • For I = 2. 10 -< 00 -< 1 1 -< 01. (3.IS) 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Suppose the order for I = k has been defined. Then for / = k + 1. the order is defined as follows: let x.y € {0 . l } fc+l. x = (xi.x').y = (yi.y1 ) where •Ti-i/i € { 0 . 1 }. x'.y' E { 0 . l } fc . then x -< y iff either x' -< y1 or x; = y' and X| = l. y t = 0 . (3.19) Case 2: The minimum length / 1 = I . In this case everything is the same as in Case 1 except that the order of strings ending with 1 and 0 is reversed respectively, i.e.. • For / = 1. 0-<l. (3.20) • For / = 2. 00-< 1001 -< 11. (3.21) • Suppose the order for / = k has been defined. Then for / = k + 1. the order is defined as follows: let x.y € {0 . l } fc+l x = (x[.x')./y = (y\.y') where £ {0 . 1 }. x'.y' e { 0 . I}*-', then x -< y iff either x' -< y' or x' = y' and x t = 0. iji = 1 . (3.22) W ithout loss of generality, we assume that the upper branch of a code tree represents letter "0" and lower branch represents " I" . The algorithm goes as follows: 90 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Step I. Generate the block tree of depth l\. take the first m t leaves in the preference order as leaves of the code tree and assign the source letters in the non-increasing order of probability. Set i = I. Other leaves form the set of available strings. Step 2 . Set i := i + I. Extend all available strings to depth /,. These leaves form the set of extended available strings. Take the first in, extended available strings in the preference order as new codewords and assign the source letters in the non-increasing order of probability. The remaining extended available strings form the set of available strings for the next round. Step S. If I = k. then the present tree is the code tree. Otherwise go to Step 2 . Intuitively, this algorithm works well because of the following properties: 1 ). The codes constructed by this algorithm have as many as possible neighboring pairs which are pairs of codewords of the same length at Hamming distance 1. 2). The codes have as many as possible pairs of codewords of different length such that the shorter codeword is a suffix of the longer codeword. The first property makes the norm of p smaller whereas the second property makes the norm of Q smaller. This algorithm works very well for almost all sources we have examined. In fact, we checked all 1827 possible sources w ith dyadic distributions (with probabilities of the form 2- / ) whose maximal codeword length is less than 7. The average MEPL of the codes resulted from our algorithm is 2.3909. In the case where the minimum length 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is 1, the definition of the string order in Case 2 is based on the Property 2 which makes the elements of Q in Theorem 1 smaller. Note that when the elements of Q are small. ( / — Q)~l will be small in some sense. In fact, among 2 0 2 such codes with maximal code length less than 7 and minimum length 1 . the average gain of MEPL is 0.21 when we use the fixed order given in Case 2 rather than the order in Case 1. According to this algorithm, we can see that prefix-free code can be constructed without any search. Experiments also show that once the indices m [. • • •. tn^ are given, the structure of the code tree plays the main role in MEPL and the source probabilities are minor factors in terms of MEPL. So from now on. except for some examples, in this chapter, we only consider dyadic sources. 3.4.2 Maximum Gain M ethod This method constructs the code by selecting the codewords one by one in a non increasing order of probabilities. A codeword is selected from a set of available strings based on a measure called the gain of the available string and the order of the available string defined in Subsection 3.-1.1 . The measure gain of a string partially reflects the benefit of choosing the string as a codeword. If there arc more than one strings having the same gain, we choose the one with the maximal order given in 3.4.1. 92 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Suppose we have determined a set of codewords Ck = {ci-• • • .Q.} which is ar ranged in the non-decreasing order of length. Then we can get a tree T such that Ck is a subset of its leaf set Ct and the lengths o f leaves in Cj\Ck are all lk+i if fc < n. For the tree T. just as in Section 3.3. we can find the set of internal nodes X j = {oq = aI - ' ' ’ • ftm } which is called the state space. To define the probability vector ps = (p(af ).■■•. p(a^)) as in Section 3.2. we need to keep in mind that, since not all leaves of the tree are codewords, only codewords should be used. The definition of q,,, is exactly the same as in Section II. Then we have _ />(«■) _ /'(«.) ^ L[C) Z U i M K c , ) - Let S(j.l) denote the state (internal node) obtained when a bit error occurs in the j-th bit in the codeword c/ where q £ Ck- Then P ( ° [ ) = H 1j.i where 1 < i < m. Let be the transition probability from state a [ to state o j with respect to the tree 7" (which is temporarily considered as a code tree, so qtJ can be obtained from equations (3.7)-(3.S)). The transition m atrix Q t is defined as Qt = (7.-jr=l;=1. (3 .2 5 ) 93 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let A t = C j\C k = {c t i } and we call it the set o f available strings. Let (3.26) where df( represents the Flamming distance and L{C) is the average codeword length of the desired code as before. We can see that. p(at) is the probability that the error propagation stops at a, in one step when a, is selected as a new codeword. Let st J be the 1-codeword transition probability from state a f to available string tij which is defined as Setting p..j = (p(ai ).•••. p(ai))‘. the gain vector g,\ = ).••• .</(«/)) is defined as R em ark 3.7 In equation (-i.J-5). we can use Qc -k instead of Q j where Qck is the transition matrix with respect to the known codeword set Ck- to r most sources we tested, they achieve the same codebook. But for a few sources, they obtain different p{ Cm ) if 3m € { 1. •••.£•} s.t. c\Jc,n = eij 0 otherwise Then 6 a = P'-i + P s ( I - Q t ) ' Qa■ (3.2S) codes and on average, for the sources we tested, this is slightly worse. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. By the definition of g.4 we see that g(at) is roughly the probability that the error propagation stops at a, when a; is chosen as a codeword. We call y(at) the gain of choosing a, as a codeword. Having had this measure, it is reasonable to choose the available string w ith maximum gain as a codeword. The algorithm based on the maximum gain is given as follows. Step I. Initialization: Set i : = 1. If /, = 1. set c, = 1 and C, = {c,}. Otherwise set c; = 0 • • • 01 and Ct = {c,} such that / ( 0 • • • 0 1 ) = l {. Step J. Generate a tree 7~ with codeword Ct such that C, C Ct and the length of codewords in Ct \C, is ll+l. Step ■ ]. Determine the available strings A t - {«i. • • • .«/} = Cr\C, such that at -< ai-i -<•••-< (ii. Let s = inf {7 : g(a} ) - max{</(«i). • ■ • .ej( «/)}}• Set c1+i := a, and Cl + 1 := Ct U {c1+1}. Step Set <:=/ + !. If i < n — I go to Step 2. Otherwise the tree T is considered to be the code tree. R em ark 3.8 C 'ontparing to the fixed order method, stm illation results show that the maximum gain method indeed reduces the MEPL for rtiang sources. .-Is a cost, its computation complexity is higher, li e should point out here that the maximum gain method is not always better than the fixed order method. Lor a few sources, it increases their MEPLs. 9 b Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.4.3 M -Algorithms using Partial MEPL In this subsection we use the M-algorithm to design a codebook. To do so. we need define reasonable objective functions. Recall that the codewords are assumed to have length A < • • • < l n. the maximum codeword length of the desired codebook is A and m, is the number of codewords with length / for 1 < / < A . The same as in maximum gain method, the codewords are selected one by one in the non-increasing order of source letter probabilities. Suppose we have constructed a partial codebook C k = {cv • • - .ck}. and the tree T is defined as in Subsection 3.-1.2. Let A t — C r \ C k = {«[. • • •.« /} be the set of available strings. Assume that the tree T has totally n j leaves, then / = n j — k. Among the / available strings of length h+\- we w ill eventually choose nt = ” 6 ~ k codewords as new codewords. Assume that nt new codewords are taken according to some rules. So the se lected codewords arc = {c,. • - • . Cfc.Cfc+i. • • • . c k+n, }. Let P t(c k+ j ) denote the probability that when an error occurs in ck+J. the corrupted codeword cannot be the concatenation of the codewords in Ck+nt- Let s denote the string obtained when codeword ck+j has a bit error at position u. Let S(a.k + j) denote the internal node after parsing the string s by codewords in Ck+n,• Then Pt ((-'k+j) = £ cu,k+j + ! { c : c e CT\Ck+nt ■(hl( c. Ck+j) = 1} | x c [.k+j (3.29) u:S(u.k+j)£\ 96 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. for 1 < j < nt. Let ps = (p fo !). • • • ,p(arn)) where p(a,) is defined as in (.3.2-4). then the first kind of partial M EPL of the partial codebook Ck is defined as pmepl,(Cfc) = 1 + p3( / — + Y,Pt{ck+j) + £ 9 (°i)‘ (;*-30) J = l i = l.--.l.at£Ck + nt wliere </(«,) is defined as in (3.28) and 1 is the m-dimensional all 1 row vector. We want to choose codewords c^+i. • • •. Ck+n, such that n, $3/MCfc-h,)+ H </(«.)• J = l i=i.-./.-i.«Ec\+rit which is roughly the penalty when c;.+ i .• • • .C k+n, are chosen as new codewords, is minimized. Cnfortunatelv. there is no easy way to find the codewords minimizing the partial MEPL. As a substitution, we only take those nt strings in the non-increasing order of gain g{a,) as codewords. If some of them have the same gain, only the strings having the maximal order (according to the fixed order) are taken. Partial MEPL reflects the MEPL of the desired codebook in some sense. For example, if k = n then pmtpli(Ck) = tpl(Ck)- bo we can use it as an objective function in the M-algorithm. The algorithm works as follows: Sit p I . Among leaves of the block code of length / (. choose .\[x = m in{2/l_ l. .1/} codewords c}. - • • .c \t according to the fixed order defined in Subsection V-A such that r{ >- • • • >- c\f i . Set C} := {c{}. • • • .C(/i := {c'Ui }. Set t 1. 97 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Step 2. For Cj. j = 1. • • •. generate a tree 77.j with leaves Ct, , such that the length of strings in Ct,s \Cl } = {c‘ ,'J. • • •. c\'J} is ll+l. Arrange c\'J. ■ • •. c\'J in the non-increasing order of gain g(c{J). For strings having the same gain, arrange them according to the fixed order Set B1 ^ := C‘ U {cJ.'J} and calculate p rrie p l,!# ^1) by formula (3.30) for j = 1 . - - -. .\/,. 1 < k < I. Assume that there are u, different sets Bj+ kl . Among the /i, different sets, choose the first M,+i = m in {u ,..\/} sets such that their partial MEPLs are the smallest. Set the .W l+i sets to be C[+ l. • • • Set i := i + L . Step ■ ]. If /, = ln, calculate the MEPL of the tree 7i.j for 1 < j < instead of the partial MEPL. Compare the MEPLs of the .l/,_[ trees and take the one with minimum MEPL as the desired code tree. If /, < /„. go to Step 2. By the above method, generally, we can get code book with smaller MEPL than by the methods given in the above subsections. For their performance, see the next subsection. We can also define other objective functions instead of the one defined above. Because the power of M-algorithm. we can always get satisfactory codebooks if the objective functions are reasonable. We define another objective function by defining the second kind of partial MEPL of a partial codebook Ck- For Ck- assume that the number of available strings is / and the same as for the first objective function, eventually nt of them will be codewords. Let < /( < </_> < •••< < // be the rearrangement of the gain value g{a,) given in (3.28). Different from the previous M-algorithm. the 98 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. state (internode) probabilities are defined as the state probability of the tree T. The the second kind of partial MEPL is defined as __ I - n t pmepl.2(Ct ) = I + ps( / - Q q-Y 1! ' + Y 0(^)- (3.31) 1=1 In order to make the M algorithm work well, we combine the M-algorithm. fixed order method and maximum gain method to get the following algorithm: Step I. Among leaves of block code with length /[. choose .\[x = m in{2,1_1. M } codewords c{. • • •. according the fixed order defined in Subsection 3.1.1 such that c{ > ------>- c\l r Set C\ := {c lJ .---.C i/, := M /, }• Set i := 1. Step J. For Cj. j — I . • • • . \ f t. generate a trce Tt,j with codewords Cp , such that the length of codewords in C t,j\C 1 = {c'f-7. • • •. cfJ} is /,+|. Among the avail able codewords find those codewords c\'J. ■ ■ ■. c\'J with maximum gain. We also require that c\ J >-■■•>- cj'7. Set : = C‘ U {c ^ J and calculate pm epL(5^.‘ ) by formula (3.31) for j = 1. •• •..!/, and 1 < k < /. Assume there are ii, different set B jj} . Among the nt different codeword sets, choose the first = m in{n,. M } sets such that their partial MEPLs are the smallest. Set the sets to be CJ+ l. • • • .Cu,1 . Set / : = / + ! . Step :i. If /, = /„. calculate the MEPL of the tree T ij for 1 < j < .\/,_[ instead of partial MEPL. Compare the MEPLs of the trees and take the one with minimum MEPL as the tree desired. If /, < l n. go to Step 2. 99 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.5 Performance Comparisons To compare the performance of our algorithms and other existing code construction methods, we check the following examples. We can see that fixed order method is indeed simple and powerful. Even though the maximum gain method is applicable for any source distributions, in the following of this chapter, for maximum gain method, we always assume that the source probabilities are dyadic. In Table 3.3. 3.5 and 3.6. MEPL for "dyadic prob” means the MEPL when the sources have dyadic distributions. In order to avoid the case that I — Q and/or I — Qt are not invertible, for both (3.10) and (3.12). I is replaced by (1 + 10~')/. We w ill first use these algorithms to construct codes for the four sources used in [63] and compare the corresponding MEPLs w ith those given there. E xam ple 3.6 II V discuss four sources from [OS]. Sourer I is of index rn | = I . m > = 0. ni:i = l.W-i = l.m.T = 4. ma = f) . / » 7 = 6 . Source II is of index rn{ = 0. = O.ni.i = 1. m .t = 10./Us = 7. = 2. Source III is of index m i = 0. m -> - 0 .m 3 = 2. in .1 = 1 .Mr, = 19. m,; = 2. rn- = S. Source I 1 /.s of index rn( = 0. m -> = 0. m.) = 2. mi = S. Mr, = -l.rnti = T.rn- = l.rn$ = i.tn 9 = I. mm = l,(« n = 2. For these four sources, the MEPLs for each method is tabled as follows. In this example, for .1/ algorithm. M is taken to be 10. In the following figures, the upper branch represents " I" and the lower branch represents "0". Xote that in the figures. S-1. .S '-1 1 . S- Iff. S- IV mean source I. II. III. IV respectively. F-0 means fixed order method. M-C! means 100 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.2: MEPL for Rudner’s 4 Sources Fixed Order Max Gain M-algi M-alg 2 Rudner's Source I I.92S70 1.867S7 1.92870 1.S4902 1.88893 Source II 2.559S2 2.43175 2.41176 2.29320 2.57883 Source III 2.93694 2.51570 2.29979 2.31288 2.84377 Source IV 1.SS446 1.SS446 1.86929 2.02647 1.92S79 maximum gain method, M -l means the first M-algorithm and M-2 means the second M-algorithm. Figure 3.1: S-I by F-0 Figure 3.2: S-I by M-CI E xa m p le 3.7 .Is pointed out in [72], a code with higher occurrence probability of synchronizing codewords may not hare better synchronization recovery capability. In [20], Escott and Perkins gave two codes with more synchronizing codewords, which are Code 2 and Code ./ in [20]. Since Code 2 has the same indices as Source 11' 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. / / / Figure 3.-I: S-I by M -II Figure 3.3: S-I by M -l Figure 3.o: S-I by Rudner Figure 3.C: S-I I bv F-0 102 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure :J.7: S-II by M-G Figure 3.8: S-II bv M-I / \ Figure 3.9: S-II by M -II Figure 3.10: S-II by Rudner 103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.11: S-III bv F-0 Figure 3.12: S-III by M-G Figure 3.13: S-III by M-I 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.15: S-III bv Rudner Figure 3.16: S-IV by F-0 105 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.1!): S-IV by M-I I Figure 3.20: S-IV by Rudner in Example 3.6. the codes are listed in Table 3.3. For Code ire assume that probabilities are dyadic. The codes, their MEPLs and crcs are given in Table 3..[. Exam ple 3.8 For the Huffman code of the English alphabet (without space) which is cited in [3.{. 57. 73]. the optimal coeies constructed by different methods, the MEPL* and ar-s of these codes are given in Table 3.5. When considering the application in video coding, in [73]. the authors gave a rode for Motion \ ectors. The comparison of these results are included in Table 3.6. Exam ple 3.9 H Y should especially point out herein that, in [71 ]. in order to obtain a code with short MEPL. for the English alphabet(without space again) cited in [53], the authors randomly generated rarieible-length codes. The best code from 350000 trials was shown in [71] and its MEPL = 1.01.(736. erf = 1.12055!). The fixed order mclhexl gives a code with MEPL = 1.891715. Gc — 1.315187 while the maximum gain method gives a code with MEPL = 1.833.(98 and exc = 1.1-18893. Even though these 106 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. English letters have different probabilities from those given in Table Vll. the code tree is the same as the one in Table V ll which is generated for dyadic probability. So we do not give the code table here. Remark 3.9 By the above examples, we can see that when the dyadic probability is replaced by the real probability, the MEPL will decrease. This is because in our algo rithms. the probabilities of the codewords have been taken into account appropriately. 3.6 Conclusions In this chapter, the concepts of MEPL for characterizing tlie robustness of prefix- free codes are studied. Conditions for robustness of prefix-free codes of mernoryless sources are* discussed. Based on the formula of MEPL. throe algorithms are proposed for the construction of prefix-free codes using the statistics of an information source. These algorithms are fairly simple and efficient. But so far an algorithm for finding the prefix-free codes with the shortest possible MEPL is still not available. 107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.3: A 26 English L e tte r Source w ith Space B it Length L e tte r P ro b ab ility Fixed O rder M a x Gain Refercnce[63j Reference[20j 3 space 0.1859 001 001 001 110 3 e 0.1031 101 101 101 100 4 0.0796 0001 1001 1111 0110 4 a 0.0642 1001 1101 1101 0000 4 0 0.0632 0101 0001 1001 0010 4 0.0575 1101 0101 1000 0100 4 n 0.0574 0111 0111 0111 0111 4 s 0.0514 m i m i 0101 1010 I r 0.0484 0000 0000 0001 1110 4 h 0.0467 1000 1000 0000 i n i 5 1 0.0321 01001 01001 inoi 00110 5 < 1 0.0317 11001 01101 1 1001 10110 5 u 0.0228 01101 11001 01101 0 0 1 1 1 5 c 0.02 IS 11101 11101 01001 OIOIO 6 r 0.0208 010001 1 11001 111001 010110 ti m 0.0198 110001 011001 111000 000110 6 V i 0.0175 OllO O l 010001 110001 000100 6 y 0.0164 111001 110001 110000 000111 ti P 0.0152 010000 010000 011001 010111 6 g 0.0152 t 10000 011000 010001 lonio ti b 0.0127 011000 110000 010000 101111 7 0.0083 1110001 1110001 0110001 0001010 S 0.0049 11100001 11100001 01100001 00010110 9 X 0.0013 111000001 111000001 011000001 000101111 It) 0.0008 1110000001 1110000001 0110000001 0001011100 11 q 0.0008 11100000001 11100000001 01100000001 00010111010 11 I 0.0005 11100000000 11100000000 onuoouoooo 00010111011 MEPL 1.84576 1.82755 1.93724 1.94658 "i- 1.19340 1.142548 1.50386 1.45379 MEPL (dyadic pr»b ) 1.88446 1.88446 1.92879 2.01881 (dyadic prob) 1.33866 1.33866 1.51747 1.70378 Table 3.4: Escott and Perkins’ Code 4 Bit Length Fixed Order Max Gain Reference[2 0 ] 2 0 1 01 11 3 0 0 1 101 O il 3 101 0 0 1 0 0 0 3 111 111 0 1 0 3 0 0 0 0 0 0 1 0 0 -I 1 0 0 1 1 1 0 1 0 0 1 1 4 1 1 0 1 1 0 0 1 LOll -1 1 0 0 0 1 0 0 0 1 0 1 0 -I 1 1 0 0 I L O O 0 0 1 0 MEPL 2 . 0 0 0 0 0 2 . 0 0 0 0 0 2.08333 • ) ac 1.48889 1.4SSS9 1.7569-1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table To: A 26 English Letter Source without Space Bit Length Letter Probability Fixed Order Max Gain Ref. [24] Ref. [72] 5 E 0.1278 001 001 0 1 1 110 5 T 0.0855 101 101 111 0 1 0 4 0 0.0804 0001 1001 0 0 0 1 1 1 10 4 A 0.0778 1001 1101 0 0 1 1 0 1 1 0 4 X 0.0686 0101 0001 0 1 0 0 1010 4 I 0.0667 1101 0101 0 1 0 1 0 0 1 0 4 R 0.0651 0 1 11 0111 1010 1000 4 S 0.0622 till 1111 1011 0 0 0 0 I II 0.0595 0 0 0 0 0 0 0 0 1100 1001 5 D 0.0404 10001 01001 1 1 011 oino 5 L 0.0572 01001 11001 IL0 10 0 0 1 1 0 0 L ' 0.0508 1 1 0 0 1 11101 1 0 0 11 11110 5 C 0.0296 0 1 101 01101 1 0 0 1 0 10110 5 M 0.0288 11101 10001 0 0 1 0 1 0 0 0 1 0 5 P 0.0225 10000 10000 0 0 1 0 0 11111 5 F 0.0197 0 1 0 0 0 11000 0 0 0 0 1 0 0 0 1 1 6 Y 0.0196 110001 111001 0 0 0 0 0 0 0 1 1 1 1 0 6 W 0.0176 0 1 1001 0 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1 1 0 ( 5 g 0.0174 111001 0 1 0001 1 0 0001 1 0 1 1 10 1 5 B 0.0141 1 10000 0 1 0 0 0 0 1 0 0 0 1 0 001 111 ( 5 V 0 .0 1 1 2 0 1 1 0 0 0 0 1 1 0 0 0 1 0 0 0 11 1 0 1 1 1 1 7 K 0.0074 1110001 1110001 1 0 0 0 0 0 1 0 1 1 1 1 1 0 8 J 0.0051 11100001 11100001 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 9 X 0.0027 111000001 1 1 1000001 1 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 0 10 z 0.0017 1110000001 1 1 10000001 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 0 10 Q 0.0008 1 1 1 0 0 0 0 0 0 0 11 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 MEPL 1.90505 1.84829 2.80501 2.08875 ■ ■ ■ — i ac 1.56545 1.22944 5.55525 2.26862 MEPL (dyadic prob) 1.98546 1.91594 2.91665 2.21065 ij (dyadic prob) 1.55957 1.58050 5.95448 2.75527 109 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.6: The Source of 26 Motion Vectors Bit Length Motion Vector Probability Fixed Order Max Gain Ref. [72] 2 MYT 0.2304 01 01 10 3 MV2 0.1620 0 0 1 101 110 3 MV3 0.0799 101 0 0 1 0 1 0 4 MV4 0.0774 0 0 0 1 1 1 01 0 1 1 0 4 MV5 0.0760 1 0 01 1 0 0 1 1110 4 MVG 0.0697 1 1 01 0 0 0 1 0 0 1 0 4 MV7 0.0411 l l l l l l l l 0 0 0 0 5 MY'8 0.0399 0 0 0 0 1 1 1 101 0 0 1 1 0 0 MV9 0.0390 1 0 0 0 1 1 1 0 0 1 1 1 1 1 0 5 MYTO 0.0380 L1 0 01 1 0 0 0 1 o i n o 6 MYT 1 0.0195 0 0 0 0 0 1 1 1 1 0 0 1 0 0 1 1 1 0 6 MYT 2 0.0187 1 0 0 0 0 1 1 1 0 0 0 1 l l l l 10 G MYT 3 0.0141 1 1 0 0 0 1 1 0 0 0 0 1 0 1 1 1 1 0 (i MYT 4 0.0117 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 G MYT 5 0.0115 1 1 1 0 11 0 0 0 0 1 1 0 0 0 1 0 0 G MYT 6 0.0109 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 ( MYT 7 0.0097 1 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 1 1 0 1 MYTS 0.0094 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 1 0 t MYT9 0.0078 1 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 1 1 1 0 t MV20 0.0069 1 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 MV21 0.0069 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 1 1 1 ( MY’22 0.0067 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 l l l l 8 MY’23 0.0036 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 1 1 1 1 1 1 1 1 0 8 MV2-I 0.0034 1 1 1 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 1 0 1 1 0 8 MV25 0.0029 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 1 8 MV2G 0.0029 1 1 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 MEPL 1.66936 1.68404 1.68759 > ac 0.49955 0.52752 0.53315 .MEPL (dyadic prob) 1.72693 1.76088 1.74803 V ar (dyadic prob) 0.57480 0.64700 0.62306 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 4 Synchronization of Variable-Length Codes in Hamming Distance 4.1 Synchronization M easurements It is well known that the well-designed Huffman codes (exhaustive prefix codes). Tun- stall codes (variable to fixed length (V’-F) codes), variable to variable(V-V) codes are robust with respect to the IDD. So the synchronization probability is I in this case. Hence, we should introduce some methods to measure the degree of robust ness. We prefer the use of a simple indicator— the mean error propagation length (M EPL)-to measure the synchronization a b ility in IDD sense. W ith respect to HD. we will use synchronization probability to measure the robustness when one bit error occurs. If the data are synchronized with respect to HD. we can further discuss the average Hamming distance between the transm itted data and decoded data. The dynamic (adaptive) Huffman coding, arithmetic coding and Lampell-Ziv coding are I I I Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. catastrophic with respect to both IDD and HD and they will be excluded in this paper. V Y hen an error occurs in a sequence of concatenated codewords from a prefix code, it causes parsing errors so that error propagates until correct parsing resumes. Fortu nately. most of the prefix codes have the so-called parsing self-recovery property[ 1 0 2 ] or as called in [32. 5S. 7. 63. 6 8 . -19]. synchronizing property, by which we mean that correct parsing resumes with probability 1 under some mild conditions. This is why such coding methods are asymptotically mean robust. As discussed in the previous chapter, the formula calculating the MEPL have been given in many papers. See. for example. [63. 71. 2 0 . 102]. For a more general model, see [72]. In this paper, we w ill give another form of the formula to calculate the MEPL as a byproduct of calculating the strict synchronization probability of a code. In this paper, as in Chapter 3. the source is assumed to be memoryless. Let the source word alphabet be A = {« i........«.v} and let the probability mass function of the source words be p(ai). • • • ,p (a\). The source is coded by a prefix code C = {c[. • • • . cv} in such a way that cij is coded by cr The elements of the set C" = Ur ^_u Cn are called sentences or messages. W hen a sequence of source letters is coded by the prefix code C. the codewords are concatenated to form a sentence which is transmitted through the channel. W’e study the case where only one bit amplitude error occurs in the sentence transmitted. For simplicity, in this paper an error means an amplitude error. As mentioned above, a bit error in a sentence from a prefix 1 1 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. code may cause parsing errors. The error propagation length for a particular bit error occurred in a particular sentence is defined as the number of codewords in the sentence from the one where the bit error occurs to the one after which the correct parsing resumes. The concepts of error propagation length and synchronization in HD sense are illustrated by the following example: Exam ple 4.1 Suppose A = {cti.«>•«3- « s } h? coded as 01. I. 000. 0010. 0011 respectively and a sequence U 2a\a^a >a\a2 ’ ’ ' ^ encoded as 1. 01. 000. 1. 01. 1.... Supposing that there is a bit error occurred in the channel and the first bit is received as a "0 " instead of a "/ ". then the sequence 001000101 1... will b e parsed as 0010. 0010. 1. 1.... which is decoded as aA e i.\C i ? • • •• The correct parsing is resumed after .? codewords. It f would say that the error propagation length is o—the source letters rijo ^/-jdjo i are lost and misdecoeled as a.te i.iei2 . For this example, we can also see that with respect to Hamming distance, it is asynchronized. Sow see another source sequence 0 2 0 1 0 5 (12«[a.> • • •. [t is encoded as 1. 0 1 . 0011. 1. 0 1 . 1.... 113 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. If the first bit " /" is swapped to “0”, the decoded source sequence is 0 4 0 ^ 2 0 2 0 ^ 2 • • •• In this case, we will say that the source sequence are synchronized in both distance senses with error propagation length 3. To make this definition more precise, the decompression process of a sentence corrupted by a single bit error is analyzed as follows. A prefix code corresponds to a code tree. A code tree has two kinds of nodes—nodes that give rise to other nodes and nodes that do not. The first kind of nodes are called internal nodes and the second are called leares or external nodes. Let us denote the set o f internal nodes by I = {o 0 = A. G|. • • •. a.\/} (4.1) (in fact M = .V — 2). where A denotes the root node of the tree or the empty string. A is called the synchronization state in other papers, see. for example [o.'L 71]. The formula presented hereafter to calculate the strict synchronization probabil ity is applicable to all the three kinds of coding method: V-V. F-V and V-F. We assume the source word length set is /(«,) = /* and the codeword length is l(c,) = l'~ for i = I. • • •. .V. We assume that C is an optimal code prefix cotie. i.e.. a Huffman code relative to the source word set A. Assume the message string ■ • ■ nit ■ ■ ■ is encoded as binary string cic2 ■ ■ ■ where c, € C. When there are bit errors, say q is amplitude changed to c\. there are there cases: 114 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 . c'[ is still a codeword or a concatenation of codewords, say c', = C [ •••c, for some i > 1 . In this case the error w ill not propagate in the IDD sense and the message number difference is d = £ ‘ = l l(~~l(cj)) - l(~~l {cl )). If d = 0. then the error will not propagate in the Hamming distance sense either. If d ^ 0 . then the error w ill propagate in Hamming distance sense till the end of the source secjuence if there is no other bit errors. 2. c\ is an internal node. In this case, the lost message number is 0 — / ( t - 1 ( c ( )). The error will definitely propagate. 3. c[ is the concatenation of some codewords and an internal node. Say. dx = c i---c tQj for some i > I and j € {1. •••..!/}. Then the increased message number is 5Z‘ = 1 /(rr- 1(cJ)) — /( - _l(c[)) (decreased if negative). Also the error will definitely propagate. In anv of these three cases, a bit error causes an error of length at least 1. In Case “ O 2 and 3. the error propagates. If a codeword containing a single bit error is in Case 2 and 3. parsing the corrupted codeword ends up with an internal node. This internal node is said to be the state of this codeword for this particular bit error which is denoted by = S{j.k) (E 1 where 1 < A . - < .V is the index of the codeword in the code and 1 < j < /(cj. ) is the position of the bit error in the codeword. If a codeword containing a bit error is in Case I. define .s, = S(j.k) = A. The state internal node si w ill be concatenated with the next codeword and then parsed by using the prefix code. It again has three possible cases— Case 1.2 and 3. If it is in Case 2 and 3. the Ha Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. concatenation of the state and the next codeword is the concatenation of a sentence and a non-empty internal node at the end. This internal node is said to be the second state and is denoted as s2. In Case 1. the second state is the empty string. In general, the ith state s* is concatenated with the next codeword—the (i + l)-th codeword. This concatenation of strings can be either a sentence (Case 1). or a concatenation of a sentence and a non-empty internal node at the end (Case 2 and 3). In Case 2 or 3. the internal node at the end is said to be the next state .s i+i. In Case I. the next state is s(+i = A. Once the state s, = A occurs, the correct parsing resumes in the ID distance sense. After that, all states must be the empty string, that is. .s j = A. V / > /. The total number of steps needed to reach an empty string state, or the total number of non-empty string states plus 1 in the state sequence is the error propagation length in the codeword sense with respect to the IDD. The average codeword error propagation length times the average source word length is the mean error propagation length of the encoding machine. If the decoded source sequence and the transmitted source sequence has the different length, the error does not recover in the Hamming distance sense and can not recover any more if there arc no other errors later. 116 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To study the synchronization probability and MEPL. we need the following no tation and definitions. The average source word length and codeword length of the code are defined as )/(< *« ■ )• L{C) = f^p{a,)l(c,) (4.2) 1=1 1=1 respectively. Let denote the probability that the bit error occurs at the j -th bit of the /-th codeword c, under the assumption that only one bit error occurs in the sentence corresponding to the source data and the error position as a random variable has a uniform distribution in the sentence. Then we have _ P(«.) _ p(uj) ..... £j,‘ HC) u = lp M K c ty We now analyze how to calculate the MEPL and synchronization probability. Let .r‘ denote i additional source words are decoded when one codeword is input. We will introduce polynomial m atrix to calculate the synchronization probability. Let Cj'i denote the binary string obtained when the j -th bit of codeword c, is amplitude changed. Say eJit = ct • ■ -c*.am. W’e define k 'L , = ^ ( - - ' ( ^ - / ( T - V , ) ) . (4,1) u=l 117 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let Piix ) = Y cJ.^n,k- (4.0) (j.k):S(j,k)=o, Define P(-^) = (Pto(-r)- — -P.\/(-r))- (4.6) Let o, and Oj be two internal nodes in 1. The codeword set making node o, transit to Qj is defined as TC(i.j) = {A: € { I. •••.«} : 3c* € C~ such that o,c;. = c'O j}. (-1.7 If T g (i.j) ^ 0 . for k € T g (i.j). assume o,c\. = c'i • • • ('mor Then we define T T l = £ / ( - _V 'u) ) - ( ( - ~ V 0 ). h -8> U=I I hen define ( kj-r) - Y P(ak)-rn , l k ■ (4.9) KTc u.j) Define an (\1 + 1) x (.\[ 4 - L) polynomial matrix t?c(-r ) = (^(-r));= o ;=o- (4.io) us Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let O.v denote the all zero column vector of length M. Let ei = (1.0. • • •. 0)' of length .\/ + 1. Let ei = (0. 1. • • -. 1)' of length M + 1. Let Qui-c) = 0 0 O' 1st QC(*) U-ll) where /\/ is the identity m atrix of size \ [ x .1/. Define two polynomials as follows X C(.r) i ^ p (x i((? .u (.r))"e 1; (-1. 1 2 ) n=0 x H(x.y) = [ + £ p L rK Q .u U D V e ,. (-1.13) n= 0 We have the following theorems. The proofs for these results are attached in the Appendix. Theorem 4.1 The constant term of (!(x ) is the synchronization probability with respect to the flamming distance. is the are rage number difference between the transmitted source words and the decoded source words. Let gif'1 be the constant term of polynomial p(x)(C?A/(.r))rie1. Then £ * =0(h + 1 £ ' * = 0 < 7u° ^ th( MTPf. + ^ second moment of the error propagation length conditional on the synchronization with respect to the flamming distance. Theorem 4.2 //( I. 1 ) ;s the codeword mean error propagation length with respect to the insertion-delction distance. —.][[ ( 1. 1 ) — 2 /s the second moment e > f the codeword error propagation length. 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C o ro lla ry 4.1 When all the source words are of length I. the MEPL is //(L I) = l + f>(l)<?S,(l)ei = l+ p (l)(/-(? A /(l))-Iel n = 0 if ( I ~ Q\t{ I ) ) - 1 exists. The variance of error propagation length is //(L 1) + 2p( 1)(/ - QA/( 1) r 2et - / / ( L I )2 if (I ~ Q .u(U ) - 1 exists. R em ark 4.1 The calculation of the mean error propagation length and the variance of the error propagation are derived in [53. 11. 12]. But we will sag the formulas given in tin paper and [102] are neat and easy to h e realized by the computer. We now see ail example to calculate the above quantities. Exam ple 4.2 Assume the source set is {r/|. « ».«•}} with source word length /(«,) = 1 fo r i = 1.2.3 and dyadic mass probability, i.e., p(a|) = 1/ 2 . p{a2) = />(«:;) = I/-I. One Huffman code is {1.01.00}. The average codeword length L{C) = 1.5. There is only one interned node «[ = 0. According to (4-3). £[ | = 0.5/1.5 = 1/3. £ i._ > — c->.2 — 11.3 — £ > 3 = 1/6. According to (4-4). nltl = —1. nL2 = 1. n> _> = n L:) = n2.3 - 0. Xote that 5(1.2) = 5(2.2) = 5(2.3) = 0 0 and 5(1.1) = 5(1.3) = o t. So by (4-o) Po(J') = - x + - -F - = -( 2 + x) (-1.14) 6 6 6 6 120 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and Thus , s 1 -1 1 P i(*) =rJ- + - o b 1 P(-r) = (Po(-r)./b(-r)) = ~(2 + x.2x 1 + 1) b According to (/f.7) and (j.8) we hare the following results (-1-15) 7 ^ 0 .0 ) = {1.2.3}. 7^0.1) = 0 .7 ’ c (l.O ) = { 1. 2 }. Tc ( 1. 1 ) = { 3 } (4.17 and N o.O .l — '>0.0.2 = '>0.0.3 = '> 1.0.1 = H i . 1.3 = 0 . N 1.0.2 Then hg (/f.9) and (.{.10) we hare (-1.18) 1 0 0 0 Q c\ x) = and Q\[(.r) = 1 + £ i 2 4 1 . 1 + £ I 2 1 4 Therefore G’(.r) = ^(2 + x. 2s l + l)^2 n= 0 = - ( 2 + .r. 2 x -f 1 ) b - - n / \ 0 0 1 1 + £ 2 4 A 1 i # i \ I 0 1 _ £ 2 2 I -1 v 0 / 121 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hence the synchronization probability is P, = | and the average number difference be tween the transmitted source and decoded source is 0. Setg^(x) = p(x)(Q.u (x )re l . li e hare the following result < J i0){-r) = ^ p < y (n)( j) = 1 1 for n = 1.2. •5 6 o x -ln Hence the conditional mean error propagation length The variance of the conditional error propagation length is / . 2 £ O W N _ CUEpL, = m _ r a y = m Y-i -lr‘ J -1 5 V15/ 225 lie now see the mean error propagation length and the variance of the error propa gation length in the IDD sense. By Corollary I. we can easily obtain that the MEPL is //(I.I) = * + <U> / ~ \ ~ 1 / \ 0 0 0 I - \ :i 1 1 4 / ^ 1 ) 122 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and the variance of the error propagation length is / - \ - 2 / \ <t2 = ff(l.l) - H (L I ) 2 + ( 1. 1) 0 0 0 I - 3 i v 1 / \ .4 4 / 4.2 Index Assignm ent of Synchronization Codes Last section discussed the MEPL and strict synchronization probability. A good code should have the properties that the MEPL is as short as possible and the the strict synchronization probability is as large as possible, or at least have one of the properties according to the requirement. The synchronization related to MEPL has been extensively studied and one may refer to the previous chapter(see also [ 1 0 2 ]). fo r the synchronization in HD sense, till now we have not seen any discussions. In the previous chapter, we gave two efficient algorithms (fixed-order method and maximum gain method) to encode the prefix code so that the code has very short MEPL. It happens that the two simple algorithms work very well for synchronization in HD sense. That is to say. we have two algorithms to encode a prefix code such that the MEPL is very short and at the same time, the strict synchronization probability is large. Furthermore, the codes obtained by the two algorithms are stable, which means that the variances of the error propagation length in IDD sense and the variance of the error propagation length conditional on HD sense synchronization arc small. 12d Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the following two examples, all the source words are of length 1. Since when the source alphabet is large. Q.u(x) is a polynomial of x and it is not easy to calculate (see Theorem 1). we set G(x) » EntoP(-r )((?.u( x ))net. In the following examples, the synchronization probabilities have converged and are nearly exact estimators of the true synchronization probabilities. E x am p le 4.3 In [53. 56. 71], a five-character source {A. B.C. D. E} with proba bilities ()..{. 0.2. 0.2. 0.1 and 0.1 is extensively studied. For the source, there are 16 distinct codes with minimum redundancy. The codes and their correspondent MEPLs and I EPLs are given in Table I The calculation results show that the Fixed-Order algorithm is robust regarding the five measures of robustness(synchronization). In the above table, the bold-faced codes (Code 5.7 and 12) are the ones that can be obtained by our algorithms after swapping "1" and "0". E x am p le 4.4 In [67], Stanfel gave a source of size 9 with probabilities {0.2S. 0.13.0.13.0.13.0.13.0.05.0.05.0.05.0.05}. There are totally 30 distinct codes (two of them have MEPL= oc corresponding to a biprcfix code and a uniformly composed code[65j) and our algorithm obtains the code with the shortest MEPL which is Code I in Table II. Code I has the shortest MEPL and the smallest \ 'EPL. In Table II. we only present eight codes whose MEPL< 2.5. 124 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the remainder of this section, we w ill discuss the synchronization ability of V-F. F-V and V-V encoding methods through a simple example. We characterize the compression efficiency by the normalized redundancy which is defined as R - L(C) U n L{A) where L{C) and L(A) are the average codeword length and source word length respectively and H is the source entropy. We assume that the information source set is F = { / i . / j } with p(f i ) = 0.7 and lAfi) = 0.3. For Y-F encoding method, we assume that the codeword length is 2 and the source words to codewords mapping is -(/i/i/i) = 00. -(/,/,/a ) = 01.-(/,/,) = 10. ~(f >) = 11. Then the average source word length is 2.10 and Rn = 0.032. When one bit error happens, the MFPL=2.L9. Since the error will not propagate in the IDD sense. c t2 = 0. The synchronization probability is Ps — 0.2-to. The conditional MEPL is C ’MEPL = 3(1 in fact in this special case) and a* = 0. Let .1 = F x F and encode .-I by the Huffman code. We get the following source words to codewords mapping - ( / i / . ) = 1. - ( / 1/ 2 ) = 0 1 . - ( / , / , ) = 0 0 0 . - ( f 2f 2) = 0 0 1 . 12.5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Then Rn = 0.024. M E P L= 1.7S3272 x 2 = 3.566544. a 2 = 0.5S61S0 x 4 = 2.344720. P3 = 0.415414, C M E P L= L.760772 x 2 = 3.521544 and cr2 = 0.5S6459 x 4 = 2.345636. If the encoding mapping is obtained by Fixed-Order method, then -(/i/i) = 1. - ( / 1/ 2) = 01.-(/;/,) = 001 . - ( f 2f 2) = 000. and the M EPL= 1.534272 x 2 = 3.06S544. a2 = 0.354506 x 4 = 1.418024. P3 = 0.346791. C ’M E P L = l.573691 x 2 = 3.147382 and a2 = 0.358047 x 4 = 1.432188 . Let A = { f 1f 1. f i / 2-f 2 } and the the source words to codewords mapping is " (/i/t) = L “( / 1/ 2) = 00. ~(/2) = 01. Then Rn = 0.007. M E P L= L(A) x 1.5S6805 = 1.7 x 1.586805 = 2.697569. rr2 = L(A.)2 x 0.554437 = 1.602323 and Ps = 0. By the above example, it seems that Y-F encoding is good for short M EPL with respect to the IDD. But its synchronization probability is much lower than the F-Y encoding. Y-Y encoding seems to be the worst of the three encoding methods: its MEPL is larger and its Ps is very small( 0 in this example). We also test other sources of small alphabet size and it seems that we can conclude that among the three encoding types. Y-F has the best synchronization ability with respect to IDD. F-Y has the best synchronization ability with respect to LID and Y-Y has the worst synchronization ability with respect to HD. even though the compression efficiencies of the above three types are in the increasing order. 126 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 4.1: A Five Letter Source c o d * 1 code 2 code J code 4 c o d e 5 code 6 c o d e 7 code 8 A 00 00 00 01 01 01 0 0 B 01 01 10 00 00 10 10 10 C 10 11 11 10 11 11 110 111 D i io 100 010 110 100 000 1110 1100 E in 101 O il l i t 101 001 mi 1101 M E P L 3 832562 2.022727 2 060606 4.076323 1.710227 3 545455 1 555556 2 348610 < T2 34 720083 2 003271 2.107438 27 733832 1 200123 18.853334 0 370370 2 045275 P, 0.806848 0.735453 0 787877 0.755272 0 835227 0 .7 8 1 1 13 0 282828 0 286323 C M E P L 2.067218 1 630323 I 657025 2.318401 1 125170 2 258536 1 753368 2 320823 ar 7 210335 1 341632 1 461585 8 044661 0 775863 7 031573 0 353043 I 818757 code 3 code 10 code 11 c o d e 12 code 13 code 14 code 15 code 16 A 0 0 0 0 0 0 0 0 B 1 1 11 100 100 100 101 101 no c 100 101 101 110 11 I 110 111 1 11 D 1010 IQ Q Q 110 101 no to o to o 100 E 10 11 1001 11' 111 101 11 1 110 101 M E P L 1 357071 6 181818 I 852273 1 716783 l 737380 2 031042 2 203203 I 380861 1 7 J I 025233 36.231405 2 232312 I 506125 1 314234 2 351608 4 143670 2 615036 P, 0 136370 0 132366 0 743365 0 763227 0 763623 0 784806 0 780338 0 803766 C M E P L I 886752 J 035307 1 533017 1 520308 1 671160 1 700433 t 801438 I 681762 T " 1 018633 12 466135 I 641205 1 033284 1 383533 2.077768 2 813438 I 813053 Table 4.2: A Nine Letter Source P ro b a b ility c o d e I code 2 code 3 code 4 code 5 code 6 code 7 code 8 0 28 01 00 01 Qt 00 01 01 01 0 13 001 100 001 001 O il 001 001 001 0 13 101 101 000 100 100 101 ooo 100 0 13 111 110 100 101 101 1 10 I 10 101 0 13 000 111 101 I I I 111 111 101 1 10 0 05 1001 0100 1 100 0000 0100 0000 1000 0000 0 05 1000 0101 110 I 0001 0101 0001 1001 0001 0 05 1100 0110 1110 1100 1 100 1000 11 10 11 10 0 05 I 101 0111 l l l l I to I 1101 1001 l l l l l l l l M E P L 2 027337 2 051180 2 111306 2 111884 2 144733 2 265373 2 424331 2 46738 1 rr ~ 1.462463 2001461 2513759 I 434646 I 854828 2 615365 I 630581 2 823405 P, 0.770062 0 770183 0 771580 0 717363 0.763130 0 763547 0 714685 0 716364 C M E P L 1.733264 1.70074 2 1.737034 I 838332 I 830360 1 30304 2 2 071436 2 064626 < T “ I 060657 1 433337 1.773081 1 143823 1 376388 I 882458 1 304327 2 188536 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.3 Improving the Synchronization Probability As discussed in the previous sections(see also [100]). the synchronization probability for a prefix-free code is generally about 0.5 and may less. The synchronization prob ability for V-V codes is much less, and may be less than 0.1. Surprisingly, the syn chronization probability for variable to fixed length codes is almost the same as fixed to variable length codes. There are some techniques to improve the synchronization probability while keeping the same redundancy! the difference between average code word length and source entropy). When we consider A n = {un ■ ■ •«,„ : < /llt £ A } as a new source and the codeword length of a,, • • -aln is equal to Yl'k=.i l(ctk)- u'bere l(ct) represents the length of codeword c,. we can obtain a new code which has the same redundancy as the original one. The synchronization probability of the new code can be I — 0(£ ). One can see that when n — > D C . the synchronization probability will approach to 1 . However, when n is too large or the original source alphabet is too large . it is practically impossible to design and store the codebook of A n. The following algorithm can significantly improve the synchronization probability while the complexity is almost negligible. I. For source alphabet set A = {« i n.v} with mass probability function p{(ii ).-•-. p(rt.v). die source is coded by two optimal prefix-free codes C — { c i . • • •. c v } and C = {c [.---.c v } in such a way that a} is coded by c, and r, respectively for j = 1. • • •. .V. 12S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2. Choose an integer n > 0. For any source word sequence an at2 ■ ■ ■ a,L of length L. when k ^ n j for some integer j. alk is encoded by c,k. and otherwise, i.e.. k = nj, for some integer j. a,k is encoded by clfc. The number n is called the encoding period. 3. When the codeword sentence is received through a noise channel, the first n — 1 source words are decoded by looking up the codebook C and the n-tli source word is decoded by looking up the codebook C. After the first n source words are decoded, the decoding procedure is repeated, till the whole sentence is decoded. -1 . If the sentence cannot be decoded completely, or the source word number is not L. we w ill say that bit error occurs in the transmission and the whole data are discarded. Otherwise, we accept the data even though it is possible that some bit errors exist in the sequence. In average, the source errors in the SSS case is of order »(the encoding period) in flamming distance. In this case, the source is strongly synchronized. Generally speaking, the above algorithm can achieve synchronization probability more than 0.90 when the codes C and C are carefully chosen and n is larger than S . However, simulation results show that it seems there is an upper bound pnuix for the synchronization probability when C and C are given. In the following simulation results, we simply choose C to be the complementary code of C. i.e.. c, is the logic 129 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. complement of c, for i = 1. • • •. .V. In this case we also call the encoding period the complementary period. The mathematical theory of the above algorithm can be roughly depicted by a random walk with two stopping times T0 and Tn. Assume a gambler has x dollars at the beginning and at each game he loses I dotlar w ith probability L — p and wins 1 dollar with probability p. W ithout loss of generality, we may assume that He stops when he has totally n dollars (n is the encoding period) or loses all his money.i.e.. has 0 dollars. Let o(x) = Pr{'F0 < 7’,,} be the probability that the gambler loses all of his money when he has x dollars (0 < .r < n) at the beginning. Then we have o(x) = 1 p J . V .v (4.20) Assume the encoding period n is large enough. Then when one bit error happens, with high probability (near 1). we may assume that x is near 0 or near n. When x is near 0 and at last Tn < 7’ 0. then SSS fails and the decoded source number will be L + n. On the other hand, if x is near n and at last T0 < Tn. then SSS fails and the the decoded source number will be L — it. Otherwise the SSS occurs. Let p+ be the average probability that a codeword in C can be decoded more than one source word by C. Let /> _ be the average probability that a codeword in C is an internal 130 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. node of C. Then we mav think p = — ^ —- f nder the condition that ti is verv large r p+ + p- - ° and P < we can calculate that the synchronization probability is P s y n c h % Pr{x = 0} + = 1} + Pr{x = n - /} ) H-21) where Pr{x = 0 }H -5Z?= i (Pr{x = /} + f * r { j = n — /}) ~ I. Hence the synchronization probability mainly depends on p. which is determined by the codes C. C and the source statistics. When p = O.o. the synchronization probability will be about 1 — 0 {l/n ) and we see that it approaches to 1 when the encoding period n goes to infinity. Of course, the mathematical model of our scheme is more complex and it is hard to characterize the model exactly. 4.4 Simulation Results We will see the simulation results in this section. The simulation algorithm is de signed as follows. We randomly generate 10000 source word sequences of length 1000 according to the probability distribution of the source. For each sequence, we assume the bit error happens among the first 100 codewords, and for the codeword of bit error, the distribution of bit-error position is also uniform ly random. The assumption that the error happens only in the first 100 source words is because that we assume the source word sequence is in fact very long and we want to ex clude the case that the correct decoding process in ID distance has not finished 131 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. at the end. For each sequence, if the decoded source word length is 1000. we w ill say the strong synchronization occurs, otherwise we w ill say the strong synchro nization fails. In SSS case, the different source word number between transmitted source sequence and decoded source sequence is called the error propagation length conditional on synchronization. The synchronization probability is defined as the synchronized sequence number divided by 10000. The CM EPL is defined as the total error propagation length divided by the total synchronization sequence number. We now see the source given in Example 1 again. In order to show that different index assignments have different performance, we also see the case that cj = 1. c" = 01. c3 = 001. cj = 0001. q = 0000. The synchronization probabilities correspondent to complementary periods 1.2. •••‘ JO are shown in Figure 4.1. Their correspondent C.MEPLs are shown in Figure -1.2. Comparing the performance of the two codes, we will say that code C" is better than code C . We now see another example. The statistics and a robust code of motion vectors in video coding arc presented in [72] and was discussed in the previous chapter. Ac cording to the "Fixed-order" algorithm, another robust code is given. The two codes . their correspondent synchronization probabilities and CMEPLs are tabled in Table -1.3. We can see that the synchronization probabilities are about 0.6. Applying the algorithm in this section, the synchronization probabilities and C.MEPLs correspon dent to complementary period 1.2. •••.20 are given in Figure -I.- 3 and Figure 4.-1 respectively. Both the two codes are believed to be robust[102] in the ID distance 132 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Synchronization Probability 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 -© - Synch. Prob. (or Code C « Synch. Prob. (or code C* Complementary Period Figure 4.1: Synchronization Probability 1 :5 :3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CMEPL 40 -© - CMEPL for Code C * CMEPL for Code C' Complementary Period Figure -1.2 : CMEPL 1 3 -1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sense, and we can see that they are also good codes in Hamming distance sense. However, we think the one given in [102] is a little better than the one given in [72]. For details about synchronization probability o f prefix-free codes, see [100] 0.95 0.9 >•0.85 S 0-75 0.7 0.65 0.6 Synch. Prob. for Code C « Synch. Prob. for code C* 0 55 Complementary Period Figure 4.3: Synchronization Probability Simulation results above have shown that the algorithm indeed improve the syn chronization probability. However, there are two disadvantages when applying this algorithm. Firstly, it increases the error propagation length in strong synchronized case, as shown in Figure 2 and 4. Secondly, when strong synchronization fails, the source word length w ill most likely to be n shorter or longer than L. the original 135 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 4.3: Synchronization P ro b a b ility of M otion Vectors Bit Length Motion Vector Probability Ref. [102] Ref. [72] 2 MV1 0.2304 01 10 3 MV2 0.1620 001 110 3 MV3 0.0799 101 010 4 MV4 0.0774 0001 0110 4 MV5 0.0760 1001 1110 4 MV6 0.0697 1101 0010 4 MV7 0.0411 m i 0000 5 MVS 0.0399 00001 00110 5 MV9 0.0390 10001 11110 0 MVIO 0.03S0 11001 OHIO 6 M YT1 0.0195 000001 00111 0 6 MV12 0.0187 100001 111110 6 MYT 3 0.0141 L1000L 011110 6 MV14 0.0117 111001 6 M VL5 0.0115 111011 000100 6 MYT 6 0.0109 000000 000111 / MYT 7 0.0097 I000001 0011110 7 MYTS 0.0094 1100001 1111110 7 MYT 9 0.007S 1110001 0111110 MV20 0.0069 1110101 0001010 < MV21 0.0069 1000000 0111111 < MV22 0.0067 1100000 0011111 s MV23 0.0036 11100001 11111110 s MY'24 0.0034 11101001 00010110 s MV25 0.0029 11100000 u i i i i ii 8 MY'26 0.0029 11101000 00010 L11 P s y n c h 0.607S56 0.597537 CMEPL 1.591804 1.616416 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 03 CMEPL - 6 - CMEPL (or CodeC * CMEPL for Code C' Complementary Period Figure -1.4: CM EPL 137 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. source word number, rather than about I or 2 source words shorter or longer cor responding to the case that the encoding period n = 1. We need to balance the trade-off between synchronization probability and mean error propagation length. We believe that this algorithm can be applied to the cases that the strong synchro nization is required. 4.5 Conclusions In this paper, we introduced the concept of synchronization with respect to insertion- deletion distance and Hamming distance. According to these concepts, we conceive several formulas to calculate the synchronization probability w ith respect to the Hamming distance, the mean error propagation length and the second moment of the error propagation length conditional on the synchronization. We also present one algorithm which achieve smaller MEPL with respect to IDD and achieve larger synchronization probability with respect to HD. There are some open problems up to this paper. There are extensive works about the index assignment for block codes in vector quantization. The similar problem exists here: could we find an efficient index assignment to variable-to-fixed length codes so that the synchronization probability with respect to the Hamming distance is as large as possible? In this paper, an algorithm to improve the synchronization probability in Ham ming distance for optimal prefix-free codes is provided. Its mathematical theory is l:JS Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. also roughly argued. Two examples are presented to show that this algorithm can increase the synchronization probability to be more than 0.9 when the encoding pe riod is about 8. Since the complexity and cost are negligible, it may be a practical method to improve the synchronization probability in Hamming distance sense. According to the discussions about this algorithm and examples, we can also conclude that different code have different performance and hence finding a code which is robust in both ID distance sense and Hamming distance is an interesting issue. 1:59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5 Robustness of Prefix-free Codes for Finite State M odels 5.1 A Finite State Coding Scheme A simple finite state coding scheme is the first order Markov sources. To explain the basic idea for the coding scheme, we start with two examples. The first one is the fixed to variable length coding of a first order Markov source and the second one is for the variable length to variable length coding of a Markov source. Example 5.1 Consider the prefix-free coding of a Markov ternary source with source alphabet {«[.«>.«3} and transitional probability matrix given as follows: M = ( \ 0.5 0.25 0.25 0.25 0.5 0.25 0.25 0.25 0.5 1-10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The source alphabet is used as the set of states. The conditional probabilities given the letter a, can be used to design a Huffman code denoted by C,. Assume the initial state is ax. When encoding a sequence of source letters A't . • • •. .Yr .. • • A'i is encoded by a codeword in C\. The next state is A'i which is used to select the state code used to encode X 2 and so on. Suppose the three state codes are designed as follows: States « i a 2 03 «i 0 10 11 a 2 00 I 01 «3 00 01 I Then a source sequence «(a 2«2«i«3 « 3 is encoded as 0. 10. L.00. II. I. Example 5.2 In this example, we consider the variable length to variable length prcjix-frce coding generated from a binary Markov source with transitional probability matrix ( \ 0.7 0.3 .\I = \ 0.2 0.S The set of states is {0 .1 }. Two variable to variable length codes are designed as follows. When the present state is 0. the following set of source words 00. 010. 0110. 0111. 10. 11. I l l Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is used. The conditional probabilities o f these source words given the state 0 are 0.49.0.042.0.0336.0.1344.0.06.0.24. This set of source words is designed based on the principle that all probabilities should be close to a number of the form 2~l for some integer I so that the redundancy can be very small. By the same method, the set of source words for state 1 /s 00.01. 100. 101.110. i l l with conditional probabilities 0.14.0.06.0.112.0.048.0.128.0.512. These two sets of source words are encoded by two Huffman codes. For state 0. the state cod ebook is source word 00 010 0110 0111 10 1 1 code word 0 10010 10011 101 1000 1 1 For state 1. the state codebook is source word 00 01 100 101 n o 111 codeivord 000 0111 001 0110 010 I 142 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. When a source word is transmitted, the next state depends on the last letter of the source word. For instance, for the source word 101 in the first set. the next state is the last letter I. Codes designed based on such a principle usually have very small redundancies (smaller than those of variable length to fixed length code or fixed length to variable length code). In fact, for this example, the entropy of the source is 0.785673. the mean source word length is 2.587434 and the mean codeword length is 2.037256. Hence the redundancy is 2.037256 -----------------0.7S5673 = 0.001692. 2.587434 Based on the above two examples, we describe a finite state prefix-free coding scheme as follows. Let the state space be a finite set S = {.si. ■ • •. ,sm}. Let A = {«[.•••.«„} be the source alphabet. For each state .s € S. let be a set of source words, where n(.s) is the number of state source words and >1' is the string set consisting of source letter in >1. Conditional probabilities of elements of ,4S given .s are denoted by />('“ ; / 1^) for 1 < / < n(s). For each s £ S. let c 3 = {c? = os(7,3) : i t € A , for ; = I. • • •. /,(*)} 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. be an optimal prefix-free code designed based on the given conditional probability distributions, where Os is a state-dependent encoding mapping o, : A S ^ C s C {0.1}*. (5.1) Assume that the next state is determined by the current source word and the current state by a next-state function /: / : (J s x A s -> S. (5.2) A state-dependent decoder mapping is defined by : C, — > « 4S (5-3) and we require L’,(o J(A| J)) = ~i! (5.-1) for any s G < S and any ' f G A s. i = 1. - * -. «(-s). Suppose that the encoding starts with an initial state .S ’o G S. To encode a sequence of source letters j-[. • • • .r n. ■ ■ •. one first finds a unique position i\ such that T, = (j-[. • ■ •. ) G «45q- Then encode this string of letters by a codeword Os0(h i) = C i G Cs0■ By means of the next state function, the next state is de termined as S'[ = /(S ’o .r,). Then one finds a unique position i2 > /’i such that 1 4 -1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r 2 = (i,,+ i. • • •. X{2) € which is encoded as C2 = Os1( r 2). Continue this pro cedure until the whole sequence of source letters is encoded as the concatenation of a sequence of codewords C\. C2. • • •. C \. ■ • *. When this sequence of concatenated codewords is decoded, starting with state So. parse the concatenated codewords to find Ci € Cs0. This is possible because the code is prefix-free. The codeword C'i is decoded as IV By means of the next state function, one finds S\. Then find C'2 € and so on. As long as there is no error in the sequence of concatenated codewords. The original sequence of source letters can be decoded exactly. 5.2 Mean Error Propagation Lengths for FSM If a bit error occurs in the codeword C\ = Os,., (T,). then starting with this codeword, decoding error may occur. Similar to the memoryless case, parsing error is one of the reasons for decoding errors. But. since for finite state coding scheme decoding is state dependent, the recovery of correct parsing does not guarantee the recovery of correct decoding. The correct decoding can be recovered only if both the correct parsing and the correct state are recovered. We now check the decoding process when a single bit error occurs in the codeword C\. In general, there exists a k > 0. such that the corrupted version of C, is 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. parsed as cL c2 • • • cka where c, € <4s,_, • c2 € ■^/(51 _ ,.^ ._ l (cl )), c3 € -4/(s._, • • '• Ck € -4/(sI_I,ci.c2.-,cf c _,) and f{Si-i,Ci,C2) ~ f(f{Si-i.LS,_l[Ci)).li'jls,_l.VSt_ilcl))[c2)). ( 5 - O j ) / ( 5 , _ I . C [ . - • • .Cfc_i) = / ( / ( 5 , - i . C ! . • • • . C * _ 2 ). « i 7 ( 5 . _ , . c l . - . c * _ 2) ( C i k - i ) } (5.5jfe_2 ) which are defined inductively. The string a is an internal node in the code tree of the prefix-free code C/(s,_l,cl.-,ck) (including the empty node A. which is the root node of the code tree). Note that if fc = 0. the string a is an internal node in the code tree of Once the following two conditions are accessed, the decoding of the next codeword would be definitely correct: 1 . /( N .- I.r ,) = /(S ,_!.C |. •••.£*) (5.6) 2. o = A (5.7) In this case, the bit error causes the source word f, which is of length /(T,) to be decoded incorrectly. The error does not propagate to the next source word r,+1. Another case that the correct decoding resumes at O + i is f . a = A. 2'. / ( 1. F,) ^ /(.S '1 _ [.C [.---.q ). but («)• O + i 6 C /(s 1 . l,r1 ] f iC /( .f ,.1 ,L -l,..A ). 11 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (c). /(/(5,_1 .r,).rI+ 1) = /(/(•?,-i.ci.• • •.Cjt).r1 + ,). Notice that, if (c) is not satisfied, the correct decoding can still resume at Cl+l if (o '). C 'l + 2 6 C /(/(s ._ 1 .r,).r1+1) n C f i f { s , _ l ,cl .~ .c k ) .v , + l )- (&')- ^7(/(a',-i.r,).r1+1) ( C + 2) = t7 (/(^ -i.^ .--.^ ).r 1+I)(C + 2 )- (c')- / ( / ( / ( 5 l_l . r l ) . r t+ l) . r 1+2) = /( / ( /( 5 't_1.cl . - . - . c , ) . r l+ l) . r 1+rj. If (c7 ) is not satisfied, it is still possible that the correct decoding resumes at C’,+1 if similar conditions are satisfied by r t+3( ^+ 3 ) and so on. Otherwise, the error propagates to the next source word r,+ i- There are some other patterns that the source word may decoded correctly. We w ill say that the analysis of source word error and code word error for FSM is much more complicated then memoryless case. Fortunately, the exceptions have very small probability and we may neglect their contribution to mean error propagation length. Define the error propagation state ( ETS) at the end of C, as a triple Et = (/(.S',_i. F,)./(S ',_ i. ct . • • •. ct ). a) = (S’ .S '.o ). In general, an ETS is a triple (t.u.ct) where t.u £ S and a is an internal node of the code tree of the prefix-free code C u . The correct decoding resumes regardless of the next codeword if the state satisfies (6.6) and (6.7). that is t = u and a = A. 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. These states are said to be the first class recovery states, [f t ^ u. a = A. then the correct decoding resumes if the next codeword is c and 1. cGCt C\Cu. (5.S) 2. i't(c) = L’u(c). (5.9) 3. f ( t . L'e (c)) = f( u .v u(c)). (5.10) Such states are said to be the second class recovery states. Other error propagation states are said to be non-recovery states. We call the first class recovery states, second class recovery states and non-recovery states as error transition .s/fl/e(ETS) to distinct to states of the original FSM. There are totally \E = rn — 1) possible error propagation states where n(s) denotes the number of codewords in the code Cs. There are tn first class recovery states, rn2 — in second class recovery states and \1' — rn2 non-recovery states. For simplicity, the set of non-recovery error propagation states and second class recovery states is denoted by S — {r t. • • •. e.u}- where M = M' — rn. The whole first class recovery states are denoted by c0 since they have the same performance when concerning only on the MEPL and VEPL of an FSM. Assuming that at time j > i. the error propagation state is E3 — (Sj.S'.Qj ) (note that Oj is an internal node o f the code C5 '). Since the next source word is r j + 1 with code word CJ+i € Csr the parsing of a 7 CJ+i according to the decoding rule of MS Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the coding scheme results in a concatenation of strings of the form Cjic72 • • • CjtjQj+i satisfying the condition that for each t : I < t < t}: cjt 6 If the state before oJ+I is Su = /(5 j.C jj. • • • .cJt]) . we would say that the ETS E} is transfered to EJ+l = (/(5 ,. r,). a ,+ i). Notice that the next error propagation state is a function of the current error propagation state and the current source word. Assume C\C>••• is a codeword sequence with initial state Sq = so- If there is no error in the sequence, it will correspond to an ETS sequence e0e0- -. Once a bit error occurs in a sequence of concatenated codewords from such a coding scheme, it will generate a sequence of ETS Eu. Ei.--- which may consist of some non-recovery states and some second class recovery states. Once a recovery state (first or second class) occurs after some ETS in £ in this sequence, the states after it will all be the first class recovery states and the error stops propagating. If the present ETS is non-recovery but the next codeword c satisfies the following two conditions 1. ceCt n cu (o.ii) 2. c t(c) = f„(c ) (5.12) 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. but /(< ,0f(c)) ^ f{u,w u(c)). Then the decoding of the codeword c is correct but the next state is either a non-recovery state or a second class recovery state. In the non-recoverv state case, c w ill not be considered as decoded correctly. In the second class recovery state case c may be considered as decoded correctly. This is the reason that the definition of MEPL is more complex than the memoryless Huffman codes. To simplify the calculation of error propagation length, we give the definition of error propagation length for one bit error as follows. D e fin itio n 5.1 Assume the iniiial state is .s 0 and the codeword sequence is C\C-> ■ ■ ■ C\ C'i • ■ ■ corresponding to source word sequence ~i ~;2 • • • ~ i- 1", • ■ If a bit error occurs in C ', and the correspondent ETS sequence is £ 2 • • • E, ■ ■ • Ei+kE,+k+i •• • where £’[ = • • • = — e0. Ej = to for j > i -f k + 1 and E,. ■ ■ ■. Et+k € S. then we call the error propagation length for codewords to be k + 1 and the error propagation length for source words to be ) where l(~.,+J) means the source word length °f 1.+J- Assume that the source data encoded by the scheme is random from a given Markov information source, and that only one bit error occurs. Furthermore, assume that the position of the bit error in the sequence of concatenated codewords subjects to the uniform distribution. Then the sequence of ETS forms a Markov chain assuming the initial state distribution is the stationary distribution. Let the matrix Q = ((h.j) ° f dimension M x M be the submatrix of the transitional m atrix of this Markov chain where c/,,j is the transitional probability from state c, to state 150 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ej confined on the set £. The m atrix Q is said to be the error transition m atrix. Let p(et) be the probability of the event that the first non-first class ETS is e, € £. Let p = (p(ci). — .p( e.w))- We give here the formulas for the vector p = ( P ( c i p ( e.\/)) and the error transition matrix Q. In the coding scheme, the state sequence { 5 ,} ^ 0 (assuming we code a infinite source data sequence) forms another Markov chain(different from the error propagation state Markov chain). In order to calculate p. we need to know the stationary distribution of this state Markov chain. We depict a method for finding this distribution as follows. Let pfr = pf'/J.s,) be the probability of source word ';j given the state s,. Let be the probability of state transition from s, to _ s t . Then <>■•*= £ Pj\‘ - (5-1:}) Let T = (ti.t)mxm be the state transition matrix. If this m atrix corresponds to an irreducible finite Markov chain, then (p($j ). • • • .p(.sm)) is the unique solution of the following equation: (/>(.Sl).---.p(.Sm)) = {p{si).---.p[sm))T i £ :'i,p L s ) = i 151 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Having known p(si) for i = I. • • •. m. let us see how to calculate p = {p{cx). • • • .p{e\i)). Letting L(C3l) be the mean codeword length given state s, and L(S.C) be the mean codeword length for this finite state coding scheme, we have ^ ( ^ . ) = i ; ,p ( o k ) /( o 5l(-J)) (5.14) j =i and m HS.C) = '£ p (si )L(C,.). (5.15) i=i The probability of bit error occurrence at the Arth position in o J|('7J) when the current state is s, is , . , _ P(*i)p(~u 1-s) 1 ,J L(S.C) which is independent o f A. Supposing that the bit error occurs at the A-th bit of a codeword Os,(~j)- then when decoding, an error propagation state denoted as A) is reached. Thus ,* k ,)= £ Y. < - l6 > (i.j,k):E(s,.~,j.k)=ec (j,k):E{st.~,, ,k)=r, ' ‘ ' By this formula, the vector p = (p(e[). • • • .p(e.\f)) is obtained. Now let's see how to compute Q. Assuming that the current error propagation state is c, = (s(‘L fb), a(‘> ). and the next source word is ~ . the next error ■ rz Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. propagation state is then fully determined as a function of e, and 7 denoted as E(ei.-f). Then we have Let L{Alt) be the mean source word length of the set A 3 t and let L{S.A) denote the mean source word length of this scheme. Then n(s,) L(As.)= X > ( o k ) ' b J (5.IS) j = 1 and rn L(S.A) = '£i p(si)L(Aa,). (5.19) 1— t Let e = (s.t,a) be an ETS not belonging to the first class recovery states set. Then the next source word may be decoded incorrectly. In average this ETS state causes an error in tlie decompressed data of length L(AS). Define L(c) = L{AS). We have L [c)= £ > h l * ) x / h ) . (5.20) •■SA, Let L = (L(e,) : 1 < i < M). Let X be an all one row vector of dimension M. Then T h e o rem 5.1 The source word MEPL of the finite state coding scheme is MEPLS = L(S.A) + pQnV = L(S.A) + p ( / - Q)~lV (5.21) n= 0 153 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and the codeword M E P L o f the fin ite state coding scheme is \IEPLC = 1 + f] p Q " P = 1 + p (/ - Q)~l i ‘ . (5.22) n = 0 where the last “= " in each equation holds if (I — Q)~l exists. R e m a rk 5.1 For an optimal prefix-free code of an FSM. the above MEPLs and MEPLC are reliable tight bounds when considering the insertion-deletion distance between source words. For example, for an EC-TC'Q. the MEPL, is meaningless, but if all the codevectors are distinct, then the MEPLC is exact the mean number that the codevectors arc altered (regardless of the timing). C o ro llary 5.1 Let a* be the variance of the error propagation length of the code words. Then X u2 = 5MEPLC + 2 Y i npQni ‘ - 2 - MEPL; - MEPLC + 2p( / - Q )"2l ' - MEPL; n=0 (5.23) where the last " = ” holds if (1 — Q)~l exists. 5.3 Index Assignment for FSM Now wc analyze the factors that affect M EPL, and M EPLC . .Just like for niemoryless sources, if the entries of Q and p are small, generally speaking. M EPLS and M EPLC w ill be short. In order to make the entries of Q to be small, we require the transition V 5-1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. probability from an ETS to a first class recovery error propagation state, or at least a second class recovery state is large. Therefore, if the set of codewords from all state codebooks leading to the same next state is suffix-rich for all next states, then the code has better ability to recover the correct decoding. This is the following suffix rich principle. S u ffix-rich P rin c ip le (P l): fo r a finite state prefix-free coding scheme, in order to have short mean error propagation length, the codewords with different length leading to the same next state from all state codebooks should be suffix-rich. To make the entries of p small, we require that when one bit error occurs, the parsing and next state are both correct. That is Increasing F irs t Class Recovery ETS(P2): For source words having the same codeword length and leading to the same next state, assign the state codewords in such a way that the number of codewords having Hamming distance 1 is as large as possible. We give two examples to see how the two principles work. The first one is a first order Markov source w ith five states. The results by two different coding methods are compared. E xam ple 5.3 The state transition matrix for a Markov chain is as follows: 155 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Si S2 S3 SA S3 Si 0.250 0.250 0.250 0.125 0.125 S2 0.490 0.125 0.125 0.125 0.135 S3 0 . 2 4 0 0.260 0.125 0.250 0.125 s, 0.260 0 . 2 4 0 0.125 0.125 0.250 S3 0.250 0.125 0.250 0.125 0.250 The stationary distribution is (0.29S62.0.20406.0.18312. 0.14789.0.166-32). The codes designed by traditional Huffman algorithm are as follows. state Sy s '. ■83 Sa So 00 01 10 110 111 .s '. L 001 010 o n 000 •83 11 00 100 01 101 SA 00 11 100 101 01 S3 00 n o 01 111 10 For these codes. MEPLS = \IEPLC — 13.610S6 and cr? = 148.94238. The following codes are designed based on the principles Pi and P2. 156 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. state Si s2 S3 S4 S r, Si 00 1 1 01 100 101 S -2 0 111 101 100 110 S3 00 1 1 101 01 100 SA 00 1 1 101 100 01 So 00 111 01 110 10 We just simply encode the source words leading to Si (the most possible state) to be all 0 codewords and encode source words leading to (the second most possible state) to be all I codewords. We then try to make codewords leading to s:i ( the third most possible state) as suffix-rich as possible and then so on according to decreasing order of the stationary distribution probability. For these code. MEPLS = MEPLC = 5.S2S-II and al = 22.95068. Example 5.4 H e now revisit Example 5.2 and compare the results for the code designed based on the principles Pi and P2 and the code used in that example. Code I is designed based on the traditional Huffman coding algorithm and Code 11 is modified from Code I based on principles Pi and P2. The two codes are shown in the following table. 157 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. State 0 1 0 Source Word 00 010 0110 10 0111 1 1 Code I I 00100 00101 0011 000 01 Code II 1 01100 01101 0111 010 00 1 Source Word 00 100 110 01 101 111 Code I 100 110 1 0 1 1110 1111 0 Code II 111 100 1 0 1 1100 1101 0 ■ 4 .s in Example 5.2. for both codes, the redundancy is 0.001692 but for Code I. MEPL, = 11.7797S and \IEPLC = 4.56063 with cr* = IS.05530. for Code If. MEPL, = 7.6IS 12 and MEPLC = 2.95012 with a/ = 4.0S301. Remark 5.2 Assume that we know the mean source word length L(S.A) of an FSM. then w e can roughly say that MEPL, = L(S.A)MEPLC . For example, in Example o .f L(S.A) = 2.5S74. for Code I. MEPL,/MEPLC — 2.5SS6. for Code If. MEPL,/MEPLC = 2.5S23. So the variance of the source word error propagation can also be roughly estimated as erj = L2(S. A)er/. 5.4 Conclusions In this chapter. \vc develop a formula to calculate M EPLs-the mean error prop agation length of source words and MEPL^.-the mean error propagation length of codewords for a finite state model. We also present the formula for the variance of 15S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. error propagation length of codewords. These formulas can measure the robustness of a code and thus can help us to design a robust code. For this end. we present an intuitive method to design a robust code for an FSM. Even though theoretically the formula is true for any finite state, because the error transition state number is of the order 0 {m 2L(S,C)). where m is the state number of the original FSM and L{S.C) is the average of codeword length, the calculation of MEPL is only applicable for moderate state size and super code book size. However, principles PI and P2 are believed to be true for any FSM. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 159 Chapter 6 Robust Index Assignment for Finite State Vector Quantization 6.1 Introduction and Background As is known, a useful and frequently studied communication system model includes a source encoder and decoder, a channel encoder and decoder, a noisy channel and a mapping of source codewords to channel codewords. The mapping of source code words to channel codewords is called index assignment[5-1]. The importance of index assignment has been aware of for a long time and a lot of work has been done for scalar quantization, vector quantization and pulse coded modulation schemes. See. for example. [15, 1 -1 . 6-1. 93. -15. 5 -1. 3S]. For scalar and vector quantization and differ ent applications, some of the well known index assignment techniques are the Gray coding(GC'). natural binary coding(XBC'). folded binary coding(FBC') and pseudo 160 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Gray coding! PGC). For lossless source coding, there are also a lot of index assign ment work on variable length codes and the references on this topic may be found in [1021. Joint source and channel encoding is one method to combat against the total information loss. For this topic, we refer to [IS. 4. 21. 52. 22. 23. 55. 82. 17]. When the channel error frequency is very low. the effect of joint design is dominated by the source encoding. In this case, good index assignment can reduce the channel distor tion and it is the main concern of this chapter. There are also intense works on code design for set partitioning and trellis coded modulation [62. 84. 51]. In this chapter, we only focus on the case that the codebook has been designed suboptimally in some sense and the channel error probability is very small. If the channel is noiseless, the distortion is totally caused by quantization. However, when channel error occasion ally happens, it will cause additional distortion which is called channel distortion. In [19], it has been aware that some finite state vector quantizers!FSVQ) can be catastrophic. In order to obtain robust FSVQ. some rules are presented in [26] to design robust trellis-coded vector quantizers(TC’YQ). It is also well known that a special finite state machine!FSM). shift register decoder[31] has very good perfor mance and we believe it has the shortest error propagation length, i.e.. it can recover from wrong paths as soon as possible. The error propagation of TC'Q systems using encoding circuits with and without feedback is also discussed in [52]. One question arises: is it necessary to study the robustness of trellis coded quantization!TC'Q) 161 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. again? The answer is YES and we can illustrate the necessity by Example 6.1 and 6.2 given in Section 6.5. In fact, in our opinion, the robustness of FSVQ and TCYQ can be characterized by the average channel distortion(AC'D) and the mean error propagation length (MEPL) which will be defined in Section 6.3 and 6.4. Based on these measurements, pseudo gray codes can be generalized to FSVQ. This chapter is organized as follows. In Section 6.2. we briefly review the concept and notations of FSVQ. In Section 6.3. we deduce a formula calculating the ACD for FSVQ. In Section 6.4. we discuss the robustness of an FSM based on MEPL. In Section 6.5. we present two index assignments for achieving small ACD. In Section 6.6. we offer several numerical examples. The chapter is concluded in the last section. 6.2 Finite State Vector Quantization In this section we briefly introduce FSVQ and its notations. For details about FSVQ. we refer to Chapter 13-15 in [31]. As is known, the FSVQ is a VQ with memory, which is a special case called recursive VQ or feedback VQ. Let V be the vector space, generally an Euclidean space. Given an input sequence of random vectors A n € V . n = 0. 1. • • •. the encoder produces both a sequence of channel symbols un. n = 0 ,1.---, and a sequence of states Sn. n = 0. I.---, which effectively describe the encoder's behavior in response to the sequence of input vectors. The channel symbols un take values in a channel alphabet A*. The states Sn take values in a set S = {so. • • • ..s\/_i} which is called the state space. In order to track the encoder 162 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. state from the sequence of channel symbols without additional side information, we require that the states be determinable from knowledge of the initial state Sq. say. s0- To characterize the FSM. we also require the next state function Sn+I = f( u n.sn). for n = 0.1.2. •• •. the encoder mapping(index mapping) un = q(.V „. Sn). for n = 0. 1.2. • • • and the decoder mapping .Y„ = J(un.Sn). for n = 0. 1.2. ••• . We call the set Cs = {J(ti.s) : all u € A } state codebook and call C = the super codebook. In this chapter, we assume that for each state, there are 2m (ni > 1 ) codevectors, i.e.. |C50| = • • • = |C'su_, | = 2 m so that all indices are encoded as binary strings of length m. In an FSM. even though the encoder mapping is essential, its values can be changed among A ” as long as there are no channel errors. 163 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. However, when there are channel errors, the performance of FSM with different encoder mapping may be very different and we will discuss it in the following sections. Assume that the initial state is 50 = s0. Given an input sequence of random vectors X n, n = 0 .1 • • •. according to some ruies. the sequence is decoded as A'„ = 3{un.Sn), where un = a (A '„.5 n). 5n+t = f(u n.Sn) for n = 0. I.---. Let r/(-.-) denote the distortion measure on V . Then the total distortion for the FSVQ is X d = Y .d (X n.3(a(Xn.Sn).Sn)). (6.1) n=0 When the source sequence is very long, the total distortion may be huge. So instead of the total distortion, we may consider the long term average distortion which is defined as [ v = J-im T £ d{Xn.3(a{Xn.Sn).Sn)). The design goal for a FSVQ is to minimize the long term average distortion. This topic has been extensively studied and a lot of techniques have been developed. In this chapter, we simply assume that the FSVQ is optim ally designed according to some rules and we use the Viterbi algorithm to achieve the minimum average distortion. In practice, because the source input is finite, minimizing the average distortion is equivalent to minimizing the total distortion. 164 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the following, we call a shift register decoder[31] a simple FSM. It is not only simple but also has good performance. That is also the reason that the simple FSM is so popular till the present. 6.3 Channel Distortion for FSVQ In Section 6.2 we reviewed FSVQ models. In this section, we will conceive a formula to calculate the average channel distortion (ACD) when there are bit errors in chan nel transmission. In this chapter, we assume the channel is a binary symmetric chan nel and the bit error probability is e . As long as 0 < e 1. when one error occurs, it w ill propagate only finite steps(paths) for a well-designed FSVQ. and after a long pe riod. the errors may occur again. When e is very small and the ACD is our concern, we may assume that there is only one bit error and we w ill make such assumption in the rest of this chapter. This can be justified by the fact that when the input data sequence is very long. ACD ~ (total channel distortion) /( total bit errors). As in Section 6.2, assume that the initial state is Sq = s0 and the input sequence is A'„. n = 0. I.---. According to some rules(for example, the Viterbi algorithm), the sequence is decoded as ,V„ = 3(un,Sn). where un = a (A '„.5 n). S’,l+ i = f{un-Sn) for n = 0. I.---. Let denote the distortion measure on V. Then the total distortion for the FSVQ without channel error is X dX = Y , d(Xn-3^ X n-Sn).Sn)). (6.2) n = 0 165 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Assume that a bit error causes the receiver to receive u, instead of u, = a (AT St) for some z = 0, 1. • • - . Then the new distortion will be < * 2 = Y d{Xn. 3{a(Xn. Sn). Sn)) + d(Xt.3{ut. 5,)) + Y d(Xn. 3(a(Xn.Sn).Sn)) n = 0 n = i + l (6.3) where 5'1+I = f{u t. St) and S ’a+I = /(« „. 5'T for n = i -f I. i + 2. • • •. We call < / _ > — d{ the channel distortion since it is caused by a channel bit error. Even though dt and r/2 may be very large, generally their difference is finite when the FSVQ is well designed. If the input sequence is encoded by the V iterbi algorithm with infinity delay (theoretically). since is minimal, generally d2 —di > 0. For the mean squared error (MSE) distortion, when all the codevectors are real, a simple calculation can show that d2 - d \ -- r/(J(iq. S,)). J(ut. S',)) + Y, d(J{a( X n. Sn). >’„). J(o( .V„. Sn). Sn)) »=i+i + 2(.V. - Jiu,. 5,). J (u,.St)~ S,)) (6.-1) X + - Y {Xn - 3{ct(Xn.Sn).Sn).3{a{Xn.Sn)-$n) - J{c*(Xn.Sn).Sn)) n = i + l where (■ , •) represents the inner product. When the codebook is optimized, the codevector is the centroid of vectors having it as reproduction. So the mean of -V, — 3{ut, St) and A'„ — 3(a(Xn. Sn). Sn) are the zero vector. Once .V, is encoded as 3[ui.St). 3(u,. S\) — S'i) is independent of AT Similarly, once A* is encoded as 166 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. &{ct(Xn,Sn),Sn). j3(a(Xn,Sn).Sn) - 3(a(Xn,Sn),Sn) is independent of .Yn. Hence taking the average over the random inputs, we have E x ^ - d , ) = E\d(i 3{ u,. Si)), 3{ Ui. Si)) +E x d(J(a(Xn, Sn). Sn). 3 (a(X n. Sn). Sn)) n = * + l ■ (6.5) Similar discussions about channel distortion in the case o f scalar and vector quanti zation can be found in [22. 95. 54. 58]. It is hard to analyze d2 — </j directly when the measure is not MSE. However, as long as </(-.-) is a metric cn V . we have the following inequality: d-t-di < d(3{ul.Si)).3(ul.Si))+ ^ d(3(a(Xn.Sn).$n). 3(a(Xn. Sn). $n)). (6.6) n = i+ 1 We call the right-hand side of (6.5) as the average channel distortion and the right- hand side of (6 .6 ) the channel distortion. We now conceive a compact formula for ACD for some source models. The source models can be further characterized bv the statistics of the FSM. According to the « O next state function, we can equivalently define the FSVQ mechanism as follows: 167 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. States ■So S l S . u - l • - ‘'o eg ex 0 f'M — 1 l-0 eg e\ c .u -‘ ^ . U - l e ? u . 1 C tr-i cf,:! where Cf is the subset of the state codebook C3 , such that when the current state is .s , and the input is decoded to C'J t . the next state is Sj. For eacli codevector set C’/. it corresponds to a binary channel codeword (index) set of length ni. Note that, some CJ , may be empty. According to our previous assumptions. I' = UjL^1 ( for i = 0. 1. • • •. .\/ — 1 and it is called the channel codebook. If a bit error occurs in the channel symbol un = a(Xn. Sn) such that the received channel symbol is u'n. then the next state w ill be decoded as f(u'n.Sn) which may be different from f(u n.Sn). If f( u n.Sn) = f(u r n.Sn). the next errorless channel symbol will be decoded correctly so that there is no error propagation. If f( u n. Sn) 7^ / ( u'n. Sn). the error will probably propagate. Generally, we call the pair (st. Sj) state- transition state (STS) where .s , represents the next correct state and .s } represents the state obtained by the decoder which may be incorrect, .s , = definitely means that the error propagation will stop and s, ^ Sj generally means the error will propagate. We call (s,. st) recovery states for i = 0. • • •. — 1 and call (s,. s,) for i 7^ j - i ' j € {0. • • •. .\/ — 1} error-transition states. Assume the present STS is (s,.Sj) and the error propagates. When the next input is encoded as u;.. we get another 16S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. state pair (sj.s'-). where s[ = f{uk. s,). s'- = f{uk,Sj). By the above analysis, FSVQ induces a finite STS model where the state number is A /2. We simply denote the STS by Sij for z, j 6 {0.1. - • -. M — 1} and the transition probability m atrix is denoted by T = (tfj ).v2x.u2 where tfj means the probability from STS to Note that tfj represents the (i.\[ + j.k M + /) entry in T . Crossing the (z ’A/ + z ’)-th row and column of T for z = 0. I. • - •. A/ — 1. we can get an error-transition probability m atrix Q of size (.V/2 — A/) x (.I/2 — .1/) with entries = t^‘ for z ^ j and k ^ I. The above idea about channel error analyses can be found in [82] in some sense. The discussions in [S2] and in this chapter are in fact a special case of Markov chains with rewards. When an error occurs, it may produce a state pair (.s,.s,). which is the state of the new Markov chain. Let pt] denote the probability to generate the pair (s,.s7) and p be the row vector consisting of such that z ^ j. Let dltJ denote the average distortion caused by the error state and d be the row vector consisting of z /lt J such that z ^ j. Xote that the subscripts of elements in p. d. Q and the subscripts and superscripts of elements in Q are two dimensional. For simplicity, we arrange them by lexicographic order even though other arrangements do not affect the channel distortion. We now discuss the formulas to calculate p. Q and d. Let u'i'j = H V £CJ f°r f = 0. I. • • ■. A /— 1. where pUi-s,) is the probability of codevector r conditional on state s,. Then we can get a stochastic m atrix about the states so. • • •. s u_i. We assume that each state is accessible from other states directly 169 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. or indirectly so that the Markov chain is aperiodic. Then we can find the stationary distribution n = (~0, • • •. rr.u_i) such that IUT = n where H' = (wtj ) \ i x,\{. The component of vector p can be obtained as M - l Pt.j = ^ ~k 51 5Z P r ^L 'i vk )• (6-7) k = 0 l - f ( u ‘k .3k ) = 3 , n : /( u " ,s * ) = 5 j where Pr( vi -> ef) is the probability that the codeword (index) uJ t corresponding to rj is changed to u* corresponding to e* by the channel error. Note that by our notations in this chapter. "/ : f(u l k.Sk) = s," means that ujj. is taken from i k while the correspondent e[. € C l k. The entry qfj of the error-transition probability matrix Q can be calculated by <# = Z P (t-rh ). (6.8) J ( u ? . S , ) = S k .f{ U ? , 3 J )=S, Now let us see how to calculate dtJ where d,tJ is the average distortion when the correct state s, is corrupted to state Sj. Hence (l‘.j = H (/(i\ . J ( o ( r f..s,). .s _ ,))p( ef l-s,). (6.9) ‘fee,, Let t/0 denote the average distortion caused by a bit error without error propagation. Then . W - l 2 m — 1 2 m — i H Z ] 51 P(f'- ls«)^r ( ^ “ > (’f)- (6.10) 1 = 0 7 = 0 fc= 0 170 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Combining (6.7-6.10) vve have T h eo rem 6.1 The average channel distortion for an FSVQ is .4 C D = (/0 + f ; p Q nd'. (6.11) n = 0 Proof d0 is the average channel distortion at the instant the bit error happens. p Q ” d' is the average channel distortion at instant n + 1 after the bit error occurs for n = 0. 1. • • •. Hence their summation is ACD. □ 6.4 Robustness of FSVQ Another important concern is how far the error will propagate. In this case the mean error propagation length(MEPL) will measure the robustness of the FSM. Equation (6.11) is the formula for ACD. A little modification of (6.11) can achieve a formula for calculating MEPL. When we assume d0 = 1 and d = 1 = ( !.- • • . 1) be the all ones vector of size M 2 — M. ACD is in fact the MEPL which is the average number of wrong decoded codevector caused bv one bit channel error. We denote it as X MEPL = 1 + £ p Q 'l'. (6.12) n = 0 Roughly speaking, the MEPL is ACD when the distortion measure is the Hamming distance: d(vj.vl k) — 0 if / = k and j = / and d(vJ t .v! f.) = 1 otherwise. If MEPL 1 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is finite, we w ill say that the FSM is robust. Otherwise, the FSM will be called catastrophic. As a matter of fact, the results can be easily generalized to general bit error models. As is known, the statistics of FSVQ is generally unknown and it seems to restrict the application of the formula (6.11). However, the statistics can be approximated by conditional histograms from training data [31]. For simplicity, we may roughly assume all states and codevectors to be equally likely. These assumptions may affect the exact values of MEPL and ACD. but will not affect the robustness of the FSM which is only determined by whether MEPL is finite or not. On the robustness of FSM. we have the following result which is similar to the one for the robustness of prefix codes[l02]. P ro position 6.1 .4 sufficient condition for an FSM to be robust is that Q defined in (().$) has no eigenvalue 1. Proof. Let A ,-, i = 1. • • •. .\/2 — M. be the eigenvalues of Q. Since 0 < r/f^ < 1 and < I- we can conclude that for any A,. —1 < A, < 1. Let Am, ix = max,-!....,a/2-.u {|A ,|}. A theorem about spectrum radius of an operator [12] says that lim ||Q'l ||" = Amar. n— ¥ x Hence Amfir < I means that for 0 < S < I — Amar there exists an .V > 0 such that if n > .V then ||Qn|| < ( < $ + Am a J-)n. Therefore for n > X we have ||pQ” l|| < (< J + Amar)nv /m ||p||. This implies that MEPL < oc since S -f Amar < 1 . □ 172 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. P ro p o s itio n 6.2 .4 sufficient condition for an F S M to be catastrophic is that fo r each state, there is no same index between paths which reach the state. That is to say. if f{ui.Si) = f(uj.sj) and Si ^ Sj conclude u, ^ uj, then the F S M is catastrophic. The above conditions for the catastrophe of FSM are not necessary. For example, for the following FSM. MEPL=oc ■ S 0 •si So •S 3 •*0 0 1 0 1 •S 2 1 0 •S 3 I 0 while the sliding block window method gives MEPL=3. By classifying the STS as done in Markov chains, we can find all of the equivalent classes. Then by checking the initial state probabilities, we can determine whether it is robust or not. Rather than to classify the states we present a computation based criterion. T h e o re m 6 .2 Let 6 > 0 be any positive constant. Then the F S M is robust if and only if lim p ((1 + £ )I — Q )-1 1( < oc (6.13) ( 5 — ► o where I is the identity matrix having the same size as Q. 173 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Proof: Since Q is a submatrix of the stochastic m atrix T. we know that — 1 < A < I where A is any eigenvalue of Q. Hence (1 + < 5)1 — Q is invertible and for any < 5 > 0. p ((1 + <5)1 — Q)~l I 4 is finite. Note that E PQ"1' > £ P(-TTl)nl' = (1 + *)P((1 + *)I - Q )-1 (6.14) n = 0 1 + 5 Hence if limj-^o P ((1 + < 5)1 — Q) 1 l 4 = oc then the FSM is catastrophic. On the other hand, if the FSM is catastrophic, then for any positive number A > 0. there exists a positive integer .V.i such that 5Zn=0 p Q nl 4 > .4. Hence • V 1 Q lim(l + <5)p ((I + <5)1 - Q)_I l 4 > [ i m ^ p f - — t)'1!' > .-1 Q nZ 0 *• 4 " A which means lim < 5_ + 0 p ( ( I + A")/— Q) 1 l 4 = oc. □ By Proposition 6.1 and the proof of Theorem 6.2. we can conclude that C o ro lla ry 6.1 If I —Q is invertible, then ACD = do + lim p ((1 + < 5 )1 — Q) d 4 = do + p ( I — Q) *d4 (6.15) S -* o and MEPL = 1 + lim p ((1 + <5)1- Q ) ' 1 l 4 = 1 + p ( I - Q ) " l l 4 . (6.16) 5->0 174 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Remark 6.1 Define MEPLs = 1 + p ( ( l + £ ) I - Q ) ~ I l f. (6.17) By (6.1J). we have MEPL > MEPLs for any 8 > 0. So for gome 8 > 0. if MEPLs is larger than some pre-designed threshold, we can say that the FSM is catastrophic. Furthermore, for very small 5. we have /I CD ~ f/o + p ((1 + 8)I — Q ) d' (6.18) and MEPL % 1 + p ( ( l + £ ) I - Q ) " “ 1'. (6.1!)) 6.5 Index Assignments for short ACD The goal of index assignments is to obtain an index mapping such that when bit errors occur, its ACD or MEPL is small. Since the index assignment is redundancy free, a good index assignment is expected no matter how reliable the channel is. In the rest of the chapter, for simplicity. We also denote the ACD of an FSVQ by D. 175 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Consider a simple three state case. Assume that codevectors from different states are different. The state transition is depicted in the following diagram. If it is labeled as •^0 S l ■ S 2 So 0 1 S l 0 I S i 1 0 then by Proposition 6.2 MEPL=oc and hence in general D = oc. If it is labeled as •S o ■Sl S i ■ S o 0 I •Sl 1 0 •S 2 1 0 and all the available transition probabilities are 0.5. then M E P L= 5j. Now we discuss general FSM models, including FSVQ. It is known that for a trellis coded quantizer(TCQ). the sliding block decoder (SBD) is robust, i.e.. the number of codewords that are decoded incorrectly will be L + I where L is the shift register constraint length. However, it is unnecessary that SBD has the smallest ACD. The following are two examples to illustrate it. E xam ple 6.1 Assume that the 256 gray level is quantized as levels 16. /f8. SO . 112. /././. 176. 208. 2-iO and the distortion measure is the widely used mean squared 176 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. error(MSE). Assume we want to use 2 bits/ppx to quantize a 256 gray level image. How should we label the following TCQ or FSVQ S o Si S o l’o o = 16 i-oi = S O vq2 =144 i-'o 3 = 20S Si t’io = -1 8 t’ii = 112 V[2 =1/6 i’i3 = 240 so that it is the most efficient? One solution is that it is labeled by the natural binary code(SBC) according to its natural order. But it is only optimal for uniform sources, [f the source is Gaussian or Laplacian. we suggest a Gray code(GC). [f the block length of index is larger than 2. we suggest a folded binary code(FBC) [5.{j. Let us see the average channel distortions (ACD) corresponding to different index assignments. For a uniform source and SBC, ACD= 10752. For uniform source and GC'(or FBC. which are the same in this case). ACD= 12800. If we assume that under state sx. Pr(vlQ ) = Pr(v,:i) = 0.15. Pr(vtl) = Pr(v,2) = 0.35 for i = 0.1. then for SBC. the AC’D=10752. For GC\ the ACD=9522.2002. If the codevector is multi-dimensional, the index assignment is more complicated and we have to develop the pseudo Gray code [92. 92] applicable to vector quantization) \ Q) for the case of FSM. E xa m ple 6.2 (2-bit shift register). Let us consider the case of two bit shift register. Let the distortion measure be ,\ISE. We assume that each state and each codevector are equally likely respectively. So in this case the four states are 00. 01. 10 and II. If .3(0,00) = 100.0. .3(1.00) = 0.0. .3(0.10) = 1.0. J(1.10) = 101.0. .3(0.01) = 102.0. 177 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. j3( 1.01) = 2.0, 3(0.11) = 103.0. 3(1.11) = 3.0. then the ACD is D = 20005.0. One can see that the ACD is a iittle large even though the error propagation length is the shortest: MEPL=3. If we do not use the sliding block states, and simply change the channel symbol function (index function) such that q( 101.0.01) = 1. a(l.0.01) = 0. then the ACD= 10015.0 and MEPL=6. Another solution for the above model is that we simply assign a( 101.0.01) = 00. a (1.0.0l) = 10. But this w ill change the trellis structure so that the TC’Q machine is not optimal. We think that the joint source and channel encoding [IS. 4. S2j may avoid the happening of this situation. Note that, the next state actually depends only on the current state and codevector. Index function has no contribution to the performance of a quantizer as long as there is no channel error. In the following, we w ill discuss the index function assignment problem so that when the binary symmetric channel has low cross error probability, the ACD can be as small as possible. Before we give the cost function based search algorithm, let us give some heuristic thumb-up-rules which are the following. In tu itiv e In d ex Assignment: 1. If there are parallel paths, assign the indices of the parallel paths in such a way that as many as possible of them have Hamming distance I. 2. Make the channel codewords entering the same state be the same as many as possible and at the same time make such states as many as possible. 17S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3. When all of the indices have been used up. for the states that are left, repeat the above steps and keep the indices in the same order. The first condition makes the norm of p to be as small as possible. The second condition makes qfj to be as small as possible once and s3 may reach the same next state. The third condition makes the elements of Q n to be as small as possible for some fixed n in some sense. The sliding block window FSM exactly satisfies the above conditions and it indeed has the shortest M EPL even though it may not have the smallest ACD. In order to get smaller ACD for general FSVQ. we can use the "index switch method” to get better channel codes. For this purpose, we use the ACD as the cost function. We assume the channel codebook is the set t = {c0.---.c 2m_!}. The following algorithm is a development of pseudo Gray coding[92. 93]. A lg o rith m fo r F S M : 1 . Initial Index Assignment: Assign the initial indices according to the intuitive method. 2. Switch Inside State Codebook- (a) Set i =: 0. (b) For state .s,. assume the state codebook C '3 i — {e^.-- - . which is a permutation of L’. Calculate the average channel distortion D. set j = 0 and Dj =: D. 179 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (c) For k = j + 1. • • •. 2m — 1, switch the codewords c‘ and c‘ k temporarily. Cal culate the average channel distortion Dfc. Let [\ = a rg m in {D j. • • •. Update the state codebook by switching c‘ and c'A -. (d) Set j =: j + 1 and DJ+l =: DK. Repeat (b)-(d) till j = 2m - 1 . (e) Set i =: i + 1. Repeat (6)-(e) until i = .1/ — I. 3. Switch Among Super C'odebook: (a) Set i 0. Calculate the average channel distortion D and set D, =: D. (b) For j = / + I. • • •. 2m — I. switch the codewords c, and c, temporarily among the super codebook. Calculate the average channel distortion Dj. Let fc = arg min{Z)t. ■ • •. Dim_v}. Update the super codebook by switching c, and ck among the super codebook. (c) Set i =: i -f 1 and D, =: Dk. Repeat (b)-(c) till i - 2m — 1 . -I. Switch Inside Parallel Paths: If there are more than two parallel paths, for such parallel paths, use the switch method in VQ to find the suboptirnal en coding method. In these cases, switches are restricted only to the parallel paths entering the same state. Because the switches may cause the ACD to change, we still use the ACD as a cost function. ISO Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.6 Numerical Examples In this section, we will see some simulation results for the distortion measure MSE. We assume that the initial codebook is taken randomly or according to some rule. We assume that the initial codebook is suboptimized by the well-known generalized Lloyd algorithm. In the following, we use the FSM to deal with the quantization of the famous "lena“ of size 256 x 256 and gray level 256. For simplicity, we iteratively use "Iena” to train the codebook untii the final codebook is stable. Let us see how the above algorithms can reduce the ACD. E x am p le 6.3 Consider the following FSM with suboptimal coderectors which is truncated to integers in the following table *0 *1 • s2 * 3 - s0 Coo = 23 i’oi = 94 i’o2 = 66 f,0 3 = 159 -s t r» C II CO o t - n = 28 t ’ l 2 = 4 1 c ,3 = 99 S) cjo = 61 f ’ ) | = 75 II o 1 t ’2.3 = 133 *" V 3 e-jo — 157 t'3 i = 52 t'32 = 130 {’3 3 = 195 with histogram pQ 0 = 0.28421, pot = 0.24543. poi = 0.29S55. pm - 0.17181. pl0 - 0.31978. pn = 0.50532. pn = 0.10127. pl3 = 0.07363. p20 = 0.21613. p>i = 0.13464. Pn — 0.37250. p _ > 3 = 0.27673. p3 o = 0.18411. p3 1 = 0.03143. pvl = 0.36342. p3 3 = 181 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.42104. For simple FSM, ACD = 7691.3. After switching, the .4CD = 7165.3. The gain is 0.3017 dB. The index assignment is as follows: Q ( L’o o . S o ) = 00. Q ( l * o i. S o ) = 01. o( l ’o 2 . S q ) = 10. o ( L ' 0 3 . S o ) = 11. **( l’to< - si ) = 10. q ( i*oi - So) = 0 1 , a ( i ’o i. So) = 0 0 . a ( i'oi. so) = 11. t ’o i - S o ) = 0 0 . a ( i * o i . S o ) = 0 1 . q ( i ' o i . s o ) = 1 0 . a ( i ’o i . s o ) = 1 1 . o (fo [.s0) = O l.a (i’oi.so) = 00. o(roi. So) = 10. a( t’oi. s0) = 11. For the simple FSM. MEPL=2. But fo r the channel encoding with less channel distortion. MEPL=2.J,S. Now let us see an FSM with parallel paths. Exam ple 6.4 This FSM is a generalization of 2 shift register state FSM with two paths. The codevectors are shown in the following table. So S l S'2 S3 So 25. no 88.131 S l 131. 39 6 4 .156 S'2 86.138 60.170 S.3 171.59 28.198 182 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with histogram p(v°\s0) = 0.23522, p frj^ o ) = 0.41451, p(v$\s0) = 0.170S4, p{vf\s0) = 0.17943. p{v°\si) = 0.24878. p[v°\Sl) = 0.27097, p ( ^ | 5 ,) = 0.27547. p(r*|5 ,) = 0.20478, p{vx\s2) = 0.30503. p{v[\s2) = 0.34977, p (^ |s 2) = 0.23426, p(c?|5 2) = 0.11094. p{v^\s3) = 0.07743. = 0.17SS3, p (r0 > 3) = 0.43291. p [ij\s 3) = 0.310S2. The intuitive algorithm suggests the index assignment as follows o(25. 5 o) = 00. o ( 1 1 0 . 5 0) = 01. q (8 8 . 5 o) = 1 0 . o ( l 3 l . 5 o) = 1 1 • 0 (1 3 1 .^) = OO.o(S9.5,) = 01.0(64.^!) = 10. o(156.5i ) = 11. q(S6 . 5 2) = 00. o( 138. 5 2) = 0 l.o (6 0 .5 2) = 10. o( 170. .s 2) = 11. 0(171.5.-,) = 00. o(59.53) = 01.q(2S.53) = 10. o( 198.5 3 ) = 11. For this index assignment, MEPL=2.0 and ACD = 11377.0. After switching, the index mapping is as follows o(25.50) = OO.o(U0 . 5 0) = O l.o (8 8 . 5 0) = IO.o(131.50) = 11. 0(131.5!) = 01.0(89.5!) = 00.0(64.5!) = 10.o (156.5,) = 11. 0 ( 8 6 . 5 - 2 ) =00.q(138.52) = 0 l.o (6 0 .5 2) = 10.o(170..s2) = 11. o(171.5 3 ) = O l.o (5 9 . 5 3 ) = 00. o(2S. 5 3 ) = 10. o(l9S. 5 3 ) = 11. 183 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In this case, the MEPL=2.0, ACD = 7303.1 and the gain is 1.9252 dB. .4 careful reader may notice that the above encoding method is exactly the intuitive one if we arrange the codevector in the increasing order. This is really the case when the codcvector is scalar. However, for multi-dimensional case, the judgment is not so easy and the generalized pseudo Gray coding must be enforced. 6.7 Conclusions In this chapter, we deduced a formula to calculate the average channel distortion. When bit errors occurred, we discussed the robustness of a finite state machine. Considering the average channel distortion as a cost function, we presented an in tuitive index assignment and an index switching method to achieve FSM with low average channel distortion. We also provided two examples to support the efficiency of the algorithms. There is an open problem which is interesting theoretically. It is known that for a uniform source and scalar quantizer, the .\BC is an optimal index assignment which minimizes the average channel distortion when the distortion measure is MSE [15. 5 -1. 38]. Having defined the average channel distortion for TC'Q or FSVQ. can we have the similar result in the FSVQ case? We prefer the similar result even though a mathematical proof is still not available. 18-1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 7 FSVQ with M inim um Channel D istortion D etection for N oisy Channels 7.1 Introduction and Background Joint source and channel coding has been extensively studied. In [47]. A. Kurtenbach and P. Wintz discussed the optim um quantizer which minimize the total mean square error for any given probability density on the input data and any given channel matrix. In [21]. X. Farvardin and V. Vaishampayan developed an iterative algorithm for obtaining a locally optim al quantizer and coder where the source and channel are memoryless and the quantizer is of zero-memory. The joint source and channel coding method was generalized to other quantization structure which may cause error propagation. For the special topic of trellis-coded quantization (TC’Q) for noisy channels, we refer to [IS. -1 . 2o. 82. S9]. I So Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Even the quantizer is designed optimal for noiseless channel, for noisy channel, different index assignments may cause different channel distortion and it also affects the suboptimal jo in t source-channel quantizer. For such topic, we refer to [92. 93. 54, 99]. In [25. 82], the authors discussed the joint T C Q /T C M problems. But the de tection for TC’M is separated from TCQ. For trellis coded modulation, the authors used the Viterbi algorithm to finish the decoding and the decoded sequence is the one with minimum Euclidean squared distance from the sequence of received values which results the maximum-likelihood decoded sequence. However, even though it is optimal with respect to symbol error ratio, it is generally no longer optimal with respect to total distortion caused by quantization and noisy channel because of the memory of the quantizer. In [66]. M. Soleymani and C. Xassar applied the maximum a posterior^ MAP) detection to trellis quantizers over AWGX which is not optimal about distortion either. In this chapter, we w ill make the decision upon the distor tion. The resulting total distortion w ill be lower than MAP detection even though the bit error ratio w ill be higher. Roughly speaking, the previous methods can be thought as hard decision method and the method presented in this chapter can be considered as soft-decision method. 1S6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7.2 Joint Source and Channel D etection 7.2.1 Minimum Channel D istortion Paths for FSVQ As discussed in the previous chapter, channel distortion is caused by channel noise. Assume the initial input channel symbol sequence is i ' = «0u i---u .v with initial state s0- When the sequence is passed through a noisy channel, the decoded sequence may be I = uQ u{ ■ ■ ■ u \. In this case, the channel distortion is defined as •v dc = = ^2 d(J(un. Sn). J(un. Sn)) (7.1) n = 0 where SQ = 50 = s0 and 5n+i = /(« „ . 5’„). 5„+i = /( « „ . S„) for n = 0. 1. • • •. .V - I. In general case, it may happen that f( u n.Sn) is not well defined. In this case, we may assume that dc = dc. For simplicity, in the following of this chapter, we assume that for each state, there are 2m codevectors, i.e.. |C3o| = • • • = |CJu_, | = 2m so that all indices are encoded with the same length and f ( u n.Sn) always makes sense. W hen an FSVQ is determined according to the training data or the statistics of source data, the goal of joint quantization and detection is to find the bit stream that minimizes the channel distortion. Turbo coding and SISO coding are active and effective detecting method with soft output which makes the estimation of probabil ity of input bits possible. For example, the APP algorithm or MSM algorithm is a generalized Yiterbi algorithm. For one iteration SISO. when making hard decision. 187 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. it is exactly the Viterbi algorithm. The soft output can be used to further improve the performance when the source is memory or the quantizer is memory. Assume that according to the soft output, the probability distribution for un is P(un^) = Ph f°r n = 0.1, • ••.A ’, i = 0 .1 .-...2 "*-1. The previous work [4. IS. 21. 25. S2. S9] on joint source and channel quantization decide the channel symbols to be the one with maximal probability. This method is simple for implementation, but the channel distortion and hence the total distortion may not be minimal. In fact, for any possible path among the total 2m * ' + l) paths, there is a correspondent average channel distortion. Among the 2m*N ,+ 1* available paths, we choose the path having minimum average channel distortion. Roughly speaking, when the quantizer is well designed, the total distortion d = dc + d,, and hence when we choose the path with minimal dc. the total distortion is near minimum. For detailed analysis, please refer the previous chapter. Assume the 2m(A+I) paths are l'u. •••. £ 2 m(.v+n_t with probabilities p (i'0). • • •. p(L • 2'n(.\-+i!_1 ) ■ Then the average channel distortion when choosing i \ as the decoded path (or channel symbol sequence), is •> m( .V+ I ) _ y < C . = ' E A L 'A 'M L -,). (7.2) J = 0 So the path with minimal channel distortion is argmin1 = 0 ... 2 m ( . v + u _ I {dC t}. (7.3) 1 S S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. When the channel symbol sequence is long, it is impractical to find the minimal path according to (7.3). Fortunately, the minimal path can be determined bv the Yiterbi algorithm. We first give the algorithm to find the path with minimal channel distortion. 1 . Assume the in itia l state is So = -s o with p(s0) = l.p (s i) = •• • = p(.s.u-i) = 0. At instant 0. the minimal channel distortion for each state is d(0)(*o) = O i % 1) = - = ( f V i ) = x . 2. Set i := 1 . 3. Assume that at instant i . the probability distribution for states is o). • • •. /^(■s.\/-i). The present minimal channel distortion entering state is S l^(sj) for j = 0. 1.- -..V/ — 1 . The probability distribution of channel symbol is A ^ttriic 4 : 1 1 = /< "1,‘A ■% ' i- we define ° lw c = i : rf (-J( « !;„> -> . .#( «<!.■>-)) c.-n l.n The channel distortion corresponding to .s a t instant / + L for j = 0. • • •. .U — 1 is defined as 4 i+ l,( ^ ) = , min U l)(*i) + Di‘]\ (7.5) 3“* ‘ .S ,)=S j 1S9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. We also denote = arg minu The probability distribution of states Sj'+1) for j = 0, • • •, .1/ — 1 is Pli+lH *j)= £ (7.6) tt (•) \ f [ U [ .3k ) = 3j -I. If i < .\ . set i := i + 1 and go to Step 3. Otherwise, the minimum channel distortion is = (7-7) Let = arg niinJ=0,.....vr_i Tracing back, we can obtain the chan nel sequence ••• u^K which is the one w ith minimal average channel distortion. In the above algorithm, we assumed that the statistics of each state is equally likely and that the transitions leaving each state are also equally likely. It is clear that the encoder outputs at time n. un. conditioned on the previous state .S ’,, are not equally probable. So the above algorithm can be improved considering the prior probabilities leaving each state. Xote that when the statistics of the transitions leaving each state is known, and assuming that the process is stationary, we can obtain the statistics of the states. So we do not need the statistics of states given the statistics of transitions leaving each state. When the statistics of each transition 190 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. is given, we can use it as a weight to define the channel distortion. That is. (7.4) is replaced by )■ ( t.s ) The other steps are exactly the same as the previous algorithm. Another reason able consideration is that in the above definition of (7.S). we did not normalize the increased channel distortion for each instant. So another reasonable substitution of (7.4) is For simplicity, we call the above three algorithms Algorithm l( A l) . Algorithm 2(A2) and Algorithm - ‘i( A3) respectively. The simulation results prove that known infor mation indeed improve the performance when the SXR is high. We call the above method minimum channel distortion detection (MC’DD) method. 7.2.2 Minimum Channel Distortion D etection for VQ The previous subsection results apply to vector quantizer too. .Vote that we can consider a VQ as a I state TC'Q. The performance is almost the same as the FSVQ case. Since the source and channel are both memoryless. we can deal with the quantization and detection problem for each symbol. That is. at instant n. assume 191 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that there are 2m possible channel symbol outputs u^ with probabilities p{u^) for i = 0.1. • • •. 2m — 1. Then the average channel distortion choosing as the decoded symbol is j= 0 So the channel symbol with minimum channel distortion is a rg m in 1=0...2m(.v+ii_ , {(/„.,}. (7.11) 7.2.3 Adaptive M odels In the above two subsections, we assume that the statistics leaving each state is known or equally likely. As we can see by the simulation results, when the statistics is correctly approximated, the SQXR really improved. So when the statistics is unknown, we can apply applicable dynamic methods to estimate the statistics. In this section, we briefly introduce several simple methods. Block Method: In this method, we initially assume that the transitions leaving each state are equally likely. We use this statistics and the minimum channel dis tortion detection method to make a decision for a block of Li, source symbols. After making decision, we update the statistics of the transitions and use the updated transition statistics to make MC'DD decision. We w ill see the simulation results later. 192 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fixed-Lag Method: This is a typical simplified Viterbi algorithm. As is known, when the lag is about 5-7 times the constraint length of the memory o f the FSM. the minimum paths for each state w ill merge with high probability and the cost may be neglectable. Using this method, initially assuming that the transition probabilities are equally likely, we can update the transition statistics and using the updated statistics to realize the MC'DD decision. We will see the simulation results in the next section. 7.3 Predictive VQ and FSVQ In this section we discuss the predictive VQ (PVQ) and predictive FSVQ(PFSVQ) where the detection error w ill cost more channel distortion. PVQ and PFSVQ are applicable to Markov-Gaussian models, and as examples, we use these two structures to deal with images which may be considered roughly as Markovian. In this case, the error caused by channel detection may propagate to infinite even though generally it will decrease exponentially. The structure of PFSVQ and detection is as follows. As discussed in the previous section, assume the initial input channel symbol se quence is V = u0u i- - - u \ w ith initial state s0 and initial predictor value P.\. 193 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. FSVQ Codebook t Trellis Search Un ^ a T en Predictor P Decoder Shift Register rl" T 1 1 IT M Look-up eg Table H £ | Delay | T M I Predictor P Figure 7.1: Predictive FSVQ When the sequence is passed through a noisy channel, the decoded sequence may be L = h0U[ • • • uy. In this case, the channel distortion is defined as .v de = d(L'.L') = j> ( . J ( « „ . S n) + Pn- l ..i(un.Sn) + P n - i ) (7.12) n=0 where SQ = SQ . S'n+l = f ( u n.S n), 5Y+i = f{u n.Sn) for n = 0. I and P - 1 = P - l - P n = P { P - l . l l Q . - ■ ■ . U n ). P n = P ( P - l . U 0 . • • • , U „ ) . When a PFSY’Q is determined according to the training data or the statistics of source data as the FSVQ case, the goal of joint quantization and detection is to find the bit stream that minimizes the channel distortion. Assume that according to the 194 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. soft o utp ut, the p ro b a b ility d is trib u tio n for un is p(t/^,}) = p'n for n = 0. i = 0,1. • • • , 2m_1. Assume the 2m * ‘ v+1* paths are U0, • - •. C2 m(.v+i)_1 w ith probabilities p {i’o)- p{U^ 2m(.v+ii_,). Then the average channel distortion when choosing L\ as the decoded path (or channel symbol sequence), is 2 m (.V +I| _ j 4 = E d d ^ .C M U j). (7.13) j= o where d (i’j.L \) is defined as in (7.12). So the path with minimal channel distortion is argmin1 = 0 ... 2m(.v+i)_[ {dc, }• (7.14) When the channel symbol sequence is long, it is also impractical to find the minimal path according to (7.14). Unfortunately, the minimal path cannot be de termined by the Viterbi algorithm as the non-predictive case. However, we may use a sub-optimal Viterbi algorithm to find the path w ith low channel distortion. It is in fact a greedy algorithm. We first give the algorithm to find the path with minimal channel distortion in the greedy sense. 1 . Assume the initial state is 50 = ,s 0 with p[$o) = 1 - /»(- s x) = ••• = p(.s.\/_i) = 0. At instant 0. the minimal channel distortion for each state is f/(0,(s0) = 0 .c /( ° > ( s 0 = --- = d ( o ) k u _ i) = o c , 2. Set ( := 1 . 195 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3. Assume tha t at instant i. the p ro b a b ility d istrib utio n for states is p ^ ( s 0). ■ ■ •. The present m inim al channel distortion entering state sj is d{,){s: ) with respect to channel symbol ti*'-1* for j = 0.1. • • •. M — I. We denote the determined input channel symbol sequence to be = {u*°q. • • •. }. The probability distribution of channel symbol is p( U q*). •••./>( u ^ - i )■ Assume ''I,* '1 = / ( " ! : , we define l.n (7.1o) The channel distortion correspondent to Sj at instant / + 1 for j = 0. • • • . M — I is defined as d<‘+I>(sJ) = min {< £ » (*)+ D[‘>} (7.16) 9 u fc‘ U t ‘ . S l ) = S j We also denote u[‘ J = arg m inu |f/{.‘‘r l*(sJ)} and the determined channel symbol sequence at instant (/ + 1) is updated as = {«*!*. « ^ } . The probability distribution of states s( ,‘+I) for j = 0. • • •. .U - 1 is j J p{,+l](*j) = X ! /,(,)(^t)/3 ( “ !'’ )- ( " - I” ) f ( u \ ' ] ,Sk ) = S j 196 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4. If i < N, set i := i + 1 and go to Step 3. Otherwise, the minimum channel distortion is de = j= o ™ !. u - ( ?-l s ) Let = arg minJ=o. -..v/-i Tracing back, we can obtain the chan nel sequence ‘ ’ ‘ ui^.- which is the one with minimal average channel distortion in the greedy sense. In the above algorithm, we also assumed in default that the statistics of each state is equally likely and that the transitions leaving each state are also equally likely. It is clear that the encoder outputs at tim e n. un. conditioned on the previous state Sn are not equally probable. So the above algorithm can be improved considering the prior probabilities leaving each state. N ’ote that when the statistics of the transitions leaving each state is known, and assuming that the process is stationary, we can obtain the statistics of the states. So we do not need the statistics of states give the statistics of transitions leaving each state. When the statistics of each transition is given, we can use it as a weight to define the channel distortion. That is. (7.1o) is replaced by t.n (7.19) The other steps are exactly the same as the previous algorithm. Another reasonable consideration is that in the above definition of (7.19). we did not normalize the 197 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. increased channel distortion for each instant. So another reasonable substitution of (7.15) is D ( i ) _ + P in !1 - 11)) P ^ p j u y w u ^ ) El,n (7.20) For simplicity, we call the above three algorithms Algorithm 1P(A1P). Algorithm 2P(A2P) and Algorithm 3P(A3P) respectively. The simulation results (see next section) show that known information indeed improve the performance when the SNR is high. We call the above method minimum channel distortion detection (MC’DD) method. 7.4 Simulation Results In this section, we see the performance of the joint quantization and detection al gorithms. We assume the channel sequence is passed through a binary symmetric channel( BSC) with AWG.V The signal is modulated by BPSK or binary PAM which are equivalent in this case. Other modulation methods can be analyzed sirnilarly[62]. In the following simulation results, the SNR is defined as Et,/S'0 where Eh is the en ergy for each bit and .Vq is the noise power. W ithout loss of generality, we assume Eh = 1. When making detection with soft output, we can obtain the probability for each symbol: p(0) and p( 1). So we can obtain the channel symbol probabilities required in the algorithms introduced in the previous section. 19S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let us consider the binary PAM signals where the two signal waveforms are si(0 = 9(0 an<l = ~ <7(0? an(i <7(0 IS an arbitrary pulse that is nonzero in the interval 0 < t < Tb and zero elsewhere. Let the signal energy be Eb■ Then their geometric representation is simply the one-dimensional vector = \/Eb. s2 = — \fEb- When the signal is transmitted, the received signal from the demodulator is r = sr + n = \JlLb + n where n represents the additive Gaussian noise component[62] which has zero mean and variance a* = j.V 0. Similarly, it is true when is transmitted. When the two signals are equally likely, when r is received, the probabilities for Si and s2 to be transmitted are e - ( r - V T £ ) 2/-v o ^ * ‘ 1^ " e - ( r - y i 7 ) 2 / . V 0 + e - ( r + y £ l ) 7 . V 0 (<■“ !) e— t v^*r)2 / vo = e- ( r - ^ r b)\/\a + f -(r + v /E?p/.V0 ( '- 22) As usual, the SNR is defined as Eb/-V0. 199 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. We first see an FSM with the following transition structure. •So ^ > i • S 2 • S 3 Coo = (115.114) C o i = (133.140) do = (27.28) C , = (93.51) r’o o = (28.28) coi =(194.194) c,o = (63.66) c,, = (140.134) t’o o = ( 1 S. 107) co, = (63.63) c,o = (91.91) n C T i : 7 s J t! ■ '* 3 Coo = (138. 137) co, = (29.31) c,o = (115.82) C , = (163.164) In the above table, the entry coo at the cross of row s0 and column s0 is the code vector when present state is the (row) s0 and channel index is 00. The next state is .s 0. The above codcbook is suboptimal for the famous 256 x 256 gray-scale image "Lena". Let D\i,\p denote the channel distortion when the detection is according to the MAP while Dc denote the channel distortion when the detection is accord ing to the minimum average channel distortion. The channel distortion gain is defined as 10log10( D.\/..,p/Dc) in dB. Let Damp denote the actual total distortion when the MAP detection is used and Dc denote the actual total distortion when the minimum channel distortion algorithm is applied. The total gain is defined as 10 log l0( D am p/D L -) in dB. The total gain is in fact the SQ.\R gain. The perfor mance of the minimum channel distortion algorithm w ithout the transition statistics is shown in Figure 7.2. Figure 7.3 is the performance when we apply the statistics without normalization. Figure 7.4 is the one with normalization. 200 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Without TrmaAon S a w a t u 0 Figure 7.2: Channel and Total Distortion Gains Satiates and N on-N om aM ^aton t c 3 s e Q Figure 7.3: Channel and Total Distortion Gains 201 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. c < 3 8 Figure 7.4: Channel and Total Distortion Gains Comparing Figure 7.2-1. one can see that tlie transition statistics do help us to reduce the channel distortion. Furthermore, when the weighting probabilities are normalized, the performance can be improved a little bit. One can also see that when the SXR is high, the quantizer distortion is the dominated component even though the channel distortion gain is about O.o dB. Even though the channel distortion is reduced, frankly speaking, the visual effect is not as good as expected. We now see a TCQ of constraint length '■ ) . The codebook and transition infor mation is as the following table. 202 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. So S [ S ' 2 S 3 S 4 S 5 so s t S o I ’o = 2 6 d = 3 7 S i Co = 6 8 c i = 9 7 S2 c0 = 6 8 c i = 101 S3 c 0 = 1 3 2 c i = 1 6 4 S-l i ’o = 3 4 i ' i = 6 9 S 5 c 0 = 1 0 0 c i = 1 3 6 So c0 = 5 5 c i = 135 s? . c 0 = 1 6 4 c i = 1 9 3 This codebook is suboptimal for the 256 x 256 grayscale "Lena". In the table, the states and their correspondent shift registers are s0: 0 0 0 . 1 0 0 . * ■ > : 0 1 0 . s3: 1 1 0 . •s.i: 001. .S 5 : 101. st5 : 011. .s 7: II I. The performance of A1 is shown in Figure 7.8. Figure 7.9 and Figure 7.10 are the performance of A2 and A0 respectively, which are improved comparing to Figure 7.8. In this case, there is little gain in the high SNR case. It seems that A 1-3 can not improve the SQXR significantly. Simulation results show that our algorithms can increase the SQXR much more when the FSVQ is more complex. For example, when there arc 8 transitions leaving each state and entering -1 states(which mean there are parallel transmissions), the performance can be 3dB for low SXR. It is very clear that for 1-bit(black and white) VQ. the M AP 203 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Without TVanetnn Stettfbca 03 • 5 3 5 S 3 -0 2 -03 Figure 7.5: Channel and Total Distortion Gains t e „ 3“ 5 I 3° 0 0 Figure 7.6: Channel and Total Distortion Gains 204 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0* -i----- — '--■ ---* --- 1 ------------ -1 0 1 2 3 4 5 6 7 6 SNR dB Figure 7.7: Channel and Total Distortion Gains S t a t e *0 * 1 « l s~ J o *'0 00 = *'■>01 = 3 5 *’o m = 4 6 l' n t I = ^ 6 *' i i o = 8 8 ' i l l = H 5 *’ 100 = 1 2 8 *•101 = 5 7 sl l' o o o = 55 *’o o i = 73 * ' o i o = 5 3 * ' o u = 1 4 5 1 n o = 154 viu = 1 6 5 * ' l o o = 183 t - m i = 118 * 0 0 0 = 5 7 f in it = 150 * 0 1 0 = 124 * o i i = 1 6 3 * 1 1 0 = 17 6 t M 1 = 18 6 *’ l o o = 1 5 2 *’ l o t = 1 5 2 <3 * 'o o o = ’ 2 * D 0 i = 8 5 * ' o i o = 1 0 3 * ' o t l = 1 ^ 8 * '1 1 0 = 1 6 5 * l l l = 1 5 4 *' l o o = 2 0 0 *T o i = 2 1 0 -*4 •’MOO = 23 •’fill 1 = 53 *'o l u = 6 9 *’o u = 5 1 " t t o = H 4 e , j , = 2 1 3 *’ l o o = 1 5 0 *’ l o t = 8 1 * 'o o o = 8 7 I ' n o t = 1 0 0 * ' o i o = 1 1 1 *■011 = M 2 *’ 1 10 = I 38 'I I I = 150 ' T o o = 174 r 1o i = 123 *6 * 'o o o = 22 *'OOl = ' 1 * ' o i o = 1 2 7 r o t t = 1 5 0 n o C S C I I II o — * T o o = 1 5 7 * ' t o t = 13 5 * 'e o o = 53 * o o 1 = 111 2 Z I I II - i s *' 11 o = 13 6 ' 1 1 1 = 2 0 2 ' T o o = 20 6 ' t o t = 21 7 detection is exactly the minimum channel distortion. As an example, we will see the following FSVQ (Table 7.4) and the visual effects for the four algorithms. The following images are the original "Lena", the quantized version without noise and the four (M AP.Al.A2.A3) recovered images after transmitted through a noise channel of SXR=L dB. The performance for different SNR. is shown in Figure 7.14-16. 205 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Q tfn tu a d Im age W ithout M am 50 100 150 200 250 100 ISO 200 250 Figure 7.8: Original Lena Figure 7.9: Quantized Lena W /O Xoise Figure 7.11: Jointed Q /D Figure 7.10: Separated Q /D Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Joe« OuenOolen w d Oeieden Figure 7.12: Weighted Joint Q /D Figure 7.13: Normalized Joint Q /D Wflhotf Trarwtnn J 1 & Figure 7.14: Channel and Total Distortion Gains 201 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Trwitthon S u o a tc * and N o n -M o n n a a a trm 22 I 5 Figure 7.15: Channel and Total Distortion Gains TarM en Satm<c» and NofmsUaton c 3 5 5 I Tom Qmoition Gar» Figure 7.16: Channel and Total Distortion Gains 208 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From the above three examples, we can also see that for high SNR case, even though the channel distortion may decrease about 0.3 dB, the total distortion is mainly caused by the quantizer. Because the high complexity for the least channel distortion path, we can conclude that when the SNR is high, it is unnecessary to use the joint source and channel detection even though it indeed improve the performance. In the above algorithm, we assume that the priority probabilities are equally likely, i.e.. the state transition probabilities are equal and the out-going probabilities from each state are the same. But in practice this is not true and when the statistics about this FSM is bias, the minimum channel distortion algorithm should be mod ified and we should consider the unbalanced case. We first estimate the statistics and then modify the average minimum channel distortion algorithm. There are two methods to get the statistics. One is that the statistics are transmitted correctly as side information. Another is to obtain the statistics adaptively from the decoded information. We now see the predictive FSVQ case. This model is applicative to image pro cessing because the pixel values of an image are highly correlated. In the following simulation results, we assume the source is a Gaussian-Markovian model with cor relation coefficient a = 0.9. Since bit errors in PFSVQ will propagate for a long time, it is more important to design a better detection scheme. As shown by the following results, we can see that the joint quantization and detection can increase 209 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the SQNR to more than 3dB for low SNR. Consider the famous “ Lena" again. We apply a PFSY'Q as the previous example except that we assume the correlation co efficient is q = 0.9. The original image and the quantized image based on PFSYQ are shown in Figure 7.17 and 7.IS respectively. We also assume that the quantized image are transmitted through a noise channel of SNR=1 dB. Figure 7.19 is the result when quantization and detection are separated. Figure 7.20 is the jointed quantization and detection while ignoring the weight (A1P). Figure 7.21 is the one when the weight are considered!A2P) and Figure 7.22 is the one when the statistics is normalized! A3P). trx) Q uantvtf im*g« Wfthovrf Now* 50 100 ISO 200 250 50 100 150 200 250 Figure 7.17: Original Lena Figure 7.18: Quantized Lena W ith out Noise The performances concerning the channel distortion gain and total distortion gain are depicted in Figure 7.23-25. 210 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 7.20: Jointed Q /D Figure 7.19: Separated Q /D Figure 7.21: Weighted Jointed Q /D Figure 7.22: Normalized Joint Q /D Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wthoul Trarwbon I Figure 7.23: Channel and Total Distortion Gains TranM or S atiate* and Non-Normakatfcn I 5 Figure 7.24: Channel and Total Distortion Gains 212 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i i Figure 7.25: Channel and Total Distortion Gains 7.5 Conclusions In this chapter, we introduced the average channel distortion for each possible chan nel sequence generated bv a finite state vector quantizer. By the Viterbi algorithm, the path with minimum average channel distortion can be selected given the chan nel index probability. The maximum likelihood detection in noisy channel does not achieve the minimum channel distortion and hence does not achieve the rate- distortion even for an optimal trellis quantization codebook. We provided three simulation results to show that the new algorithm indeed achieve less distortion than the one when detection is MAP. The minimum channel distortion detection idea can generalized to memory source quantizer and non memory source quantizer. We w ill study the performance for DPSK. DTCQ later. 213 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 8 M ultiple Description Trellis-Coded Vector Quantization 8.1 Introduction A diversity-based communication system provides multiple channel to send informa tion between a transmitter and a receiver so that even if several channels fail, some data can still be delivered to the receiver. In a heterogeneous network, a typicai scenario might require data to move from a fiber link to a wireless link which ne cessitates dropping packets to accommodate the lower capacity of the latter. If the network is able to provide preferential treatment to some packets, then the use of a traditional multiresolution or layered source coding system is the obvious solution. But what if the network will not look inside packets and discriminate? Then pack ets will be dropped at random and in this case an error-resilient scheme must be 214 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. designed. The situation is similar when packets are lost due to transmission errors or traffic congestion. Diversity techniques are commonly used to increase the robustness of communica tion systems and storage devices. A source coding technique is the multiple descrip tion coding(MDC). This problem was posed by Gersho. YVitsenhausen. Wolf. Wyner. Ziv. and Ozarow at the September 1979 IEEE Information Theory Workshop[30]. For a two descriptions and three estimates case. El Gamal and C'over[30] gave an achievable rate region for a memoryless source and a single-letter fidelity criterion. Ozarow[60] constructed the rate-distortion region for a memoryless Gaussian source and the mean squared-error (MSE) distortion. The binary-symmetric memoryless source with an error frequency distortion criterion has been studied by Berger and Zhang [5. 94]. Ahlswede [I]. Witsenhausen and Wyner [So]. Wolf. Wyner. and Ziv [86]. The design of m ultiple description scalar quantizers! MDSQ) was addressed in [78, 80]. Transform based source coding algorithms have been widely applied for com pressing many types o f sources such as audio, image and video sources and this approach to MDC is also investigated [S3. 33]. Another approach named MDC via polyphase transform was investigated by Jiang and Ortega[44]. For more detailed history of MDC we refer to [30. 33. 79], 215 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cser Side Decoder d - Side Decoder d Channell C'liannel‘ 2 Source Central Decoder d, Encoder I Encoder2 Block Diagram of Multiple Description System As we know, there are three papers talked about the multiple description trellis coded quantization. In [79]. Vaishampayan and Battlo mentioned that they imple mented a MDTCQ of 512 states at a rate of:? bits/sample/channel and simulations indicate gains of up to 2.2 dB over multiple description quantization. In [-10] and [42], Jafarkhani and Tarokh presented a construction of MDTCQ. They used the tensor product of trellis to build a trellis which is applicable to multiple description coding. They also considered the problems of index assignment and set partitioning for the resulting trellis. As claimed. MDTC'Q provides remarkable performance with little encoding complexity. In this chapter, we restrict the distortion measure to be the mean squared error (MSE) even though the methods are applicable to any other distortion measures. 216 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another topic closely related to MDTCQ is multistage trellis coded quantiza- tion(MSTCQ). In [2. 3], Aksu and Salehi studied concatenated MSTCQ for both scalar and vector cases. Compared with two dimensional multistage vector quanti- zation(MSVQ). MSTCQ provides up to 1.6dB performance gain in signal to quan tization noise ratio(SQNR). In [9]. Brunk and Farvardin invented embedded TCQ (E-TCQ) which achieve better performance than the concatenated MS-TC'Q with the same complexity. In [41]. Jafarkhani and Tarokh introduced the design of suc cessively refinable trellis-coded quantizers. A ll these ideas can be unified by the decomposition of central TCQ or equivalently, how to factorize the central TCQ into tensor products of side TC’Qs. This chapter is organized as follows. In Section S.2. we introduce spatial MD- rC’Q and temporal MD-TC'Q. In Section 8.3. we conceive a lower bound for multiple description when there are more than two side channels. In Section 8.4 we present simulation results and discuss the asymptotic performance for spatial MD-TC'Q. In the last section we conclude the chapter. 8.2 The M D-TCVQ Scheme A powerful source coding scheme is TC’Q. The success and advantage of TC'Q s motivated multiple description TCQ(MD-TCQ) which provides remarkable perfor mance with little encoding complexity [40. 42]. In this chapter, we further develop the techniques of MD-TCQ. In our MDC’ system, the codeword may be scalar or 217 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. vector. The codebooks for central TCQ and side TCQ's are obtained by training data similar to single description case. Assume the transmitted source is a stationary and ergodic random process .V = with marginal probability density function px{*)- -r € R. The encoder blocks the source {.Yn} into vectors of dimension A^. Each vector X Sd € R A < i is represented bv a channel codeword of length Lc. Assume the description number is V. then there are totally 2‘v description patterns. Any description pattern can be binary represented as ri ■ ■ • r.\- where r, £ {0.1} for / = I. • • •. .V. In this chapter, r, = I means that description ; is lost and r, = 0 means that description / is received correctly. Let r = r t ■ ■ ■ r \ . We order r in the following order: r -< r' if and only if iv(r) < w(r') or tc(r) = ic(r') but as binary numbers, r > r'. Recall that ir(r) is the weight of r as a binary string. So in this case, we can arrange the description pattern as 0. I. • • •. 2s under a mapping preserving the order of r. By the above agreement, description pattern 0 represent the central description!received all .V descriptions). • • ,2 n_1 means that only one description is lost or not available yet. 2 N — I means that all the transmitted descriptions are lost. Before we continue, we give some notations to be used in the rest of the chapter. .V: description number: .V(i: codevector dimension of the central TCQ: Xb- the state number that one state leaves for: 2 IS Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ns: state number of the central TCQ: Np: parallel path number from one state to another in the central TCQ : L„: shift register constraint length= log2 Xs: Lp: parallel path number in bits= Iog2 .Yp; Lt>: log2 Xf, which is the bit number to indicate the state transition: Lc: channel codeword length= Lp -f Ly. /?,: coding rate corresponding to description pattern i for i = 0. 1. • • • . 2N — 1 : D,: distortion corresponding to description pattern i for / = 0. 1. • • • .2 ' — 1 . In general, the shift register consists of l\ = Ls/Lb (£f,-bit) stages. We assume that at each time. Lb bits are shifted in from the left and Lb bits in the most right are shifted out. The Ls bits in the shift register determine a central TCQ state. If there are .Yp > 1 paths transferring from state 5 to S'. then we say that the TC'Q has parallel paths. We assume that from state 5 to S', there are either 0 (disconnected) or .V p parallel paths. We denote the parallel path number in bits by Lp. For low rate TCQ. we develop spatial separation algorithm which is abbreviated as SMDTC'Q. For high rate TCQ. we develop temporal separation algorithm which is denoted as TMDTC'Q. We can also consider joint SMDTC'Q and TMDTC’Q. For SMDTC'Q. we require that Ls = KLb is a multiple of XLb (so A |/v ). For TMDTC'Q we require 219 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that X is a divisor of Lb and Lp. Hereafter in this chapter, for description pattern i. the correspondent TCQ, has the following parameters: iVj: codevector dimension: :V6 ‘: the state number that one state leaves for: A’j: state number: Xpi parallel path number from one state to another: L‘ s: shift register constraint Iength= log2 X's: L‘ p: parallel path number in bits= log2 X'p: L‘ h: log2 ,V6 ‘. the bit number indicating state transition. L'c: channel codeword length = L’ p + L'h: Note that i = 0 corresponds to the central TCQ. The parameters for TCQ, is ( L‘ s. L‘ h. L‘ p. A'}) which can determine the structure of TCQ, except the codebook. For this TC’Q,. we denote its state set S t — {s(l.0). • • • state codebook CS { ) consists of .V6 ‘Ap codevectors of dimension A j and the super codebook C‘ = U’ i=o lC3{i]). To reduce the storage and computation complexity, we may adopt the set-partitioning rule given in [76J as did for MS-TC’Q in [4lj. 220 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.2.1 Spatial Separation In this scheme. :Y^ = X Xj, for any description pattern i. Assume the initial time is 0. Assume the shift register corresponding to state 5*"’ for the central TCQ at instant n consists of 6" • • -6^- where 6" are binary strings of length Lb (so Ls = I\ Lb)- At instant n. the parallel path bits are pn(0 if there is no parallel path). The MD-TC'Q scheme is constructed as follows: 1. At instance n = k X for some k > 0. the central TCQ has state = 6” • • • 6£-: 2. Let k = /y’/.Y. If k = I and the description pattern i has zero bit positions ^i-' ‘ ' • where w(i) means the zero bit weight. TC'Q, has state ••■6" . So its shift register constraint length is L's = ic(i)Lb. If k > 1. TCQ, has state hn ... hn An ... A'1 . . . hn ... /in °b -'+'i ■ '■ + ',.■ (.) "(i—D.v+z, °{i-i).\+twUl- •'L If ic(i)Lb bit string b is shifted in and is a component of a state corresponding to description pattern i during .Y instants, then b is the shifted in bits of TC’Q,. The parallel path bits corresponding to the shifted-in register bits are the parallel path bits of TCQ,. The parallel path bit number is L‘ p = ic(i)Lp. By the above discussion, for a TCQ when it is .Y-tuple described, there w ill be 2N TC’Qs. As can be expected, the complexity is very high. In the following, we only treat two description case. Viterbi Algorithm for Spatial Separation: Let A, > 0 for / =0.1.2 be three pre assigned constants. Assume the initial state is ^°00) = 0. Denote the minimum 221 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. cumulative distortion reaching state s™ -j to be d(s^ ^), where m represents the instant, i = 0.1.2. j represents the state index. If i = 0. s["y is a central TCQ state. Let y"- represent the codevector corresponding to state 1. d(s° 0)) = 0 for i = 0.1.2: d(s°t j)) = oc for j > 0 and i = 0.1.2. 2. Set m = 0. Assume the input to be vectors A'2m+i and .V2m+2 of dimension A’dim - Assume can transfer to -s^y, in two instants through state with parallel path indices l\ and /2. Then the distortion increment is A<jd( A 2m+[. ) + Aod( A 2m+2. i{lj. ) + A id((A2m+i. A-2m+2). v[\j) + A2(/((A 2m+I. A 2m+2). c2j) . I lie niinimutn accumulative distortion d(.s2 }) at instant 2 can be obtained by the VA algorithm [28]. 3. Continue till the end of the input data. We then choose the state with mini mum distortion and trace back to determine the survivor path. 8.2.2 Temporal Separation In this scheme. \ ‘ { = Xj for any description pattern i. Assume the initial time is 0. Assume that for the central TCQ. the shift register corresponding to state 5*'1 ' at instant n consists of 6" • • • where 6" are binary strings of length /A. At instant Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rc, the parallel path bits are pn(0 if there is no parallel path). In this scheme, we require that N is a divisor of both L b and L p . The MD-TC'Q scheme is constructed as follows: 1. At instance n. the central TCQ has state S(n) = 6? • • - where />".•••. are further equally divided into .V parts(so each part has bit length L b / S ' ) and parallel path bits p" ■ ■ ■ where pj1 . • • •. p\- are binary strings of length L p / X . 2. If the description pattern i has description pattern r t •• • ry. for zero bit posi tions. the state of TCQ, is attained by crossing the parts in 6 jJ at the location that r[ • • • r.v has bit 1 for k = 1. • • •. A": So its shift register constraint length is Ll s = w{i)Lh: 3. If Lb bits string b is shifted in and is a component of state corresponding to pattern i at instant n, then the bit string obtained by crossing the bits corresponding to the location of 1 in r t • • • r.v is the shifted in bits of TCQ,. -1 . The parallel path bits at instant n for TCQ, at instant n with description pattern rt • • • r.v can be obtained by crossing the bit strings in pn corresponding to the location of 1 in rt • • • r y. Say. if rt = I. then p" w ill be crossed out. By the above discussion, for a central TCQ when it is .Y-tuple described, there w ill be 2'1 TC'Qs. This is a generalization of tensor product of TCQ. Also, we have some other methods to generalize the tensor product of TCQ. Viterbi Algorithm for Temporal Separation 223 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let A, > 0 for i = 0, l.---.2‘v — 1 be constants. Assume the initial state is •*(0, 0 ) = Denote the minimum accumulative distortion reaching state be where m represents the instant, i = 0. L.2. •••.2A - 1. j represents the state index. If i = 0. is a central TCQ state. Let c" - represent the codevector corresponding to state .sp.j). 1. d(s°i'o)) = 0 f°r i — 0.1.2.---.2'N — 1: = oc for j > 0 and i = 0.1.2.---.2v - I. 2. Set m = 0. Assume the input to be vectors .Vm+i of dimension .V*. Assume j) can transfer to in one instant with parallel path index /[. Then the distortion increment is 2V-1 Aocf(Am+i . C qj) + ^ A,d(Am+i . 1=1 Then the minimum accumulative distortion d(.^t at instant 1 can be ob tained by the YA algorithm. 3. Continue till the end of the input data. We then choose the state with mini mum distortion and trace back to determine the survivor path. R e m a rk 8.1 Many practical \'A-based algorithms arc applicable to the TMDTC'Q scheme. ,\ote also that for description 2N — I = 1 1 • • • 1. we actually has no descrip tion at all. 1 1 e may replace the distortion caused by this TCQ by the variance of the source or simply we only consider the descriptions 0. I. • • • . 2 X — 2. 224 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 8.1: M STC Q perform ance. lb /S a m p le . d im = 4 . 8 Dimension Ri SQNR, VQ, D(Ri) SQNR^ S 1/2 1/2 2.637 2.215 3.01 5.195 4 1/2 1/2 2.415 1.893 3.01 4.758 2 1/2 1/2 2.033* 1.691 3.01 4.792* 2 1/2 1/2 2.101f 1.691 3.01 4.63 D 8.2.3 MSTCQ The above discussions can be easily generalized to multistage trellis coded quantiza tion (MSTCQ). In [27]. Fleming and EfFros discussed generalized multiple-description vector quantization (GMDVQ) and they argued that MSVQ is a special case of GMDVQ. Regardless of the complexity, we can say MSVQ is a generalization of MDVQ. However, the main purpose of MSVQ is reducing the complexity of VQ when the dimension is large [31]. A more general case of MSVQ is tree structured vector quantization (TSVQ). For TMDTC'Q. when we only consider the description patterns 011 • • • 1. 001 • • • 1. • • •. 0 • • • 0. we can get jointed MSTC’Q which will have better performance than the separated MSTCQ. For example, in [3]. for Gaussian source ,V(0. 1). when the codevector dimension is - 1 and S. and the total coding bit rate is 1 b/sample. for a two stage, four state trellis, after optimization, the performance is shown in Table S.2.3 In the above table, we can see that for the 2 dimensional case, the jointed per formance is almost the same as for - 1 dimensional case. Note that the results marked by * correspond to A0 = 1. A, = 0.1. A2 = 0 and the results marked by f correspond 225 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to A0 = 1. At = 1, A2 = 0. For detailed jointed MSTCQ. we refer to [101]. Jointed MSTCQ has close relation w ith embedded TCQ(E-TC’Q) [9] invented by Brunk and Farvardin and successively refinable trellis-coded quantizers provided by .Jafarkhani and Tarokh[41]. 8.3 A Lower Bound for M ultiple Description In [30]. El Gamal and Cover discussed the achievable rates for m ultiple descriptions. Partially applying the results, for memoryless Gaussian source. Ozarow [60] exactly characterized the actual set of achievable points for a two channel and three receiver system. Following the proof o f the reverse part of Ozarow's theorem, we can further obtain the lower bound for more than two side channel cases. A tight lower bound is still not available. Let .V" be a block of n letters. We assume the existance of .V channels(.Y descriptions) and there is a discrete encoder /, for channel i. The cardinality of /,(.Y N ) is limited by - lo g 2 H/,(-Vn)i| < /?, for / = 1..-...Y. n 226 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The estimate .V, is a function of /,(.V) for i = 1. • • •. .V. Using the data-processing theorem, we get /(.Y..Y,) < /(.X .M X )) < n R ,. Tlie following lemma is used in attaining the lower hound for m ultiple description Lemma 8.1 For X > 2. m \ \ . — . x x ) < f £ i n x . x ) - X J - - £ m x ) i <«><-v and the "= holds if and only if X\. • • •. X \ air independent. Proof: See Appendix C. We now set about to conceive the lower hound for memoryless Gaussian sources. Let A' denote the estimate when all the .Y descriptions are received. Then /(A'..Y) < /(A \/,(A") • • • /.v(.Y)) < H(fi(X). • • •. f\(X )) I < 1 < J < V 1=1 1=1 I< 1 < J < .V 227 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. < " E f t , - 4 E /( /( . v. ) . / ( a-)) i=l * l < s < j < A r < E '(•<'<••*;)• (S .I) i= l • 1< I<J<.V In the following, let Dt denotes the distortion corresponding to description i w ith rate Ri for / = 1 . • • -. X and Do denote the central distortion. Let n,.j = ( I — D,-)( I — D} ) and A. j = DtD} — 2-2*fi,+fid. By the converse to the source-coding theorem [6]. we have Do > > 2_2^'=i r,2 "* £-'<'<j<v (S.2) According to Ozarow's results [60], j . ----------------------------- \ ----------------------------- l - f / i E J - v / I U ) 2 Hence we have the following result P ro p o s itio n 8.1 For an X description system, the achievable lower bound is de picted as follows: Dt > 2~2R' for i = [.■■■. X: 228 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Let 2 < M < V. For any M descriptions, the destortion D0 from these M descrip tions satisfies Do > 2~2iRi+"'+Ru) ( JJ 1 l < ,< j< \( 1 - ( \f f t i\ j ~ y / X J 3 When all the description have the same rate /?=/?,••■ = Ry = /?0/.V. and all their distortions are considered to be the same and is denoted as D. for 2 < M < A . C o ro llary 8.1 Let 2 < M < A . For any M joint estimate. ^ “ " ’ i , where H = ( I - D)2. A = Dz - 2~aR. Following Ozarow's analysis, we present two simple examples to clarify the be havior of the region specified in Proposition 8.1 (or Corollary 8.1). In the first example, as in Corollary 8.1, R = /?!•••= R \ = Rq/S and assume that the distor tion obtained over each side channel is essentially on the appropriate rate-distortion curve, i.e.. D i = • • • = D,\ = 2~2R. Then A = 0 and by Corollary 8.1 we have Do~ D [2D - D2) (2 - D) v_1 ' 229 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. This is to say that achievable distortion over the joint channel is no better than 2V_1 distortion on the side channels. This is far worse than the value D0 ^ D N according to the rate-distortion theorem. A t the opposite extreme, assume that Dq = 2~2 (fll+fi2+'"+/?v). That is to say that the encoder is designed to provides as good a performance as possible for the jo in t estimates. From Proposition 8.1. we must have n,.j ~ which implies D, + D} % 1 + 2~2{R'+R>) for 1 < / < j < X . In this case, the performance is the same as the two description case. 8.4 Simulation Results In [42]. Jafarkhani and Tarokh have shown that MDTCQ can achieve much better performance than the traditional MDSQ. In this section, we w ill also shown the following phenomena: 1. For MDVQ. for fixed coding rate, the larger the dimension of the codevectors is. the smaller the distortion for each description pattern is. 220 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2. For MDTCVQ, the longer the shift register constraint length is. the smaller the distortion is for each description pattern. We also apply a four description TM D-TCQ to encode the famous “ Lena" so that we can have a visual feeling about MD-TC’Q even when more than 75% the transmitted data are lost. For better performance of multiple image coding, we believe that transform coding must be applied first. 8.4.1 SMDTCQ We first see the performance of the SMDTCQ. When the stage bit number is larger than 1. we may apply TMDTC'Q which has less complexity and better performance. SMDTCQ has more complexity in that for side TC’Qs, each state can transfer to 2 s states and when VA is applied, we should do more multiplications and comparisons than TMDTC'Q. For example, for a central TCQ with parameter (4. 2. 0. 1). we may introduce SMDTCQ or TMDTC'Q to achieve diversity. Assume the initial codebooks are generated randomly. After trained by Gaussian source of mean 0 and variance 1 . we encode a newly generated Gaussian source of length 40000. The distortion curves (do. ) for the two methods are shown in the following figure.It seems that when the weights for side distortions are small. SMDTC'Q outperforms TMDTC'Q while the weights for central distortion is small, TMDTC'Q outperforms SMDTCQ. 231 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 0 OPTA O SMDTCQ * TMDTCQ c o c o « 5 Q 10" g c < u o O x 10 '* 10‘ Mean of Side Distortion Figure S.l: Simulation Results for (4. 2. 0. 1) 1 0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. There is an important property for SMDTCQ and Gaussian sources. For low bit rate cases, when the central TCQ approaches the rate-distortion for single descrip tion, the side TCQs can also approach the rate-distortion given by Ozarow [60]. O zarow ’s T h e o re m : Let .Y[. A V • • - be a sequence of i.i.d. unit variance Gaussian random variables. The achievable set of rates and mean-squared error distortions is the union of points satisfying Di > 2~2R' (S.-l) D2 > 2~2R 2 (S.o) D0 > 2 -2[R'+[i2)----------------- = - (8.6) l - ( > / n - \ / A ) 2 where fl = ( l -£>,)(! -£>>) (8.7) and A = DiD-z - 2~2{Rl+R2). (8.8) By the above theorem, when the central TCQ achieves the rate-distortion 2 we must have IT = A which induces Di + D-i = I + 2~2(Rl+R2). 233 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For VQ. a special one state TCVQ. assume the codebook to be {it. • • •. xn }. When the side VQ, has codebook {(iq. 0). • • •. (xn, 0)} and side VQ2 lias codebook {(O .iq). • • •. (0 .i„)} where 0 is a zero vector having the same dimension as x, for i = 1 . •••.n. the total distortion w ill be exactly Di + D? = 1 + ‘> -2(Rl+fi2'. Note that when the codevector dimension goes to infinity, the random coding method proved that for fixed rate, the distortion w ill approach to the theoretical value D(R). Hence we can conclude that spatial m ultiple description can approach the theoretical performance of MDYQ. It is also known that for fixed codevector dimension, when the shift register constraint length goes to infinity, for fixed coding rate. TC'VQ can also approach the rate-distortion. We now argue that SMDTC'Q can also achieve the OPTA in the above case. Assume that the central TCVQ has ,YS states. We require that all states corresponding to the same state of side TC'Q, have the same state codebook. Cnder such restriction, we believe that when the SRC'L goes to infinity, the central TC’Q will still approach to the optimal performance theoretically attainable(OPTA). In this case, similar discussion as for MDVQ case, the total side distortion will be exactly D\ + D> = 1 + The simulation results are shown in the following figure. 23-1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 0 OPTA O L ,=4 + Ls=6 • L =8 10"' .0 t 1 0 1 0 Mean of Side Distortion Figure iS .2: Simulation Results for R = I. Ri = R2 = 1/2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8.4.2 TM DTCQ We now see the performance of TM DTCQ. V V e first see a two description case with R = Ro = 2 and Ri = R2 = 1. Figure X illustrates the rate-distortion performance using MDSQ. MDVQ and M DTCQ. In Figure X. the source is zero-mean, unit- variance memoryless Gaussian. For MDVQ. the codevector dimension is 2. so the central VQ has codebook size 16. For TMDTCQ. we consider a trellis with L, = 2. Lb = 2 and .V.{ = 1. So the central TC'Q has four states and each state can transfer to any state directly. — OPTA 0 MDSQ X MDVQ + MDTCQ Q 10' Mean of Side Distortion Figure 8.3: Simulation Results for R = 2. R\ = R2 = 1 236 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From this figure, we can see that MDSQ has the worst performance while MDVQ and MDTCQ have almost the same performance. Let us analyze the complexity of MDVQ and MDTCQ shown in this figure. We assume that At = A2. In this case. MDVQ needs 50 multiplications and 16 comparisons. On the other hand. MD- TC'Q needs 26 multiplications and 4 comparisons. O f course. MD-TCQ needs more memory or the storage complexity is higher for MD-TCQ than for MDVQ. The advantage of TC'Q over VQ in redundancy, complexity and delay tradeoff can be found in [9S]. We use MD-TCQ to encode the famous "Lena" of size 256 x 256. The coding rate is 1 bit/ppx and we assume the codevector dimension is 4. We design a central TC'Q with 16 states so that each state can directly transfer to any state. The coding information are transmitted in 4 packages and we assume that at most one package may be lost and the lost probabilities are equally likely. So when there is one package loss, the actual coding rate is j. By the TM D-TCQ approach, when we assume that all the 5 TCQ have the same weight, the central distortion is MSE=1S5.S0S corre sponding to PSXR=25.474 dB. W hen the first package is lost. MSE=236.641 with PS.\R=24.424. When the second package is lost. MSE=231.933 with PSXR=24.511. When the third package is lost. MSE=229.2 with PSXR= 24.563. When the last package is lost. MSE=242.047 with PSXR= 24.326. The original "Lena" and the five quantized image are shown in the following figures. 237 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 8.4: Original Lena Figure 8 .0 : Central Quan- Figure 8 .6 : Package 1 lost tizer Figure 8.7: Package 2 lost Figure 8 .8 : Package 3 lost Figure 8.9: Package 4 lost Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 8.2: Losing Patterns, W eight, M SE and PSNR. Pattern Bit Rate A MSE PSNR(dB) MSE* SQNRt(dB) D (R )r(dB) 0 0 0 0 I 1 233.387 24.484 0.351 4.547 6 . 0 2 1 1 0 0 0 3/4 1 314.651 23.187 0.486 3.133 4.515 0 1 0 0 3/4 1 309.332 23.261 0.483 3.164 4.515 0 0 1 0 3/4 1 336.442 22.896 0.480 3.183 4.515 0 0 0 1 3/4 1 328.727 22.996 0.479 3.197 4.515 1 1 0 0 1 / 2 I 576.404 20.55S 0.636 1.963 3.010 1 0 1 0 1 / 2 I 464.072 21.499 0.637 1.958 3.010 1 0 0 1 1 / 2 I 447.116 21.661 0.634 1.979 3.010 0 1 1 0 1 / 2 1 483.632 21.320 0.636 1.963 3.010 0 1 0 1 1 / 2 1 436.059 21.769 0.638 1.950 3.010 0 0 1 1 1 / 2 I 596.066 20.412 0.63S 1.955 3.010 1 1 1 0 1/4 I 996.141 IS.IS2 0.809 0.919 1.505 1 1 0 1 1/4 I 781.800 19.234 0.810 0.914 1.505 1 0 1 1 1/4 1 951.579 IS.380 0.810 0.916 1.505 0 1 1 1 1/4 I 976.450 18.268 0.S07 0.930 1.505 A more general case is that each package can be lost and any number of packages can be lost. When assuming all the loss situations have the same probability, we can assume that all the TC'Q have the same importance and we may take A 0 = • •• = A14 = 1. In this case, by optimization, the distortions for different package loss patterns are as follows In the above table, the numbers in the column of MSE* is the MSE corresponding to memoryless Gaussian source of mean 0 and variance I. Numbers in the column of SQNR* is the SQNR of Gaussian source. The correspondent images are shown in the following figures. 239 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 8.10: Pattern 0000 Figure 8.11: Pattern 1000 Figure S.12: Pattern 0100 Figure 8.13: Pattern 0010 Figure 8.14: Pattern 0001 Figure 8.15: Pattern 1100 The above algorithms considered all package loss cases and in fact are unnecessary in practice. For example, we may concern the distortion when only one package is 240 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 8.16: Pattern 1010 Figure 8.17: Pattern 1001 Figure 8.IS: Pattern 0110 Figure 8.19: Pattern 0101 Figure 8.20: Pattern 0011 Figure 8.21: Pattern 1110 Figure 8.22: Pattern 1101 Figure 8.23: Pattern 1011 Figure 8.24: Pattern 0111 241 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 8.3: Losing P atterns, W eight, MSE and P S N R Pattern Bit Rate A MSE PSNR(dB) MSE1 SQNRt (dB) D (R )'(dB) 0000 1 5 206.353 25.019 0.306 5.142 6.021 1000 3/4 0 374.164 22.434 0.560 2.518 4.515 0100 3/4 0 372.292 22.456 0.545 2.639 4.515 0010 3/4 0 331.877 22.955 0.544 2.640 4.515 0001 3/4 0 422.967 21.902 0.550 2.597 4.515 1100 1/2 0 493.603 21.231 0.709 1.492 3.010 1010 1/2 0 618.032 20.255 0.710 1.4S5 3.010 1001 1/2 0 595.953 20.413 0.713 1.470 3.010 0110 1/2 0 478.773 21.364 0.700 1.550 3.010 0101 1/2 0 675.36S 19.869 0.698 1.559 3.010 0011 1/2 0 550.064 20.761 0.703 1.530 3.010 1110 1/4 1 813.632 19.061 0.850 0.704 1.505 1101 1/4 1 926.476 IS.496 0.851 0.703 1.505 1011 1/4 1 999.773 18.166 0.853 0.691 1.505 0111 1/4 1 970.774 IS. 294 0.843 0.739 1.505 received. If more than one package is received, the distortion w ill be reduced since we receive more information. The simulation results for the case are shown in the following table and figures. to too » a s so to n »e so too < so » no Figure S.25: Pattern 0000 Figure 8.26: Pattern 1000 Figure S.27: Pattern 0100 2-12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 8.28: P attern 0010 Figure 8.29: P attern 0001 Figure 8.30: P attern 1100 Figure 8.31: Pattern 1010 Figure 8.32: Pattern 1001 Figure 8.33: Pattern 0110 8.4.3 Asym ptotic Performance It is known that when the constraint length of a TC'Q goes to infinity, for sin gle description TCQ, the rate distortion curve w ill approach the theoretical curve. 2-13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure S.34: Pattern 0101 Figure S.35: Pattern 0011 Figure S.36: Pattern 1110 Figure 8.37: Pattern 1101 Figure S.38: Pattern 1011 Figure S.39: Pattern 0111 241 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. There is no theoretical results now to guarantee it is true for MDTCQ case. How ever, the simulation results shows that it may be true: as the constraint length of MDTCQ approaches infinity, the distortion curve will approach the curve obtained from Ozarow’s result[60]. In the following figure, R = 2, = R2 = 1 and the codevector dimension is 1. i.e., scalar quantization case. Note that when L, = 0. it is MDSQ. When Ls — 2.4.6. S. Lp = 0 and £ < , = 2. the central distortion and the average of the two side distortions is as follows. — OPTA x Ls=2 + L s = 4 O Ls=6 . L =8 b 1 0 *o 10‘ 2 Mean of Side Distortion Figure S.40: Simulation Results for R = 2. /?t = /?2 = 1 We now see a more general case: the description number is three. In the following table, the weight vectors are all chosen as (1.1.1.1.1.1.1). 245 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 8.4: MSE and SQNR for I. 8. 64 States D e scrip tio n P a tte rn C ode ve cto r D im e n s io n s ! C ode v e c to r D tm enston = 2 M S E SQN'R MSE I ft 64 8 64 8 64 8 64 000 0 199 0.103 0.089 7 015 9.882 10 487 0-277 0 251 0.231 5 583 5 995 6.366 100 0 208 0.197 0.174 6 829 7 060 7 587 0.397 0.390 0.364 4.008 4.085 4.385 010 0 360 0 200 0 168 4.437 7.000 7.736 0 405 0 404 0 368 3 927 3 936 4 339 001 0.202 0.185 0.171 6.950 7.330 7 664 0.403 0 409 0 373 3 950 3 878 4 280 110 0 385 0 388 0 348 4 145 4.116 4.589 0 677 0 630 0 583 1.693 2.009 2 341 101 0.419 0.411 0 346 3.781 3.850 4 609 0 679 0 628 0 584 1 682 2 018 2 3J6 O il 0.407 0 397 0 339 3 903 4 017 4 592 0.678 0 631 0 602 1 685 2 000 2 203 From the above table, we can see that when the constraint lengths get larger and larger, for all different patterns, the distortions get smaller and smaller. It is reasonable to conclude that when the constraint lengths approach infinity. TMD TCQ will approach the theoretical performance of multiple descriptions!which will be our further research work). 8.5 Discussions and Conclusions In this chapter, we discussed two types of MDTCQ upon the structure of the original rC’Q. Spatial M DTCQ applies to a few descriptions and low rate case. Temporal MDTCQ has low complexity and is applicable to high rate cases. The ideas dividing TCQ can be applied to MSTC'Q scenarios and may be further studied. Simulation results also show that when the constraint length goes to infinity, it is possible that the performance of MDTCQ will approach to the theoretical curve. 246 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 9 Discussions and Future Work Future work in about one year will try to do some research in the following topics: 1 . The redundancy of general structured TC’Q: 2. Rigorous analysis for complementary Huffman codes: 3. Space-time code design for channel known and unknown cases: -1 . The application of Markov chain in communication systems for Rayleigh fading channel. In Chapter 2. we discussed the redundancy when the TC'Q is fully connected. This is a special structure. If we assume that there are \ s states and each state can transfer to .V( states, what is the relationship between rate redundancy and ,V3. ,Vf? In Section -1.3. we invented a method to significantly improve the synchronization probability of an optimal prefix code in the Hamming distance sense. There are two aspects deserving further research. Firstly, if a codebook and its statistics 2-17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are given, what is the exact and asymptotic synchronization probability when the complementary period n goes to infinity? Secondly, is there any efficient method designing good codes when synchronization in Hamming distance is strictly required? In wireless communication systems, m ultipath fading usually has a severe impact on the performance. An effective technique to mitigate channel fading is to increase diversity through the use of multiple antennas at the transmitter and/or receiver. In recent years, many space-time coding and modulation schemes have been suggested [39. 73], where the two fundamental performance parameters are diversity gain and coding gain. Some space-time convolutional codes are given in [73]. and orthogonal design space-time block codes are given in [77]. In [97]. we presented some prelim inary results on space-time convolutional codes. For the channel unknown cases, when differential encoding are desired, some further criteria are under investigation. In some papers, some semi-analytical approach to evaluate the performance of TCP-IP based applications on wireless channels are proposed. The analyses are based on channel correlation for Rayleigh fading channel. The signal amplitudes are quantized so that each quantization level corresponds to a state in a finite Markov chain. In these papers, the authors simply decide the quantization levels and no further research on the performance of different quantization levels. We believe that there should be some optimal/suboptimal design methods for quantizor such that the analytical results can approach the simulation results more closely. 2-1S Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reference List [1] R. Ahlsvvede. The rate-distortion region for multiple descriptions. IEEE Transactions on Information Theory. 31:721-726. November 1985. [2] A. Aksu and M. Salehi. Multistage trellis coded quantization (MS-TC'Q) de sign and performance. [EE. Proc. Communication. 144:2:61-64. April 1997. [3] A. Aksu and M. Salehi. Design, performance, and complexity analysis of resid ual trellis-coded vector quantizers. IEEE Transactions on Communications. 46:8:1020-1026. August 199S. [ I] Ender Ayanoglu and Robert M. Gray. The design of joint source and channel trellis waveform coders. IEEE Transactions on Information Theory. 33:6:855- 865. November 19S7. [5] T. Berger and Z. Zhang. Minimum breakdown degradation in binary multiple descriptions. IEEE Transactions on Information Theory. 29:807-814. Novem ber 1983. [6] Toby Berger. Rate Distortion Theory: .4 Mathematical Basis for Data Com pression. Prenticc-Hall. Englewood Cliffs. New .Jersey. 1971. [7] L S. Bobrow and S. L. Hakimi. Graph theoretic prefix codes and their syn chronizing properties. Inf. C'ontr.. 15:1:70-94. .July 1969. [8] Charles G. Jr. Boncelet. Block arithmetic coding for source compression. IEEE Transactions on Information Theory. 39:1546-1554. Sept 1993. [9] H. Brunk and N. Farvardin. Embedded trellis coded quantization. Data Com pression Conference. 1998. DC'C '98. Proceedings, pages 93-102. 1998. [10J R. M. Capocelli. A. A. D. Santis. L. Gargano. and E. Vaccaro. On the con struction of statistically synchronizable codes. IEEE Transactions on Infor mation Theory. 38:2:407-41-1. March 1992. [11] Philip A. Chou. Michele Effros. and Robert M. Gray. A vector quantization approach to universal noiseless coding and quantization. IEEE Transactions on Information Theory. 42:4:11 09— 1138. July 1996. 219 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [12] John B. Conway. .A Course in Functional Analysis. New York: Springer- Verlag. 1990. [13] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. John Wiley < L ' Sons. New York, 1991. [14] T. R. Crimmins and H. Horwitzz. Mean-square-error optimum coset leaders for group codes. IEEE Transactions on Information Theory. 16:429-432. Julv 1970. [15] T .R .C rimmins. H. Horwitzz. C. J. Palermo, and R.Y. Palermo. Minimization of mean square error for data transmission via group codes. IEEE Transactions on Information Theory. lo:72-7S. January 1969. [16] I. Csiszar and J. Korner. Information Theory: Coding theorems for Discrete memmoryless systems. Academic Press. New York. 1981. [17] Tolga M. Duman and Masoud Salehi. Optim al quantization for finite state channels. IEEE Transactions on Information Theory. 43:2:758-765. March 1997. [18] J. Cl. Dunham and R. M. CIray. Joint source and noisy channel trellis encoding. IEEE Transactions on Information Theory. 27:516-519. July 1981. [19] J. G. Dunham and R. M. Gray. An algorithm for the design of labeled- transition finite-state vector quantizers. IEEE Transactions on Communica tions. 33:S3-S9. January 1985. [20] A. E. Escott and S. Perkins. Binary Huffman equivalent codes with short sy- chronizing codewords. IEEE Transactions on Information Theory. 44(I):346- 351. Jan 1998. [21] N. Farvardin and Y. Vaisharnpayan. Optimal quantizer design for noisy chan nels: An approach to combined source-channel coding. IEEE Transactions on Information Theory. 33:S27-83S. Nov 19S7. [22] Nariman Farvardin. A study of vector quantization for noisy channels. IEEE Transactions on Information Theory. 36:4:799-809. July 1990. [23] Nariman Farvardin and V. Vaisharnpayan. On the performance and complex ity of channel-optimized vector quantizers. IEEE Transactions on Information Theory. 37:155-160. Jan. 1991. [24] T. J. Ferguson and J. II. Rabinowitz. Self-synchronizing hufTman codes. IEEE Transactions on Information Theory. 30:4:687-693. July 1984. 250 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [25] Thomas R. Fischer and Michael VV. Marcellin. Joint trellis coded quantiza tion/modulation. IEEE Transactions on Communications. 39:172-176. Feb. 1991. [26] Thomas R. Fischer, Michael VV. Marcellin. and Min Wang. Trellis-coded vec tor quantization. IEEE Transactions on Information Theory. 37:6:1551-1566. November 1991. [27] M. Fleming and M. Effros. Generalized multiple-description vector quantiza tion. Proceedings of the Data Compression Conference. Snowbird, i t ah. March 1999. [28] Jr. G. D. Forney. The Viterbi algorithm(invited paper). Proceedings of IEEE. 61:268-278. 1973. [29] R. G. Gallager. A simple derivation of the coding theorem and some applica tions. IEEE Transactions on Information Theory. 11:3-18. 1965. [30] A. El Gamal and T. M. Cover. Achievable rates for multiple description. IEEE Transactions on Information Theory. 28(6):S5l-S57. November 1982. [31] A. Gersho and R. Gray. Vector Quantization and Signal Compression. Klmver. 1992. [32] E. N. Gilbert and E. F. Moore. Variable-length binary encodings. Bell Syst. Tech. ./.. 38:1:933-967. 1959. [33] Vivek K. Goyal. Beyond Traditional Transform Coding. Ph.D. Dissertation. Cniversity of California. Berkeley. 1999. [3-1] R. M. Gray. Sliding-block source coding. IEEE Transactions on Information Theory. 21:4:357-368. 1975. [35] R. M. Gray. Time-variant trellis encoding of ergodic discrete-time sources with a fidelity criterion. IEEE Transactions on Information Theory. 23:71-83. 1977. [36] R. M. Gray. Entropy and Information Theory. Springer-Verlag. New Vork. 1990. [37] R. M. Gray. Source Coding Theory. Kluwer Academic Press. Boston. 1990. [38] R. Hagen and P. Hedelin. Robust vector quantization by a linear mapping of a block code. IEEE Transactions on Information Theory. 45:1:200-218. January 1999. [39] M. R. Bell J. C'. Guey. M. P. Fitz and W. Y. Kuo. Signal design for transmitter diversity wireless communication system over rayleigh fading channels. IEEE Vehicular Technology Conference. Atalanta. (7.4. pages 136-140. 1996. 251 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [•40] Hamid Jafarkhani and Vahid Tarokh. M ultiple description trellis-coded quan tization. Image Processing. 1998. ICIP 98. Proceedings. 1998 International Conference on. Volume 1:669-673. 199S. [41] Hamid Jafarkhani and Vahid Tarokh. Design of successively refinable trellis coded quantizers. IEEE Transactions on Information Theory. 45:5:1490-1497. July 1999. [42] Hamid Jafarkhani and Vahid Tarokh. M ultiple description trellis-coded quan tization. IEEE Transactions on Communications. 47:6:799-803. June 1999. [43] F. Jelinek. Tree encoding of memoryless time-discrete sources with a fidelity criterion. IEEE Transactions on Information Theory. 15:5S4-590. Sept. 1969. [44] Weiqing Jiang and Antonio Ortega. Multiple description coding via polyphase transform and selective quantization, preprint. 1999. [45] Potter Knagenhjelm and Erik Agrell. The Hadamard transform—a tool for index assignment. IEEE Transactions on Information Theory. 42:4:1139-1 151. July 1996. [46] Vidyadhar Ci. Kulkarni. Modeling and Analysis of Stochastic Systems. Chap man V; Hall. 1995. [47] A. J. Kurtenbach and Paul A. Wintz. Quantizing for noisy channels. IEEE Transactions on Communication Technology. 17:2:291-302. April 1969. [IS] VVai-Man Lam and Sanjeev R. Kulkarni. Extended synchronizing codewords for binary prefix codes. IEEE Transactions on Information Theory. 42:3:984- 987. May 1996. [49] Debra A. Lelewer and Daniel S. Hirschberg. Data compression. ACM Com puting Surreys. 19:3:261-296. Sept. 1987. [50] T. Linder. G. Lugosi, and K. Zeger. Rates of convergence in the source coding theroem in empirical quantizer design . and in universal lossy source coding. IEEE Transactions on Information Theory. 40:6:1728-1740. November 1994. [51] Xueting Liu. Christos Komninakis. and Richard D. Wesel. Cross constellation with superior edge profiles, preprint. 199S. [52] M. VV. Marcellin and T. R. Fischer. Trellis-coded quantization of memoryless and gauss-markov sources. IEEE Transactions on Communications. 38:82-93. January 1990. [53] J. C. Maxted and J. P. Robinson. Error recovery for variable codes. IEEE Transactions on Information Theory. 31:6:794-802. November 1985. 252 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [54] Andras Mehes and Kenneth Zeger. Binary lattice vector quantization with linear block codes and affine index assinments. IEEE Transactions on Infor mation Theory. 44:1:79-94. .January 1998. [55] D. M iller and K. Rose. Combined source-channel vector quantization using deterministic annealing. IEEE Transactions on Communications. 42:347-356. Feb./Mar./Apr. 1994. [56] M. E. Monaco and J. M. Lawler. Corrections and additions to 'error recovery for variable length codes’. IEEE Transactions on Information Theory. 33:454- 456. May 1987. [57] B. L. Montgomery and .J. Abrahams. Synchronization of binary source codps. IEEE Transactions on Information Theory. 32:6:849-854. Nov. 1986. [58] Peter G. Neumann. Efficient error-limiting variable-length codes. IRE Trans. Inf. Theory. 8:4:292-304. July 1962. [59] Jim K. Omura. A coding theorem for discrete-time sources. IEEE Transac tions on Information Theory. 19:4:490-498. July 1973. [60] L. Ozarovv. On a source coding problem with two channels and three receivers. Bell Syst. Tech. 59:1909-1921. December 1980. [61] R. J. Pile. The transmission distortion of a source as a function of the encoding block length. Bell Syst. Tech. ./.. 47:827-885. 1968. [62] John G. Proakis. Diqital Communications. McGraw H ill Book Cornpanv. New York. 1995. [63] Beulah Rudner. Construction of minimum-redundancy codes with an optimum synchronizing property. IEEE Transactions on Information Theory. 17:4:478- 487. July 1971. [64] N. Rydbeck and CAY. Sundberg. Analysis of digital errors in nonlinear PCM systems. IEEE Transactions on Communications. C'OM-24:59-65. January 1976. [65] M. P. Schutzenberger. On the application of semigroup methods to some problems in coding. IRE Trans. Inf. Theory IT -2 . 3:47-60. 1956. [66] Mohammad R. Soleymani and Carl R. Xassar. Trellis quantization with MAP detection for noisy channels. IEEE Transactions on Communications. 40:10:1562-1565. 1992. [67] L. E. Stanfel. Mathematical optimization and the synchronizing properties of encodings. Information and Computation. 77:57-76. 1988. 253 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [6S] J. J. Stiffler. Theory of Synchronous Communications. Prentice-Hall. Engle wood Cliffs. New Jersey. 1971. [69] J. A. Storer and J. H. Reif. Error resilient optimal data compression. SIAM Journal on Computing. 26:4:934-949. 1997. [70] J. A. Storer and J. H. Reif. Low-cost prevention of error-propagation for data compression with dynamic dictionaries. Proceedings 1997 IEEE Data Compression Conference. pages 171-180. 1997. [71] Peter F. Swaszek and Peter DiC’icco. More on the error recovery for variable- length codes. IEEE Transactions on Information Theory. 41:6:2064-2071. November 1995. [72] Yasuhiro Takishima. Masahiro Wada. and Hotomi Murakami. Error states and synchronization recovery for variable length codes. IEEE Transactions on Communications. 42:2/3/4:783-792. 1994. [73] Vahid Tarokh. Ayman Naguib. Nambi Seshadri. and A. R. C'alderbank. Space time codes for high data rate wireless communication: Performance criterion in the presence of channel estimation errors, m obility and multiple paths. IEEE Transactions on Communications. 47:2:199-207. Feburary 1999. [71] Jukka Teuhola and Timo Raita. Arithm etic coding into fixed-length code words. IEEE Transactions on Information Theory. 40:219-223. Jan 1994. [75] Mark R. Titchener. The synchronization of variable-length codes. IEEE Transactions on Information Theory. 43:683-691. March 1997. [76] G. Cngerboeck. Channel coding with multilevel/phase signals. IEEE Trans actions on Information Theory. 28:55-67. Jan. 1982. [77] Hamid Jafarkhani Vahid Tarokh and A. R. C'alderbank. Space-time block coding for wireless communications: Theory of generalized orthogonal design. IEEE Transactions on Information Theory. 45:1456-1467. July 1998. [78] V. A. Vaisharnpayan. Design of multiple description scalar quantizers. IEEE Transactions on Information Theory. 39:3:S21-S34. May 1993. [79] V. A. Vaisharnpayan and J.-C. Batllo. Asymptotic analysis of multiple de scription quantizers. IEEE Transactions on Information Theory. 44:278-284. Jan. 1998. [80] V. A. Vaisharnpayan and J. Domaszewicz. Design of entropy-constrained mul tiple description scalar quantizers. IEEE Transactions on Information Theory. 40:245-250. Jan. 1994. 251 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [SI] Andrew J. Y iterbi and Jim K. Omura. Trellis encoding of memoryless discrete time sources w ith a fidelity criterion. IEEE Transactions on Information The ory. 20:3:325-332. May 1974. [S‘ 2] Min Wang and Thomas R. Fischer. Trellis-coded quantization designed for noisy channels. IEEE Transactions on Information Theory. 40:6:1792-1802. November 1994. [S3] Yao Wang. Michael T. Orchard, and Vinav Vaisharnpayan. M ultiple descrip tion coding using pairwise correlating transforms, submitted to . IEEE Trans actions on Information Theory. 1999. [84] Richard D. Wesel and John M. C'ioffi. Constellation labeling paradigms. IEEE Transactions on Information Theory, (submitted to). 1999. [So] H. S. Witsenhausen and A. D. Wyner. Source coding for multiple descriptions II. Bell Syst. Tech. ./.. 60:2281-2292. December 1981. [86] J. K. Wolf. A. D. Wyner. and J. Ziv. Source coding for multiple descriptions. Bell Syst. Tech. ./.. 59:1417-1426. Oct. 1980. [87] Enhui Yang and Zhen Zhang. An online universal lossy data compression algorithm via continuous codebook refinement-Part III: Redundancy analysis. IEEE Transactions on Information Theory. 44:5:1782-1801. Sept. 1998. [88] Enhui Yang and Zhen Zhang. On the redundancy of lossy source coding with abstract alphabets. IEEE Transactions on Information Theory. 45:4:1092 1110. 1999. [89] L. Yang and T. R. Fischer. Trellis-coded quantization for binary erasure chan nels. IEEE Transactions on Information Theory. 45:2:781-787. March 1999. [90] B. Yu and T. P. Speed. A rate of convergence result for a universal d- semifaithful code. IEEE Transactions on Information Theory. 89:818-820. 1998. [91] K. A. Zeger. A. Bist. and T. Linder. I'niversal source coding w ith codebook transmission. IEEE Transactions on Communications. 42:386-846. Feb 1994. [92] K. A. Zeger and A. Gersho. Zero redundancy channel coding in vector quan tization. IEE Electron. Lett.. 23:654-656. June 1987. [93] K. A. Zeger and A. Gersho. Pseudo-gray coding. IEEE Transactions on Com munications. 38:2147-2158. Feb 1990. [94] Zhen Zhang and T. Berger. New results in binary multiple description. IEEE Transactions on Information Theory. 33:502-521. July 19S7. 255 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [95] Zhen Zhang and En-Hui Vang. An online universal lossy data compression algorithm via continuous codebook refinement-Part II: optim ality for o-mixing source models. IEEE Transactions on Information Theory. 42:3:822-836. May 1996. [96] Zhen Zhang. En-Hui Yang, and Victor K. Wei. The redundancy of source coding with a fidelity criterion-part one: know statistics. IEEE Transactions on Information Theory. 43:1:71-91, January 1997. [97] Guangcai Zhou. Yuankai Wang. Zhen Zhang, and K eith M. Chugg. On space time convolutional codes for psk modulation. Communications. ICC 2001. IEEE International Conference on. 2001. [98] Guangcai Zhou and Zhen Zhang. The redundancy of trellis coded quantization. preprint. 1998. [99] Guangcai Zhou and Zhen Zhang. Robust index assignment for finite state vector quantizer, preprint. 1999. [100] Guangcai Zhou and Zhen Zhang. Synchronization property of optimal prefix- free codes, preprint. 1999. [101] Guangcai Zhou and Zhen Zhang. Tree structured trellis coded quantization. under preparation, 2000. [102] Guangcai Zhou and Zhen Zhang. Synchronization recovery of prefix codes. IEEE Transactions on Information Theory. 48. January 2002. 256 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A Proofs of Theorems for Chapter 4 In this appendix, we give the proofs of the two theorems and the corollary. Proof of Theorem /f.l: We first explain the meaning of each term in (4.12). For n = 0. g ^ {x ) = = p(x)(Q.vr(j‘))riei = p(x)e[ = Po{x) is the polynomial corresponding to synchronization in IDD sense. Let po(x) = Then t/}°* represent the probability that the source word number is / more if / > 0 or |/| less if i < 0 after decoding. Let g{n\x ) = p{x)(Q_\[{x))nei = g\n)x‘. Then r/jn* represents the probability that the source word number is i more if / > 0 or |/| less if i < 0 when the codeword sequence is synchronized (in IDD sense) after corrupting n + 1 codewords. That is to say. no wide sense synchronization occurs before instance n. Therefore the constant term of C/(.r) is r/o"* anc* represents the probability that the decoded source word number is exactly the original source number, i.e.. the synchronization probability in the Hamming distance sense. By above discussion, we can see that d C ^ y |J=1 = Z2r?=o ]Ci=-:c > s ^ ie average number difference between the original source words and the decoded source words. It is also clear that 5Z^_0(n-f- 1 )go^/ ls ^ ie MEPL conditional on the synchronization with respect to the Hamming distance and 0(n + 1 0 0 ° is the second moment of the error propagation length conditional on the synchronization with respect to the Hamming distance. □ Proof of Theorem ./. J: This theorem is in fact a extended form of Theorem 1 in [102], Let p = {po.pi. ■ ■ •. Psi) = P (0- Then p, is the average probability that when one bit error occurs, after decoding, the erred codeword generates the internal node (state) a, for i = 0.1. • • • ..!/. Let Q = Q (l) and Q \i = Q .\/(l). Then Q represents the stochastic matrix among the state set I = {a 0 = A. 0 [. • • •. a.\/}. Hence pQ '^ei represents the probability that after corrupting n + 1 codewords, the synchronization in wide sense is not yet achieved, i.e.. the error w ill continue to propagate. The probability that at instance n > 0. the corrupted codeword number is ti -f 1 but after this corrupted codeword, synchronization will resume is - P W !ei. At 257 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. instance n = 0. the probability that the error will propagate is 1 — pe ^ Therefore the mean error propagation length in IDD sense is M E P L = 1 - p e t + £ ( n + 2) ( p Q n XIei - p Q ^ e i ) n = 0 = 1 n = l = / / ( l. I ) . The second moment of the error propagation length is therefore OG 2nd order moment = 1 — pet + ^ ( n + 2)2 ( p ^ ^ e t — p ^ ^ e ^ n=0 = I + [(n + 2)2 - (n + I J2j pC^Afgi n = 0 = 1 + 53 [-(n + U + I] n = 0 — 4 / 7 7 1 l ) i - 4 / 7 ( I. I) + 7 ------— -------l(-r.y)=(I.l) □ Proof of Corollary f.l: One can easily check that pQ £et = pQ^fet. Since Qq{ 1) is a stochastic matrix, if / — Q\t is invertible, we can easily prove that ( / ~ Q \l)~ l = o Qsi and ( / - Q u)~2 = 0 (n + ! ) % • See also'[l02j. Hence the corollarv can be easilv deduced from Theorem 4.2. □ 258 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix B Proofs of Chapter 5 Proof of Theorem 5.1. Assume that a bit error occurs at instant 0. L(S.A) in (5.21) corresponds to the average corrupted source word length when a codeword contains a bit error, i.e.. at instant 0. The term p • L‘ is the average source word length that will be corrupted at instant 1 because of the current ETS is not a first class ETS. In the same way. the term p Q P is the average source word length that will be corrupted at instant 2 because of the current ETS is not a first class ETS that after the second codeword the correct parsing does not resume. Similarly, the term pQnl ‘ is the average source word length that w ill be corrupted at instant n + I. Therefore, the right hand side of the formula is the total of M EPL when bit error happens at instant 0. When / — Q is invertible, we can conclude that the absolute values of eigenvalues of Q are less than 1. Hence as an operator on the m-dimensional Euclidean space. ||Q|| < 1 . So the first part of the theorem is proved. The 1 in the formula corresponds to the codeword containing the bit error which is in error with probability 1. The term p • 1‘ is the total probability that the codeword containing the bit error has a non-empty state. This is also the probability that the second codeword is decoded incorrectly. In the same way. the term p (? lf is the probability that after the second codeword the correct parsing does not resume. This is also the probability that the third codeword is decoded incorrectly and so on. Therefore, the right hand side of the formula is the sum of the probabilities of the events that the t-th codeword is decoded incorrectly for all i. It is exactly the mean error propagation length of the code. When I — Q. similar analysis as in the first part of the theorem finishes the proof. □ Proof of Corollary 5.1. By the proof of Theorem 5.1. we know that the probability that 1 codeword is decoded incorrectly and the following codewords will be decoded correctly is I — p • 1'. Similarly, the the probabilities that corrupt exactly n -f 1 codewords are pQ "-1! ' — p£?nl* for n = 1.2,---. Hence the variance of the error propagation length is a] = I - p -l' + X > + 2 ) 2p<ri'-pQ'^'l'-MEPL* n = 0 259 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. = 3M EPLc + 2 $ > p < 7 ,l t - 2 - M E P L c 2 71=0 If / - Q is invertible, then ||Q|| < 1 and by the identity that ( / - Q)~2 = £'^_0(n 1 )Qn. the theorem is easily proved. 260 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. + □ Appendix C Proofs of Theorems for Chapter 8 Proof of Lemma 8.1: In the proof of the lemma, we iteratively use the inequality H(X. V. Z) < H(X. V) + H(X. Z) - H(X) (CM) for random variables .V. Y. Z. Let 5, be the marginal random variable set of (A'|. • • •. A'.v) of dimension i. So the cardinality of 5’, is |!( Applying the above inequality f 'mes vvc can obtain the following inequality H < x v - . . x '. v ) < ^ E « w - v. ,, E «<*). < (•■ -> ) • s€E v-i ^ ' *eSs-2 Applying the above procedure to LL(s) we can achieve E H{s)<cx E E (C.:l) where c\ > 0 and d\ > 0 are constants depending on X. Plugging (CM) to (C.2). we obtain //(.V,.--..Vv)<<v E //(») - E H is) (C.4) where c'v > 0 and d'v > 0 are also constants depending on X. By induction, we can obtain the following inequality H (X). • • •. A'.v) < C’x £ ) - Av Y H( ) • i <.-<j <-v ‘= i 261 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where C't \ > 0 and D,v > 0 are constants depending on V. Note that when are i.i.d. and when A'i = ••• = A'.v, u= " holds. Hence we have the following eqution systems f :V = C.v.V(.V - 1) - D.v.V \ 1 = C \ N( A' - l)/2 — D.v-V Hence C \ = D \ = and the lemma is proved. □ 262 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Contribution to transform coding system implementation
PDF
Intelligent systems for video analysis and access over the Internet
PDF
Computer-aided lesion detection in positron emission tomography: A signal subspace fitting approach
PDF
Complexity -distortion tradeoffs in image and video compression
PDF
Energy and time efficient designs for digital signal processing kernels on FPGAs
PDF
Iterative data detection: Modified message -passing algorithms and convergence
PDF
Iterative data detection: Bounding performance and complexity reduction
PDF
Compression, correlation and detection for energy efficient wireless sensor networks
PDF
Contributions to efficient vector quantization and frequency assignment design and implementation
PDF
Design and performance of space -time codes
PDF
Color processing and rate control for storage and transmission of digital image and video
PDF
Design and applications of MPEG video markup language (MPML)
PDF
An evaluation of ultra -wideband propagation channels
PDF
Contributions to image and video coding for reliable and secure communications
PDF
Joint data detection and parameter estimation: Fundamental limits and applications to optical fiber communications
PDF
Design and performance analysis of low complexity encoding algorithm for H.264 /AVC
PDF
Design and analysis of server scheduling for video -on -demand systems
PDF
Adaptive video transmission over wireless fading channel
PDF
Error resilient techniques for robust video transmission
PDF
Advanced video coding techniques for Internet streaming and DVB applications
Asset Metadata
Creator
Zhou, Guangcai (author)
Core Title
Data compression and detection
School
Graduate School
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
engineering, electronics and electrical,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Zhang, Zhen (
committee chair
), Baxendale, Peter (
committee member
), Ortega, Antonio (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-208876
Unique identifier
UC11334720
Identifier
3065872.pdf (filename),usctheses-c16-208876 (legacy record id)
Legacy Identifier
3065872.pdf
Dmrecord
208876
Document Type
Dissertation
Rights
Zhou, Guangcai
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
engineering, electronics and electrical