Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Joint data detection and parameter estimation: Fundamental limits and applications to optical fiber communications
(USC Thesis Other)
Joint data detection and parameter estimation: Fundamental limits and applications to optical fiber communications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
JOINT DATA DETECTION AND PARAMETER ESTIMATION : FUNDAMENTAL LIMITS AND APPLICATIONS TO OPTICAL FIBER COMMUNICATIONS by Orhan Co§kun A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2004 Copyright 2004 Orhan Co§kun R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. UMI Number: 3140461 INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. ® UMI UMI Microform 3140461 Copyright 2004 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. D edication to m y w ife and parents ii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A cknow ledgem ents First and foremost, I must thank the God for how generous It has been to me, for all the gifts It has bestowed upon me. I would like to express my deepest gratitude to Professor Keith M. Chugg, my super visor, for his guidance, advice, and constant support throughout the course of this work. I have benefitted a lot from his research experience and his extensive technical knowledge. He made this rocky road joyful for me. I would like to thank Professors Antonio Ortega, P. Vijay Kumar, and Andreas Polydoros for their taking time to present such excellent courses as Wavelet Theory, Coding Theory and Communication Theory. I also would like to thank Professor Peter Baxendale for his helpful comments and for serving 0 1 1 my thesis committee.I am indebted to them for the effort they put into reading this dissertation under significant time constraints; their thoughtful criticisms and suggestions ultimately made this a sounder and more coherent document. My research benefitted from the discussions and criticisms from my friends: Baris Fidan, Cengizhan Cardakli, Jun Heo, Ali Taha, Gent Paparisto, Omer Oralkan, Ayhan Mutlu, Ali Ekber Gurel and Kazim Buyukboduk. iii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Last but by no means least, I wish to express my sincere gratitude to my parents, sister and two brothers. W ith my wife’s love, understanding and organizational skills and with my father’s encouragement, this work finally has been completed. R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Table of C ontents D e d ic a tio n ii A ck n o w led g em en ts iii L ist O f T ables vii L ist O f F ig u res viii A b s tra c t x 1 In tro d u c tio n 1 1.1 Signal Model and Optimal D e te c tio n .................................................................. 2 1.2 Decision Feedback Equalizer ................ 9 1.3 Thesis Outline .......................................................................................................... 12 2 T ra in in g C o d es 14 2.1 Traditional Training and JML Detection ..................... 18 2.2 Overhead-Complexity Trade-off for Long Packets ................. 20 2.3 Training Code Design Using Separability and Distance Conditions . . . . 26 2.3.1 Code Design M ethodology.......................................................................... 27 2.3.2 Noise-Free Separability................................................................................. 29 2.3.3 Example of Noise-Free Design M ethodology......................................... 30 2.3.4 A Pairwise Distance for JMLSE ............................................................. 31 2.3.5 Example of Noisy Design Methodology ............................... 35 3 O p tim iz a tio n o f E q u a liz a tio n T ech n iq u es for F ib e r O p tic a l S y stem s 39 3.1 Optical Channel Model ................... 42 3.2 Maximum Likelihood Sequence Detection for Fiber Optical Systems . . . 51 3.3 DFE for Fiber Optical S y s te m s .................. ....... ................................................. 59 3.3.1 Mean A d ap tatio n ........................................................................................... 62 3.3.2 Equalizer Coefficient A d a p ta tio n ............................................................ 64 3.3.3 Sampling Phase A daptation ..................... 66 3.3.4 Slicer Threshold Adaptation (S T A ).......................................................... 67 4 D a ta -A id e d C lock R eco v ery fo r O p tic a l R eceiv ers 69 4.1 D ata Aided Clock Synchronizer .......................................... 71 v R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 5 C o n clu sio n s 84 R efe ren c e L ist 86 A p p e n d ix A Previous Identifiability / Separability Tests and Their R e la tio n ............................ 91 A p p e n d ix B PD F of the Quadratic Detector Signal in AWGN Environment . . . . . . . . . 98 A p p e n d ix C An Algorithm for Code Design (Clique P ro b le m ).........................................................104 C .l Greedy Heuristic for Code D esign............................. ...........................................107 A p p e n d ix D Parallel Implementation DFE and Viterbi A lg o rith m ........................................ 108 vi R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. List Of Tables 2.1 Effect of training sequence selection on channel estimation error variance for L — 3. Values shown for the worst full-rank sequence and the best training sequence......................................................................................................... 19 3.1 Results of SPBA algorithm for 2500 ps/nm and 80 ps DGD channels at 11 dB OSNR................................................................................................................. 57 4.1 Jitter performance of the timing recovery algorithm as a function os OSNR 82 vii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. List Of Figures 1.1 The direct structure for known-channel M L SE .................................. 4 1.2 The front end processing blocks of a receiver ..................................... 5 1.3 (a) Transfer function of zero-forcing LE (b) Another interpretation of ZF- LE (c) Derivation of ZF-DFE from Z F -L E .................................... 10 2.1 Configuration of long-packet investigation, (a) the traditional approach and the split-training approach, (b)Standard initialization for a PSP or Viterbi processor, (c) Exhaustive JML initialization up to tim e Q for a generalized PSP re ceiv er.......................................................................................... 21 2.2 The trade-off between training overhead and receiver complexity for rel atively long packets. Simulation parameters f = [1 2 l]r / v /6, D = 60 information bearing bits, and RLS forgetting factor p = 0.9 . . , ............. 23 2.3 Bit error rate versus packet location for split training and standard train ing. Simulation parameters: / = [1 2 l]r /\/6 , D = 60 information bearing bits, RLS forgetting factor p — 0.9, and Es/N q = 7 dB.................................... 25 2.4 The conditional PE R performance of JMLCSE for 400 3-tap channels se lected on the unit hemisphere as a function of m r(-)/a(-). Shown under the condition of either sequence transm itted and <rj = 1. The emprical upper bound Q (mr (-)/oy(-)) is also shown . ................ 33 2.5 The GER for two sets of 3 sequences averaged over all 3-tap channels. Set A was designed to have good unknown channel performance and set B was designed to have good known-channel performance........................................... 36 2.6 Comparison of segregated and combined training and modulation. All systems use a packet length of N = 16, with the trained and training-code systems conveying 7 information bits per packet. Performance is averaged over all 3-tap channels. ................................. 37 3.1 Simplified transmission and reception model for optical fiber system . . . 43 3.2 Realization of y(t) corresponding to A( t ) ............................................ 44 3.3 Variation of N shot, Nther, Nsp- sp as a function of optical bandwidth in term s of data rate, M = B q/ R b ................... 47 3.4 PMD M o d el.............................. 48 3.5 Time domain representation of overall channel response for no dispersion, 50 ps, 100 ps dgd PMD and 2000 ps/nm CD c a s e s ........................................ 50 3.6 Power spectral density of no dispersion, 50 ps, 100 ps dgd PMD and 2000 ps/nm CD channels ................... 51 viii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 3.7 Effect of step size on the acquisition of PSP algorithm for 80 ps dgd PMD channel at 11 dB OSNR. PSP algorithm acquires the from its shifted ver sion with large step size. This switch occurs at around t = 2100 . . . . . 54 3.8 SPBA acquires the actual channel coefficients through the selection met ric. (a) For 2500 ps/nm CD channel, 2nd initialization gives the minimum selection metric, (b) For 80 ps PMD channel 1st initialization gives the minimum m etric.................... 58 3.9 (a)Conditional and (b)unconditional probability density functions of Xk for 80 ps dgd channel in the logorithm domain re s p e c tiv e ly ............................. 58 3.10 3-Phase process to achieve the BER performance of MLSD ...................... 58 3.11 Mean, sampling time, FFF and FBF coefficients, and sheer threshold adap tation of D F E .................................... 61 3.12 Convergence of the input mean,-ms, for 70 ps dgd PMD channel at 14 dB O S N R ............................................................................ 63 3.13 (a) Convergence of feedforward coefficients and (b) convergence of feedback coefficients for 70 ps dgd PMD channel at 14 dB OSNR. . . . . . . . . . . 64 3.14 Flowchart of the sampling phase adaptation algorithm. .................. 65 3.15 (a)Timing function for 70ps dgd PMD channel at 14dB OSNR. (b) Con vergence of sampling phase................................. 66 3.16 LUT based STA for 70ps dgd PMD channel at 14 dB OSNR ................. 68 4.1 Overall time-domain responses of various dispersion channels........................ 71 4.2 Variations of phase error for various dispersion channels. ..................... 72 4.3 Variances of phase errors............................................................................................ 74 4.4 Block diagram of timing recovery l o o p ........................ 76 4.5 Impact of the sampling phase on the output statistics of receive signal, (a) Q value of 50 ps dgd channel at 12 dB OSNR after the DFE. (b) The MSE of the DFE for the same c h a n n e l......................................................................... 77 4.6 Closed loop Bode plot of Type II f i l t e r .............................. 78 4.7 Accumulated phase jitter for 50 ps dgd PMD channel at 12 dB with Type 11 filte r........................................................................................................................... 79 4.8 Improvement of the Q value for the parameters are adapting (a) Q of 50 ps dgd channel (b) Coefficients of the DFE ....................... 80 4.9 Acquisition of the iming recovery algorithm for a frequency difference of 12 MHz after a number of cycle s lip s ................................................................... 81 4.10 Impact of the loop filter on the output Q value for 2000 ps/nm CD channel at 14 dB OSNR with a 4MHz frequency difference ....................... 83 A .l The relation between the identifiability checks in [8] is weaker than that in [28]. Note th at the relation between regular and assured channels is a conjecture ............................. 93 C .l Example of the execution of the clique algorithm. One of the maximum cliques of the graph (in circle) is emphasized by non-dashed edges...................106 D .l Trellis Diagram for D FE ...................................... I l l ix R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A b stract The traditional method of sending a training signal to identify a channel, followed by data, may be viewed as a simple code for the unknown channel. Results in blind se quence detection suggest th at performance similar to this traditional approach can be obtained w ithout training. However, for short packets and/or time-recursive algorithms, significant error floors exist due to the existence of sequences th at are indistinguishable without knowledge of the channel. In this work, we first reconsider training signal design in light of recent results in blind sequence detection. We design training codes which combine modulation and training. In order to design these codes, we find an expression for the pairwise error probability of the joint maximum likelihood (JML) channel and sequence estimator. This expression motivates a pairwise distance for the JML receiver based on principal angles between the range spaces of data matrices. The general code design problem (generalized sphere packing) is formulated as the clique problem associ ated with an unweighted, undirected graph. We provide optimal and heuristic algorithms for this clique problem. For short packets, we dem onstrate th at significant improvements are possible by jointly considering the design of the training, modulation, and receiver processing. For long packets, training codes are used to reduce the BER in the start-up mode. x R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. As a practical blind data detection example, data reception in a fiber optical channel is investigated. To get the most out of the data detection methods, auxiliary algorithms such as sampling phase adjustment, decision threshold estimation algorithms are suggested. For the parallel implementation of detectors, a semiring structure is introduced both for decision feedback equalizer (DFE) and maximum likelihood sequence detection (MLSD). Even when the d ata detection algorithm is. optimally designed, if the jitter in the recovered clock of the baseband incoming signal is high, the BER performance of the system will be adversely affected. A data-aided clock recovery algorithm reduces the jitter of the clock with the help of decisions from the data detection module. Performance of such an data-aided clock recovery scheme is analyzed and simulated. The algorithms suggested in this thesis cover some of the basic algorithms to optimize the performance of an receiver both in complexity and BER. R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C h ap ter 1 In tro d u ctio n The goal of this work is to investigate the design space of popular detection techniques: DFE and MLSD. W hen transm itted signals smear into each other, detectors which don’t exploit the memory in the channel provide a higher BER than otherwise can be achieved with DFE or MLSD. Even though MLSD does not estim ate the input data by simply filtering a function of the receive data as equalizers do, we will treat it in the set of equalizers. The main focus of this thesis is to search for techniques in the design space of equalization methods to use the given resources in an efficient way. We first start with efficient utilization of a given overhead for channel estimation. The area of interest on the work of overhead is the acquisition mode where there is a complete lack of chan nel information except for its length. Two extreme approaches were suggested before. Blind approaches use only the received signal to identify the channel, whereas trained approaches get some help from the transm itter to provide a reliable channel estimate to the decoder. Besides the channel estimates, the sampling phase should be adjusted in baud-rate sampling receivers in order to sample at a phase where the impact of aliasing on the identifiability of the channel is the least possible. The m etric in MLSD and the 1 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. decision threshold in DFE also plays a significant role on the BER performance of an equalizer. Especially when the noise in the system is non-Gaussian, data dependent, m etric and threshold parameters should be adjusted adaptively. High speed application of an equalization algorithm requires processing so simple that even access to an error signal might not be possible. Under these circumstances, performance monitoring, which is an essential receiver function, could only be possible through the manipulation of some of the available signals in the design space of equalization. In addition to BER perfor mance related algorithms, some mapping algorithms are necessary to break the inherent recursion bottlenecks in equalization methods. Those mappings enable the parallel im plementation of the equalization blocks in the receiver modules. In the following, the signal model, JML detection, and DFE topics are briefly reviewed. 1.1 Signal M odel and O ptim al Detection The continuous time received signal in an AWGN environment with an ISI channel can be modeled as r(t) = dih(t - iT) + n(t) = y(t) + n(t) (1.1) i where a*1 is the data sequence, h{t) is the equivalent channel arising from the convolution of the data pulse, u(t), and the physical channel, c(t), and n(t) is the AWGN with power spectrum N 0. The i 11-effect of c{t) elongating the data pulse has to be mitigated at the receiver in order to prevent power loss. We focus on this section to determine the degree to which such a m itigation process can be performed optimally. When r(t) is passed through 1 Throughout the thesis, m and dk are used interchangeably to represent the input data 2 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. the matched filter (MP), g(t) = h*(—i), the output gives a set of sufficient statistics to decode the transm itted signal [17,20] and as such the MF output, x[n], retains as much information as the continuous signal r(t). The metric for the quality of a detector is most appropriately chosen as the probability of making an erroneous decision on a;. The Maximum a-posteriori probability (MAP) detector is optimal in term s of minimizing the probability of such an event. Provided th at the a priori probabilities of the data sequence a,; are all equal, the MAP detection of the data sequence based on observing r(t) on some interval J can be achieved by the minimization of the following metric r({dj}) = J Iy(t; {di})\2dt - 25R j r(t)y*(t; (a f} )d fj (1.2) This metric can be recursively computed by r u (af e ) = r u ({a/c _ 1}) + \y(t; ak)\2dt - 25ft r(f)y*(f; a*,)dt j (1.3) where % = [dkdk-i ■ ■ - do] and J = [0, (k + 1)T] [12]. The resulting receiver structure suggested by (1.3) is depicted in Figure 1.1. The recursion (1.3) is similar to Ungerboeck MLSD receiver [58] given by ^ ( a * ) = r 2ife(af e _ 1) + 2 S R ^2x[k] - ph(G)ak - 2 ^ / oh(i)df e _ij | (1.4) where x[k] = j r(t)h*{t — kT)dt (1.5) ph[k) = J r h(t)h*(t - kT)dt (1.6) The physical channel, h{t), is assumed to have a time span of I = [0, LT}. A common feature of the above receivers (1.3) and (1.4) is th at they both find the closest data 3 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. sequence output to the received data without using a whitening filter. As a m atter of fact, the minimum distance between the received signal and the filtered version of the hy pothesized data sequence is only the function of h(t) for uncoded systems, thereby cannot be increased by adding a discrete time filter such as whitening filter after the sampled MF output. By means of coding, however, the separation between the allowable sequences is increased owing to the redundancy added to the transm itted data. Besides, coding is able to average the impact of the noise by adding time diversity to the transm itted data. (k+iyr Data — ^Sequence Estimate Survivor Paths kT Hypothesized Versions of y(t) h(t) Viterbi Decoder Figure 1.1: The direct structure for known-channel MLSE The upper bound for the minimum distance of uncoded system can be achieved by a channel which satisfies Nyquist criterion, implying th at ISI cannot improve the per formance. The performance degradation due to ISI can be larger as the length of the channel increases. For example in PAM signalling with AWGN, the worst case ISI chan nel, fi = [0.38 0.60 .60 .38], has a performance loss of 4.2 dB whereas the performance loss for a longer channel £ 2 = [0.29 0.5 0.58 0.5 0.29] is 5.7 dB [46]. One striking charac teristic of these channels is th at all their zeros are on the unit circle. Forney [20], applied a whitening filter to decorrelate the MF output noise. W hitening filter enables us to use 4 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. usual squared Euclidian distance metrics in Viterbi algortihm (VA). Let Sh(e^wT) shows the folded spectrum of the channel hit) Sh( e ^ T) = i f ) |H(jw) + m m = — oc L 2t t . j t ! — jw k T (1.7) ( 1.8) k--L where H{jw) is the Fourier Transform of h{t). Then, so long as the folded spectrum satisfies the so-called Paley- Wiener condition: p ! T / log Sh{ e ^ T)d- J - n / T W < 00 (1.9) Sh(z) can be factorizable Sh{z) = A2 hG(z)G*(l/z*) f ' f f i r /T A h = exP j 2^ J ^ In Sh(e3wT)d w ( 1.10) ( l .ii ) where G(z) is a casual monic loosely mimimum-phase transfer function. The front end process in Figure 1.2 with g{t) = h*(—t), sampling rate p — 1, and precursor equalizer C{z) = l / A 2 hG*(l/z*) are all together called a whitened matched filter (WMF). Besides providing sufficient statistics, the WMF maximizes the output SNR with the variance of the complex baseband noise 2Nq/ A \ as well. By adding enough delay, the anticasual t = R T , T r p T ,p < i i = k T CT Filter S am pler P rraarso r Equalizer B aud-rate Sampler Figure 1.2: The front end processing blocks of a receiver R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. filter, C(z) can be turned into a realizable filter for finite length implementation of the precursor filter. Since the noise process at the output of the W M F is an innovation process, C(z) can also be viewed as an optimal linear predictor. The correlation between the successive noise samples is the key concept here. The received signal then, can be modeled as zk = A k f + w f c = I I fh-i h o -k - L + i ■■■ a k (1 .12) (1.13) (1.14) r t T where (-)T denotes transposition and y k = y k yk-l ■ ■ • y\ is the notation for a signal vector to time k (i.e., zk and Wfc). Convolution of the independent identically distributed (iid) digital sequence a* with the finite support channel is represented by multiplication of the (L x 1) channel vector f by the Toeplitz (k x L) d ata m atrix Ak. The maximum likelihood sequence detector (MLSD) selects the data sequence which minimizes the h norm of the error, j|w*,||2 = \\zk — Aki\\2. Since the unitary transform ation preserves the h norm, any rotation of zk would not change the minimum distance. In the z domain, an allpass filter functions as a unitary m atrix resulting in a different factorization of Sh(z). However among all front end discrete-time filters, the one which generates a minimum phase channel at its output has the desired property of maximal concentration near zero delay. This property of a minimum phase filter can be shown by mapping a zero of the R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. minimum phase filter to the outside of the unit circle using an allpass filter. Let G(z) be a minimum phase filter with a zero at \c\ < 1 and H = [ h0 h h 2 ■ ■ ■ hn (L15) F = H{z~l - c*) (1.16) G = H { l - z ~ xc) (1.17) then the difference of the partial sum of the coefficient magnitude square is always positive for k < n. ] C i ^ i 2 - X ^ i 2 = n 2(! - ici2) (L 18) i~ 0 i —0 > 0 (1.19) Suppose th at the received signal, r(t), is a bandlimited signal, then due to the Landau- Pollak Theorem [36], the above assumption of the channel being tim e limited can only be an approximation. Thus MLSD with truncated channel performs best when the channel is minimum phase. Phase characteristics of the channel also play a significant role in DFE because of the energy cancelation in the post tail components of the channel. We indicated above th at even the optimal detector incurs some performance penalty with respect to no ISI channel depending on the characteristics of the channel. Unger- boeck [58] gave two different channel conditions which guarantee th at ISI has essentially no impact on the error performance of MLSD: 1 . There exists no frequency wq such th at Sh{e^w°T) is 6dB below than p/i(0). 2. The peak distortion at the output of the matched filter is less than unity. Whenever one of these conditions is satisfied, the single error events are assured to be the dominat ing error events. Thus the minimum eigenvalue of the autocorrelation matrix, E, of the 7 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. error sequence, e = [ao — So a \ —a\ • ■ • a-k ~ Sfc], is ’1 ’ [46]. The second condition is also a sufficient condition for the zero forcing equalizer to minimize the peak distortion [38]. Up to this point, we have considered optimal detection assuming th at h(t) is known at the re ceiver. When the channel is unknown or time varying, implementation of adaptive analog filter poses difficult challenges to system and circuit designers. Furthermore, for signals th at satisfy Paley-Wiener condition, baud rate sampling introduces aliasing. Unless the sampling phase is chosen optimally, the linear phase characteristics of Sh(e^wT) are cor rupted and there happens to be a frequency where the amplitude response is severely degraded. The margin for the sampling time is maximum for the rectangular pulse. Yet, since the frequency response of the rectangular pulse decays with a rate of 1 / / , it is the most sensitive pulse shape to ISI. The minimum bandwidth pulse shape, on the other hand, is extremely sensitive to the timing offset. All of these problems can be addressed more easily in the discrete domain. Thus, in lieu of an analog matched filter, we can im plement an optimal, yet simpler, discrete time matched filter using an ideal lowpass filter, sin 2?rt which in time domain is a sine function g(t) — , and a fractional sampler, Tr = pT pt where > 2W and W is the bandwidth of r(t). The discrete time model of such a system can be written as in (1.12) with each component of z&, W & , and f being (1 x Nr) vector where N r = 1 /p. Notwithstanding the out of band noise included at the output of the fractional sampler due to the oversampling, the signal spectrum is still preserved and there is no incurred penalty for the described front end processing. Moreover, since the noise samples are independent, the cascade of such a lowpass filter and fractional sampler would obviate the need for a whitening filter. In [12], the lowpass filter was replaced by a continuous time filter matched to the known data pulse, u(t). The effects of suboptimal 8 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. front-end processing was compared for various channels through simulation in [10]. If was shown th at the impact of the suboptimal front-end processing is substantial when the signal is not bandlimited whereas it is only subtle for bandlimited cases. In all the receiver structures studied above, the VA is involved. However, the VA is applicable only when the channel is known. The performance of the VA hinges critically on the reliability of the channel estimate. In the case of unknown channel, exhaustive search of all possible sequences is required to implement the JML receiver which attem pts to minimize A k(Ak) = || ( I - AkA{)Zkf = || (I - Pk)Zkf (1.20) over all possible allowable sequences A k. The m atrix A[ is the pseudo-inverse of A k which is (A^Ak)~1A^ for full-rank Ak and Pk = AkAj. is the m atrix th at projects onto the range of A k. Thus, the JML receiver may be interpreted as a deterministic version of the estimator-correlator receiver [31], since it attem pts to find the combination of A k and f(Ak) = A kf th at best aligns with zk. The JML receiver may also be interpreted as a matched-subspace detector, which attem pts to find the range space th at collects the most energy in zk. 1.2 Decision Feedback E qualizer Due to the complexity of JML detection, DFE is a tem pting choice thanks to its rather simple structure. Complexity-performance tradeoffs favor DFE in numerous applications such as dispersion compensation in fiber optical links, m ultipath fading cancellation in microwave links, etc. 9 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. The DFE can he viewed as a Viter hi detector with immediate decisions. In Ap pendix D, we give the metric which maps DFE into a trellis decoding algorithm. The metric basically entails the comparison of the precursor output with postcursor output. For example, for binary shift keying (BPSK) modulation, detection is performed as follows ak = -1 Xf[k] — Xb[k] > t h r e s h o l d Xf[k] — Xb[k] < t h r e s h o l d (1.21) where Xf[k) and Xb[k] are the feedforward filter (FFF) and feedback filter (FBF) output respectively. More generally, a DFE estimates the discrete input data by filtering the received and the previously detected signals. Zero-fordisg LE 1 G(z) / / / o - G(z) -1 \ \ / \ / Zero-forcing LE Zero-forcing DFE G(z) - 1 \ (b) \ \ \ \ Figure 1.3: (a) Transfer function of zero-forcing LE (b) Another interpretation of ZF-LE (c) Derivation of ZF-DFE from ZF-LE 10 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. In terms of mean square error (MSE) or BER without any constraint on the equalizer structure, the zero-forcing equalizer is a suboptimal receiver. Unless G(z) has some zeros on the unit circle, a zero-forcing linear equalizer flattens the spectrum. Hence the cascade of the whitening filter, l/A^G*(l/z*), and the zero-forcing equalizer, G~1{z), gives the inverse of the folded spectrum, S ^ l {z). In the case of deep nulls, a ZF-LE performs poorly due to the noise enhancement. Quantitatively the noise variance goes up to 2N0(3- gh^ wT})~1 from 2_/Vo/A|. Another interpretation of ZF-LE is depicted in Figure 1.3(a). The realizability of such an implementation is guaranteed by the causality of G(z) unless Giz ) has one or more zeros on the unit circle. A zero of G(z) on the unit circle leaves no margin for the phase and thereby causes instability. However if the slicer is moved into the feedback loop as in Figure 1.3(c), the feedback is performed by clean, thresholded data. Note that the noise variance at the input of the slicer is the same as th at of the WMF output. W ith the no-ISI constraint, this is the minimum noise variance that can be achievable at the input of the slicer. It can be easily shown that a filter other than whitening filter would increase this noise variance, thus the W M F is the optimal front-end structure for ZF-DFE. By changing the objective function from zero forcing to mean square error minimization, noise variance at the input of the slicer can be reduced both for linear and decision feedback equalizer. Relaxing the no-ISI condition, on the other hand, not only leaves some residual ISI but also produces biased decisions i.e., E(a) y = a. The degradation impact of the bias on BER can be corrected either by scaling the slicer input or equivalently by shifting the slicer threshold values properly. For BPSK modulation, however, biased and unbiased MSE-DFE give the same performance as the slicer threshold in both cases. 11 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. In the implementation of the MSE-DFE, the complexity of both FFF and FBF is constrained. Exploiting the second order statistics of the received data, finite length forward and backward filters can be composed such th at the estimation error is orthogonal to the vector of filter inputs. The solution of Wiener-Hopf equation produces a finite length MSE-DFE complying with the orthogonality principle. The coefficients of the MSE-DFE together with mean square error can be found as follows — [%k ' ' ' Xk —N a m ' ' ' O 'tti— m ] (i.2 2 ) R z z = E (z{z k) (1.23) Rza = E (zla m+i) (1.24) 9 = RZlRza (1.25) J m in = 1 ~ R -za d (1.26) where N and M represent the length of the F F F and FBF. The timing relation between am, and x/;2 is adjusted to optimize Jmin [2]. 1.3 Thesis O utline The rest of this thesis is organized as follows. In Chapter 2, we investigate the impact of training sequences on the BER performance of an MLSD receiver. We explore techniques which reduces the training sequences w ithout degrading the performance. The main focus of Chapter 2 is an analysis of a coding strategy which optimizes the utilization of the overhead given for channel identification. In C hapter 3, we examine the design space 2Unless otherwise stated Xk and x[k\ is used interchangeably for the sampled receive data. 12 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. optimization of an adaptive, blind DFE and MLSD receiver for fiber optical channels. A baseband receiver lias to recover the clock of the incoming signal with a low jitter in order to initiate a reliable detection process. We analyzed the rms jitter performance of a data aided timing recovery algorithm which uses the decision generated by an adaptive, blind DFE equalizer in Chapter 4. In the last chapter, the conclusions are drawn and directions for a future research are given. 13 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C h ap ter 2 Training Codes D ata detection in channels with intersymbol interference (ISI) is a well-studied problem with maximum likelihood sequence detection (MLSD) providing an optimal strategy if the channel is known at the receiver [20,21]. In many practical applications where MLSD is desirable, such as time-varying ISI channels in mobile radio systems, one must account for uncertainty in the channel. Most investigations and system designs fall into one of two extreme categories: (i) trained algorithms/systems which insert sufficient training signals to allow the receiver to reliably identify the channel, or (ii) blind algorithms/ systems which attem pt to identify the channel without the aid of training. The trained approach is a conservative design th at allows relatively simple receiver processing, but sacrifices throughput or spectral efficiency. The blind approach is an aggressive design that relies on complicated receiver processing and may be susceptible to false-acquisition phenomena. For trained systems, previous results have focused on the design of good training sequences [16,47]. Such sequences can, for example, be used to initialize a Viterbi algo rithm to implement MLSD under the assumption th at the channel estim ate is accurate or to initialize an adaptive approximation to MLSD, such as Per-Survivor Processing (PSP) 14 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. based algorithms {e.g., [49]). Such algorithms may be able to both reduce the required training overhead by tracking out initial channel estimation error and subsequent vari ations in the channel. Such approaches have been suggested for trained mobile radio systems such as GSM and IS-136 [37,48,49]. Improvement in throughput or spectral efficiency ostensibly gained by blind ap proaches may, in reality, be lost due to poor data detection performance (or lost data) during a ” pull-in” period. This acquisition period is typically hundreds to thousands of symbols for traditional blind linear equalizer structures {e.g., [26,50]). There is a long history behind blind ’’equalization” and channel identification algorithms based on approaches such as higher order statistics (HOS) {e.g., [4,5,25]), second order cyclosta- tionarity (SOCS) {e.g., [23,57]). These algorithms require the collection of relatively large amounts of data and process this collection as blocks [29]. This limits their appli cability for time varying environment or for systems which use short data bursts without significant training {e.g., the Personal Access Communication System (PACS)). Blind approaches based on joint maximum likelihood sequence and channel estima tion (JML) 1 have shown promise for rapid blind acquisition. The JML has been shown to come tantalizingly close to reliable d ata detection shortly after blind start-up [8] . Strictly speaking, JML requires an exhaustive search process so th at its complexity in creases exponentially with sequence length. While the performance of the suboptimal approximations of this process has been shown to approach th at of the known channel case for large block lengths [24], application to short burst communication has shown th at misacquistions are problematic [8]. Misacquisition for short packets is inherent in 1We use JM L as a short abbreviation for Joint Maximum Likelihood Channel and Sequence Estimation. 15 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. the JML optimality criterion. Thus, an im portant challenge has been to understand the structure of the exhaustive search space associated with JML. In previous investigations this structure has been characterized by the ability to distinguish sequences in the absence of observation noise [8,28,53]. Specifically, It has been shown th at some sequences cannot be distinguished during this exhaustive search procedure [8,28]. It was noted in [8] that some of these sequence pairs are indistinguishable for all channels (i.e., "equivalent” in the terminology of [8]) and other pairs are indistinguishable only for some channels. In [28], a sufficient condition for two sequences to be distinguishable for a large class of chan nels (i.e., ”linearly independent channels” in the terminology of [28]) was developed. In summary, these results suggest th at as the block length of the transmission increases, the number of problematic sequences appears to remain constant and therefore their ratio to the total number of sequences decreases exponentially. Intuitively, one may expect th at if sequences th at have identifiability problems are disallowed for transmission, reliable acquisition and low training-overhead could be simultaneously achieved. This leads to the combined training/coding approach described below. In this thesis, we consider the training to be part of the signal design problem th at includes modulation and error correction coding. For example, consider transmission of N bits through an unknown channel. A trained system using K < N bits of training may be considered a rate (N — K ) / N code. However, since this is typically accomplished with the segregated approach of sending K consecutive training bits, followed by (N — K) information-bearing bits, the code is actually a rate zero code of block size K followed by a rate one code with block size (N — K ). The blind approach may be viewed as using a rate-one code. In this chapter we consider the issue of how one should select the 2( ~ N~K' ) 16 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. possible JV-bit sequences to best identify the channel and communicate the (N — K) bits. We refer to such designs as training codes. The best approach is found to be a function of the values for K and N and the amount of complexity th at one allocates to the receiver processing. We elaborate on this example and the related results in literature below. In Section 2.1, the notation together with traditional training sequence design are described. We compare the performance of various trained systems using Viterbi or generalized-PSP [8,11] receivers in Section 2.2. Different initialization methods are investigated for the generalized-PSP algorithms to explore the trade-off between receiver complexity and training-overhead. In Section 2.3, we consider separability and pairwise error probability for JML algorithms. First, we give a necessary and sufficient condition for pairwise separability of sequences in the absence of noise which is simple to check numerically. This allows one to determine the minimum amount of redundancy required for a (exhaustive) JML algorithm to identify the channel without data errors in the absence of noise. Furthermore, we develop an appropriate distance measure for two sequences under the JML criterion in the presence of noise. This distance measure is in terms of noise variance and the principal angles between the range spaces of data matrices. A basic problem emerging from the above distance/ separability computations is to find the largest subsets of sequences th at are reliably separated. This is a general code design (packing) problem under a minimum pairwise distance requirement. We formulate this problem as a clique problem on a graph and provide an optimal algorithm to solve it along with a greedy heuristic. These algorithms are used to design training codes for short packet communications (e.g., N — 16) which yield significant performance improvements relative to the traditional segregated trainig/m odulation approach. 17 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 2.1 T raditional Training and JM L D etection The front-end filtering and sampling of the received signal were discussed in Section 1.1. The m atrix representation of the receive signal was given by (1.13). In the examples presented in this work, is drawn from {— 1, + 1} which models binary phase shift keying(BPSK), but the development is applicable to arbitrary M constellations. Except for the length of the impulse response2, the continuous time channel is assumed to be unknown in this chapter, so that the model of ( 1.13) arises by some form of filtering and sampling of a continuous time observation. This may consist of an anti-aliasing filter or a pulse-matched filter, followed by a sampler and, possibly, a discrete-time noise whitening filter. In order for this conversion to obtain an approximate set of sufficient statistics, fractionally-spaced sampling is generally required. However, it has been shown th at reasonable front-end processing with N s samples per symbol leads to a model in the form in (1.13) where the components of z&, w*,, and f are N s x 1 vectors [6,7,9]. For simplicity of the presentation and reduced simulation effort, we work with the Ns = 1 simplified version of the model. All results can be directly generalized to the oversampled case and the qualitative results will be similar. Furthermore, the primary issues addressed in this chapter are due to the structure of the data m atrix which is independent of the sampling rate3. A traditional approach to communicating over the channel is to first send a training sequence to estim ate the channel f and then use this channel estim ate to perform MLSD 2The approaches discussed throughout are robust to overestimation of L ,but sensitive to under esti mation of L. 3If one attem pts to exploit the structure of f induced by the known pulse shaping (i.e., not just a fractionally-spaced version of 1.13), then over-sampling may alter qualitative conclusions. 18 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. under the assumption of perfect estimation. W ith perfect knowledge of the channel, MLSD can be implemented via the Viterbi algorithm [20]. Good training sequences are those th at optimize an associated least squares (LS) channel estimator. The unbiased LS channel estim ate and the associated LS error for a length N training sequence is iff = [A1 n A n }~1A ^ , z n (2 .1) E {||f)v — frv 1 1 2 } — aw ~ 2 — ~^jy~ (2-2) i — i * where oq , are the singular values of the training sequence4 Ak (assumed to be rank L) and E{-} denotes the ensemble average. The lower bound in (2.2) is obtained if and only if all singular values are equal. Because the training signal is typically drawn from the same modulation alphabet as the data, the lower bound in (2.2) may not be obtainable. Table 2.1 shows the variation of the estimation error variance with training sequences length N and singular-value spread. Note that, for a three tap channel, the error variance in (2.2) for the best sequence is reduced by 3 dB when increasing the training sequence length from 6 to 10 bits, while the corresponding reduction is 5.1 dB when increasing the training sequence length from 6 to 15 bits. Training length variance (worst) variance (best) lower bound worst/best (dB) 6 1.25 3/4 3/4 2.2 dB 10 1.083 3/8 3/8 4.6 dB 1 - 5 1.045 0.2333 3/13 6.5 dB Table 2.1: Effect of training sequence selection on channel estimation error variance for L — 3. Values shown for the worst full-rank sequence and the best training sequence. 4We will refer to matrices of the form in (1.14) as sequences since there is a one-to-one mapping between the sequence and m atrix representation. Also, a t will denote the actual transm itted sequence with hypothesized or conditional versions denoted by A k and or A ^ \ 19 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. As discussed in Section 1.1, when the channel is unknown, the Viterbi algorithm is not applicable [9] and exhaustive search of possible sequences is required to implement the JML receiver. Note that, even with training, the JML receiver should perform an exhaustive search of all possible sequence - the initial training sequence simply eliminates a subset of sequences from the search. W ith sufficient training, however, the channel is well estim ated after the training and the performance difference between the exhaustive search and a Viterbi or PSP-based approach should be small. Practical, forward-only approximations to the JML receiver can be based on PSP and its generalized version (e.g., [7,49]). Specifically, the JML m etric in (1.20) can be computed recursively [13, eq. (3)] converting the problem into a tree-search with per- sequence recursive LS (RLS) channel estimation. Any search algorithm can be applied to this exponentially growing tree. We refer to the case when the Viterbi algorithm with M L~l states is applied as a suboptimal search strategy as PSP and use the term generalized PSP (G-PSP) for other search algorithms. It was shown in [8] that significant improvements in blind acquisition performance can be obtained by increasing the tree- search complexity. 2.2 O verhead-C om plexity Trade-off for Long Packets In this section we compare the performance of several training and receiver processing strategies illustrated in Figure 2.1. In all cases, a burst of N symbols is sent at the beginning of the packet to aid with channel identification, followed by data symbols and L — 1 tail symbols to term inate the channel state. We consider two types of training 20 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. sequences, a traditional length N sequence, and a split-training sequence with i f of the N symbols fixed for training, and the other (N — K ) freely used to convey data. Thus, the traditional approach is a special case of the split-training approach with K — N. Therefore, K + L — 1 of the P symbols are overhead and D = P — K — L + l are data symbols. N L - 1 ----- > | Training g — — Signal X - ----- —— D ata (P ) T ail ; N (all known) Traditional training ] Trj D a[ t 7 | N- K Split-training Packet filial state data decisions initial state channel estim ate (Viterbi) or initial channel estim ate (PSP) Packet final state data decisions G -PSP initial channel estim ates initial state m etrics c Channel Estim ator (CE) JML E xhaustive Search Viterbi Algorithm P SP Figure 2.1: Configuration of long-packet investigation, (a) the traditional approach and the split-training approach. (b)Standard initialization for a PSP or Viterbi processor, (c) Exhaustive JML initialization up to time Q for a generalized PSP receiver Three types of receivers are also considered: (a) the Viterbi algorithm with standard trained initialization, (b) the PSP algorithm with standard trained initialization, and 21 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. (c) the PSP/G -PSP algorithm with split-training and exhaustive initialization. W ith standard trained initialization, the initial channel estimate and the initial trellis state are determined by the known, length N training sequence. Receiver (a) uses this initial channel estimate through the packet, while the receiver (b) updates this estimate in the standard PSP manner. In receiver (c), an exhaustive JML search is performed over the first Q symbols to initialize at G-PSP receiver with M ht~l states with Lt > L and Q > N. This receiver considers different sequences during the initialization process and for each state in the G-PSP trellis, selects the best of M®~K~Lt~l sequences entering th at state at time Q + 1. Simulations were run to assess the relative effectiveness of these approaches. Unless otherwise stated, all simulations in this chapter were run under the condition th at ||f|| = 1, O fe £ {— 1, +1}, and cr^, = where R is the rate of the system accounting for overhead and Ej,(Es) is the energy per bit(symbol) accounting for this overhead.5 This convention corresponds to the assumption of a constant received power and bit rate with an implicit bandwidth expansion of R r 1. Prom the above description, the rate is R ~ D /P = (P — K — L + 1)/P for the current example. Care m ust be taken when making such rate comparisons over non-AWGN channels. For example, if the transmission bandwidth is doubled through the use of rate 1/2 code, the length of the ISI, measured in terms of channel symbols, is also doubled. In this chapter, we assumed the length of the ISI remains fixed when the rate of the training code is varied. We adopt this convention mainly as a convenient way to include the effect of different overhead rates 5This differs from the case of, say, a unit norm channel w ith a t = i f t v T t and crj = N o /2 because of nonlinear effects. For example, the RLS step size used and the distance measure introduced are not invariant under scaling. 22 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. and other conventions may be more appropriate based on specific application. In general, this is a valid assumption if R is nearly one. For rates significantly lower than unity this assumption holds when the rate change is accomplished by some means other than bandwidth variation. Consider, for example, a TDMA-based system that uses 8 time slots, each with 20 bits of training and 40 bits of data. If one could effectively enable the use of 8 bits of training overhead for the same 40 d ata bits, then the eight 60-bit slots could be replaced by ten 48-bit slots. Thus, the reduction in training overhead yields a 25% increase in throughput or capacity without changing the channel bandwidth and thus the ISI channel. 1 0 IQ'1 A 1 0 ' " S 10' a -p 3 ,4 1 0 1 0 ' 1 0 ' ■ T T T - r - r " i— ! — i — i — i — | — r Viterbi (Perfect C SI) -•£}-- Viterbi (K=6) Viterbi (K=15) ---A ---- Viterbi (K=25) —K — PSPW-state; I{=6) G-PSP (64-state; K=S) ~i~\ PT ~~ r"t ~r rr r ~ t r r y i y i i - L - J - J . J . ... J. 4 6 8 1 0 1 2 1 4 Eb/No (dB) [includes training lo ss] Figure 2.2: The trade-off between training overhead and receiver complexity for relatively long packets. Simulation param eters £=[12 l]T/\/6 , D = 60 information bearing bits, and RLS forgetting factor p — 0.9 23 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Results are shown in Figure 2.2 and 2.3 for various systems simulated with L = 3, f = [1 2 l]r /\/6 , and D = 60. For the systems of Figure 2.1(a), we simulated standard training and initialization with K = N = 6 ,10(not8hown), 15,20(notshown), 25. Larger training sequences provide better channel estimation, and thus BER, but the resulting overhead reduced the rate R. The results in Figure 2.2 suggest that, under this scenario, the best performance is achieved with N = 15. W ith the standard 4-state PSP receiver using standard training-initialization and an RLS forgetting factor of p = 0.9, a length 6 training sequence yields an improvement of approximately 0.5 dB in Eb/No with respect to the best trained system using the Viterbi algorithm (K — 15). Thus, additional receiver complexity yields improved performance with less overhead. The final performance curve in Figure 2.3 correspond to the G-PSP approach with exhaustive initialization and split- training. Specifically, the split training format was N = 10 and K = 6 with 2 training bits followed by 4 data bits and another 4 training bits6. The training bursts were design separately according to the traditional methodology described in Section 2.1. A 64- state (Lt = 7) trellis was used for the G-PSP processor [8] with Q = 16. So, {zi}]^ were observed, and 2^N~K)+^'~ N^ = 24+6 = 1024 sequences were considered for the exhaustive JML initialization. Initial exhaustive search was done with batch-LS processing. One of 210/64 = 16 sequences entering each state was selected according to the JML metric in (1.20). Compared to the system with standard training and PSP, this significant increase In complexity yields an improvement of approximately 0.5 dB. Note th at this system performs within 0.5 dB of the perfect channel state information (CSI) case, and 0.4 dB of this degradation is irreducible due to the K = 6 overhead. Figure 2.3 shows the ®These 4 bits are counted in the totalof D information bits 24 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. T T t 1 H i T i l n rrT-rT-r - rl-r r -r- rT- 1 | i | | | PSP (4 states; standard training) G-PSP (64 states, split training) PSP (4 states; split training) h / i is ! Generalized PSP IMS exhaustive initialization A Known channel MLSD V \ A ■ j rv v Viv W f l i 111 i i > i i i m i M i i ., i 8 16 24 32 40 48 56 64 packet location (k ) Figure 2.3: Bit error rate versus packet location for split training and standard training. Simulation parameters: / = [ 12 1}T/ \/6, D = 60 information bearing bits, RLS forgetting factor p = 0.9, and E s/Nq = 7 dB. performance of of the PSP and G-PSP systems described above versus packet location at an E s/No = 7 dB. The PSP with split-training system is a standard 4-state PSP algorithm with the same N = 10, K = 6 split-training sequence used for the G-PSP scheme, but the initialization is simplified. Specifically, Q = 10 is used so th at { z i } ^ were used to compute the JML m etric of 2(N~K' J = 16 sequences. Each of these 16 sequences term inate in the same state so the best is selected to initialize the PSP state and channel estimate. In [8], where no training was used and G-PSP algorithms were started blindly, increasing Lt was found to improve the performance, but performance was relatively poor for the first 15-20 bits. Using training, of course, improves the performance at the beginning of the packet. However, using split-training is superior to traditional training for two 25 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. reasons. First, detection of the (N —K ) bits between the training bursts can be performed reliably since the symbols on either side of this burst are known. Second, assuming that the 2(n ~k> = 16 different sequences can be reliably distinguished, then the effective training length is N = 10. However, even for the split-training approach, the BER of the standard PSP is large for symbols at locations just after N. This is presumably caused by decision errors on the the 4 bits in between the training bursts which cause the PSP algorithm to be initialized with a poor channel estimate. In contrast, the G-PSP receiver th at uses the same split-training format, but performs exhaustive search up to time 16, and 64 states, alleviates this affect and has approximately constant BER over the packet length.7 This BER is slightly worse than the known channel MLSD performance due to steady-state estimation error associated with PSP-RLS channel estimators. In this section, we have demonstrated th at receiver complexity can be traded for training overhead and th at different training schemes with the same overhead can yield different performance. In the next section, we formalize this optim ization process and demonstrate it for short-packet systems. 2.3 T raining C ode D esign U sing S eparability an d D istance C onditions In this section, we describe a simple check to determine whether a pair of sequences can be distinguished in the absence of noise under any channel. This can be used to determine the minimum amount of training required for a JML receiver. Similarly, for the case th at 7The final bits are more reliably detected due to the tail bits. 26 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. includes observation noise, we develop a distance measure to be used for training code design. Two approaches to characterize separability (i.e.,without noise) have been developed independently by Gustafsson and Wahlberg [28] and Chugg [8]. ■ In Appendix ?? we summarize and relate these two approaches. However, neither has all the desirable prop erties for training code design. Before describing the separability check and distance, we describe our code design methodology and the type of design criteria th at axe most appropriate. 2.3.1 Code Design M ethodology We propose to design length N training codes by starting with all possible 2N sequences and discarding sequences until the largest set of sequences meeting certain pairwise con dition is obtained. We then select the largest integer D such th at 2° is less than or equal to the size of this largest set, yielding a (N , D) code. In the noise-free case, the pairwise condition is separability, while with noise, it is a minimum distance requirement. Toward this goal, the following properties are desirable for the condition used to discard sequences. 1. The condition should exploit the digital nature of the allowable transm itted se quences. For example, a condition th at indicates th at Ak is a “bad” sequence based on potential confusion with X&, an “analog” sequence, is not desirable because it will discard Ak when it may be easily distinguished from all other digital sequences. 27 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 2. The condition should be a pairwise property. A condition th at identifies Ak as a bad sequence because it may be difficult to distinguish from one or several other sequences {A ^}i is not desirable. If Ak is discarded from the code without the explicit indication of then all of these other sequences may be discarded as well when they are considered. This reduces the rate of the final code, whereas it would be preferable to maintain at least one representative member from the set. 3. The condition should be independent of the actual channel coefficients f. Robust designs should not exclude some class of channels. 4. The condition should be symmetric. If Ak looks bad to A'k, then A’ k should look bad to Ak- 5. The condition should be relatively simple to check. Given some condition with the above properties, we can formalize the code design problem as finding the largest complete subgraph (or clique) on an undirected, unweighted graph - i.e., the so-called clique problem [14], Specifically, consider a graph representation of the code design problem with each of the 2N potential codewords defining a vertex. An edge exists between two vertices if the associated sequences satisfy the condition. Due to the above properties, the resulting graph G = (V, E) is undirected and unweighted. The design problem is to find the largest subset C of the \V\ = 2N vertices for which the pairwise condition is satisfied for each pair in C. The general class of clique problems is known to be NP-Hard [14]. However, in Appendix C we present an optimal algorithm which enumerates all cliques in an efficient manner when the maximum number of edges for any vertex is much smaller th an \V\. We 28 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. also describe a (suboptimal) greedy heuristic to approximate the solution to the clique problem for cases when the optimal algorithm is prohibitively complex. 2.3.2 N oise-F ree Separability It has been noted th at there are sequences that cannot be separated from some other se quences by the JML receiver in (1.20) [8,28] even in the absence of noise. Specifically, we say th at Ak is separable or distinguishable from A'k under channel f if when Ak transm it ted, A(A'k) > A(Afc) - i.e., th at the correct sequence is decided by the rule in (1.20). The existence of indistinguishable sequences induces an error floor for JML so th at a BER of zero is not achieved even when E^/Nq — » oo. Based on all evidence available [8,24,28,54], this error floor decreases rapidly with the length of the observation interval. Thus, an interesting issue is to characterize the minimum overhead required to reliably identify the channel when no noise is present. For example, for BPSK signaling, at least one bit of redundancy is required to eliminate the sign ambiguity. More generally, addressing this issue requires a simple check for pairwise separability under any nonzero channel f. Specifically, we define two sequences to be pairwise separable if they are each separable from the other under any channel f.8 A necessary and suf ficient condition for pairwise separability is that the range spaces of A ^ and A ^ have dimension9!! and share only the origin-i.e.,dim [7 2 . (A ^ ) DlZ (A ^ )] = 0 [8]. As discussed in Appendix A, such a check is not available from the related previous research [8,28]. A simple test is provided by the following theorem. sThis was called “completely distinguishable” in [8]. 9If the rank of A (i) was less then L then there would be nonzero f such th a t f = O.Thus when Aw is transm itted with this channel an no noise, both metrics are zero and the two sequences cannot be distinguished. 29 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. T h e o re m 2.1 Two digital sequences and A 1 -^ are pairwise separable if and only if the matrix C = [A ^ A jj^] has rank 2L. Proof: If C has rank 2L, the set of 2L columns of C must be linearly independent which implies th at the columns of A ^ and the columns of A^l must be linearly independent sets. Thus, no point In TZ (A ^ ) other than the zero vector can be expressed in terms of the columns of and vice-versa. So, if C has rank 2L, then dim [7 ?. (A.W) DTZ (A ^ )] = 0 and the two sequences are pairwise separable. To show the converse, since the ranges are disjoint, excluding the zero vector, the dimension of their union (i.e., the range space of G) must be equal to their sum, namely 2L. ■. 2.3.3 Exam ple of Noise-Free Design M ethodology As an example of the applicability of the above result, consider transmission of a BPSK sequence of length N = 8 through an L = 3 channel. First, consider the approach where only 1 bit of overhead is used to remove the sign ambiguity by fixing the first bit to 1. Considering 400 different channels equally spaced on the unit hemisphere10, the BER associated with this technique is 0.04 conditioned on the “best sequence” being sent and as high as 0.5 conditioned on the “worst” sequences. Thus, even In the absence of noise, a significant floor for the average BER of approximately 0.3 is reached because not all of the 27 = 128 sequences are pairwise separable. The optimal clique algorithm described in Appendix C was run on a graph with 128 vertices1 1 to determine the maximum number of sequences th at can be decoded 10Henceforth referred to as “all channels” when describing simulation results. 11 One bit location was fixed to reduce the number of vertices from 256 to 128. 30 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. without any error in the absence of noise. Two vertices were connected if the associated sequences were pairwise separable. The maximum training code size consisting of all pairwise separable sequences was determined to be 8. Moreover, 882 size 2D — 8 training codes were found. Since the largest pairwise separable training code has 23 = 8 codewords, (N — D) = 5 bits of overhead are required to uniquely determine which one of the length 8 sequences is sent without any restriction on the channel coefficients. In the absence of noise such a system will perform without error (i.e., the error floor has been removed). The BER in the presence of noise,however, for each of the 882 pairwise separable codes can be expected to vary. We simulated several of the 882 possible R = 3/8 training codes in the presence of noise and, although the results are not shown here, we observed variations in performance of up to 1 dB in SNR. 2.3.4 A Pairw ise D istance for JM LSE Next, we introduce a distance criterion for JML th at accounts for AWGN. The distance is related with the conditional second order statistics of the random variable rk{i,j) = A k(A^; zk) - Ak( A ^ ; z k) = z j (P ^ - pjf'*) zk (2.3) In a pairwise JML decision between and A ^ \ the former is selected if and only if rk(i,j) > 0. In Appendix B, it is shown that, conditioned on being sent and a particular f, rk(i,j) can be represented as a difference of two sums, each sum consisting of L m utually independent non-central chi-square random variables. It follows th at the conditional probability density function (pdf) of rk(i,j) is the convolution of 2L densities of the x 2 form. Therefore, for a given channel f and A ^ \ the pairwise error probability 31 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. with can be found by numerical integration of this pdf function. A reasonable distance between and A 1 ^ could be defined based on such pairwise error probabilities. However, this is numerically intensive and should be made independent of the channel by averaging over all channels or considering the worst case. We define a distance th at is relatively simple to compute and its minimum value over unit norm channels is easily found. In Appendix B the conditional mean and the variance of rk(ijj) are shown to be m r (A^j) |A ^ ; f) = E ^ rk(i,j)\A ^ ; f} = c2 sin2 9t (2.4) 1 = 1 = var \rk{ i , j ) \ A f \ f = 4 sin28 ^ + 4ct2 m r ( A ^ |A f ; f) (2.5) i=i where { d i } ^ is the set of principal angles [27] between the range spaces of A ^ and A ^ and ci is the coefficient vector A ^ f represented with respect to a particular basis (see Appendix B). Using the exact expressions in 2.4-2.5, we will propose a pairwise distance measure based on the value of mT(-)/crT(-). To motivate this distance, consider th at if rk(i,j) were Gaussian the pairwise probability of error would be Q (mr{-)/o>(•))• Thus, intuitively a large value of m r(-)/ar{-) should reduce the conditional pairwise error probability. To illustrate this further, the conditional pairwise error rate (PER) was simulated for a pair of separable sequences as the channel traced the unit hemisphere. For all channels of length 3, two sequences of length 8 were sent 4 x 105 times through an AWGN channel with a 2 = 0.1. Figure 2.4 shows the simulation results of the PE R versus m r (-)/o>(-). Also shown is the function Q(mT(-)/aT(-)) which was empirically found to be a good approximate upper bound for the conditional PER. We performed similar experiments 32 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. o h w 0 1 ■ S a 10 - 1 10 ■ 2 10 ■ 3 10 ■ 4 10 ■ 5 10 ■ 6 10 2 3 1 4 5 6 10 1 10 ■ 4 10 ■ 5 10 ■ 6 10 4 5 6 1 2 3 m r{-)/c>r{-) dB (a) m r(-)/ar(-) dB (b) Figure 2.4: The conditional PER performance of JMLCSE for 400 3-tap channels selected on the unit hemisphere as a function of m r(-)/cr(-). Shown under the condition of either sequence transm itted and = 1. The emprical upper bound Q (mr (-)/oy(-)) is also shown for 20 different sequence pairs of length 8, 9, and 10 at channel lengths 3 and 4. In all cases similar results were obtained; the m r(-)/ar(-) characterized the P E R and the Q-function provided a good approximate upper bound. Although we have not shown this to be a valid upper bound, some intuitive justification is provided bt the fact that the pdf of the decision statistic is asymmetric around its mean with the heavier tail in the direction away from the decision boundary The distance th at we suggest is obtained by finding the smallest value of m r (-)/oy(-) considering all unit norm channels and either or as the transm itted sequence. It can be shown th at m r(-)/aT(-) is an increasing function of m T(-) so that the minimum 33 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. value of m T(-)/o>(•) is achieved at the minimum value of m r(-). While a simple lower bound on m r(-) (see (B.23) in Appendix B) could be used to define a distance criteria, it is relatively simple to obtain the explicit minimum value of m r (-)/oy(-) for a unit norm channel. Let be the smallest eigenvalue of id) (Pf c W - P ^ ) A f . This leads to A? A aP) the minimum value of the m r (-)/crr (-) and the minimum conditional distance mm f:||f|| = l 2mr ( 4 M * );f)' . < jr{Af\Af'J) X(M m in Ef=l sin2 + ^aw ^m in (2.8) Finally, we define the distance as d(Ak)’Ak )) = min (2.7) so th at the approximate upper bound for the pairwise error is Q{dj2). The distance in (2.7) is an approximate minimax criterion because the minimum value of m T(-)/<jr (•)) approximately determines the maximum error probability. Note th at the distance is a function of fo \, which may be viewed as a measure of SNR for the quadratic detector. Computation of the distance in requires (2.7) the following: • Computation of the minimum eigenvalue of 4 ] ( p f - P « ) A « and tO')' (Pt U) P ^ ) A ^ \ A^Jn and A , respectively. This determines both conditional means. m Determination of an orthonormal basis for TZ (^A k ^ and for 7Z . This can be computed, for example, by performing a QR-decomposition on A ^ and A ^ \ respectively [27]. A singular value decomposition of an (L x L) m atrix of inner products between the basis vectors obtained from the above step. This yields the principle angles between the range spaces. 34 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. • Use the above eigenvalues, eigenvectors, and principal angles to compute the dis tance via (2.6)-(2.7). 2.3.5 Exam ple of Noisy Design M ethodology "W e consider a simple example to illustrate the appropriateness of this distance. First, we design two sets length N = 8 sequences, each of size three, for an L = 3 channel. These two sets are designed to be good or bad with respect to a known channel receiver or a JML receiver. Specifically, for a known channel f , the pairwise error is monotonically decreasing in ||(A ^ — A ^ ) f ||. Thus, analogous to the distance in (2.7) for the known channel case, we define a known-channel distance th at is the minimum of ||(d j^ — .4 ^ )f||/(2 cr^ ) over (i) (i) all unit norm channels, which is simply minimum singular value of the m atrix Aj,' — djT divided by 2cr^. Set A was designed by setting the minimum value of the distance in (2.7) between any pair to 2.19 and the maximum known channel distance to 3.7. Similarly, Set B was found by setting the maximum distance in (2.7) between any pair to 1.6 and the minimum known channel distance to 4. All distances were computed at cr^ = 0.1. The optimal clique algorithm was run on a graph of 128 vertices to determine the sets. The maximum clique size for each set was found to be 3, and each set contains three sequences. Figure 2.5 shows the average CER obtained by simulation of the JML receiver for the unknown channel case and the ML receiver for the known channel case. As expected Set A is superior in the unknown channel case and inferior in the known channel case. This further validates the distance in (2.7) and distinguishes the approach from known channel signal design. 35 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. K W 0 ! h H T j 'H 0 I 0 a b C S h a > < 10 Unknown Channel Known Channel ■ 2 10 •3 10 ■ 4 10 ■ 5 10 1 0 “ 6 5 10 15 20 25 Eb/No (dB) [including effects of overhead] Figure 2.5: The CER for two sets of 3 sequences averaged over all 3-tap channels. Set A was designed to have good unknown channel performance and set B was designed to have good known-channel performance. As another example, the distance criterion given in (2.7) was used to design a training- coded system for a packet length of N = 16. A codebook of size 27 = 128 was found by running the greedy clique algorithm on a graph with 21 5 vertices. Thus, a (7,16) training code was designed. Each sequence was sent 1000 times at over all channels with the results summarized in Figure 2.6. Unless specified decoding was performed using the using JML via exhaustive search. The performance of this combined coding and training system is compared with two traditional training systems. Trained system A uses the 9 bit overhead as a training sequence at the beginning of the packet. The training sequence was selected as one of the optimal training sequences as described in Section 2.1. This 36 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. < 8 ! h 0 I- B 'd ■ - I o £ O ' 7 3 0 0 A — Trained A ■ B — Trained B (PSP) — 0— Training-code • - X - - C S I ■ 1 10 •2 10 •3 10 •4 1 0 7 8 9 10 11 12 13 14 6 Eb/No (dB) [including effects of overhead] Figure 2.6: Comparison of segregated and combined training and modulation. All systems use a packet length of N = 16, with the trained and training-code systems conveying 7 information bits per packet. Performance is averaged over all 3-tap channels. segregated design performs very poorly as compared to the combined design due to the two bits at the end of the packet. A second segregated design was considered which uses 7 bits of training at the beginning of the packet w ith the last two symbols of the packet to [1 1] to term inate the channel state in a known manner. This trained system B performs much better than the first segregated design, but there is a 2.8 dB loss in SNR relative to the performance of the combined design at a CER value 1G~3. The performance of trained system B with segregated processing is also shown. Specifically a standard (4- state) PSP receiver initialized by the 7 bit training sequence and term inated by the tail bits shows only slight degradation relative to the JML receiver in this case. 37 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Also shown in Figure 2.6 is the performance of a known channel MLSD receiver averaged over all channels. This curve was generated by sending N — 16 bits with the first and last two bits specified to initialize and term inate the channel state respectively. The performance of the system using the training code is similar to the that of the perfect CSI system despite the SNR penalty due overhead increase (i.e., 2.3 dB). This suggests that, in principle, one could obtain coding gain relative to an uncoded, known ISI channel while jointly estimating the channel 38 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C h ap ter 3 O ptim ization of E qualization Techniques for F iber O ptical System s Optical fiber channels offer a wider bandwidth compared to the other wired systems. In the long-haul link optical signals traverse long distances with a very low attenuation. For today’s bandwidth-demanding, low-power wired system applications, optical fiber channels axe the ideal medium for digital communication. However, as any other signal traveling in some environment, optical signals have limitations. As optical signals progress from transm itter to receiver, the quality of the signal degrades with the distance. The power of the signal attenuates; the pulse shape of the signal spreads and various sources of noise corrupt the signal. Attenuation is primarily caused by the scattering and the ab sorption of light. Three m ajor types of noise contaminates the information bearing signal: Shot noise, N sh0t', therm al noise, N th er and amplified spontaneous emission (ASE) noise where ASE noise can be further categorized into spontaneous-spontaneous beat noise, N sp~S p\ and signal-spontaneous noise beat noise, N s- sp. W ith forward error correction (FEC) methods, the performance degradation due to the several noise sources can be corrected to some extent. Even though block codes can correct very long bursts of errors, 39 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. these codes perform at their best when the transmission pipe is memoryless. However, polarization mode dispersion (PMD) and chromatic dispersion (CD) add memory to the transmission channel. PMD arises as a consequence of the two orthogonal principal states of polarization of the signal propagating with different velocity, and makes the successive symbols smear into each other. CD is, on the other hand, caused by the difference in the velocity of the spectral components of the pulse. As the bit rate increases, disper sion becomes more severe, thus poses a significant limitation on the distance between the repeaters, unless compensated. Due to the importance of dispersion in optical channels, there have been many papers proposing various compensation methods [39,40,45,60-62]. These methods can be classi fied into two groups: optical methods and electrical methods. For example, CD is usually compensated by dispersion compensating fiber (DCF). It is also possible to compensate PMD optically by employing polarization controllers and time delay devices [45,61,62], Even though optical methods might perform slightly better than electrical methods, op tical compensators are much more expensive and bulky th an integrated electrical equaliz ers. Moreover, electrical equalizers can easily be adapted to changing channel conditions, whereas optical equalizers lack this flexibility. Adaptability is a necessary feature due to the random time varying nature of the PMD. For wavelength division multiplexed (WDM) systems, one viable solution can be deploying one optical equalizer for all channels and canceling the residual ISI in each channel by electrical equalizers. Among the electrical equalizers, MLSD yields the best performance. Actually, MLSD is the optimal decoding m ethod as long as the applicability rules are satisfied [9]. At the front end of the MLSD, a matched filter (MF) provides the sufficient statistics at 40 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. the baud rate. However, due to the unknown ISI, the exact continuous time matched filter cannot be implemented. By fractionally sampling the received signal, the MF can also be constructed in the discrete domain. While this oversampling process helps timing recovery, it increases the complexity of the dispersion compensation. Another possible solution is using a lowpass filter with a bandwidth (BW) of at least half of the bit rate. This suboptimal solution results in a negligible SNR loss, and therefore could be considered for less complex implementations. For low to moderate dispersion, DFE provides similar performance to th at of MLSD. Since both DFE and MLSD share the same whitening MF, DFE can be viewed as approximating MLSD with zero decoding delay. The goal of this chapter is to investigate two promising electrical equalizer candidates, MLSD and DFE, in terms of their applicability to the optical dispersion compensation. Since it is crucial to understand the mechanism th at necessitates the equalization for fiber optical systems, we first examine the channel model in Section 3.1. The goal of this chapter is to investigate two promising electrical equalizer candidates, MLSD and DFE, in terms of their applicability to the optical dispersion compensation. Since it is crucial to understand the mechanism th at necessitates the equalization for fiber optical systems, we first examine the channel model in Section 3.1. Once the distinctive characteristics of fiber optical channels are presented in Section 3.1, we exploit these characteristics to optimize the performance of MLSD in a rather simple way in Section 3.2. We devote Section 3.3 to the modifications of conventional DFE to improve the complexity and BER performance of the equalization process. A Semiring structure is introduced in Appendix D to study the parallel implementation of both MLSD and DFE under a unified framework. 41 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 3.1 O ptical C hannel M odel In this section, a transmission model is developed for the received signal at the input of the equalizer. The modulation scheme for the information-bearing signal is usually on-off keying (OOK), frequency shift keying (FSK) or differential phase shift keying (DPSK). As seen in Figure 3.1, there might be periodically placed optical amplifiers on the optical fiber line to boost the signal level and to compensate the span loss. Erbium-Doped Fiber Amplifiers (EDFA) are the optical amplifiers most commonly used in practice. Due to the characteristics of their gain spectrum, these amplifiers are almost transparent to the number of channels, data rate and data format. The incoming optical signal at the receiver is detected either directly or coherently. In coherent detection an extra local oscillator signal is used to provide larger dynamic range and increased signal sensitivity compared to the direct detection [35]. However, the advantages of coherent detection technique come at a cost of complexity. For optical to electrical conversion, photodetectors are used due in large part to their fast response, low noise and small size. PIN (positive-intrinsic-negative polarity) pho todetectors, for example,, have been designed with a modulation bandwidth of over 100 GHz [33]. Depending on the application, avalanche photodetectors (APD) can also be used instead of PINs because of their better noise sensitivity [42]. In this chapter, we are primarily concerned with the transmission and the reception of a single OOK modulated channel through a single-mode fiber detected with a PIN photodetector. The block diagram of the system is given by Figure 3.1. 42 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Optical Amplifiers / Optical \ Fiber \ M odulator S .(t> Low Pass Filter T /k i i i K _ E qualiser Input D a t a Figure 3.1: Simplified transmission and reception model for optical fiber system Photodetectors generate filtered impulses at arrival times of photons. Since arrival times are random, the received signal can be modeled by a filtered Poisson process y ‘ P ) m y(t) — X(t) * u(t) < J y{t) — X(t) * u2(t) (3.1) (3.2) (3-3) where t^’s are random arrival times and A(f) is the time-varying incoming photon rate. Knowing th at the energy of a single photon is hv where h is the Planck’s constant (6.6.10~34 Joule-sec) and v is the optical frequency, the average arrival rate, Xave, can be found as Am,e “ hvB (3.4) where Ps is the average optical power of the incident light and B is the bandwidth of the OOK m odulated signal. In Figure 3.2(a) X{t) is shown for transm itted sequence of ’101’. To dem onstrate the random nature of the photodetector output, a realization of y(t) corresponding to X(i) in Figure 3.2(a) is given in Figure 3.2(b). A finite extinction ratio is assumed due to the dark current in the laser-off state. The pulse shape, u(t) 43 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. --- ----" T ........... ...........1 ....... ...i " ------------- ' 'VTA : L/Jx....... \ ........i...........i .......l\ ...... ................ 1 .... . . . \ ...?.................l i \ .... / \ /: :...\ . . / ; \ : / : : ^ V-/ i i V ( a ) ....4 a.. 1 l ...........i.hk ................................. y r % \ ....... > 1 7 ’ .......... t l i f : I - \ i . " 1 j :. .. ii ..... ................. J i .....P : f : ...... U . .... .......... • v ; .......A " — . . y ; V . V , ......... 1 time (b) Figure 3.2: Realization of y(t) corresponding to A(f) is chosen as to be rectangular. The continuous-time process y(t) is also known as shot noise. Comparing the moment generating function of y(t) with th at of Gaussian process for large A, we can easily show th at shot noise can be approximated by Gaussian noise. For a shot noise limited system, the detection process is a hypothesis-testing problem. Since the noise distributions differ for the !on’ and ’off’ states, care must be taken to model these distributions accurately. Of the four fundamental noise components, Nther and NSp-S p, are independent of the d ata value, however N ^ t , and N a- 3p, powers are different for 0 and 1 data values. Therefore optimal detection requires the optimization of the decision threshold. Assuming th at the summation of all noise sources generates 44 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Gaussian random variables with mean, mo and m i, and variances, a\ and af for 0 and 1 respectively, the optimal decision threshold can be found by minimizing the P re with respect to the threshold P re = P r(0 )/(x |0 ) + P r(l)/(a ;|l) (3.5) For equal a priori probabilities, optimal threshold is achieved when the two conditional pd f’s are equal. 1 (s-m n )2 / 0r]°) = -= ^ e 2-o V 27T Crl /(i|1 ) = (3,6) f(xth\0) = f ( x th |1 ) m 0a\ - m ia l , ffi<7o[(mi - m 0f + 2(of - og) In ^ j 1/2 =>- Xfn = i ----- o — + ------------------------o ------2 --------------- 2 ----- (3-7) af -a% a f - a\ Note that the decision threshold given by 3.7 is optimal as long as the Gaussian assump tion is valid. This threshold value might considerably deviate from the optimal one when the distribution function is different from Gaussian [30]. In practice, as the exponen tial terms are more dominant than the linear terms, a much simpler expression can be obtained by equating only the exponents in 3.6. TOooq — micro . . xth = ---------: --------- (3.8) (T i t (T q To show the relative magnitudes of four noise sources, we consider a direct detection receiver with a front-end EDFA. The operating wavelength, Aop, is chosen as 1.55 jum 45 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. where the overall effect of the high frequency attenuation, Rayleigh Scattering, and the low frequency attenuation, Infrared Absorption, is minimized. Other parameters are given as follows: Rb (Data Rate) = 10 G b/s B e (Electrical Bandwidth) = 8 GHz Ba (Optical Bandwidth) = Pin, (Signal Power) = 1 /iW e (Electronic charge) = 1.60x10-19 C 3 ft (Responsivity of photodetector) = 0.8 A /W n sp (Spontaneous emission factor) = 2 rjth (Power spectral density of therm al noise) = 20 pA/Hz r)in,(Amplifier input coupling efficiency) = 1 Vout (Amplifier input coupling efficiency) = 1 Lo (Optical loss between amplifier and the receiver) = 1 T (bit duration) = 100 ps We can now find the power of each noise component [48] Nshat = 2Beer]outLo(UPinGr]in + 3 R(G - \)rjaphvB 0) (3.9) N 8s p = $ zPinLQ‘r}inrji outG{G - 1 )hvnspB e (3.10) 46 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Nsp~sp = (JR(G - l)r)outn spL0hv)2B e(2Bo - B e) (3.11) N t h e r = V t h e r B e (3.12) Prom (3.10) and (3.11), we can infer th at the power of N s- ap is independent of the optical bandwidth whereas the power of N sp- sp increases with the optical bandwidth. In Figure 3.3 we compare the noise powers as a function of optical bandwidth. As seen in the figure the Ns- sp noise is the most dominant noise source in the system. 10' -1 C 10' ► 10' ” -12 10' -13 10' -14 10' 20 0 2 4 6 8 10 12 14 16 18 Optical Bandwidth (M) Figure 3.3: Variation of N sh0t, ier, N sp- sp as a function of optical bandwidth in terms of data rate, M = B q/R'0 Dispersion corrupts the basic two humps structure of the received signal histogram. Since optical sources have typically nonzero spectral width, AA, and speed of propagation of optical signal changes with frequency, the transm itted pulse broadens as much as ZtyAALe where D \ is the linear delay coefficient and Le is the fiber length. For example, 47 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. for AA = 0.2 nm at Aop = 1.551 fim, CD spans five symbols after 150 km at the OC-192 rate. As a result the histogram of the symbol spaced received data will have many humps. The frequency response of CD for a fiber of length Le is given by [49]. • p 2 A 2 Hc(f) = e JO !f where a = 7tD\— Le (3.13) — = Power split ratio Differential Group Delay Figure 3.4: PM D Model An optical allpass filter with a prescribed phase matching to the phase of dispersion completely compensates CD [1]. PMD can be described by differential group delay (dgd) together with power split ratio. In Figure 3.4, it is illustrated how dgd, Arg, between two polarization modes of the optical received signal broadens the pulse shape. In this regard, effects of PMD are similar to CD. However, PMD Is a random slowly time varying quantity whereas, CD is almost deterministic. Even though CD and first order PMD are linear in the optical domain, after the photodetector with direct detection the overall channel becomes 48 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. nonlinear. Nonlinear channels can be expressed in terms of Voiterra Series [52]. In this chapter, we use the first order approximation of Voiterra series, x(t) = aih(t — iT) + n(t) (3.14) i where a* £ A = { + 1 ,-1 } is the input data sequence, h(t — iT) is the overall impulse response and n(t) is the total noise. The modulation scheme is OOK, nevertheless we have subtracted its mean from the output of noise-limiting lowpass filter in Figure 3.1 and assumed th at the resultant signal is generated by bipolar input data. In Figure 3.5 we compare the overall time domain response of no dispersion, 50 ps and 100 ps dgd, and 2500 ps/nm CD channels after the lowpass filter. After averaging, the power spectral density of each of the channel can expressed as \H (f)\2/ T where H(f) is the Fourier Transform of h(t). These Fourier Transforms are given in Figure 3.11 As expected, while the pulse shape broadens, the spectrum shrinks. Note th at since the 100 ps dgd channel has a null at 5 GHz, symbol spaced discrete time representation will have a zero on the unit circuit which makes it hard to equalize using linear methods. It also bears mentioning th at the spectrum of the nonzero mean signal exhibits spikes at the multiples of the bit rate. These spikes should not be confused with the discrete lines observed on the spectrum analyzer due to the periodicity of Pseudo-random binary sequence (PRBS), often used to generate test data patterns. In the following sections we work on the discrete-time baud spaced channel model where the sampling time is estimated by an algorithm based on th at of a Mueller and Muller [43] (MM) Convolution of the independent identically distributed (iid) input sequence a*, with each bit taking the values in A with equal probability, with the finite support channel 49 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 1 © X Mo dispersion + 50ps dgd •* 1QOps dgd A gSQOpa/nm CD 14 12 1 0 S 6 4 2 0 - 2 -4 0 10 20 30 40 50 SO 70 80 time (xlGps) Figure 3.5: Time domain representation of overall channel response for no dispersion, 50 ps, 100 ps dgd PMD and 2000 ps/nm CD cases can be represented by the multiplication of the (L x 1) channel vector h by the Toeplitz (k x L) data m atrix A k as X fc = A kYi + n k h — [hh • • • hi}T a. ” [® i— L+\ d-i A k = [a-k «i] (3.15) (3.16) (3.17) (3.18) where (-)T denotes transposition. It is to be noted here th at, we arrived the same channel model used in the previous chapter. Thus the techniques applied in the previous chapter can be used for optical channels. 50 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. x No dispersion 50ps dgd & 100psdgd A 2500ps/nm CD '5 S 10-7 Frequency (GHz) Figure 3.6: Power spectral density of no dispersion, 50 ps, 100 ps dgd PMD and 2000 ps/nm CD channels 3.2 M aximum Likelihood Sequence D etection for Fiber Optical S y s te m s In this section, starting with the definition of the optimal sequence in the sense of gener alized likelihood ratio, we first propose a suboptimal method, which acquires the channel without any misacquisitions. Second, we consider the problem of noise variance depen dence on the received signal and its impact on the branch metric com putation in MLSD. In general, to identify the channel characteristics of any transmission medium, training sequences are added to the information payload. Those sequences are known both at the transm itter and the receiver, and thus don’t carry any information. Even when there is no training sequence, some side information can help in the identification of the chan nel coefficients. For instance, each digital wrapper frame for optical transport network (OTN) has a 6 byte fixed pattern for frame alignment [OTN]. Besides determining the 51 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. beginning of an OTN frame, these sequences can also be used as training sequences. Yet it is advantageous to decouple any interface protocol like SONET, SDH or OTN from the detection process to have a modular structure. Therefore detection algorithms that only work on the received signal without any help from other layers are preferable. Such algo rithms, so called blind algorithms, provide an uninterrupted, bandw idth efficient, lower complexity solution for detection. The m ajor problem in any blind equalization process is the phase ambiguity of the channel. When the channel is unknown, MLSD is an exhaustive data sequence search algo rithm [9]. For AWGN n^, a simple metric is checked exhaustively for candidate input sequences. Specifically, the metric given in (3.19) is tried to be minimized over all possible data matrices Ak in order to find the optimal data matrix Ak A(A k; xf c ) = ||(/ - A kA {)xkf (3.19) where Aj. is the pseudo-inverse of Ak. It is to be noted th at even in the noiseless case, there are some sequences which cannot be distinguishable from some others w ith the given metric above [8,9]. However as the sequence length increases, the probability of such cases becomes negligible. W ith its original form, (3.19) cannot be implement able. Per- survivor processing (PSP) approximates MLSD by embedding the estimation of unknown parameters into the Viterbi algorithm [8,15]. Unknown parameters, in our case channel coefficients, can be estimated by the least-mean-square (LMS) algorithm [59] using the tentative decisions from trellis decoding. Let us denote the state, error, state transition metric and state metric by fik, e, A and P, respectively where : l^k — {a k —Li ■ ■ ■ > i ) (3.20) 52 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. elMfc+i) = x k ~ hT (/ik)ak {fik -* /xfc+1) (3.21) Hpk-^fJ-k+i) = |e(Wt -» Mfc+i)!2 (3-22) P ^ + i = m i n |r(/ifc+i) + A(/ifc -> M/c+i)i (3.23) P k Then the channel coefficients are updated as: HPk+i) = HfJ'k) + pe(/J,k -> fik+1)ak {/j,k -> Uk+i) (3.24) where (3 is the step size of the LMS algorithm. Taking into consideration th at error is a function of channel coefficients, it can be shown th at (3 determines the bandwidth of the one-pole filter given by (3.24). As do other blind equalization methods, blind PSP suffers from the phase ambiguity of the channel. Actually, in blind PSP, there is an indispensable possibility for the channel estim ator to converge to a shifted version of the actual channel, [8] h » = [0 Q hL ••• hL- i +il (3.25) i zeros h(~i} = [hL hL- i+1 0 O j (3.26) i zeros We simulated channel acquisition characteristics of blind PSP for 80 ps dgd channel at 11 dB OSNR. The actual channel coefficients are computed offline as [0.47 0.83]. Channel taps are initialized to zero vector at the beginning of the PSP algorithm. Out of 50 independent runs of a block of 5000 symbols, 21 converged to the shifted version of the channel, [0.83 0]. As suggested in [8], if the bandw idth of the LMS filter is increased (i.e., increased (3) noise might help the PSP algorithm to jum p into the actual channel from the shifted one. Figure 3.7 illustrates an example for such a case. One drawback of using a wide bandwidth LMS filter is the increased excess mean-square error, which has an 53 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. ... ''ay-W. v fv tf’ T fc ^%/kj O 1st tap Q 2nd tap )' 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Tima (x 100p3) Figure 3.7: Effect of step size on the acquisition of PSP algorithm for 80 ps dgd PMD channel at 11 dB OSNR. PSP algorithm acquires the from its shifted version with large step size. This switch occurs at around t = 2100 adverse impact on BER performance. By overestimating the channel length as 21/ —1 and using a center tap initialization, the convergence of the PSP algorithm to a shifted version of the channel can be prevented. However, considering the exponential complexity growth of VA with the channel length, this approach is not a feasible solution. Our proposal, staggered-phase blind acquisition (SPBA), addresses phase ambiguity problem by using a sequence of initial conditions for channel taps. The main idea behind the SPBA is to initialize the channel coefficients with a phase as close as possible to the phase of the actual channel. The algorithm can be described as follows: 1. Form a set of overlapping d ata blocks of length 64 + 41, x 2 XS4k+4L+1 ■■j XS4!t+8t X64(K+1 )+8L+1 XQ4<k+lHieL X64(fc+1)^L+1 •• XS4(ts+1)+5t. 34{k+2)+8L+1 •" ^'64Jki-2)+15L 54 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 2. Initialize the • 1st block with hi = [1 0 • • • 0] ® 2nd block with I12 = [0 1 • • • 0] « ■ ■ ■ • Lth block with = [0 0 • • • 1 ] 3. Once the ith block (i < L) is decoded using the VA, update the (i + L)th block coefficients using the LMS algorithm. In the update operation, we can either use all 64 errors or choose a subset of these 64 errors in a round-robin manner to reduce the number of operations. 4. Let Fj show the minimum metric at the end of the trellis decoding of the ith block, update the selection metric, Sj, of the (i + L)th block using a one-pole lowpass filter si = (1 — P)si + p^i (3.27) where all s /s are initially set to zero. 5. Repeat the step 3 and 4 for m x L successive blocks. Here m is a design param eter and depends on the noise variance, step size, (3, and p parameters. It has to be set so th at s are no longer changing. 6. Find the index i which has the minimum selection metric, then set the channel coefficients to faj. 7. Continue updating hj for the following blocks to track the channel variations. 55 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 100 200 3 0 0 4 0 0 100 200 3 0 0 4 0 0 -T- s, 0 5 > block num ber block num ber Figure 3.8: SPBA acquires the actual channel coefficients through the selection metric. (a) For 2500 ps/nm CD channel, 2nd initialization gives the minimum selection metric. (b) For 80 ps PMD channel 1st initialization gives the minimum metric. Figure 3.8(a)and 3.8(b) depict the convergence curves of s /s for 2500 ps/nm CD and 80 ps dgd channels at 11 dB OSNR respectively. As seen in Figure 3.8(a)and 3.8(b), the final values of s /s depend on the initial channel taps. Depending on the phase difference between the actual channel and the initial channel coefficients, SPBA converges to some shifted version of the actual channel. However at least one of the initializations makes SPBA acquire the actual channel (i.e., zero shift). We tested SPBA for various PMD and CD files and observed the same phenomena in all cases th at the initialization set parti tions the phase space sufficiently. Table 3.1 shows the final values of channel estimates for each of the channel initializations together with their selection metrics for two typical dispersion channels. As implied in Table 3.1, the final channel estim ate of the initializa tion, which has the minimum selection metric after m y. L, blocks, is clearly converged 56 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. CD 2500ps/nm at lld B OSNR PM D 80ps dgd a t lld B OSNR Actual Channel =[0.22 0.91 0.12 -0.11] Actual Channel =[0.47 0.83 Initial Estimates Final Values Si Initial Estimates Final Values Si i= l [1 0 0 0] [0.91 0.13 -0.10 0.00] 6.5 f O 1 ] [0.86 0.00] 13.4 i=2 [0 10 0] [0.22 0.91 0.13 -0.11 4.1 [10] [0.45 0.85] 4.1 i=3 [0 0 10] [-0.06 0.22 0.91 0.12] 4.6 i=4 [0 0 0 1 ] [0.00 -0.06 0.22 0.91 5.4 Table 3.1: Results of SPBA algorithm for 2500 ps/nm and 80 ps DGD channels at 11 dB OSNR. to the actual channel. When the noise is AWGN and the only unknown parameters in the detection process are the channel coefficients, the VA with Euclidian metric provides the minimum sequence error probability after an accurate estimation of channel. The SPBA algorithm yields the same performance as the VA. However, as already mentioned in Section 3.1, optical fiber system noise is non-Gaussian and input dependent. Modifi cation of SPBA which will take non-Gaussian, input dependent characteristics of noise into account will be described later in this section. The proposed algorithm takes the advantage of the slowly varying nature of the optical channel by assuming the channel coefficients fixed for each block as opposed to updating these coefficients for each state transition as in PSP algorithm. Therefore SPBA requires much less computations than PSP. Hence it is an attractive choice for the application in hand which favors extreme simplicity. As a state transition metric, we used Euclidian metric in the SPBA. Based on the assumption th at n,t is white Gaussian random sequence with a fixed variance irrespective of (fflfe-j, ■ • ■,afc_£,), this m etric is the optimal metric. However, as shown in Figure 3.3, rik consists of various kinds of noises among which or N3p -Sp is the most dominant one depending on the input data. Moreover, N s- sp is chi-squared distributed rather than Gaussian. The Euclidian metric, therefore, might 57 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. ■ b g 10« * kM - i ) • k> o )0 W ^ |-i1) * k > 8 t0 M \|1 -1 ) [~ a te g ,0M » k |n) _ 1 o - 2 O - 3 -1 0 1 2 3 4 0 •1 o > £ *3 - 4 ■ 1 0 1 2 3 4 (b) *k Figure 3.9: (a)Conditional and (b)unconditional probability density functions of Xk for 80 ps dgd channel in the logorithm domain respectively yield a poor BER unless modified. W ith the white assumption for ilk still satisfied, the optimal metrics are the conditional probabilities f ( x k |«fe) Figure 3.9(a) and 3.9(b) show the conditional and unconditional probability density function (pdf) of for 80ps dgd channel in the logarithm domain respectively. Condi tional pdf can be obtained from SPBA algorithm by a simple counting process after the ACQUISITION CP LOOKUP TABLE UPDATE Phase I Phase II Phase III Phase I Phase H Phase III Figure 3.10: 3-Phase process to achieve the BER performance of MLSD 58 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. right channel is selected. For example, for a 4-bit quantized , a 18 x 4 look-up table is formed using the decisions from the Viterbi detector along with their corresponding x^s. Once enough information is gathered so th at the conditional pdf reliably provides state transition metric, Viterbi algorithm switches to conditional pdf based metric computa tion from the Gaussian based one. Even after switching to conditional pdf based mode, conditional pdf’s can be updated to track the channel variations. The detection opera tion can thus be summarized as a three-phase process: Channel acquisition, conditional pdf lookup table generation, conditional pdf lookup table updating. Figure 3.10 shows the expected BER in each phase of the detection process for a fixed dispersive channel. During Phase I, the BER starts to go down as the mean of the humps are acquired. As soon as we enter phase II, a sudden drop is observed in the BER since only the right channel is used in Euclidian metric. We experience another drop in BER in phase III due to the utilization of the optimal metric. 3.3 DFE for F ib er Optical System s Compared to the VA, DFE offers a low complexity solution with some degradation in performance. Especially for the channels where dispersion spans many symbols, the VA is prohibitively complex for 10 G b/s rate system unless state reduction techniques are used. Since the FFF portion of the DFE equalizer does not include a recursion, it may be chosen long enough to m itigate the dispersion particularly due to the phase distortion 59 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. which extends to xnany symbols. The coefficients of the MSE-DFE together with mean square error can be found as follows: 2 . !: = [Xk ' ' ' %k—N O-rn " ' ' A f] (3.28) R zz = E(zlzk) (3.29) R z a = E(zkam+i) (3.30) g = R ^ R Z 3 (3.31) J rain = l - 4 g (3-32) where N and M represent the length of the F FF and FBF. The F F F delay (i.e., the timing relation between arn and x k) is adjusted to optimize Jm\n [2], Since in general this timing relation causes a high correlation between some of F FF and FBF inputs, the condition number, the ratio of the largest eigenvalue to the smallest eigenvalue, of Rzz is very large. For example an 8+3 DFE which optimizes Jm\n has a condition number of 97 and J min of 0.079 whereas the timing relation which minimizes the condition number has a J m;n of 0.1 and condition number of 5.6 for 2000 ps/nm CD channel at 16 dB OSNR. The coefficients of the DFE can be either indirectly computed through channel coef ficients, h, or directly adjusted using LMS or RLS. The m ajor advantage of the indirect method over the direct one is its fast acquisition capability. Once the channel coefficients are estimated, the Wiener solution of DFE coefficients can be obtained by applying var ious techniques [2,18]. However, since we consider a blind equalization scheme, channel coefficients should be identified blindly. Such a requirement rules out the indirect method due to the prohibitive complexity of two-step detection process: blind channel identifi cation and channel equalization. The RLS update in direct m ethod also provides fast 60 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. acquisition of DFE coefficients owing to its independence from the condition number of the overall correlation m atrix of FFF and FBF input. Yet the RLS is model dependent, computationally complex and not numerically robust [51]. In fact, considering the slowly time varying nature of the optical channel and complexity constraint of the equalization method to be Implemented, direct method with LMS update is the most feasible choice. Further simplifications are also possible in the utilization of LMS algorithm. For Instance, the LMS algorithm might be replaced with one of its signed variants. However, one has to be careful about the gradient misalignment problem of both the sign-sign and signed regressor LMS which can cause the coefficients to diverge in some cases [55,56]. For the simplification of the update operation, signed error LMS, thus stands as a safe and better alternative. Additionally, block implementation of signed error LMS simplifies the update operation further. Simplification of the adaptation algorithm is especially im portant in discrete time analog implementation of DFE in terms of power consumption. A /D FBF PARAMETER UPDATE Figure 3.11: Mean, sampling time, F F F and FBF coefficients, and slicer threshold adap tation of DFE In the digital implementation of DFE, parallelism and pipelining are the two critical stages of the design. Due to the recursion in DFE, parallelism requires a special mapping 61 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. algorithm which is described in Appendix D. A simplified digital implementation of adaptive DFE is illustrated in Figure 3.11. In the block diagram, adaptation of the three parameters: filter coefficients, decision threshold and sampling time are emphasized. 3.3.1 M ean A daptation A block of 64 symbols are processed simultaneously at every clock with a clock frequency of 156MHz. At every Alh clock, a 20 bit mean register is updated using 4 x 64 input data, x'q[k). Input is assumed to be quantized into 4 bits. A digital one pole filter with a step size of 2~“ is employed for mean adaptation Since biased estimates are sources of drift problems in stochastic adaptation algorithms, 7 most signification bits (MSB)s of the 20 bit register are used as the mean estimate of the 4 bit input data. In Figure 3.12 we show the convergence of the 7 bit mean value starting from an initial estim ate of 32. In practice, since the adaptation of the mean value is independent from the adaptation of the other param eters, the mean can be estim ated in the start-up mode and then can be tracked afterwards together with other parameters. 63 x 8[n] = afq[64n + k] k=0 (3.33) m n = (3.34) 6 2 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 42 40 3 3 33 34 32 30 15000 0 10000 time 64x1 OOps Figure 3.12: Convergence of the input mean,m& for 70 ps dgd PM D channel at 14 dB OSNR After the mean subtraction, 64 parallel paths generate F FF outputs with a pipelining delay of 5 clocks X q[n} x q[64n] x q[64n + 7] x q{64n + 63] ••• x q[64n + 70] FFF[n) = [fffo[n] ■■■ f f h[n)]T Xf[n + 5] = X q[n)FFF[n] (3.35) (3.36) (3.37) For 7 bit FFF coefficients, the F FF output is 17 bit quantity. The complexity of the implementation is reduced significantly for the subsequent com putations by selecting only 9 bits of this 17 bit quantity. Once target decision values are set to ±64, the proper 9 bit value can be selected by shifting the FFF output value 4 times to the right and clipping the resultant value if greater than 255 or less than -256. 63 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. a £ < 2000 4000 5000 8000 10000 12000 14000 1SO O O 0 (a ) w, < -20 6000 10000 12000 time 54x1 OOps Figure 3.13: (a) Convergence of feedforward coefficients and (b) convergence of feedback coefficients for 70 ps dgd PMD channel at 14 dB OSNR. 3.3.2 E q u a liz e r C o e ffic ie n t A daptation a = (3.38) In the previous section the pipelining and the parallel processing is formulated in terms of the timing index. From now on, we drop the timing indexes to ease the representation. The slicer threshold input, xo, is obtained by subtracting FBF output from FFF output. Once the d ata a is detected as follows: 1 ,xo > threshold -1 , xq < threshold the error value is computed as a 7 bit quantity: 63 , [(64 x a - x 0)/2 j > 63 -6 4 , [(64 x a — x0)/2 j < — 64 [(64 x a — xq)/2\ , else The corresponding FFF and FBF input values are multiplied with the error to gener ate the coefficient update values. Then the coefficients are updated with a proper size. 64 e = (3.39) R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. In Figure 3.13, the convergence of the FFF and FBF are depicted for 70 ps dgd PMD channel while the mean sampling phase and slicer threshold are also adapting. In digital implementations of adaptive filters, the limit cycle oscillations due to the round off noises are more severe as the pipelining delay increases. Therefore as the pipelining delay increases, the step size of the LMS algorithm has to be reduced in order to prevent instability. Start Sam ping Phase Adjustment A lgorithm Yes m > 3 0 0 Y© g(t) for 320 Set Counter! = Cctrrterl 1 Delay Samplng Phase ips Set Counter! = o Advance Figure 3.14: Flowchart of the sampling phase adaptation algorithm. 65 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 3.3,3 Sam pling Phase A daptation The MM based timing recovery algorithm is applied to acquire and track the sampling phase of the baud rate DFE. First a timing function whose zero occurs at the desired sampling phase is generated. Center tap initialization is used for FFF coefficients, i.e., FFF[0] = [0 0 0 32 0 0 0 0]. Let’s assume f x q[k] - x q{k + 7]] generated the decision am , then the timing function is am (xq[k + 4] - x q[k + 2]). To have low variance timing function, the timing function is averaged over 5 blocks. Counter and dead zone concepts are also introduced to reduce the timing jitter. The flowchart of the algorithm is given in Figure 3.14. 4000 3000 2000 1000 i r ------- r I I ft 11 i Dead Zone H m I W B W t i W 1 _______ _____ _ 0 500 1000 1500 2000 2500 3000 (a) 5 ! 4 • 1 500 1000 1500 2000 2500 3000 0 time 320x1 OOps Figure 3.15: (a)Timing function for 70ps dgd PM D channel at 14dB OSNR. (b) Conver gence of sampling phase. The average timing function is represented by g(r) where r represents the timing phase. Starting from the worst sampling phase and center tap FFF initialization, the phase adaptation algorithm is able to acquire a sampling phase which is 5ps away from the optimal sampling phase in less than 71 ps for 70 ps dgd PMD channel. The difference 66 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. between the optimal sampling phase and the acquired one is due to the asymmetric nature of the overall channel response. In the Figure 3.15, the convergence curve of sampling phase together with timing function are depicted. 3.3.4 Slicer Threshold A daptation (STA) The DFE can be viewed as a method which recovers the histogram of the received signal to its original two hump form. Since, as already mentioned in Section 3.1, the noise power is directly proportional with the received signal level, the optimal slicer threshold changes with the channel. Selection of the slicer threshold is of param ount importance in term s of the BER performance. As the decisions at the optimal slicer threshold value can be seen as Bernoulli trials, an observed bias in the decisions can be used for STA. Basically, the summation of n decisions can be approximated by a Gaussian random variable with variance n. If due to the biased decisions, BER goes up, this event can be captured and used to adapt the ST for high BER. Even though this method can be used to adapt the ST for high BER, it is not feasible to exploit the bias of the decisions for STA for low BER. A look-up table (LUT) based STA actually estimates the position of the ST accurately even for very low BER. Coefficient adaptation errors are used to form the LUT. The square of the coefficient adaptation errors of decision 1 and -1 are accumulated in two separate registers. As the equalizer converges, the accumulated values are adapted to estim ate the noise variances around 1 and -1. It should be noted here that these are not the actual variances as MMSE-DFE is biased, however the LUT can be designed accordingly. The 67 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. ST is adapted every 512 blocks depending the on the ratio of the variances. In Figure 3.16, the convergence of the ST is shown . 1 , 1 1 M ; ! ; 1 ' < i 0 20 40 60 SO 100 120 140 160 180 Figure 3.16: LUT based STA for 70ps dgd PMD channel at 14 dB OSNR For a discrete time analog implementation of DFE, both the complexity and power consumption prohibits use of the conventional LMS algorithm. As already discussed above the signed error LMS provides a reasonable compromise between the power and the performance. However, the STA algorithm cannot be based on the error variance for signed error LMS algorithm. In this case a dummy decision device can be used to generate the decisions as a function of ST. It can be shown th at conditional cumulative distribution function can be estim ated through the summation of decisions. Once the cumulative distribution functions are known, variances can be computed. 6 8 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C h ap ter 4 D ata-A ided Clock Recovery for O ptical Receivers Dispersion in gigabit fiber optical links poses new challenges in the clock recovery design. Overlapping of the transm itted pulse train due to the static chromatic dispersion (CD) and random polarization mode dispersion (PMD) corrupts the timing information of the incoming signal at the receiver. Besides the clock of the baseband optical signal, the optimal sampling instant has to be extracted from noise contaminated, dispersed receive signal to optimize BER. To combat inter symbol interference (ISI), equalization tends to be a m ajor receiver process for high rate fiber optical systems. For the digital implementation of the equalization, complexity and power consumption of A /D converters preclude fractional sampling. For discrete time analog implementation of the equalization, uncontrolled dc biases on the adaptive update operation present stability problems due to the large condition number of the correlation m atrix of the fractionally sampled receive data. Therefore, the baud rate sampler stands as a b etter alternative to generate the necessary signals for equalization. Synchronous Optical Network (SONET) and Synchronous Digital Hierarchy (SDH) standards at 10 G b/s place stringent requirements on the signal timing. Specifically, jitter 69 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. generation, jitter transfer and jitter tolerance specifications are stipulated in ITU-T G.783 and ANSI Tl.105.03 recommendations for SDH and SONET respectively. For nonzero mean receive signals, spectral lines are created at the harmonics of the baud rate due to the cyclostationary characteristics of the information bearing signal. A memoryless nonlinear function at the receiver is in general responsible to generate such a nonzero mean signal. Even though, the nonlinear operation used for the spectral line methods for CR is provided by the photodetector in fiber optical systems, asymmetry of the narrowband bandpass filter following the nonlinear operation causes degradation in the performance in terms of jitter variance. Moreover, ISI might be so severe th at it attenuates the clock frequency and thereby reduces the gain of the spectral line. In this chapter, we propose a Mueller and Muller (MM) [43] based data-aided clock recovery (DACR) algorithm th at takes advantage of the decisions to improve the per formance of the clock recovery (CR). We choose the RMS jitter as a figure of merit. Peak-to-peak jitter is also computed after the frequency acquisition for a duration of 105 symbol periods. In practice, the frequency difference between the free running voltage controlled oscillator (VCO) frequency and data bearing signal frequency may be so large that it causes large jitter in phase-locked system. Therefore, we assume th at a quadri- correlator or a rotational frequency detector (FD) [3,22,41] is used to bring the VCO frequency within a ±12MHz neighborhood of the received signal clock. The goal of this paper is to 1) derive the sampling instant close to the optimal sampling phase, and 2) recover the clock with low jitter. The optical channel model was examined in Section 3.1. In the following sections, we investigate the three m ajor components of CR, namely the data detector, phase detector and loop filter. 70 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. - s ~ 2000ps/nm CD 1000ps/nm CD 80ps PMD -»s~ 50ps PMD 1 0.8 0 . S cs 0.4 0.2 0 -0.2 60 70 '0 10 20 30 40 50 Figure 4.1: Overall time-domain responses of various dispersion channels. 4.1 D ata A ided Clock Synchronizer In Figure 4.1 we compare the overall time domain response of 50 and 80ps dgd PMD, and 1000 and 2000ps/nm CD channels after the front end filter. The impulse response of CD and worst case PMD exhibits an even symmetry around half of its time span. Since even symmetry in the overall channel response, h(t) yields improved phase tracking performance, we have selected even pulse shapes for the transm itter and the receiver. In the following sections we work on the discrete-time baud spaced channel model given by L Xk(£) = ak~mhm{£) + n k (4.1) 771— 1 where jej < T j 2 is the phase error. Convolution of the independent identically dis tributed (iid) input sequence a*, with the finite support channel can be represented by 71 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. multiplication of the (L x 1) channel vector hie) by the Toeplitz (fe x L) data matrix Ak as Xfc(e) = A k h(e) + n k h (£) = [hh{e) ■ ■ ■ ■ = [Oi-L+l Ak = [ak ■ ■ ■ ai] h i(e )f where (•)T denotes transposition. (4.2) (4.3) (4.4) (4.5) 2000pstamCD -A - 1000ps/nmCD -Q - 80ps PMD 50ps PMD © " 2 c 3 phase srror x10ps Figure 4.2: Variations of phase error for various dispersion channels. The major function of the clock synchronizer is to derive the sampling instants for the detector. An inductive synchronizer generates an error function to adjust these instants. A phase lock loop forms the basic building block of the adjustm ent process. The data aided clock synchronizer creates the phase error of PLL out of a specified function of the channel response. However, since the channel response is generally not available at the receiver, decisions are utilized to interpolate the channel response. In this chapter we 72 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. employ a 3 tap feedforward and 1 tap feedback DFE to obtain the decisions with a low BER. Among various equalizer types,a 3+1 tap DFE provides a sufficiently low BER with low complexity for phase error generation. It is to be noted here that we do not propose a 3+1 DFE as the main equalizer. Instead we are using it to increase the reliability of the phase error. Once the input d ata are decoded, the discrete time channel response. h(t), can be evaluated through the correlation of the decisions with the sampled receive data. In their classic paper on baud rate sampling, Muller and Muller suggested a class of phase error generators in terms of channel impulse response, then they derived the structure of more general phase error functions d k (s ) = x fe_ m + i , k { s ) T F where m is a fixed param eter and F is a function of the decisions F = f m { a k —m + 1 ' ' ' a k) We select one of the members of this class 0k{e) = ak^ ix k(e) - dfc-i+i^-iO O (4.8) (4.7) (4.8) where ai shows the blindly detected input data and 1 represents the properly aligned timing index which is less than k. As a m atter of fact, since the continuous channel, h(t), has an even symmetry, we can safely assume th at I — [L /2j. Phase error functions, 0k(e), of 50 and 80 ps dgd PMD and 1000 and 2000 ps/nm CD channels are shown in Figure 4.2. Since the slope of the curves are different for each dispersion case, the closed loop bandw idth of the PLL will be not be fixed for a fixed loop filter. The phase error 73 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 2000ps/nm CD lOOOps/nm CD 80psP M D 50pa PMD 0.025 .2 0.02 P h a s e Error x10pa Figure 4.3: Variances of phase errors. function, 6k(s), is exposed to three different kind of noises, namely additive noise, pattern dependent noise and detection error noise. Decomposing 0&(e) into the summation of two terms, 0fc(e) = #fc,i + %,2(e) (4-9) 6k,i = a-k-ink - ak-i+ inu-i (4.10) L L 6 k ,l it ) = 6-k—l 'y \ O'k—m .tm it) ®k—l+ 1 a k —m —l t m { t ) (4.1l) m=1 1 we observe that, the first term, 9k,i, just adds zero mean noise whereas the second term, 6 k ,i{ t), carries the necessary timing information corrupted by pattern dependent noise and detection error noise. Although pattern dependent noise and detection error noise are strongly coupled, the detection error noise can be ignored when the BER of the system is less than IQ-2 . The variances of the phase error noise, a 2(9(e)), for the above channels are evaluated and plotted in Figure 4.3. As shown in this figure, the optimal sampling instants have also the minimum phase error noise variance. The variance of both the 74 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. pattern dependent noise and additive noise can be reduced by averaging over a number of symbols. Moreover, by using an auspicious d ata set only, the random fluctuations of the phase error can be reduced further. For instance, let’s consider the case where the channel response is h(e) = [/12(e) hi{e) h0(e)}T . Then ^ ) 2(e) takes four possible values 2/i2(e) ~ 2/i0(e), 2/12(e), - 2 h 0{e), 0 (4.12) For an even channel response around h\ (e), among the four term s given above, only the first one is the desired phase error. The first term is also one-half of the mean phase error. Note th at it is possible to allow only the following patterns for [ ak a k-i Q > k - 2 a&-3] - 1 - 1 1 1 - 1 1 1 - 1 1 - 1 - 1 1 1 1 - 1 - 1 Doing so, the fluctuations can be prevented for this particular instance. In the follow ing we will discuss the details of the implementation. The block diagram of the timing recovery loop is shown in Figure 4.4. Decisions are generated by an adaptive DFE equal izer. As an adaptation method, a slowly adapting LMS algorithm is chosen. Therefore detection error noise is not ignorable at the start-up of the PLL. A block of 32 symbols is processed at each cycle with a delay of 1 cycle. After having found the phase de tector curve, loop parameters of the PLL are adjusted so th a t the minimum bandwidth of the closed loop transfer function is greater than 4 MHz. To be able to track low frequency sinusoidal jitter, the selection of loop param eters plays a critical role in CR design. Since as the dispersion changes, the BW of the closed loop varies, it is essential to 75 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. m Figure 4.4: Block diagram of timing recovery loop consider the bandwidth variation due to the channel variation. The linearized analysis of a PLL is usually performed in s-domain. However, MM based CR works in discrete time, therefore z-domain analysis is required to describe the loop behavior. Decisions are gen erated every 32 symbols, thus the sampling rate of the z-transform of PLL is 156.3 MHz( 987.1 Mr ad/s). The target closed loop bandwidth, then, corresponds to 25.13 Mr ad/s. Adaptation of the pre-equalizer is decoupled from th at of PLL to a certain extent by choosing the bandw idth of the LMS algorithm very small compared to the closed loop bandwidth of the PLL. In particular, the step size of the LMS algorithm is chosen as 0.0003. Two types of loop filters are investigated. Type I filter is just a gain. Type II filter, on the other hand, is comprised of both a gain and a discrete time integrator. The transfer function of VCO is simply modelled as z /{ z — 1). The optimal sampling instant is found offline through the Wiener solution of 8+3 DFE for comparison purposes. As 76 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. (a) Sam pling P h a s s xIO ps 0.14 Sampling Phasa xIOps Figure 4.5: Impact of the sampling phase on the output statistics of receive signal, (a) Q value of 50 ps dgd channel at 12 dB OSNR after the DFE. (b) The MSE of the DFE for the same channel already mentioned above, 3+1 Pre-equalizer is used to produce the decisions with a BER of IQ"2 or less for phase error generation. Sampling rate of x(t) is chosen as 100 GHz. For 10 G b/s system, the minimum mean square error (MMSE), J mjn(j), of each of 10 phases is found as follows Z f c = [x(10fc + i) ■ x{\0 (k - 7) + i) am a TO_ x oT O _2] (4.14) R zz(i) = E { z lz k) (4.15) Rza(i) = E { z la m+l) (4.16) f(i) = R - H i ) R z A i ) (4.17) = i - R i m (4.18) where the time indexes are in term s of 10 ps. The decision delay or in other words the timing relation between am and x(l0k + i) is adjusted to optimize the J min(/)- Statistics 77 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. ^ § rn y s~ ~ ™ ~ ™ Frequency {rad/sec): 2.6&+M7 iagnitufia (cB): -3.01 C 2 JZ G . Figure 4.6: Closed loop Bode plot of Type II filter are obtained by averaging over 20000 symbols. In practice, the digital SNR, Q, is used as a figure of merit for the performance of the receiver Q = (4.19) a i + ( T - i The optimal sampling point is defined as the phase which minimizes the Jmin or equiva lently which maximizes the Q at the output of the equalizer. In Figure 4.5(a) and 4.5(b), Jmin and Q of 50 ps dgd channel after the 8+3 D FE are shown as a function of sampling phase respectively. It turns out th at 6th and 7th sampling phases are yielding approxi mately the same performance w ith the maximum Q value. Therefore optimal sampling phase should be in between these sampling phases. 78 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. We adjusted K j, phase detector gain, so th at $fc(e) = e for the worst case phase detector slope. Type II filter has two parameters F 2(z) ~ K \ + (4-20) Param eters Ki and K 2 are tuned to get a closed loop bandwidth of at least 25.13Mrad/s with a peaking of less than 0.5 dB. The only param eter of Type I filter is also similarly adjusted. The closed loop Bode plot of Type II filter w ith Ki — 0.08 and K 2 — 0.0002 is given in Figure 4.6. As shown in Figure 4.6, the closed loop bandwidth of the PLL is 26.8M rad/s and the peaking is 0.216 dB. ------- T jakloPeak & time x100ps Figure 4.7: Accumulated phase jitter for 50 ps dgd PM D channel at 12 dB with Type II filter It can be shown using final value theorem th at Type II filter can acquire a frequency step up to a certain magnitude without any phase error. Jitter performance of the Type II filter is simulated for the above dispersion channels for a frequency difference of 4MHz. In 79 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 10.2 10 9.8 9.S 9.4 5 6 7 1 2 3 4 time 20000x100ps 1± X fff{2) ' j A fbf(Q) 500 1000 1500 2000 2500 3000 3500 (b) time 32x1 OOps Figure 4.8: Improvement of the Q value for the parameters axe adapting (a) Q of 50 ps dgd channel (b) Coefficients of the DFE particular VCO frequency is set to 10.004GHz while the data rate of the incoming signal is lOGb/s. Since the sampling rate of x(t) is 100GHz, the values of x(tk) are evaluated by linear interpolation. Let tk be between tn > * = lOOn +10* ps and = lOOn + 10(* + 1) ps where i < 9, then x(tk) can be found as x{tk) ~ tk^)x(tn,i) "P (1 (j'n^i+l tfc))^(tn,*-rl)) (4.21) For 50 ps dgd PMD channel at 12 dB OSNR, phase is locked in 5fis with no phase error starting from the worst sampling phase. Figure 4.7 shows the accumulated phase jitter along with its histogram. The mean of the sampling phase coincides with the optimal sampling instant. Peak-to-peak jitter is measured as 8 ps while the rm s jitter is 1.3 ps. While PLL is acquiring the frequency, both the feedforward, FFF, and the feedback, FBF filter coefficients converge to their steady state values providing a decent Q value. The learning curve of the coefficients and the evolution of the Q value are shown in Figure 4.8(b) and 4.8(a) respectively. W hen the frequency difference between VCO 80 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 100 O . -203.5 1 1.5 2 2.5 3 3.5 Figure 4.9: Acquisition of the iming recovery algorithm for a frequency difference of 12 MHz after a number of cycle slips and incoming data rate is large, PLL acquires the frequency after a number of cycle slips. Figure 4.9 illustrates this acquisition process for a frequency difference of 12MHz. Cycle slips result in a oscillation in the convergence curves of DFE coefficients. Since the coefficients are away from their optimal values during the frequency acquisition stage, the Q value is low. However, once the phase converges, the output Q value is the same as in the case of no frequency difference. The number of cycle slips before the frequency acquisition is achieved is a function of system parameters. For the fixed loop filter and dispersion channel, the number of cycle slips goes down as the GSNR of the system improves. For example, for 50 ps dgd PMD channel with 12MHz frequency difference, the phase is locked after 20 cycle slips at 12 dB whereas the number of cycle slips is only 4 at 22 dB OSNR. Jitter performance of the various dispersion channels are simulated and the rms jitter values are compiled in Table 4.1. For fixed loop parameters, the two dominating factor in determining the rms jitter are the phase error noise variance and 81 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. OSNR 12 dB | 14 dB 16 dB 18 dB 20 dB 22 dB 24 dB 50 ps dgd PMD 1.3 ps j 1 ps 0.79 ps 0.58 ps 0.48 ps 0.39 ps 0.3 ps 80 ps dgd PMD 1.2 ps I 0.87 ps 0.7 ps 0.54 ps 0.47 ps 0.38 ps 0.29 ps 1000 ps/nm CD 1.1 ps 0.88 ps 0.72 ps 0.57 ps 0.46 ps 0.39 ps 0.3 ps 2000 ps/nm CD 1 ps | 0.76 ps 0.65 ps 0.54 ps 0.40 ps 0.34 ps 0.28 ps Table 4.1: Jitter performance of the timing recovery algorithm as a function os OSNR the phase detector slope around zero. Since we used a fixed Kd, the larger the slope is, the lower the rms jitter is. A striking observation is th at the higher dispersion cases yielded lower rms jitter. Even though as the dispersion increases phase detector curve of an MM based timing recovery flattens faster, higher dispersion channels have larger phase detector slopes around zero which reduces their rms jitter. Type I filter is much simpler to implement than Type II filter, however even a slight frequency difference generates a non-zero steady state phase error. Since the performance of the receiver hinges on the sampling phase, the output Q value of the receiver with Type I filter will be lower than that of Type II filter as depicted in Figure 4.10. 8 2 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. 11.2 11 10.8 10.6 e o 7 3 10.4 O T ype I Filter T ype II Filter 10.2 10 9.8 9.S 2 4 3 10 12 0 Figure 4.10: Im pact of the loop filter on the output Q value for 2000 ps/nm CD channel at 14 dB OSNR with a 4MHz frequency difference R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C h ap ter 5 Conclusions Traditional training approaches may be viewed as simple codes for unknown channels that require relatively simple processing at the expense of decreased rate or throughput. In this thesis, we have established a framework for the design of training codes based on a pairwise distance th at is appropriate for JML channels and sequence estimation or approximations thereof. For short d ata packet lengths the resulting training codes were demonstrated to provide significant improvements over traditional segmented approaches at the expense of increased receiver complexity. For longer packets, the direct application of this framework becomes prohibitively complex. However, reliable d ata communication starting from the beginning of the (long) packet was dem onstrated with very low overhead using split training sequences and additional search effort during acquisition. Clearly, the major obstacle to applying the approach developed for training code design to larger block lengths is the exponential complexity growth in both the code design and the decoding problem. However, one could use a short training code designed using the approach suggested to initialize a PSP-based algorithm for a long packet. The practical 84 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. value for such an approach, as compared against a standard training approach or a split- training approach for long packets, is questionable because the importance of saving a small number of overhead bits diminishes with the packet length. For example, saving 3 bits of overhead in a 16 bit packet is substantial, but saving 3 bits of overhead in a packet of 100 bits has a less impressive impact on throughput. In practice, the packet length or the number of bits transm itted between training is determined by a number of factors including the dynamics of time variation in the channel and the multiple access method. An interesting area for future research is the formulation of the design in a recursive manner th at would allow simplified design and decoding. Also, incorporation of additional levels of error correction coding using recent adaptive iterative detection techniques is another promising research direction. For high-rate fiber optical systems, dispersion poses challenges in both the timing and data recovery algorithms. We have established a framework for the design of both digital and discrete time analog receiver. Low complexity, blind adaptive equalizer techniques are examined in terms of their suitability for fiber systems. Parallel pipelined architec ture of digital equalizer combined with block FEC might be another interesting research direction. The implementation of both the timing recovery and data recovery algorithms requires a high bandwidth front end filter. A partial response signal not only reduces the bandwidth requirement but also equalizes the dispersion when used in conjunction with precoding. The improvement in the distance span of the fiber optical system with partial response signalling together with precoding is also currently attracting so much interest. 85 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. R eference List [1] D. A. Atlas A. F. Elrefaie, R. E. Wagner and D. G. Daut. Chromatic dispersion limitations in coherent lightwave transmission systems. IE E E J. Lightwave Tech., 6:704-709, May 1988. [2] N, Al-Dhahir and J. M. Cioffi. MMSE decision-feedback equalizers: finite length results. IEEE Trans. Inform. Theory, IT-41:961-975, Jul 1995. [3] J. A. Belliso. New phase-locked timing recovery m ethod for dogital regenerators. Int. Conf. Communications Record, Jun 1976. [4] A. Benveniste, M. Goursat A., and G. Ruget. Robust identification of a nonminimum phase system: Blind adjustm ent of a linear equalizer in data communication. IEEE Trans. Automatic Control, AC-25(3):385-398, June 1980. [5] H. H. Chiang and C. L. Nikias. Adaptive deconvolution and identification of non minimum phase fir systems based on cumulants. IEEE Trans. Automatic Control, AC-35:36-47, Jan 1990. [6] K. M. Chugg. Sequence Estimation in the Presence of Parametric Uncertainty. PhD thesis, University of Southern California, Los Angeles, CA, August 1995. [7] K. M. Chugg. Performance of optimal digital page detection in a two-dimensional ISI/AWGN channel. In Proc. Asilomar Conf. Signals, Systems, Comp., November 1996. [8] K. M. Chugg. Blind acquisition characteristics of PSP-based sequence detectors. IE E E J. Select. Areas Commun., 16:1518-1529, October 1998. [9] K. M. Chugg. The condition for applicability of the Viterbi algorithm with implica tions for fading channel MLSD. IE EE Trans. Commun., 46:1112-1116, September 1998. [10] K. M. Chugg and A. Polydoros. Front-end processing for joint maximum lileihood channel and sequence estimation. In Proc. Globecom Conf., pages 51-55, 1994. [11] K. M. Chugg and A. Polydoros. On the existence and uniqueness of joint channel and data estimates. In Proc. IE E E Symposium on Information Theory, page 402, 1995. [12] K. M. Chugg and A. Polydoros. MLSE for an unknown channel - part I: Optim ality considerations. IEEE Trans. Commun., 44:836-846, July 1996. 86 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. [13] K. M. Chugg and A. Polydoros. MLSE for an unknown channel - part II: Tracking performance. IEEE Trans. Commun., pages 949-958, August 1996. [14] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill, 1997. [15] O. Coskun and K. M. Chugg. Combined coding and training for unknown isi channel. IE EE Trans. Commun., (To appear). [16] S. N. Crazier, D. D. Falconer, and S. A. Mahmoud. Least sum of squared errors (Isse) channel estimation,. IE E Proc. Radar and Signal Processing, 138:371-378, Aug 1991. [17] T. Ericson. Structure of optimum receiveing filters in data transmission. IEEE Trans. Inform. Theory, 17:353-360, July 1971. [18] B. Farhang-Boroujeny. Channel equalization via channel identification: Algorithms and simulation results for rapidly fading HF channels. IE E E Trans. Commun., 44:1409-1412, Nov 1996. [19] G. Fetweiss and H. Meyr. Parallel viterbi decoding by breaking the compare-select bottleneck. IEEE Trans. Commun., pages 785-790, Aug 1988. [20] G. D. Forney. Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference. IE E E Trans. Inform. Theory, IT-18:284-287, May 1972. [21] G. D. Forney. The Viterbi algorithm. Proc. IEEE, 61:268-278, March 1973. [22] F. M. Gardner. Propeties of frequency diffrence detectors. IE E E Trans. Commun., COM-33:131-138, Feb 1985. [23] W. A. Gardner. A new method of channel identification. IEEE Trans. Commun., 39(6):813-817, June 1991. [24] M. Ghosh and C. L. Weber. Maximum-likelihood blind equalization. In SPIE, pages 181-195, July 1991. [25] G. B. Giannakis and J. M. Mendel. Identification of nonminimum phase systems using higher-orde moments. IE E E Trans. Acoustics, Speech and Signal Process., ASSP-37(3):360-377, March 1989. [26] D. Godard. Self-recovering equalization and carrier tracking in two-dimensional data communication systems. IE E E Trans. Commun., 28:1867-1875, November 1980. [27] G. H. Golub and C. F. V. Loan. M atrix Computation. John Hopkins, 1996. [28] F. Gustafsson and B.Wahlberg. Blind equalization by direct examination of the input sequences. IE E E Trans. Commun., 43:2213-2222, July 1995. 87 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. [29] S. Haykin. Adaptive Filter Theory. Prentice Hall, Upper Saddle River, NJ, 3nd edition, 1998. [30] P. A. Humblet and M. Azizoglu. On the bit error rate of lightwave systems with optical amplifiers. IEEE J. Lightwave Tech., 9, 1991. [31] T. Kailath. Correlation detection of signals perturbed by a random channel. IR E Trans. Inform. Th., IT-6:361-366, June 1960. [32] R. M. Karp, R. E. Miller, and S. winograd. The organization of computations for uniform reccurence equations. J.ACM, 14:563-590, 1967. [33] K. Kato, A. Kozen, Y. Muramoto, Y. Itaya, T. Nagatsuma, and M. Yaita. 110GHz, 501.55mm wavelength. IEEE Photonics Technol. Letters, 6:719-721, Jun 1994. [34] P. Kogge and H. Stone. A parallel algorithm for efficient solution of a general class of recurrence equations. IEEE Trans, on Computers, C-22:786-793, Feb 1973. [35] G. Lachs, S. M Zaidi, and A. K. Singh. Sensitivity enhancement using coherent heterodyne detection. IEEE J. Lightwave Tech., 12:1036-1941, Jun 1994. [36] H. J. Landau and H. O. Poliak. Prolate spheroidal wva functions,fourier analysis, and iii the dimention of the space of eesncially time-and band-limited signals,. Bell Sys. Tech. J., 41:1295, 1962. [37] J. Lin, F. Ling, and J. G. Proakis. Joint d ata and channel estimation for TDMA mobile channels. In PIMRC-92, pages 235-239, 1992. [38] R. W. Lucky. Automatic equalization for digital communications. Bell Sys. Tech. J., 44:547-588, April 1965. [39] C. K. Madsen and G. Lenz. Optical all-pass filters for phase response design with applications for dispersion compensation. IEEE Photonics Technol. Letters, 10:994- 996, July 1998. [40] C. K. Madsen and G. Lenz. General optical all-pass filter structures for dispersion control in wdm systems. IEEE J. Lightwave Tech., 17:1248-1254, July 1999. [41] D. G. M esserschmidtt. Frequency detectors for pll aquisition in timing and carrier recovery. IEEE Trans. Commun., COM-27:1288-1295, Sep 1979. [42] Y. Miyamoto, K. Hagimoto, F. Ichikawa, M. Yamamoto, and T. Kagawa. A lOGb/s high sensitivity optical receiver using InGaAs-InAlAs superlattice APD at 1.3mm/1.5mm. IEEE Photonics Technol. Letters, 3:373-374, Apr 1991. [43] K. H. Mueller and Muller. Timing recovery in digital synchronous data receivers. IEEE Trans. Commun., 24:516-531, May 1976. [44] J. Omura and T. Kailath. Some useful probability distributions. Technical report, System Theory Lab, Stanford University, CA, 1965. 88 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. [45] T. Ono, S. Yamazaki, H. Shimizu, and K. Emura. Polarization control method for suppressing polarization mode dispersion influence in optical transmission systems. IEEE J. Lightwave Tech., 12:891-898, May 1994. [46] J. Proakis. Digital Communication. McGraw-Hill, New York, 2001. 4th edition. [47] S. U. H. Qureshi. Fast start-up equalization with periodic training sequences. IEEE Trans. Inform. Theory, IT-23(5):553-562, September 1977. [48] R. Raheli, A. Polydoros, and C-K. Tzou. The principle of per-survivor processing: A general approach to approximate and adaptive MLSE. In Proc. Globecom Conf., pages 33.3.1-33.1.6, 1991. [49] R. Raheli, A. Polydoros, and C-K. Tzou. Per-survivor processing: A general ap proach to MLSE in uncertain environments. IEEE Trans. Commun., 43:354-364, Feb-Apr. 1995. [50] Y. Sato. A method of self-recovering equalization for multilevel ampiitude- modulation systems. IEEE Trans. Commun., 23:679-682, June 1975. [51] A. H. Sayed and T. Kailath. A state-space approach adaptive rls filtering. IEEE Signal Processing Mag., 11:18-60, Jul 1994. [52] M. Schetzen. The Volterra and Wiener Theories of Nonlinear Systems. New York: Wiley, 1980. [53] N. Seshadri. Joint data and channel equalization using fast blind trellis search tech niques. Proc. Globecom Conf., pages 1659-1663, December 1990. [54] N. Seshadri. Joint data and channel estimation using blind trellis search techniques. IE E E Trans. Commun., 42:1000-1011, Feb./M ar./A pr. 1994. [55] W. A. Sethares, I. M. Y. Mareels, B. D. 0 . Anderson, and C. R. Johnson Jr. Com parison of DC offset effects in four LMS adaptive algorithms. IE E E Trans. Circuits S yst, 35:613-624, Jun 1998. [56] W. A. Sethares, I. M. Y. Mareels, B. D. O. Anderson, and C. R. Johnson Jr. Exci tation conditions for signed regressor least mean squares adaptation. IE EE Trans. Circuits Syst., 35:613-624, Jun 1998. [57] L. Tong and T. Kailath G. Xu. Blind idenfication and equalization based on second- order statistics:a time domain approach. IEEE Trans. Inform. Theory, 40(2):340- 349, March 1994. [58] G. Ungerboeck. Adaptive maximum likelihood receiver for carrier-modulated data- transmission systems. IE E E Trans. Commun., com-22:624-636, May 1974. [59] B. Widrow and S. D. Stearns. Adaptive signal processing. Prentice Hall, Englewood " Cliffs,NJ, 1985. 89 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. [80] J. H. Winters and R. D. Gitiin. Electrical signal processing techniques in long-haul fiber-optic systems. IEEE Trans. Commun., 38:1439-1453, Sep 1990. [61] L. S. Yan, Q. Yu, T. Luo, A. E. Willner, and X. S. Yao. Compensation of higher order polarization-mode dispersion using phase modulation and polarization control in the transm itter. IEEE Photonics Technol. Letters, 14:858-860, Jun 2002. [62] Q. Yu, L. S. Yan, Y. Xie, M. Hauer, and A. E. Willner. Higher orde polarization mode dispersion compensation using a fixed time delay followed by a variable time delay. IEEE Photonics Technol. Letters, 13:863-865, Aug 2001. R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A ppendix A P revious Identifiability / Separability Tests and T heir R elation Two approaches to characterizing separability have been developed independently by Gustafsson and Wahlberg [28] and Chugg [8]. Both approaches attem pt to characterize sequences which, when transm itted, may (or will) cause a blind receiver to have problems distinguishing it from another allowable sequence. We briefly describe these approaches and relate the main concepts. In [ 8], only digital sequences were considered. A sequence A)■ . was defined to be dis tinguishable from A'k under channel f if A £ ^ lZ(A'k). Furthermore, a sequence Ah that is distinguishable from all sequences except those in T { A h ) = {A'k : A'k = cAk for some c} is identifiable under the definition from [28]. The equivalence class of a digital sequence A k is all other such sequences with the same range space, with the trivial equivalence class defined as T (A k)- Sequences with a non-trivial equivalence class cannot be distin guished from some other sequence under any circumstance under the metric in (1.20). Such sequences were characterized as having some periodic structure [8]. Furthermore, a sufficient condition was given [8 , Lemma 7] for a BPSK sequence A k to be distinguishable 91 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. from all A k T (A k) under a class of channels th at were “regular" with respect to the digital alphabet. For BPSK modulation, this condition is th at all of the 2L possible rows occur on the “interior” of A k. The approach in [28] is based on condition for analog sequences X — k (i.e., , a Toeplitz m atrix with elements taking values on the continuum). Clearly, rank deficient data matrices cause channel identification problems. Considering only full rank data matrices X k, the solution of r = px , x k - x k = o (A.i) for X k, where Px > represents the projection m atrix of X k, gives the set of sequences which are equivalent to X k. The notation X k is used for the Toeplitz m atrix representation of the sequence x k, with the symbol x used in place of a to emphasize th at x takes values on the continuum. In [28], it was shown th at for a channel length L, if B k(x) = Xk-2L+2 Xk (A-2) X-L+2 ••■XL is a rank 2L — 1 m atrix in the language of [28], X k is persistently exciting (PE) of order 2L — 1, then X k can always be represented as cX k for some constant c. Therefore, for digital sequences ak, J3fc(a) being a full rank m atrix is a sufficient condition for A k to have a trivial equivalence class. In general, however, this is not sufficient for A k to be identifiable with respect to all channels. We next give the definition of a class of channels implied in [28]. D efin itio n A .I Assured channels are those fo r which (P ^ A k — Ak)f= 0 iff P ^ A k = Ak, holds fo r all A k and A k rank L digital data matrices with A k being PE of order 2L — 1. 92 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. [Gustafson and Wahiberg 95] Trivia! equivalent class Assured All interior rows condition Regular Unaariy Independent / / [Chugg 98] Digital Sequence Channel Condition Condition Figure A.I: The relation between the identifiability checks in [8] is weaker than th at in [28]. Note th at the relation between regular and assured channels is a conjecture It follows th at a trivial equivalence class is a sufficient condition for identifiability with respect to assured channels. Thus, focusing on assured channels eliminates the possibil ities of the channel dependent indistinguishabilities described in [8]. In [28], a subset of assured channels was given as ’ ’linearly independent channels” which has much simpler characteristics. In the remainder of this appendix, we show th at the set of linearly inde pendent channels is a subset of the set of regular channels, whereas the set of sequences satisfying the ” all rows” condition [8] is a subset of the set of sequences which are P E of order 2L — 1 for a length L channel. These conclusions are summarized in Figure A.I. Comparing the check for identifiability in [28] to th at in [8], a larger class of digital se quences was considered with respect to a smaller class of channels. Note th at we have not proven any relation between regular channels and assured channels, b ut we conjecture th at the relation is as implied by Figure A.I. Finally, we note th at neither references give a simple, pairwise separability check of the form desired for our code design methodology. 93 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. T h e o re m A .2 The set of linearly independent channels is a proper subset of the set of regular channels. P ro o f: Let the d ata belong to the finite alphabet ,A — (±1, ±3, • • • ± ( M — 1)} and assume the channel is not regular, but is linearly independent. From the definition in [28] this means th at the channel cannot output a zero when the input is drawn from the set Ql (A) = {0, ±1, ±2, ■ ■ • ± 2(M - 1) • • • ± (2M - 1 )2 L+1 L LA (k 0 - L + 1 )L} (A.3) Since the channel is not regular, there exist a pair of (L x 1) vectors a and b with elements from Ai, such th at aTf = b Tf or (a — b )Tf = 0. However, the entries of the vector, a — b take values from the set Qi(A), therefore f cannot be linearly independent. Thus, every linearly independent channel is regular. On the other hand there exist channels th at are regular but not linearly independent. For example, the channel f = [2 3 10]T is regular with respect to A — {±1}, but not linearly independent since [2 2 — l]f = 0. T h e o re m A .3 I f all the rows occur on the interior of Ak with L > 2 and a B P SK alphabet, then Ak is PE of order 2L — \ P roof: Ak is PE of order 21/ — 1 if and only if the equation A.3, where X k is replaced by Ak, has only the trivial solution X'k = cAk [2, Theorem 3.1]. Here we show th at if Ak has all the rows in its interior then the equation (A.I) has only the trivial solution X'k = cAk- Let Ak and X'k be partitioned as follows, ti t 2 • • t l ll! U 2 ' A k = ai a 2 • ■ a L K = Xl x 2 • ' Xi br b 2 • ■ b L ll 1 2 • 1 l 94 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. where hj and tj are (L — 1) x 1 vectors. The (k + 2 — 2L) x 1 vectors comprise the interior of Ah and X k, respectively [8]. Denoting k + 2 — 2L by m, we define the vectors aj(x j) as the portion of the j th column of Ak{X'k) starting from the second uppermost element of aj down to the uppermost element of bj(lj) - e.g., aj(2) aj(m) b j( l) (A -5) The Toeplitz structure implies a.j = a j _ i for j = 2, • • •, L with the similar relation holding for X'k. Since it has been assumed th at Ak and X k have the same column space x i = c j j a i + Ci2a2-d--------1 - c h &l — c n f i i + c i 2a 2 + • • • + c i l & l - i (A .6) X2 = C 2ia i + C 22a2-d------ h C 2£,ax, = C 21& 1 + C 22& 2 d------ + C2L&L-1 (A.7) ; ! (A .8) XL = CLiai + C L2& 2 + • ■ • + CLLa i = CL1&1 + C L2& 2 + ' ’ ' + CLL&L-1 (A.9) must hold for some choices of coefficients {cy}. Two different equations can be w ritten for xi x i = c n a i d- c12a 2 d h ciL&L (A.10) X l = X i = C 2 1 & 1 - f C 22a 2 - f • • • d - C2L& L ( A . l l ) where (A.10) follows directly from the assumption th at TZ(Ah) = lZ(X'k) and (A .ll) follows from (A.7) and the Toeplitz structure. We now show that (A .ll) and (A.7) imply th at c2i = 0 by contradiction. If c2i 0, (A. 10) and (A .ll) imply, C n - C 22 Ci2 - C 23 , , C1(L—1) - C 2L , C i L ,, ax = --------— a id --------------a 2 d 1 ------ --------- a L- i i------- aL (A.i2) C 21 C 2 1 C2l C2l di d2 di,-1 dx, 95 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Assume th at J of the d* coefficients are nonzero and denote these nonzero coefficients by (Pi, 82, • • •, Pj) with («i, « 2,..., u j) denoting the corresponding values of the a*’s. The all-rows condition implies th at all 2J vectors consisting of — 1 and 1 occur in (on, 02,.. ■, aj) Specifically, 81+82 + - ■ - + P j and ,p\ + 82 H P j both occur as entries of a*. However, the entries of a, are either 1 or — 1, so th at a simple argument leads to the conclusion th at P j = ±1. Applying the same argument to Pi for i = 1,..., J — 1, we obtain /% = ± 1, Since (on, 0 2, ■ ■., a j) has all permutations in its rows, it will have [Pi, P2 , ■ ■ ■ ,Pj} as a row. Thus, Pi + 0 2 + • ■ ■ + Pj = J is an entry of Aj, which is possible only if J — 1. So a* = Pa, where P = 1 or p = — 1 and a = a* for some i. This condition, along with the Toeplitz structure of Ak, implies th at a j (or &i) will have a periodicity of length at most 2(L — 1). However, all rows cannot be constructed in Ak with such a column with such a period for L > 3. Thus, we conclude th at the case C 21 = 0 must hold which, along with (A.12) and the linearly independence of a» implies C 2 2 = Cll, C 2 3 = C 12, ' ' ‘ ; C -2L — Cl(L-l), Ci£ = 0 (A.13) Similarly, two different equations can be w ritten for X2 as in (A.10) and (A .ll). W ith a similar reasoning, C 3 1 = C 3 2 = 0, C 3 3 = on, C 34 = C12, • • • , c3i = c1(i _2), c i l = 0 (A.14) Appl}dng the same argument (L — 3) times more, we obtain X'k = cAk in summary, from [28] a sufficient condition for identifiability a digital sequence under the class of length L linearly independent channels is th at the sequence be PE of order 2L — I. Similarly, from [8], a sufficient condition for the identifiability of a digital sequence under the class of regular channels is th at the sequence satisfy the ” all-rows” 96 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. condition. The conclusions of this appendix are summarized in Figure A.I. Namely, for digital sequences, all sequences satisfying the all-rows condition are PE and all linearly independent channels are regular. So the check for the identifiability of sequences given in [8] was for a smaller set of sequences, but a larger set of channels compared to the one in [28]. Finally, we note that neither references give a simple, pairwise separability check of the form desired for our code design methodology. 97 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A ppendix B P D F of th e Q uadratic D etecto r Signal in AW GN E nvironm ent The quadratic detector signal is defined in (B .l). r = z l (P (i) - P w ) zf c (B.l) For compactness, we use A® in place of , r in place of rk{i,j), and most quantities discussed, do not explicitly denote dependence on the two data sequences and A ^ ) and the actual channel f. Before giving the pdf of the decision statistic r in (B .l), we give three lemmas about the spectral properties of pW — pO') and one theorem stating that r can be represented as the sum of 2L independent chi-square distributed random variables. L em m a B .l There exists a set of orthonormal eigenvectors for P ^ — P ® with real eigenvaules. P roof: This follows from the symmetric structure of pW — P ^ = [P W — P ^ )]T. ■ 98 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. The next lemma provides a coordinate system useful for describing eigenvectors of Lemma B .l. First, however, we give the definition of principal angles between two sub spaces J- and Q in 7Zm [27]. Let T and Q be subspaces in 7Zm whose dimensions satisfy p = dim(JF) > dim(<?) — q > 1 (B.2) The principal angles 9\ , ... 6q £ [0,7 r / 2] between T and Q with principal vectors { u i , uq} and {vi, ...vg} are defined recursively by cos(9 A — uJV,- = max max u Tv (B.3) 3 3 3 u er veg subject to: ||u|| = ||v|| = 1 (B.4) n Txii = v TVj = 0 i = j — 1 (B.5) Intuitively, ui and v i define the two directions in T and Q, respectively, th at are closest to being orthogonal and 9\ characterizes the angle between these directions. The vectors u j are Vj similarly interpreted based the subspaces of T and Q th at are orthogonal to the span of are { v , } ^ , respectively. The principle angles and vectors can be computed using three singular value decom positions (SVDs). Let the m atrix Oj (O j) have columns defined by orthonormal basis for the column spaces of (A^) which can be obtained from an SVD of (A^). The SVD of O f O j , Y T(Of O j ) Z = diag((7i, ...,<rq) yields the principal vectors [ u 1} = OfY (B.6) v i, ■ ■ ■ , v g = OjZ (B.7) with the principal angles as the arc-cosine of singular values [ 27]. 99 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. L em m a B .2 Let { « i , «£,} and {vi, ...v l } he the principal vectors with angles 0 < Q \ < 0 2 < ■ ■ ■ < &l between the column spaces of two separable data matrices, and . A set of orthonormal eigenvectors of pW — associated with nonzero eigenvalues can be written as V 2 V l N + Vi\\ W u i - v i W 1 ( - ) _ J _ f Ui + Vl Ui - Vi \/2 V lN + «(|| ' \\u i-vi\\) where I — {!,■■■, L). P roof: The denominators of (B.8) and (B.9) can be written in term s of principal angles.1 ||u; i vi || = y/2 i 2 cos & i (B.10) The product of the given vectors e[+^ and by p b ).p (i) C an be found by multiplication and simplification (p » _ p (j))e(±) = _ L , . V l + cos & l , . , V l - COS 01 (u« - V l) ------------ ------- ± (u; + Vl) a/ 2 a/ 2 — ± sin O ie ^ (B.12) Thus, e[+^ and e |- - * are the eigenvectors of pW — phi), ■ C o ro llary B .l I f Xi is an eigenvalue o /p (d _ p 0 ') corresponding to the eigenvector then -Xi is also an eigenvalue of pW — P ^) corresponding to the eigenvector e)~^. L em m a B .3 Let {hi, v%,..., ujf} and {v%, V 2 , vl) be the principal vectors with prin cipal angles Q < & i < 6 2 < ■ ■ ■ < 8 l between the column spaces of two separable data 1For conciseness, we use ± and rp to express two equations in one using the standard convention - e.g., = y z is short for = y — m and x — y + m. 100 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. matrices and and {e^+\ ej be the orthonormal eigenvectors of pW — P^A. Then the principal vectors {vti}f= 1 and { t^ } ^ can be expressed in terms of eigenvectors 0f p{i) _ p(f ); ui = - + cos8 i + \ / l - cos Q^j e[+^ + ^ (^\/l + cosOi - \ / l - cos 9iJ e|(^.13) vi = \ (^/T + co sF i - y 'l - cos & i \ e[+) + i ( V l +COS0I + y 'l - cos 0 ^ ej(^.14) P ro o f: Solving (B.8) and (B.9) for u i and v; yields (B.13) and (B.14). ■ T h e o re m B .4 I f the transmitted sequence is A ^ \ the channel is f, and the AW G N variance is o \ , then r can be expressed in terms of2L independent non-central chi-square distributed random variables {r[+\ r \ (B.15) i— i with mean and variances to (±) = -c? (1 ± sin 0j) sin 0t + <t2 sin 0; (B.16) ri 2 cr2(±) = 2 sin2 0; + (aw sin 0;q)2(l ± sin 6 - () (B.17) ri where { c /} ^ are the coordinates of the vector A ^ f with respect to the basis { u ;} ^ . P roof: Using the spectral decomposition of — pC?)7 (B .l) can be written as r — z -r T ^ ^ (sin 0/ e E — sin die\ ^[ej ^]r ) l!=l = E ^ sin ^ ilzfce i+)li2 ~ sin0||!z|'e[~)||2) i=i = E ( b E ]2 - [y\~]?) Z fc 1= 1 (B.18) (B.19) (B.20) 101 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. Since zj, is Gaussian and {e|+- * , e | is an orthonormal set, { j/^ } are indepen dent Gaussian random variables with means sin0;(l ± sin0j)c? and variance cxj sinfy. Therefore { r ^ = [y^]2} are chi-square distributed with means and variances given by (B .l6) and (B.17), respectively. ■ C o ro lla ry B .2 The conditional pdf of r can be expressed as / r ( r | A « ; f ) = / ( + ) ( r | A « ; f ) * /,.<-> (-r|A«; f) / (+)(r|A » ;f) * / (_ )(-r|A « ; f) r l r l L L (B.21) with the 2L densities defined by exp r-h c f ( l± sin &i) sin 61 2(7.2 sin " ... , M'-smu, \ rcf (1 ± sin 6 i) sin 9i t ( ± ) ( r |^ W;f) = -------- C0Sh .... — J r > 0 (B.22) ^ 2 rrraf0 sm \ & w sm U i P roof: Follows directly from the fact th at { r|+\ is a set of independent chi-square random variables and the density of —r f "} is f 7 .(-) (-n and rj ^ are the summation of independent random variables, each of which or either chi-square distributed with means and variances given in (B.16) and (B.17). ■ C o ro llary B .3 The conditional mean and the variance of r are as stated in (2-4) and (2.5), respectively. P roof: Follows directly from the fact that r is the sum of independent random variables with densities given in (B.22). ■ C o ro llary B .4 mr(-)/or (•) is an monotonically increasing function of mr{-). P roof: Note that, with the principle angles known, oy(-) is a function of mr (-) as implied by (2.5). Using this expression, it is straightforward to verify th a t the derivative of mr(-)/oy(-) with respect to mT{-) is always positive. ■ 102 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C o ro lla ry B .5 For a unit norm channel, the mean in (2-4) is bounded as L ofninA (j) sin2 G i < J 2 cl sin x,Ak ( 0 sin2 ° l (B.23) l=i where (i) and < 7m ax,A k (i) are the minimum and the maximum singular values of respectively P ro o f: It follows from (B.3) that sin 8 \ < sin#; < sin#/,. Also, for a unit norm f, ^ l l ^ W f H2 < C T m ax,A fc(*) a n d l l ^ f f = E * ■ Finally, we note that, in [44], the characteristic function of the r is given as 'Sr(jv) = E (exp(jur)} = \I — 2jvo 2 (P ^ — p W )j1 / 2g 2 < tid (B.24) where s* = A ^ f. 103 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A ppendix C An A lgorithm for C ode Design (Clique Problem ) The general class of clique problems is known to be NP-Hard [14]. However, we present an optimal algorithm which enumerates all cliques in an efficient manner when the maximum number of edges for any vertex is much smaller than |V|. We describe this algorithm via the example in Fig. C .l. The graph shown in Fig. C .l has |V| = 16 with a maximum number of edges for a given node of 7. A maximum clique is shown with a solid line and all other edges are shown dashed. The algorithm proceeds as follows: 1. Label the vertices of the graph from 0 to |V| — 1. 2. Let each vertex define the root of a tree. Construct the tree associated with vertex labeled v as follows. The children of the root v are all vertices with labels v such th at there is an edge between v and v, and v < v. For example, in Fig. C .l, the tree with root label 3 has no child corresponding to label 2 because 3 > 2. The edge between 2 and 3 has been represented in the tree rooted at label 2. 3. To grow each tree from the root level zero to level one, consider the child of v with smallest label, say v. The children of this node are determined by the intersection 104 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. of all other children of roof v with larger label than v and the children of the tree rooted at v. For example, in Fig. C .l, the children of 5 under the tree rooted at 2 are determined by taking the intersection of {8,7 ,8,9,10} and the children of the roof labeled 5, namely {6,8,9,10}; yielding {6,8,9,10}. The children of 6 under the root labeled 2 are determined by taking the intersection of {7,8,9,10} and the children of the root 6, namely {8, 10, 11, 12}; yielding {8, 10}. 4. At level d > 0, the tree is extended by the same process, except the set used for intersection can be simplified. Specifically, for any of the |F | roofed trees, consider extending the vertex v at level d which has parent P(v) at level d — 1. The intersection of the set two sets provides the children of v. The first set is all vertices v at level d in the same rooted tree with the same parent node P(v) such that v < v. The second set is the children of the vertex label v at level d — 1 in the same rooted tree. For example, the children of vertex 8 at level 2 in the tree rooted by vertex 2 are obtained by intersecting ( 6, 8,9 ,10} (set one), with the children of node 6 at level 1 in the same rooted tree, namely ( 8,10}. Note th at one could also use the children of 6 at level one of the tree rooted by vertex 6 for the second set, but this is less efficient. 5. This process is repeated on each of the rooted trees until it terminates. The largest clique is determined by the deepest level obtained in one of the trees. Notice th at all cliques are enumerated in the collection of trees. In the example, two cliques of size 5 exist: { 2 ,5 ,6, 8,10} from the root 2 and {4,7,10,11,12} from the root 4. 105 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. The correctness of this algorithm can be formally proven. Also, the running time of this algorithm can be shown to be upper bounded as exponential in N&, the maximum number of edges associated with any vertex and polynomial in |V|. By comparison, exhaustive search to check if a clique of size n exists has running time However, even though the complexity relative to exhaustive search is small, when N e and/or \V\ becomes large, this algorithm becomes prohibitively complex. 0 Level l(plY l2 ) 1 0 YllYl2 ) 0 Level !()> Figure C .l: Example of the execution of the clique algorithm. One of the maximum cliques of the graph (in circle) is emphasized by non-dashed edges. 106 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. C .l G reedy H euristic for Code Design We propose the following suboptimal algorithm when the above algorithm becomes pro hibitively complex. Construct the set of [¥’| trees as described above. At each level, grow only the tree with the largest number of children. For example, in Fig. C .l, the tree rooted by 2 would be selected to be extended to level 1 because it has 7 children while no other root has more than 5 children. Only the vertex labeled 5 under the tree rooted at 2 would be extended to level two because it has 4 children while no other node at level two under the tree rooted at 2 has more than two children. W hen considering the next level there is a tie condition because both vertices 6 and 8 at level 2 have two children. In our heuristic an arbitrary tie breaker is used. If this breaks the tie in favor of 6, then the optimal solution will be found by this heuristic. However, if the tie-breaker selects 8 over 6 at level 2, then two cliques of size 4 are found (suboptimal) - i.e., { 2 ,5 ,8,9} and (2,5,8,10}. 107 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A ppendix D P arallel Im plem entation D FE and V iterbi A lgorithm Recursions in algorithms prevent the successive data from being processed independently, thus impedes high throughput implementations. Since the Viterbi algorithm has a re cursive structure, parallel implementations of it has long been an area of interest for researchers [19, 32, 34]. Semiring algebra we used to resolve the add-compare-select bottleneck in Viterbi algorithm. We present in this section how the Viterbi and DFE algorithms can be transformed into a m atrix multiplication problem by using the semir ing structure. The goal here is not to provide a full-fledged architecture definition but to describe a general framework which enables the parallel implementation of both Viterbi and DFE. Further improvements can be achieved by using pipelining, efficient sparse m atrix multiplications algorithms, etc. A semiring is defined on two operations, © and ® over a set R with the following axioms : 1. (R, ©) is an abelian monoid 2. (R, ®) is a monoid 3. V x,y, z G R x < g > (y © z) — (x ® y) © (x < g > z) 108 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. A semiring (R, ©, < g > ) can easily be extended to an (M x M ) dimensional space, i.e., , ®). For example, square matrices on conventional addition and multiplica tion over real numbers constitutes a semiring. Now th at we are equipped with the necessary tool, we can determine the operations and the set th at we are going to work on to map DFE algorithm into m atrix multiplication. Figure D .l illustrates the state representations of DFE. Given the FFE output, Xf[k], and previous decisions d& = [d^-idk-2], the DFE generates just one decision that complies with Equation 1.2.1. Let’s assign a distance 1 to the state transition path which generates this decision and 0 to the other. For example with DFE coefficients FBF = [-0.8 - 0.3] and decision threshold dth — - 0.1, F(xy[A:] = 0.9, d^ = [— 1 — 1]) generates 1, so A([— 1 — 1] — > [— 1 — 1]) = 0 and A([—1 — 1] — » [1 — 1]) = 1. We construct the state transition matrix, A & with (i,j)th entry being A(dk(i) — > dk+i{j))- Note th at for each row of A & , there is just one and all the rest are ’O ’. Choosing the semiring we can find what the decision is going to be at time k+ N given the state of the DFE at time k by just simply multiplying state transition matrices A /s where i is from k to k + N. Let’s take the example given in Figure D.l: A fcA fc+1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 X = (D. 0 0 0 1 0 1 0 0 0 i 0 0 0 1 0 0 0 1 0 0 1 0 0 0 where the resultant product in (2.24) gives us the decisions at time k + 1 conditioned on the state at time k. Given, for instance, d/-(4) is the correct state, the decision is found by checking the column index of 1 on the fourth row of A^A^+i product. If this index is 109 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. less than three then dk+i is -1 otherwise 1 . In general, decoding process can be formulated lk+N\dk(j) 0 0 0 1 0 j-1 zeros where 0 function is defined as: j-i zeros_ As, A k+ N (D.2) 0 - 1 ,i< M/2 + 1 1 ,i > M/2 + 1 (D.3) j - i zeros j-i zeros_ Since there is no recursion in the computation of state transition matrices, for a given block of received data, state transition matrices can be computed in one clock in a parallel fashion. In what follows we describe how a block length of 2 ^ + 1 data can be decoded in IV + 1 steps, given the P — 2N state transition metrics. 1. Ai, ■ ■ •, A2 are computed, d j is given and d\ is decoded. 2. A1A2, A3A4, AsAe, • ■ •, Ap_iAp are computed, d2 is decoded. 3. (AiA2)A3, (A1A2XA3A4), (AsAq)A7, (A6A6)(AgA8), ■ ■ •, (Ap_3Ap_2)Ap_ i, (Ap_-3Ap_2)(A p_iA p) are computed, d3 and are decoded. 4. ... 5. (IV+ 1) (A1A2 • • ■ Ap/ 2)Ap/2+l (A1A2 • • • Ap/ 2)Ap/2+lAp //2+2 (A1A2 • ■ • Ap/ 2)Ap/2+iAp/'2 +2Ap/2+3 are computed and dp/ 2+i, • • •, dp are de coded. 110 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. w k*0.9 Figure D .l: Trellis Diagram for DFE. At the end of (N + 1) steps, all decoding processes are completed. Except for the first step, the algorithm given above is regular in the sense of number of operations in each step. Similarly, the Viterbi algorithm can be m apped into the m atrix multiplication with the help of (RMxM , min, +) semiring. The recursion in the Viterbi algorithm is due to the add-compare- select operation. As in the DFE, transition m atrix computation is a feedforward process and thus can be easily implemented in parallel. For a fixed interval Viterbi implementation, forward and backward state transition m atrix multiplications can be done in parallel and at the final stage, can be combined together to generate either the digital decisions or the soft quantity of decisions, state transitions or states. Note that 111 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission. no trace-back operation is necessary for forward-backward algorithm. For both DFE and VA, the impact of the history on the current state fades with time. Therefore, each block can be processed without waiting for the information from previous blocks. In order for the independent processing not to degrade the performance, some redundant data should be added to the decoding processing. Let’s say we are decoding a block of data from Xk to Xk+N using the DFE and assume th at initial state information is not available. If we use enough number of previous data, i.e., Xk-i, • • •, Xfe-i, the multiplication of their state transition matrices gives a m atrix where all the ’1’ entries appear on the same column, in other words all rows become linearly dependent . Therefore it doesn’t m atter which row we choose to initialize the decoding process. The argument also holds for the VA. Multiplication of the state transition matrices, A k-i, • • ■ , Afc_i for VA leads to a m atrix where each row of the resultant m atrix is differentiated from any other row by some fixed value. Hence each row provides the same soft information. 112 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without perm ission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Iterative data detection: Modified message -passing algorithms and convergence
PDF
Iterative data detection: Bounding performance and complexity reduction
PDF
Data compression and detection
PDF
Array processing algorithms for multipath fading and co-channel interference in wireless systems
PDF
Contributions to image and video coding for reliable and secure communications
PDF
Contributions to efficient vector quantization and frequency assignment design and implementation
PDF
Joint data detection and parameter estimation: Fundamental limits and applications to optical fiber communications
PDF
Characterization and identification of ultrawideband radio propagation channels
PDF
Iterative detection for page -oriented optical data storage systems
PDF
Dispersive and nonlinear effects in high-speed reconfigurable WDM optical fiber communication systems
PDF
Experimental demonstration of optical router and signal processing functions in dynamically reconfigurable wavelength-division-multiplexed fiber -optic networks
PDF
High performance components of free -space optical and fiber -optic communications systems
PDF
Generation, filtering, and application of subcarriers in optical communication systems
PDF
Electro -optic microdisk RF -wireless receiver
PDF
Electro-optic and thermo -optic polymer micro-ring resonators and their applications
PDF
Chromatic and polarization mode dispersion monitoring for equalization in optical fiber communication systems
PDF
Design and performance of space -time codes
PDF
Fading channel equalization and video traffic classification using nonlinear signal processing techniques
PDF
Investigation of degrading effects and performance optimization in long -haul WDM transmission systems and reconfigurable networks
PDF
An evaluation of ultra -wideband propagation channels
Asset Metadata
Creator
Coskun, Orhan (author)
Core Title
Joint data detection and parameter estimation: Fundamental limits and applications to optical fiber communications
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
engineering, electronics and electrical,OAI-PMH Harvest,physics, optics
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Chugg, Keith (
committee chair
), Baxendale, Peter (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-508496
Unique identifier
UC11335965
Identifier
3140461.pdf (filename),usctheses-c16-508496 (legacy record id)
Legacy Identifier
3140461.pdf
Dmrecord
508496
Document Type
Dissertation
Rights
Coskun, Orhan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
engineering, electronics and electrical
physics, optics