Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A CMOS frequency channelized receiver for serial-links
(USC Thesis Other)
A CMOS frequency channelized receiver for serial-links
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A CMOS FREQUENCY CHANNELIZED RECEIVER FOR SERIAL-LINKS by Kyongsu Lee Copyright 2005 A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment o f the Requirements for the Degree DOCTOR OF PHIILOSOPHY (ELECTRICAL ENGINEERING) December 2005 Kyongsu Lee Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3220120 Copyright 2005 by Lee, Kyongsu All rights reserved. INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. ® UMI UMI Microform 3220120 Copyright 2006 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgements My most earnest acknowledgment must go to my academic advisor, Professor Won Namgoong, who gave me his endless support and guidance throughout this research. In every sense, none o f this work would have been possible without his technical expertise, insight, and high standards in research. I am grateful to my academic guidance committee members, Professor John Choma, Professor Roger Zimmermann, Professor Peter A. Beerel, and Professor Eunsok Kim, for their unconditional availability and valuable advices at my qualifying and defense examinations. Especially, I would like to thank Professor John Choma for his insightful lectures and generous discussions that became a great source o f knowledge in this research. I would like to express my special appreciation to Lei Feng, who assisted this research in developing signal processing algorithms, for his constant support. Also, my colleagues, Shaomin, Liang, Sotrios, Ali, and Gongrit deserve special recognition for their technical discussions and unconditional support. I also owe a huge gratitude to my friends, Dr. Pansop Kim, Youngsoo, Younggil, Dooil, Hyungjin, Professor Jaeyoon Sim, Professor Jaewan Kwon, Woohyun, and Janghyuk who helped my life in USC much easier. My final most heartfelt acknowledgment should go to my family for their constant patience, encouragement, and support during my study. ii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents Acknowledgements ii List o f Tables v List o f Figures vi Abstract xi Chapter 1 Introduction 1 Chapter 2 Background 5 2.1 High-Speed Electrical Signaling 5 2.2 Limitations of Electrical Signaling 7 2.2.1 Practical Signal Limitations 7 2.2.2 Channel M edium Limitations 11 2.3 Conventional Time-interleaved Receiver 12 2.4 Channel Equalization 19 2.5 Summary 27 Chapter 3 Receiver Design 29 3.1 Frequency Channelized Receiver Architecture 29 3.2 Comparison between FCR and TIR 36 3.3 Frequency Channelization Circuit 43 3.4 Analog-to-Digital Conversion Circuit 64 3.5 Digital Signal Processing 71 3.6 Summary 74 Chapter 4 Clock and Frequency Generation 77 4.1 Local Oscillator Frequency Generation 78 4.2 On-chip Clock Generation 84 4.3 Summary 87 Chapter 5 Experimental Results 88 5.1 Receiver Chip, Package and Board 88 5.2 Measurement Results 97 5.3 Receiver Performance 105 5.4 Summary 108 Chapter 6 Conclusion 110 6.1 Future works 113 iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography Appendices A. Shunt-shunt feedback amplifier circuit analysis Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Tables Table 3.1: Performance comparison in various wideband am plifiers..................60 Table 3.2: Performance comparison between two frequency channelizers ....... 64 Table 5.1: Power consumed by each receiver b lo c k ..............................................107 v Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure 2.1: A high-speed low-swing signaling ........................................................6 Figure 2.2: A practical high-speed signaling environment ...................................... 7 Figure 2.3: Frequency response of package p arasitics...............................................9 Figure 2.4: An eye diagram showing ISI and jitter e ffe c t.......................................10 Figure 2.5: A lumped-parameter model o f channel medium ................................. 11 Figure 2.6: Frequency dependence o f channel loss ................................................. 12 Figure 2.7: Transceiver with 1:N parallelism ............................................................13 Figure 2.8: 1:N demultiplexer driven by N consecutive clock p h a se s.................14 Figure 2.9: Multi-bit ADCs with unit-gain sample/hold am p lifier.......................15 Figure 2.10: Input bandwidth as a function of ADC b its ........................................16 Figure 2.11: An example o f 4-PAM eye d iag ram .....................................................18 Figure 2.12: A communication transceiver with eq u alize r.................................... 20 Figure 2.13: Channel equalization ...............................................................................21 Figure 2.14: The effect o f transmitted waveform with pre-em phasis..................23 Figure 2.15: Decision-feedback equalizer (DFE) .................................................... 25 Figure 2.16: Single-carrier frequency-domain equalization system ....................26 Figure 2.17: Block processing o f data seq u en ce...................................................... 26 Figure 3.1: Hybrid filter bank (HFB) with M subband A D C s.............................. 30 Figure 3.2: An example o f FCR with time-domain equalization..........................32 Figure 3.3: (a) Proposed receiver block diagram, (b) Frequency relationship .. 33 Figure 3.4: (a) Proposed 3-subband receiver, (b) The frequency relationship .. 34 vi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.5: The 3rd LO harmonic contamination in baseband .............................35 Figure 3.6: (a) Maximum input capacitance to support lOGsymbols/s. (b) Typical high-speed S/H circu it.................................................................37 Figure 3.7: S/H circuit bandwidth requirement in F C R ..........................................38 Figure 3.8: (a) Received signal with ISI. Slicer input o f (b) conventional receiver and (c) 3-subband FCR ..............................................................39 Figure 3.9: The jitter effect on performance o f two receivers .............................. 40 Figure 3.10: ADC bits and channel ISI e ffe c t...........................................................42 Figure 3.11: Quadrature mixing at the receiver front-end ..................................... 43 Figure 3.12: Equivalent block diagram of passive double-balanced m ix e r.......44 Figure 3.13: Simulation result for mixer bandwidth vs. gain ............................... 46 Figure 3.14: Signal amplification at the fro n t-en d ...................................................47 Figure 3.15: A shunt-shunt feedback (a) topology and (b) frequency response 48 Figure 3.16: Cascaded shunt-shunt feedback amplifier ..........................................48 Figure 3.17: A shunt-shunt feedback (a) topology with gain control and (b) frequency resp o n se......................................................................................50 Figure 3.18: The effect o f LPF bandwidth on the signal division ........................51 Figure 3.19: LPF order and bandwidth effect ........................................................... 52 Figure 3.20: A frequency channelization circuit b lo c k ........................................... 53 Figure 3.21: LPF unit cell topologies with (a) shunt-shunt feedback and (b) differential p a ir .............................................................................................54 Figure 3.22: Wideband buffer circuits (a) source follower, (b) shunt-peaking, and (c) active shunt-peaking .................................................................... 55 Figure 3.23: (a) Capacitive degeneration amplifier and (b) modified Cherry - Hooper amplifier ......................................................................................... 58 vii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.24: (a) H alf circuit o f capacitive degeneration and (b) its frequency resp o n se......................................................................................................... 59 Figure 3.25: Bandwidth o f various wideband amplifiers according to ADC b i t s ...................................................................................................................61 Figure 3.26: Two frequency channelization sch em es............................................. 62 Figure 3.27: Two time-interleaved ADC in the 2n d subband .................................65 Figure 3.28: 3-bit differential comparators ................................................................65 Figure 3.29: Simulation result to determine comparator input dynamic range . 67 Figure 3.30: Positive regenerative am p lifier............................................................. 68 Figure 3.31: (a) DNL definition and (b) ADC dynamic range simulation result ...........................................................................................................................69 Figure 3.32: (a) 3-bit thermometer-to-Gray code mapping and (b) look-up table conversion b lo c k ................................................................................70 Figure 3.33: 2:1 transmit m ultiplexer.......................................................................... 70 Figure 3.34: (a) Cyclic-prefix coding and (b) demodulation (single-channel) . 71 Figure 3.35: DSP structure of 3-subband channelized rece iv er............................ 72 Figure 4.1: (a) Proposed receiver with 3-subband, (b) Frequency and clock generation sch em e .......................................................................................77 Figure 4.2: Poly-phase generation s te p s ..................................................................... 79 Figure 4.3: A two-stage poly-phase filter with output buffer ................................80 Figure 4.4: A frequency doubler based on P L L ........................................................81 Figure 4.5: Frequency doubler model ......................................................................... 81 Figure 4.6: Bode plot of loop transfer function ........................................................82 Figure 4.7: PLL loop stability simulation result .......................................................83 Figure 4.8: Transient response o f PLL control v oltage........................................... 84 viii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4.9: Differential injection-locked frequency divider ................................ 85 Figure 4.10: Injection-locked oscillator model .........................................................86 Figure 5.1: Receiver chip photo ................................................................................... 88 Figure 5.2: On-chip parasitic capacitances along the high-speed signal path ... 89 Figure 5.3: The implementation o f termination resistance with unsalicided poly resistors ................................................................................................91 Figure 5.4: Passive double-balanced miter (a) schematic and (b) layout ...........92 Figure 5.5: Supply and ground bouncing due to package inductances ............... 93 Figure 5.6: Guard rings to isolate sensitive analog blocks .................................... 93 Figure 5.7: Bonding wire configuration for high-speed inputs and outputs ...... 94 Figure 5.8: Board trace from package to SMA connector ..................................... 95 Figure 5.9: Data generation and acquisition ............................................................. 96 Figure 5.10: Measured energy spectrum o f transmitted data at lOGsymbols/s 97 Figure 5.11: M easurement setup for LPF frequency response ............................. 98 Figure 5.12: (a) Frequency response at each subband, (b) LPF gain co n tro l.... 99 Figure 5.13: Measured ldB compression points for the 1st and 2n d subbands 100 Figure 5.14: The measurement setup for S N D R .....................................................101 Figure 5.15: Curve-fitted recovered ADC output at 5 0 0 M H z............................ 102 Figure 5.16: (a) The injection o f coupling noise by phase mismatch, (b) Measured SNDR at each su b b an d ......................................................... 103 Figure 5.17: Jitter measurement setup ......................................................................104 Figure 5.18: Measured jitter perform ance................................................................ 104 Figure 5.19: Performance testing scheme ...............................................................105 ix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.20: (a) Measured transmitted signal at lOGsymbols/s. (b) Slicer input after equalization.............................................................................106 Figure 5.21: ADC ideal Figure-of-Merit (FOM) com parison..............................108 Figure A .l: Equivalent circuit o f shunt-shunt feedback .......................................120 x Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Abstract The endless desire for the integration of more systems on a chip demands higher transmission bandwidth between chip to chip through cost-efficient serial links. In modern multi-giga links, the achievable bandwidth per channel is limited by the filtering effect of off-chip environment such as wire losses in the channel medium and package parasitics. More digital communication techniques such as coding, equalization, and multilevel modulation need to be employed to increase detection accuracy. However, the lack o f available multi-bit ADCs has been limiting their use. To achieve multi-bit data conversion at the signal Nyquist rate, a novel receiver architecture that decomposes the received wideband signal into multiple frequency subbands before digitizing is proposed. Compared to the conventional time- interleaved receiver (TIR), this frequency channelized receiver (FCR) enjoys numerous advantages including simplified sample/hold circuitry, greater robustness to sampling jitter, lower power consumption, and better performance in the presence of large ISI. To verify the concept, a prototype FCR has been designed and fabricated in 0.25um CMOS process. This prototype achieves effective sampling rate of 12.5Gsamples/sec with 3-bit resolution and requires no initial ADC offset compensation. The architectural efficiency has been proven by achieving ideal ADC figure o f merit of 4.6pJ/step which is much lower than conventional receivers. The functionality o f the proposed receiver is demonstrated by correctly operating at lOGsymbols/sec in a channel with significant inter-symbol interference (ISI). xi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1 Introduction As the integration o f system on a chip increases rapidly by technology scaling, the volume o f transmission data over short cables and backplane traces increases as well. The point-to-point signaling system, such as serial-link, draws attentions since higher data rate can be achieved without huge number o f package pins, operating power, and area overhead o f previous slow-rate bus architecture [6], In order to increase the system data rate over on-chip clock speed limitation, multiple parallel switches are employed to sample the received data with multi-phase on-chip clock. This time- interleaved receiver (TIR) enables multi-giga bit-per-second serial-link systems in modern CMOS technology. However, the transmission data rate in TIR is limited not only by on-chip bandwidth limitation but dominantly by various off-chip hardly-scalable factors such as package parasitics, skin effect, cable losses, and signal reflections by impedance mismatches [45], The overall frequency response o f these channel imperfections can be represented as low-pass filtering. Therefore, when signal bandwidth is much larger than channel bandwidth, a significant distortion o f signal, or inter-symbol interference (ISI), can degrade sampling resolution at the receiver front-end. Furthermore, as technology scales down, this effect is causing a primary bottleneck on overall system performance [22], 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another important performance limiting factor is high sensitivity to sampling jitter, that is caused by the large amount of aliasing of the wideband input signal [22] [5 5]. Furthermore, the equally spaced multi-phase sampling clock itself has limited jitter performance since its generation is often achieved with low-Q ring oscillator (Q =l). This jitter effect becomes more problematic as data rate increases with technology scaling. Therefore, it is necessary to develop circuits and systems that are more robust to jitter. To overcome these limitations, more communication techniques such as multi level modulation and advanced signal processing need to be employed to continue the increase o f system bandwidth. There are some attempted examples, such as pulse amplitude modulation (PAM) [9], transmitter pre-emphasis [45][11], and receiver equalization [59][58]. However, to make the use o f these communication techniques possible, multi-bit analog-to-digital converters (ADC) are required. Recently, an attempt has been made to implement 4-bit ADC on the base of conventional TIR [55], Even though the receiver achieved 8Gsymbols/s at 2-PAM modulation, it shows a good example of its architectural limitations. First o f all, the receiver employs distributed bonding wire inductance network to increase the channel bandwidth. Secondly, offset cancellation circuits are employed to compensate for the gain offsets due to device mismatches among the demultiplexing channels. This gain offset becomes a problem because higher input bandwidth needs to be achieved by employing smaller transistors that are more sensitive to process variations. Finally, the timing margin for each sampler is so limited that phase offset 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. compensation at each sub-channel is necessary to increase the sampling resolution. Ironically, in spite of all these additional complexity, the achieved input bandwidth (3.8GHz) is still below the signal bandwidth, which means on-chip circuit is causing additional ISI to the received signal. This is because all these compensation and equalization adds additional capacitive loading at analog front-end where higher input bandwidth is required for the higher signal bandwidth. To overcome the drawbacks o f conventional receiver, we proposed new receiver architecture that channelizes the received signal into multiple subbands before digitizing [29]. This frequency channelized receiver (FCR) achieves similar effective sampling rate using approximately the same number of ADCs with each operating at same reduced clock frequency. The advantage o f channelizing the received signal into multiple subbands before digitizing is that each ADC input bandwidth is reduced by approximately (2M-1) where M is the number o f subbands. As a result, many o f the problems in conventional receiver that are caused by wide signal bandwidth are largely reduced. For example, the sample/hold circuitry is relaxed and it becomes more robust to sampling jitter. Another important advantage is that it obviates the need to generate accurately spaced multiple clock phases, which becomes problematic as technology scales. To validate the concept of FCR, a prototype with 3-bit ADC has been fabricated in 0.25um CMOS technology. The detection and equalization are performed off-line in the digital domain. The functionality is verified by correctly operating at lOGsymbols/sec in the presence o f significant signal ISI and jitter. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The following chapter 2 introduces modern high-speed signaling techniques and its practical limitations. Also, conventional receiver architecture will be discussed with its own limitations. Finally, various equalization techniques to ‘flatten’ the channel characteristics will be followed. The chapter 3 is dedicated on the proposed receiver architecture, circuit blocks, and digital signal processing. And the generation o f local frequencies and on-chip clock for the proposed receiver will be discussed on chapter 4. The experimental results for the implemented receiver will be described in chapter 5. And the conclusion will be given in chapter 6. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2 Background As the integration volume of system on a chip increases with rapid technology scaling, a high-speed signaling technique has been developed to satisfy the demands for higher system bandwidth and lower cost I/O design. However, as the transmission rate reaches multi-giga symbols per second in modern interconnect links, the filtering effect o f off-chip signaling environment introduces serious ISI to the received symbols and increases the probability o f detection error at the receiver. In this chapter, we will discuss about limitations o f high-speed signaling environment, conventional TIR, and their limitations. Also, various equalization techniques to overcome channel imperfections will be discussed. 2.1 High-speed electrical signaling The traditional parallel bus architectures often employed TTL or CMOS logic to drive uncontrolled and unterminated parallel lines with full-swing signals become unsuited for data rates over 100MHz on 1-meter wire [6]. One of the main disadvantages is that the high impedance (-3 0 0 0 ) driver could not be able to charge up the line completely on the incident wave [7], In stead, many round trips o f signal propagation are required for the complete switching, which limits the speed o f the 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. system and the line length. Also, the driver should use rail-to-rail swing to meet the threshold level requirements at the receiver side, which demands for enormous power consumption. Another disadvantage is that noise coupling from reference lines (supply and substrate) is hardly removed due to single-ended nature of the system. A differential low-swing signaling (or low-voltage differential signaling, LVDS) is developed to overcome the problems o f previous bus approach with minimum modification o f signaling environment. As shown in Figure 2.1, this system consists of transmitter, communication channel, and receiver. Z = 50Q Figure 2.1: A high-speed low-swing signaling. A differential transmitter drives the line whose characteristic impedance is matched with receiver termination (usually 50Q) which prevents reflection o f signals back to the transmitter where, if any, reflected signal is absorbed again by source termination. The impedance-controlled terminations improve signaling speed since most o f signal energy is absorbed at the fist incident wave, and also reduce signal distortions often caused by overlaps of reflected signals. The low impedance terminations also allow wide signal bandwidth for higher symbol rate transmission. 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Ideally, this high-speed signaling system does not attenuate signal amplitude, assuming lossless channel, such that it can drive infinite length o f wire. For this reason, only a small swing signal is good enough to be detected at the receiver, which significantly saves power consumption at the transmitter and receiver. This system is also immune to noise because common-mode noise coupled from the references onto two wires can be differentially cancelled out. filtering effect of off-chip elements including packages, backplane traces, and channel medium itself, and also from unexpected timing variations at the zero- crossing points o f signal and sampling moments. These fundamental limitations which will be discussed in this section are more pronounced as transmission data rate increases with technology scaling o f transistor. 2.2 Limitations of electrical signaling The performance of a practical high-speed link system suffers mainly from the 2.2.1 Practical signaling limitations Cable t Package Transm itter Receiver Figure 2.2: A practical high-speed signaling environm ent. 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A practical high-speed signaling environment as shown in Figure 2.2 in modern multi-giga serial-link system is far from ideal loss-less transmission line. A significant signal bandwidth is limited by filtering effect o f on-chip capacitive load, package parasitic elements, PCB traces, and cable medium. The on-chip capacitive load consists o f driver output (or receiver input) capacitances, pad cell, and electro static discharge (ESD) circuits. Even though the termination resistances are small (25Q=50Q//50Q), careful design effort is necessary to keep the RC time constant from limiting the signal bandwidth. The package parasitic, which is one o f major sources o f bandwidth limitation, includes pin capacitance and bonding wire inductance. Because of restrictions of its process, these elements are hard to control. Typical value o f pin capacitance is around 700fF and inductance is around InH/mm. When typical bonding wire o f 2.5mm is assumed, the resonance frequency, which is defined by l/(27i-VLC) , is around 3.8GHz and is quite close to the signal bandwidth of multi-Gb/s signaling. This effect together with impedance discontinuities is causing significant ringing in the frequency response (Figure 2.3). Special packages such as controlled collapse chip connection (C4), flip chip, multilayer ball grid array (BGA), etc., can alleviate this effect by controlling impedance, which may not be available for low-priced high-speed signaling [44], Another important source o f frequency-dependent attenuation is due to skin effect and dielectric loss o f PCB traces and cable, which often are modeled as lossy transmission line. We will discuss more about this in the next section. 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.0 -1 0 c o > to 2 Bonding wire = 2.5nH for Tx/Rx Pin c a p a c ita n c e = 700fF -20 100M 1 G 10G F requency [Hz] Figure 2.3: Frequency response of package parasitics. In addition to the filtering effect of practical channel, reflections of a signal from imperfect terminations or discontinuities along the signal path can interfere with traveling signal causing signal distortions. As a result, received signal waveform at the receiver front-end exhibits not only amplitude attenuation but also long settling tail, especially when the signal bandwidth is much greater than channel bandwidth. These effects can be observed in eye diagram o f received waveform (Figure 2.4) where significant tails (or skews) in the rise/fall transitions are corrupting the waveform o f subsequent bit. This “inter-symbol interference” or ISI limits the sampling resolution o f receiver ADCs and is one o f the fundamental limitations in multi-giga serial-link communication systems. The eye diagram o f received data signal in Figure 2.4 also shows the deviation of the zero crossings from their ideal position (solid line) in time, which is called jitter. 9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The jitter is primarily due to random delay variations caused by coupling o f noise through reference nodes (supply and substrate) and crosstalk between conductors. b it-p e rio d In serial-link transceiver system, not only data jitter caused by the unavoidable use o f phase-locked loops (PLL) which practically has limited jitter performance, but also sampling clock jitter generated by receiver PLL increases the sampling uncertainty o f receiver ADCs further. Due to the need for integration o f PLLs with large digital logic circuits, both jitters in serial-link system is dominated by power supply and substrate noise and do not scale with technology [15]. Therefore, as data rate increases with technology scaling, the portion o f timing uncertainty caused by jitter relatively increases with shorter bit period. For this reason, jitter is another important fundamental limitation in multi-giga link system. > A clo ck jitter sam pling c Figure 2.4: An eye diagram showing ISI and jitter effect. 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AL AR ' I H P — ^ AG S AC Figure 2.5: A lum ped-param eter model of channel medium. 2.2.2 Channel medium limitations The short section o f channel medium such as PCB traces and coaxial cables, for high-speed signaling is modeled as lossy transmission line as shown in Figure 2.5 [18][7]. The frequency dependent loss is primarily due to the series resistance of metal (AR) and shunt conductance o f dielectric (AG). The frequency dependency of the series resistance(AR) is represented by skin effect of conductor AR oc 8 = (2.1) where 8 is skin depth and f is the signal frequency, i.e. AR is proportional to square root o f frequency. Dielectric loss, which is corresponding to AG, can be explained by loss tangent tan a = ^ — (2.2) 2 7 i f - C where AG is linearly proportional to the frequency (tana is usually frequency independent for given material). Generally materials with low dielectric constant (low C) and low tana can alleviate this dependency but are more expensive. 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Typically, the dielectric loss for PCB is known to be considerably higher than cable. The examples o f these two attenuations are shown in Figure 2.6 where skin-effect attenuation is dominant at relatively lower frequency [10]. /K C O 0 T D _ c a > C D I '4 a E < Dielectric loss Skin-effect loss 1 2 3 4 Frequency [GHz] ■» Figure 2.6: Frequency dependence o f channel loss. 2.3 Conventional Time-interleaved Receiver The performance of high-speed signaling system is often described by a technology independent metric, F 0 4 delay (Fan-out of 4). The delay time o f a CMOS inverter driving a capacitive load that is 4 times larger than input capacitance is defined as F 0 4 and is linearly scales with technology [57]. For a typical 0.25um CMOS process, F 04 is corresponding to about 125ps, which is 8Gsymbols/s. The on-chip clock period for the digital logic is often 8 times larger than a F 0 4 to provide enough time for the logic transitions. Therefore, the maximum symbol rate is about 8F04 when no parallelism is added to overcome this limitation. 12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The symbol rate which is N times o f on-chip logic frequency can be achieved by employing N:1 multiplexer at transmitter and 1:N demultiplexer at the receiver (Figure 2.7). This parallelism requires N fan-in multiplexer with each input selected by N consecutive phases o f clock in the transmitter and N sampler with each sampling clock driven by N different phase receiver clocks. Even though the greater parallelism enables higher symbol rate, the capacitive load at the transmit drivers and input capacitance at the receiver front-end increase at the same rate, which become another source o f RC filtering in multi-giga transmission rate. T x D riv e rs Rx D e te c to rs C K rl C K t l / CKt2 y Data rate = N ■ f C K CK t ( N - l ) CK r ( N -l ) __ M u ltip le x e r D e m u ltip le x e r Figure 2.7: Transceiver with 1:N parallelism. A simple MOS transistor switch is often used as one-bit sampler for the best switching performance. For the 1 :N demultiplexing, N parallel switches driven by N consecutive clock phases spaced by bit-time, sample the input at N different positions o f time (Figure 2.8). The limitation of one-bit sampler is well described by sampling aperture that is defined as the minimum input pulse width required for the 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. successful completion of the sampling [17]. This aperture is determined by the sampler’s impulse response (or sampling function) when the switch is assumed as a linear time-invariant filter. The minimum aperture is basically due to switching RC time constant where R equals on-resistance and C equals the sum o f self-load capacitance (Cself) and load capacitance (Cload). The aperture decreases as switch size increases until Cself dominates Cload. Therefore the optimum size is when Cself becomes equal to Cload. Also, to minimize the on-resistance o f switch, highest (lowest for PMOS) voltage should be applied to the gate of the switch which requires rail-to-rail swing o f sampling clock. input Decoder input (j ) o < f > \ _rm_n_ i i i i bit-period - - - - L J ---------- ^N -l I 1 Figure 2.8: 1:N dem ultiplexer driven by N consecutive clock phases. In addition to switching RC time constant, the finite transition time o f sampling clock also significantly widens the aperture time. This is because the gradual change of switching resistance results in the increase o f time duration from off-state to on- state of the switch. Ram in’s analysis shows that the clock skew o f a FO l exhibits 70% increase o f aperture time [10]. W hen this effect is introduced to the differential 14 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. switch topology, which is common nowadays, further degradation o f aperture time is observed since the differential signal inputs are sampled at a different moment respectively. Thus, the effective impulse response becomes the average o f the two single-ended ones, resulting in a larger aperture time [10]. Including all these secondary effects, the maximum symbol rate for TIR with NMOS switches is known to be 1F04 [57], which is 8 times faster than that of receiver without parallelism. For the multi-bit ADCs, the MOS switch type sampler is not attractive, since the on-resistance o f MOS transistor which is usually several hundred ohm is much larger than 50 ohm termination resistor which allows wider input bandwidth. A differential unit-gain sample/hold amplifier is suggested for the implementation o f multi-bit ADCs [55] as shown in Figure 2.9. When elk is high and elkb is low, it amplifies differential inputs (inp, inn) by enabling the tail current source and the output loads. When the elk goes low and elkb rises, it samples the input by disabling the tail current source, turning off the loads and holding the sampled value. To maintain the Latch Latch I e l k b J — o --------- Input BW CLK Figure 2.9: M ulti-bit ADCs with unit-gain sam ple/hold am plifier. 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. input pair (M l, M2) in saturation during the amplification and maximize the headroom, both the input and output o f this amplifier should swing from Vdd to Vdd-|Vthp| so that their common mode level is around Vdd -|Vthp|/2. The PMOS in the center o f output (M6) is a differential load that reduces the amplifier output impedance, and thus its time constant, where half o f the on-resistance is actual resistive load. The gain o f this sampler is usually kept 1 to keep a wideband performance at the output. N X O JC * ■ » ~o xs c 0 3 C O r 2.5 " 5 6 5 TS 4 3 Tf 2 0.5 0 5 6 2 3 4 # of ADC bits Figure 2.10: Input bandwidth as a function o f ADC bits. This switched differential amplifier can be directly connected to the 50ohm termination resistor (Figure 2.9) so that it can make use o f low input impedance, which is 25ohm (50ohm termination in parallel with 50ohm characteristic impedance of cable). However, for the multi-bit ADCs, the input bandwidth increases exponentially since the number of sampler increases exponentially. The input 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. capacitance o f B bit ADC is proportional to (# o f parallel samplers) x (2 B -1 ). The input bandwidth of 10 parallel samplers with 3um with 0.9pF pad capacitance is shown in Figure 2.10. For example, when 4-bit ADC is to be implemented, the input bandwidth decreases as much as 3.5GFIz and will significantly limit the signal bandwidth without special techniques to distribute input capacitance [55]. In TIR, the bandwidth o f sampler itself should be wide enough since each sampler sees that whole signal bandwidth. The bandwidth of the sampler is determined by half o f PMOS on-resistance and the following capacitive load (C l in Figure 2.9) which is usually comparator input device. As small input device allows high bandwidth, offset voltage caused by device mismatches degrades the resolution. The primary source of offset is due to mismatches in threshold voltage and current factors. This offset essentially shifts the decision threshold o f comparator and is causing sampling uncertainties at the decision circuit. Studies show that the standard deviation is inversely proportional to the area of the device [32] [28], Thus, as symbol rate increase with technology scaling, this effect is more pronounced since higher symbol rate accompanies higher signal bandwidth requiring smaller input device. For this reason, multi-Gb/s signaling receiver requires offset cancellation techniques to improve the resolution [55][53][23]. Also, by the random characteristic of offset, mismatches among the demultiplexing channels need to be cancelled separately, which adds design complexity at the receiver front-end [55], The parallelism applied to the TIR enables transmission symbol period which is much shorter than on-chip sampling clock period. This condition exacerbates the 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. jitter effect on the performance o f the receiver since the timing margin for sampling is reduced significantly. In other words, since the input symbol rate becomes much higher than sampling clock frequency, the sensitivity o f aperture to the sampling jitter becomes significant. As the symbol rate increases with technology scaling, the sensitivity to sampling jitter is increased as the amount o f aliasing o f the wideband input signal grows. Therefore, this jitter effect becomes a serious bottleneck in today's multi-giga serial-link system. Figure 2.11: An exam ple o f 4-PAM eye diagram. One o f communication techniques introduced to high-speed link system to increase system channel efficiency without directly increasing symbol rate in TIR is multi-level modulation (PAM). In M-level PAM, each bit conveys log2(M) bits of information that the symbol rate, for a given data rate, can be reduced by a factor o f log2(M). An example o f 4-PAM eye diagram is shown in Figure 2.11. However, the level spacing of M-PAM decreases with M when the signal swing is limited by transmitter. Thus, the M-PAM signal is more vulnerable to noises from various 18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sources. Also, a recent study shows that the modulation impacts both voltage and timing margins since eye opening for multi-level signal is more sensitive on ISI [56]. For this reason, higher resolution ADC is required to demodulate multi-level signals at the receiver [55][34], Another communication technique to compensate the channel filtering effect is channel equalization which will be discussed in the next section. 2.4 Channel Equalization The signaling channel which becomes the main limitation in multi-giga link system, exhibits frequency dependent loss (FDL) caused by RC filtering o f on-chip capacitive load, package parasitics, skin-effect, discontinuities, and dielectric absorption. This causes the received pulse not only attenuated but extended out in time, resulting in ISI. By simply increasing the signal magnitude, the SNR (signal-to- noise ratio) does not improve significantly since the ISI also depends on the signal patterns. To mitigate or remove the ISI, various communication equalization techniques have been introduced in high-speed link systems [6] [47] [11] [55]. A typical digital communication transceiver with equalizer can be modeled with transmitting filter gt(t) (or pulse shaping filter), ISI channel c(t), receiving filter gr(t) (or matched filter), and equalizer ge(t) as shown in Figure 2.12. The baseband equalizers also can be implemented in analog receiver front-end. However, digital equalizers are more common since filters are small, robust to noise, cheap, easily 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. tunable, and very power efficient. The linear digital filter for this purpose can be implemented via an N=2L+1 tap transversal filter: L G e (z )= ^ W i • z'1 (2.3) i= -L where wi is the filter coefficient with length o f N=2L+1. Now, let f(t) denote the combined impulse response o f transmitting filter, ISI channel, and receiving filter, which is f(t) = gt(t) * c(t) * gr(t). (2.4) Then the receiving filter output y[n] which is sampled at every Ts (symbol time) seconds becomes co y[n\ = dn • /[ 0 ] + £ dk ■ f \ n - k ] + v[n] (2.5) k = - c o k*n where u[n] denotes equivalent baseband noise after receiving filter (i.e. u(t)=n(t)*gr(t)). Here, the 1st term in (2.5) is the desired data bit, the 2n d term is the ISI to be removed, and the 3rd term is the sampled baseband noise. This result shows that equalizer design should balance ISI mitigation with noise enhancement [34], Equivalent channel h(t) detector Channel Equalizer Ge(z) Receiving filter gr(t) Transm itting filter gt(t) Figure 2.12: A com m unication transceiver with equalizer. Most equalization techniques fall into two categories: linear and nonlinear. The linear equalization techniques includes zero forcing, MMSE (minimum mean square 20 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. error), and transmitter precoding (or pre-emphasis). The concept of zero forcing equalizer is to cancel the equivalent channel characteristic h(t) simply by applying its inverse as shown in Figure 2.13. In discrete frequency domain, the system function of equalizer can be represented as 1 Ge(z)— F(z) (2 .6) where F(z) is combined system response o f transmitting filter (Gt(z)), channel (C(z)), and receiving filter (GR(z)). When the power spectral density o f additive white Gaussian noise (AWGN) is No, the noise power spectrum N(z) becomes , ,2 N o-|G r(z)|2 N o N (z)= u (z) • Ge(z) = ■ 1 v^ - - - (2.7) |F(z)[2 |H(z)|! ' This shows that any sharp attenuation o f frequency in H(z) will introduces significant increase of noise power. For this noise enhancement problem, zero forcing equalizer can not be a realistic solution [13]. Inverse channel Channel 0 ) T 3 Z J Equalized a E < Frequency Figure 2.13: Channel equalization. 21 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In MMSE equalization the average mean square error (MSE) between each transmitted symbol dk and its estimate dk is minimized by finding the optimal filter coefficients. Since the noise input to the equalizer is not white but colored by receiving filter (or matched filter), a noise whitening process is added before ISI cancellation is performed at the following equalizer. The resulting equivalent equalizer becomes ° < K z ) = (2.8) F(z)+No Now, when F(z) is highly attenuated at some frequency the noise term No in the denominator o f (2.8) prevents the noise from being significantly amplified by the equalizer. Thus, the noise enhancement problem in zero forcing equalizer is removed in M MSE equalizer. However, since this equalizer still requires a matrix inversion process which entails high computational complexity (typically N to N operations), it is not very practical for the low cost link systems. Another linear equalizer which is often popular by its simplicity is based on transmitter precoding (or pre-emphasis). This technique employs a finite impulse response (FIR) filter (or transversal filter) to shape transmitted signal waveform. The important advantage of this scheme is the ease of implementation since the previously transmitted binary information is already available at the transmitter and the design o f DAC (digital-to-analog converter) design is much simpler than ADC at high-speed application. Figure 2.14 shows pre-emphasized waveform and received signal with or without preshaping [9], The FIR filter attenuates low frequency 22 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. portion (or pulse amplitude) o f signal power and at the same time amplifies high frequency part. This results in reduced power o f received signal and decreases signal to noise ratio (SNR). To compensate for the amplitude loss, larger signal amplitude needs to be transmitted and this often results in headroom issues at the driver. Another disadvantage is that the channel characteristic needs to be known in advance and the tap coefficients are set manually. The FIR filter has been implemented by digital circuits [6] such as adders and digital-to-analog converter (DAC), and current-modulated analog drivers [9]. Analog implementation allows higher operation speed with less complexity [59][58][9] but careful design effort needs to be made to minimize noise contaminations. p re -e m p h a s iz e d pulse T3 0.5 w /o p re -e m p h a s is w / p re -e m p h a s is 0 - 0.5 450 600 150 300 Time [ps] Figure 2.14: The effect o f transmitted waveform with pre-em phasis. The maximum frequency to be compensated by one symbol-spaced FIR filter is the half symbol rate [34]. Like wise, half symbol-spaced filter, which doubles the number o f taps required and the complexity, can compensate frequency loss up to 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. symbol rate and allow sharper signal transition. However, generation o f too much sharp transition at the transmitter is not efficient since high frequency portion in the channel suffers more ringing, reflections, and crosstalks. The most popular nonlinear technique with relatively less complexity is decision- feedback equalization (DFE) [47] [59], As shown in Figure 2.15, DFE consists o f a forward filter W(z) with the digitized sequence as input and a feedback filter V(z) with the previously detected sequence as input. The forward FIR removes the preceding ISI, while a feedback FIR is necessary to remove the ISI caused by previous bits. Often forward FIR is implemented at the transmitter side to access the bits to be sent to the channel [59], This approach does not suffer from noise enhancement problem since it only estimates the channel frequency response rather than its inverse. However, in high-speed binary link, a latency problem occurs at the receiver side because it is difficult to resolve the input data at very small bit-period of time. An approach, called loop unrolling, proposed by Kasturia [19] alleviates this problem by pre-compute possible incoming data states (either 1 or 0 for binary) and choose one o f output by using a fast multiplexer. Unfortunately this often increases the complexity of receiver design by multiplying the number o f comparators at the receiver front-end [30], In wireless communication, single-carrier with frequency domain equalization (SC-FDE) system [8] has been developed to reduce the computational complexity of time-domain FIR filters, where its performance is very similar to orthogonal frequency-domain multiplexing (OFDM). Especially, this approach is known to be 24 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. very efficient when there exists significant inter-symbol interference (ISI) caused by the multipath, which requires higher order o f FIR filter taps to equalize the data. The same concept can be applied to wired communication systems such as serial interconnects where the number o f data symbols spanned by channel ISI is continuously increasing with technology advancement. Slicer y(t) y[n] ► Forward Filter W(z) Feedback Filter V(z) Figure 2.15: Decision-feedback equalizer (DFE). Figure 2.16 shows the block diagram of single-carrier frequency-domain equalization receiver. The received data sequences are sampled and transformed to frequency domain by fast-fourier transform (FFT) before frequency-domain equalization is performed. The equalized symbols are then transformed back to time- domain by the inverse fast-fourier transform (IFFT) for the detection. To perform equalization on a block of data at a time and an efficient FFT operation, a cyclic prefix is appended to each block o f M data symbols as shown in Figure 2.17. In this thesis, we will explore this communication technique combined with adaptive equalization for the optimal detection of transmitted data at multi-giga symbols per second. More details about this system will be covered at the digital-signal processing part o f next chapter. 25 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. detector FFT FD E IFFT Coding with cyclic prefix Equivalent channel h(t) Figure 2.16: Single-carrier frequency-dom ain equalization system. Last P symbols P symbols repeated <— — — Block of M data symbols — ► Cyclic prefix Figure 2.17: Block processing of data sequence. In all equalization techniques discussed so far, the equalizer should dynamically track various channel environments often caused by operation temperature, power noise, and different channel characteristics [46]. In this sense, adaptive receiver equalizer is the best approach since no knowledge of channel variations is necessary in advance such that the correction of filter coefficients can be achieved periodically in adaptive manner. As an adaptation algorithm, the least mean square (LMS) is often preferred to MMSE since only N multiply operation per iteration is needed for N tap filters (at least N2 operation is required for MMSE). In LMS, the tap weight vector w (k+l) in (2.3) is updated linearly as w (k+l) = w(k) + A£k (2.9) where £k=dk-dk is error between the bit symbol and the training sequence. The choice o f A (or step size) dictates the convergence speed and stability o f LMS 26 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. algorithm. Usually, the convergence speed for LMS is known to be slower than that of MMSE. However, since the channel variation over time is not very fast in most of interconnect link applications; LMS algorithm is often selected for the adaptive equalization algorithm [20] [12]. 2.5 Summary A high-speed signaling technique has been developed to accommodate higher transmission rate between IC chips over existing backplanes and cables. However, as the data rate reaches over giga bits per second, the performance o f the receiver becomes heavily limited by two factors: the filtering effect o f signaling environment including channel medium itself and timing uncertainty caused by signal and/or sampling clock jitter. To overcome these limitations, well-developed digital communication techniques such as coding, equalization, and multi-level modulation need to be employed. And to make these techniques available, higher resolution ADCs at the receiver side are in demand. In this chapter, we have described the conventional time-interleaved architecture which employs N parallel samplers driven by N consecutive clock phases. W hen the received signal data rate increases with technology scaling, we have also shown that this receiver requires wider bandwidth samplers and more accurate timing generation circuits because each sampler sees the whole signal bandwidth and the sensitivity to jitter increases by the reduction of signal period. 27 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The basic concepts o f communication equalization techniques such as zero forcing, MMSE, transmitter pre-emphasis, and DFE have been discussed. For the serial-link transceivers, transmitter pre-emphasis (or precoding) is quite popular because o f its simplicity. However, the transmit driver often suffers from signal headroom problem since larger signal amplitude needs to be transmitted to compensate for the low frequency attenuation. Also, its application is limited to well- defined channels since their characteristics should be known in advance. We have also discussed about decision-feedback equalizer (DFE) whose practical implementation entails design complexities at the receiver front-end. The computational complexity of equalizer can be reduced by employing frequency- domain equalization which has been popular in wireless communication systems. This technique is known to be very efficient where the length o f channel impulse response extends a lot by significant ISI effect. We will combine this technique with adaptive equalization where the filter coefficients are updated in time and apply for the equalization o f received symbols in proposed receiver. 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3 Receiver Design The primary sources of performance degradation in modern high-speed serial-link systems are the limited timing accuracy resulting from ISI caused by the propagation channel and clock jitter. The performance degradation caused by both sources is expected to worsen with the continued scaling of technology for the higher transmission rate. New receiver architecture is required, which enables full digital equalization to efficiently deal with signal distortions without adding complexity at the front-end and alleviates the sampling sensitivity o f clock jitter. The proposed receiver channelizes the received signal into multiple frequency subbands then samples at a fraction o f the symbol frequency. With this way, the input signal bandwidth to each sampler is reduced, which allows to employ higher resolution ADCs for the following digital signal processing (DSP) circuits without offset cancellation and also reduces the sensitivity of signal to the sampling jitter. Fully digital equalization scheme is demonstrated to compensate ISI effect in an adaptive manner. 3.1 Frequency Channelized Receiver Architecture The basic idea o f frequency channelization can be started from the generalized sampling theorem o f Papoulis [31], which states that a deterministic continuous-time 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. band-limited signal is uniquely determined from the outputs o f M properly designed linear systems sampled at 1/M the Nyquist rate. This concept has been demonstrated by all digital fdter bank theory [3] for the perfect reconstruction problem and by the hybrid fdter bank (HFB) for analog-to-digital conversion problem [50]. The frequency channelization in HFB view is shown in Figure 3.1. A band- limited input signal, x(t), is split into M sub-bands by M analog band-pass fdters (analysis fdter) (denoted by Hk(jQ), 0 < k < M - 1 ) where each pass-band width is defined by 1/M of the signal bandwidth. The frequency channelized subband signal is then sampled at 1/M of effective sampling rate (feff) and digitized. The signal reconstruction is achieved by up-sampling discrete data by factor o f M, then passing through discrete synthesis filters (denoted by Fk(j Q), 0 < k < M - 1 ) before summing. One important disadvantage of this system is that designing analog band-pass filters in integrated circuits, with some pole locations in excess of several giga-hertz, as is the case for multi-giga serial-link system, is extremely challenging. I— H t M H,un) j— H t m \— H f m-d o analysis filter synthesis filter Figure 3.1: Hybrid filter bank (HFB) with M subband ADCs Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Since our goal is to detect the transmitted data signal and not to achieve perfect or near perfect reconstruction, the receiver functions such as matched filtering and equalization can be applied directly on the sampled signals. This perspective allows us to recognize each subband as a diversity communication channel and to employ known techniques to optimally compute the digital reception filters. The applications of this concept for the ultra-wideband (UWB) receiver in the presence o f narrow band interference [29] and for the serial-link system have been reported [12]. An example o f FCR [12] with time-domain equalization is shown in Figure 3.2. This receiver employs a bank of complex mixers operating at M equally spaced frequencies (denoted as fo, 2fo, ...,(M -l)fo) and low-pass filters ( H ( j Q ) ) to decompose the analog input signal into M subbands. The first subband signal, x„(t) , is a real signal without down conversion and the rest o f subband signals ( -\(t) ) are complex signals downshifted by mixers to the baseband. Compared to HFB approach in Figure 3.1, this system is much easy to implement in integrated circuits since band-pass filters with multiple center frequencies can be replaced by a bank o f identical LPFs. The quantized signals («[/], 0 < k < M -1 ) are upsampled by y then filtered by linear filters before combining and the resulting real part of the sum is the estimate of the desired signal d[n]. The linear filter performs correlation and cross-correlation between filter inputs (yk[n\, 0 < k < M-I) and desired response (zk[n], 0 < k < M - l ) t o minimize mean-square-error (MSE) estimate o f transmitted 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. signal. This linear filter can be implemented by transversal filter in (2.3) such that it performs estimation and equalization at the same time. For the calculation of linear filter tap weights (wk[n], 0 < k < M -1 ) , least-mean-square (LMS) adaptive algorithm is used to update filter coefficients from instantaneous estimate of correlation and cross-correlation. As we discussed in chapter 2, this adaptive algorithm simplifies numerical complexity in synthesis filter by iterative feedback and tracking in time when perfect knowledge o f analysis filter is not available. -v (0 R e{.} ADC H(jO) ADC • o-V’U • • • • • HU Si) f{)l ADC analysis filter adaptive sy n th esis filter Figure 3.2: An exam ple of FCR with time-dom ain equalization The block diagram of FCR is shown in Figure 3.3(a). For the frequency channelization, a bank of mixers and LPFs are employed since it is difficult to design BPFs with high center frequencies in integrated circuits. Compared to conventional TIR, this channelized receiver achieves similar effective sampling rate using approximately the same number o f ADCs with each operating at same reduced frequency. As shown in Figure 3.3(b), this receiver channelize the received signal bandwidth (BWsig) into multiple subbands before digitizing such that each ADC input bandwidth is reduced by approximately (2M-1) where M is the number of 32 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. subbands. As a result, many o f the problems in conventional receiver such as limited input bandwidth and sampling jitter effect, as we mentioned in chapter 2.3, are largely reduced. input H o n ) ADC fs _ J T H(jO) — ► ADC — ► i-j2nfs-t H (jn) ADC fs Decoder subbandO 4 PSD 4 * subband 2 s u b b an d l I » subbandM-1 I f fs t BWsig (a) (b) Figure 3.3: (a) Proposed receiver block diagram , (b) Frequency relationship. The proposed receiver prototype with 3-subband is diagramed in Figure 3.4(a) where we actually combined frequency channelization and two-way time- interleaving technique. The channelization of band-limited input signal is achieved by the combination o f down-conversion mixers and the following LPFs such that each ADC in the subband sees only 1/5 o f signal bandwidth. The quadrature mixing at 2n d and 3rd subbands separates the real and imaginary part of signal, which is required for the signal detection. Since the SNR at the receiver front-end is high enough (Vp-p>300mV), a passive type mixer topology is recommended to keep a good linearity between input and down-converted signal. Also, to keep a good selectivity between sub-band signals, multi-order LPF topology needs to be employed for the sharp filtering slope [12]. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. signal in p u t - H X ) - » fL P F |— !-► fo l/Q *-► h t i - h 5 )->{l p f ]—j- h ; * 2 fo l/Q *-►[ Signal m agnitude 1st 2nd l/Q 3rd l/Q B W sig 4------1 ------F B W lpf (=fo/2) f 2.5fo fL01=fO fL02=2fO (b) f Figure 3.4: (a) Proposed 3-subband receiver, (b) The frequency relationship. To further relax the ADC sampling requirements, the channelized subband signal is sampled by time-interleaving two 3-bit ADCs, each operating at half Nyquist rate (fo/2) o f LPF output. A total o f 10 ADCs are employed to achieve an effective sampling frequency of 5fo Gsamples/sec, which is again N yquist rate of the input signal bandwidth (BWsig «2.5fo) as shown in Figure 3.4(b). Our previous analysis shows that 3-bit ADC achieves SNR o f approximately 17dB, which is enough for the optimum performance [12], The digitized samples from receiver front-end are processed off-line. Since proposed receiver provides sufficient statistics for the detection, ideally any modulations can be processed in digital domain. The digital signal processing (DSP) block in Figure 3.4(a) performs detection and equalization. For the implementation in integrated circuits, the computational complexity of DSP needs to be reduced without losing detection performance. The frequency domain adaptive equalization 34 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. scheme has been employed to meet these requirements. This topic will be discussed more in the later part o f this chapter. S(f) B W sig= 5 * B W lpf Broadband input — fi.o + 3*fi.o + 5* fi.o**' B W lpf * ► f 3 * f Lo i = 6 * B W LP F Figure 3.5: The 3r LO harm onic contamination in baseband. When broadband input signal is applied, not only the fundamental frequency of local oscillator is multiplied with input but also its harmonic frequencies do too. Even though the even terms o f those harmonics can be cancelled by differential topology, the odd harmonics, especially 3rd order term, can be problematic. A good example (Figure 3.5) is observed at the 1st subband mixer in three-subband system when its mixing frequency (fL o) is defined as 2 * B W Lpf (assuming the input bandwidth is evenly divided by B W l p f ) . The criterion of this condition is when frequency difference between the 3rd LO and signal input frequency is within the LPF bandwidth. Mathematically this condition is expressed by 3 - fL O - f s iG < B W lpf (3.1) where fLO denotes the fundamental LO frequency. For this reason, we chose 3- subband since only tail part of signal energy is contaminating the baseband and the contribution o f this power to overall performance is negligible (-40 to -50dB below signal power) in 3-subband case. However, for more than 3-subband case, there are 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. more chances to satisfy the criteria since fLo becomes relatively smaller than fsiG in (3.1), which results in higher contamination in the baseband. In this case, 2-step down-conversion system can be a solution while keeping the maximum number of subband for each down-conversion less than 3. 3.2 Comparison between FCR and TIR Compared to a conventional TIR, the FCR enjoys four important advantages: 1) simplified S/H circuitry; 2) greater robustness to sampling jitter; 3) lower power consumption; 4) improved performance in the presence of large ISI. Each of these advantages will be discussed in detail in this section. For the performance comparison, we assumes that the prototype FCR has 3- subband and each subband signal is two-way time-interleaved with ADCs that operate at 1.25Gsamples/s each. Therefore, a total of 10 ADCs is required as shown in Figure 3.4. To make a fair comparison with this prototype FCR, we assume that the TIR employs the same number o f ADCs with each sampling at the same frequency as in the prototype FCR, i.e., TIR consists o f 10 ADCs with each sampling at 1.25Gsamples/s to achieve an effective sampling rate of 12.5Gsamples/s. We subsequently refer to this TIR as the prototype TIR. When comparing the performance via simulation, one o f the difficulties is that the target bit-error rate (BER) is extremely low (e.g., 10"1 2 ) in serial-link system. Since simulating such low BER would require unrealistically long simulation time, we determined the serial-link performance by first plotting the constellations before 36 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the sheer o f 106 symbols then measuring the noise margin, which we define as the average magnitude o f the 100 constellation points closest to the detection threshold. As we discussed in chapter 2.3, the input bandwidth o f TIR decreases exponentially with ADC bits since the input capacitance increases exponentially. This causes a severe bandwidth-resolution tradeoff for multi-bit ADCs. To better illustrate this tradeoff, the maximum input capacitance of a S/H circuit (Figure 3.6(b)) that achieves an input bandwidth o f 5GHz (so to support 1 OGsymbols/s) is plotted as a function of ADC bits assuming 25Q input resistance and lpF pad capacitance in Figure 3.6(a). The maximum allowed input capacitance is relatively small (e.g., 4fF for 3-bits), suggesting that device mismatches will become an important issue. Therefore, offset compensation circuitry would be required, complicating the S/H circuitry. t 1 --------- 1 --------- 1 ----------r 10 p arallel sw itc h es P ad c a p .= lp F R e sista n c e = 2 5 0 0 1 2 3 4 5 6 ADC b its (a) (b) Figure 3.6: (a) Maximum input capacitance to support lOGSymbols/s. (b) Typical high-speed S/H circuit. In contrast, the input bandwidth of proposed receiver in Figure 3.7 is determined by the number o f subbands and not by the number of ADC bits. Typically, the 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. number o f subbands is less than 3 to minimize harmonic contamination which will be discussed in the later section. As a result, mixer input capacitance as large as 30fF (Figure 3.6(a)) can be supported in the 3-subband FCR. In addition, the bandwidth requirement o f S/H circuitry in the FCR is greatly relaxed because the signal bandwidth to each ADC is reduced by the frequency channelization. Specifically, in conventional receiver, each sampler bandwidth should be larger than input signal bandwidth to prevent filtering effect. However, in proposed receiver, its bandwidth is only required to be larger than the subband bandwidth which is much smaller (i.e. divided by (2M-1)) than input signal bandwidth. This relaxed bandwidth requirement allows signal amplification in each subband before digitizing and this is not possible in conventional receiver since wide bandwidth S/H circuitry is required. This signal amplification is, especially, important when signal power is limited at the receiver front-end because it can reduce offset problems caused by device mismatches. Amplified Latch Latch S/H CLK Figure 3.7: S/H circuit bandwidth requirement in FCR. 38 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another big advantage is that the FCR is more robust to sampling jitter. This is because the amount o f aliasing from the wideband input during the sampling is much reduced [29]. To show this, the jitter effect on performance o f conventional and 3- subband FCR is simulated. The received lOGsymbols/s signal with significant ISI in Figure 3.8(a) is sampled and equalized where sampling jitter (peak-to-peak=36ps, o= l Ops) is added during the sampling. Also, we added mixer phase noise (peak-to- peak=14°, o=4°) in FCR. The slicer inputs from two receivers are accumulated in Figure 3.8(b) and (c) respectively. In conventional receiver, there are many points in the middle that are causing bit errors. However, in 3-subband FCR, we can see the clear two bands of regions which represent two binary states (0 or 1). Time in term of symbol period Figure 3.8: (a) Received signal with ISI. Slicer input o f (b) TIR and (c) 3-subband FCR. The reason for the performance difference will be well understood when the frequency difference between the signal bandwidth and sampling frequencies o f two receivers is considered. For example, the sampling frequency o f 10-time-interleaved receiver (1GHz) is much smaller than signal Nyquist bandwidth (5GHz). In 3- 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. subabnd receiver case, the sampling frequency is quite close to subband signal bandwidth which is 1/(2M-1) times input bandwidth where M=3. Therefore, the amount o f sampling uncertainty caused by jitter is relatively insignificant in FCR. 0.6 0.5 c 05 k . ft E 0.3 < U to o Z 0.2 --© --3 b HR O " 3bFCR - -H- -4b TIR O 4bFCR 15 20 5 10 0 Jitter in Std.[ps] Figure 3.9: The jitter effect on performance of two receivers. The clock uncertainties produce both jitter and phase noise in the channelized receiver but only jitter in the TIR. Despite the additional mixer phase noise, the FCR still outperforms the TIR because of the large improvement in the robustness to sampling jitter. To show this, the performance of the prototype FCR and TIR are plotted in the presence o f sampling jitter and mixer phase noise. Figure 3.9 plots the noise margin at the slicer as a function of the standard deviation o f the jitter. The sampling jitter and phase noise are modeled by a white uniformly distributed noise. The jitter standard deviation cr; is related to the mixer phase noise standard deviation (JrN by <rF N - n f oa , . Both mixers are assumed to have the same phase noise 40 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. standard deviation. The propagation channel is modeled as second order Chebyshev filter with cutoff frequency at 2.7GHz. As jitter increases, the FCR clearly outperforms the TIR. We believe proposed FCR has two advantages in power consumption. According to Uyttenhove [49][21], there exists a speed-power-accuracy tradeoff in flash-type ADC front-end (or S/H circuit). The relationship is given by Bandwidth-Accuracy2 ---------------------------- — * constant. (3.2) Power The meaning of accuracy in (3.2) is the magnitude ratio between signal and offset voltage which is caused by device mismatches. Therefore, if accuracy is assumed the same, the power consumption is directly proportional to the S/H circuit bandwidth. This means that the S/H in proposed receiver consumes much less power (1/(2M-1)) since its bandwidth can be reduced by frequency channelization. This advantage will be more obvious in higher bit ADCs since the number o f ADC increase exponentially with ADC bits. Furthermore, the power consumed by additional compensation circuits such as offset/phase cancellation at each ADC can be saved in FCR since the accuracy-bandwidth tradeoff is weakened and FCR is more robust to jitter. Another power advantage is that we can replace power-hungry multi-phase clock generation circuits with LC oscillators which consume much less power. The ADC requirement in the presence o f ISI is compared between the prototype FCR and TIR. In Figure 3.10, the available noise margin is plotted as a function of the propagation channel bandwidth (fch) for different ADC resolutions assuming no 41 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. jitter/phase noise is present. The propagation channel is modeled as a 2n d order Chebyshev lowpass filter. For the same ADC bits, the prototype FCR performs worse than the prototype TIR when fC h exceeds 3GHz (i.e., little ISI is present). This degradation in FCR performance occurs, because the finite ADC resolution is not sufficient to adequately compensate for the aliasing between adjacent subbands. However, when fC h is less than 3GHz (i.e., ISI becomes more significant), the FCR outperforms the TIR. This performance advantage o f FCR can be explained by the reduction of the quantization noise in the upper subbands (2n d and 3rd) where the high frequency signal amplitude is largely attenuated by the propagation channel. - - © - - 2 b TIR —© — 2b FCR - -El- - 3b TIR — a — 3b FCR - - X - - 4 b TIR x 4b FCR - - • - - 5 b TIR — 5b FCR 0.8 C O ) k - 03 E a ) v > o z -o 0.2 1.5 2 2.5 3 3.5 4.5 5 4 Channel bandwidth [GHz] Figure 3.10: ADC bits and channel ISI effect. As expected, the performance of both the prototype FCR and TIR improves with increasing ADC resolution. From Figure 3.10, a 3-bit ADC seems to achieve a noise margin that is sufficiently wide for propagation channels with significant ISI (e.g., fci, 42 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. < 3GHz). Higher ADC resolution can be used to further improve the performance at the expense o f increased ADC complexity and power consumption. 3.3 Frequency Channelization Circuit As we described in the last section, the frequency channelization circuits consist of front-end down-conversion mixers and the following LPFs. For 3-subband system, two quadrature mixer pairs as shown in Figure 3.4 are employed to convert each high frequency band to the baseband such that real part and imaginary part o f signal are processed separately. T x Driver 3 5 500 50 0 —6 E T ransm ission lin e " } — - — E T ransm ission Line*y» - / VILO+ X I ■ 1 + ■ I - ' v VILOt -' VQLO+ V lL O - ■ Q+ ■Q- Vq l o - VILO+ Figure 3.11: Q uadrature mixing at the receiver front-end. A typical serial-link interface and quadrature mixer topology is described in Figure 3.11. For a transmit driver, current mode logic (CML) is more popular than voltage mode [7] in serial-link system because the current source in the output driver isolates data signal from ground noise. Therefore, a binary logic states are represented by voltage levels which are VDD (logic 0) and VDD-Vp-p (logic 1) where Vp-p denotes peak-to-peak voltage swing. A typical Vp-p is around 600 to 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 800mV differentially. Both transmitter and receiver are terminated by on-chip 50Q resistors to reduce signal ringing. As described in Figure 3.4(a), the 1st subband does not need a down-conversion mixer since the targeted signal band is already in the baseband. For the other subbands, a simple passive double-balanced mixer topology [43] is selected considering its good linearity and simplicity. A PMOS is chosen to satisfy the bias requirement at the front-end where its DC level is around VDD. A ( J L b d B V SIG ( t ) > m n(t) L P F > ViF(t) Figure 3.12: Equivalent block diagram o f passive double-balanced mixer. The equivalent circuit for double-balanced mixer is represented by normalized mixing function (mn(t)), gain stage (A), and LPF as shown in Figure 3.12 [43]. The normalized mixing function is expressed by ^ g w V f ) g lm a x g ( t ) + g ( t - ^ ) (3.3) where g(t) is time varying switch conductance and T lo is the period o f sinusoidal LO frequency. The Thevenin equivalent conductance (gift)) in (3.3) is then expressed by 44 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. gT(t) = g(!H g ( p °0 (3.4) The conversion gain (or processing gain) of the mixer is defined by the ratio between maximum value of gj(t) and its average value, which is This suggests that the conversion gain is a function of the highest amplitude o f local oscillator voltage (gTmax) and its DC level ( g r). Now, the bandwidth o f the mixer can be expressed by where C l is capacitive load between differential outputs. Therefore, the mixer bandwidth is determined by the product of PMOS on-resistance and parasitic capacitances at the mixer output node, where DC biasing conditions and device sizes are affecting its values. In order to prevent additional ISI caused by RC time constant o f mixers, the mixer bandwidth in (3.6) is required to be larger than LPF bandwidth which is defined by BWsig/(2M-l) (where BWsig = input signal bandwidth and M = number of subband). Practically, there exists a trade-off between the bandwidth in (3.6) and processing gain in (3.5). When input DC level is fixed, the conversion gain is inversely proportional to the average transconductance ( g T ) which is set by LO DC level, whereas the mixer bandwidth is proportional to g T . (3.5) C l ’ (3.6) 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. O GAIN •4 ■ 6 C D •8 o CL 12 •14 •16 4 6 7 B an d w id th {OHz] Figure 3.13: Simulation result for mixer bandwidth vs. gain. The simulation result (Figure 3.13) for this trade-off suggests that wide mixer bandwidth can be achieved by sacrificing its gain. Since the signal-to-noise ratio (SNR) o f received signal in serial-link system is large enough, a processing gain (actually loss) of -6dB, for example, is still acceptable while maintaining 2.5GHz bandwidth which is large enough than LPF bandwidth for 3-subband receiver at lOGsymbols/s rate. After down-conversion o f input signal, the actual division in frequency domain is done by LPF. The design objective of LPF is to achieve variable gain with multiple order poles. Variable gain is necessary so that the signal output sees the full input dynamic range of the ADC and multi-order pole is needed to reduce signal bandwidth to each subband ADC so that the following ADC design is simplified. Also, this multiple-order pole actually helps to provide enough selectivity between 46 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. subband signals by reducing signal overlaps from next subband. To satisfy these requirements, the 4th order cascaded feedback topology is selected for this prototype. Level 50Q ^ Shifter 4 - s t a g e LPF -7dBm M m - |-6dB; 5dB -8dBm Amp.! ! (250m Vp-p) (300mVp-p) -5dB 5dB p6dB; 5dB Figure 3.14: Signal am plification at the front-end. The gain requirements for each stage are shown in Figure 3.14. When peak-to- peak voltage o f 300mV (or -7dBm) input signal is applied, a DC level shifter (or source follower) is employed to lower DC level for the following LPF and ADC in the 1st subband. To compensate for the loss at the level shifter (-6dB), the 4-stage LPF should provide gain of 5dB to satisfy the target output peak-to-peak voltage level which is 250mV (or -8dBm). For the 2n d and 3rd subbands, we need to compensate for the mixer loss which is around -5dB. For this, additional amplification stage is added between mixer and level shifter whose gain is 5dB. The resulting output peak-to-peak voltage swing from each subband is around 250mV (or -8dBm). Now, since no high gain is required at each LPF stage and relatively wide bandwidth is necessary, a feedback topology is a good choice. Figure 3.15 shows a single-stage shunt-shunt feedback amplifier and its frequency response. By the feedback, the closed loop gain (Avc) is determined by the ratio o f feedback resistance 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Rf) and source resistance (Rs) and the bandwidth (dominant pole, Po) is increased as much as loop gain (|3 A v o where A v o is open-loop gain). Ideally, the gain-bandwidth product o f the amplifier should remain the same before and after feedback, which is gm/Co (gm=amplifier transconductance and Co=output capacitance). Gain ^ \ G ain • B andw idth t Avo lAvcl = — -► f (a) (b) Figure 3.15: A shunt-shunt feedback (a) topology and (b) frequency response. Rs,eq Ci,eq K f r t n Ro,eq Figure 3.16: Cascaded shunt-shunt feedback am plifier. When gain-bandwidth product increases, not only the DC gain but also the bandwidth o f the amplifier will be increased (gray line in Figure 3.15). In practical application, we need to consider the validation o f the feedback because the feedback 4 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. resistance introduces loading effect at the output [37], However, if Rf is large enough, this effect can be minimized [Appendix A], Figure 3.16 shows cascaded feedback topology of shunt-shunt feedback. The equivalent source resistance o f 2n d stage is the output resistance o f the 1st stage which is expressed by R s ,e q * — ------ (3.7) (Loop gain) where Ro is open-loop output resistance. Thus, the closed loop gain o f the 2n d stage becomes Avc « —— — » - — • (Loop gain ) . (3.8) R,,a, Ro Now, by the feedback, the equivalent input capacitance o f the 3rd stage which is the output capacitance o f the 2n d stage is increased by loop-gain which is Ci,eq « Ci ■ (Loop gain) (3.9) where Ci is open-loop input capacitance. The output capacitance at the 2n d stage is dominated by Ci,eq in (3.9) because it usually is much larger than self-loaded output capacitance. Now, since the increment o f capacitance from the next stage is cancelled by the decrement o f output resistance assuming each stage is identical, the bandwidth o f the 2n d stage remains the same. Now, to achieve the variable gain without affecting the bandwidth, we can make use o f amplifier transconductance (or gm) since the loop-gain is proportional to gm (because o f Avo term in Figure 3.15(b)). Figure 3.17 shows a single-stage shunt- shunt feedback with gm control and its frequency response. The control o f gm is 49 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. achieved by adding an additional current path in parallel with amplifier, where the current is controlled by the gate voltage (Vc). This relationship is expressed by gm oc Ip - L oc Ip - (Vc - V,h)2 (3.10) where Vth is threshold voltage o f NMOS. Therefore, when the amplifier gm is increased as shown in Figure 3.17(b), the closed-loop gain is also increased without affecting the bandwidth o f each LPF stage. Also, common-mode feedback is used to keep a certain DC level for each stage input. V c CMFB V - v + (a) Agm \ AAvc| f f Po Po-pAvo Figure 3.17: A shunt-shunt feedback (a) topology with gain control and (b) frequency response. To investigate the effects of transfer function of LPF on the receiver performance, a simulation is performed when each ADC is sampling lOGsymbols/s input at 2.5Gsamples/s and mixer frequencies are fixed at 2GFIz and 4GHz. If the bandwidth of LPF is too narrow, then a significant amount of signal energy is lost as illustrated in Figure 3.18. On the contrary, if the bandwidth is too wide, the amount o f aliasing caused by the overlap of signal energy from the next subband becomes significant. For both o f the cases, the receiver performance will be degraded. The order o f filter 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. affects the performance in the same way. For example, lower order filter covers more signal energy whose effect is similar to wider bandwidth. P S D subbandO subbandl subband2 / / / N a rro w b a n d w id th P S D SubbandO Subbandl Subband2 / / / ’ - S , --- - - - 4 \ Signal P S D / / / subbandO subbandl subband2 fL O l f l 0 2 W id e b a n d w id th Figure 3.18: The effect o f LPF bandwidth on the signal division, The simulation results are shown in Figure 3.19 where the receiver performance is represented by the eye opening with respect to the variation of the filter orders and the bandwidth. As described in Figure 3.18, the eyes are closing at both extremes o f the bandwidth. The eye closing at lower bandwidth side is due to the signal loss and it is due to the aliasing at higher bandwidth side. The filter order also affects the performance by closing eyes at different frequencies. For example, the eyes for the lower order filters are closing at relatively lower bandwidth because of their slower stop-band slopes. In the same manner, the eyes for the higher order filters are closing at higher bandwidth. W hen the filter order is smaller than the 3rd, the eye opening becomes significantly reduced. This is because the effect o f slow slope is dominantly degrading the performance by increasing the aliasing. For this reason, we chose the 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4th order LPF in the prototype design where its optimal range is from 800MHz to 1.2GHz. 0.6 - - - 2nd - k — 3rd 0.5 6th 0.4 E ? C O E 0.3 C D W o 02 2 0.1 0.6 0.8 1 1.2 1.4 LPF bandwidth [GHz] Figure 3.19: LPF order and bandwidth effect. For the last topic o f this section, we will discuss about various topologies for achieving frequency channelization and their advantages/disadvantages. This frequency channelization block, which requires good linearity, variable gain, m ulti order pole, and wide output bandwidth, incorporates down-conversion mixer, LPF, and wideband buffer as shown in Figure 3.20. A good linearity is necessary because this block is placed at the front-most of receiver such that any non-linearities could cause significant signal distortion at the following stages. In this context, a passive double-balanced mixer is a better choice than active mixer [24], By considering the input signal amplitude which is quite large (Vp-p = 300 to 400mV), this choice makes more sense. If off-chip AC coupling capacitor is available, NMOS passive double balanced mixer which will provide better performance can be employed. For 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. example, when InF AC coupling capacitor is used, the 3dB frequency o f HPF is around 3.2MFIz which is low enough to affect signal spectrum significantly. Wideband 5 0 Q 4 LPF buffer w > oo-t> - ADC ADC LO F ig u re 3.20: A fre q u e n c y cha n n e liza tio n c irc u it block. As discussed in section 3.1, variable gain and multi-order pole are required in the LPF stage because the following ADC usually has limited input dynamic range and the sharp slope in the LPF helps to isolate between subband signals, respectively. Two unit cell topologies for multi-order LPF (Figure 3.21) are studied for the performance comparison. The 1st LPF cell employs shunt-shunt feedback to reduce input and output resistance by the loop gain and the 2n d topology is just simple differential pair where its bandwidth is determined by time-constant in the output node. Both topologies incorporate transconductance (gm) control for the gain and varactor for the bandwidth adjustment. The input common-mode DC level o f shunt- shunt feedback is around half of supply voltage because large open-loop voltage gain is required to provide enough wide bandwidth in the closed-loop gain. For the case of differential pair, in order to save power consumption, the input DC level is raised as high as Vdd-V(sig,p-p)/2 where V(sig,p-p) is peak-to-peak signal swing. When N 53 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. stages o f identical LPF unit are cascaded, the resulting -3dB bandwidth is expressed by coo-^Jyf2-\ [39] where coo is -3dB bandwidth of each cell. Vout Cv Cv Vc Vin+ Vin- (b) Cv. Cv Vout ___ Vc Rf Rf Vin+ Vin- M2 (a) Figure 3.21: LPF unit cell topologies using (a) shunt-shunt feedback and (b) differential pair. The main performance difference between two topologies is the power consumption since the gain and bandwidth from each of these cells can be adjustable. According to our simulation on 4-stage o f each topology, the average power consumed by shunt-shunt feedback cell is about 200% lower than differential pair. This result is quite reasonable in that the bandwidth after feedback is extended by the amount o f loop-gain without drawing additional current through the output load. Another important fact to be considered in channelization circuit is its output bandwidth. Even though each subband signal bandwidth is reduced to a fraction o f input bandwidth, wide bandwidth buffer is necessary to provide enough bandwidth for the following ADC, especially when multi-bit ADC is needed. Three wideband buffer topologies are shown in Figure 3.22. A simple source follower (3.22(a)) provides wide bandwidth at the output since its effective output resistance is 1/gm. However, the usage of this topology is limited because DC level difference between 54 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. input and output is necessarily large to provide enough overdrive voltage o f M l. The gain loss o f this topology (typically 0.6 to 0.7) is another limiting factor o f its usage. Also, the size o f input transistor needs to be quite large to increase its transconductance (gm), resulting in high input capacitance. The next topology as shown in Figure 3.22(b) is known as shunt-peaking (or inductive peaking) amplifier. The output bandwidth of a simple common-source amplifier falls off as frequency increases. This is because the frequency dependency of load capacitor (C) is negative. The idea o f shunt-peaking is to add inductive load to compensate for this dependency (i.e. inductor introduces a zero) that results in roughly constant broader frequency range. However, this topology often causes not only a peak in the frequency response but non-uniform phase relationships among signal Fourier components. A careful design effort is needed to compromise these effects. The bandwidth enhancement by the shunt-peaking is ranged from 1.7 to 1.6 times the bandwidth given by RC time constant [24], One disadvantage of this topology is the area overhead due to on-chip inductor load. Rs M2 Out Out c M1 M 1 Ri Bias M1 _ _ U |n (a) (b) (c) Figure 3.22: W ideband buffer circuits (a) source follower, (b) shunt-peaking, and (c) active shunt-peaking. 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To moderate the size overhead in conventional shunt-peaking amplifier, active device is often used to realize inductance. This active shunt-peaking amplifier (Figure 3.22(c)) is widely used in optical communication receiver front-end [52] [41] [54], The output impedance o f this topology is expressed by [39] where Cgs2 and gm2 denotes gate-source capacitance and transconductance o f M2, respectively. When Rs » l/gm2, then the output impedance can be modeled as an inductor with serial and parallel resistors as shown in Figure 3.22(c) where Ri=Rs- l/gm2, R2=l/gm2 and For the high quality inductance, Rl should be maximized and R2 should be minimized. This is simply achieved by increasing gm2. However, increasing gm2 consumes a lot of output voltage headroom since the overdrive voltage o f M 2 should be increased. Because o f this, this topology is not a good choice for low supply voltage below 1.5V. One simple solution for this problem can be solved by applying another supply voltage at Rs that is higher than supply voltage [41] when it is available in the design. Two more wide-band amplifiers are shown in Figure 3.23. The first differential pairs make use of source degeneration such that the gain roll-off caused by the pole at the output node, is compensated by increasing their effective transconductance at (3.11) gm2 + s -Cgs2 (3.12) g m l gm l 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. high frequencies. The equivalent transconductance of half circuit can be expressed by Gm = ------------ f , = Sm (s ' RsCs +1 j (313) 1 + g m ( — H ) ■s-foCj + l + ^ ’A 2 2 s -C s 2 Thus, the equivalent transconductance contains a zero and a pole. Now the frequency response o f voltage gain can be expressed by i , i \A vo \ Gm-RD \Av\ = -----1 --------- = -------------- (3.14) 1 1 1 + s -Rd -Cl 1 + s -Rd -Cl where Avo denotes low-frequency gain. If the zero in (3.13) cancels the dominant pole at the output node in (3.14), then the effective pole for this amplifier can be extended to (1 + g m - R s/2 ) /( R d -Cl) , which is much higher than original dominant pole. For the wideband buffer application, this topology has a disadvantage because the area consumed by RsCs (= R dC l) is proportional to R d C l, which is often enormously large in multi-bit ADCs. Also, this topology consumes a lot o f static current since the current flowing through the load resistance (RD), which is usually low value, needs to be large enough to keep output common-mode level around Vdd/2. The capacitive degeneration topology shows a unique behavior when its pole and zero cancellation is not accurate. The half circuit of this topology is shown in Figure 3.24(a). The parasitic capacitance (Cp) in node X is quite large since it includes parasitic capacitances at the source o f amplifier M l and the drain o f current 57 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. source (Iss). Now, for the accurate pole-zero cancellation, the source capacitance (2Cs) should exactly match with load capacitance (Cl). R d R h ? J R d ” | C L M2 V in- — V in+ R d R d V o ut V o u t -------- V M 4 Ml V in+ |S S 2 M2 V in- Issi (a) Figure 3.23: (a) Capacitive degeneration am plifier (b) modified Cherry-H ooper am plifier. However, when CL « Cp, the effective source capacitance is dominated by parasitic capacitance (Cp) so that the location o f zero (zl) is placed in lower frequency than output dominant pole (pi) as shown in 3.24(b) where p2 is another pole from (3.13). This undesirable peak in high frequency region can be avoided by either boosting low-frequency gain with higher Rs or additional load capacitance to match the capacitance in node X (that is Cp+2Cs). This unique high-frequency boosting effect is often utilized to implement analog equalizer since low-frequency gain and high-frequency peak can be controlled separately [14] [5]. This is achieved by replacing passive source Rs and Cs with variable resistor and varactor, respectively. The Cherry-Hooper amplifier [4][39] in Figure 3.23(b) employs local feedback through RF to improve the output bandwidth. This circuit provides two signal paths 58 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. from input to output: one through M3 and another through Rf. The small-signal drain current o f Ml flows through R f and generates voltage drop which is gm\ ■ Vin ■ R f .Thus, the voltage at node X is Vx = Vout - gm\-Vin- Rf . (3-15) Now, the small-signal current created by M3 equals gmi> • Vx , which should flow through R f if R d » R f is assumed. Therefore, the small-signal created by Ml and M3 should be equal, which results in Vout „ gm\ = gm\- Rf - - — Vin gm l where Vin represents differential input voltage. Gain 4 '' [dB] R d Vout Vin+ C p ^ I lss( ) Rs/2 2 C s (b) (a) (3.16) Figure 3.24: (a) H alf circuit of capacitive degeneration and (b) its frequency response. When R f is much larger than l/gm3, then the voltage gain of this topology is simply defined by g m \■ Rf . The main advantage of this circuit is that the small-signal resistance seen at node Y equals l/gm3 (« R F , Rd) which allows high frequency pole at gm3/CY where CY is capacitance at the output node (Y). Another advantage 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. o f this circuit is that the input capacitance is determined by Ml transistor which is typically smaller than M2, resulting in low input capacitance. The main disadvantage of Cherry-Hooper amplifier is that the minimum supply voltage level is so high that output voltage swing is limited. To overcome this problem, additional biasing resistance (R h) is used to determine the bias level at node X such that the voltage drop across R f is removed from the minimum required supply voltage. Power [mW] @2.5V Cin m Vo.sat(p-p) [mV] Area overhead Differential Source-follower* 1.5 70 520 Small Active inductance peaking* 8.5 60 850 Medium Capacitive degeneration 7.3 60 >1000 Large Modified Cherry- Hooper 2.9 8 530 Medium * This circuit includes com m on-m ode feedback. Table 3.1: Perform ance com parison in various wideband am plifiers. Four differential wideband buffers that w e’ve discussed above are simulated for the performance comparison. And the result is summarized in Table 3.1. For the fair comparison, the output bandwidth is fixed and voltage gain is kept within ±ldB . In power comparison, inductive peaking amplifier and capacitive degeneration amplifier consume a lot o f power because of their bandwidth-power tradeoff. As discussed above, the input capacitance o f Cherry-Hooper amplifier is very small since no high gain is required for the input transistor M l in Figure 3.23(b). The maximum output saturation voltage (Vo,sat(p-p) in Table 3.1) is obtained by increasing low-frequency amplitude o f sinusoidal wave which is applied to each 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. topology input. The Vo,sat(p-p) for the capacitive degeneration circuit is the largest because its output structure is simply the same as differential common-source pair. However, the area overhead for this amplifier is the worst because o f passive R and C that duplicate the output node. N X <3 * •o c « r co 8 7 ti 5 4 3 2 1 0 ADC bits Figure 3.25: Bandwidth of various wideband am plifiers according to ADC bits. The bandwidth o f various wideband amplifiers according to the number o f ADC bits is investigated as shown in Figure 3.25. In this simulation, the input transistor size o f each ADC is assumed 3um and two-time-interleaved flash-type ADC is used. Also, the amount o f parasitic capacitance due to wiring metal is assumed the same as that o f input transistors. Therefore, for N-bit ADC, the total load capacitance o f wideband buffer becomes 2 ■ 2 ■ (input transistor capacitance) • (2N -1 ). (3.17) 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The unusually high bandwidth o f capacitive degenerated buffer at lower bit ADCs is due to its high-frequency peaking phenomenon that we discussed earlier. Also, the bandwidth degradation slope o f modified Cherry-Hooper amplifier is because the self-loading o f its output amplifiers (M3 and M4 in Figure 3.23(b)) dominated the load capacitance (CY), especially when its value is relatively small, in the output node. From this simulation, we can conclude that all o f 4 topologies that we discussed provide large enough bandwidth (about 1.5 GHz) for the implementation of 4-bit ADC when LPF bandwidth is assumed 1GHz. Approach 1: Differential Source pair LPF follower 50Q PMOS DC * Vdd LO Modified S h u n t-sh u n t Cherry- feed b ack LPF Hooper Approach 2: 50Q NMOS Vdd - > ADC 2 LO Figure 3.26: Two frequency channelization schemes. Based on the above discussions, two frequency channelization schemes are built as shown in Figure 3.26. The first approach consists o f PMOS double-balanced mixer, 4-stage differential pair LPF, and differential source-follower, whereas the second approach employs NMOS double-balanced mixer, shunt-shunt feedback LPF, and modified Cherry-Hooper wideband buffer. In the first method, source-follower is 62 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the only choice for a wideband buffer because the output DC level o f LPF stage is close to supply voltage. However, in the second method, any topologies except source-follower can be used. In this experiment, modified Cherry-Hooper amplifier is selected since it does not have serious disadvantages. The performance o f two frequency channelization approaches is simulated for the comparison as listed in Table 3.2. The adjustable gain o f 6dB is achieved from both circuits. The variable bandwidth for shunt-shunt feedback topology is little less than resistive-load differential pair since the active PMOS load at the output node adds additional capacitance as shown in 3.21(a) such that the size o f varactor is limited. The average power consumed by the 2n d approach is much less mostly because the feedback topology o f LPF cell requires much less current than resistive- load differential pair. The maximum output saturation voltage, when the 100MHz sinusoidal wave is applied, is affected by the number of LPF stages, not by the output buffers. This is because the output gain at 100MHz input reduces as the number o f filter order increases. The simulation result indicates that the 1st approach is slightly better than the 2nd. The system linearity including front-end mixer is investigated by input/output- referred ldB compression points with PSS (periodic steady-state) analysis. We found the 2n d approach is much better than the 1st mostly because passive double-balanced mixer based on NMOS provides better linearity (2dB higher than that o f PMOS). However, if the output voltage swing for both topologies is limited within 250mVp-p 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (-8dBm), we found that any o f these topologies can provide good linearity (below ldB compression point). Adjustable Gain [dB] Adjustable BW [GHz] Avg. Power [mW] Vout.sat [mVp-p] *1dB Com pression IN/OUT [dBm] Approach 1 6 0.92 11.3 450 -7.9/-7.4 Approach 2 6 0.60 7.8 411 -4.9/-5.2 * Mixer circuit is included in this simulation. Table 3.2: Perform ance com parison between two frequency channelizers. 3.4 Analog-to-Digital Conversion Circuit The digitization o f frequency channelized in-phase/quad-phase subband signal is performed at the following multi-bit ADC. When Nyquist rate sampling is assumed, it is reported [12] that 3-bit ADC provides sufficient statistics for the signal detection and equalization at the following DSP. The sampling environment and frequency relationship at the 2n d subband is shown in Figure 3.27 as an example. For the Nyquist sampling, the sampling frequency (fs) of ADC should be equal to 2fo where fo is LPF bandwidth. In order to lower the sampling frequency, two time-interleaved ADC each operating at fo is employed to achieve effective Nyquist rate. For the high-speed sampling performance, flash-type ADC is used and each ADC consists o f differential comparator, amplifier, and latch. 64 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BWi.n= fo LPF LPF ADC ADC ADC ADC latch amplifier comparator Gray Coder Figure 3.27: Two tim e-interleaved ADC in the 2"li subband. a2 aO Vn VP V r e f Ms V re f Vn Figure 3.28: 3-bit differential com parators. The generation o f 7 differential voltage steps for 3-bit ADC is achieved by combination o f balanced and unbalanced cell in Figure 3.28. The balanced comparator cell in the middle (a3) is a simple differential amplifier with resistive load and the rest of cells are unbalanced by additional transistor pairs (M3 and M4). The unbalanced comparators adjust the amount o f current through amplifier to 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. generate voltage difference (AV) at the output [9], The AV for each unbalanced cell can be defined as the magnitude o f input voltage difference (|Vp-Vn|) required to equalize output currents (ii=i2). At the first unbalanced comparator in Figure 3.28, when transistor M l and M2 (M3 and M4) are matched to each other, let the output current il equals i2. Then il can be expressed by where ids is drain-to-source current transistor in saturation region. With 1st order approximation, the ids is where Kn=pn*Cox (pn=electron mobility, Cox=oxide capacitance per unit area) and Vth is threshold voltage. By applying (3.19) to (3.18) where AV = Vgs2-Vgsi and Vthi=Vth3. Since it is reasonable to assume 2*(Vgsi-Vthi) » AV, we can approximate (3.20) as This suggests that AV for each unbalanced comparator cell can be defined by the size ratio o f transistor M3 and Ml when bias conditions are fixed. Also, at a given size ratio, AV is still controlled by V ref (or Vgs3), which actually determines adjustable voltages steps or input dynamic range o f ADCs. 11 — ids3“ ids 1—ids2—12 (3.18) 2 L (3.19) W3 • (Vgs3-V,h3)2-Wi • {A V 2+2 ■ AV • (Vgsi-V.hi)}=0 (3.20) 1 W .3 ( W s 3 - V . h 3 ) 2 AV s --------— 2 W. (Vgs.-V.hi) (3.21) 66 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 50 D ev. 30 20 10 0 0 100 200 400 300 500 600 700 3*VLSB [m V ] Figure 3.29: Simulation result to determ ine com parator input dynam ic range. The input dynamic range of ADC which is defined as maximum voltage difference allowed to the comparator input without causing non-linear voltage steps, is mostly dominated by comparator non-linearity. As in Figure 3.29, comparator non-linearity bounds upper and lower limits of input signal amplitude. The Y axis is percentage deviation from ideal voltage step and the X axis is the largest voltage step ( 3 * V l sb) generated by unbalanced comparator cell where V lsb stands for voltage magnitude o f least-significant bit (LSB). The upper limit o f dynamic range is defined by the maximum allowable input signal amplitude before the transistor Ml enters into triode (or linear) region. When a small amplitude input is applied, V ref needs to be increased to generate small voltage steps. However, as V R E F increases beyond a certain voltage level, the transistor M3 no longer operates in saturation region where the linear relationship 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. between transistor size ratio (W 3/W 1) and AV is maintained as in (3.21). Thus, the lower limit is defined by the minimum input signal amplitude before transistor M 3 enters into triode region. When the maximum allowable non-linearity is set within 10%, the input dynamic range in peak-to-peak voltage is about 70 to 320mVp-p (or - 19 to -6dBm in 50Q system). SR Latch clock x j ~ 0 ~ 0 > X / positive regenerative feedback Vn Vp * \ _ Input tracking, C lO C k \ . outp u t p rech arg e to high / Positive feedback enabled, outp u t latched V cP Figure 3.30: Positive regenerative amplifier. The analog output o f each comparator is further amplified by the following positive regenerative amplifier (Figure 3.30) before latched. At the negative phase o f the clock, the amplifier Mi and M 2 track the input signal and the differential outputs are pre-charged to V D D . At the positive edge of clock, regenerative positive feedback enables exponential increase o f gain as a function o f delay time. The increase o f gain is expressed by the voltage difference between node x and y ( V x y ) which is 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Vxy cc e t/r (3.22) where T=Cp/(gm3-gmi) [40], Therefore, for the maximum gain, parasitic capacitance at node x and y should be minimized and the transistor size o f M3 needs to be larger than M l. C o d e 50 100 - - DNL -*\ k- 40 o a> 30 a > no -- c a > o 010 - - X T O 5 — Ideal Real 10 00! -- 000 0 0 100 200 300 600 700 400 500 - 3 V lsb - 2 V lsb- V ls6 o V ise 2 Vlsb 3 V lsb input AV (a) 3*VLSB [mV] (b) Figure 3.31: (a) DNL definition and (b) ADC dynam ic range sim ulation result. One method o f measuring input dynamic range o f ADC is achieved by differential non-linearity (DNL) which is measurement of non-linearity resulted from input amplitude. As in Figure 3.31(a), DNL is defined as voltage difference between ideal voltage thresholds (or steps) and real transition voltages when state transition is completed. The simulation result is shown in Figure 3.31(b). The Y axis is percentile ratio o f maximum DNL to V lsb which is expressed by Max. DNL V l s b -x 100 [%]. (3.23) The simulated input dynamic range o f ADC is 80 to 330mVp-p which is slightly 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. worse than comparator dynamic range. This ensures that non-linearity caused by input amplitude is mostly due to comparator other than the rest o f ADC circuits. a6 a s a4 a3 a 2 a i ao b2 b i bo 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 ao a i a 2 a3 a4 a s a6 Themo-to-Gray look-up table clock (a) A. (b) bo b i b2 Figure 3.32: (a) 3-bit therm om eter-to-G ary code m apping and (b) look-up table conversion block. 50Q 50Q Vo- Vo+ D0+ M4 Mi M3 DO- D1 + D i clock— J clock Figure 3.33: 2:1 transmit multiplexer. The 7-thermometer code output (Figure 3.32(a)) from each ADC is converted to 3-bit gray code by pre-stored look-up table circuit driven by clock as shown in Figure 3.32(b). Gray coding is used to ensure that the nearest symbol error results in 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. only one bit error [33]. For the experimental purpose, 2:1 multiplexer (Figure 3.33) is employed to transmit 2 data bits at one clock period so that the number o f output pad can be reduced by half. C y c lic p re fix B lo c k o f k d a ta s y m b o ls L a s t p s y m b o ls r e p e a t e d p s y m b o ls (a) k d a ta s y m b o ls w ith C P A W X W X k d e c o d e d - d a ta s y m b o ls FFT IFFT 1 - t a p e q u a liz e r P ro p a g a tio n c h a n n e l (b) Figure 3.34: (a) Cyclic-prefix coding and (b) dem odulation (single-channel). 3.5 Digital Signal Processing The FCR basically performs multi-bit analog-to-digital data conversion. Therefore, as far as sufficient statistics can be provided by the converter, ideally any modulation schemes can be processed in digital domain. As discussed in section 2.4, single carrier cyclic-prefix (SC-CP) system is employed in this experiment to enable block- by-block processing o f data for testing convenience and to perform computationally efficient frequency domain equalization [8], The main idea o f this system is to copy the last P symbols to the beginning o f each transmitted block (Figure 3.34(a)) such that frequency domain equalization is possible. The length of cyclic-prefix is set to exceed the combined impulse response o f the propagation channel and the receiver 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. front-end. Although cyclic-prefix is redundant information which is not used for the signal detection, it enables us to derive an efficient receiver structure. An example of single-channel SC-CP system is illustrated in Figure 3.34. When transmitted signal vector a which has a prefix of p symbols at each data block (k) (Figure 3.34(a)) is sent, the received signal matrix x is represented by C-a+v where C is channel matrix and v is additive noise along the signal path. The regularity in transmitted data results in circulant channel matrix (C) which is decomposed by [48] C = W H A W , (3.24) where W is digital frequency transform (DFT) matrix, W H is Hermitian o f DFT matrix (W), and A is diagonal matrix. Therefore, the demodulation process (Figure 3.34(b)) employs Fast-Fourier transform (FFT) and inverse FFT which are well developed with signal processing algorithm, and a set of one-tap equalizer for the detection o f the circulant matrix. xio[/] SubbandO ■ ■ ■ ► Subbandl Xll[ ^ l / Q x q i [ / ] Subband2 x " ^ ' / Q x q i [ / ] FFT FFT I - t a p p e r c h a n n e l FFT IFFT IFFT IFFT In te r le a v e Figure 3.35: DSP structure o f 3-subband channelized receiver. The DSP structure o f proposed 3-subband receiver is shown in Figure 3.35. When a block with k data symbols is sent, we can represent the block in a vector 72 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. form as a=[ao,ai,...,ak_i]T where T denotes transpose. In order to achieve half symbol-space equalization that covers twice the signal bandwidth, the transmitted symbol sequence is upsampled by 2, resulting in s=[ao,0,ai,0,...,ak.|,0] . Therefore, when M ADCs are employed to collect N samples in a block from each subband, the relationship is represented by 2k=MN. If xm[n] represents nth sample collected by the m th subband, where m = (0,...,M -l) and n= (0,...,N -l), the n th samples o f all subbands are expressed by a vector x [ n ] = [ x 0[ n ] , x 1[n ],...,X M -i[n ]] . By defining the subband sample vector x=[x[0]T,...,x [N -l]T ]T and the additive noise vector v= [v[0]T,...,v[N - 1]T ]T, the received samples can be expressed by x = G s+ v (3.25) where the channel matrix G represents the combined effect o f the propagation channel and frequency channelizer. The regular structure (or cyclic prefix) in the transmitted block simplifies G matrix, in case o f M subbands, as a block circulant matrix (BCM) which is ' G0 G,v-i G, ^ G, G0 g 2 g „_2 •• G0 Gam k G n -\ '• G, Go , where each matrix in the last M row, [Gn.i,...,G |,G o ], is impulse response o f M subbands. This block circulant matrix can be decomposed [51] as G = F ;' D Fs (3.27) where = e~'2K kn'N -lM and F^1 = e,2nk,llN -Iw (where k,n= 0,...,N -l) are DFT related 73 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. matrix and its inverse, respectively. The D matrix in (3.27) is block diagonal matrix represented by D=diag(G(0),...,G (N "l)), where diag() denotes a block diagonal matrix with the matrices inside the parenthesis are shown on the diagonal. The estimation o f upsampled data s can be achieved by applying inverse channel matrix, G “' = F “' D_ l -Fa to x in (3.25). The inverse block diagonal matrix D '1 can be replaced by equalization matrix (D “w) of the same structure, i.e. a block diagonal matrix with N, (MxM) matrices on the diagonal. The coefficients o f D~w are updated in time adaptively for the maximum estimation performance. Therefore, the resultant estimate o f s is expressed by s = F ; '- D ? -Fb -x (3.28) where the perfect equalizer in the absence of noise is D 1' = D”1. The transmitted symbols a can be recovered by removing even rows o f F ”1 resulting in F0 and interleaving each outputs from IFFT in time, i.e. k = F0-D?-FH-x. (3.29) 3.5 Summary The main topic o f this chapter is focused on architectural advantages o f proposed receiver and detail explanation of each circuit blocks. We have also discussed various topologies for frequency channelization circuits which are most important part o f proposed receiver design by investigating advantages and disadvantages of 74 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. each topology in terms o f linearity, bandwidth, and power. The conceptual explanation o f digital signal processing part is added to help understanding of estimation and equalization processes. The main challenges o f implementing high resolution ADCs in modern multi- giga serial-links are to overcome limited bandwidth o f analog front-end and to reduce the sampling jitter effects. We have shown that the proposed FCR overcomes these limitations by channelizing the spectrum o f received signal into multiple subbands such that not only the bandwidth requirement o f each ADC is much reduced, but also the effect o f sampling jitter on ADC performance is decreased as much. This also reduces power consumption at each sampler since the wide bandwidth which is as wide as signal bandwidth is no longer necessary at its output node. The frequency channelization circuit which is composed o f analog mixers and LPFs requires good linearity, variable gain, multi-order poles, and wide output bandwidth. Especially, wideband buffer at the output of frequency channelizer is important since enough sampling bandwidth should be provided for the following multi-bit ADCs. The performance of four different wideband buffer topologies are investigated to show their advantages and disadvantages in terms o f linearity, input capacitance, power, and area overhead. Based on this research, two example channelization schemes are suggested in Figure 3.26 and their performances are listed in Table 3.2. By applying wideband buffers at the output, the simulation result shows that 4-bit differential ADC can be achieved at each subband. 75 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The differential topology for the ADC at each subband is selected to remove the needs for offset compensation circuits. Each ADC consists of comparator, amplifier and latch where its input dynamic range is mostly determined by the linearity of comparator. The complexity o f ADC circuit is closely correlated to the efficiency of digital signal processing part. Therefore, the goal of DSP block is to reduce the computational complexity without losing bit error rate performance. Ideally, any demodulation scheme can be performed to estimate transmitted symbols since the frequency channelization block provides sufficient statistics for the signal recovery. In this research, the single-carrier cyclic-prefix system is employed for the estimation and equalization of received symbols since it allows computationally efficient frequency-domain equalization and block processing o f data for the experimental purpose. 76 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 4 Clock and Frequency Generation The FCR requires sinusoidal frequencies to drive front-end mixers and a clock to operate ADCs and baseband circuits. For 3-subband FCR, we need two 4-phase sinusoidal frequencies (fo, 2fo) and one clock (fo/2) as shown in Figure 4.1(a). The generation o f clock and frequency scheme is shown in Figure 4.1(b). When external 2-phase (or differential) fo is provided, the 1st mixer frequencies (fo) are generated by a 4-phase filter (or poly-phase filter) and the 2n d mixer frequencies are generated by the combination of a frequency doubler and a 4-phase filter. Also, the differential clock (fo/2) for 2 time-interleaved ADC and digital circuits is generated by a frequency divider. In this chapter, we will discuss about details o f each circuit block. input LPF 4 - p h a s e fo LPF 4 - p h a s e 2fo LPF 2 tim e - in terlea v ed ADC 2 tim e - in terlea v ed ADC 2 tim e - in terlea v ed ADC 2 - p h a s e fo /2 (a) 2 - p h a s e , fo 4 -p h a s e filter ^ 4 - p h a s e fo x 2 4 - p h a s e filter 4 - p h a s e 2 fo -r 2 2 - p h a s e fo /2 (b) Figure 4.1: (a) Proposed receiver with 3-subband, (b) Frequency and clock generation scheme. 77 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.1 Local Oscillator Frequency Generation A poly-phase (or 4-phase) filter based on simple passive RC network allows the generation o f 4-phase local oscillator frequencies for the mixers from which down- converted complex signals are generated and processed separately (real and imaginary) in the DSP. Figure 4.2 exhibits procedures to generate 4-phase frequencies from 2-phase input. When 2-phase frequency (V<0, and V<180) is applied differential input nodes (1(2) and 3(4)), we can assume that sum o f two opposite sequence is applied 4 inputs each separated by 90° [2], As described in the bottom of Figure 4.2, a differential vector can be decomposed with two 4-phase vector sequences where their rotational directions are opposite to each other. Now, the output (Vo) is the sum o f two outputs (Voi and V 0 2 ) from each sequence. When the superposition principle is applied to obtain output of each sequence, there are two signal paths (arrows in the figure) for each output, which are low-pass network and high-pass network. Thus, Voi is the sum o f the LP filter output driven by V<90 and the HP filter output driven by V<0. Since the applied frequency is 1/(2tcRC), V oi is expressed as Therefore, the first input sequence creates 3dB gain and 45° phase increase but Vo. = V Z90 • - = Z-45 + VZO • - = Z45 = a /2 • V Z45 (4.1) With same manner, V0 2 becomes V 02 = V Z270 '>/2 (4.2) 78 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. output from the second sequence is cancelled. By applying same principle to the rest of output, the phase o f each output is separated by 90°. Ironically, the output o f this passive RC network is increased by 3dB. However, the actual gain depends on the load o f poly-phase filter [2]. v z o X R vVU—' > — Vo VZ180" v z o 1,2 -» VZ180" 3,4 VZO VZ90 VZ180 LPK ■Mr-"- HP X X SI VZ270 X N CD VZO V lpK; 'Mr-'> — H P VZ270 ■ + X VZ180 VZ90 X Vo: N + 1 4 >4 2 3 Figure 4.2: Poly-phase generation steps. Figure 4.3 shows actual 4-phase filter and output buffer used in the receiver design. The cascaded 2-stage poly-phase filter (each RC time constant is spaced by 15%) reduces sensitivity to the process variation and any passive R and C mismatches. Also, in order to externally control the output DC level which is local oscillator DC level, replica biased common-mode feedback (CMFB) is employed [7] [27], A frequency doubler based on phase-locked loop (PLL) is designed to generate the 2n d mixer frequency (2fo) from externally provide fo while keeping a constant phase relationship between subband signals. The 2n d order PLL employs two mixers for phase detector and frequency divider in the feedback loop (Figure 4.4). Since the 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. operating frequency is as high as several giga-hertz, digital phase detector and divider are not available in this application. R eplica biased C M FB V re f vzo V Z I V o u t Z 2 2 5 Figure 4.3: A two-stage poly-phase filter with output buffer. A differential amplifier with current mirror load is used as a current source for the loop filter which acts as an integrator. Thus, a DC pole is introduced by this integrator in addition to the DC pole due to VCO. To increase the phase margin in the loop transfer function, a series resistance Rf is added to place a zero. A differential VCO topology with accumulation mode MOS varactor is employed to generate differential phase outputs. As a high frequency divider, a passive double balanced mixer subtracts the VCO frequency (2fo) with reference frequency (fo) generating fo and its harmonics. The following band-pass filter (BPF) selects only the fundamental frequency o f mixer outputs and rejects harmonic frequencies. In order to amplify the doubled frequency output (2fo), an amplifier with inductive load is designed to drive off-chip terminated cable. 80 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2fo+ 2fo- — > Vctl VCO LPF fo+ 2fo- < ■ * > 2fo+ fo'+ fo'- 2fo BPF fo- Figure 4.4: A frequency doubler based on PLL. Comparing to a super-harmonic injection locked frequency divider (ILFD) [36], there are two advantages of employing mixer as a divider in the loop. First o f all, ILFD has limited locking range itself which further limits locking range o f frequency doubler. This problem will be discussed in detail in the next section. The other advantage is the ease of DC biasing. Since the VCO output is directly applied to the input of division mixer, no additional AC coupling capacitors and resistors are required to adjust the next stage DC level [36]. K pd K o / s H(s) Figure 4.5: Frequency doubler model. 81 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Magnitude -40dB/dec -20 ( DC -40 -90 PM -180 Phase Figure 4.6: Bode plot o f loop transfer function. A mathematical model o f the frequency doubler to investigate its loop stability is diagramed in Figure 4.5. The LPF (H(s)) in Figure 4.5 adds a DC pole and a zero in the loop which is expressed as 1+s/z H(s) = Kl • - (4.3) where K l is LPF gain and z = l/(Rf-Cf). The open-loop transfer function is expressed by ®o K f d - K l - K o -(1+s/z) Ao(s) ®e (4.4) where K pd is phase detector gain and Ko is VCO gain. Therefore, the closed-loop transfer function is A . . Oo 2 -(1+s/z) _ 2 -(1+s/z) A c(s) ^ = ®i s~ s . + - + 1 K lo o p z s2 2 < - s . — + — ^— + 1 COn COn (4.5) where the whole loop gain is 82 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. K pd • K l • K o .. K loop = -------------------------- . ( 4 . 6 ) Also, from (4.5), the natural frequency and the damping factor are C O n = V K lO O P (4.7) £ = ^ - . (4.8) 2-z Bode Diagram G m = h f d8 (at Inf r a d /s e c ) , ftn = 71.1 deg (at 2.04e+008 rad/sec) Frequency (rad/sec) Figure 4.7: PLL loop stability sim ulation result. To reduce effective jitter relative to the reference (fo), the loop should responds faster to noise. Thus, a high loop bandwidth is desirable. However, achievable loop bandwidth is limited by loop stability issues. Since two poles are located at DC of loop transfer function (Figure 4.6), the position of zero is critical for the loop stability. Furthermore, there is a parasitic capacitance (Cp in Figure 4.4) at the control node, which adds additional pole in the loop. Therefore, in order to obtain a 83 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. enough phase margin, the Cp needs to be as small as possible and the position of zero must be far below the cross-over frequency (coc). However, the position o f zero is also related to the damping factor which should be larger than 1 to exclude transient response overshoot. Figure 4.7 shows the simulation result o f loop stability where phase margin of more than 70° is achieved when ^=1.2 and coc = 1.2GHz. Also, the simulated transient response (Figure 4.8) o f voltage at the control node (Vctl in Figure 4.4) shows no significant overshoot. V c tl [V ] 1.4 1.2 1.0 0.8 0.6 0.4 0.2 o o ....................... 1 '— —1 — 1 * 1 , ‘ » 1.0 2.0 3.0 4.0 5.0 T im e [ u s ] Figure 4.8: Transient response of PLL control voltage. 4.2 On-chip clock generation As shown in Figure 4.1(a), on-chip clock drives two time-interleaved ADCs and the rest of digital circuits (code converter and output drivers). A differential super harmonic injection-locked frequency divider (DILFD) [35] is designed to generate a differential clock (fo/2) from external reference (fo) (Figure 4.9). In this topology, the 84 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. incident frequency (fo) is a harmonic of the LC oscillation frequency (fo/2) [35], which is the origin o f super-harmonic. And the following cross coupled push-pull amplifier converts analog output to rail-to-rail digital clock with 50% duty cycle [27]. F ull-sw ing fo Figure 4.9: Differential injection-locked frequency divider. A mathematical model o f injection-locked oscillator is shown in Figure 4.10. The LC oscillator can be modeled as a non-linear block (f(e)) and a BPF in the positive feedback loop. To observe a locking phenomenon, let’s define V.(t) = V. • cos(co.t+^) (4.9) V o(t) = Vo-cos(coot) u (t)= /(e (t)) = /(V o(t) + Vi(t)) H (*>) Ho 1+J-2Q CD-GOi C D , (4.10) (4.11) (4.12) where Vi(t) is the incident frequency, Vo(t) is the oscillation frequency, ( ( ) is the 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phase difference, cor is the resonant frequency, and Q is the quality factor. The output of non-linear block contains various harmonics of Vi(t) and Vo(t) and expressed as u(t) = Z L o Z r = oIV n ‘ cos(mcoi • t + m ^ ) • cos(nco0 • t) (4.13) where Kmnis inter-modulation coefficients of input and output frequencies. The inter-modulation terms of interest are |m c o i- n c o 0| = coo=cor + Aco . ( 4 . 1 4 ) For the division ratio of 2, n=2m ± 1 possess a frequency equal to the half o f incident frequency. The non-linearity function for the 2n d order division can be defined by / (e(/)) = tfo + a\ ■ e + ai ■ e~ + m ■ e 3 . ( 4 . 1 5 ) And the locking range o f the oscillator is defined as [36] Aco C O . < Ho-a 2-Vi 2Q (4.16) N on-linear H (a>) Vi(t) e(t) u(t) Vo(t) f(e) Figure 4.10: Injection-locked oscillator model. There are two ways to increase locking range. One is to apply large incident amplitude (Vi). However, this is limited by non-linearity problem. The other is to 86 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. increase Ho/Q in (4.16), which is equivalent to increase large L. The inductor size is usually limited by self-resonant frequency and area overhead, which is another limiting factor. For this reason, ILFD usually exhibits limited locking range [35][36]. To overcome this problem, varactors at the resonant circuit are often employed to track incident frequency [35], 4.3 Summary In this chapter, we have discussed clock and local oscillator frequency generation scheme. For 3-subband FCR, two 4-phase sinusoidal frequencies are necessary to drive in/quad-phase down-conversion mixers and one digital clock is needed to drive ADCs and digital circuits. The generation o f 4-phase frequencies from 2-phase input is achieved by a passive poly-phase filter and a buffer where the DC level of buffer output is controlled externally for the experimental purpose. The constant phase relationship from external reference is maintained by employing a frequency doubler and a frequency divider that generate 2fo and fo/2 frequencies, respectively. The topology o f frequency doubler is 2n d order analog-PLL with a frequency divider in the feedback loop. For the high frequency operation, two passive mixers are employed in this topology, which provide wide locking range and ease o f design. The injection-locked super-harmonic frequency divider followed by low-swing to high-swing converter is used for digital clock generation. 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5 Experimental Results This section describes prototype chip design, testing scheme, characterization results of each on-chip component, and receiver performance. Section 5.1 describes receiver chip and board design for the test setup, and testing scheme. The measurement results o f LPF, ADC, and frequency doubler will be described in section 5.2. Finally, the performance o f receiver and power consumption will be discussed in section 5.3. CCULHHHimn Figure 5.1: Receiver chip photo. 5.1 Receiver Chip, Package and Board 88 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The 3-subband frequency channelization receiver front-end has been fabricated in 0.25um CMOS technology (TSMC). The chip photo is shown in Figure 5.1 and the chip size is 3.0x3.0mm . The chip area occupied by receiver front-end and frequency • 9 9 doubler is 1.5x2.4mm and 0.6x0.75mm , respectively. In order to reduce the parasitic capacitance, the electrostatic discharge (ESD) protection circuits are removed from high-speed input and output pads. 100um < ► E z> o o M5 M4 [ M3 [ M2 [ M1 [ Figure 5.2: On-chip parasitic capacitances along the high-speed signal path. As shown in Figure 5.2, a typical lOOxlOOum pad consists o f stacked metal layers from m etal-1 to metal-5 in 1-poly 5-metal CMOS process. W hen metal-3 is selected as interconnection metal to the internal cell, the extracted parasitic capacitances for C l and C2 are respectively 315fF and 25fF, resulting in total parasitic capacitance o f 340fF for the pad cell. For the high-speed serial signal input, this pad cell is connected to LPF input and quadrature mixer input through 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interconnection metal as shown in Figure 5.2. To minimize the metal-to-substrate interconnection capacitance C3, a significant design effort is made to place input circuits as close as possible to the pad. The estimated value o f C3 is approximately 50fF when the distance (x) is 200um. For 3-subband receiver input, the input capacitance C4 consists o f input gate capacitance o f LPF for the 1st subband and source-to-substrate diffusion capacitances of two quadrature mixers for the 2n d and 3rd subbands. The estimated value of C4 is approximately 160fF when 12um and 15um transistors are used for LPF and mixer input devices, respectively. The 50Q termination resistance is implemented with on-chip resistor since it clearly helps damping any ringing caused by package and board parasitics. When this resistance is implemented by unsalicided N+ poly resistor (~180Q/sq), the accuracy o f resistance value drops significantly because of the small aspect ratio. To increase the accuracy, four 200Q resistors may be connected in parallel to produce effective resistance of 50Q. However, this parallel structure produces significant parasitic capacitance (Cp). To avoid these parasitic capacitances, salicided N+ poly resistor (~5Q/sq) is used to implement 50Q even though its accuracy is not as good due to the nature o f salicide process. The estimated parasitic capacitance due to the on-chip resistance is approximately 83fF. The measured average on-chip resistance is approximately 65Q which is 23% higher than the target value. Now, the total input capacitance is the sum o f C l through C4 and Cp, which is approximately 630fF. The input bandwidth is calculated by l/(27t*630f*25) where the effective resistance becomes 25f2 (terminal resistance 50Q // transmission line characteristic impedance 90 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 500). When all the parasitic resistances are ignored, the estimated input bandwidth is approximately 10GHz which is large enough for lOGsymbols/s data rate. I I I I 4 _ _L -L _L Figure 5.3: The im plem entation of termination resistance with unsalicided poly resistors. The geometrical symmetry in the mixer layout is very important since any asymmetric differences could cause differential noise at the output. The schematic and layout o f double-balanced mixer is shown in Figure 5.4. To keep a symmetric geometry seen by both RF+ and RF- nodes, two dummy transistors are placed at each side o f mixer layout where their terminals are connected to the ground. Also, at high frequency operation, the parasitic capacitance resulted from overlap o f metal layers should not be ignored. Because o f this, the coupling noise from crossing wires is significantly introducing signal distortions. To cancel out this noise differentially, for example, a dummy wire (dotted circle in Figure 5.4(b)) is added to the layout so as to create overlap capacitance which neutralizes the polarity o f crossing wires (IF+ and IF-). The design techniques to prevent noise from supply and ground are employed in this chip design. As shown in Figure 5.5, when on-chip supply current (IDD) is charging the load capacitor (Cl) during the At time, the spontaneous current (AI) 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. flowing into CL from on-chip VDD node (VDDi) is equal to CL-AV/At. While this current is supplied by the external voltage source (VDD) through bonding wire inductance (L), a significant voltage drop will be developed across the inductor that is equal to L-AI/At. This phenomenon is causing internal supply node ‘bouncing’ whenever a charging current is flowing inside the chip. With the same mechanism, the discharging current through the ground path causes ground bouncing at the internal ground node (GNDi). RF+ X 1 0 + HCf ’ > H I - 10- 10- < RF+ R F - T " R F - (a) > L 0+ rf] / / / L 0+ (b) F ig u re 5.4: Passive d ou ble -b a la n c e d m ix e r (a ) schem atic and (b ) layo u t. This bouncing noise can be a serious problem when the amount o f current supplied by one supply (or ground) pin increases. Especially, the digital part in the chip induces a significant bouncing noise when a lot of inverter is switching at the same time. To remove noise coupling from digital part of the chip, separate supply node is assigned for the analog part. The number o f internal pads assigned for supply and ground is also important because the effective inductance decreases as the number o f these pads increases (i.e. inductance in parallel). For this reason, we assigned as many pads as we can for the 92 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. internal supply and ground. In this design, we assigned 12 pads for the internal supply nodes and 13 pads for the internal ground nodes. Another technique to stabilize supply/ground noise is to use on-chip decoupling capacitance (C) in Figure 5.5. The idea is that if C is large enough, then VDDi and GNDi bounce in unison. For this, any un-used space in the chip is filled with decoupling capacitors and metal- 1 is mostly assigned for the routing o f supply/ground node for the same purpose. L VDDi VDD Q ^ IDD On-chip circuits L o GNDi Figure 5.5: Supply and ground bouncing due to package inductances. p + s u b s t r a t e Analog blocks Figure 5.6: Guard rings to isolate sensitive analog blocks. 93 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To isolate the sensitive analog circuit blocks from noisy digital section o f the chip, “guard rings” (Figure 5.6) which consists of P+ substrate tie surrounded by n- well ring is used for mixers, LPF, and frequency generation circuits. The destructive effect o f substrate noise to the analog circuit is mainly due to the threshold voltage variations caused by noise voltage between source ground and substrate. This immediately introduces noise current flowing through the transistors. P a c k a g e Serial in p u t F req u en cy d o u b le r inp u t 1 st m ixer LO inp u t F re q u en c y F req u en cy d iv id e r in p u t d o u b le r o u tp u t Figure 5.7: Bonding wire configuration for high-speed inputs and outputs. The package used for this chip is a 84-pin ceramic lidless chip-carrier (CLCC) from MOSIS. It is unknown if this package has controlled impedance traces from pins to the bonding pads. Figure 5.7 shows bond wiring configuration o f the high speed signals. To reduce the length o f bonding wire for high-speed signals, sensitive input and output pads are placed in the middle o f the chip since the wire length increases for the pads in the corners. Also, the distance between high-speed bonding wires should be far enough to reduce the electromagnetic interference. For this 94 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. reason, we tried to avoid placing different high-speed signal pads on the same side of the chip. In this package, we could not customize the distance between receiver chip wire length is approximately 2.5mm which corresponds to 2.5nH o f self-inductance. The 84-pin CLCC is directly soldered on PCB without a socket to reduce signal reflections caused by impedance mismatches and discontinuities. The PCB is typical FR4 type with four layers and its thickness is about 62mils. The typical dielectric constant o f this material is approximately 3.9 and its thickness is about lOmils. Therefore, the width o f traces for high-speed signals is kept to be 2 times larger than its thickness to match the characteristic impedance with 50Q. The traces from the package are connected to SMA connectors to connect with cables as shown in Figure 5.8. The distance from the package to the SMA is kept as close as possible for high speed input and output signals to minimize filtering effects discussed in section 2.2.2. The power decoupling capacitors are used to reduce the supply noise on the board. Each power supply pads (analog and digital) on the board is connected to various types and sizes o f capacitor networks ranging from 0.1 nF to luF. Ceramic edge to the bonding pad which is known as package cavity. The shortest bonding trace SMA package . i Figure 5.8: Board trace from package to SMA connector. 95 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. chip capacitors that are placed close to the package are used for small capacitances since their small feature sizes provide low inductance. To reduce the coupling noise, the power plane is divided for analog and digital supplies even though the ground plane is shared on the board. Serializer(4:1) 622M bps (MAX3693) Differential 4-bit at 155M bps * > 156.25M bps 1 0 G bps 1 0 G b p s T x (1 6 :1 ) P a tte rn G e n e ra to r S ta te A n a ly z e r Rx Board Figure 5.9: Data generation and acquisition. The lOGbps transmitter (MAX3952) evaluation board is used to generate differential serial inputs. A pattern generator is programmed to generate 155Mbps binary codes with cyclic-prefixes as shown in Figure 5.9. This 16 4-bit data from the pattern generator is serialized with 4:1 serializer (MAX3693) to generate 622Mbps data inputs for the transmitter. The serialized lOGsymbols/s differential inputs are connected to receiver board for the digitization. The 3-bit differential data outputs from the receiver board are de-multiplexed to 156Mbps LVDS signals by 1:16 de serializer (MAX3880) on the board and then stored in the state analyzer memory for the retrieval. The acquired data are transferred to PC for the evaluations. 96 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >. O ) k. O c 0 ) ■o 0 ) T O D E D O o T O ■ o o N T O E S y m b o l frequency [Hz] Figure 5.10: M easured energy spectrum of transmitted data at lOGsymbols/s. 5.2 Measurement Results The accumulative energy spectrum o f transmitted signal from lOGsymbols/s transmitter through 15cm PCB trace and a lm -long cable is measured to see the signal energy distribution in Figure 5.10. The result shows that the signal energy is quite uniformly distributed from DC to several GHz range and approximately 90% of signal energy is concentrated within 4GHz which is less than half symbol frequency (5GHz). This suggests that high-frequency components o f signal spectrum are filtered out by the board traces and cables along the transmitter signal path. If the receiver signal path is considered, this effect will be even worse resulting in significant ISI at the receiver input. Based on this measurement, we set the signal bandwidth (BW sio) at approximately 5GHz and local oscillator frequencies for the 2n d and 3rd subbands at around 2GHz (fo) and 4GHz (2fo), respectively. 97 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cable SubbandO Subbandl Subband2 Baiun Signal Generator #2 Signal Generator Spectrum Analyzer Serial Input L01 L02 Receiver Board Figure 5.11: M easurem ent setup for LPF frequency response. A measurement setup is prepared for the frequency response o f each subband LPF as shown in Figure 5.11. A single-tone sinusoidal frequency is provided to the serial input with a signal generator (#1) where its frequency is sweeping from 100MHz to 5GHz. To compensate for the cable loss, the signal amplitude o f each frequency at the end o f the cable is normalized by adjusting the signal source power. Another signal generator is used to provide local oscillator frequencies through external balun. The output spectrum from each subband LPF is measured with spectrum analyzer. The synchronization of serial input and local oscillator frequencies are achieved by connecting 10MHz reference ports between signal generators. The measured frequency response o f each subband LPF is shown in Figure 5.12(a). The measured 3dB bandwidth o f each subband is approximately 900MHz by considering limited bandwidth of wideband driver for the LPF output. This measured bandwidth is slightly lower than the target (lGFIz), however, as we will discuss about in the next section, the receiver performance is not very sensitive to such 98 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. deviation. The gain o f each subband LPF is adjusted such that its output is within the input dynamic range o f ADC (80-330mVp-p) as we discussed in section 3.3. The same measurement setup is used to obtain variable gain characteristic o f the LPF. As shown in Figure 5.12(b), the adjustable gain of LPF is around lOdB at approximately constant bandwidth. + - Vc=1.! -20 F re q u e n c y [Hz] F re q u e n c y [Hz] (a) (b) Figure 5.12: (a) Frequency response at each subband, (b) LPF gain control. The non-linearity of mixer and LPF is investigated by means o f ldB compression points. The measurement setup is similar to Figure 5.11. In this case, the power of input sinusoidal frequency (100MHz) increases from -15dBm (40mVp-p) as the fundamental output power is watched at the spectrum analyzer for each subband. Figure 5.13 shows measured input/output-referred compression points for the 1st and 2n d subbands. The input/output-referred ldB compression points for the 1st and 2n d subbands are -5.6/-7.6dBm and -7/-9dBm, respectively. The 1st subband 99 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. compression point is approximately 1.5dB higher than the 2n d subband. This is because the 2n d subband signal path includes mixer stage which introduces non- linearity related to its harmonic terms. The measured 3rd subband ldB compression point is similar to that o f the 2n d within +/- 0.5dB range. Also, this measurement is performed at single-ended LPF output since the differential outputs are not available. Therefore, the actual differential ldB compression values could be little bit higher than these measurement results. I o a 3 a. + ■ » 3 o -1 4 —G -- S u b b a n d O - P S u b b a n d i -1 8 •8 •6 •2 0 -1 6 -1 4 -12 -10 -4 Input power [dBm] Figure 5.13: M easured ld B com pression points for the 1st and 2nd subbands. The signal-to-noise-plus-distortion ratio (SNDR) of receiver front-end that includes mixer, LPF, and ADC is measured to estimate the performance. The measurement setup is shown in Figure 5.14. A single-tone frequency is applied to serial input of receiver through a cable and the local oscillator frequencies are 100 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. provided for the mixer operations at the 2n d and 3rd subbands. The 3-bit ADC output is collected in PC to recover the sinusoidal waveform in various frequencies. Cable SubbandO Subbandl Subband2 Baiun Signal Generator #2 Signal Generator PC Serial Input L0 1 , L02 Receiver Board Figure 5.14: The m easurem ent setup for SNDR. The applied single-tone sinusoidal waveform is sampled effectively at 2.5Gsamples/s with two time-interleaved on-chip ADCs. The three-parameter least- squares algorithm [16] is used to fit the recovered ADC output to an ideal sine waveform at each frequency tone (sweeping from 100MHz to 1.4GHz). When M samples ( y l , y 2 ,..., yM ) are taken at times tl , t2 ,..., tM, this algorithm finds three parameters (Ao, Bo, and Co) that minimize the following rms noise: r m s n o is e = 1 M ?n'/2 y '[y n -A o - COS(Ct)o • tn) - Bo • sin(< Z > o • tn) - C of (5.1) where coo is the frequency applied. The SNDR is defined as the ratio between rms signal to noise. An example o f recovered ADC data and its fitted sine wave at 500MHz is shown in Figure 5.15. This measurement helps to estimate the performance of sampling system by quantifying the amount o f distortions (due to non-linearity, harmonics and noise) and jitter. 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 2 1 0 ■ 1 ■ 2 •3 0.12 0.2 0.28 0.36 0 .4 4 0.52 0.6 0.68 0 .76 0 .8 4 T im e [nS ] Figure 5.15: Curve-fitted recovered ADC output at 500M Hz. The measured SNDR at each subband is shown in Figure 5.16(b). The SNDR at high frequency is primarily limited by LPF bandwidth that is around 900MHz. However, we found that the SNDR at low frequency is mainly limited by signal distortions from LO feed-through. Especially, this phenomenon is pronounced when there are phase mismatches (or unbalances) in differential LO frequencies. This phase mismatch ( in Figure 5.16(a)) degrades the common-mode rejection ability of double-balanced mixers at front such that coupling noise from LO becomes differential. In this case, the following differential LPF and ADCs can not remove this noise. The main source o f this phase mismatch can be found at the passive elements in the poly-phase network as we discussed in section 4.1. In order to generate higher frequencies, the poly-phase network requires smaller R and C elements that are more sensitive to process variations. This explains why the 3rd subband SNDR is worse (- 102 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.5dB lower) in this measurement. For 3-bit ADC, when only the quantization noise is considered, the ideal SNDR can be reached to 19.8dB by the expression [16]: SNDR = N * 6.02 + 1.76 [dB] (5.2) where N is ADC bits. Compared to ideal ADC, the low frequency SNDR of 1st and 2n d subbands is approximately 3.6dB lower. foZ (7t+A ^) foZO LPF — ► ► ADC — ► s — f* ► Q — ► Z to 20 15 10 5 0 ■Sab-1 Sub-2 ■ 5 0.1 0.3 0 .5 0 .7 1 (a) Frequency [GHz] (b) Figure 5.16: (a) The injection of coupling noise by phase mismatch, (b) M easured SNDR at each subband. The jitter performance o f on-chip frequency doubler is measured. The experimental setup is shown in Figure 5.17. A signal generator is providing the local oscillator frequency of the 1st mixer (fo « 2GHz) to the differential inputs of frequency doubler. The output waveform is collected at the digital oscilloscope for display. By enabling accumulation function o f oscilloscope, the peak-to-peak jitter o f frequency doubler can be measured. 103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Baiun Signal Generator Digital Oscilloscpe FDB Receiver Board FDB OUT Figure 5.17: Jitter measurem ent setup. hi s t o g r a m s c a l e *12 h i t s / d i v mean 200.0*192 ns s t d d ev 3 . 7 5 7 3 ps p - p 29.*1 ps o f f s e t 0 h i t s median 200.0*192 n s p±l<r 7 0 . 5 8 h i t s 2 . 5 1 7 k h i t s p±2tf 9 6 . 1 8 p eak 167 h i t s 9 9 . 5 8 Figure 5.18: M easured jitter performance. For the locking range measurement, a spectrum analyzer is connected at the frequency doubler output to see the frequency changes while the input frequency from signal generator is sweeping. The measured jitter performance is shown in Figure 5.18. The output amplitude is approximately 300mVp-p that is good enough to drive on-chip mixer. The measured peak-to-peak jitter at 4GHz is 29.4ps when 104 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. signal source has 18ps jitter. Also, the phase lock is maintained from 3.75GHz to 4.2GHz resulting in 450M Hz difference. Cable S ubbandO S u b b a n d l S u b b a n d 2 Baiun Signal G enerato r PC 10Gbps T ransm itter Logic Analyzer Serial Input L 0 2 L 0 1 , FD Receiver Board FDB FDB O U T IN Figure 5.19: Perform ance testing scheme. 5.3 Receiver Performance The performance o f proposed receiver is measured by collecting ADC outputs and processing them in PC. The measurement setup for the performance test is shown in Figure 5.19. A differential reference frequency (fo) is generated by external balun and provided to the 1st mixer, frequency divider (FD), and frequency doubler (FDB). Then, the output o f FDB (2fo) is applied back to the 2n d mixer. A lOGsymbols/s transmitter (Maxim 3952) is programmed to serialize binary data blocks with cyclic prefix as discussed in section 5.1. To be compatible with current mode logic (CML) signal interface, no AC coupling capacitor is used between transmitter and receiver and approximately lm semi-flexible cable is used to connect the transmitter and receiver. 105 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.20: (a) M easured transmitted signal at lOGsymbols/s. (b) Slicer input after equalization. The eye diagram o f transmitted signal through 15cm PCB trace and lm long cable is shown in Figure 5.20(a) where ISI and jitter effects are significant enough to close eyes. This signal is applied to the receiver for the frequency channelization and digitization. The local oscillator frequencies o f two down-conversion mixers are 2GHz and 4GHz, respectively. For the digitization, each subband signal is sampled at 1.25Gsamples/s and 10 ADCs effectively achieves 12.5Gsamples/s. The collected ADC outputs are process in PC for the detection and equalization. The accumulated slicer inputs o f 18,000 equalized symbols show two distinct regions, as shown in Figure 5.20(b), which actually verifies the functionality o f proposed receiver. Power consumed by each block at 2.5V supply is listed in Table 5.1. Although the measured power consumption is approximately 1W, this value includes output buffers and drivers that were added to observe some o f the internal nodes o f the FCR. Through simulation, we estimated the power consumption o f these test circuitries. 106 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Removing them from the measured power values, the power consumption o f the prototype FCR is approximately 460mW. Mixer + LO 50 (measured) LPF Block 200 (measured) LPF 90 (simulation) Output buffers 110 ADC Block 700 (measured) Comparators/Latches 225 (simulation) Code converters 60 (simulation) Output drivers 415 Frequency Doubler 35 (measured) Measured total power: 985mW (including test circuitries) Estimated receiver power: 460mW (excluding test circuitries) Table 5.1: Power consumed by each receiver block. To prove the architectural efficiency o f proposed receiver, the ideal figure-of- merit (FOM) formula [42], which represents energy consumed by each conversion step, is used. For the N -bit ideal ADC, the formula is IpJ/step] (3.1) where fs represents ADC sampling frequency. The estimated Figure-of-M erits (FOM) of various TIR are plotted with prototype FCR in Fig. 5.21 for the comparison. The achieved FOM (4.6pJ/step) for 3-bit FCR is significantly lower than conventional TIR architectures. When the optimized frequency channelization circuits as discussed in section 3.3 is used, we found by simulation that achievable FOM can be reduced to 4.0pJ/step. This architectural advantage is mostly resulted from the relaxed bandwidth-power 107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. tradeoff at the sample-hold circuitry, robustness to sampling jitter, and low power frequency generation scheme. When this result is extrapolated to higher bit ADCs (dotted line in Fig. 5.21), the advantage o f proposed receiver is more obvious achieving less than 4pJ/step at 4 or higher bits resolution. 14 Q . CL) (A 12 - 10 - £ 8 h d ) E i 6 h i 0 ) l - 3 U) ‘ t------------------- 1 ------------------- 1 ------------------- r TIR,8GS/s[9] TIR,10GS/s[25] X TIR,8GS/s[54] X FCR,12.5GS/s[26] 3 4 ADC bits Figure 5.21: ADC ideal Figure-of-M erit (FOM ) com parison. 5.4 Summary A 3-subband FCR is implemented in 0.25um CMOS technology. The estimated input bandwidth for 3-subband receiver is as high as 10GHz that is good enough for lOGsymbols/s data rate. Also, various layout /board design techniques to minimize parasitic effect, bouncing noise, and substrate noise coupling are presented in section 5.1. 108 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The measured controllable gain o f each subband LPF is about lOdB at approximately constant bandwidth o f 900MHz. The non-linearity o f each frequency channelizer (mixer/LPF) is measured by means of ldB compression point. The 1.5dB difference between 1st and 2n d subband can be explained by the presence of mixer which adds harmonic distortions. The measured SNDR at high frequency is primarily limited by LPF bandwidth. However, the low frequency SNDR is limited by signal distortions mainly caused by phase mismatches in the LO frequencies. The 3rd subband SNDR is about -2.5dB lower than the 1st and 2nd. This is because the rd phase mismatches are worse in 3 subband LO frequencies. The measurement o f receiver performance is achieved by collecting ADC outputs and processing them in PC. In the presence o f significant ISI and jitter, the slicer inputs o f 18,000 symbols at lOGsymbols/s are detected without any errors. Also, the power consumed by the whole design is approximately 1W at 2.5V supply. 109 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 6 Conclusion The modern incident wave signaling technique enables high-speed symbol rate through conventional PCB traces and cables with less power consumption and higher noise immunity. As symbol rate increases continuously up to multi Gsymbols/s, not only the band-limiting factors from off-chip environment degrade receiver performance by introducing ISI on the signal but also on-chip clock jitter decreases sampling accuracy. On the basis of conventional TIR, communication techniques such as multi-level signaling and equalization are adopted to increase the transmission data rate. However, by the inherent limitation o f the architecture, which is pronounced by high-speed sampling at the front-end, the above limitations are not efficiently managed [55] such that the implementation o f multi-bit ADC, which is critically required for these communication techniques, becomes very difficult. To overcome these limitations, we proposed a FCR that channelizes received signal into multiple subbands before digitizing such that the signal input bandwidth to each S/H circuitry is reduced by approximately the number o f subbands, greatly mitigating many o f the TIR implementation challenges caused by the wide signal bandwidth [55]. This reduced subband signal bandwidth also allows great robustness to the timing uncertainties by decreasing sensitivity to the sampling jitter and lower power consumption at the S/H circuitry. In addition, FCR requires less number of 110 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ADC bits than conventional TIR when the propagation channel bandwidth is limited by the half or less o f signal bandwidth (i.e., significant ISI is present), which is a typical case in modern multi-giga serial-links (e.g., channel bandwidth o f 10m cable is only 1GHz [55]). As a result, FCR provides efficient design environment for the realization o f full-digital equalization at the receiver in an adaptive manner, which is known to be more robust in serial-link system [58][ 1 ]. The frequency channelization is achieved by employing down-conversion mixers and following LPFs. The passive double-balanced mixer is chosen since it offers good linearity and wide bandwidth at the front-end. During the down- conversion, we have shown that the subband signal contamination by the 3 rd harmonic o f local oscillator frequency limits the maximum number o f subband by three. The signal loss by passive mixer has been recovered by the following low-pass amplifier which also offers variable gain to accommodate limited dynamic range of the following comparator. In this dissertation, improved topologies for the frequency channelizer have been studied in terms o f linearity, output bandwidth, and power. Especially, various wideband amplifiers are investigated to compare their advantages and disadvantages for this application. The suggested topologies provide sufficient bandwidth for more than 4-bit ADC with good linearity. Since the proposed receiver provides statistically sufficient information, any demodulation can be processed in the digital domain. For the computational simplicity and ease o f testing, the single-carrier cyclic-prefix system is employed for the signal detection. The computational efficiency o f digital filter can be improved 11 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. by employing regularity at each transmitted data block (cyclic prefix), which simplifies the ‘inversion’ process. Comparing to time-domain linear equalization, not only the number o f filter taps are reduced but also processing efficiency is improved by FFT operations. The generation o f on-chip frequencies is achieved by a frequency doubler based on PLL and injection-locked divider in order to keep constant phase relationship between them. The topology o f frequency doubler is 2n d order analog-PLL with a frequency divider in the feedback loop. For the high frequency operation, two passive mixers are employed in this topology, which provide wide locking range and ease o f design. The injection-locked super-harmonic frequency divider followed by low-swing to high-swing converter is used for digital clock generation. Poly-phase filter network is used to convert differential reference frequency to 4-phase local oscillator frequencies. To validate the concept, 3-subband receiver prototype has been implemented in 0.25um CMOS technology. The two time-interleaved 3-bit ADC in each subband operates at 1.25Gsample/s and requires no initial offset compensation. The total effective sampling rate achieved by 10 ADCs reaches 12.5Gsamples/s. The functionality o f the receiver is demonstrated by correctly detecting lOGsymbols/s signal in the presence o f significant ISI and jitter. The architectural efficiency o f the prototype is verified by achieving ideal Figure-of-Merit (4.6pJ/step) that is much lower than conventional receivers (Figure 112 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.21). We have also shown that the efficiency will be more obvious as 4 or higher ADC bits are implemented. 6.1 Future works In this dissertation, we have proved fundamental concept o f frequency channelization and suggested efficient design to implement the concept in silicon. For the completeness o f the receiver, the DSP part which includes detection and equalization needs to be implemented in a same chip such that real-time BER can be evaluated. Although the implementation efficiency and performance have been improved by employing single-carrier cyclic-prefix system that incorporates digital Fourier transform blocks and adaptive loop for the digital filter coefficients update in time, we may need more study to find optimal demodulation scheme for this application. To further exploit the architectural advantage o f FCR, multi-level signaling is a good choice to multiply the data rate without increasing the symbol rate. It is reported that the higher resolution ADC is required to detect multi-level signaling since the modulation itself impacts both voltage and timing margin o f the signal [56], For 4-PAM scheme, according to our analysis, at least 5-bit ADCs may need to detect the signal reliably at lOGsymbols/s rate. However, the implementation o f 5-bit ADC at Nyquist rate (~lG H z) may not easy even with FCR due to its huge capacitive loading. One possible choice to relax the requirement o f higher resolution ADCs is to improve the accuracy o f the received signal by pre-emphasizing the 113 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. signal at the transmitter. Another possible choice is to use a mixed-mode equalization scheme at each subband channel. This is a quite reasonable approach in that the bandwidth requirement for the analog equalizer is only a fraction o f signal symbol rate. For the next generation serial link system, the data rate o f 20Gb/s with 4-PAM may be a reasonable challenge for the FCR. One possible application for FCR is toward optical signaling where the channel medium is no longer limiting the signal bandwidth. With the help o f rapid technology scaling, the serializer and deserializer for optical communication system is often realized in CMOS technology [39] [3 8]. The concept o f frequency channelization can be directly applied to the demultiplexer of optical receiver which claims higher system bandwidth and sampling resolution. Another good application can be found in ultra-wideband (UWB) radio receiver [29], Especially, the cyclic prefix modulation offers high efficiency when the SNR is seriously degraded by signal dispersion due to dense multipath propagation. 114 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography [1] Kamran Azadet, et al., “Equalization and FEC techniques for optical transceivers,” IEEE J. o f Solid-State Circuits, vol. 37, pp. 317-327, March 2002. [2] Farbod Behbahani, et al., “CMOS mixers and polyphase filters for large image rejection,” IEEE J. o f Solid-State Circuits, vol. 361, pp. 873-887, June 200. [3] J. L. Brown Jr., “Generalized sampling and the perfect reconstruction problem for maximally decimated filter banks,” IEEE ICASSP, pp. 1195-1198, 1989. [4] E. M. Cherry and D. E. Hooper, “The design o f wideband transistor feedback amplifiers,” Proc. IEE, vol. 110, pp. 375-389, Feb. 1963. [5] Jong-Sang Choi, et al., “A 0.18-um CMOS 3.5-Gb/s continuous-time adaptive cable equalizer using enhanced low-frequency gain control method,” J. o f Solid- State Circuits, vol. 39, pp. 419-425, Mar. 2004. [6] W.J. Dally et al., “Transmitter equalization for 4-Gbps signaling,” IEEE Micro vol. 17, pp. 48-56, Jan. - Feb. 1997. [7] W illiam J. Dally and John W. Poulton, Digital system engineering, Cambridge University Press, 1998. [8] David Falconer, et al., “Frequency domain equalization for single-carrier broadband wireless systems,” IEEE Communications Magazine, pp.58-66, April 2002 . [9] Ramin Farjad-Rad, et al., “A 0.3-um CMOS 8-Gb/s 4-PAM serial link transceiver,” IEEE J. o f Solid-State Circuits, vol. 35, pp. 757-764, May 2000. [10] Ramin Farjad-Rad, A CMOS 4-PAM Multi-Gbps Serial Link Transceiver. Ph.D. dissertation, Stanford University, 2000. [11] Farjad-Rad, R., et al., “A 0.4-um CMOS 10-Gb/s 4-PAM pre-emphasis serial link transmitter,” IEEE J. o f Solid-State Circuits, vol. 34, pp. 580-585, May 1999. [12] Lei Feng and Won Namgoong, “A frequency channelized adaptive wideband receiver for high-speed links,” IEEE Workshop on Signal Processing Systems, SIPS 2003, pp 24-28, Aug. 2003. [13] Andrea Goldsmith, Wireless communications. Cambridge University Press, 2005. 115 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [14] Srikanth Gondi, et al., “A lOGb/s CMOS adaptive equalizer for backplane applications,” Int. Solid-State Circuits Conf., 18.1, Feb. 2005. [15] Pavan Kumar Hanumolu, et al, “Analysis o f PLL clock jitter in high-speed serial links,” IEEE Trans, on Circuits and Systems - II, vol. 50, pp. 879-886, Nov. 2003. [16] The Institute o f Electrical and Electronics Engineers, “IEEE standard for terminology and test methods for analog-to-digital converters,” IEEE Std 1241-2000. [17] H.O. Johansson and C. Svensson, “Time resolution o f NMOS sampling switches used on low-swing signals,” IEEE J. o f Solid-State Circuits, pp. 237-245, Feb. 1998. [18] David A. Johns and Daniel Essig, “Integrated circuits for data transmission over twisted-pair channels,” IEEE J. of Solid-State Circuits, vol. 32, pp. 398-406, March 1997. [19] S. Kasturia and J. H. Winters, “Techniques for high-speed implementation of nonlinear cancellation,” IEEE J. Selected Areas Comm., vol. 9, pp. 711-717, Jun. 1991. [20] Jinwook Kim, et al., “A four-channel 3.125-Gb/s/ch CMOS serial-link transceiver with a mixed-mode adaptive equalizer,” IEEE J. o f Solid-State Circuits, vol. 40, pp. 462-471, Feb. 2005. [21] Peter Kinget and Michiel Steyaert, “Impact o f transistor mismatch on the speed- accuracy-power tradeoff of analog CMOS circuits,” IEEE Custom Integrated Circuit Conf., pp. 333-336, 1996. [22] M.-J. Edward Lee, William J. Dally, et al., “CMOS high-speed I/Os - present and future,” Proc. of 21st International Conference on Computer Design, pp. 454-461, Oct. 2003. [23] M.J.E. Lee, W. Dally, and P. Chiang, “A 90mW 4 Gb/s equalized I/O circuit with input offset cancellation,” Int. Solid-State Circuits Conf., pp. 252-253, 463, Feb. 2000 . [24] Thomas H. Lee, The design o f CMOS radio-frequency integrated circuits, Cambridge University Press, 1998. [25] Jaesik Lee, et al., “A 5-b lOGsamples/s A/D converter for 10-Gb/s optical receivers,” IEEE J. o f Solid-State Circuits, vol. 39, pp. 1671-1679, Oct. 2004. 116 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [26] Kyongsu Lee and Won Namgoong, “A 0.25um CMOS 3-b 12.5GS/s frequency channelized receiver for serial-links,” IEEE ISSCC Dig. Tech. Papers, pp. 336, Feb. 2005. [27] J.G. Maneatis, “Low-jitter process-independent DLL and PLL based on self biased techniques,” IEEE J. o f Solid-State Circuits vol. 31, pp. 1723-1732, Nov. 1996. [28] T. Mizuno, “Experimental study o f threshold voltage fluctuation due to statistical variation o f channel dopant number in M OSFET’s,” IEEE Trans, on Electron Deveices, vol. 41, pp. 2216-2221, Nov. 1994. [29] Won Namgoong, “A channelized digital ultra-wideband receiver,” IEEE Trans, on Wireless Comm., vol. 2, pp 502-510, May 2003. [30] Azita Emami Neyestanak, Design o f CMOS receivers for parallel optical interconnects. Ph.D. dissertation, Stanford University, 2004. [31] A. Papoulis, “Generalized sampling expansion,” IEEE Trans, on Circuits and Systems, vol. 24, pp. 652-654, Nov. 1977. [32] M.J. Pelgrom, “Matching properties of MOS transistors,” IEEE J. o f Solid-State Circuits, vol. 24, pp. 1433-1439, Oct. 1989. [33] Rudy van de Plassche, CMOS integrated analog-to-digital and digital-to-analog converters, Kluwer Academic Publishers, 2003. [34] John G. Proakis and Masoud Salehi, Communication systems engineering, Second edition, Prentice Hall, 2002. [35] Hamid R. Rategh, et al., “A CMOS frequency synthesizer with an injection- locked frequency divider for a 5-GHz wireless LAN receiver,” IEEE J. o f Solid-State Circuits, vol. 35, pp. 780-786, May 2000. [36] Hamid R. Rategh and Thomas H. Lee, “Superharmonic injection-locked frequency dividers,” IEEE J. o f Solid-State Circuits, vol. 34, pp. 813-821, June 1999. [37] Behzad Razavi, Design o f Analog CMOS Integrated Circuits, McGraw-Hill, 2001 . [38] Behzad Razavi, “Prospects o f CMOS technology for high-speed optical communication circuits,” IEEE J. o f Solid-State Circuits, vol. 37, pp. 1135-1145, Sep. 2002. [39] Behzad Razavi, Design o f integrated circuits for optical communications. McGraw-Hill, 2003. 117 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [40] Behzad Razavi, et al., “Design techniques for high-speed, high-resolution comparators,” IEEE J. of Solid-State Circuits, vol. 27, pp. 1916-1926, Dec. 1992. [41] Eduard Sackinger and Wilhelm C. Fischer, “A 3-GHz 32-dB CMOS limiting amplifier for SONET OC-48 receivers,” IEEE J. o f Solid-State Circuits, Brief Papers, vol. 35, pp. 1884-1888, 2000. [42] Christoph Sadner, et al., “A 6bit, 1.2GSps low-power flash-ADC in 0.13um digital CM OS,” Proc. o f the Design, Automation, and Test in Europe, pp. 223-226, 2005. [43] Arvin R. Shahani, Derek K. Shaeffer, et al., “A 12-mW wide dynamic range CMOS front-end for a portable GPS receiver,” IEEE J. o f Solid-State Circuits, vol. 32, pp. 2061-2070, Dec. 1997. [44] Meigen Shen, et al., “Chip-package co-design for high-speed transmitter in serial links applications,” Electrical Performance o f Electronics Packaging 2003, pp. 217-220, Oct. 2003. [45] Vladimir Stojanovic, et al., “Transmit pre-emphasis for high-speed time- division-multiplexed serial-link transceiver,” IEEE International Conf. on Comm., vol. 3, pp. 1934-1939, May 2002. [46] Vladimir Stojanovic, et al., “Optimal linear precoding with theoretical and practical data rates in high-speed serial-link backplane communication,” IEEE Conference on Comm. Vol. 5, pp. 2799-2806, June 2004. [47] Vladimir Stojanovic, et al, “Autonomous dual-mode (PAM2/4) serial link transceiver with adaptive equalization and data recovery,” IEEE J. o f Solid-State Circuits, vol. 40, pp. 1012-1026, Apr. 2005. [48] Gilbert Strang, Linear Algebra and its applications, the 3rd edition, Thomas Learning, Inc., 1988. [49] Koen Uyttenhove and S. J. Steyaert, “Speed-power-accuracy tradeoff in high speed CMOS ADCs,” IEEE Trans, on Circuits and Systems, vol. 49, Apr. 2002, pp. 280-287 [50] Scott R. Velazquez, et al., “A hybrid filter bank approach to analog-to-digital conversion,” Proc. IEEE-SP, Int. Symp. On Time-Frequency Time-scale Anal., pp. 116-119, Oct. 1994. [51] R. Vescovo, “Inversion of block-circulant matrices and circular array approach,” IEEE Trans, on Antennas and Propagation, vol. 45, pp. 1565-1567, Oct. 1997. 118 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. [52] Heinz Werker, et al, “A 10-GB/s SONET-compliant CMOS transceiver with low crosstalk and intrinsic jitter,” IEEE J. o f Solid-State Circuits, vol. 39, pp. 2349- 2358, Dec. 2004. [53] Koon-Lun Jackie Wong, et al., “Offset compensation in comparators with minimum input-referred supply noise,” IEEE J. of Solid-State Circuits, vol. 39, pp. 837-840, May 2004. [54] Chia-Hsin Wu, et al., “A 2GHz CMOS variable-gain amplifier with 50dB linear-in-magnitude controlled gain range for 10GBase-LX4 ethernet,” Int. Solid- State Circuits Conf., 26.8, Feb. 2004. [55] C.K.K. Yang, Stojanovic, V., Modjtahedi, S., et al., “A serial-link transceiver based on 8-GSamples/s A/D and D/A converters in 0.25-um,” IEEE J. o f Solid-State Circuits, vol. 36, pp. 1684-1692, Nov. 2001. [56] C.K.K. Yang, et al., “Analysis o f timing recovery for multi-Gbps PAM transceivers,” Proc. IEEE Custom Integrated Circuits Conference, pp. 67-72, Sep. 2003. [57] C.K.K. Yang, A multi-Gbps transceiver in CMOS technology. Ph.D. dissertation, Stanford University, 1998. [58] Jeongsik Yang, et al., “A quad-channel 3.125Gb/s/ch serial-link transceiver with mixed-mode adaptive equalizer in 0.18um CMOS,” Int. Solid-State Circuits Conf., pp. 176-185,2004. [59] Jared L. Zerbe, et al., “Equalization and clock recovery for a 2.5-10Gb/s 2- PAM/4-PAM backplane transceiver cell,” IEEE J. o f Solid-State Circuits, vol. 38, pp. 2121-2130, Dec. 2003. 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendices A. Shunt-shunt feedback amplifier circuit analysis. Rf Rs gm ■ Vx Co Ro Vx Vo Figure A .l: Equivalent circuit o f shunt-shunt feedback. The equivalent circuit for shunt-shunt feedback circuit discussed in section 3.2 (Figure 3.12(a)) is shown in Figure A .l where inverting amplifier is replaced with NMOS transistor. By applying K irchhoff s Current Law at input and output nodes, we can get two equations which are and T Vx Vx - Vo Is = -----+ ------------ Rs Rf V x - V o „ „ 1 --------------- = g m ■ Vx + V o------ Rf Ro (A .l) (A.2) where gm is a trans-conductance, Ro is a open-loop output resistance, and the load capacitance is removed for low-frequency analysis. To remove Vx, we can solve 120 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (A .l) in terms o f Vx and apply it to (A.2). After further simplification, we can obtain the closed loop voltage gain which is V o Is-Rs 1 -(gm .R f-1) J _ --------------------( R ^ ------------------------ (A3) Ks 1 + ( R s / / R f ) — -------------( g m - R f - 1 ) R f • ( R f + R o) Since trans-conductance (gm) o f amplifier and Rf (several kilo-ohm) are large enough, we may assume g m - R f » 1. Then, the left side o f (A.3) becomes m im \ R ° - R f (A.4) * 1 + <Rs//Rr)l ^ ' 8m In order to increase the open-loop voltage gain ( -gm ■ R o ) we can employ current mirror as a load resistance. Therefore, most o f the case, R o is much larger than R f. This simplifies (A.4) as 1 ( R s / / R f ) - R f - g m " R s ’ l + ( R s / / R f ) - g m ‘ Now, in order to satisfy the condition for a valid negative feedback where the closed- loop gain is only a function o f R s and R f, ( R s //R f ) • gm term in both numerator and denominator should be larger than 1. This implies that the feedback resistance ( R f) needs to be the same order as R s. This is achievable in multi-stage topology in that the effective Rs is decreased by the previous feedback stage. In actual situation, R f should be larger than Rs in order to create gain instead of loss. By applying conditions discussed above, the closed-loop voltage gain is further approximated as 121 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The closed-loop voltage gain of ideal trans-impedance feedback is expressed by 1 A v o . , A v c = ----------------------------- ( A . 7 ) R s 1 + P • A vo where A v o is open-loop voltage gain and ( P - A v o ) is loop gain. By comparing ( A . 5) and ( A . 7), we can identify the loop gain as P • A v o = ( R s / / R f ) - g m ( A . 8 ) which is a linear function o f gm. By the shunt-shunt feedback theory, the input and output resistance are decreased by the amount o f loop gain and the bandwidth is increased by the amount o f loop gain [37], which are R ie = — ^ — ( A . 9 ) ( P - A v o ) R oe = R ° ( A . 1 0 ) ( P - A v o ) BWL*d = BWL p = n ' (P ' A™ ) = ' C P • Avo) . (A . 1 1) 122 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A passive RLC notch filter design using spiral inductors and a broadband amplifier design for RF integrated circuits
PDF
Gyrator-based synthesis of active inductances and their applications in radio -frequency integrated circuits
PDF
A template-based standard-cell asynchronous design methodology
PDF
A 1.2 V micropower CMOS active pixel sensor
PDF
CMOS gigahertz -band high -Q filters with automatic tuning circuitry for communication applications
PDF
Effects of non-uniform substrate temperature in high-performance integrated circuits: Modeling, analysis, and implications for signal integrity and interconnect performance optimization
PDF
Design and analysis of ultra-wide bandwidth impulse radio receiver
PDF
High -speed CMOS continuous -time switched -current sigma -delta modulators
PDF
Design and analysis of MAC protocols for broadband wired/wireless networks
PDF
Dynamic voltage and frequency scaling for energy-efficient system design
PDF
Contributions to efficient vector quantization and frequency assignment design and implementation
PDF
Dynamic radio resource management for 2G and 3G wireless systems
PDF
High performance crossbar switch design
PDF
Adaptive detection of DS /CDMA signals with reduced-rank multistage Wiener filter
PDF
Blind multiuser receivers for DS -CDMA in frequency-selective fading channels
PDF
Clustering techniques for coarse -grained, antifuse-based FPGAs
PDF
Contributions to image and video coding for reliable and secure communications
PDF
Development of high frequency annular array ultrasound transducers
PDF
A study of unsupervised speaker indexing
PDF
Direction -of -arrival and delay estimation for DS -CDMA systems with multiple receive antennas
Asset Metadata
Creator
Lee, Kyongsu
(author)
Core Title
A CMOS frequency channelized receiver for serial-links
School
Graduate School
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
engineering, electronics and electrical,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Namgoong, Won (
committee chair
), Beerel, Peter (
committee member
), Choma, John (
committee member
), Kim, Eun Sok (
committee member
), Zimmermann, Roger (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-610216
Unique identifier
UC11342107
Identifier
3220120.pdf (filename),usctheses-c16-610216 (legacy record id)
Legacy Identifier
3220120.pdf
Dmrecord
610216
Document Type
Dissertation
Rights
Lee, Kyongsu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
engineering, electronics and electrical