Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A generic spur and interference mitigation platform for next generation digital phase-locked loops
(USC Thesis Other)
A generic spur and interference mitigation platform for next generation digital phase-locked loops
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A GENERIC SPUR AND INTERFERENCE MITIGATION PLATFORM FOR NEXT GENERATION DIGITAL PHASE-LOCKED LOOPS by Cheng-Ru Ho A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2018 Copyright 2018 Cheng-Ru Ho Dedication To my father, Sheng-Syun Ho. To my mother, Shu-Zhen Xie. To my wife, Tzu-Yi Lin. To my beloved cats, Allpa and Hemu. ii Acknowledgements This work could not have been done without the following supports. First of all, I would like to thank my advisor, Mike Shuo-Wei Chen, for his unconditional support, encouragement, and advice over the past few years. I had already worked with Prof. Chen as a voluntary researcher during my master's program. Although I had no previous experience with chip fabrication, Prof. Chen was willing to oer the opportunities to learn about tape-out so that I could start my career in this eld. During my experience in the University of Southern California (USC) Ph.D. program, there was a time when I was not certain that I could continue my Ph.D. project, which had come to a standstill. It was Prof. Chen's relentless encouragement and support that helped me to overcome this challenge. The lessons I learned from Prof. Chen extend beyond academics; his ambition, passion, insight, and wisdom has enabled me to become one of the most established researchers in my eld worldwide. Even though it was painful and dicult at times to survive in his research group, thanks to his in exibly standards, I have become a so-called superstar without even realizing it. I will be proud of being one of his group members throughout my entire life. iii I would like to extend my sincere thanks to the members of my qualifying and defense committees, including Prof. Hossein Hashemi, Prof. Peter Beerel, Prof. Keith Chugg, and Prof. Jongseung Yoon. To help me improve and become a suc- cessful expert in my eld, they shared their opinions sel essly during all discussions. The most important thing I learned from them is how to think about and deal with problems outside of the box. I would like to extend special thanks to Prof. Hashemi who, even now, oers me his kind help and treats me as a student in his research group. To me, he has played a role equivalent to that of a co-advisor. I am grateful to my previous senior members in Prof. Chen's research group, including David Chiong, Tui Tsai, Praveen Sharma, and Dylan Hand, as they built the infrastructure for this group from scratch. I also appreciate all the brainstorm- ing and technical support from my labmates, including Jaewon Nam, Shiyu Su, Tzu-Fan Wu, Aoyang Zhang, Mohsen Hassanpourghadi, Haolin Cong, Rezwan Ra- sul, and Ce Yang. Special thanks to my roommates, Shiyu Su and Aoyang Zhang, who took care of my cat, Allpa, when I had to travel. I wish to thank Dr. Run Chen and Dr. SungWon Chung from Dr. Hashemi's research group as well, as I beneted not only from the technical support but also the valuable experience they shared. During my studies at the USC, I was fortunate to have made close friends, who accompanied me and added avor to my life. I will never forget having Kevin Jiang, Yu-Chun Shih, Chung-Yao Pai, Yu-Ju Tsai, Daniel Wang, Justin Wang, iv Griey Kao, Allen Chen, Will Hsiao, Ying-Yi Chiang, Echo Xie, Venus Fu, and Emily Chu in my life. I still remember the time we stayed up late for a party together. When I felt overwhelmed by the pressure of this project, I was lucky to have a squad of friends, including Shiyu Su, Tzu-Fan Wu, Aoyang Zhang, and Zisong Wang, with whom I played video games. The memory of ghting so hard to record game data over many sleepless nights is still vivid to me. Without funding support, none of my Ph.D. work would have been accom- plished. My research project was mainly supported by Defense Advanced Research Projects Agency (DARPA), under the IRSI-TERCI and RF-FPGA programs, and Google, under the ATAP-R2 program. These sponsors should be acknowledged. Last but not least, without my family's support, I would not have been able to proceed this far. Even though I am the only child in my family, my parents still let me go abroad to fulll my dreams. I am deeply indebted to my parents for their wholehearted love. Of course, I am indebted to my wife, Tzu-Yi Lin, whose understanding, companionship, and tolerable of my irregular daily routine supported me through the most dicult times; my two adorable cates, Allpa and Hemu, were also a great comfort. Without my family's full support, none of these accomplishments would have been possible. ii Table of Contents Dedication ii Acknowledgements iii List Of Tables iv List Of Figures v Abstract xiv Chapter 1: Introduction 1 1.1 Impact of PLL Spurs . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Generation of External and Internal Spurs . . . . . . . . . . . . . . 5 1.3 Related Research for Spur Mitigation . . . . . . . . . . . . . . . . . 7 1.4 Contributions of this thesis in relation to the state of the arts . . . 8 Chapter 2: A Digital PLL with Signal-Tone Interference Cancellation and Injection Locked TDC 11 2.1 Overhead of Conventional DPLL Architecture . . . . . . . . . . . . 11 2.2 Injection-Locked TDC Architecture . . . . . . . . . . . . . . . . . . 12 2.2.1 Locking Range of the Injection-Locked TDC . . . . . . . . . 13 2.2.2 Dierential Nonlinearity due to Injection-Locking . . . . . . 17 2.2.3 Phase Noise Impact due to Injection Locking . . . . . . . . . 19 2.3 Adaptive Single Tone Interference Cancellation Scheme . . . . . . . 23 2.3.1 Steepest Descent Search Algorithm . . . . . . . . . . . . . . 24 2.3.2 Residue Error Energy Calculation . . . . . . . . . . . . . . . 29 2.3.3 Numerical Simulation . . . . . . . . . . . . . . . . . . . . . . 32 2.4 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.4.1 DPLL Overall Architecture . . . . . . . . . . . . . . . . . . 35 2.4.2 Implementation of Injection-Locked TDC . . . . . . . . . . . 36 2.4.3 Digitally Controlled Oscillator . . . . . . . . . . . . . . . . . 41 2.4.4 Implementation of Adaptive Spur Cancellation Loop . . . . 44 2.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 46 iii 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Chapter 3: A Digital PLL with Feedforward Multi-Tone Spur Can- cellation 54 3.1 Spur in Multi-tone Scenario . . . . . . . . . . . . . . . . . . . . . . 54 3.2 Proposed Multi-Tone Spur Cancellation . . . . . . . . . . . . . . . . 55 3.2.1 Derivation of the Cancellation Algorithm . . . . . . . . . . . 55 3.2.2 Stability of PLL Loop . . . . . . . . . . . . . . . . . . . . . 63 3.2.3 Integer and Fractional Delay . . . . . . . . . . . . . . . . . . 66 3.2.4 Complete Cancellation Loop and Adaptability . . . . . . . . 69 3.3 Case Study: Fractional-N Spur Cancellation . . . . . . . . . . . . . 71 3.4 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.4.1 Block Diagram of DPLL Prototype . . . . . . . . . . . . . . 73 3.4.2 Proposed Averaging Scheme . . . . . . . . . . . . . . . . . . 75 3.4.3 Modular Extension: Cascaded Cancellation Loops . . . . . . 79 3.4.4 Deployed Analog and Isolation Techniques . . . . . . . . . . 80 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Chapter 4: A Digital PLL with Background dither noise cancellation for Near-carrier In-band Spur 89 4.1 Fractional-N Spurs at Near-Carrier Frequencies . . . . . . . . . . . 89 4.2 Fractional-N Spurs in DPLLs . . . . . . . . . . . . . . . . . . . . . 92 4.2.1 The Generation of Fractional-N Spurs . . . . . . . . . . . . . 92 4.2.2 Phase Randomization in Injection-Locked TDC . . . . . . . 94 4.3 Background Dither Noise Cancellation . . . . . . . . . . . . . . . . 100 4.4 Circuit Implementations . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4.1 The Proposed DPLL Architecture . . . . . . . . . . . . . . . 104 4.4.2 Implementation of the Digital-to-Time Converter . . . . . . 106 4.4.3 Implementation of the Dithering Noise Cancellation Loop . . 108 4.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Chapter 5: A Digital PLL with Dither-assisted Pulling Mitigation 122 5.1 Pulling of PLLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.2 Adaptive Filter Transfer Function for Spur coupling to the DCO path125 5.3 Challenges of Mitigating DCO-induced Spur and Proposed Mitiga- tion Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.4 Circuit Implementation of DCO-induced Spur Mitigation Scheme . 131 5.5 Experimental Results of DCO-induced Spur Mitigation Scheme . . 137 5.6 DCO and Reference Pulling Phase Errors . . . . . . . . . . . . . . . 141 5.6.1 Pulling Signal to the DCO Path . . . . . . . . . . . . . . . . 141 5.6.2 Pulling Signal to the Reference Path . . . . . . . . . . . . . 144 iv 5.6.3 Simultaneous Coupling of Pulling Signals . . . . . . . . . . . 146 5.7 Dithering for Orthogonalizing Pulling Phase Errors . . . . . . . . . 148 5.8 Proposed Pulling Mitigation Scheme . . . . . . . . . . . . . . . . . 152 5.8.1 Update Function of the DCO Mitigation Loop . . . . . . . . 152 5.8.2 Update Function of the Reference Mitigation Loop . . . . . 156 5.9 Circuit Implementation of Pulling Mitigation Scheme . . . . . . . . 159 5.9.1 Proposed DPLL Architecture . . . . . . . . . . . . . . . . . 159 5.9.2 Implementation of the DCO Mitigation Loop . . . . . . . . . 160 5.9.3 Implementation of the Reference Mitigation Loop . . . . . . 163 5.10 Experimental Results of Pulling Mitigation Scheme . . . . . . . . . 164 5.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Chapter 6: Conclusions 174 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 6.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . 175 References 178 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 A.1 The Reference Mitigation Loop with DPLL in Fractional-N Mode . 186 v List Of Tables 2.1 Comparison with state-of-the-art digital PLLs . . . . . . . . . . . . 52 3.1 Comparison with state-of-the-art digital PLLs . . . . . . . . . . . . 87 4.1 Comparison table with state-of-the-art digital PLLs . . . . . . . . 120 5.1 Comparison with state-of-the-art PLLs . . . . . . . . . . . . . . . . 140 5.2 Comparison of state-of-the-art PLLs with pulling mitigation . . . . 172 vi List Of Figures 1.1 Illustration of spur-induced reciprocal mixing in Wireless transceiver 2 1.2 Analog phase-locked loop v.s. digital phase-locked loop . . . . . . . 4 1.3 Generation of Fractional-N spur in digital phase-locked loop (DPLL) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 (a) Block diagram of N-stage ring oscillator without injection locking (b) The current phasor diagram without injection locking . . . . . 14 2.2 (a) Ring oscillator under injection locking (b) The current phasor diagram under injection locking and the boundary of injection lock with = max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 (a) The block diagram of injection-locked TDC under injection with N=3 (b) The timing diagram of TDC output without injection (c) The timing diagram of TDC output with ! inj =! RO + ! (d) The timing diagram of TDC output with ! inj =! RO ! . . . . . . . 18 2.4 DPLL transfer function with IL-TDC . . . . . . . . . . . . . . . . 21 2.5 DPLL phase noise prole: (a) With injection-locking bandwidth of 150 kHz (b) With injection-locking bandwidth of 25 MHz . . . . . 21 2.6 Potential interference coupling and cancellation-leveraging DPLL ar- chitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.7 High level concept of interference cancellation . . . . . . . . . . . . 24 vii 2.8 (a) The loci of the updates using the steepest descent algorithm (b) The proposed coordinates descent algorithm, using ve computation channels with skewed amplitude and phase . . . . . . . . . . . . . 26 2.9 Block diagram of the residue error energy computation . . . . . . . 28 2.10 (a) Numerical simulation of TDC output without (top) and with (bottom) the proposed spur cancellation, showing the convergence of estimated interference parameters (b) Numerical simulation of TDC output before and after enabling cancellation in the absence of interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.11 (a) Spur level versus number of updated cycles with dierent A step (normalized to the full swing of DDS) (b) Spur level versus number of updated cycles with dierent step . . . . . . . . . . . . . . . . . 33 2.12 (a) The impact of amplitude oset (normalized to the full swing of DDS on the cancellation accuracy (b) the impact of phase oset on the cancellation accuracy . . . . . . . . . . . . . . . . . . . . . . . 33 2.13 Overall block diagram of the proposed DPLL architecture . . . . . 34 2.14 Circuit implementation of injection-locked TDC and LC-DCO . . . 36 2.15 (a) Cross-coupled resistor network applied in IL-TDC (b) Equivalent circuit for one stage of IL-TDC with noise modeling . . . . . . . . 38 2.16 Proposed DAC interface between digital loop lter and LC oscillator 41 2.17 Implementation of the proposed gradient-based adaptive spur can- cellation scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.18 Die micrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.19 Measured spectrum snapshot of IL-TDC (a) before injection locking, (b) under injection locking (c) Phase noise prole of RO before and after IL (d) Measured DNL performance of proposed IL-TDC . . . 47 2.20 (a) Measured phase noise prole with dierent DPLL bandwidth setting under integer-N mode at 2.816GHz carrier frequency (b) PLL output PSD of frac-N mode at 2.8165 GHz with fractional spur at 500 kHz oset with narrow bandwidth setting . . . . . . . . . . . . 48 viii 2.21 Measured reference spur of -86.45 dB at 2.816 GHz carrier frequency 49 2.22 (a) Measured spur level improvement of >43 dB at 500 kHz oset fre- quency with 30 kHz bandwith (b) Measured spur level before/after spur cancellation scheme over dierent injected spur frequencies with 250 kHz bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.23 (a) Measured spur level before and after cancellation with varying interference amplitude in real time (b) Measured digital dynamic power consumption under dierent DDS frequencies . . . . . . . . 51 3.1 The property of multi-tone spurs and its harmonic decomposition . 56 3.2 Basic concept of feedforward multi-tone spur cancellation in time domain view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.3 z-domain model of the proposed digital PLL with feedforward can- cellation path H CANC (z) . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4 The open loop response of digital PLL with H CANC (z) = 1z D . 57 3.5 (a) Block diagram of feedforward cancellation path with high-pass lter and (b) zero-pole diagram of the high-pass lter . . . . . . . . 58 3.6 Zero-pole diagram of (a) open loop response of 2nd-order type-II digi- tal PLL (b)H CANC (z) = 1z 4 (c)H OPEN (z) withH CANC (z) = 1z 4 (d) H OPEN (z) with H CANC (z) = 1 ( 1z 1 1pz 1 )z 4 . . . . . . . . . . 60 3.7 The magnitude response of H CANC (z) = 1z D . . . . . . . . . . 60 3.8 (a) The block diagram of feedforward cancellation path with high- pass lter and signal averaging and (b) its frequency response with dierent N win . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.9 Computed phase margin with (a) dierent settings of D-cycle delay given a certain DPLL bandwidth, and (b) dierent averaging cycle 65 3.10 Illustration of phase alignment between the actual spur and its replica with (a) 4-cycle delay, and (b) 4.5-cycle delay . . . . . . . . . . . . 66 3.11 Block diagram of the Lagrange fractional delay lter with adaptabil- ity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 ix 3.12 (a) Simulation and analytical results of cancellation ratio given single sinusoidal spur; (b) simulation result of cancellation ratio given saw- tooth spur over dierent fractional delay resolution . . . . . . . . . 69 3.13 Complete block diagram of the proposed feedforward multi-tone spur cancellation loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.14 Illustration of fractional-N spur cancellation . . . . . . . . . . . . . 72 3.15 Proposed digital PLL implementation with feedforward multi-tone spur cancellation loop . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.16 (a) Proposed averaging scheme in integer mode (b) Representative examples of integer spur period with averaging in integer mode and the reconstruction of spur pattern at z[n] . . . . . . . . . . . . . . 76 3.17 (a) Proposed averaging scheme in fractional mode (b) Representative examples of fractional spur period with averaging in integer mode and fractional mode . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.18 Illustration of proposed spur cancellation with (a) single-stage con- guration and (b) two-stage conguration for two series of spurs . 79 3.19 Die micrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.20 Test chip measurement setup . . . . . . . . . . . . . . . . . . . . . 82 3.21 Measured digital PLL phase noise prole at 3.57GHz in the absence of external or internal spur (a) without and (b) with the proposed spur cancellation activated . . . . . . . . . . . . . . . . . . . . . . 83 3.22 Measured digital PLL phase noise prole given average cycle (N win ) of (a) 2 and (b) 32 . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.23 (a) Spectrum snapshot of fractional spurs at FCW = 119.25 and (b) measured worst-case fractional spur level across dierent FCW setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.24 (a) Spectrum snapshot of reference spur measurement at carrier fre- quency of 3.57GHz and (b) measured reference spur across the entire synthesizer operation range . . . . . . . . . . . . . . . . . . . . . . 85 x 3.25 Measured (a) fundamental and (b) 2nd harmonics of external spur reduction in single-stage cancellation . . . . . . . . . . . . . . . . . 85 3.26 Measured external spur reduction in cascaded two-stage cancellation congurations (a) without and (b) with activating the spur cancel- lation scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.1 The hardware complexity involved in a feedforward multi-tone spur cancellation when a spur period increases . . . . . . . . . . . . . . 90 4.2 The generation of fractional-N spurs in the DPLL with a frequency accumulation path . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3 The DPLL block diagram with reference-path dithering . . . . . . 94 4.4 A time-domain illustration of the randomization of a fractional spur error pattern (a) with and (b) without the proposed dithering scheme 96 4.5 A time-domain illustration of the randomization of a fractional spur error pattern (a) with and (b) without the proposed dithering scheme 98 4.6 A simulation with the worst-case fractional spur improvement under dierent stages of the Hadamard code generator . . . . . . . . . . . 100 4.7 A DPLL mathematical model with injected dither noise and a cor- responding cancellation path in z-domain . . . . . . . . . . . . . . 101 4.8 The overall proposed DPLL architecture . . . . . . . . . . . . . . . 105 4.9 The implementation of the proposed dither and dither removal scheme 106 4.10 The Implementation of the proposed background dither noise can- cellation loop with a one-tap conguration . . . . . . . . . . . . . . 107 4.11 A digital implementation of the update function Eq. (4.12) with the compensation block inserted at (a) the input node and (b) the output node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.12 The complete implementation of the proposed background dither noise cancellation loop with a two-tap conguration . . . . . . . . 110 xi 4.13 Chip micrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.14 The testing conguration . . . . . . . . . . . . . . . . . . . . . . . 113 4.15 Measured phase noise prole with the dither and dither noise can- cellation loop at an FCW of 120 + 2 10 . The spur improvement of 30 dB and the noise improvement of 23 dB were noted. These im- provements validated the ecacy of the proposed technique . . . . 114 4.16 Measured worst-case fractional spur with an FCW of 120 +frac12 14 (a) before and (b) after enabling the dither noise cancellation loop 115 4.17 (a) Measured settling behavior of a DPLL with an FCW of 120 +frac12 14 and (b) the worst-case fractional spur with dierent FCW settings 115 4.18 The worst-case fractional spur levels, measured with dierent dither- ing magnitudes normalized to one DCO periods . . . . . . . . . . . 116 4.19 Measured variations of integrated RMS jitters with an FCW of 120 + 2 14 after enabling the dither and dither noise cancellation loop under (a) 20% DCO/TDC/DTC supply variations, and (b) 27 o C to 60 o C tem- perature variations . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.20 Measured variations in the worst-case fractional spur with an FCW of 120 + 2 14 after enabling the dither and dither noise cancella- tion loop under (a) 20% DCO/TDC/DTC supply variations, and (b) 27 o C to 60 o C temperature variations . . . . . . . . . . . . . . . 117 4.21 Measured integrated RMS jitter variations with a xed update weight- ing after the settling of the cancellation loop under (a) 20% DTC supply variations and (b) 27 o C to 60 o C temperature variations . . 118 4.22 Measured DPLL (a) reference spur and (b) phase noise prole at 3.57 GHz and an FCW of 119. . . . . . . . . . . . . . . . . . . . . 119 5.1 A PLL pulled by a transmitter output as PA pulling or by nearby PLLs as oscillator mutual pulling . . . . . . . . . . . . . . . . . . . 123 5.2 A PLL can be pulled by the aggressor signal coupling to reference and DCO paths simultaneously . . . . . . . . . . . . . . . . . . . . 125 xii 5.3 A z-domain model of a DPLL with the DCO-induced spur and the compensation signal . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.4 The mitigation of dierent spur sources to the PLL's (a) reference and (b) DCO paths . . . . . . . . . . . . . . . . . . . . . . . . . . 129 5.5 A block diagram of a DPLL implementation with the proposed DCO- induced spur mitigation scheme and dierent spur injection points 132 5.6 The learning phase of the proposed mitigation implementation . . 133 5.7 The proposed spur-pattern learning phase . . . . . . . . . . . . . . 135 5.8 A time-domain view of a spur replica in the IIR-based accumulator during the (a) learning and (b) cancellation phases. . . . . . . . . 136 5.9 Chip micrograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.10 (a) A measured phase noise prole at 3.57 GHz and (b) a best-case reference spur level at 4.59 GHz . . . . . . . . . . . . . . . . . . . 138 5.11 A measured DPLL spectrum with multi-tone spurs with a 600 kHz oset (a) before spur mitigation and (b) after spur mitigation . . . 138 5.12 Measured spur levels versus spur frequencies over dierent coupling paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.13 Measured spur levels at representative spur frequencies over dierent (a) delay and (b) amplitude settings during the cancellation phase 140 5.14 z-domain DPLL transfer function with (a) DCO pulling phase error DCO;Pull [k] and (b) reference pulling phase error REF;Pull [k] . . . . 144 5.15 Indistinguishable DCO and reference pulling phase errors at the TDC output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 5.16 DCO pulling phase error at TDC output with and without the pro- posed dither-assisted scheme . . . . . . . . . . . . . . . . . . . . . 148 5.17 Reference pulling phase error at TDC output with and without the proposed dither-assisted scheme. . . . . . . . . . . . . . . . . . . . 149 xiii 5.18 Proposed dither-assisted technique dierentiates the simultaneous coupling signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 5.19 The derivation of the update function, based on the proposed DCO mitigation loop, and the minimization of the DCO pulling phase error 154 5.20 A derivation of the update function, based on the proposed reference mitigation loop, used to minimize the reference pulling phase error 157 5.21 Overall DPLL architecture with the proposed DCO and reference pulling mitigation scheme . . . . . . . . . . . . . . . . . . . . . . . 159 5.22 The implementation of the proposed DCO pulling mitigation loop . 161 5.23 The implementation of the proposed DCO pulling mitigation loop . 162 5.24 The operation of a PN correlator: extracting DC information from the reference pulling phase error. . . . . . . . . . . . . . . . . . . . 163 5.25 A chip micrograph. . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.26 Measured DPLL spectrum (a) with sinusoidal pulling to a DCO path and dithering and (b) with enabling the dither noise cancellation loop 166 5.27 Measured DPLL spectrum (a) with sinusoidal pulling to only the reference path, (b) with dithering, (c) with dither noise cancellation, and (d) with reference pulling mitigation loop . . . . . . . . . . . . 167 5.28 (a) The measured DPLL phase noise prole without injecting ag- gressor signal and (b) with a modulated signal generated from a 65 GS/s arbitrary waveform generator . . . . . . . . . . . . . . . . . . 168 5.29 The measured DPLL spectrum with PA pulling (a) before and (b) after activating the proposed pulling mitigation scheme . . . . . . . 169 5.30 The measured DPLL spectrum with oscillator mutual pulling (a) be- fore and (b) after activating the proposed pulling mitigation scheme 170 5.31 (a) The time-domain waveform of a DLF input before and after en- abling both mitigation loops with disturbances removed and (b) the DPLL reference spur performance . . . . . . . . . . . . . . . . . . 170 xiv 5.32 Testing congurations . . . . . . . . . . . . . . . . . . . . . . . . . 171 A.1 The implementation of the modied PN correlator when DPLL op- erates at fractional-N mode . . . . . . . . . . . . . . . . . . . . . . 188 xv Abstract Phase-locked loops (PLLs) are widely deployed in most electronic systems to generate a desired clock frequency, perform clock data recovery (CDR), and fre- quency/phase modulation, among other applications. Due to stringent system-level specication, the rapid growth of modern electronic systems (e.g., 5G communica- tion or bio-medical applications) has imposed increased design constraints on the component-level block, such as PLLs. One challenge is to build robust, low-spur PLLs, which are highly desirable for many applications to avoid unwanted reciprocal mixing of blocker signals, prevent emission mask violation in a wireless transmitter, and minimize deterministic jitter as a clock source. Designing a low-spur PLL is not a straightforward problem. A spur can orig- inate externally or internally and has a variety of patterns, such as sinusoidal, sawtooth-like, or modulated waveform. External spurs are caused by nearby inter- fering aggressors, which can be clocks in other domains or power amplier (PA) output in a system-on-chip (SoC) platform. Notably, external interferences may couple with the PLL via a silicon substrate, bond wire, power supplies, or even an inductor inside an LC-tank oscillator to generate spurs. Even worse, various xvi coupling paths indicate dierent transfer functions, from spur input node to nal PLL output, which makes it nearly impossible to pre-calculate the coupling transfer function. However, internal spurs, including fractional and reference spurs, result from the nature of PLL operation, and they are dicult to lter externally. The implementation of an additional spur mitigation scheme sometimes causes stability issues for the loop dynamic, which may sacrice the noise performance of PLLs. Consequently, it is challenging to design a reliable, low-spur PLL. To resolve these issues, this thesis introduces a generic spur and interference mit- igation platform for digital PLL (DPLL) architecture by leveraging various adaptive lter algorithms, which are capable of mitigating both types of spurs. Because of its intensive digital embodiment, the DPLL is highly exible and recongurable. Furthermore, the digital core of a DPLL is substantially robust for process, voltage, and temperature variations (PVT) as it will not suer from the analog non-ideality, such as supply headroom. Thanks to the many advantages of DPLL over its ana- log counterpart, this digital architecture is used to demonstrate the concept of the proposed spur and interference cancellation platform in this thesis. xvii Chapter 1 Introduction 1.1 Impact of PLL Spurs PLLs are widely deployed in electronic systems to synthesize a desired clock to perform up/down conversion for wireless clock data recovery (CDR) in a wireline or other communication system, based on the intended application. Ideally, a syn- thesized clock should generate only its carrier or fundamental components to avoid unwanted reciprocal mixing of blocker signals, prevent emission mask violation, and minimize deterministic jitter as a clock source. However, due to the non-linear operation of PLL or disturbance from interferences, the PLL will generate not only its carrier component but also undesirable tones, namely spurs, in the spectrum. To illustrate the impact of a spur, an example of unwanted reciprocal mixing in a wireless application is displayed in Fig. 1.1. On the transmitter side, after the mixer, the input signal will be up-converted and will generate additional com- ponents to violate the emission mask at specic frequency osets corresponding 1 Figure 1.1: Illustration of spur-induced reciprocal mixing in Wireless transceiver to the spur location. On the receiver side, the incoming signal usually includes close-in blockers, which are down-converted to the baseband due to the reciprocal mixing. As a result, the signal-to-noise ratio (SNR) performance will be degraded. Note that the spur-induced reciprocal mixing issue is just one of many scenarios. Another possibility is that the spur will increase the deterministic jitter, which is not desirable for a wireline communication system. Before studying the generation of spurs, the architecture of a conventional PLL should be introduced to explain why digital PLL (DPLL) architecture was chosen to demonstrate the concept of the proposed mitigation platform. Fig. 1.2 shows the two most popular PLL architectures: the image at the top of the gure is an analog PLL, and the one at the bottom is a DPLL. The analog PLL uses an analog phase- frequency detector (PFD), a charge pump (CP), and a passive loop lter to control 2 the voltage-controlled oscillator (VCO) through a negative feedback mechanism. The output frequency (F OUT ) of a PLL basically follows the multiplication of the reference clock frequency (F REF ) and the feedback division ratio (N div ), which can either be an integer or a fractional value. F OUT =N div F REF (1.1) In contrast, the DPLL converts the CP and passive loop lter into a digital loop lter. To create a bridge between analog and digital domains, a digitally controlled oscillator (DCO) and time-to-digital converter (TDC) are used as the interface. As a result, DPLLs takes advantage of Moore's law, and many design constraints common in analog PLLs are relaxed. Regardless of the PLL topology, both architectures will generate spurs [1]{ [2]. A spur-induced disturbance will result in phase deviation at the PLL output, which will later be preserved by the PFD or TDC. The spurious errors captured appear either as voltage information (in analog PLL) or as digital codes (in DPLL). Al- though a sinusoidal waveform is illustrated as a representative example in Fig. 1.2, the spurious pattern can be applied to dierent waveforms, such as sawtooth signals. As digital codes can be easily post-processed and inherently robust to analog varia- tions, they have more benets than does voltage information, so digital information is preferable. DPLL architecture provides abundant digital-signal-processing (DSP) 3 Figure 1.2: Analog phase-locked loop v.s. digital phase-locked loop opportunities to explore digitized spurious errors, which is not feasible for conven- tional analog PLLs. Therefore, we intend to leverage this DPLL topology to verify the underlying concept of the proposed spur and interference mitigation scheme throughout this thesis. Although PLL spurs are a well-known issue, few solutions have been proposed because of the diculty of characterizing spur properties. To achieve spur mitiga- tion, the next step is to understand how a PLL generates spurs. In the following section (1.2), DPLL architecture is claried and utilized as a representative example for explaining the generation of PLL spurs, but the same concept can be applied to analog PLLs as well. 4 1.2 Generation of External and Internal Spurs Spurs can be categorized into two types: external and internal. In a SoC en- vironment, external spurs are mainly caused by the interference coupling from the aggressors peripheral to the victim DPLL. In most cases, interference can be as- sumed when its carrier frequency is away from the oscillation frequency of the DCO, and it will modulate the phase of reference or DCO clock instead of the frequency. After TDC sampling, this interference-induced spur is aliased so that it can be located within the DPLL bandwidth. Another type of interference occurs when the carrier frequency of a disturbance is close enough to the oscillation frequency of the DCO (i.e., only a few kHz/MHz apart); this type of interference may directly aect the frequency of the victim DPLL so that spurs are introduced. This phenomenon is referred to as injection pulling [3] and is most likely to occur in an LC-tank DCO (LC-DCO). Generally, the waveform of external interference is in a sinusoidal or modulated pattern. It should be noted that external interferences may couple with the silicon substrate, bond wire, power supplies, or inductor of an LC-DCO, which indicates that the coupling medium can be electrical, magnetic, or both. Internal spurs are mainly generated due to the phase-locking operation, includ- ing fractional spurs [2] in the fractional-N mode and reference spurs in the integer-N mode. Fractional-N operation can be studied from frequency accumulation in the type of DPLL architecture shown in Fig. 1.3. To minimize additional errors in the injected phase from the accumulated frequency control word (FCW), the DPLL 5 Figure 1.3: Generation of Fractional-N spur in digital phase-locked loop (DPLL) architecture has to push the DCO so that it oscillates at the fractional-N frequency. However, a nite word length mismatch between the FCW path and the TDC block will gen- erate a sawtooth-like disturbance, which cannot be suppressed by the DPLL and, hence, causes harmonic tones at the DPLL output. Note that in this case, the PLL output frequency can be rewritten as follows: F OUT = FCWF REF (1.2) Another internal spur that occurs in the integer-N mode is the reference spur. Because of the transistor mismatch in the CP circuitry, unwanted current glitches are generated to modulate the VCO at the frequency of the reference clock. In the case of DPLLs, fortunately, the issue of reference spurs is somewhat lessened after replacing the CP circuitry and passive components with a digital loop lter. However, the digital core of the DPLL will then become the main aggressor, and its 6 switching noise can be severely coupled with the DCO through the power supply or substrate, which is hard to predict. 1.3 Related Research for Spur Mitigation To address external interference, the conventional approach is to increase the power-supply-rejection ratio (PSRR) of the victim PLL by rejecting the interference before it reaches the PLL core. In this way, information about the coupling path transfer function and spur properties is not required. The interference coupling can be ltered by inserting a low-dropout regulator (LDO) into the supply for the DCO, TDC, and reference clock buer. One drawback of this technique is that it is highly vulnerable to PVT variations. As a result, an extra LDO calibration loop [7] is required to guarantee sucient rejection of interference coupling under dierent conditions. For interference from injection pulling, various researchers [10]{ [13] have proposed dierent schemes to mitigate strong interference, either from the surrounding oscillator (i.e., sinusoidal interference) or the power amplier (PA) output (i.e., modulated interference). In order to compensate for the unpredictable behavior of coupling, an adaptive lter algorithm was developed in [11] and [12] to track PVT variations in real time. However, those techniques focused on a scenario in which the aggressor only interferes the DCO. In other words, if the reference path of the DPLL victim is contaminated, these previously described schemes may not work eectively. 7 To suppress fractional spurs, previous research has incorporated a dithering scheme [5] to randomize the periodic sawtooth pattern. Due to this randomization, additional dither noise will enhance the noise oor, which severely degrades the integrated jitter performance, so an additional dither noise cancellation algorithm is necessary. The conventional dither noise cancellation algorithm is costly and not robust for PVT. Another approach that has been applied to DPLL architecture is to truncate the FCW accumulation path [6]. With additional quantization of the accumulated FCW phase to the nearest threshold of the DCO feedback path, the nite word length mismatch between the TDC code and the accumulated FCW can be minimized. However, this technique is sensitive to noise disturbanceand the TDC's dierential-nonlinearity (DNL) eect, so the amount of spur reduction that can be achieved is still limited. For reducing reference spurs, a reference spur cancellation scheme [9] can be used that preserves reference spur information in the digital domain; however, the digital core is oversampled by 8 x the Nyquist rate, which increases power consumption signicantly. 1.4 Contributions of this thesis in relation to the state of the arts At this point, multiple prior works describing mitigation techniques have been introduced, all of which demonstrated impressive outcomes. Nevertheless, each 8 technique is most likely constrained to a specic scenario. That is, a technique may lose its eectiveness when the spur changes to a dierent type. To extend the concept of spur mitigation to a more general scenario, this thesis mainly focuses on leveraging an adaptive lter technique to resolve the spur issues mentioned in Sec- tion 1.2. The ultimate goal is to develop a generic spur and interference mitigation solution with fully digital implementation, and the DPLL shown in Fig. 1.3 was selected as our proof-of-concept illustration because of its hardware simplicity. To begin the research with the easiest case, single-tone external interference that is only injected into the supply of the DPLL reference clock is studied. The steepest descent algorithm is shown to prove the possibility of spur mitigation inside the DPLL adaptively, and the details of this technique are introduced in Chapter 2. Afterwards, the algorithm is expanded to tackle a multi-tone spur scenario with a feedforward least-mean-square-based (LMS-based) adaptive lter scheme, which is described in Chapter 3. The technique is not limited to external interference; it can also be applied to reduce internally generated fractional spurs. However, the multi- tone spur mitigation scheme may increase the overhead signicantly when a spurious pattern is observed over a long period. This algorithm proves eective especially when a high spur frequency is considered. To reduce the design cost, Chapter 4 introduces a reference path{based dithering technique to achieve in-band fractional spur cancellation [18] where the spur frequency is close to its carrier. Afterwards, an adaptive cancellation loop is used to mitigate introduced dither noise. From 9 Chapters 2 to 4, we can already provide an in-depth theory about how to design an adaptive lter algorithm to compensate for unwanted spur or noise components coming from the reference path of a DPLL. Then, what the next step is must then be addressed. Another problem is when external interference couples with the DCO path di- rectly; this spurious error can also be referred to as a DCO-induced spur. The adap- tive lter transfer function derived previously cannot mitigate the DCO-induced spur without some degree of modication. Hence, the redesign of adaptive lter settings is studied with experimental support in Chapter 5. To further increase the value of this research, we extend the prototype to be capable of mitigating one of the most practical but dicult interferences; that is, injection pulling interference, including the modulated interference from the PA output or the sinusoidal interfer- ence from other PLLs. Furthermore, pulling interference is coupled with the DCO and reference path of DPLL victim simultaneously. With the dither-assisted tech- nique [20] introduced in Chapter 5, the prototype allows simultaneous rejections for the rst time. The potential extension of the implemented pulling mitigation scheme is also discussed in Appendix A. Finally, Chapter 6 contains concluding remarks and suggestions for future work. 10 Chapter 2 A Digital PLL with Signal-Tone Interference Cancellation and Injection Locked TDC 2.1 Overhead of Conventional DPLL Architecture The TDC quantizes the phase dierence between the DCO and the input ref- erence clock and converts it into digital codes, which allows the control loop to be implemented in the digital domain. It thus eliminates the need for a CP cir- cuit and analog loop lter. However, quantization errors caused by the TDC can degrade the in-band phase noise and may demand a high resolution TDC design. Additionally, in order to normalize the time period between the DCO and TDC, phase normalization schemes are typically required [1], [5], which enhances the de- sign diculty. To avoid phase normalization, another prior-art [21] has proposed an embedded TDC concept to recycle the ring oscillator as the TDC. However, this idea is not applicable to the DPLL with LC-tanked oscillator. To minimize 11 the TDC overheads, a multi-phase injection-locked TDC scheme is proposed to automatically normalize the TDC quantization step with the period of LC-tanked DCO (LC-DCO) over PVT without additional calibration [14], [15]. The multi- phase nature of the injection-locked TDC is further leveraged to achieve ner TDC resolution. Another major challenge in frequency synthesizer design is the required robust- ness against interference from aggressors. Since the trend is to integrate more circuit blocks on the same silicon substrate, this self-interference issue becomes more se- vere. Although the digital nature of a digital loop lter provides more resilience to interferences than an analog lter, some parts of a digital PLL are still vulner- able, such as the input reference buers and TDC. Accordingly, a gradient-based adaptive lter with steepest descent search is proposed to nullify the interferences completely inside the digital embodiment. Depending on the required specica- tions, the proposed approach can be used on its own or in conjunction with the other isolation approach. 2.2 Injection-Locked TDC Architecture The injection-locking technique is widely used in the design of oscillators, fre- quency dividers, and PLLs [3], [22]- [25]. The proposed TDC architecture leverages this injection-locking technique for low-cost and calibration-free operation inside the DPLL. 12 The key idea of the proposed multi-phase IL-TDC is to use a separate ring oscillator that is injection-locked to the LC-DCO and that provides ner phase in- formation than that given by the DCO, typically only 0 and 180 degrees. The ring oscillator is designed in such a way that its oscillation frequency can be locked to that of the DCO during the DPLL steady-state operation. Therefore, an N-stage ring oscillator naturally provides a ne TDC step size of 180 N degree without addi- tional circuitry to estimate the TDC delay, as typically required in a conventional TDC implementation [1]. In addition, even under PVT variations, so long as the IL-TDC remains locked to the DCO, the ne TDC step remains unchanged, avoid- ing the need for TDC calibration in the background. Finally, mismatch ltering and the passive phase interpolation technique are applied in the IL-TDC in order to improve its linearity and resolution. There are several design considerations for the proposed IL-TDC scheme. In the following subsections, the current phasor notation is used to examine the practical design tradeos of this proposed IL-TDC scheme, including the tolerable injection- locking frequency range, dierential non-linearity (DNL) caused by injection lock- ing, and the phase noise impact from the building blocks of the IL-TDC. 2.2.1 Locking Range of the Injection-Locked TDC The realization of the IL-TDC is essentially through a single-input injection- locked ring oscillator where the DCO acts as the input source. In practice, the 13 Figure 2.1: (a) Block diagram of N-stage ring oscillator without injection locking (b) The current phasor diagram without injection locking free-running frequency of the ring oscillator cannot be too far o from the DCO operation frequency; the tolerable frequency dierence will be referred to as the frequency locking range in this subsection. The conceptual block diagram of an N-stage ring oscillator (RO) is examined rstly without injection, as depicted in Fig. 2.1(a). For simplicity, the odd number stage of the single-end ring oscillator, without cross-coupled interpolation resistors, is shown. Each stage of the ring oscillator is modeled with an ideal inverter, providing a 180 degree phase shift. The output of the ideal inverter injects bias current of I osc upon the zero-crossing point into the RC-load, consisting of the extracted output impedance of the inverter at the zero-crossing point and the trace capacitance. The current phasor diagram is shown in Fig. 2.1(b). This satises the Barkhausen criteria for oscillation, i.e., the overall phase shift of the closed loop should maintain an integer multiple of 2, which implies the phase shift of N in each stages RC-load of the ring oscillator [22]. 14 Figure 2.2: (a) Ring oscillator under injection locking (b) The current phasor dia- gram under injection locking and the boundary of injection lock with = max A block diagram of the oscillator with single-input injection is shown in Fig. 2.2(a), with an extra injection currentI inj before the RC-load at the rst stage; the current phasor diagram is shown in Fig. 2.2(b). In order to satisfy the stability criterion, the required phase condition is given by +N ( N + +) = 2k (2.1) , where k = 1,2,3 . . . etc, and and are the additional phase shifts in the rst and the remaining stages after enabling injection locking. To satisfy Eq. (2.1), should be equal to N . Here, the IL-TDC operates in a weak injection condition (jI inj j<<jI osc j =jI load j). This is preferable, as a stronger injection will increase power consumption [3]. 15 At the boundary of injection-locking, the maximum phase shift ( max ) occurs when the injection current I inj is orthogonal to the output current I load , as shown in g. 2.2(b). By applying the law of sines: jI inj j sin( max ) = jI osc j sin(90 o ) (2.2) The maximum value of ( max ) can be derived as: max j I inj I osc j (2.3) In the following derivation, Eq. (2.3) will be used to calculate the locking range. In real IL-TDC operation, the frequency of the free-running ring oscillator ! RO and that of the injection source ! inj may vary, which will further impact the phase shift, i.e., and . According to [23], the phase shift can be approximated by: tan 1 (tan( N ) ! inj ! RO ) = N + (2.4) Now ! is dened as the frequency dierence, i.e., ! inj =! RO + !. Applying Taylor series expansion on Eq. (2.4), can be expressed as: ! ! RO [sin( N )cos( N )] (2.5) 16 Given the condition of Eq. (2.1), the upper bound of can be found to be: jj = 1 N jj 1 N j max j (2.6) By combining Eq. (2.3), (2.5), and (2.6), the maximum value of the frequency dierence (!) can be found; this will henceforth be referred to as the injection- locking bandwidth (! BW;IL ). ! ! RO 1 N [ 1 sin( N )cos( N ) ][j I inj I osc j] = ! IL;BW ! RO (2.7) In this prototype implementation, a 7-stage (N=7) ring oscillator is imple- mented, and the injection current (I inj ) is designed to be 1% of the oscillator current (I osc ). As a result, the injection-locking bandwidth is 100 MHz, which allows an ample design margin to ensure that this operation condition is satised under PVT variation. Note that if a higher number of ring oscillator stages is used for ner TDC resolution, one can simply increase the injection-locking strength to maintain the same injection-locking bandwidth as that of IL-TDC, as evident from Eq. (2.7). 2.2.2 Dierential Nonlinearity due to Injection-Locking Since injection-locking introduces an asymmetric load condition between the ring oscillator stages as shown in Fig. 2.2(b), it introduces an unequal phase shift 17 Figure 2.3: (a) The block diagram of injection-locked TDC under injection with N=3 (b) The timing diagram of TDC output without injection (c) The timing diagram of TDC output with ! inj =! RO + ! (d) The timing diagram of TDC output with ! inj =! RO ! between the stages, with and without an external injection point. This phase imbalance can contribute to the DNL of the IL-TDC, and results in spurious tones of the DPLL. If we assume that there are only two dierent phase shifts are assumed, i.e., 1 and 2 , as annotated in Fig. 2.3 where a 3-stage RO case is shown, they can be computed from Eq. (2.1): 1 = N + + = N + (1N) (2.8) 2 = N + (2.9) From the above equations, the DNL of the TDC can be derived and normalized to the nominal TDC LSB, i.e., N . is further replaced with equation (2.5) in order 18 to determine the relationship between the frequency dierence (!) and the DNL. They can be expressed as DNL 1 = N(1N) ! ! RO [sin( N )cos( N )] (2.10) DNL 2 = N ! ! RO [sin( N )cos( N )] (2.11) Depending on the polarity of !, the DNL of the injection-locked stage can be either larger or smaller, as shown in Fig. 2.3(c) and 2.3(d). Additionally, given the injection strength, a larger frequency dierence (!) makes the TDC DNL worse. Ideally, if the injection frequency is exactly equal to the ring oscillators free-running frequency, there will be no DNL due to injection-locking. Note that the worst-case DNL occurs when the injection frequency is at the boundary of the injection-locking bandwidth, which can be found via Eq. (2.7). In practice, a free-running frequency tuning of the ring oscillator can be implemented to reduce the ! and, hence, the DNL. 2.2.3 Phase Noise Impact due to Injection Locking In the proposed IL-TDC scheme, the ring oscillator can contribute additional noise to the overall DPLL phase noise. In addition, the injection-locking loop behaves as a rst-order phase locking between the DCO and the feedback signal; therefore, it can potentially alter the loop dynamics of the DPLL. In this subsection, 19 we will examine the noise contribution from the IL-TDC, as well as the revised noise transfer function of the DCO due to the inserted IL stage. It has been found that a properly designed injection-locking bandwidth (! BW;IL ), in relation to PLL bandwidth (! BW;PLL ), plays a crucial role in minimizing the phase noise, which will be elaborated as follows. According to Adler's equation [25], the relationship between the input and out- put phases of an injection-locking loop can be described as: d out (t) dt =! RO ! inj ! BW;IL sin ( out (t) inj (t)) (2.12) where out (t) is the IL-TDC output phase and inj (t) is the phase of the injection- locking source, i.e., the input phase. Two assumptions are made here. Firstly, the phase perturbation due to the noise is so small that the high-order term in Eq. (2.12) can be ignored. Secondly, the frequency mismatch between the injected source and the free-running RO is suciently small. As a result, Eq. (2.12) can be linearized and applied Laplace Transform to derive the following equations: s out (s) =! BW;IL ( inj (s) out (s)) (2.13) and out (s) inj (s) = ! BW;IL s +! BW;IL (2.14) 20 Figure 2.4: DPLL transfer function with IL-TDC Figure 2.5: DPLL phase noise prole: (a) With injection-locking bandwidth of 150 kHz (b) With injection-locking bandwidth of 25 MHz Equation (2.14) shows that the transfer function of the injection-locking loop re- sembles that of a rst-order phase-locked loop. Figure 2.4 shows the overall DPLL transfer function with the inserted IL-TDC response, as described by Eq. (2.14). The digital loop lter is transformed from z to Laplace domain via Forward Euler Transformation, purely as an example. In this case, the loop lter is approximated as K d ( s+K I F REF s ), where K d and K d K I are the gain in the proportional and integral paths of the loop lters, respectively, andF REF is the reference clock frequency. In Fig. 2.4, two noisy phase disturbances are considered, including n;DCO and n;RO , i.e., phase noise of the DCO and RO, 21 respectively. Their phase noise transfer function to the DPLL output can be derived as: n;PLL n;RO = ( s s+! BW;IL ) 1 N div K d K dco (s +K I F REF ) s 2 + 1 N div K d K dco ( ! BW;IL s+! BW;IL )(s +K I F REF ) (2.15) n;PLL n;DCO = s 2 s 2 + 1 N div K d K dco ( ! BW;IL s+! BW;IL )(s +K I F REF ) (2.16) The above equations indicate that the noise transfer functions are aected by ! BW;IL and hence require further examination. In the case of ! BW;IL being smaller than ! BW;PLL , the RO noise between the ! BW;IL and ! BW;PLL bands is not atten- uated by injection locking or the PLL loop, and will cause elevated phase noise within this band. Additionally, since Eq. (2.15) shows a high-pass transfer func- tion with corner frequency of ! BW;IL , the corner frequency should be suciently high to suppress the in-band noise oor caused by the RO. To validate these ef- fects, numerical simulations were performed with dierent ! BW;IL given the same ! BW;PLL . Fig. 2.5(a) shows phase noise peaking and higher in-band noise oor when ! BW;IL is smaller than ! BW;PLL . In the second case, where ! BW;IL is set to be 100! BW;PLL the in-band phase noise is signicantly lowered without any noise peaking, as shown in Fig. 2.5(b). 22 2.3 Adaptive Single Tone Interference Cancellation Scheme As the level of integration continues to increase in an SOC environment, the in- creasing number of interference sources from simultaneous analog and digital block operation imposes grand challenges for achieving low-spur frequency synthesizers. Unwanted interferences can be coupled into the PLL through various paths, such as substrates, bond wires, and power supplies [26]. These coupled interferences can disturb dierent parts of the PLL, such as the input reference clock, the TDC and oscillator, and can generate spurious tones at PLL output. To mitigate such eects and better phase noise performance, the oscillator is typically shielded by a low drop-out (LDO) regulator or an additional supply noise cancellation loop [7] to increase the power supply rejection ratio (PSRR). However, the remaining analog circuit blocks, such as input clock buers and TDC, are still vulnerable to inter- ference coupling. To further enhance the PLLs robustness to self-interference, we aim to fully exploit the unique property of the DPLL and explore a digital spur cancellation technique to that does not exacerbate analog complexity. The target DSP-based approach also benets from technology scaling. 23 Figure 2.6: Potential interference coupling and cancellation-leveraging DPLL ar- chitecture Figure 2.7: High level concept of interference cancellation 2.3.1 Steepest Descent Search Algorithm Given the digital nature of DPLL operation, the interference will appear as a phase disturbance inside the loop lter with accessibility in digital form, as shown in Fig. 2.6. If there is a DSP block capable of extracting the phase information and reconstructing the interference, direct interference cancellation can be realized [27]- [29]. As the rst proof-of-concept prototype, the proposed interference cancellation scheme will focus on the single sinusoid case in this chapter. 24 The concept of interference cancellation is illustrated in Fig. 2.7. The input of the cancellation loop is the TDC output, containing the interference part (S i [k]) and the remaining phase information (S n [k]), such as DCO phase noise. The key objective of the proposed cancellation loop is to generate an interference replica that matches S i [k] and to subtract it prior to the digital loop lter. In the following discussions, we will use the variables A i , i . and ! i to denote the amplitude, phase, and frequency of the interference S i [k], respectively. Similarly, A y , y , and ! y represent those of the generated replica signal y[k]. It is assumed that the dierence between y and i is negligible. Theoretically, if the estimated y[k] is perfectly matched with S i [k], no interference disturbance will be passed on to the DCO, resulting in perfect spur cancellation. However, this cannot be achieved in practice; instead, it is proposed to minimize the following cost function: min Ay;y E[je[k]j 2 ] = min Ay;y E[jS i [k]y[k]j 2 ] (2.17) According to Eq. (2.17), the objective is to minimize the mean square value of the residue error e[k] which is the dierence between the real and estimated inter- ference. Note that, since the cancellation loop should operate in the background during normal operation, the interference replica, y[k], is injected inside the DPLL loop; and the residue error energy, dened asE[e 2 [k]] in this subsection, is continu- ously monitored by the adaptive lter algorithm in order to minimize mean square error in real time. 25 Figure 2.8: (a) The loci of the updates using the steepest descent algorithm (b) The proposed coordinates descent algorithm, using ve computation channels with skewed amplitude and phase The concept of the adaptive lter algorithm, realized with 2-D gradient steepest descent search, is illustrated in Fig. 2.8(a). To reach the minimum residual error energy, the optimal values of amplitude and phase can be approached independently. Note that a good initial guess can reduce the convergence time, but does not aect the nal convergence value. The updating function of amplitude and phase in the conventional gradient steepest descent is expressed in (2.18) and (2.19), A y [k + 1] =A y [k] +A step e[k] @y[k] @A y [k] (2.18) y [k + 1] = y [k] + step e[k] @y[k] @ y [k] (2.19) ,where the updated equation is a function of residue error, step size, and the replica interference. In order to reduce the hardware cost of derivative operation @y[k] @Ay[k] and @y[k] @y[k] , the proposed spur cancellation scheme utilizes the coordinate descent algo- rithm [27], which is modied from the gradient descent search algorithm. Instead of 26 updating the descent direction according to the true gradient vector, the coordinate descent algorithm selects one of the two perpendicular descent directions with xed step, i.e. either amplitude or phase in this case. In order to determine the descent direction, ve parallel computation channels are implemented to calculate the residue error energies corresponding to dierent amplitudes and phases of the replica interference. For illustrative purposes, the ve computation channels are mapped onto a 2-D coordinate system as depicted in Fig. 2.8(b), where amplitude is plotted in the Y-axis and phase in the X-axis. To simplify the mathematical expression, three channels with replica interference amplitudes of A Ay (1 +A oset ), A y , and A y (1A oset ) are grouped in the same basis with channel index numbers i = 1, 2, and 3. These three channels com- pute the corresponding residue error energy values, namely E[e 2 1 (A y )], E[e 2 2 (A y )], and E[e 2 3 (A y )], respectively. Similarly, channels with replica interference phases of y + oset , y and y oset are assigned to another basis with channel index numbers j = 1, 2, and 3, and their corresponding residue error energy values are E[e 2 1 ( y )], E[e 2 2 ( y )], and E[e 2 3 ( y )]. As Fig. 2.8(b) shows, the center channel with index numbers of i= 2 and j= 2 should yield the lowest residue error energy after reaching the steady state of the coordinate descent algorithm; therefore, only the replica interference from this center channel is injected inside the PLL loop. The 27 Figure 2.9: Block diagram of the residue error energy computation iterative algorithm is based on updating either the amplitude or phase, based on (2.20) and (2.21). A y [k + 1] =A y [k] +A step f2 arg min i21;2;3 [E[e 2 i (A y )]]g (2.20) y [k + 1] = y [k] + step f2 arg min j21;2;3 [E[e 2 j ( y )]]g (2.21) The descent path of the update that yields the least residue error energy will be selected. In other words, if the residue error energy due to a change of amplitude is less than the residue error energy due to a change of phase, the update based on (2.20) will be chosen instead of (2.21), and vice versa. Note that the step size has a linear impact on the convergence time and the steady-state error after the descent algorithm converges. 28 2.3.2 Residue Error Energy Calculation Since the coordinate descent algorithm aims to minimize certain spurious tone energy, the residue error energy should be evaluated at the target spur frequency to avoid potential disturbance from other interferences or random noise. Inaccurate residue error energy estimation due to noise disturbance can lead to erroneous de- scent direction and, hence, convergence instability. A matched lter and averaging overN win cycles is thus performed via I/Q correlators against sinusoids at the esti- mated spurious frequency, ! i , as shown in Fig. 2.9. The mathematical expression of residue error energy at ! i can be derived as: E[e 2 [k]j !y =! i ] =f 1 N win N win 1 X k=0 [(S i [k] +S n [k]y[k]) cos(! i kT s )]g 2 + f 1 N win N win 1 X k=0 [(S i [k] +S n [k]y[k]) sin(! i kT s )]g 2 (2.22) Here, we substitute S n [k] with discrete time samples n(kT s ); likewise, S i [k] with A i cos(! i kT s + i ), and y[k] with A y cos(! y kT s + y ). Equation (2.22) becomes 29 E[e 2 [k]j !y =! i ] =f[ A i 2 cos( i ) A y 2 cos( y )] + 1 N win N win 1 X k=0 [ A i 2 cos(2! i kT s + i ) A y 2 cos(2! i kT s + y ) +n(kT s )cos(! i kT s )]g 2 f[ A i 2 sin( i ) + A y 2 sin( y )] + 1 N win N win 1 X k=0 [ A i 2 sin(2! i kT s + i ) A y 2 sin(2! i kT s + y ) +n(kT s )sin(! i kT s )]g 2 (2.23) According to Eq. (2.23), if the average window duration (N win ) is an integer multiple of the period 1 ! i , the frequency-dependent terms, such as A i 2 cos(2! i kT s + i ) andn(kT s )cos(! i kT s ) will be completely nullied. On the other hand, ifN win is not exactly an integer multiple of the period 1 ! i , the frequency-dependent terms will be nonzero but bounded as they are periodic and zero-mean sinusoids. Moreover, the noise terms,n(kT s ), are correlated against the spur frequency, which also helps reduce the variation. As for the other frequency-independent terms containing pure spur phase and amplitude information, such as A i cos( i ) and A y cos( y ), they are independent of N win , unlike the frequency-dependent terms that are scaled down by 1 N win . As a result, the frequency-dependent terms become less signicant asN win increases. In the case thatN win is suciently large, the frequency-dependent terms in Eq. (2.23) become negligible compared to the frequency-independent terms after matched ltering and averaging, and thus can be further approximated as: 30 Figure 2.10: (a) Numerical simulation of TDC output without (top) and with (bottom) the proposed spur cancellation, showing the convergence of estimated interference parameters (b) Numerical simulation of TDC output before and after enabling cancellation in the absence of interference [ Ai 2 cos( i ) A y 2 cos( y )] 2 + [ Ai 2 sin( i ) + A y 2 sin( y )] 2 = 1 4 (A i A y ) 2 if j i y j = 0 = A 2 i 4 [2 2cos( i y )] if jA i A y j = 0 (2.24) It can be observed from Eq. (2.24) that there exists only one global minimum point overjA i A y j andj i y j when Eq. (2.24) is drawn on the 3-D surface plot for every 2 period ofj i y j. Another perspective is to examine Eq. (2.24) from the plane ofj i y j = 0 orjA i A y j = 0. The residue error energy always monotonically increases around the global minimum point. Therefore, in steady state, the minimum error is achieved when A y is equal to A i and y is equal to i , which suggests that the replica interference matches the real one to be canceled. 31 2.3.3 Numerical Simulation To verify the proposed adaptive algorithm, a time-domain numerical simulation is performed. Note that due to the implementation based on the cost function (2.17), the proposed adaptive algorithm mainly focuses on cancelling the interfer- ence coupled to the supply of reference clock, TDC and divider. Hence, the phase disturbance that we modeled in this subsection originates from the reference path. Without the cancellation loop, the TDC output shows a periodic phase disturbance due to the coupled interference. When enabling the cancellation loop, the phase disturbance gradually diminishes as the amplitude and phase of the replica inter- ference approaches that of the actual interference, as shown in Fig. 2.10(a). As a result, the spurious tone energy at the PLL output is reduced. Note that if the interference is absent but the cancellation loop is enabled, the adaptive algorithm will converge towards zero amplitude in the steady state. Therefore, it does not disturb the normal PLL loop operation ifA step in Eq. (2.20) is set to be suciently small, as shown in Fig. 2.10(b). In addition, supplementary digital logic can be implemented to disable the adaptation of the cancellation loop in order to reduce power consumption once the interference is removed. In other words, the cancella- tion loop will not cause phase noise degradation, and is eectively activated only when the interference is present. Figs. 2.11(a) and 2.11(b) show the numerically simulated tradeos between the convergence speed and nal convergence accuracy for several representative values of A step and step . It can be observed that the 32 Figure 2.11: (a) Spur level versus number of updated cycles with dierent A step (normalized to the full swing of DDS) (b) Spur level versus number of updated cycles with dierent step Figure 2.12: (a) The impact of amplitude oset (normalized to the full swing of DDS on the cancellation accuracy (b) the impact of phase oset on the cancellation accuracy 33 Figure 2.13: Overall block diagram of the proposed DPLL architecture convergence speed is inversely proportional to the step size. The impact of A oset , and oset is shown in Figs. 2.12(a) and 2.12(b). Since the oset value does not aect the convergence speed, only the steady-state spurious levels post-cancellation are plotted. Better convergence accuracy is observed when oset values are smaller, owing to the smaller dierence of residue error energies between the computation channels. 34 2.4 Circuit Implementation 2.4.1 DPLL Overall Architecture The proposed fractional-N DPLL architecture is illustrated in Fig. 2.13. The phase of DCO is quantized by injection-locked TDC, and generates a 14-bit ther- mometer code, which is converted into binary form and scaled by 1 28 due to the 28 phase quantization levels. It is then added to the 8-bit output of the feedback in- teger counter to generate complete quantized phase information. The accumulated FCW path is subtracted at the TDC output to execute fractional-N mode [1], [5] and [21]. The proposed spur cancellation operation is inserted prior to the digital loop lter, such that the interference signal can be removed before modulating the DCO. The digital loop lter is composed of parallel proportional and integral paths with a tunable type-I, II, or III loop response, which allows for multi-standard appli- cations. For instance, when the PLL in-band phase noise requirement is stringent, the type-III loop response can be enabled to attenuate the close-in VCO noise. Fol- lowing the digital loop lter, the DAC interface and LC oscillator are implemented to provide digital frequency tuning capability. 35 Figure 2.14: Circuit implementation of injection-locked TDC and LC-DCO 2.4.2 Implementation of Injection-Locked TDC The embedded TDC concept was proposed in [21] to utilize the internal phases of a ring oscillator as the time quantization basis. This minimizes the TDC hard- ware without the need for additional delay calibration, since the TDC quantization step automatically scales with DCO frequency by design. However, for RF LO generation, an LC oscillator is preferred for its lower phase noise when compared to an inverter-based ring oscillator [30]- [32]. Unfortunately, a single-stage LC oscilla- tor only provides two complementary phases, which may not be sucient to meet the stringent TDC resolution requirement and makes the embedded TDC concept impractical in this case. Alternatively, LC oscillators can be cascaded into multiple stages, for example an LC-based ring oscillator [33], to provide ner phase resolu- tion; however, the excessive area and power consumption limit its practicability. 36 Based on aforementioned observations, a calibration-free multi-phase injection- locked TDC is proposed to resolve these constraints with minimal implementation overhead. The circuit implementation is shown in Fig. 2.14. The LC-DCO serves as the injected source to the IL-TDC ring oscillator. In the steady state, the ring oscillator frequency and phase is aligned with the LC-DCO output. Therefore, the TDC quantization steps automatically track with the DCO period over PVT, which avoids the need for phase normalization. As discussed in Section 2.2.3, since the injection-locking mechanism periodi- cally aligns the ring oscillator phase with the lower-noise LC-DCO, it suppresses the phase noise accumulation throughout the delay elements. The injection-locking interface needs to guarantee sucient coupling strength from the LC-DCO. Ad- ditionally, the bandwidth of the injection-locking loop should be designed to be wider than the PLL bandwidth, to avoid in-band phase noise degradation. The injection-locking interface consists of two-stage buers. Due to the process varia- tion and wide frequency-tuning range, the amplitude and common mode voltage of the DCO output varies. As a result, the rst-stage CML buer is designed to reject the input common mode noise and perform a proper DC level shift and signal am- plication for the following stage. The second-stage CMOS buer further converts the sinusoidal input into a rail-to-rail square wave, resulting in stronger coupling strength to eectively align the ring oscillator phase. The output common mode of the CML buer is not biased by the common mode feedback circuitry (CMFB), 37 Figure 2.15: (a) Cross-coupled resistor network applied in IL-TDC (b) Equivalent circuit for one stage of IL-TDC with noise modeling so the output voltage is controlled by adjusting the ICML current bias outside the chip in order to prevent duty-cycle distortion after the CMOS buer. For the RO implementation, current-starved delay stages are utilized to reduce free-running oscillation frequency variation over PVT [34]. The voltage biases, including V b p1 ;V b n1 ;V b p2 and V b n2 , as annotated in Fig. 2.14, are designed to maximize the headroom for the output swing. In order to constrain the dierence between the DCO and the free-running RO frequency, as well as characterizing its impact as discussed in Section II.A, the bias current of the current-starved delay stages can be externally adjusted in this silicon prototype. Note that this current adjustment process can be automated on the chip, as the TDC itself can provide the frequency information when the RO operates in free running mode. A servo loop can then be implemented to adjust the bias current until the frequency dierence is within a tolerable range. 38 For the multi-phase generation, the ring oscillator naturally divides the DCO period into 2 M equal phases, where M is the number of ring oscillator stages. To avoid in-band phase noise of the DPLL being limited by TDC quantization, further phase renement is achieved by passive interpolation techniques [35], as shown in Fig. 2.15. The phase interpolation is done through the center nodes of the cross- coupled resistors, i.e. nodes C1, C2, and so on. In this case, we do not need to increase the number of RO stages, which will decrease its free running frequency and incur higher power consumption to stay within the tolerable injection-locking frequency range. To analyze the noise impact of the cross-coupled resistor network, we derive a circuit model for the single-stage IL-TDC with noise currents, as shown in Fig. 2.15(b). Three dominant noise sources are modelled: they are ~ I n;1 (noise from the delay stage driving node A1), ~ I n;1 (noise from the delay stage driving node B2, and ~ I n;1 (noise from the cross-coupled resistor between nodes A1 and B2). The voltage noise contribution of node A1 is then derived from the three dierent sources: ~ V 2 n;1 = ~ I 2 n;1 f[R + (R eq jj 1 sC )]jjR eq jj 1 sC g 2 (2.25) ~ V 2 n;2 = ~ I 2 n;2 f (R eq jj 1 sC ) 2 2 (R eq jj 1 sC ) +R g 2 (2.26) and 39 ~ V 2 n;3 = ~ I 2 n;3 (R eq jj R 2 jj 1 sC ) 2 (2.27) ,where gm and R eq are the equivalent trans-conductance and output resistance of the RO delay stage andR is the cross-coupled resistance between nodes A1 and B2. Assuming the transistors of the delay stage operate in the saturation region at zero- crossing points, the values of ~ I n;1 and ~ I n;2 are proportional to 4kT R, while ~ I n;3 can be modeled as 4kT R . According to equations (2.25)-(2.27), it is found that the larger resistor value R helps to reduce voltage noise contribution from the cross-coupled resistor network. Additionally,R minimizes the loading of each delay stage, yielding a larger free-running oscillation frequency given the same current consumption. On the other hand, the R value should not be too large, in order to ensure dierential operation of the RO. Besides, the cross-coupled resistors help to equalize the delay mismatch between the stages [21], i.e., to improve the DNL of the TDC after fabrication. In this prototype, we design R to be 8R eq to achieve an acceptable tradeo between phase noise degradation and the TDC's DNL. According to the circuit simulation, the designed cross-coupled resistor network improves the DNL by 19% over process variation and mismatch, using Monte Carlo simulation. The added noise from the resistor network is constrained to be less than 30% of the total RO noise, which matches the expectation from equations (2.25)-(2.27). 40 Figure 2.16: Proposed DAC interface between digital loop lter and LC oscillator 2.4.3 Digitally Controlled Oscillator As shown in Fig. 2.14, the DCO is implemented with a top-biased LC oscillator and dual switching pairs for reduced power dissipation and better out-of-band phase noise [36], [37]. The DCO frequency tuning is achieved through digitally controlled varactors and metal-insulator-metal (MIM) capacitors. The capacitors' setting is done in the foreground to compensate for PVT variations before closing the PLL loop. The coarse varactor bank with a frequency resolution of 600 kHz per code, called an acquisition (ACQ) DAC, is mainly used in the initial phase-settling period. The ne varactor bank with a frequency resolution of 120 kHz per code, called a tracking (TB) and extended tracking (EX-TB) DAC, is used in the PLL-locked steady state, to compensate for the phase disturbance from various noise sources. The block diagram of the DAC interface between these capacitor banks and the digital loop lter is shown in Fig. 2.16. Since the DCO frequency is digitally 41 controlled, its quantization noise can degrade the overall PLL phase noise, which sets the minimum bound of DAC resolution. Instead of implementing a high- resolution Nyquist DAC which requires a large silicon area, the fractional control word from the digital loop lter is oversampled and dithered by a third-order error feedback delta sigma modulator (DSM). The DSM is clocked at a DCO frequency divided by 4 or 8, and its 3-bit output is decoded into an 8-bit thermometer code to control the TB bank. Since the digital loop lter and DSM are clocked at dierent frequencies, a cross-clock domain synchronization circuit is implemented to mitigate the potential for metastability. In an ideal scenario, there is only fractional control word toggling during the PLL steady state, to reduce switching noise. In reality, the control word can be close to the integer code boundary, causing both integer and fractional codes to toggle simultaneously. Therefore, the EX-TB DAC is added to and controlled by the LSB part of the integer control word, to avoid toggling the ACQ DAC. Note that the required range of the EX-TB DAC should cover the maximum control word variation during the steady state. In the initial PLL-settling phase, the DCO control word can undergo a large swing, and exercise both coarse capacitor banks, the ACQ and EX-TB DAC. In order to ensure the monotonicity of frequency tuning, a saturation block is inserted at the input of EX-TB DAC. It clips at the maximum value when the integer code word is above the saturation threshold and vice versa. Whenever the integer control 42 Figure 2.17: Implementation of the proposed gradient-based adaptive spur cancel- lation scheme word goes beyond the saturation threshold, only the ACQ DAC is toggled with a larger frequency step, which helps reduce the PLL's settling time. Additionally, an ACQ Bias code is added, such that the EX-TB DAC is nominally biased at the middle value in the steady state, allowing maximal margin from the satura- tion threshold. The setting of the ACQ Bias value is determined according to the targeted locking frequency and capacitor bank resolutions. A simple nite state machine (FSM) can be implemented to automatically adjust the ACQ Bias setting on the chip, although we perform this adjustment manually for the prototype. 43 2.4.4 Implementation of Adaptive Spur Cancellation Loop Figure 2.17 shows the detailed implementation of the proposed gradient-based adaptive spur cancellation loop. PDout is the combined coarse and ne TDC out- put. It is used as the input of the spur cancellation loop for estimating spurious tone characteristics and generating a 180-degree out-of-phase spur replica. As de- scribed in Section III, the implemented coordinate descent algorithm generates ve replica spurs with various amplitude and phase osets. In the proposed imple- mentation, the ve replica spurs are generated based on three coordinate rotation digital computers (CORDIC)-based direct digital syntheses (DDS), with one phase accumulator (ACC). To allow a maximal design margin for this prototype, the spu- rious tone performance of the synthesized DDS is overdesigned to achieve 100 dB SFDR, avoiding any unwanted spurs due to the DDS. In practice, the DDS nite word length can be substantially reduced, resulting in at least a 2x reduction in digital gate count, according to our digital synthesis results. The phase oset is introduced after the phase accumulator by adding a DC phase oset code. Conversely, the amplitude oset is introduced at the center CORDIC output by multiplying with dierent amplitude scaling coecients. As a result, the proposed implementation architecture shares the CORDIC module to minimize the cost, rather than using ve ACC and CORDIC pairs. One parameter that is required by the proposed cancellation loop is the spurious tone frequency. In this work, we assume the interference frequency is either known 44 apriori, or can be found externally by observing the DPLL output spectrum. Al- ternatively, we can simply sweep the center frequency of the matched lter until we nd a center frequency that yields a large lter output, essentially indicating the existence of a spur at that particular frequency. The determined spur frequency is then used in the phase accumulator block via a frequency controlled word (FCW), where the phase is accumulated according to this rate. The CORDIC maps this cumulative phase to the digital sinusoid with the corresponding frequency, which supports the LSB resolution of 1 2 14 F REF . Additionally, this estimated frequency is used to generate I/Q sinusoids via the same CORDIC-based DDS. Once the ve spur replicas are generated, they are subtracted from PDout and the resulting dierence values are processed by I/Q correlators in order to calculate the residue error energy. The correlator rst multiplies the input with I/Q sinusoids, accumulates the value over a certain number of cycles, and then performs a square- and-sum operation to compute the averaged residue error energy. Since the number of accumulation cycles aects the precision of the residue error energy estimation, the cycle should be designed to be suciently large to reduce estimation error that degrades the algorithm convergence. In this prototype, the accumulation cycles can be programmed to number up to 256. In addition, to minimize the digital implementation cost, bit width truncation is exploited inside the correlators with careful numerical simulations to ensure minimal impact on the spur cancellation accuracy. 45 After the ve residue error energy values are computed, they are sorted and the one with the minimum value identied, which indicates the update direction of the coordinate descent algorithm. In this case, the direction indication bits connect to the phase and amplitude update module, which will change either the amplitude or phase of the ve spur replicas byA step or step . Since these updates are applied to all spur replicas, the phase oset is simply added after the ACC block and amplitude update is performed directly by the CORDIC module. After the update of A step or step , the CORDIC module uses 17 pipeline stages to generate a new sinusoidal output. This latency should be carefully considered to align the cancellation phase properly in the loop. To meet the timing constraint of the overall digital logic, we introduce a total of (19 +N win ) pipeline stages to execute one update. It is also noteworthy that should the coupled spur frequency be above half of reference clock frequency, it will be aliased within Nyquist, i.e. DC to F REF 2 , at the TDC output due to the inherent sampling operation of TDC [26]. Therefore, the proposed spur cancellation is still capable of cancelling the aliased spur before modulating the DCO. 2.5 Experimental Results The silicon prototype was implemented in a 65nm CMOS process with an active area of 0:75mm 2 , as shown in Fig. 2.18. The analog part of the digital PLL con- sumes 15.8 mA, while the output buers consume 3.5 mA from a 1V supply. The 46 Figure 2.18: Die micrograph Figure 2.19: Measured spectrum snapshot of IL-TDC (a) before injection locking, (b) under injection locking (c) Phase noise prole of RO before and after IL (d) Measured DNL performance of proposed IL-TDC 47 Figure 2.20: (a) Measured phase noise prole with dierent DPLL bandwidth set- ting under integer-N mode at 2.816GHz carrier frequency (b) PLL output PSD of frac-N mode at 2.8165 GHz with fractional spur at 500 kHz oset with narrow bandwidth setting digital part consumes 1.83 mA in normal DPLL operation, while the spur cancella- tion loop consumes an additional 3:6 mA. To ensure the proper working of the injection-locked TDC, the in-band phase noise of the injection-locked RO should follow that of the LC DCO. Fig. 2.19(a)-(c) shows the measured RO spectrum before and after enabling the injection-locking interface, as well as the correspond- ing phase noise prole. To determine the injection-locking bandwidth, the DCO frequency is intentionally swept away from the RO free-running frequency until we observe the injection-pulling phenomenon at the RO output. In this prototype, we nd that the frequency dierence of > 147 MHz triggers this phenomenon, and hence determines the injection-locking bandwidth of the IL-TDC. In this case, it is indeed signicantly higher than the DPLL bandwidth, as discussed in Section II.C. The measured DNL of the IL-TDC after DPLL lock is shown in Fig. 19(d), showing a peak value of 0.55 LSB. 48 Figure 2.21: Measured reference spur of -86.45 dB at 2.816 GHz carrier frequency The phase noise measurement in integer-N operation mode is shown in Fig. 2.20(a) with dierent PLL bandwidth settings, and the fractional-N mode at 2.8165GHz is shown in Fig. 2.20(b). Without activating spur cancellation algorithm, the measured worst-case fractional spur is -31 dBc at 31.25 kHz oset with FCW of 88.00048828. The measured in-band phase noise at 40 kHz oset is -90 dBc/Hz, and out-of-band phase noise at 3 MHz oset is -128 dBc/Hz when normalized to a carrier frequency of 3.6 GHz. Since the DPLL architecture eliminates the charge pump circuit and analog loop lter besides its careful layout and noise isolation, a reference spur below -86 dBc is achieved with a 32-MHz reference clock, as shown in Fig. 21. To validate the eectiveness of the gradient-based cancellation scheme, a si- nusoidal interference is intentionally injected via the power supply of the input clock buer. In a representative case, a 500 kHz interference tone is injected, and the spectrum is captured before and after activating the spur cancellation scheme, 49 Figure 2.22: (a) Measured spur level improvement of >43 dB at 500 kHz oset frequency with 30 kHz bandwith (b) Measured spur level before/after spur cancel- lation scheme over dierent injected spur frequencies with 250 kHz bandwidth demonstrating a spur reduction of 43 dB, as shown in Fig. 2.22(a). The inter- ference frequency is further swept from 125 kHz to 1 MHz oset from the carrier frequency with the bandwidth setting of 250 kHz, and a spur level reduction > 20dB is measured within the frequency range, as shown in Fig. 22(b). Note that the spur reduction yields better cancellation when the interference is at 500 kHz oset, because the value of N win happens to be integer multiples of 1 ! i , such that the frequency-dependent term in Eq. (2.22) is completely removed. The proposed cancellation scheme also works for spur frequencies larger than 1 MHz, since the DDS can synthesize any sinusoidal frequency within the Nyquist band. However, since the digital PLL loop lter attenuates the coupled spur in the stop band, we report the spur frequency up to the PLL bandwidth in order to clearly observe the spur level dierence before and after the cancellation. To validate the real-time tracking capability of the proposed gradient-based algorithm, the coupled interference amplitude is varied manually in real time at 50 Figure 2.23: (a) Measured spur level before and after cancellation with varying in- terference amplitude in real time (b) Measured digital dynamic power consumption under dierent DDS frequencies very slow rate such that the adaptive loop can fully settle, and continuously observe the DPLL output spectrum to record the spur levels. As shown in Fig. 2.23(a), the spur level indeed remains at the same low level over time, which proves that the proposed cancellation algorithm automatically tracks with the amplitude variation in the background. Finally, the dynamic power consumption of the DDS increases with the signal activities. This is conrmed by the measured power consumption of the cancellation loop, with the DDS synthesizing 125 kHz to 16 MHz digital sine waves, as plotted in Fig. 2.23(b). Table 2.1 lists the key highlights in comparison with the state-of-the-art digital PLLs. The existing spur mitigation techniques [5] involve some form of dithering to spread out the spur energy at the cost of an elevated noise oor. Our proposed spur cancellation avoids this undesirable phenomenon, since no dithering is utilized. We have also externally injected two spurious tones and conrmed that only the intended tone is canceled while the other one is left intact. In other words, each 51 Table 2.1: Comparison with state-of-the-art digital PLLs cancellation loop mitigates the target tone without interfering with the other one, i.e., orthogonality holds. 2.6 Conclusion A fractional-N digital PLL that synthesizes frequency from 2.7 to 4.8 GHz was presented. In order to minimize the TDC overhead, the proposed calibration-free injection-locked TDC is demonstrated to work with LC-based DCO and automat- ically track with PVT variation. To improve the spurious tone performance, a gradient-based adaptive single-tone spur cancellation scheme is proposed, for which 52 >40 dB spur reduction was measured. The proposed cancellation scheme takes ad- vantage of the digital PLL architecture and thus that highlights the advantage in it when compared to conventional analog PLLs. 53 Chapter 3 A Digital PLL with Feedforward Multi-Tone Spur Cancellation 3.1 Spur in Multi-tone Scenario No matter how the spurs are generated, the spurs will unavoidably come in multiple tones. These tones can be harmonics of each other, which is very common as the clock sources usually generates a square waveform. The spurs can also be composed of on-harmonic tones, and this happens when the interferers are not coherent to each other (i.e., multiple aggressors are presented). To mitigate multi- tone, the DDS-based cancellation scheme introduced in Chapter 2 is the optimal solution. The DDS block can only generate a single-tone spur replica at a given time, requiring multiple DDSs for the multi-tone spur case. Therefore, the increased cost will be proportional to the number of spurs. While some techniques eectively reject the external spurs, they may not be able to compensate for the performance 54 degradation due to the internal spurs. In order to address the internal spurs, other techniques are necessary. Moreover, if more than one spur source exists, multiple techniques are required to combine together. Consequently, the imminent problem is how to realize a cancellation technique which can address internal and external spur in an eective manner with shared logics. To resolve these demands, a feedforward multi-tone spur cancellation scheme is discussed in this chapter with minimum design complexity. Dierent case studies are explored and experimented to validate the technique's eectiveness in various spur scenarios. 3.2 Proposed Multi-Tone Spur Cancellation 3.2.1 Derivation of the Cancellation Algorithm The proposed multi-tone spur cancellation scheme is based on the observation that either externally coupled or internally generated spurs should always present some kind of periodic pattern at the phase detector output, i.e., TDC output in the case of a digital PLL. To illustrate the basics of the proposed cancellation concept, a sawtooth-like spur pattern with periodicity of D is examined as a representative example in Fig. 3.1, which would generate a series of spurious tones at frequencies, F REF D and integer harmonics. In general, D can be either an integer or a fractional number, and how to implement D as a fractional number will be described in Sec- tion 3.2.3 and 3.4.2. As shown in Fig. 3.2, if the spur pattern (x[n]) is intentionally 55 Figure 3.1: The property of multi-tone spurs and its harmonic decomposition delayed by precisely D delays, this delayed spur replica (y[n]) should coherently match with the original pattern. By subtracting the original spur pattern from this replica, the spur pattern can be completely removed before modulating the DCO, and hence the spur harmonics are cancelled. From the time-domain perspective, since harmonics repeat at the exact same period, the delayed spur replica naturally aligns all the harmonic waveforms and potentially allows complete multi-tone can- cellation. From the frequency-domain perspective, the signal components at DC and integer multiples of F REF D are essentially notched out. Note that, the spurs only in a harmonic relationship is considered for now, and will consider the spurs in a non-harmonic relationship in Section 3.4.3. While the aforementioned cancellation algorithm works in the open-loop sce- nario, it has not considered the closed-loop operation of the PLL as well as the noise component at the TDC output. Therefore, the cancellation algorithm re- quires further modications to be feasibly embedded inside the PLL loop. As illustrated in Fig. 3.3, to understand the design constraints, the z-domain transfer function of the DPLL loop [40], [41] is rst examined. Assuming a second-order 56 Figure 3.2: Basic concept of feedforward multi-tone spur cancellation in time do- main view Figure 3.3: z-domain model of the proposed digital PLL with feedforward cancel- lation path H CANC (z) Figure 3.4: The open loop response of digital PLL with H CANC (z) = 1z D 57 Figure 3.5: (a) Block diagram of feedforward cancellation path with high-pass lter and (b) zero-pole diagram of the high-pass lter type-II loop response is combined with D-cycle delay and subtraction (i.e. a comb lter, H CANC (z) = 1z D , the DPLL open loop response H OPEN (z) is expressed as H OPEN (z) =H CANC (z)H LPF (z)H DCO (z) = (1z D ) ( z 1 1z 1 ) (1 + z 1 1z 1 K I )K DCO K D (3.1) ,whereK DCO ,K D andD D K I represent DCO gain, proportional gain and integral gain of the loop lter, respectively. Note that the introduction ofH CANC (z) changes the frequency response of the DPLL open loop. In the low frequency region, one of the two poles introduced by DLF and DCO is cancelled and hence the overall roll-o becomes -20dB/dec rather than -40dB/dec, as shown in Fig. 3.4. This undesirably changes the DPLL response to the 1st-order loop, and it can be observed by plotting the frequency response of DCO noise transfer function. 58 To solve the degeneration of loop response as indicated from Eq. (3.1), a high- pass lter with the response of 1z 1 1pz 1 is inserted in theD-cycle delay path as shown in Fig. 3.5(a), and provides pole/zero pair as shown in Fig. 3.5(b). As a result, the transfer function of H OPEN (z) to output is modied as follows H OPEN (z) = (1 ( 1z 1 1pz 1 )z D ) ( z 1 1z 1 ) (1 + z 1 1z 1 K I )K DCO K D (3.2) If the low frequency regime of the transfer function (3.2) is examined, the overall roll-o can still provide -40dB/dec. For intuitive understanding, the zero-pole dia- gram of open loop transfer function given original 2nd-order type-II DPLL is rst examined, there are two poles at DC and one zero within the unit circle, as shown in Fig. 3.6(a). The 1z D block where D is 4 is just an example; four zeros will be added at 0 o , 90 o , 180 o , and 270 o on the unit circle, and another four poles at origin, as shown in Fig. 3.6(b). When those zeros and poles are combined from Figs. 3.6(a) and (b), the zero at DC, i.e. 0 o , cancels out one pole as shown in Fig. 3.6(c). As shown in Fig. 3.6(d), by inserting the high-pass lter to block the DC in theD-cycle delay path, the zero-pole diagram indicates that the direct zero-pole cancellation at DC is avoided, which helps preserve the 2nd order characteristic of DPLL response in the low frequency region. Note that, the value of p determines the frequency range that performs slope correction. Another design consideration is the noise component at the input of the D- cycle delay path should be rejected in an ideal scenario, in other words, the spur 59 Figure 3.6: Zero-pole diagram of (a) open loop response of 2nd-order type-II digital PLL (b)H CANC (z) = 1z 4 (c)H OPEN (z) withH CANC (z) = 1z 4 (d)H OPEN (z) with H CANC (z) = 1 ( 1z 1 1pz 1 )z 4 Figure 3.7: The magnitude response of H CANC (z) = 1z D 60 Figure 3.8: (a) The block diagram of feedforward cancellation path with high-pass lter and signal averaging and (b) its frequency response with dierent N win information at the output of the delay path should only be extracted. In the case of simple delay and subtraction, some part of the frequency spectrum will experience noise enhancement, as evident from its frequency response (Fig. 3.7). To resolve this issue, the signal-to-noise-ratio (SNR) between the spur and noise component in theD-cycle delay path should be increased. Hence, a simple averaging technique [42] is proposed to provide the processing gain and eectively suppress the noise feedthroughs. The transfer function of the averaging operation in the z-domain is expressed as H AVG (z) = 1 N win N win 1 X i=0 z iD (3.3) 61 , and its implementation is drawn in Fig. 3.8(a). Note that, by increasing the averaging cycle N win , it improves the SNR in proportion. Essentially, the D-cycle delay path including averaging operation formulates a bandpass lter response. If the passband area is integrated and normalized to that of an all-pass lter given the same frequency band, the ratio is found approximately 1 N win . Therefore, it suggests that roughly 100 (1 1 N win )% of the noise energy is rejected through the D-cycle delay path via the averaging operation. From another perspective, the averaging operation eectively sharpens the lter response ofH CANC (z) around the notch frequencies, and gain response at the passbands becomes at, i.e., unity. Fig. 3.8(b) shows the eect of frequency response with dierent number of averaging cycles. To conclude, the complete transfer function of the open loop H OPEN (z) response of the nal implementation can be expressed as: H OPEN (z) = (1 ( 1z 1 1pz 1 ) ( z D +z 2D ::: +z DN win N win )) ( z 1 1z 1 ) (1 + z 1 1z 1 K I )K DCO K D (3.4) The eectiveness of the proposed signal averaging has been proven in the lab mea- surement, where the phase noise prole given dierent averaging cycles will be shown in Section 3.5. 62 3.2.2 Stability of PLL Loop Traditionally, the stability analysis of the PLL loop can be calculated using either phase margin [43] or the Jury's stability theory [44]. In this section, both techniques will be used to gain more insights of how each building block of the spur cancellation loop aects the stability. Considering the model shown in Fig. 3.3 without introducing theH CANZ (z) block, the characteristic polynomialf(z) can be found from the denominator of 2nd-order type-II DPLL close loop transfer function. It can be written as f(z) =z 2 + (K DCO K D 2)z + (1 +K DCO K D K I K DCO K D ) (3.5) To guarantee the stability of the system, the Jury's criterion has to be satised, and the range for a stable DPLL loop can be determined as follows: 8 > > > > > > > > < > > > > > > > > : K DCO K D K I > 0 4 2K D K DCO +K DCO K D K I > 0 2>K D K DCO K DCO K D K I > 0 (3.6) In this prototype, the tunable range of K I is constrained from zero to one. There- fore, the upper bound of K D K DCO is 2, based on (3.6). Next, the new constraint 63 is re-examined after introducing H CANZ (z) = 1z D . To simplify the mathemati- cal expression, D is set to 1. Therefore, the new characteristic polynomial and its stability range are expressed as: f(z) =z 2 + (K DCO K D 1)z + (K DCO K D K I K DCO K D ) (3.7) 8 > > > > > > > > < > > > > > > > > : K DCO K D K I > 0 2 2K D K DCO +K DCO K D K I > 0 1>K D K DCO K DCO K D K I >1 (3.8) From (3.8), the upper bound of K D K DCO is reduced to 1, i.e., the new stability range reduces by two fold due toH CANC (z) when compared to (3.6). After inserting the averaging block in the nal cancellation loop, the characteristic polynomial is rewritten by using the transfer function of (3.4) withN win of 2 andp of 1 just as an example. The modied characteristic polynomial and stability region for this case can be derived as follows f(z) = 2z 3 + (2K DCO K D 2)z 2 + (2K I 1)K DCO K D z + (K I 1)K DCO K D (3.9) 64 Figure 3.9: Computed phase margin with (a) dierent settings of D-cycle delay given a certain DPLL bandwidth, and (b) dierent averaging cycle 8 > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > : 3K DCO K D K I > 0 4 2K D K DCO +K DCO K D K I > 0 2> (1K I )K DCO K D >1 j(K I 1) 2 K 2 DCO K 2 D 4j>j 2K DCO K D (3K I 2 +K DCO K D (1K I ))j (3.10) From (3.10), the bound is compensated by the averaging back to K D K DCO < 2, same as (3.6). This analysis suggests that the averaging block helps improve the loop stability. While the Jury's stability criteria provide insights into how each building block aects the stability of the PLL loop in a simple setting, the mathematical expres- sion can become very complex when the order of the characteristic polynomial goes higher. As a consequence, the phase margin is also examined using the approxi- mated z-domain transfer function of Eq. (3.4). As shown in Fig. 3.9(a), the phase 65 Figure 3.10: Illustration of phase alignment between the actual spur and its replica with (a) 4-cycle delay, and (b) 4.5-cycle delay margin is evaluated in Matlab with dierent settings of D given the same DPLL bandwidth, which is about 10% of reference clock frequency. As indicated by the pole/zero diagram of the transfer function, when the delayD increases, the location of poles moves closer to the unit circle which suggests the smaller phase margin. In order to improve the phase margin, more cycles of averaging help improve the phase margin given a particular delay and DPLL bandwidth, as suggested by Fig. 3.9(b). Note that, the averaging block essentially generates pole-zero pairs close to the unit circle in the pole/zero diagram. 3.2.3 Integer and Fractional Delay Up until now, the spur frequency is assumed an integer multiple of the reference clock frequency. Therefore, the delayD is an integer number, which can be simply achieved via a chain of ip ops assuming they are clocked by the reference clock. 66 Figure 3.11: Block diagram of the Lagrange fractional delay lter with adaptability This is always valid in the case of fractional-N spur, which will be elaborated in Section III. However, in the case of an externally coupled spur, the aggressor frequency may not be the integer multiple of the reference clock. In other words, the delayD can be a non-integer number. If an integer delay chain is still applied, the delayed spur pattern will not be aligned with the original one, which will lead to an incomplete cancellation, as shown in Fig. 3.10. In this example, D is assumed 4.5, and the time-domain waveform suggests that a fractional delay of 4.5 is necessary to perfectly align the delayed spur phase. To accommodate such a scenario for prototyping purpose, a Lagrange interpolation fractional delay (FD) lter [45] with some adaptive capability is used, as shown in Fig. 3.11. By setting proper lter coecients, it can provide the fractional part of the delay, while the integer part is still achieved via integer number of ip ops. As the digital implementation of the fractional delay implies nite resolution, the generated delay may dier from the actual period of the spur to be cancelled. 67 This mismatch in delay is bounded by the fractional delay resolution (FD LSB ) and will limit the eectiveness of spur cancellation. The phase error ( err ) in radian is then derived between the actual and replica spur pattern as follows: err = 2FD LSB D = 2FD LSB F REF =F spur (3.11) , whereF spur represents the fundamental frequency of spurious tones. Assuming the spur pattern is a single sinusoid with normalized amplitude of 1 in order to simplify the expression, the residue error energy can be calculated by applying the square- and-average function over spurious periodD at the output of the cancellation loop. Hence, the residue error energy can be expressed as a function of err as Residue Error Energy =f1 cos( err )g (3.12) If the residue error energy is further divided with the uncompensated spur energy, the ratio indicates the amount of spur reduction with and without the cancellation. This ratio is referred as cancellation ratio in the context of this paper, and expressed as follows: Cancellation Ratio = 2f1 cos( err )g (3.13) To verify the above analyses, the cancellation ratio is simulated with dierent values ofFD LSB using the DPLL behavioral model in Matlab and compare it with the an- alytical results from (3.11) and (3.13), as shown in Fig. 3.12(a). It can be observed 68 Figure 3.12: (a) Simulation and analytical results of cancellation ratio given single sinusoidal spur; (b) simulation result of cancellation ratio given saw-tooth spur over dierent fractional delay resolution that the results match well, and the spur with lower frequency tolerates coarser FD LSB . Note that more complex spur pattern, such as sawtooth waveform, can be analyzed with a similar methodology. The cancellation ratio given the sawtooth waveform with dierent FD LSB is simulated and plotted in Fig. 3.12(b). In this prototype, a 16-tap FIR-based FD lter is implemented, which provides fractional delay resolution of 0:8% of one reference clock period to allow sucient spur cancellation. If an even ner fractional delay is needed, more taps and nite word length can be applied. Lastly, the adaptability of the FD lter will be discussed in Section 3.2.4. 3.2.4 Complete Cancellation Loop and Adaptability The complete block diagram of the proposed feedforward multi-tone spur can- cellation scheme is shown in Fig. 3.13. It consists of four key building blocks: 69 high-pass lter, integer delay chain, FD lter and averaging block. So far, the spur frequency is presumably known apriori, which is almost always true in the practical application as the internal fractional-N spur frequency can be determinis- tically calculated, and the external aggressors, such as other clock domains whose frequencies are typically known during the system-level frequency planning. To accommodate the rare case that external spur frequency is not known apriori, two adaptive loops are utilized to determine the spur frequency in sequence [27], [29]. The rst adaptive loop is applied to the integer delay chain to nd the optimal integer delay setting. This estimation process is completed by observing the error between the output of the integer delay chain and the input of the feedforward can- cellation path, i.e., e int [n] annotated in Fig. 3.13. The delay setting of the integer delay cell is swept and this process determines the value that leads to the least error energy. Once the integer delay is set, the second adaptive loop is enabled to de- termine the best FD setting by enabling Adap en signal. Essentially, the FD lter coecient is adjustable via an adaptive feedback loop; this is to allow automatic estimation of the delay setting in the cancellation loop. The weight adjustment is performed by observing the error, e frac [n], at the output of the Lagrange FD lter. The adaptation process is done via a sign least-mean-square (LMS) algorithm for lower implementation complexity at the cost of slower convergence speed. With the implemented Lagrange FD lter, the achievable fractional delay is always a positive value. As a result, some minor implementation detail between 70 Figure 3.13: Complete block diagram of the proposed feedforward multi-tone spur cancellation loop integer and fractional delay should be considered. For example, if D is 4.9, the integer delay achieving the least error energy would be 5 rather than 4. To prevent negative fractional delay from the FD lter, the programmed integer delay should be oored to the closest integer, i.e., 4 in this case, such that the FD lter will always provide a positive fractional delay. For the operation of the proposed averaging scheme, the detail will be provided in the Section 3.4.2. 3.3 Case Study: Fractional-N Spur Cancellation In the case that the fractional-N operation of a DPLL is derived from FCW ac- cumulation instead of a multi-modulus divider, it leads to a simpler implementation and avoids the dithering noise of the delta-sigma modulator. However, the nite TDC quantization step inevitably introduces a periodic sawtooth quantization er- ror pattern and hence the harmonic spurious tones. The shape of this sawtooth waveform is related to the resolution and DNL of the TDC and the FCW value. 71 Figure 3.14: Illustration of fractional-N spur cancellation Assuming the fractional part of the FCW is df, the quantization error pattern will repeat depending on the smallest bit "1" (SO) of df, referred as SO(df). For in- stance, if df is set to 1 2 5 + 1 2 7 , the periodicity is determined by SO(df) = 1 2 7 . Hence, the periodicity of fractional spur pattern can be deterministically calculated as D = F REF F spur = F REF F REF SO(df) = 1 SO(df) (3.14) This information can then be programmed in the proposed cancellation algorithm without the need of delay estimation as the FCW value is knownapriori. Note that because the fractional-N spur period automatically tracks with the reference clock, the setting of the cancellation scheme is insensitive to the change of the reference clock frequency. Another factor that aects the sawtooth waveform is the mismatch between the TDC quantization levels, i.e., DNL. The unequal quantization step results in an asymmetric sawtooth waveform around the center DC level; however, the waveform 72 still repeats every 1 SO(df) samples, as shown in Fig. 3.14. As a result, it allows the proposed spur cancellation scheme to reject the TDC's DNL eects on the fractional-N spurs, avoiding the need to estimate DNL pattern or apply additional dithering. After performing the proposed spur cancellation, the residue components are mainly the random pulses caused by the phase dierences between the reference input and DCO output. They will be processed by the digital loop lter, as it should be for the proper operation of the PLL. 3.4 Circuit Implementation 3.4.1 Block Diagram of DPLL Prototype The proposed fractional-N DPLL implementation with the feedforward spur cancellation scheme is illustrated in Fig. 3.15. An LC-based DCO is used for better out-of-band phase noise. The phase of DCO is quantized by an injection-locked (IL) TDC [14], [15] that generates 28 ne quantization levels within one DCO period, providing 10 ps time resolution. This ne phase code is then combined with the 8- bit integer counter output to generate the complete TDC information. The IL TDC is mainly composed of a seven-stage ring oscillator (RO) that is injection-locked by the DCO. Because it formulates the 1st-order PLL, the in-band phase noise of the IL RO is dominated by the DCO as its phase noise is low passed within the IL bandwidth while the RO phase noise still dominates beyond the IL bandwidth. 73 Figure 3.15: Proposed digital PLL implementation with feedforward multi-tone spur cancellation loop To ensure proper operation of the DPLL and minimize phase noise contribution from the RO, the IL bandwidth is designed to be much greater than the DPLL bandwidth. In this prototype, the designed IL bandwidth can be programmed up to 150MHz and tunable within 30%. The proposed feedforward multi-tone spur cancellation loop is inserted at the output of FCW accumulator such that the spurious signals from either external or internal sources can be removed prior to modulating the DCO. Alternatively, the spur compensation can be inserted right after the digital loop lter; however, it would require duplicating the DLF response in the D-cycle delay path, and hence was not used in this prototype. 74 The nite word length of the entire digital logic is carefully chosen and simu- lated to ensure the expected performance matches with the behavioral simulation result in Matlab, using oating point representation. For instance, if the nite word length in the feedforward cancellation path is not sucient, it will limit the nal cancellation accuracy as the spur replica cannot precisely duplicate the pattern of the original spur. As discussed in Section III, to compensate the TDC's DNL ef- fect, the cancellation loop should provide ne enough resolution that is related to the least TDC quantization step. On the other hand, the maximum bound of the cancellation loop is determined by the largest magnitude of the coupled external spurs or TDC's DNL. Based on those design considerations, the feedforward loop is chosen to maintain 14-bit dynamic range with proper signal truncations to avoid excessive digital logic. More detailed discussions about the cancellation logic is provided in Section 3.4.2 and 3.4.3. The digital loop lter is composed of propor- tional gain and integral paths with programmable type-I and type-II response for prototyping purpose. Following the loop lter, the delta-sigma DAC interface is used to modulate the varactor bank of LC-based DCO and provide ne frequency resolution on the order of hundredth Hz. 3.4.2 Proposed Averaging Scheme As discussed in Section 3.2.3, the spur frequency may not be an integer mul- tiple of reference clock, which necessitates both fractional delay and averaging in 75 Figure 3.16: (a) Proposed averaging scheme in integer mode (b) Representative ex- amples of integer spur period with averaging in integer mode and the reconstruction of spur pattern at z[n] the spur cancellation loop. Therefore, a signal averaging scheme is proposed that supports both integer and fractional averaging. In principle, the incoming samples should be rst downsampled by the fundamental spur frequency, i.e., F REF D . If D is integer, the downsampler is essentially a ip op that is clocked at every D cycles, as shown in Fig. 3.16(a), and it is referred as the integer averaging mode. In this case, the following innite-impulse-response (IIR)-based accumulator can sum the subsampled spur constructively. Because the averaging block is instantiated at ev- ery internal node of the delay chain, a nite-state-machine (FSM) is used to serialize the outputs of all IIR-based accumulators in the correct sequence to reconstruct the spurious pattern with the right phase as illustrated in Fig. 3.16(b). 76 Figure 3.17: (a) Proposed averaging scheme in fractional mode (b) Representative examples of fractional spur period with averaging in integer mode and fractional mode IfD is a non-integer value, the integer averaging mode no longer constructively aligns the summation phase properly, as shown in Fig. 3.17. In this example, D is 10.25 and the integer downsampler is set to the nearest integer, i.e., 10, the aver- aged waveform is noticeably distorted because the phase error is accumulated over time. As a result, fractional averaging should be used, where the key component is the fractional downsampler. In the proposed implementation, the downsampler is decimated between two integer numbers: M and M + 1 for P and Q cycles, respectively. The averaged downsampling rate is essentially interpolated between M and M + 1 and should approximate D as follows: D = [PM +Q (M + 1)] P +Q (3.15) 77 Additionally, the z-domain transfer function of the fractional averaging is modied as H AVG (z) = 1 N win N win 1 X i=0 f( P P +Q )z iM + ( Q P +Q )z i(M+1) g (3.16) In this prototype, the value of (P +Q) can be tuned between 1 and 16, and is capable of interpolating 6.25% of one sample delay. To simplify the circuit imple- mentation, the switching pattern is xed, i.e., P cycles rst and then Q cycles. Due to the incomplete cancellation as shown in Fig. 17(b), the accumulated phase error pattern is still periodic if the switching pattern is not randomized, and hence regenerates the spurious tones by itself. As a representative example shown in Fig. 3.17(b), the accumulated error will repeat every 51 (=4D) samples which is 4 (=P +Q) periods of the spur. Therefore, the location of regenerated spur (F spur;avg ) is a function of F spur and (P +Q), as expressed below. F spur;avg =k F spur P +Q (3.17) , where k is an integer number to represent the k th harmonics of the regenerated spur. The magnitude of those regenerated spurs depends on the value of P andQ. Its worst-case value can be estimated via Eq. (3.13) by plugging in the maximum accumulated phase error. 78 Figure 3.18: Illustration of proposed spur cancellation with (a) single-stage cong- uration and (b) two-stage conguration for two series of spurs 3.4.3 Modular Extension: Cascaded Cancellation Loops Up until this point, the spurs are considered in harmonic relation, which is referred to as one series of spurs. However, there might be a case that the spurs are in non-harmonic relation, i.e., a mixture of multiple series of spurs. For instance, if there are two series of spurs with the spur periods of D 1 , D 2 , respectively, the combined spur period can be expressed and bounded by min(D 1 ;D 2 )D =LCM(D 1 ;D 2 ) 2 Y n=1 Dn (3.18) ,where LCM represents the least common multiple operator. According to (3.18), the worst case occurs when D 1 and D 2 are co-prime with each other. In this case, 79 the spur cancellation loop is programmed with a delay ofD 1 D 2 , which may require a long delay chain as shown in Fig. 3.18(a). Alternatively, the proposed cancella- tion scheme can be extended by cascading two cancellation loops with their delay individually set to D 1 and D 2 , respectively as shown in Fig. 3.18(b). In this case, the implemented delay chain will reduce toD 1 +D 2 , which is signicantly smaller than D 1 D 2 . By cascading the cancellation stage, careful analysis and simulation should be performed to ensure sucient phase margin under such cascaded design. In this prototype, the maximum delay stages of 128 for bothD 1 andD 2 are im- plemented, which suggests that the lowest spur frequency to be cancelled is 234 kHz given a reference clock frequency of 30MHz. As indicated in Eq. (3.14), if the reference clock frequency is lowered, the cancellation loop can address even lower spur frequency in proportion. Alternatively, the number of delay stages can be increased at the cost of additional hardware. Therefore, it is a tradeo between hardware, and reference clock frequency to set the minimum correctable spur fre- quency bound, depending on the need of intended application. 3.4.4 Deployed Analog and Isolation Techniques To minimize the reference spur (which cannot be mitigated via the cancellation loop), this prototype leverages the analog technique with careful layout and isola- tion to minimize the reference spur coupling. First, the digital core itself is a severe aggressor to the DCO. The deep N-well trench with width > 100m is utilized to 80 Figure 3.19: Die micrograph increase the separation between the DCO and the digital core. The DCO cross- couple pair is guarded with another isolation layer to hinder any clock bouncing locally. Second, the DCO supply uses a standalone regulator, separated from the TDC and digital-core supply. The output of the DCO is routed dierentially from the output driver all the way to the equipment to minimize the common-mode noise disturbance. The EM simulation is completed in the PCB design to guarantee the noise coupling from other traces is <60dB. Finally, the input reference clock is routed dierentially throughout the chip with proper shielding to minimize cou- pling. The layout of reference clock traces is iterated to greatly minimize crosstalk based on the post-layout simulation result. 81 Figure 3.20: Test chip measurement setup 3.5 Experimental Results The silicon prototype was implemented in 65nm CMOS process with an active area of 0:77mm 2 , as shown in Fig. 3.19. The analog part of the digital PLL consumes 17.2mA, while the output buers consume 4.5mA from 1V supply. The digital part dissipates 1.7mA in normal DPLL operation, and each of the spur cancellation loops consumes an additional 2mA. The measurement setup for the test prototype is shown in Fig. 3.20. The Si5341 evaluation board generates a relatively clean reference clock source for the test chip and the output of DPLL is analyzed via the phase noise module of the spectrum analyzer model E4440A. For the external spur measurement, the arbitrary waveform generator AFG3252 is used to generate external single- or multi-tone interferences for the spur cancellation experiment. 82 Figure 3.21: Measured digital PLL phase noise prole at 3.57GHz in the absence of external or internal spur (a) without and (b) with the proposed spur cancellation activated To ensure that the proposed multi-tone spur cancellation loop does not disturb the DPLL operation, the digital PLL phase noise prole (Fig. 3.21) is measured at 3.57GHz with and without activating the cancellation loop in the absence of external or internal spurs, i.e., DPLL operates in the integer-N mode. The phase noise prole remains the same in both cases, which conrms that the cancellation loop will not interfere with normal digital PLL operation. The measured in-band phase noise achieves -103dBc at 100kHz frequency oset and out-of-band phase noise of -122dBc at 3MHz frequency oset. The integrated phase noise of -38.1dBc (RMS jitter of 557fs) is measured from 10kHz to 40MHz. The eectiveness of performing signal averaging in the cancellation loop is shown in Fig. 3.22. In this case, an external sawtooth disturbance is also injected at 234kHz frequency oset from carrier frequency of 3.57GHz, which is within the DPLL bandwidth. With a smaller N win value, such as 2, severe noise degradation is observed in the phase noise prole not only within the DPLL bandwidth but 83 Figure 3.22: Measured digital PLL phase noise prole given average cycle (N win ) of (a) 2 and (b) 32 Figure 3.23: (a) Spectrum snapshot of fractional spurs at FCW = 119.25 and (b) measured worst-case fractional spur level across dierent FCW setting also outside the band. When the Nwin value increases, the noise degradation is mitigated, as shown in Fig. 3.22(b). The measured phase noise proles conrm the eective use of the averaging block to avoid elevation of phase noise oor as discussed in Section 3.2.1. The spectrum snapshot of fractional-N spur before and after enabling spur can- cellation scheme is shown in Fig. 3.23(a). In Fig. 3.23(b), fractional-N spurs are measured across dierent fractional code setting and it shows the worst-case spur ranges from -73.66 to -117dBc, indicating 20-50dB improvement. Reference spur 84 Figure 3.24: (a) Spectrum snapshot of reference spur measurement at carrier fre- quency of 3.57GHz and (b) measured reference spur across the entire synthesizer operation range Figure 3.25: Measured (a) fundamental and (b) 2nd harmonics of external spur reduction in single-stage cancellation performances are reported in Figs. 3.24(a) and 3.24(b), sweeping the entire DPLL operation frequencies from 3.2 to 4.8GHz. The plot shows the spurious level from -110.1 to -116.1dBc, due to deployed isolation and analog techniques. To demonstrate the modular extension capability of the proposed cancellation scheme, the external spur reduction is measured under single-stage and cascaded two-stage congurations. For a single-stage cancellation loop, a saw tooth distur- bance is injected intentionally via the input buer supply. From Figs. 3.25(a) and 3.25(b), the spurious tone is reduced by 15-35dB on both its fundamental and 2nd 85 Figure 3.26: Measured external spur reduction in cascaded two-stage cancellation congurations (a) without and (b) with activating the spur cancellation scheme harmonic. The higher order harmonics are not shown as they are buried underneath the phase noise oor after cancellation. For cascaded two-stage cancellation, two in- dependent series of triangular spurious tone located at 312 and 495kHz oset from the carrier frequency is combined o-chip and then injected. After cancellation, both series of spurs are nullied under the noise oor, as shown in Fig. 3.26. Table 3.1 lists the key highlights in comparison with state-of-the-art PLLs. The tables show that this work achieves the lowest reported reference spurious levels and the worst-case fractional spurs. It also demonstrates the unique capability to mitigate the externally coupled spurs from the input paths. The proposed technique avoids any form of dithering such that there is no elevated noise oor. Moreover, the eectiveness of the cancellation scheme is not limited by the TDC's DNL eect, and hence the extra DNL calibration circuitry is not needed. 86 Table 3.1: Comparison with state-of-the-art digital PLLs 3.6 Conclusion A 3.2-4.8GHz fractional-N digital PLL with feedforward multi-tone spur cancel- lation scheme is presented. To achieve low spurious performance, the feedforward multi-tone spur cancellation scheme is proposed to nullify internal spur, including TDC quantization steps and its DNL eect, and external spur coupled from the SoC environment. The worst case fractional spur is measured ranging from -73 to -117dBc with dierent fractional code settings. With the cancellation loop con- gured as single-stage and cascaded two-stage, the external spur improves by 15 to 35dB. Furthermore, due to the explored analog technique, the reference spur 87 of -110.1 to -116.1dBc from 3.2 to 4.8GHz is measured. The proposed cancella- tion scheme benets from the DSP algorithm, which suggests potential beyond the conventional PLL architectures. 88 Chapter 4 A Digital PLL with Background dither noise cancellation for Near-carrier In-band Spur 4.1 Fractional-N Spurs at Near-Carrier Frequencies A DPLL with a feedforward multi-tone spur cancellation scheme [16] was intro- duced to simultaneously cancel not only TDC quantization errors but also TDC's DNL eects. However, this multi-tone spur cancellation technique involved a chal- lenge. When one needed to operate a DPLL at a near-integer frequency (i.e., at a very small FCW value), an in-band fractional spur near the carrier frequency (i.e., a near-carrier fractional spur) would be generated. Although feedforward multi-tone spur cancellation could mitigate the spur, hardware complexity will signicantly increase. As illustrated in Figure 4.1, when a spur period increases from 2 2 to 2 14 samples, the number of D ip- ops in feedforward multi-tone spur cancellation also increases 89 Figure 4.1: The hardware complexity involved in a feedforward multi-tone spur cancellation when a spur period increases from 2 2 to 2 14 . Overall hardware complexity increases proportional to 1 SO(df) , an undesirable exponential function annotated as O( 1 SO(df) ). Therefore, in order to minimize this overhead, this chapter introduces a dithering technique to randomize periodic spurious patterns and presents a hardware implementation that is linearly dependent on spur periods (i.e., log 2 (SO(df))). Applying dithering to reduce spurious tones is a common technique, but a side eect is the introduction of dithering noise into PLL loops, which should be re- moved; otherwise, the dithering noise can severely degrade phase noise performance. To nullify the dithering noise, a feedforward noise cancellation scheme [5] can be applied to compensate for dithering noise degradation at the cost of higher complex- ity. However, due to two-phase calibration and cancellation routines, this technique loses real-time voltage and temperature tracking capabilities. Thus, to avoid noise cancellation overheads, a two-level dithering scheme [46] can be adopted to limit 90 Figure 4.2: The generation of fractional-N spurs in the DPLL with a frequency accumulation path noise degradations with minimal spur reductions, which is suitable when spurious tone requirements are not stringent. To resolve the aforementioned challenges, the technique proposed in this chapter targets dithering in regards to input reference clocks to randomize fractional spur patterns. Afterwards, an adaptive lter is applied to cancel additional dithering noise in the background to address any PVT variation. As a result, the technique allows for large dithering signals and ensures noise oor impacts are minimized. This leads to increased randomized phase error patterns and low spur magnitudes. 91 4.2 Fractional-N Spurs in DPLLs 4.2.1 The Generation of Fractional-N Spurs The fractional-N operation will cause fractional-N spurs in a PLL output. The generation of the fractional spur can be studied through the frequency accumulation DPLL situation as illustrated in Fig. 4.1. Due to the continuous injection of an accumulated phase oset into the DPLL loop via a FCW path, the phase error will push DCO to operate at fractional frequency. Unfortunately, a nite word length mismatch occurs between the quantization step of the TDC and the accumulated FCW path. This mismatch introduces a periodic and sawtooth-like phase error in the output of a subtractor, referred to as SUM . The expression SUM can be written as the summation of an accumulator's phase ( ACC ) and a TDC's output phase ( TDC ). SUM [k] = ACC [k] TDC [k] = (N int +df)kT REF ( INT [k] + FRAC [k] + QUAN [k]) = QUAN [k] (4.1) (N int +df)kT REF , INT + FRAC and QUAN denote the accumulated FCW phase, DCO phase and TDC nonlinearities, respectively. In the steady state, two phase- ramps from the accumulated FCW phase and DCO phase should cancel with each other, but the TDC nonlinearities will remain inside a DPLL loop. As the TDC 92 nonlinearities appear in a sawtooth pattern, they can modulate the DCO to gener- ate harmonic tones with a fundamental frequency equal toF SPUR (=F REF SO(df)). As a result, the TDC nonlinearities can be modeled as superpositions of Fourier components as follows. QUAN [k] = 1 X n=1 A n cos(2nF SPUR kT REF ) (4.2) A n represents the magnitude of a fractional spur's n th harmonic normalized to 1 LSB of a TDC's quantization step. Note that this TDC's nonlinearities are conned within 1 LSB of the TDC's quantization level, but if the TDC's dierential nonlin- earity (DNL) eect is considered, fractional spurs' magnitude can be approximated to have additional gain variation for each A n variable. Since the TDC's nonlinearities are generated internally, they can be ltered by the DPLL closed loop low-pass response. However, if the frequency of a fractional spur is within a 3dB DPLL bandwidth (F SPUR <<BW 3dB ), the spurious component cannot be suppressed by a loop lter. The power spectrum density of each spurious tone at the DPLL output can be derived as follows. P FRAC (n) = 20 log 10 f K DCO 2nF SPUR A n H LPF (2nF SPUR )g 20 log 10 f K DCO 2nF SPUR A n g if nF SPUR << BW 3dB (4.3) 93 Figure 4.3: The DPLL block diagram with reference-path dithering , where H LPF denotes the DPLL closed loop low-pass response from SUM to a DPLL output in thes-domain. However, for Eq. (4.2) to be valid, one can assume the phase disturbance is smaller than the entire DCO period. In general, if the power level of the spur is less than -20dBc, Eq. (4.3) is a favorable approximation. 4.2.2 Phase Randomization in Injection-Locked TDC In order to minimize the spur energy deduced in Eq. (4.2), one can introduce an additional term inside the sinusoidal expression; consequently, the sawtooth spur pattern cannot preserve its original format. If this additional term is time varied and Gaussian distributed, the spur energy can be spread as noise. To achieve this, one can dither a reference clock path to add a timing shift to the reference clock, as illustrated in Fig. 4.3. With dithering, the TDC output experiences signicant disturbances, and the subtractor output will not preserve the original sawtooth 94 pattern; rather, it performs as if it is random noise. However, this noise pattern will modulate the DCO to degrade DPLL performance. To understand the impact of dithering, a mathematical model of a DPLL phase with dithering should be examined. Note that the TDC input consists of not only the original DCO phase but also the introduced dither signal (i.e., DCO (t) + Dither (t)). After the quantization by the TDC, the quantization error appears, and the phase at TDC output can be derived as follows. TDC [k] = DCO [k] + Dither [k] + 0 QUAN [k] = INT [k] + FRAC [k] + Dither [k] + 0 QUAN [k] (4.4) QUAN [k] represents the new quantization error with dithering, which is dierent from the quantization error shown in Eqs. (4.1) and (4.2). After combining with an accumulated FCW path, Eq. (4.4) can be substituted for Eq. (4.1). The phase at subtractor output appears as follows. SUM [k] = Dither [k] 0 QUAN [k] (4.5) In Eq. (4.5), the term Dither [k] is an original dither noise, which can severely degrade DPLL phase noise and request the dither noise cancellation scheme. More- over, the term 0 QUAN [k] is no longer a sawtooth-like pattern because the dithered 95 Figure 4.4: A time-domain illustration of the randomization of a fractional spur error pattern (a) with and (b) without the proposed dithering scheme reference clock re-samples the waveform shown in Eq. (4.2). As a result, a phase error term is essentially introduced and Eq. (4.2) can be modied as follows. 0 QUAN [k] = 1 X n=1 A n cosf2n(F SPUR kT REF + T Dither [k] T DCO )g (4.6) , whereT Dither [k] denotes a timing shift atk th sample due to dithering. Specically, the timing shift can be modeled as an injected dither code at thek th sampleD PN [k], multiplied with the digital-to-time converter (DTC) gain G DTC . Note that this timing shift should be normalized to the DCO period and spur frequency. As illustrated in Fig. 4.4, before and after its use, the overall phase shift term T Dither [k] T DCO phase modulates the sawtooth pattern shown in Eq. (4.2). If the phase shift term is random for every sample, it helps to whiten spurious energy. 96 One should be aware that the overall range covered by dithering needs to be designed carefully in order to guarantee the full randomization of a spur pattern. After examining Eq. (4.6), it can be seen that if the value of T Dither [k] T DCO is signicantly smaller than that of F SPUR kT REF , the dithering eect is negligible; thus, the spurious tone does not change. Hence, to achieve full randomization, the maximum value of the dithering signal, T Dither;max , should be equal to or greater than one DCO period such that an instant phase shift covers from 0 to 2 when the lowest frequency of a fractional spur is considered (i.e., such that n is equal to one). Consider the following: T Dither;max =D PN;max G DTC >=T DCO (4.7) , where D PN;max represents the peak value of the injected dithering code. Note that Eq. (4.7) guarantees the full randomization of the fundamental tone of a fractional spur, so the high-order harmonic of the fractional spur will be randomized accordingly. Note that to verify the ecacy of Eq. (4.7), the improvement of the worst-case fractional spur with dierent settings of T Dither;max T DCO is shown in the measurement section in which we have the ability to digitally change the peak value of the injected dithering code. The measurement will show that we achieve better improvement when D PN;max is equal to one DCO period. With dithering, Eq. (4.4) can be interpreted as the shifted reference clock exer- cises dierent quantization levels such that the TDC's nonlinearity, including the 97 Figure 4.5: A time-domain illustration of the randomization of a fractional spur error pattern (a) with and (b) without the proposed dithering scheme TDC's DNL eect, is randomly shued. This implies that if an injected dither code cannot exercise all possible TDC's quantization levels, the nal randomiza- tion eect of the fractional spur degrades. This phenomenon was observed in the prior study [46] in which two-level dithering was developed with limited spurious improvements. For the proposed scheme, in order to achieve an optimal random- ization eect, 32-level dithering should be used to exercise 28 TDC quantization levels. In fact, based on a numerical simulation, it was found that 16-level dithering can result in a desired randomization eect. Except for the previous two design constraints, the length of a pseudo-random (PN) pattern can impact randomization as well. As illustrated in Fig. 4.5, two dierent dithering pattern lengths can cause distinct randomization eect to a spur pattern with its period equal to 2 10 samples. With a dithering length of 2 5 , a TDC 98 output can still repeat every 2 10 samples. The dithering eect causes only local randomization. Considering another scenario with a dithering length of 2 10 , the TDC output does not repeat every 2 10 samples anymore; rather, it repeats in a longer period. This because the dithering smears spur energy into a low-frequency regime; it takes longer for TDC nonlinearities to repeat itself. In the proposed scheme, the Hadamard code generator is used to generate PN codes to perform dithering functions. Note that the PN code pattern repeats every 2 m cycles if the m-stage Hadamard code generator is implemented in which each stage can be simply realized using D ip- ops. The hardware complexity of the Hadamard code generator increases only linearly (i.e., it can extend from 5 stages to 20 stages) even if one increases a dithering length exponentially, as exhibited in Fig. 4.5. However, note that if the dithering length is shorter than the period of the spur, the degradation of the randomization eect in Eq. (4.6) will occur. Hence, one should ensure inequity such that 2 m >= 1 SO(df) . Fig. 4.6 shows numerical simulation results of the worst-case fractional spur with dierent stages of Hadamard code generator. The simulation was performed with a DPLL behavior model in MATLAB in which a DPLL was set to operate in a fractional-N mode with the FCW value equal to 120 + 2 10 . Considering Eq. (4.5), the dither noise component exists at the TDC output, and the component dominates the noise oor due to the covered range of one DCO period in T Dither;max . To avoid unwanted noise degradation, dither [k] should be 99 Figure 4.6: A simulation with the worst-case fractional spur improvement under dierent stages of the Hadamard code generator cancelled prior to the DCO; this will be discussed in Section 4.3. After removing dither [k], the randomized sawtooth pattern 0 QUAN [k] is not be nullied by the pro- posed dither noise cancellation scheme. Hence, randomized spurs create additional noise and add to the DPLL's phase noise prole. Additional noise degradation in power spectral density can be written as follows. L SSB = 10log 10 f( T LSB T ILTDC 2) 2 2 12F REF g = 10log 10 f( 2 28 ) 2 1 6F REF g (4.8) 4.3 Background Dither Noise Cancellation Three parameters can aect the cancellation of Dither [k]: the DTC transfer function, the DPLL closed-loop response, and the original PN code. To illustrate how to achieve dither noise cancellation, the rationale for the proposed cancellation loop in a z-domain is shown (Fig. 4.3). The PN code D PN is rstly converted 100 Figure 4.7: A DPLL mathematical model with injected dither noise and a corre- sponding cancellation path in z-domain into the phase domain (i.e., PN [k]) and multiplied with the DTC transfer function h dtc [k] via a tunable delay buer, which is discussed in Section 4.4.1. Without any cancellation, dither noise, via the DTC, can see a high-pass transfer function and appear at the DLF input as e [k] = PN [k]h loop [k]. Assuming the same dithering information, PN [k], is fed into the DPLL loop via the adaptive lter responseh af [k], this noise disturbance will observe a similar high- pass transfer function related to the response h af [k] instead of the DTC response 101 h dtc [k]. If both injection paths are considered simultaneously, SUM can be written as follows. SUM [k] = PN [k]h loop [k] PN [k]h can [k] = PN [k] (h loop [k]h can [k]) (4.9) Based on Eq. (4.9), PN [k] can be extracted as a common term, which impliesD PN will not aect the adaptation of the cancellation loop. That is, if the equilibrium (h loop [k]=h can [k]) is achieved, the dithering noise will not appear at the DPLL output. However, to make h can [k] equal h loop [k] is a very challenging problem due to the PVT eect. As the DTC transfer function cannot be known a priori, the cancellation path viah af [k] will always introduce mismatches and cause incomplete cancellation at the DLF input. However, this issue can be resolved by leveraging an adaptive lter algorithm. To minimize e [k], residual error energy can be fed back to adjust the weighting of the adaptive lterh af [k]. Therefore, a corresponding cost function in a DLF input can be shown as follows. minE[ 2 e [k]] = minE[( PN [k] (h loop [k]h can [k])) 2 ] (4.10) , where the transfer functionsh loop [k] andh can [k] in thez-domain can be expressed as follows, 8 > > > > < > > > > : H LOOP (z) = N div H DTC (z)( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 +( 1z 1 z 1 )K D K DCO +K D K I K DCO H CAN (z) = H AF (z)( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 +( 1z 1 z 1 )K D K DCO +K D K I K DCO (4.11) 102 Both transfer functions can be converted into s-domain by replacing 1z 1 z 1 with s based on the forward Euler transformation. If the frequency response of the adaptive lter H AF (z) gradually approaches N div H DTC (z), the cancellation can converge to an optimal point at which e [k] is minimized so it is zero (i.e., an optimal point at which the cancellation is complete). In this proposed scheme, an adaptive lter uses an least-mean-square (LMS) algorithm that continuously optimizes the mean square error of e [k]. Thus, if one takes the derivative of the cost function Eq. (4.11) in regards to the adaptive lter response H AF (z), the adaptive update function can be achieved as follows. w[k + 1] =w[k] 2 step PN [k]h hp [k] (4.12) h hp [k] is the DPLL closed loop high-pass response. It can be expressed in the following z-domain transfer function. H HP (z) = ( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K DCO +K D K I K DCO (4.13) To realize Eq. (4.13), one can implement this transfer function in a digital domain as most of the parameters are known a priori. A duplicated transfer function can be developed as follows. H DUP;HP (z) = ( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K 0 DCO +K D K I K 0 DCO (4.14) 103 Note that if Eq. (4.13) is examined, the only unknown parameter isK DCO , which is replaced by the xed parameterK 0 DCO instead. However, the value preprogrammed in K 0 DCO can aect the nal cancellation accuracy. For example, if K 0 DCO is set to zero initially, the response H Dup;HP (z) degenerates into an all-pass lter. As a result, the update function in Eq. (4.12) will respond as if H LOOP (z) is an all- pass lter response. Then Eq. (4.12) may not settle to the optimal condition as the estimation error due to K 0 DCO aects the nal cancellation result. A detailed implementation that can compensate for such an estimation error is discussed in Section 4.4.2. 4.4 Circuit Implementations 4.4.1 The Proposed DPLL Architecture The overall DPLL architecture is shown in Fig. 4.8. The key component is the dithering scheme used to randomize fractional-N spur patterns. The rst key component is the dithering scheme used to randomize the fractional-N spurious pattern. The input reference path is implemented via a tunable delay buer chain controlled by a Hadamard-code generator to delay the rising edge of the reference clock phase before comparing with the LC-DCO phase by D PN G DTC . The back- ground dither noise cancellation loop, which is another crucial component of this proposed architecture, can be inserted before the DLF to cancel introduced dither 104 Figure 4.8: The overall proposed DPLL architecture noise. The DLF's output controls an oversampled delta-sigma modulator to toggle a bank of ne-resolution varactors in the LC-DCO. Afterwards, an IL-TDC tracks the DCO frequency and phase variations over the PVT eect. Moreover, the IL- TDC provides 28 levels of ne phase quantization within one DCO period. Finally, an accumulated FCW is subtracted at the TDC's output to allow integer-N and fractional-N operations. 105 Figure 4.9: The implementation of the proposed dither and dither removal scheme 4.4.2 Implementation of the Digital-to-Time Converter Figure 4.9 shows the implementation of a reference clock buer, the DTC, and the dithering block. An o-chip reference clock is sent dierentially into the on- chip CML buer in order to minimize common mode noise. The dierential clock is then converted into a square waveform using a dierential to single-end (D2S) amplier to improve the clock's resilience to jitter. Note that dierential shielding is drawn from the clock's source all the way from the PCB to the D2S block in order to suppress any unwanted crosstalk to the DCO, which will induce reference spur. This technique was proven to be eective in the prior study [17]. The dithering signal is generated with a 20-stage Hadamard code generator with 16-level output, creating a random sequence of 4-b binary dither code. The 4-b dither codes are converted into 16-b thermometer codes (D ctrl ) to control a bank of varactors in the DTC. The 4-b binary dither codes are also fed into a background dither noise cancellation loop, which is described in Section 4.4.2. Note that D ctrl is retimed by the falling edge of D2S output, REF CLK , and this retime 106 Figure 4.10: The Implementation of the proposed background dither noise cancel- lation loop with a one-tap conguration mechanism gives D PN a grace period of 1 2F REF to settle. Consequently, D PN can always resolve and settle before the next rising edge of REF CLK . However, as illustrated in Fig. 4.9, this retime scheme keeps glitches due to switching the delay of the CML and D2S buer away from the rising edges of the reference clock. However, this mechanism can cause a DFF delay, which can be compensated by inserting an additional delay unit in the input of a background dither noise cancellation loop. 107 4.4.3 Implementation of the Dithering Noise Cancellation Loop Figure 4.10 shows the implementation of the proposed background dither noise cancellation loop with the dither code D PN shifted by one delay to compensate for latency caused by negative retiming in Fig. 4.9. The bias ofD PN is removed before sending into the cancellation loop to generate the compensation signal Dnoise . The cancellation loop is composed of two critical blocks: the high-pass lterH Dup;HP (z), which duplicates the closed-loop DPLL high-pass response, and a one-tap adaptive FIR lter with the updated weighting w 1 , which adjusts the DTC gain variations caused by PVT eects. As mentioned in Section 4.3, if the value of the estimated K 0 DCO is incorrect, the estimation error will propagate to the entire adaptive loop and aect the compensation signal Dnoise before injecting into the loop. To com- pensate for this estimation error, one can insert a compensation block to mitigate a mismatch between K DCO and K 0 DCO . This compensation block is a division of H HP (z) and H Dup;HP (z), as shown in Eq. (4.15). H COMP (z) = ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K 0 DCO +K D K I K 0 DCO ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K DCO +K D K I K DCO (4.15) This compensation block can be inserted into the update path, as illustrated in Fig 4.11(a). The multiplication ofH Dup;HP (z) andH COMP (z) can return toH HP (z). One can modify the implementation of Fig. 4.11(a) so it is Fig. 4.11(b) by moving 108 Figure 4.11: A digital implementation of the update function Eq. (4.12) with the compensation block inserted at (a) the input node and (b) the output node the compensation block to the output of the update function without changing the block's functionality. Therefore, this compensation block can merge with the adaptive FIR lter design. This was proven to be relatively simple based on a MATLAB numerical simulation. Considering Eq. (4.15), note that the compensation block is an IIR response; it can increase the implementation diculty of the adaptive lter. Moreover, K DCO term is still embedded in Eq. (4.15), which is unknown. To address this, some assumptions can be made to simplify Eq. (4.15). We care the in-band dither noise cancellation more than out-of-band dither noise, and hence how the compensation block H COMP (z) performs at low frequency domain is relatively important. When a low frequency domain is considered, the expression can be written as follows. H COMP (z) K 0 DCO K DCO : (4.16) 109 Figure 4.12: The complete implementation of the proposed background dither noise cancellation loop with a two-tap conguration This equation is only a gain scaling factor for the nal compensation signal Dnoise [k], and it turns out that this factor can be automatically generated by an LMS loop. The weighting of the one-tap FIR will become w 0 1 , which is equal to w 1 K 0 DCO K DCO in the steady state. Note that even if one does not know the gain of K DCO , the adap- tive loop will address the gain automatically. If the architecture of Fig. 4.11(a) is not used rather than that of Fig. 4.11(b), the gain factor of Eq. (16) would be implemented with greater diculty. To consider the accuracy of cancellation in regards to a high frequency domain, Eq. (4.15) can be modied in Eq. (4.17). H COMP (z) 1 + (K D K 0 DCO 1)z 1 1 + (K D K DCO 1)z 1 (4.17) 110 By programming the value of K 0 DCO to be 1 K D , Eq. (4.17) degenerates into Eq. (4.18). H COMP (z) 1 1 + (K D K DCO 1)z 1 (4.18) Although Eq. (4.18) is a simplied result, it is still an undesirable IIR response. However, one can approximate the IIR response as an FIR lter with innite taps such thatH COMP(z) is equal to 1 + 2 z 1 + 3 z 2 :::, without losing a signicant amount of accuracy. Note that the coecients ( 1 , 2 ...), which are functions of K DCO , can be automatically tuned via an adaptive loop by extending the weight of Eq. (4.12) as the vector set (w 1 , w 2 ...) and mapping ( 1 , 2 ...) into (w 1 , w 2 ...). As illustrated in Fig. 4.12, without inserting an innite number of taps, one can implement an adaptive FIR lter in a two-tap conguration with minimal overhead but sacrice cancellation accuracy in regards to a high frequency domain. This is discussed later in the measurement section. The adaptive FIR lter is able to adjust the gain errors caused by DTC gain variations and estimation errors in Eq. (4.16), and phase errors caused by the simplication of the two-tap conguration. Note that one can program the value of K 0 DCO so it is equal to 1 K d . Eq. (4.14) becomes the following. H Dup;HP (z) = ( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 ) +K I : (4.19) 111 Figure 4.13: Chip micrograph As Eq. (4.19) depends only on the K I value, the H Dup;HP (z) now will not drift with PVT. Therefore, it is unnecessary to adapt the high-pass lter coecient of H Dup;HP (z) in real time. The updated weighting of the two-tap FIR lter should be adjusted based on the product of the phase error e [k] and the output ofH Dup;HP (z), which can lead to a favorable convergence stability. 4.5 Experimental Results A proof-of-concept test was fabricated in a 65nm CMOS process with an active area of 0:334mm 2 , in which the proposed background noise cancellation loop occu- pied 0:017mm 2 , as shown in Figure 4.13. The analog core of the DPLL consumed 16.4 mW of energy from a 1 V supply. The digital part consumed 1.1mW with 112 Figure 4.14: The testing conguration the normal DPLL operation, while the proposed dither and dither noise removal scheme dissipated an additional 0.6mW. The testing conguration is shown in Fig. 4.14. An 8665B signal generator synthesized 30 MHz of a clean reference source for DPLL locking. A USBee QX pattern generator/logic analyzer was used for SPI programming and data capturing to process a digitized internal node. Moreover, a DPLL output spectrum and phase noise prole were analyzed using an E4440A spectrum analyzer. Finally, in order to test process variations across multiple dies, an ELW10 socket was used to mechanically tighten the device under test (DUT). To test the ecacy of the proposed dither and dither cancellation scheme, the phase noise prole of DPLL locking at an FCW of 120 + 2 10 was measured; it is shown in Figure 4.15. Specically, Fig. 4.15(a) has black and blue curves that indi- cate the phase noise prole of the DPLL operating at an FCW of 120 + 2 10 without and with dithering scheme, respectively. After cancelling background dither noise, a suppressed noise prole was generated (the red curve) that demonstrated the 113 Figure 4.15: Measured phase noise prole with the dither and dither noise cancella- tion loop at an FCW of 120 + 2 10 . The spur improvement of 30 dB and the noise improvement of 23 dB were noted. These improvements validated the ecacy of the proposed technique worst-case spur and dither noise improvements (30 dB and 23 dB, respectively, as shown in Fig. 4.15(b)). Note that in-band noise was slightly enhanced after can- cellation, and this was due to the randomized spurious pattern in Eq. (4.8). At a frequency oset between 10 to 20MHz, there was a noise degradation because of an incomplete cancellation. This was caused by the simplied two-tap FIR lter conguration described in Section 4.3. If the number of taps increased, this noise degradation could be mitigated. The spectrum gure regarding the worst-case fractional spur with an FCW of 120 + 2 14 (Fig. 4.16) represents a near-carrier case. Before enabling the dither and dither removal scheme, the worst-case spurious level was -22dBc at 1.83 kHz oset. After enabling the technique, the level decreased to -62.47 dBc, while other 114 Figure 4.16: Measured worst-case fractional spur with an FCW of 120 +frac12 14 (a) before and (b) after enabling the dither noise cancellation loop Figure 4.17: (a) Measured settling behavior of a DPLL with an FCW of 120 +frac12 14 and (b) the worst-case fractional spur with dierent FCW settings 115 Figure 4.18: The worst-case fractional spur levels, measured with dierent dithering magnitudes normalized to one DCO periods harmonics were beneath the noise oor. During dither noise cancellation, the in- ternal state before the DLF was captured in Fig. 4.17(a), the measurement shows that the adaptive lter loop required, overall, 27 cycles to settle. This indicated a 0.9us convergence speed. Fractional-N spur levels were measured (Fig. 4.17(b)) with dierent fractional frequency osets ranging from 2 14 to 2 2 , demonstrating at least >20 dB improvements. The improvement of the fractional spur are mea- sured over dierent maximum dithering magnitude normalized to one DCO period, including 25%, 50%, and 100% of the DCO period; they are shown in Fig. 4.18. To test the PVT tracking capability of the cancellation loop, spur and noise cancellation results, post-PVT, were explored. Fig. 4.19 shows measured integrated jitter variations normalized to a nominal value (i.e., a 1.57ps RMS jitter) when a DPLL operated with an FCW of 120 + 2 14 . The measurements show that jitter variations were under 2% when DCO, DTC and TDC supplies were varied from 0.9 to 1.1 V (+= 10%). Moreover, the measurements show that under temperature 116 Figure 4.19: Measured variations of integrated RMS jitters with an FCW of 120 + 2 14 after enabling the dither and dither noise cancellation loop under (a) 20% DCO/TDC/DTC supply variations, and (b) 27 o C to 60 o C temperature varia- tions Figure 4.20: Measured variations in the worst-case fractional spur with an FCW of 120 + 2 14 after enabling the dither and dither noise cancellation loop under (a) 20% DCO/TDC/DTC supply variations, and (b) 27 o C to 60 o C temperature variations 117 Figure 4.21: Measured integrated RMS jitter variations with a xed update weight- ing after the settling of the cancellation loop under (a) 20% DTC supply variations and (b) 27 o C to 60 o C temperature variations variations, less than 2% integrated jitter variations occurred. Variations in the worst fractional spur level can be seen in Fig. 4.20. Overall, less than 2 dB spur variations were measured with DCO supply, TDC supply, and temperature variations. With a socket, six chips were measured; similar performance was observed. To verify the eectiveness of the adaptive loop, after the dither noise cancellation loop settled, the update weighting was xed in the two-tap FIR lter, and a DCO supply and temperature were manually changed. The variations of integrated RMS jitter variations start to increase and demonstrated more than 2% variations, as shown in Fig. 4.21. This was expected. This experiment demonstrated the necessity of having a background dither noise cancellation loop. The integer-N performance of the DPLL was measured at 3.57GHz with an FCW of 119. Fig. 4.22(a) shows a reference spur performance. The carrier frequency at 3.57GHz with -8.07dBm and the reference spur frequency at 30MHz oset with 118 Figure 4.22: Measured DPLL (a) reference spur and (b) phase noise prole at 3.57 GHz and an FCW of 119. -110.30dBm indicate a -102.32dBc reference spurious level at 30MHz oset. The integrated RMS jitter of 856 fs with an in-band phase noise of -96dBc/Hz at a 100kHz oset and an out-of-band phase noise of -120dBc/Hz at a 3MHz oset are shown in Fig. 4.22(b). Table 4.1 compares the key results of the proposed scheme to those of state-of- the-art PLLs with dithering. The table shows that the proposed scheme achieved the lowest near-carrier worst fractional spurs and the fastest background adaptive- loop response time. The noise and spur performance of the proposed scheme with dierent voltage and temperature conditions were nearly the same as the noise and spur performance of the proposed scheme in a nominal situation. This demonstrates the robustness of the proposed scheme under PVT variations. 119 Table 4.1: Comparison table with state-of-the-art digital PLLs 4.6 Conclusions A 35.2 GHz fractional-N DPLL with a background dither noise cancellation loop was presented in this section. To achieve a low spurious performance, a dithering scheme was enabled to randomize a spurious pattern, including TDC quantization steps and DNL eects. Then, a background dither noise cancellation scheme was utilized to remove introduced dither noise. Moreover, the worst fractional spur was measured from -62.47 to -79.2dBc with dierent FCW settings. Before and after activating the cancellation loop, an equiv- alent integrated RMS jitter of 1.78 ps at the FCW of 120 + 2 14 ) was measured; the results showed a complete cancellation of introduced dither noise. With a less than 0.9 us response time, the spur and integrated RMS jitter at an FCW of 120 + 2 14 120 uctuated less than 2 dB and 2% with PVT variations. Finally, a reference spur of -102.3 dBc and the integrated jitter of 856 fs were measured when a DPLL operated at 3.57 GHz carrier frequency. 121 Chapter 5 A Digital PLL with Dither-assisted Pulling Mitigation 5.1 Pulling of PLLs Injection pulling [3] updon frequency synthesizers and PLLs is a critical and challenging design issue for high-performance RF and wireless applications, espe- cially highly integrated multi-radio platforms. The severe performance degradation of the synthesizer due to the pulling aggressor imposes stringent design constraints to the system-level specication. As illustrated in Fig. 6.1, when the fundamental or higher order harmonic of the aggressor signal is close to the oscillation frequency of the victim PLL, the victim PLL can be pulled through a magnetic path, such as the inductor of the LC-tanked oscillator or the bond-wire. The victim PLL can also be disturbed via an electrical path, such as the supply, current biasing or the substrate. Note that there are two types of aggressor signals. The rst type is a 122 Figure 5.1: A PLL pulled by a transmitter output as PA pulling or by nearby PLLs as oscillator mutual pulling pulling signal introduced by oscillator mutual pulling, which can cause sinusoidal disturbances that generate multi-tone spurs when other clock sources are nearby. The second type is a pulling signal induced by PA pulling, which usually occurs in a transmitter architecture when PA outputs transmit a very strong modulated signal. A modulated signal can cause phase noise degradation as shown in Fig. 5.1. Although an aggressor's waveform can be known apriori, the coupling path and mechanism can be unpredictable and complex. Thus, expensive and time- consuming iterations of design optimizations, including system-level frequency plans, layouts, packages, and PCB manufacturing, cannot be avoided. Moreover, the transfer function of coupling paths can vary over time, as can operating conditions. Hence they cannot be predetermined or reliably compensated with foreground cal- ibrations. 123 To alleviate these pulling issues with PLLs, several techniques were explored to mitigate the eects of pulling from oscillator paths [10]- [12]. For example, for analog PLLs, phase dierences between reference clocks and divider outputs were converted into digital code, which could be post-processed with a DSP technique to generate corresponding compensation signals [13]. Afterwards, the compensa- tion signals could be injected to the VCO inputs to mitigate pulling phase errors. However, the technique incurs high overhead design, i.e., an additional analog-to- digital-converter (ADC) in the case of analog PLL. This concept was implemented for DPLL architectures [11] that did not require data converters. To ensure the robustness of mitigation loops, adaptive lter algorithms were developed to track PVT variations in the coupling transfer function. In the real operation, as shown in Fig. 5.2, the aggressor signal can also elec- trically couple to the reference path of a victim PLL [10] through a clock buer supply or substrate. Simultaneous couplings to oscillators and reference paths can occur as well. Both scenarios cause existing mitigation techniques to fail, which is rstly discussed with a DCO-induced spur mitigation scheme from Section 5.2- 5.4. Afterwards, based on the experimental results shown in Chapter 5.4, we have the knowledge to mitigate the pulling signal coupling to reference and DCO paths at the same time. Secondly, a dither-assisted scheme [20] is introduced, and it is shown that the scheme can orthogonalize two pulling signals from the DCO and reference paths simultaneously, and thus allow rejections by using the proposed 124 Figure 5.2: A PLL can be pulled by the aggressor signal coupling to reference and DCO paths simultaneously dual-mitigation scheme in the background. The dither-assisted scheme can also mitigate dierent aggressor waveforms, including sinusoidal and modulated signals, while sharing with the same hardware for minimum design overhead. To verify the concept, a DPLL operating at the integer-N mode is used to test the scheme. 5.2 Adaptive Filter Transfer Function for Spur coupling to the DCO path When interference couples to the DCO, the DCO output will show up spurs. In this scenario, if the spur mitigation mentioned in Chapter 2 is used, the DCO- induced spurious tone is not be cancelled as expected. To understand this, consider thez-domain transfer function shown in Fig. 5.3. A compensation signal comp [k] is injected at the TDC output, and the cancellation output denoted as 0 e [k] is designed 125 Figure 5.3: A z-domain model of a DPLL with the DCO-induced spur and the compensation signal to be minimized. Note that H DCO;c [z] denotes the DCO coupling transfer function that provides additional gain attenuation and phase shift to coupled interference. The cost function at the cancellation output can be designed as minE[ 02 e [k]] = minE[( DCO [k]h 0 hp [k] COMP [k]h hp [k]) 2 ] (5.1) , where h 0 hp [k] and h hp [k] denote a DPLL close-loop transfer function seen by the compensation signal COMP [k] and the DCO-induced spur DCO [k]. Then, a z- domain transfer function of h 0 hp [k] and h hp [k] can be written as follows, 8 > > > > < > > > > : H 0 HP [z] = H DCO;C [z]( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 +( 1z 1 z 1 )K D K DCO +K D K I K DCO H HP [z] = ( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 +( 1z 1 z 1 )K D K DCO +K D K I K DCO (5.2) , whereK DCO ,K D , andK D K I represent the DCO, proportional, and integral gain of the loop lter, respectively. Thez-domain loop response can be transformed into 126 a Laplace domain via a forward Euler transformation (e.g., by replacing 1z 1 z 1 with s). The transfer function shown in Eq. (5.2) can be combined as minE[ 02 e [k]] = minE[f( DCO [k]h dco [k] COMP [k])h hp [k]g 2 ]: (5.3) In a steady state, when 0 e [k] is minimized to zero, COMP [k] is equivalent to DCO [k]h dco;c [k]. However, if the nal DPLL output OUT [k] is considered, the phase equilibrium of OUT DCO becomes Eq. (5.4) after replacing COMP [k] with DCO [k]h dco;c [k] in Fig. 5.3. OUT DCO (z) = 1 +H DCO;C (z)K D (1 +K I ( z 1 1z 1 ))K DCO ( z 1 1z 11 ) 1 +K D (1 +K I ( z 1 1z 1 ))K DCO ( z 1 1z 11 ) (5.4) Note that if the DCO path does not impose extra gain attenuations and phase shift (i.e., if H DCO;C [k]=1), Eq. (5.4) degenerates into OUT [k] = DCO [k]. Instead of being suppressed by the DPLL closed loop high-pass transfer function h hp [k], a DCO-induced spur directly all-passes to the DPLL output after injecting the compensation signal. As a result, the spur energy is not minimized at all, so the spur appears as a tones in a spectrum with the given cost function Eq. (5.1). In order to eectively mitigate the DCO-induced spur, the cost function of Eq. (5.1) must be modied. Note that the loop response seen by the spur introduced to DCO and reference path has a unique reciprocal property (i.e., it comprises high- and low-pass responses, respectively). Hence, the spur mitigation scheme 127 should possess this reciprocity in terms of the sources of spurs. That is, for the DCO-induced spur, spur error energy should be minimized at e (the input of the mitigation loop) rather than 0 e (the output of the mitigation loop), as illustrated in Fig. 5.3. With this conguration, one can re-design the cost function in Eq. (5). minE[ e [k]] = minE[( DCO [k]h 0 hp [k] + COMP [k]h lp [k]) 2 ] (5.5) , where H 0 HP [z] = H DCO;C [z] ( 1z 1 z 1 ) 2 ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K DCO +K D K I K DCO H LP [z] = (K D K DCO ( 1z 1 z 1 ) +K D K I K DCO ) ( 1z 1 z 1 ) 2 + ( 1z 1 z 1 )K D K DCO +K D K I K DCO (5.6) To achieve the minimization of Eq. (5.5) in the steady state, the compensation signal should be equivalent to DCO h 0 hp [k]h 1 lp [k]. As the compensation signal COMP can be generated digitally, if the transfer function of h 0 hp [k]h 1 lp [k] can be synthesized within a compensation path before injection, we can minimize the least mean square error to be e [k]. As a result, the phase equilibrium from Eq. (5.4) becomes OUT DCO = 1H DCO (z)K D (1 +K I ( z 1 1z 1 ))K DCO ( z 1 1z 11 ) 1 +K D (1 +K I ( z 1 1z 1 ))K DCO ( z 1 1z 11 ) (5.7) 128 Figure 5.4: The mitigation of dierent spur sources to the PLL's (a) reference and (b) DCO paths Considering the transfer function of H DCO;C (z) is equal to one, the DPLL output's phase disturbance caused by the DCO-induced spur can be reduced to zero. This demonstrates a complete cancellation of the interference. Moreover, if the DCO- induced spur is completely nullied, the DPLL output will be spur-free. As a result, the TDC is unable to preserve any spur information except the random noise component. This shows that the correct transfer function of a mitigation loop can guarantee the minimization of unwanted spur energy at the input of a mitigation loop rather than at the output, as discussed with Eq. (5.7). Note that the compensation signal COMP can be injected at the input of a DLF or at the output of a DLF, which dier only according to the DLF's response. 129 To summarize, if the spur comes from the reference path of the DPLL feed- forward path such as the reference clock, or TDC, the spur replica generator, e.g., DDS, can be utilized to generate a 180 o out-of-phase replica followed by an adaptive lter to minimize residue error energy at the output of the cancellation point, as illustrated in Fig. 5.4(a). Hence, the spur induced error will not disturb the input of the DCO, avoiding unwanted spur generation. However, if the spur comes from the output of the DPLL such as DCO, a spur-induced error at a TDC output should be minimized, as shown in Fig. 5.4(b). This suggests that a 180 o out-of-phase spur replica should be generated before the DCO such that the summation of a spur and a spur replica can reach residual error energies of zero at the DCO output. 5.3 Challenges of Mitigating DCO-induced Spur and Proposed Mitigation Scheme There are two challenges that must be overcome to ensure the proper function- ality of the DCO-induced spur mitigation scheme. First, after mitigation, a TDC output will lose all spur information that might be needed to reconstruct the spur replica in the mitigation loop. A spur pattern can be a multi-tone signal, and the reconstruction of the spur pattern can become complicated. Second, the cancella- tion loop should not introduce any noise into a DPLL loop. If the cancellation loop 130 generate extra noise, the extra noise will not be suppressed by the loop completely, and it will propagate to the DCO output to degrade the phase noise prole. To resolve these challenges, the 65nm proof-of-concept prototype is implemented to mitigate DCO-induced spurs. The proposed DCO spur mitigation scheme con- sists of two key elements. First, during the initial mitigation phase, (i.e., the learning phase), the scheme estimates the periodicity of interference and nds an exact disturbing pattern that can be a single or multiple tones. In this phase, can- cellation is not enabled so the learning process is not in uenced. Second, during the subsequent mitigation phase (i.e., the cancellation phase), the learned pattern is used to subtract the interference-induced spur from a background adaptive loop to monitor the gain and phase variations ofH DCO;C (z) in real time and to minimize the TDC output's residual error energy, such that a spurious tone does not appear at the DPLL output. 5.4 Circuit Implementation of DCO-induced Spur Mitigation Scheme Figure 5.5 shows a simplied block diagram of a DPLL implementation with highlighted spur injection points. An LC-tanked DCO is used for better far-out phase noise. Then, the phase of the DCO is quantized using an injection-locked TDC that generates 28 ne quantization levels within one DCO period, providing 131 Figure 5.5: A block diagram of a DPLL implementation with the proposed DCO- induced spur mitigation scheme and dierent spur injection points an approximately 712 ps time resolution depending on the DPLL's operation fre- quency. A ne code is combined with an 8 bit integer counter output to generate complete TDC information, which preserves an important spurious pattern that is studied using the proposed mitigation scheme. The IL-TDC is composed of a seven-stage ring oscillator (RO) that is injection-locked by the DCO. The IL inter- face formulates a rst-order PLL loop. DCO noise dominates the in-band phase noise of the IL-TDC as its phase noise prole is low-passed by the IL bandwidth while the RO's phase noise dominates beyond the IL bandwidth. To minimize the phase noise contribution from the RO, the IL bandwidth is much greater than the DPLL bandwidth. The DLF is composed of proportional and integral paths with programmable type I and type II responses. Following the DLF, a delta-sigma digital to analog converter (DAC) interface is used to modulate varactors of the LC-tanked DCO to provide ner DCO frequency resolution. 132 Figure 5.6: The learning phase of the proposed mitigation implementation The proposed spur mitigation scheme can be used before the DLF. The adaptive algorithm will produce a compensation signal before the DLF and adaptively moni- tor the minimization of disturbance at the TDC output, as discussed previously. In order to test the capability to reject spurs from various coupling paths to the DCO, several spur injection points are designed, including power supplies of the DCO, power supplies of the capacitor bank driver, and bond wires of the DCO current bias. The goal is to inject multi-tone spurs either electrically or magnetically into those points. Thus, the reduction of spurious tones before and after enabling the proposed mitigation scheme can be observed. Figure 5.6 shows the learning phase of the proposed mitigation implementation. In this phase, the proposed scheme focuses on estimating a spur period without performing an actual cancellation in the DPLL loop. The periodicity of an interfer- ence pattern, referred to as D, can be composed of an integer part and a fractional 133 part. The estimation of an integer delay can be achieved by adjusting the delay of an integer delay chain (i.e., of cascaded ip- ops) such that the dierence between the output of the delay chain and the TDC is minimized (i.e., such that e int [n]) as shown in Fig. 5.6. For ease regarding hardware implementation, the integer delay estimator sweeps all the possible delay settings of the integer delay cell and determines the value that leads to the least residual error energy. Once the integer delay is set, a trigger signal is sent to enable a nite impulse response (FIR)-based fractional delay (FD) lter to perform an FD estimation. The estimation of the fractional period is completed using a sign least mean square (LMS) loop to mini- mize an e frac [n] error, as shown in Fig. 5.6. In most situations, the spur frequency is a priori because the frequency of external interference is typically known during system-level planning. In this case, an adaptive mode is unnecessary, so the integer delay and the coecient in the FIR-based FD lter can be pre-programmed. In the steady state, the phases of a spur pattern should be aligned between the output of the FD lter (y[n]) and TDC output (u[n]) such thate frac [n] does not contain peri- odic disturbances, except for random noise, as illustrated in the time-domain view of Fig. 5.6. Afterwards, the estimated periodicity information D should be used for the downsampler to properly decimate y[n]. Then, an innite impulse response (IIR)-based accumulator should constructively add the spur signal and average out the random noise component. Therefore, the spur pattern can be learned and stored in the IIR-based accumulator. Note that during this learning phase, the 134 Figure 5.7: The proposed spur-pattern learning phase cancellation signal is not injected into the DLF to avoid disturbing normal PLL operations. After the learning phase, the next phase is the cancellation phase, as shown in Fig. 5.7. For the cancellation part of the mitigation scheme, the stored spur pattern in the IIR-based accumulator is read using a nite-state machine (FSM) and a multiplexer to reconstruct a spur replica in sequence and allow potential sub- tractions prior to the DLF. During this cancellation phase, the integer delay chain and FIR-based FD lter are disabled such that the TDC output does not change the learned/stored spurious pattern in the IIR-based accumulator. However, the coupled spur may change its magnitude and phase in real time, so additional gain adjustments and phase shifts must be inserted at the output of the multiplexer with an adaptive algorithm in order to monitor PVT variations and minimize residual er- ror energy at the TDC output. This adaptive process can be completed by changing 135 Figure 5.8: A time-domain view of a spur replica in the IIR-based accumulator during the (a) learning and (b) cancellation phases. dierent phase settings (+, , and ) and gains (A+, A, and A) sequentially and observing changes in residual error energy at the TDC output. Whenever a certain amplitude or phase change leads to a smaller residual error energy, that particular gain or phase setting can be used to program tunable delay and gain stages. In the proposed scheme, the adaptive algorithm uses a single channel coordi- nate descent search for minimal hardware. If optimizing the convergence speed is desirable, multi-channel computation can be implemented without sacricing the cancellation accuracy. This adaptive algorithm operates in the background without interfering with normal PLL operations. Note that an extra detection loop can be utilized to detect whether the coupled interference pattern deviates dramatically. If the pattern departs from the originally learned spur waveform (i.e., if it deviates 136 due to the frequency variation), the spur mitigation scheme should be forced back to the rst phase to re-learn the spurious pattern and use the re-learned pattern for cancellation. However, when amplitude or phase perturbation occurs, the spur can be mitigated if the time constant of a cancellation loop is fast enough to respond to the rate of the perturbation. A time-domain view of the spur replica inside the IIR-based accumulator at two dierent phases is shown in Fig. 5.8. In the learning phase, each branch of the IIR-based accumulator stores a specic spurious pattern value that approaches a static value as the random noise component is averaged out inside the accumulator. In the cancellation phase, the FSM time-multiplexes the accumulator sequentially to recover the spurious pattern as shown in Fig. 5.8(b). Note that for the complete cancellation, the reconstruction of the spurious pattern should match the original interference pattern. 5.5 Experimental Results of DCO-induced Spur Mitigation Scheme A proof-of-concept prototype was implemented with a 65nm CMOS and pack- aged in QFN package for testing. An analog core occupied an active area of 0.63mm 2 and consumed 17.2mA, excluding an output buer. The digital feed- back loop and the cancellation scheme dissipated 1.7 and 2.4mA, respectively. The 137 Figure 5.9: Chip micrograph Figure 5.10: (a) A measured phase noise prole at 3.57 GHz and (b) a best-case reference spur level at 4.59 GHz Figure 5.11: A measured DPLL spectrum with multi-tone spurs with a 600 kHz oset (a) before spur mitigation and (b) after spur mitigation 138 Figure 5.12: Measured spur levels versus spur frequencies over dierent coupling paths chip micrograph is shown in Fig. 5.9. A measured phase noise at a carrier fre- quency of 3.57GHz achieved an integrated RMS jitter of 515fs (Fig. 5.10(a)), and a measured reference spur performance across dierent DPLL operating frequency of 35GHz demonstrated a <-110dBc spurious level with the prototype. The best-case reference spur was -116.1dBc at 4.59GHz, as shown in Fig. 5.10(b). To demon- strate the abilities of the proposed multi-tone DCO spur mitigation technique, a saw-tooth disturbance with a fundamental 600kHz oset was injected through the DCO supply. Its harmonics are shown in Fig. 5.11(a). After enabling the proposed mitigation scheme, spurious reductions of 622 dB were observed in Fig. 5.11(b). To test the generic spur-rejection capability of the proposed scheme over dif- ferent coupling paths, interference was introduced at various spur injection points, which are labeled in Fig. 5.5. First, spur levels were equalized in the three cases prior to mitigation. Second, residual spur levels were recorded over dierent spur frequencies after enabling the mitigation scheme, as shown in Fig. 5.12. In Fig, 139 Figure 5.13: Measured spur levels at representative spur frequencies over dierent (a) delay and (b) amplitude settings during the cancellation phase Table 5.1: Comparison with state-of-the-art PLLs 140 5.13, spur mitigation sensitivities are shown. The measurement demonstrates that with dierent phase delay and amplitude settings, only a specic setting can achieve an optimal cancellation accuracy in the adaptive cancellation loop. The proposed mitigation technique is not only capable of addressing the spur coupled through the supply of the DCO but also from other electrical and magnetic paths. 5.6 DCO and Reference Pulling Phase Errors After testing the capability of the proposed scheme in regards to mitigating DCO-induced spur errors, how the mitigation scheme can be extended to address the simultaneous pulling scenario shown in Fig. 5.2 was examined. It was noted that to mitigate the pulling signal, accurately compensating for the uncertain cou- pling transfer function and generating the corresponding compensation signal were crucial. To reach to this nal step, a mathematical derivation of DCO and refer- ence pulling phase errors inside the DPLL were shown after pulling. Afterwards, the advantages of using the result to create an update function for the mitigation loop and to nullify unwanted phase errors were investigated (see Section 5.8). 5.6.1 Pulling Signal to the DCO Path Throughout this paper, we discuss the pulling eect as if the second harmonic of the PA output coupled to the victim DPLL as illustrated in Fig. 5.3 (i.e., the con- ventional direct-conversion transmitter) and all the derivation will be studied based 141 on this case. Note that the derivation can be easily extended to other harmonics of PA pulling signal or the oscillator mutual pulling scenarios with further modica- tions. In this subsection, the pulling aggressor is considered to couple to the DCO path only for the ease of the derivation, but the eect of two pulling phase errors from DCO and reference paths can be approximated as the linear superposition in the z-domain model. The interfering waveform coupled to the DPLL can be ex- pressed as [X 2 I (t) +X 2 Q (t)] sinf!t + 2 tan 1 [X Q (t)=X I (t)]g, where! c ,X I (t) and X Q (t) are the DPLL operating frequency and baseband digital I/Q signals in the time domain. The termA 2 BB (t) and BB (t) are used to replace [X 2 I (t) +X 2 Q (t)] and tan 1 [X Q (t)=X I (t)] to simplify the nal expression for the following derivations. Due to the time-varying property of the DCO coupling transfer function H 1 (!) over PVT eect, extra gain factor ( DCO) and phase shift DCO (t) in the DCO path are introduced. According to the Alders equation, the DCO phase nonlinear dierential equation can be derived as d dt =f! o +K DCO D CTRL gf ! o 2Q DCO (t)A 2 BB (t) sin(2 BB (t) + DCO (t)) I s + DCO (t)A 2 BB (t) sin(2 BB (t) + DCO (t)) g (5.8) , where K DCO is the DCO gain, D CTRL is the DCO digital control code, ! o is the free-running frequency of the DCO, I s is the fundamental of the ac current owing into the LC tank andQ is the quality factor of the LC tank. The rst term in Eqn. (5.8) represents the steady-state DPLL locked frequency and the second term is the unwanted DCO pulling phase error, dened as DCO;Pull (t), which continuously 142 modulates the oscillation frequency of the victim DPLL. Interestingly, this DCO pulling phase error is only dependent on the coupling transfer function of DCO path and the digital I/Q signals. Importantly, the DCO pulling phase error is not in function of DPLLs carrier frequency, which implies the nullication of the error term only requires the knowledge of the digital I/Q signal and the coupling transfer function. Under the weak injection condition (i.e.,I s DCO (t)A 2 BB (t)), Eqn. (5.8) can be simplied as d dt =f! o +K DCO D CTRL gf ! o 2QI s DCO (t)A 2 BB (t) sin(2 BB (t) + DCO (t))g (5.9) According to Eqn. (5.9), the minimization of DCO pulling phase error can be achieved by increasing quality factorQ, DCO current biasing Is, or reducing DCO (t) by improving the DCO's noise rejection ratio in the coupling path. However, instead of applying those analog approaches, a compensation signal, referred as DDCO [k], can be generated digitally [13] to cancel out with the DCO pulling phase error. To generalize, as illustrated in Fig. 5.14(a), the close-loop DPLL response can be ex- amined in the discrete time approximation with the transformation of DCO pulling phase error to the z-domain expression, denoted as DCO;Pull [k]. DCO;Pull [k] =f ! o 2QI s DCO [k]A 2 BB [k] sin(2 BB [k] + DCO [k])g (5.10) 143 Figure 5.14: z-domain DPLL transfer function with (a) DCO pulling phase error DCO;Pull [k] and (b) reference pulling phase error REF;Pull [k] In Fig. 5.14, n;REF [k] and n;DCO [k] represents the noise contribution from the reference clock and the DCO clock. Note that the weak injection assumption does not mean that its eect will be negligible, but it provides more insight how to design the corresponding mitigation loop in Section 5.8. 5.6.2 Pulling Signal to the Reference Path The pulling signal will also couple to the reference path [26] as illustrated in Fig. 5.2. Instead of magnetic coupling, the pulling signal electrically disturbs the reference clock via supplies or substrates in the reference clock buer. Note that we only care about the disturbed phase at the rising edge of the reference clock in the prototype. Assuming the same interfering waveform in Section 5.6.1 couples to 144 the reference clock only, and the reference pulling phase error, denoted as REF (t), will be down-sampled and appear at clock buer output as shown in Eqn. (5.11). REF (t) = REF (t)A 2 BB (t) sin(2 BB (t) + REF (t)) (5.11) , where REF (t) and REF (t) represents the additional gain attenuation and phase shift in the reference path due to coupling transfer function H 2 (!). Two things can be observed from the expression of Eqn. (5.11). Firstly, the magnitude of this phase error is proportional to the magnitude of the pulling sig- nal. Thus, the coupling in the reference path can be described as the amplitude- modulation-to-phase-modulation (AM-to-PM) eect. Secondly, we assume the integer-N operation of DPLL and hence the carrier component ! c , which is ex- actly integer multiple of theF REF , will not be seen by the reference clock and does not appear in Eqn. (5.11). Only the digital I/Q component will alias back to the low frequency region and contribute to Eqn. (5.11). However, the analysis can be extended to the fractional relation when DPLL is operated in the fractional-N mode with some degree of modications in Eqn. (5.11). The compensation signal for mitigating Eqn. (5.11), referred as DREF [k], can be derived with the similar manner as DDCO [k]. Similarly, as illustrated in Fig. 5.11(b), the close-loop DPLL response can be examined in the discrete time approximation as illustrated in Fig. 145 5.14(b) with the transformation of reference pulling phase error to the z-domain expression, denoted as REF;Pull [k]. REF [k] = REF [k]A 2 BB [k] sin(2 BB [k] + REF [k]) (5.12) The detail of compensation signal DCO [k] and DREF [k] will be left for Section 5.8.2 after the dither-assisted scheme is studied as the technique will change the property of Eqn. (5.12). 5.6.3 Simultaneous Coupling of Pulling Signals In this subsection, the pulling signal coupling to the DCO and reference path simultaneously will be considered. In the z-domain model, both phase errors in Fig. 5.14(a) and 5.14(b) can be approximated as linear superposition as the coupling mechanisms are dierent. If the TDC output is examined, the overall pulling phase error can be derived as PULL [k]. PULL [k] = PULL [k]A 2 BB [k] sin(2 BB [k] + PULL [k]) (5.13) ,whereh hp;1 [k] andh hp;2 [k] denote the close-loop high-pass response seen by REF;Pull [k] and DCO;Pull [k], PULL [k] and PULL [k] are in the function of DPLL close-loop re- sponse and coupling transfer function and the operator denotes the time con- volution operation. If we examine Eqn. (5.13) carefully, it is noticeable that the 146 Figure 5.15: Indistinguishable DCO and reference pulling phase errors at the TDC output common term of A 2 BB sin(2 BB [k]) from Eqn. (5.10) and (5.12) are lumped to- gether, which cannot be separated anymore as indicated in Fig. 5.15. Based on the discussion from Chapter 2 to 4, when the interference couples to the dierent PLL path, dierent transfer function of the mitigation loop should be designed or the mitigation loop will fail to converge to the nal optimization point. Consequently, the mitigation scheme should possess the capability to distinguish DCO;Pull [k] and REF;Pull [k] as two uncorrelated waveforms without mutual interfering. Afterwards, two mitigation loops can be implemented to address two phase errors individually. 147 Figure 5.16: DCO pulling phase error at TDC output with and without the pro- posed dither-assisted scheme 5.7 Dithering for Orthogonalizing Pulling Phase Errors To resolve the challenges discussed in Section 5.6, the dither-assisted scheme involves injecting a pseudo-random (PN) code to randomize the phase error inside the reference clock buer path without changing the phase error in the DCO path. As a result, the orthogonal property holds between DCO;Pull [k] and REF;Pull [k]. In the prototype, the rising edge of DCO phase is injection-locked to an IL-TDC, and hence the DCO pulling phase error appears only at every rising edge of IL-TDC. To examine the ecacy of this dither-assisted scheme, consider two dierent scenarios with a DCO pulling signal and a reference pulling signal in Fig. 5.16 and 5.17, respectively. 148 Figure 5.17: Reference pulling phase error at TDC output with and without the proposed dither-assisted scheme. As illustrated in Fig. 5.16, the pulling signal is assumed to couple only to the DCO path. The preserved pattern at the TDC output is examined before and after the dithering. Without injecting the PN code, the clean reference clock edge REF out samples at the contaminated DCO phase, and the DCO pulling phase error will appear at the TDC output, which can preserve the original pattern of the pulling signal, as shown in Fig. 5.16(a). However, when the PN code is injected as shown in Fig. 5.16(b), the shifted reference clock edge (REF PN ) will observe the same DCO pulling phase error if the phase shift introduced in the reference clock is constrained within one DCO cycle. Note that this occurs because the DCO pulling 149 phase error happening at next rising edge of IL-TDC cannot be exercised. As a result, Eq. (5.10) can be modied as 0 DCO;Pull [k] =f ! o 2QI s 0 DCO [k]A 2 BB [k] sin(2 0 BB [k] + 0 DCO [k])g f ! o 2QI s DCO [k]A 2 BB [k] sin(2 BB [k] + DCO [k])g = DCO;Pull [k]: (5.14) 0 DCO [k], 0 DCO [k],A 02 BB [k], and 0 BB [k] denote variations of coecients that occur due to the shifted sample point of the reference clock. Because of the design constraint of phase shift less than one DCO cycle and the slow real-time variation of coupling transfer function, Eq. (5.14) will degenerate and become an equivalent to Eq. (5.10) as if the dithering does not have any eects. If DCO pulling phase errors before and after injecting the PN code is used are compared, the waveforms look almost the same, as shown in Fig. 5.16. For the pulling signal coupling to the reference path, the dithering eect is dierent. As illustrated in Fig. 5.17, the pulling signal can be assumed to couple to the reference path only, and the disturbance at the TDC output will be examined again before and after dithering. Without injecting the PN code, the contaminated reference clock edge REF out samples at the clean DCO phase and the reference pulling error will show up at the TDC output, which preserves the original pattern of the pulling signal as shown in Fig. 5.17(a). As shown in Fig. 5.17(b), the PN- shifted reference clock REFPN will see the dierent disturbance at newly shifted 150 moment compared to the original disturbance at Fig. 5.17(a). Thus, the AM-PM eect in the reference path is essentially phase modulated by the dithering and the new reference pulling phase error 0 REF;Pull [k] can be written as follows 0 REF;Pull [k] = 0 REF [k]A 02 BB [k] sinf! c PN[k]G DTC + 2 0 BB [k] + 0 REF [k]g REF [k]A 2 BB [k] sinf! c PN[k]G DTC + 2 BB [k] + REF [k]g (5.15) , where PN[k] is the injected PN code at the k th sample and G DTC is the digital- to-time-converter (DTC) gain, i.e., 1 LSB of DTC. Compared to the DCO pulling phase error, Eqn. (5.15) contains the extra PN modulated term multiplying the carrier frequency component of ! c . If the phase shift due to PN code can cover a whole DCO period (PN[k]G DTC >=T DCO ), the reference pulling phase error can be randomized as white noise as the dithering can exercise from 0 to 2. With the proposed dither-assisted scheme, two pulling phase errors are now orthogonal with each other; one preserves the original pattern while another behaves like random noise, which allows to distinguish for the mitigation loop as illustrated in Fig. 5.18. 151 Figure 5.18: Proposed dither-assisted technique dierentiates the simultaneous cou- pling signal 5.8 Proposed Pulling Mitigation Scheme 5.8.1 Update Function of the DCO Mitigation Loop In order to nullify the DCO pulling phase error, the corresponding compensation signal DDCO [k] can be injected at a D CTRL node. DDCO [k] can be written as follows, DDCO [k] = ! o DCO [k] 2QI s K DCO A 2 BB [k] sin(2 BB [k] + DCO [k]): (5.16) 152 The sinusoidal term sin(2 BB [k] + BB [k]) can be deconstructed using a trigono- metric function as follows, sin(2 BB [k] + DCO [k]) = 2fcos( BB [k])sin( BB [k]) [cos 2 ( DCO [k] 2 ) sin 2 ( DCO [k] 2 )] + sin( DCO [k] 2 )cos( DCO [k] 2 ) [cos 2 ( BB [k]) sin 2 ( BB [k])]g = 2 cos( BB [k]) sin( BB [k]) [cos( DCO [k])] + sin( DCO [k]) [cos 2 ( BB [k]) sin 2 ( BB [k])]: (5.17) By substituting Eq. (5.17) into Eq. (5.16), replacing cos( BB [k]) with X I [k] A BB [k] , and replacing sin( BB [k]) with X Q [k] A BB [k] , a nal expression of DDCO [k] can be achieved. DDCO [k] = ! o DCO [k] 2QI s K DCO fcos( DCO [k]) (2X I [k]X Q [k]) + sin( DCO [k]) (X 2 I [k]X 2 Q [k])g: (5.18) Using Eq. (5.18), the two variables 2X I [k]X Q [k] and X 2 I [k]X 2 Q [k] can be found. As X 2 I [k] and X 2 Q [k] originate from a digital processor, their values are known. Therefore, 2X I [k]X Q [k] and X 2 I [k]X 2 Q [k] can convolve with dierent coecients as functions of DCO [k] and DCO [k], which can be lumped together as shown in Eqn. (5.19). DDCO [k] =w 1;DCO [k]f2X I [k]X Q [k]g +w 2;DCO [k]fX 2 I [k]X 2 Q [k]g: (5.19) 153 Figure 5.19: The derivation of the update function, based on the proposed DCO mitigation loop, and the minimization of the DCO pulling phase error If the values ofw 1;DCO [k] andw 2;DCO [k] match !o DCO [k] 2QIsK DCO cos( DCO [k]) and !o DCO [k] 2QIsK DCO sin( DCO [k]), respectively, over time, the mitigation of DCO;Pull [k] can be achieved in a steady state. However, the coecientsw 1;DCO [k] andw 2;DCO [k] vary over time, so an LMS loop can be used to monitor variations ofH 1 (! by adjustingw 1;DCO [k] andw 2;DCO [k] in the background. To nd the update functions of w 1;DCO [k] and w 2;DCO [k], one can derive the mean square error of the DCO pulling phase error after injecting DDCO [k], as illustrated in Fig. 5.19. Instead of minimizing the cost function at the input of the DCO, the mean square error in the input of the adaptive lter, e;DCO [k], should be considered. Note that both DCO;Pull [k] and DDCO [k] result in 154 similar DPLL closed loop band-pass responses, and hence a modied cost function can be written as follows, minE[ 2 e;DCO [k]] = minE[( 0 DCO;Pull [k]h 0 bp [k] DDCC [k]h bp [k]) 2 ] = minE[f( DCO;Pull [k] K DCO DDCO [k])h bp [k]g 2 ]; (5.20) whereh bp [k] is the DPLL closed loop band-pass response from DDCO [k] to e;DCO [k]. By substituting Eq. (5.19) into (5.20), the derivative of Eq. (5.20) can be taken against the lter coecients w 1;DCO and w 2;DCO , which leads to the following iter- ative update equation. 8 > > > < > > > : w 1 [k + 1] =w 1 [k] 2 s;DCO e;DCO f(2X I [k]X Q [k])h bp [k]g w 2 [k + 1] =w 2 [k] 2 s;DCO e;DCO f(X 2 I [k]X 2 Q [k])h bp [k]g: (5.21) As there are two distinct update functions in Eq. (5.14), two separate adaptive lters are required; one uses a variable from 2X I [k]X Q [k] as an input to adapt w 1;DCO for an approach to !o DCO [k] 2QIsK DCO cos( DCO [k]). Another uses X 2 I [k]X 2 Q [k]. Note that the LMS algorithm is just one example of a way to realize the mitigation loop; other adaptive lter algorithms can be utilized. A detailed implementation of Eq. (5.21) is discussed in Section 5.9.2. 155 5.8.2 Update Function of the Reference Mitigation Loop Before deriving the update function of the reference mitigation loop and its compensation signal DREF [k], we would like to reexamine Eq. (5.15) to decouple the injected PN code and I/Q signals rstly from REF [k] and REF [k], which can be compensated for by using an adaptive loop. To decompose Eq. (5.15) with trigonometric functions similar to Eq. (5.17), Eq. (5.22) can be achieved as 0 REF;Pull [k] = REF [k]A 2 BB [k] fcos( REF [k]) [sin(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) + cos(! c PN[k]G DTC ) (2X I [k]X Q [k])] + sin( REF [k]) [cos(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) sin(! c PN[k]G DTC ) (2X I [k]X Q [k])]g: (5.22) According to Eq. (5.22), the compensation signal DREF [k] can be designed as DREF [k] = w 1;REF [sin(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) + cos(! c PN[k]G DTC ) (2X I [k]X Q [k])] +w 2;REF [cos(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) sin(! c PN[k]G DTC ) (2X I [k]X Q [k])]: (5.23) 156 Figure 5.20: A derivation of the update function, based on the proposed reference mitigation loop, used to minimize the reference pulling phase error Compared to the DCO compensation signal in Eq. (5.18), the predetermined terms include not only baseband I/Q signals but also the injected PN code in the trigono- metric function (i.e., cos(! c PN[k]G DTC ) and sin(! c PN[k]G DTC )). As the dithering scheme is implemented inside a digital in this prototype, all the dither information and its corresponding trigonometric mappings can be pre-calculated. Similarly, the update function of the adaptive reference mitigation loop can be derived using the cost function as shown in Figure 5.20. minE[ 2 e;REF [k]] = minE[( REF;Pull [k]h 0 hp [k] DREF [k]h hp [k]) 2 ] = minE[f( REF;Pull [k]N div DREF [k])h hp [k]g 2 ]: (5.24) 157 h hp [k] is a DPLL closed loop high-pass response from DREF [k] to e;REF [k]. By substituting Eq. (5.23) into Eq. (5.24), the mean square error e;REF [k] of the adaptive lter input can be minimized. One can achieve the update functions of Eq. (5.25) and Eq. (5.26). 8 > > > < > > > : w 1;REF [k + 1] =w 1;REF [k] 2 s;REF e;REF f(y 1 [k]h hp [k]g w 2;REF [k + 1] =w 2;REF [k] 2 s;REF e;REF fy 2 [k]h hp [k]g; (5.25) where 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : y 1 [k] = sin(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) +cos(! c PN[k]G DTC ) (2X I [k]X Q [k]) y 2 [k] = cos(! c PN[k]G DTC ) (X 2 I [k]X 2 Q [k]) sin(! c PN[k]G DTC ) (2X I [k]X Q [k]): (5.26) Equation (5.26) works as a PN correlator to extract useful information from the reference pulling phase error to adapt the adaptive lter, which will be discussed in Section 5.9.3. In the steady state, w 1;REF [k] and w 2;REF [k] should approach REF [k] cos( REF [k]) and REF [k] sin( REF [k]) for a complete nullication. 158 Figure 5.21: Overall DPLL architecture with the proposed DCO and reference pulling mitigation scheme 5.9 Circuit Implementation of Pulling Mitigation Scheme 5.9.1 Proposed DPLL Architecture The overall DPLL architecture is shown in Fig. 5.21. There are three key components related to the dither-assisted pulling mitigation scheme. First, the initial block is a PN sequence generator that controls the delay of the reference clock buer via a DTC, (i.e., a bank of unitary capacitors). A PN dither code is further used by the dither noise cancellation loop [18] to remove the dithering eect at the TDC output. Second, the subsequent block is the reference pulling mitigation loop 159 that is inserted prior to the DLF, which is an LMS loop that minimizes the mean square value of the PN correlators output e . The third crucial block is the DCO pulling mitigation loop that utilizes another LMS loop to minimize the mean square value of the node e . Both LMS loops take either external digital I/Q inputs for PA pulling or on-chip direct digital synthesis (DDS) outputs for oscillator mutual pulling depending upon the intended use of pulling signal types. The digital I/Q signals can be generated in MATLAB and synthesized using an arbitrary waveform generator (AWG). Before they are sent into a digital core, I/Q signals are retimed rst by an asynchronous FIFO in order to synchronize the I/Q signals with the DPLL reference clock. After the mitigation loop, a type II DLF controls a delta- sigma modulator, which toggles a bank of ne-resolution varactors inside the DCO. The DLF is programmable with a bandwidth of up to about 1 MHz. A calibration- free TDC is injection-locked by the LC-DCO and can provide 28 levels of ne- phase quantization within one DCO period. An accumulated frequency control word (FCW) is subtracted at the TDC output to allow integer-N operations. 5.9.2 Implementation of the DCO Mitigation Loop According to the derivations shown in Eqs. (5.19)(5.21), the DCO mitiga- tion loop can be implemented as shown in Fig. 5.22. Considering Eq. (5.19), the compensation signal DDCO [k] can be generated by multiplying the variables 2X I [k]X Q [k] andX 2 I [k]X 2 Q [k] with the weightsw 1;DCO [k] andw 2;DCO [k]. 2X I [k]X Q [k] 160 Figure 5.22: The implementation of the proposed DCO pulling mitigation loop and X 2 I [k]X 2 Q [k] is implemented with a 10 bit nite word length resolution and calculated using conversion logic, taking the two variablesX I [k] andX Q [k] directly from the asynchronous FIFO output. Note that if a oscillator mutual pulling case is considered, the sinusoidal waveform generated from DDS can be used directly with input 1 [k] and input 2 [k] without extra computations. As 2X I [k]X Q [k] and X 2 I [k]X 2 Q [k] have their own adaptive loops to adjust gains and phases indepen- dently, the update functions of w 1;DCO [k] and w 2;DCO [k] can be incorporated with two adaptive FIR lters. Then, two paths can be combined with an adaptive linear combiner architecture [27] before they are injected back into the DPLL loop. Varieties of adaptive lter topologies can be used for the intended mitigation, but the LMS algorithm is simple and eective at obtaining desired results. Note that the band-pass lter response h bp [k] in Eq. (5.21) consists of a DCO gain, a TDC gain and a DLF response, and in order to implement the exact responseh bp [k] 161 Figure 5.23: The implementation of the proposed DCO pulling mitigation loop in the digital domain, the DCO and TDC gains should be continuously measured over the PVT eect. Otherwise, an inaccurate estimation of h bp [k] can aect the updates ofw 1;DCO [k] andw 2;DCO [k] and degrade the nal mitigation result. To avoid increasing the design overhead, one can extend the single-tap FIR architecture to a multi-tap FIR architecture such that the weights w 1;DCO [k] and w 2;DCO [k] become a set of vectors (i.e., ~ w 1;DCO [k] and ~ w 2;DCO [k]) to compensate for the estimation errors. Based on the numerical simulation, it can be concluded that a four-tap FIR lter structure is sucient. It can result in a desired mitigation accuracy with a minimal design overhead. 162 Figure 5.24: The operation of a PN correlator: extracting DC information from the reference pulling phase error. 5.9.3 Implementation of the Reference Mitigation Loop In Fig. 5.23, the implementation of the reference pulling mitigation loop is shown. Due to a similarity between the update functions between Eq. (5.21) and Eq. (5.25), two LMS adaptive lters for the two input variables y 1 [k] andy 2 [k] are required instead of just 2X I [k]X Q [k] andX 2 I [k]X 2 Q [k] as the reference pulling phase error is now convolved with PN code. Note that a closed loop high-pass transfer function can be observed, and the injected compensation signal DREF [k] is ltered by the high-pass loop responseh hp [k] without the knowledge of the gains and phases of the reference coupling path, which is addressed by the adaptive update functions w 1;REF [k] and w 2;REF [k]. Similarly, the mimic loop response h h [k] introduces estimation errors and necessitates another adaptive four-tap FIR lter architecture. The input variablesy 1 [k] andy 2 [k] can be obtained using a PN correlator output, which realizes Eq. (5.26) and helps decorrelate the PN-dithered reference pulling 163 phase error 0 REF;Pull [k] from the DCO pulling phase error DCO [k]. The operation of the PN correlator is illustrated in Fig. 5.24. In the beginning of the mitigation, the TDC output consists of the DCO and reference pulling phase errors at the same time. The PN correlator processes the TDC output with a known digital I/Q signal (i.e., X I [k] and X Q [k]) and the PN code. Hence, DC information could be extracted from the reference pulling phase error. However, the DCO pulling phase error is mapped into a noise-like component. After sending both information into the next stage of the adaptive lter, the accumulator gradually averages the noise term out and the DC information due to the reference pulling phase error is used to determine the adaptation direction. The rationale for this PN correlator is that it works like a match-lter to select desired signals for analyses, i.e., the PN-modulated term. 5.10 Experimental Results of Pulling Mitigation Scheme A silicon prototype was implemented in 65nm CMOS process as shown in Fig. 5.25. The main DPLL was used as the victim under the pulling aggressor; this DPLL occupied an active area of 0:48mm 2 . A second DPLL served as a sinusoidal aggressor to mimic a oscillator mutual pulling scenario. A modulated signal was coupled externally into the main DPLL to mimic a PA pulling scenario. With the 164 Figure 5.25: A chip micrograph. 1V supply, the analog part of the main DPLL consumed 16.4 mW. The digital part dissipated, overall, 4.92 mW, which could be deconstructed into 1.1 mW for normal DPLL operations; 0.6 mW for the dither block; 1.04 mW for the DCO pulling mitigation loop; and 2.18 mW for the reference pulling mitigation loop. To verify the ecacy of the dither-assisted technique in regards to orthogonality, the impact of injecting a PN code into DCO and reference paths was investigated. First, a sinusoidal pulling signal was coupled only to the DCO path. Based on Eq. (5.14), it was noted that a dithering signal should not randomize the spurious pattern. Fig. 5.26(a) shows the spectrum of a sinusoidal DCO pulling spur before and after enabling the dithering. The spurious level was almost unchanged with a elevated noise oor. After enabling the dither noise cancellation, the sinusoidal DCO pulling spur was at almost the same power level as shown in Fig. 5.26(b) 165 Figure 5.26: Measured DPLL spectrum (a) with sinusoidal pulling to a DCO path and dithering and (b) with enabling the dither noise cancellation loop without much mitigation. Second, to prove that the sinusoidal pulling signal cou- pled to reference path only is able to be mitigated by the dithering is valid, and then we can conclude the orthogonality holds for the dither-assisted technique. Figure 5.27(a) shows the DPLL spectrum measured with sinusoidal pulling to only the reference path. Fig. 5.27(b) shows the DPLL spectrum with dithering, during which a noise oor was overwhelmed by the introduced dithering noise. With the dithering noise cancellation, the noise oor improved by > 20dB and returns to its original level with an additional 12dB of degradation due to the randomized pattern of the sinusoidal reference pulling signal as shown in Fig. 5.27(c). To fur- ther mitigate this degradation, the reference pulling mitigation loop was enabled, as shown in Fig. 5.27(d), and the DPLL spectrum had a 2dB improvement in 166 Figure 5.27: Measured DPLL spectrum (a) with sinusoidal pulling to only the reference path, (b) with dithering, (c) with dither noise cancellation, and (d) with reference pulling mitigation loop 167 Figure 5.28: (a) The measured DPLL phase noise prole without injecting aggressor signal and (b) with a modulated signal generated from a 65 GS/s arbitrary waveform generator the noise oor. This experiment showed that the proposed technique was eec- tive at smearing the coupled reference pulling signal after PN-modulation. Note that sinusoidal disturbance was used in the orthogonality test for a clear spectral observation, but the same principle is applicable to a modulated pulling signal. After conrming the orthogonal property, the modulated pulling case was tested coupling to the DCO and the reference path simultaneously. Before injecting the aggressor signal, the DPLL operated at a 3.2 GHz carrier frequency with a 32 MHz reference clock and an integrated RMS jitter of 394 fs, shown in Fig. 5.28(a). The AWG generated two channels of 64-QAM pulling signals with dierent amplitudes and phase shifts to mimic dierent channel eects; the modulated signal in the spectrum is shown in Fig. 5.28(b). The amplitude of the pulling signal coupled to the reference path was 30% of the signal coupled to the DCO path. Fig. 5.29 shows a wideband spectral improvement of 12 dB from before the mitigation to after it. 168 Figure 5.29: The measured DPLL spectrum with PA pulling (a) before and (b) after activating the proposed pulling mitigation scheme After coupling the modulated signal, the integrated RMS jitter degraded to 3.39 ps, and it improved to 1.04 ps after the mitigation technique was used. For a oscillator mutual coupling experiment, a second on-chip DPLL was enabled to operate at 3.2 MHz separate from the carrier frequency of the victim DPLL. After enabling the dithering and mitigation loops, the worst-case spurious level at the strongest sidelobe improved by 22.45 dB, as shown in the Fig. 5.30. Fig. 5.31(a) shows an internal node at the DLF input that was captured and post-processed in MATLAB to show the DPLL's state before and after mitigation. Note that the time-domain waveform validated the removal of unwanted pulling disturbance. Finally, the reference spur performance is shown in Fig. 5.31(b). The rst testing conguration is illustrated in Fig. 5.32(a); with the congura- tion, results shown in Figs. 5.28(b) and 5.29 were obtained. A M9502A 65 GS/s AWG was used to synthesize two modulated signals and baseband I/Q signals for 169 Figure 5.30: The measured DPLL spectrum with oscillator mutual pulling (a) before and (b) after activating the proposed pulling mitigation scheme Figure 5.31: (a) The time-domain waveform of a DLF input before and after en- abling both mitigation loops with disturbances removed and (b) the DPLL reference spur performance 170 Figure 5.32: Testing congurations mitigation in the digital. An Si5341/40 evaluation board was used as a clean refer- ence source. An E4440A spectrum analyzer was used to capture the DPLL output for spectrum measurements. Note that the evaluation board generated an external 10MHz reference to synchronize all the equipment. With the testing conguration shown in Fig. 5.32(b), the results in Figs. 5.30 and 5.31(a) were obtained. The second on-chip DPLL was enabled and internal digital information was analyzed using a USBee QX logic analyzer. Finally, orthogonality measurements, including those in Figs. 5.26 and 5.27, were examined by using the test congurations shown in Figs. 5.32(c) and 5.32(d). Only one channel of 65 GS/s AWG was used to gener- ate sinusoidal pulling signal to the victim DPLL. It was possible to independently control the coupling to the DCO or the reference path. 171 Table 5.2: Comparison of state-of-the-art PLLs with pulling mitigation Table 5.2 lists the key highlights in comparison with state-of-the-art PLLs ap- plying pulling mitigation. The tables show that this work achieves the highest reported improvement on the oscillator mutual pulling and wideband suppression on the modulated PA pulling. Furthermore, it shows that the proposed technique is the rst simultaneous pulling mitigation for simultaneous DCO and reference paths coupling. 5.11 Conclusions In the beginning of this chapter, the DCO-induced spur cancellation scheme was discussed to demonstrate the possibility of cancelling DCO spurs by simply modifying the adaptive algorithm discussed in Chapters 2, 3, and 4. Afterwards, the pulling mitigation technique was shown based on results. The technique was 172 developed to mitigate pulling eects on DPLL test articles, which can be aected by strong power amplier outputs and local PLLs. It was shown that the proposed technique simultaneously orthogonalizes pulling signal coupling to the victim PLLs through DCO and reference paths. With no increased hardware or power consump- tion overheads, the implemented mitigation loop uses the same logic regarding PA pulling (with modulated signals) and mutual oscillator pulling (with sinusoidal sig- nals). Moreover, with external injections of PA pulling, the mitigation loop results in a spectral suppression of 12 dB and an integrated RMS jitter improvement from 3.39 ps to 1.04 ps. The proposed mitigation loop also reduces coupled spur levels by 22.5 dB when a second on-chip DPLLs is enabled and operating at 3.2 MHz apart from the victim DPLL. 173 Chapter 6 Conclusions 6.1 Summary Advanced CMOS technology literature often promotes the application of discrete- time and mixed-signal processing techniques to conventional analog PLL architec- ture. As a result, digital PLL and all-digital PLL (ADPLL) architectures were explored as the architecture benets from technology scaling. It was noted that the transition of signal analyses from voltage domain to digital domain analyses provides many opportunities to improve PLL performance in regards to various DSP techniques, which is not currently feasible with an analog PLL architecture. However, due to the quantization eects of digital architecture, the nonlinearity of DPLL dynamic severely degrades spur performance; this is a signicant DPLL challenge. To demonstrate the advantage of the DPLL architecture, several proof- of-concept prototypes were fabricated to explore the possibility of mitigating spurs and interference with digital adaptive algorithms. 174 For fractional spurs, the proposed feedforward multi-tone spur cancellation and reference path dithering technique achieved a signicant improvement of 2040 dB from a near-carrier to out-of-band spur ranges. Moreover, for reference spurs, lay- outs and isolation techniques were explored and the performance of <-110dBc was measured without the need of additional calibration. For externally coupled in- terference and pulling signals, more than 20 dB rejections of interfering-induced artifacts were observed in regards to bands of interest using dierent prototypes in Chapter 5 and the proposed mitigation platform shows the robustness of the technique with dierent aggressor scenarios. To summarize, a spur and interference mitigation technique was implemented for DPLL architecture. The contributions of this work included the development of innovative concepts and theoretical ndings. Several proof-of-concept prototypes were provided that had record spurious performance and validated the technique's eectiveness. Finally, it was shown that the proposed DPLL architecture provides new design direction and capability for PLL designer to go beyond what existing architectures can do. 6.2 Recommendations for Future Work The proposed technique focused on improving DPLL performance in regards to spurs. The concepts of DSP-enabled DPLL architecture and adaptive algorithms can be extended in various directions. Here shows few examples. 175 First, a critical issue related to wireless communication is frequency synthesizer, or PLL, costs. In order to achieve stringent noise requirements, PLLs are usually designed with LC-tanked oscillators to ensure they have favorable phase-noise per- formance. Unlike PLL topologies with ring oscillators, areas occupied by passive inductor components are extremely large. In the future, for SoC platforms, such as 5 G communication, the integration of multiple PLLs on one SoC chip cannot be avoided. However, the cost of such an integration may be high due to large inductors. Hence, it may be valuable to replace LC-tanked oscillators with ring oscillators. However, it may be dicult to reduce the phase noises of ring oscil- lators to the levels of LC-tanked oscillators. Based on the discussion in Chapter 4, a noise component may be mitigated with a digital cancellation scheme if the contribution of the noise can be analyzed and used correctly. Future research may examine the extension of such a cancellation technique to address the phase noises of ring oscillators. Second, low-spur DPLLs are suitable for Internet of Things (IoT) and bio- medical applications in which a major concern is low-power operations. With low-power conditions, the spurious performance of frequency synthesizers can be challenging because systems may not able be able to aord an additional spur can- cellation scheme. Note that if power budgets need to be increased in frequency synthesizers, the performance requirements of other building blocks will be pushed 176 and become stringent. To avoid adding to the burdens on the system-level spec- ications, the technique used in Chapter 4 may be appropriate as the proposed spur mitigation scheme consumes very little power and area. Future research could examine this. To conclude, this thesis provides multiple directions for future studies on not only the performance optimization of the DPLL but also the integration of advanced SoC platforms with proposed DSP-enabled DPLL architecture. 177 References [1] R.B. Staszewski, K. Muhammad, D. Leipold, C.-M. Hung, Y.-C. Ho, J.L. Wall- berg, C. Fernando, K. Maggio, R. Staszewski, T. Jung, J. Koh, S. John, I.Y. Deng, V. Sarda, O. Moreira-Tamayo, V. Mayega, R. Katz, O. Friedman, O.E. Eliezer, E. de-Obaldia and P.T. Balsara, "All-digital TX Frequency Synthesizer and Discrete-time Receiver for Bluetooth Radio in 130-nm CMOS," IEEE J. Solid State Circuits, vol. 39, pp. 2278-2291, Dec. 2004. [2] S.D. Vamvakos, R.B. Staszewski, M. Sheba, K. Waheed, "Noise Analysis of Time-to-Digital Converter in All-Digital PLLs," IEEE Dallas/CAS Workshop on, pp. 87-90, Oct. 2006. [3] B. Razavi, "A study of injection pulling and locking in oscillators," IEEE J. Solid State Circuits, vol. 39, pp. 1415-1424, Aug. 2004. [4] C.M. Hsu, M.Z. Straayer and M.H. Perrott, "A low-noise wide-BW 3.6-GHz digital fractional-N frequency synthesizer with a noise-shaping time-to-digital converter and quantization noise cancellation," IEEE J. Solid-State Circuits, vol. 43, pp. 2776-2786, Dec. 2008. 178 [5] E. Temporiti, C. Weltin-Wu, D. Baldi, M. Cusmai and F. Svelto, "A 3.5GHz Wideband ADPLL with Fractional Spur Suppression Through TDC Dithering and Feedforward Compensation," IEEE J. Solid State Circuits, vol. 45, pp. 2723-2736, Dec. 2010. [6] F. Opteynde, "A 40nm CMOS all-digital fractional-N synthesizer without re- quiring calibration," in ISSCC Dig. Tech. Papers, pp. 346-347, Feb. 2012. [7] A. Elshazly, R. Inti, W. Yin, B. Young and P.K. Hanumolu, "A 0.4-to-3 GHz Digital PLL With PVT Insensitive Supply Noise Cancellation Using Deter- ministic Background Calibration," IEEE J. Solid-State Circuits, vol. 46, pp. 2759-2771, Dec. 2011. [8] M. Elsayed, M. Abdul-Latif and E. Snchez-Sinencio, "A Spur-Frequency- Boosting PLL with a -74dBc Reference-Spur Suppression in 90nm Digital CMOS," IEEE J. Solid State Circuits, vol. 48, pp. 2104-2117, Sep. 2013. [9] H. Kim, J. Sang, H. Kim, Y. Jo, T. Kim, H. Park and S.H. Cho, "A 5GHz - 95dBc-reference-spur 9.5mW digital fractional-N PLL using reference-multiplied time-to-digital converter and reference-spur cancellation in 65nm CMOS," in ISSCC Dig. Tech. Papers, pp. 258-259, Feb. 2015. [10] R.B. Stazewski, K. Waheed, F. Dulger and O.E. Eliezer, "Spur-Free Multirate All-Digital PLL for Mobile Phones in 65 nm CMOS," IEEE J. Solid State Circuits, vol. 46, pp. 2904-2919, Dec. 2011. 179 [11] G.L. Puma and C. Carbonne, "Mitigation of Oscillator Pulling in SoCs," IEEE J. Solid State Circuits, vol. 51, pp. 348-356, Feb. 2016. [12] R. Winoto, A. Olyaei, M. Hajirostam, W. Lau, X. Gao, A. Mitra, O. Carnu, P. Godoy, L. Tee, H. Li, E. Erdogan, A. Wong, Q. Zhu, T. Loo, F. Zhang, L. Sheng, D. Cui, A. Jha, X. Li, W. Wu, K.-S. Lee, D. Cheung, K.W. Pang, H. Wang, J. Liu, X. Zhao, D. Gangopadhyay, D. Cousinard, A.A. Paramanandam, X. Li, N. Liu, W. Xu, Y. Fang, X. Wang, R. Tsang and L. Lin, "A 2x2 WLAN and Bluetooth Combo SoC in 28nm CMOS with On-Chip WLAN Digital Power Amplier, Integrated 2G/BT SP3T Switch and BT Pulling Cancellation," in ISSCC Dig. Tech. Papers, pp. 170-171, Feb. 2016. [13] A. Mirzaei and H. Darabi, "Pulling Mitigation in Wireless Transmitters," IEEE J. Solid State Circuits, vol. 49, pp. 1958-1970, Sep. 2014. [14] C.R. Ho and M.S.W. Chen, "A fractional-N DPLL with adaptive spur can- cellation and calibration-free injection-locked TDC in 65nm CMOS," in Proc. IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, pp. 97-100, June 2014. [15] C.R. Ho and M.S.W. Chen, "A fractional-N DPLL with Calibration-Free Multi-phase injection-locked TDC and adaptive single-tone spur cancellation Scheme," IEEE Trans. Circuits Syst. I: Reg. Papers, vol. 63 , pp. 1111-1122, Aug. 2016. 180 [16] C.R. Ho and M.S.W. Chen, "A Digital PLL with Feedforward Multi-Tone Spur Cancellation Loop Achieving<-73dBc Fractional Spur and<-110dBc Reference Spur in 65nm CMOS," in ISSCC Dig. Tech. Papers, pp. 190-191, Feb. 2016. [17] C.R. Ho and M.S.W. Chen, "A Digital PLL with Feedforward Multi-Tone Spur Cancellation Loop Achieving<-73dBc Fractional Spur and<-110dBc Reference Spur in 65nm CMOS," IEEE J. Solid State Circuits, vol. 51 , pp. 3216-3230, Feb. 2016. [18] C.R. Ho and M.S.W. Chen, "A fractional-N Digital PLL with Background Dither Noise Cancellation Loop Achieving<-62.5dBc Worst-Case Near-Carrier Fractional Spur in 65nm CMOS," in ISSCC Dig. Tech. Papers, to be published in Feb. 2018. [19] C.R. Ho and M.S.W. Chen, "Interference-induced DCO Spur Mitigation for Digital Phase Locked Loop in 65-nm CMOS," in Proc. ESSCIRC, pp. 213-216, Sep. 2016. [20] C.R. Ho and M.S.W. Chen, "A Digital Frequency Synthesizer with Dither- Assisted Pulling Mitigation for Simultaneous DCO and Reference Path Cou- pling," in ISSCC Dig. Tech. Papers, to be published in Feb. 2018. [21] M.S.W. Chen, D. Su and S. Mehta, "A Calibration-Free 800MHz Fractional-N Digital PLL with Embedded TDC," in ISSCC Dig. Tech. Papers, pp. 472-473, Feb. 2010. 181 [22] J.C. Chien and L.H. Lu, "Analysis and Design of Wideband Injection-Locked Ring Oscillator With Multiple-Input Injection," IEEE J. Solid-State Circuits, vol. 42, pp. 1906-1915, Sep. 2007. [23] B. Mesgarzadeh and A. Alvandpour, "A study of Injection Locking in Ring Oscillators," IEEE International Symposium in Circuit and Systems (ISCAS), vol. 6, pp. 5456-5468, May. 2005. [24] S. Kalia, M. Elbadry, B. Sadhu, S. Patnaik, J. Qiu and R. Harjani, "A Simple, Unied Phase Noise Model for Injection-Locked Oscillators," in Proc. IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, pp. 1-4, June 2011. [25] R. Adler, "A Study of Locking Phenomena in Oscillators," in Proc. of the IRE, vol. 34, pp. 351-357, June 1946. [26] O.E. Eliezer, R.B. Staszewski, I. Bashir, S. Bhatara and P.T. Balsara, "A Phase Domain Approach for Mitigation of Self-Interference in Wireless Transceivers," IEEE J. Solid-State Circuits, vol. 44, pp. 1436-1453, May 2009. [27] S. Haykin, "Adaptive Filter Theory," Prentice Hall, Upper Saddle River, NJ, 1996. [28] H.C. So, "Adaptive algorithm for sinusoidal interference cancellation," Elec- tronics Letters, vol. 33, pp. 1910-1912, Oct. 1997. 182 [29] B. Widrow, J.R. Glover, J.M. McCool, J. Kaunitz, C.S. Williams, R.H. Hearn, J.R. Zeidler, Jr. E. Dong and R.C. Goodlin , "Adaptive noise cancelling: Prin- ciples and applications," Proceedings of the IEEE, vol. 63, pp. 1692-1716, Dec. 1975. [30] D.B. Leeson, "A Simple Model of Feedback Oscillator Noise Spectrum," Pro- ceedings of the IEEE, vol. 4, pp. 329-330, Feb. 1966. [31] A.A. Abidi, "Phase Noise and Jitter in CMOS Ring Oscillator," IEEE J. Solid- State Circuits, vol. 41, pp. 1803-1816, Aug. 2006. [32] T.H. Lee and A. Hajimiri, "Oscillator phase noise: a tutorial," IEEE J. Solid- State Circuits, vol. 35, pp. 326-336, Mar. 2000. [33] R. Romano, S. Levantino, C. Samori and A.L. Lacaita, "Multiphase LC oscil- lators," IEEE Trans. Circuits Syst. I, vol. 53, pp. 1579-1588, July 2006. [34] X. Zhang and A.B. Apsel, "A Low-Power, Process-and-Temperature- Compensated Ring Oscillator With Addition-Based Current Source," IEEE Trans. Circuits Syst. I, vol. 58, pp. 868-878, May 2011. [35] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund and D. Schmitt- Landsiedel, "A Local Passive Time Interpolation Concept for Variation-Tolerant High-Resolution Time-to-Digital Conversion,"IEEEJ.Solid-StateCircuits, vol. 43, pp. 1666-1676, July 2008. 183 [36] D. Ham and A. Hajimiri, "Concepts and methods in optimization of integrated LC VCOs," IEEE J. Solid-State Circuits, vol. 36, pp. 896-909, June 2001. [37] P. Andreani and A. Fard, "More on the 1=f 2 Phase Noise Performance of CMOS Dierential-Pair LC-Tank Oscillators," IEEE J. Solid-State Circuits, vol. 41, pp. 2703-2712, Dec. 2006. [38] A. Arakali, S. Gondi and P.K. Hanumolu, "Analysis and design techniques for supply-noise mitigation in phase-locked loop," IEEE Trans. Circuits Syst. I: Reg. Papers, vol. 57, pp. 2880-2889, Nov. 2010. [39] X. Gao, E.A.M. Klumperink, G. Socci, M. Bohsali and B. Nauta, "Spur re- duction techniques for phase-locked loops exploiting a sub-sampling phase de- tector," IEEE J. Solid-State Circuits, vol. 45, pp. 1809-1821, Sep. 2010. [40] S. Mendel, C. Vogel and N.D. Dalt, "Signal and timing analysis of a phase- domain all-digital phase-locked loop with reference retiming mechanism," in Mixed Design of Integrated Circuits & Systems, pp. 681-687, June 2009. [41] R.B. Staszewski, J. Wallberg and P.T. Balsara, "All-digital frequency synthe- sizer in deep-submicron CMOS," John-Wiley & Sons, Inc., NY, 2006. [42] A. Gautam, Y.-D. Lee and W.-Y. Chung, "ECG signal de-noising with sig- nal averaging and ltering algorithm," in Convergence and Hybrid Information Technology, ICCIT 08. Third International Conference on, pp. 409-415, 2008. 184 [43] F.M. Gardner, "Phaselock techniques," John-Wiley & Sons, Inc., NY, 1979. [44] E.I. Jury, "Theory and Application of the Z-Transform Method," John-Wiley & Sons, Inc., NY, 1964. [45] P. Depalle and S. Tassart, "Fractional delay lines using LaGrange Interpola- tors, " in Proceeding of Nordical Acoustic Metting, June 1996. [46] Y. He, Y.-H. Liu, T. Kuramochi, J.V.D. Heuvel, B. Busze, N. Markulic, C. Bachmann and K. Philips, "A 673uW 1.8-to-2.5GHz Dividerless Fractional- N Digital PLL with an Inherent Frequency-Capture Capability and a Phase- Dithering Spur Mitigation for IoT Applications," in ISSCC Dig. Tech. Papers, pp. 420-421, Feb. 2017. 185 Appendix A A.1 The Reference Mitigation Loop with DPLL in Fractional-N Mode As noted in Chapter 5, an integer relationship between a reference clock and a pulling signal can be assumed. This assumption implies that a DPLL operates in an integer-N mode. However, the proposed technique can be extended to a fractional-N operation. With a fractional-N condition, the carrier frequency of a DPLL is equal to the multiplication of a reference clock frequency and an accumulated FCW value, dened as follows: F DCO = (N int +df)F REF : (A.1) 186 N int and df denote an integer and fractional division ratio. If the interfering wave- form with the carrier frequency shown in Chapter 5.6.1 is coupled to the reference path, the new reference pulling phase error 00 REF;Pull [k] can appear as follows: 00 REF;Pull [k] REF [k]A 2 BB sinf! int PN[k]G DTC + 2dfkT REF + 2 BB [k] + REF [k]g (A.2) ! int is equal to 2N int F REF . The modied update function can be rewritten as Eq. (A.3), and the new outputs y 0 1 [k] and y 0 2 [k] will be generated in the PN correlator as shown in Eq. (A.4): 8 > > > < > > > : w 0 1;REF [k + 1] =w 0 1;REF [k] 2 s;REF e;REF fy 0 1 [k]h hp [k]g w 0 2;REF [k + 1] =w 0 2;REF [k] 2 s;REF e;REF fy 0 2 [k]h hp [k]g (A.3) , and 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : y 0 1 [k] = sin(! c PN[k]G DTC + 2dfkT REF ) (X 2 I [k]X 2 Q [k]) +cos(! c PN[k]G DTC + 2dfkT REF ) (2X I [k]X Q [k]) y 0 2 [k] = cos(! c PN[k]G DTC + 2dfkT REF ) (X 2 I [k]X 2 Q [k]) sin(! c PN[k]G DTC + 2dfkT REF ) (2X I [k]X Q [k]) (A.4) 187 Figure A.1: The implementation of the modied PN correlator when DPLL operates at fractional-N mode The corresponding implementation is illustrated in Fig. A.1, in which the trigono- metric function of 2dfkT REF is realized using DDS logic and the value df is already known a priori. 188
Abstract (if available)
Abstract
Phase‐locked loops (PLLs) are widely deployed in most electronic systems to generate a desired clock frequency, perform clock data recovery (CDR), and frequency/phase modulation, among other applications. Due to stringent system‐level specification, the rapid growth of modern electronic systems (e.g., 5G communication or bio‐medical applications) has imposed increased design constraints on the component‐level block, such as PLLs. One challenge is to build robust, low‐spur PLLs, which are highly desirable for many applications to avoid unwanted reciprocal mixing of blocker signals, prevent emission mask violation in a wireless transmitter, and minimize deterministic jitter as a clock source. ❧ Designing a low‐spur PLL is not a straightforward problem. A spur can originate externally or internally and has a variety of patterns, such as sinusoidal, sawtooth‐like, or modulated waveform. External spurs are caused by nearby interfering aggressors, which can be clocks in other domains or power amplifier (PA) output in a system‐on‐chip (SoC) platform. Notably, external interferences may couple with the PLL via a silicon substrate, bond wire, power supplies, or even an inductor inside an LC‐tank oscillator to generate spurs. Even worse, various coupling paths indicate different transfer functions, from spur input node to final PLL output, which makes it nearly impossible to pre‐calculate the coupling transfer function. However, internal spurs, including fractional and reference spurs, result from the nature of PLL operation, and they are difficult to filter externally. The implementation of an additional spur mitigation scheme sometimes causes stability issues for the loop dynamic, which may sacrifice the noise performance of PLLs. Consequently, it is challenging to design a reliable, low‐spur PLL. ❧ To resolve these issues, this thesis introduces a generic spur and interference mitigation platform for digital PLL (DPLL) architecture by leveraging various adaptive filter algorithms, which are capable of mitigating both types of spurs. Because of its intensive digital embodiment, the DPLL is highly flexible and reconfigurable. Furthermore, the digital core of a DPLL is substantially robust for process, voltage, and temperature variations (PVT) as it will not suffer from the analog non‐ideality, such as supply headroom. Thanks to the many advantages of DPLL over its analog counterpart, this digital architecture is used to demonstrate the concept of the proposed spur and interference cancellation platform in this thesis.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
RF and mm-wave blocker-tolerant reconfigurable receiver front-ends
PDF
Mixed-signal integrated circuits for interference tolerance in wireless receivers and fast frequency hopping
PDF
Propagation channel characterization and interference mitigation strategies for ultrawideband systems
PDF
Digital to radio frequency conversion techniques
PDF
Nonuniform sampling and digital signal processing for analog-to-digital conversion
PDF
Wideband low phase-noise RF and mm-wave frequency generation
PDF
Delay-locked loop-based fractional frequency counter for magnetic biosensors
PDF
Charge-mode analog IC design: a scalable, energy-efficient approach for designing analog circuits in ultra-deep sub-µm all-digital CMOS technologies
PDF
Silicon photonics integrated circuits for analog and digital optical signal processing
PDF
Surface acoustic wave waveguides for signal processing at radio frequencies
PDF
Bidirectional neural interfaces for neuroprosthetics
PDF
Memristive device and architecture for analog computing with high precision and programmability
PDF
Towards high-performance low-cost AMS designs: time-domain conversion and ML-based design automation
PDF
Real-time reservoir characterization and optimization during immiscible displacement processes
PDF
Analog and mixed-signal parameter synthesis using machine learning and time-based circuit architectures
PDF
Investigations of Mie resonance-mediated all dielectric functional metastructures as component-less on-chip classical and quantum optical circuits
PDF
Models and information rates for channels with nonlinearity and phase noise
PDF
A power adaptive low power low noise band-pass auto-zeroing CMOS amplifier for biomedical implants
PDF
Calibration of digital-to-analog converters in highly-integrated RF transceivers using machine learning
PDF
Engineering solutions for biomaterials: self-assembly and surface-modification of polymers for clinical applications
Asset Metadata
Creator
Ho, Cheng-Ru
(author)
Core Title
A generic spur and interference mitigation platform for next generation digital phase-locked loops
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
04/17/2020
Defense Date
11/16/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
adaptive filter,ADPLL,DPLL,injection‐locked TDC,injection‐locked time‐to‐digital converter,interference cancellation,OAI-PMH Harvest,PLL,pulling mitigation,spur cancellation
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Chen, Mike Shuo-Wei (
committee chair
), Beerel, Peter (
committee member
), Chugg, Keith (
committee member
), Hashemi, Hossein (
committee member
), Yoon, Jongseung (
committee member
)
Creator Email
chengruh@usc.edu,scc07090234@outlook.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-492864
Unique identifier
UC11266888
Identifier
etd-HoChengRu-6256.pdf (filename),usctheses-c40-492864 (legacy record id)
Legacy Identifier
etd-HoChengRu-6256.pdf
Dmrecord
492864
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Ho, Cheng-Ru
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
adaptive filter
ADPLL
DPLL
injection‐locked TDC
injection‐locked time‐to‐digital converter
interference cancellation
PLL
pulling mitigation
spur cancellation