Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Delay-locked loop-based fractional frequency counter for magnetic biosensors
(USC Thesis Other)
Delay-locked loop-based fractional frequency counter for magnetic biosensors
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Delay-Locked Loop-Based Fractional Frequency Counter for Magnetic Biosensors by Xiao Chu A Thesis Presented to the FACULTY OF THE USC VITERBI SCHOOL OF ENGINEERING UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE (ELECTRICAL ENGINEERING ) May 202 2 Copyright 202 2 Xiao Chu Acknowledgements I would like to express my sincere appreciation to Prof. Constantine Sideris for allowing me to conduct research in his lab, offering insights on analog and mixed-signal design, and providing significant help on this project. I would like to thank my thesis committee members, Prof. Hossein Hashemi, Prof. Mike Chen, and Prof. Manuel Monge, for their time and support. I also appreciate the knowledge that they shared in their classes. Finally, I would like to thank my parents for their dedicated love, unconditional encouragement, and support. ii iii Abstract The increasing demand for point-of-care systems that help realize fast, home-based disease diagnosis drives the development of accurate and quick-response biosensors. A promising technology is the frequency-shift magnetic biosensor, which uses an LC oscillator as the sensor core to qualitatively and quantitatively detect target biomolecules by downshifting its oscillation frequency. The potential of this innovation for extensive use is the time- and cost-effective biomolecule detection enabled by its simple structure, high sensitivity, and compatibility with modern CMOS technology. The accurate and rapid counting of frequency output from a frequency-shift magnetic biosensor is necessary to efficiently reconstruct the concentration of target biomolecules. Conventional full-cycle frequency counting, which involves recording the number of rising edges (𝑁 ) in a certain period (𝑇 𝑐𝑜𝑢𝑛𝑡 ), cannot detect the initial and final skews between the clock that marks 𝑇 𝑐𝑜𝑢𝑛𝑡 and the signal to be measured, thereby diminishing the accuracy of frequency reconstruction. A longer counting time is required to upgrade the accuracy and resolution of full- cycle counting, but this strategy degrades the response speed of a biosensor, thus introducing a trade-off. This project developed a delay-locked loop (DLL)-based fractional frequency counter, which can count a fraction of a full clock cycle, to simultaneously enhance the accuracy, resolution, and speed of frequency counting and further improve the overall performance of frequency-shift magnetic biosensors. This fractional counter is applicable not only to biosensors but also to any procedure that requires high-speed and precise frequency counting. iv Analog delay-locked loops (DLLs) were chosen over digital counterparts because of the higher delay resolution and better jitter performance. A voltage-controlled delay line (VCDL) composed of current-starved delay cells was optimized for use under a wide delay tuning range, low jitter, low power consumption, low supply sensitivity, and a small area. Regulated cascode current sources with switches that can reduce charge injection, charge sharing, and clock feedthrough were used to construct a charge pump (CP) for realizing speedy operation and an extensive range of output control voltage. A phase frequency detector (PFD) with a false-lock- prevention circuit was inserted into the loop to enhance the robustness of the DLL under frequency changes. As a result of the abovementioned procedures, the DLL-based fractional counter designed in TSMC 65nm can reconstruct the frequency of an input signal from 42 MHz to 216 MHz at a power less than 7.12 mW. The duration of counting is 32 times faster and the resolution of frequency reconstruction is 32 times higher than those achieved with conventional full-cycle frequency counting. v vii v Table of Contents Acknowledgements .................................................................................................................. ii Abstract ................................................................................................................................... iii List of Tables .......................................................................................................................... vii List of Figures ....................................................................................................................... viii Chapter 1: Introduction ............................................................................................................ 1 1.1 Introduction to a frequency-shift-based magnetic biosensor .............................1 1.2 Frequency counting for frequency-shift-based magnetic biosensors ................5 Chapter 2: DLL Theories ......................................................................................................... 7 2.1 DLL frequency response ....................................................................................8 2.2 DLL time domain jitter transfer function ........................................................10 Chapter 3: V oltage-Controlled Delay Line (VCDL) .............................................................. 18 3.1 Review of delay cell structure .........................................................................18 3.2 Operation principle of current-starved delay cells...........................................20 3.3 Effect of parameter variations and mismatches ...............................................22 3.4 Output jitter of a current-starved delay cell .....................................................24 3.5 Structure of the VCDL and the delay cell ........................................................27 3.6 Simulation results ............................................................................................31 Chapter 4: Charge Pump ........................................................................................................ 33 4.1 Structure ...........................................................................................................34 4.2 Nonidealities of the basic CP ...........................................................................35 4.3 CP structure improvements ..............................................................................41 4.4 CP structure comparison ..................................................................................60 4.5 Class AB voltage follower (VF) with complimentary input used in the CP ...63 4.6 Simulation results ............................................................................................72 Chapter 5: Phase Frequency Detector (PFD) ......................................................................... 75 5.1 Structure and operation ....................................................................................75 5.2 Non-idealities of the PFD ................................................................................76 v v vii v 5.3 False lock prevention .......................................................................................79 5.4 Simulation results ............................................................................................84 Chapter 6: Fractional Frequency Counter .............................................................................. 87 6.1 Counting pattern generator (C<31:0>) ............................................................87 6.2 Full-cycle counter ............................................................................................90 6.3 Fractional counter ............................................................................................91 6.4 Frequency reconstruction.................................................................................97 Chapter 7: Top-level Simulation Results ............................................................................... 99 Chapter 8: Conclusion and Future Work .............................................................................. 103 References ............................................................................................................................ 105 vi v vii v List of Tables 2.1 The pattern of how the output jitters will change as the jitter source parameters change 17 4.1 Results of the charge pump structure comparison 62 4.2 Performance comparison of various kinds of VFs based on the simulation results in this report 64 4.3 Worst-case performance summary of the VF (0.2 V < 𝑉 𝑐𝑡𝑟𝑙 < 1 V) 73 4.4 Performance summary of the CP at the worst-case corner (SS, 1.08𝑉 , 60℃) under Monte Carlo 74 5.1 Performance summary of the PFD at the worst-case corner (SS, 𝑉 𝑑𝑑 = 1.08 V, temperature = 60 ℃) 85 6.1 Output of Q<31:0> and Q ̅ <0:31> 89 6.2 Truth table of an 8-to-3 priority encoder 93 7.1 Comparison of DLL performance in this project and previous studies 101 vii v vii v List of Figures 1.1 Illustration of the concept of frequency-shift based magnetic biosensors [1], [10]. 2 1.2 Conventional full-cycle frequency-counting principle and its error. 5 2.1 (a) Structure of the DLL and (b) example output waveforms of the DLL. 7 2.2 Linear s-domain model of the DLL. 8 3.1 (a) Resistive delay cell and (b) capacitive delay cell [36], [37]. 18 3.2 (a) Current-starved inverter and (b) output waveforms under a rising input edge [39]. 20 3.3 (a) Current-starved inverter and (b) output waveforms under variable MOSFET parameters under a rising input edge [39]. 22 3.4 Conversion of delay cell voltage error (∆𝑉 𝑛 ,𝐷𝐶 ) into delay cell timing jitter (∆𝑡 𝐷𝐶 ). 24 3.5 Schematic of a unit delay cell. 28 3.6 Two types of V-to-I converters (a) and (b). 29 3.7 Final structure of the delay cell and the V-to-I converter. 30 3.8 Schematic of the VCDL. 30 3.9 Layout of the VCDL. 31 3.10 Delay range of the VCDL at all corners. 32 4.1 Design flow of the CP. 33 4.2 (a) Conceptual diagram of a charge pump (CP) and (b) basic CP. 34 4.3 Description of how clock feedthrough affects the basic CP. 36 viii v vii v 4.4 (a) Basic CP and (b) description of how charge sharing affects the basic CP. 37 4.5 (a) CP leakage model and (b) description of how 𝐼 𝑢𝑝 /𝐼 𝑑𝑛 current leakage affects the basic CP. 38 4.6 The lock condition in the presence of 𝐼 𝑢𝑝 /𝐼 𝑑𝑛 mismatches. 39 4.7 (a) Basic CP and (b) example 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveform of the basic CP. 39 4.8 Output resistance improved PMOS current source of the CP using positive feedback. 42 4.9 (a) CP with feedback to the PMOS current source [54] and (b) 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveforms of this CP. 44 4.10 (a) CP with feedback to both the PMOS and NMOS current sources [54] (b) example 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveforms of this CP. 45 4.11 Output resistance improved NMOS current source of the CP using negative feedback (a) cascode and (b) regulated cascode. 46 4.12 (a), (b) Output resistance improved NMOS current source of the CP using regulated cascode and the biasing circuit; (c)–(f) simple structured amplifiers could be used in (a) and (b). 49 4.13 (a) Output resistance improved NMOS current source of the CP using regulated cascode and the biasing circuit. (b) Regulated cascode current source with a level shifter for reducing 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 . 50 4.14 Large signal voltages 𝑉 𝑑 1 and 𝑉 𝑔 2 and currents 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 as a function of 𝑉 𝑐𝑡𝑟𝑙 . 53 4.15 Complementary switch. 58 4.16 Complementary switch and dummy switch. 59 4.17 The principle of charge-sharing reduction. 59 4.18 Charge pump structure comparison. There are four types of current sources: ① single transistor current source, ② cascode current source, ③ regulated cascode current source, and ④ regulated cascode current source with a level shifter. There 60 ix v vii v are also four types of switches: (a) switch at the output node, (b) switch at the power supply, (c) switch at the bias circuit, and (d) switch at the output node with charge-sharing reduction. 4.19 Final charge pump structure. 62 4.20 Classification of various kinds of VFs. 63 4.21 (a) Source follower, (b) FVF; and (c) FVF feedback analysis referenced from [62]. 64 4.22 (a) Differential flipped voltage follower (DFVF), (b) DFVF with extended common mode input range (CMR), and (c) example DC characteristic of the DFVF in (a) and (b) [60], [62]. 67 4.23 (a) Complementary class AB VF, and (b) complementary class AB VF with adaptive biasing 𝑀 3 and 𝑀 10 . 69 4.24 (a) Dynamic tail current source used in the DFVF, (b) output current versus differential input voltage with and without the dynamic current source, and (c) final version of VF used in the CP: complementary class AB slew rate enhanced VF [60]. 71 4.25 The layout of the CP. 73 4.26 Post-layout transient simulation of the VF at all the corners. 74 4.27 Post-layout simulations of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 at all the corners. 74 5.1 Basic PFD structure. 75 5.2 PFD transfer characteristic, blind zone, and dead zone. 76 5.3 UP/DN oscillation under an insufficient reset hold time. 78 5.4 Examples of (a) harmonic and (b) stuck locks [66]. 80 5.5 (a) PFD with reset, (b) harmonic lock detector, and (c) the stuck lock detector used in the DLL developed in this work [66]. 81 5.6 Example waveform of an HLD-corrected harmonic lock; (𝑟 𝑖 [𝑥 ] denotes the rising edge of P in the x th cycle). 82 x v vii v 5.7 Example waveform of stuck lock correction when (a) 𝑇 𝑟𝑒𝑓 < 𝑇 𝑓𝑒𝑏 < ( 16/ 15) 𝑇 𝑟𝑒𝑓 and (b) 𝑇 𝑓𝑒𝑏 < 𝑇 𝑟𝑒𝑓 . 84 5.8 Layout of the PFD and false lock prevention circuit. 85 5.9 PFD output characteristics at the worst-case corner when 𝑉 𝑑𝑑 =1.08 V and a temperature = 60 ℃ at SS. 85 5.10 Dynamic performance of the DLL. 86 6.1 Example VCDL outputs. 88 6.2 Counting pattern generator. 88 6.3 Counting pattern generator with full-cycle counter. 91 6.4 Counting pattern generator with full-cycle counter and fractional counter (32-to-5 priority encoder (L) [67]) for the last non-complete circulation of C<31:0>. 92 6.5 “74 x 148” 32-to-5 priority encoder [67]. 94 6.6 “74 x 148” 8-to-3 priority encoder [67]. 94 6.7 Complete gate-level schematic of the fractional frequency counter. 95 6.8 The content of the output digits from the fractional frequency counter. 98 7.1 Top-level block diagram of the DLL-based fractional frequency counter. 99 7.2 DLL layout. 99 7.3 Phase error of the DLL. 100 7.4 Reconstructed frequency versus the input frequency. 102 7.5 Overall power consumption in different corners. 102 8.1 Structure of the continuously working fractional frequency counter. 104 xi 1 Chapter 1: Introduction 1.1 Introduction to a frequency-shift-based magnetic biosensor Currently, medical diagnosis through blood testing usually requires a long processing time by centralized labs and expensive specialized facilities [1]. Because of the time- and cost-ineffective flow, there is a high demand for low-cost, fast, and at-home point-of-care (PoC) medical diagnosis [2]. Advanced PoC sensing systems require low-cost, high-sensitivity, battery-level power consumption and handheld portability [3], [4]. Once fully developed, PoC systems can play an important role in medical applications, such as early diagnosis, pandemic control, and home-based health care [1], [3]. Biosensor, which converts biological signals into electrical signals, is a key component of PoC systems because of their ability to measure the concentration of target molecules, such as cells, antibodies, proteins, and strands of nucleotides (DNA/RNA) in a solution full of contaminants [5]. Although there is a large amount of research promoting the achievements of optical biosensors, they suffer from the problems of bulky size and high cost due to the use of optical sources and filter setups [1]. Advanced CMOS technology allows for the development of integrated magnetic biosensors that use magnetic labels as tags. The use of such sensors can facilitate lower cost, lower detection limits, and higher sensitivity because they do not require optic setups, making them a promising alternative to fluorescence assays [5]–[13]. Furthermore, compared with label-free or dielectric biosensors, magnetic labeling provides a virtually nonexistent background and higher specificity [14], [15]. 2 Magnetic biosensors, which utilize magnetic particles as sensing tags, are usually based on sandwich bioassays, such as the enzyme-linked immunosorbent assay (ELISA). When the detection procedure begins, the target molecules in the sample are captured by pre-deposited molecular probes. Subsequently, the magnetic particles that have been biochemically functionalized are injected, and the captured target molecules immobilize them. Thus, one can qualitatively and quantitatively detect the presence of the target molecules in a sample by determining the magnetic labels left on the sensor surface [3], [4]. Compared with magnetic biosensors that use other methods, such as giant magnetoresistance [16]–[20] and the Hall effect [16], [17], frequency-shift-based sensors have a simple structure; are low cost, highly sensitive, and fully compatible with modern CMOS processes; and do not require any post-process modifications. Additionally, frequency-shift-based sensors enable multiplexed biosensing (i.e., using a single sensor surface for the simultaneous detection of numerous objects) [1]. [9] and [21] established that frequency-shift-based magnetic biosensors are able to sense nucleic acid and protein targets. Fig. 1.1. Illustration of the concept of frequency-shift-based magnetic biosensors [1], [10]. The sensing core of the frequency-shift-based magnetic biosensor is composed of an integrated LC oscillator with the inductor surface functioning as the sensing region that detects the 3 existence of magnetic beads, as shown in Fig. 1.1 [1], [3]–[5]. The sensing beads are formed using paramagnetic material, enabling them to be polarized when exposed to an external magnetic field. A magnetic field that magnetizes all beads on the inductor surface can be produced by the current flowing through the inductor. The magnetized beads generate their own magnetization fields that contribute to the overall magnetic flux through the surface of the inductor. The presence of these magnetized beads increases the total magnetic energy in the region and consequently causes an effective increase in the inductance of the sensing inductor, resulting in a decrease in tank resonance frequency [1], [3]–[6], [22]. Because of this downshift, the new tank frequency is expressed as follows [1], [4], [5], [22]: 𝑓 = 1 2𝜋 √𝐿𝐶 = 1 2𝜋 √( 𝐿 0 + ∆𝐿 ) 𝐶 0 = 1 2𝜋 √( 1+ ∆𝐿 𝐿 0 ) 𝐿 0 𝐶 0 = 1 2𝜋 √𝐿 0 𝐶 0 √( 1+ ∆𝐿 𝐿 0 ) ( 1.1) where 𝐿 0 is the nominal tank inductance and 𝐶 0 is the nominal tank capacitance. After using a first-order Taylor expansion, because ∆𝐿 ≪ 𝐿 0 , 𝑓 ≈ 1 2𝜋 √𝐿 0 𝐶 0 ( 1− ∆𝐿 2𝐿 0 )= 𝑓 0 ( 1− ∆𝐿 2𝐿 0 ) ( 1.2) where 𝑓 0 is the nominal tank resonance frequency The physical mechanism that links the presence of magnetic particles on the surface of the inductor to the upshift of the inductor inductance is quantitatively modeled as follows. A local magnetic field 𝐻 ⃗ ⃗ is generated when current 𝐼 flows through the inductor. Assume a magnetic 4 particle with an effective susceptibility 𝜒 and a volume of 𝑉 𝑝 , which is small enough that it will not significantly perturb 𝐻 ⃗ ⃗ , is placed close to the sensing surface of the inductor; the total magnetic energy will be increased by ∆𝐸 𝑚 [1], [3]: ∆𝐸 𝑚 = ( 𝐸 𝑚 ′ − 𝐸 𝑚 )= 1 2 ∭𝐻 ⃗⃗ ⃗ · 𝐵 ′ ⃗⃗⃗⃗ 𝑑𝑣 − 1 2 ∭𝐻 ⃗⃗ ⃗ · 𝐵 ⃗⃗ 𝑑𝑣 = 𝜇 0 2 ∭ [||𝐻 ⃗⃗ ⃗ || 2 ( 1+ 𝜒 )− ||𝐻 ⃗⃗ ⃗ || 2 ]𝑑𝑣 𝑉 𝑝 = 𝜒 2𝜇 0 ∭ ||𝐵 ⃗⃗ || 2 𝑑𝑣 𝑉 𝑝 ≈ 𝜒 2𝜇 0 ||𝐵 ⃗⃗ || 2 𝑉 𝑝 ( 1.3) where 𝐵 ′ ⃗⃗⃗⃗ and 𝐵 ⃗ are the local magnetic flux density with and without the target particles, 𝐸 𝑚 ′ and 𝐸 𝑚 are the total magnetic energy with and without the magnetic particles. Because the fields elsewhere remain unchanged, the volume integration is only performed over 𝑉 𝑝 . The polarization field is assumed to be homogeneous throughout its volume when the particle is tiny enough. The shift in inductance ∆𝐿 can be expressed as follows [1], [4], [5], [22]: ∆𝐿 = 2𝛥 𝐸 𝑚 𝐼 2 ≈ 𝜒 𝜇 0 ||𝐵 ⃗⃗ || 2 𝐼 2 𝑉 𝑝 ( 1.4) By substituting ∆𝐿 with equation (1.4), the frequency shift ∆𝑓 can be expressed as ∆𝑓 = −𝑓 0 𝜒 𝜇 0 𝐼 2 ||𝐵 ⃗⃗ || 2 𝑉 𝑝 2𝐿 0 ( 1.5) This resonance shift can be quantified by direct impedance measurement using a Wheatstone bridge [23]. However, this method can suffer from a poor signal-to-noise ratio. A straightforward option is to integrate a differential cross-coupled CMOS oscillator with the LC tank and measure the oscillation frequency in real time [6]. 5 1.2 Frequency counting for frequency-shift-based magnetic biosensors The output from magnetic biosensors in the GHz range can be mixed down to the MHz range to reduce the power consumption of the frequency-counting circuit. Mixing the actual output frequency down to a lower frequency subtracts a constant offset. For example, if the biosensor outputs (1 GHz + ∆𝑓 ) and the mixer subtracts 950 MHz, then the input to the counting circuit will be (50 MHz + ∆𝑓 ). Thus, the same frequency shift remains, and no information is lost. The oscillation frequency of frequency-shift-based magnetic biosensors can be reconstructed by conventional full-cycle frequency counting. If 𝑁 (an integer) rising edges of a signal are detected in a period of 𝑇 𝑐𝑜𝑢𝑛𝑡 seconds, the signal will be regarded as passing 𝑁 full cycles during 𝑇 𝑐𝑜𝑢𝑛𝑡 , and the oscillation frequency of this signal can be reconstructed as follows: 𝑓 𝑜𝑠𝑐 = 𝑁 ∙ 1 𝑇 𝑐𝑜𝑢𝑛𝑡 = 𝑁 ∙ 𝑓 𝑐𝑜𝑢𝑛𝑡 ( 1.6) Fig. 1.2. Conventional full-cycle frequency-counting principle and its error. However, the conventional full-cycle frequency-counting method has two disadvantages. The first is the trade-off between 𝑇 𝑐𝑜𝑢𝑛𝑡 and the resolution of frequency reconstruction. If 𝑇 𝑐𝑜𝑢𝑛𝑡 = 1 𝑠 , the resolution for 𝑓 𝑜𝑠𝑐 reconstruction is equal to 𝑓 𝑐𝑜𝑢𝑛𝑡 (1 Hz in this case). Although a longer 𝑇 𝑐𝑜𝑢𝑛𝑡 will lead to a higher resolution of frequency reconstruction, it will limit the operation speed, introducing a dilemma. In practice, there might be 1,000 magnetic biosensors on a chip whose frequencies must be measured. Counting each one for a second will cause a wait of 1,000 seconds 6 to measure all the data of the whole chip, which is significantly slow. Another disadvantage of full-cycle counting is the inaccuracy of frequency reconstruction. Taking the signal in Fig. 1.2 as an example, it actually passes (2 + F% + L%) cycles, instead of three full cycles, during 𝑇 𝑐𝑜𝑢𝑛𝑡 . Thus, the reconstructed frequency will not be accurate, and it will have an error of [(1 - F% - L%) × 𝑓 𝑐𝑜𝑢𝑛𝑡 ]. This project aims to develop a delay-locked loop (DLL)-based fractional frequency counter. If the signal to be measured (the input signal) is passed to an 𝑀 -phase DLL and the DLL outputs are fed to the edge counter, instead of receiving only one rising edge when the input signal completes each cycle, the edge counter can receive 𝑀 rising edges per input cycle. As a result, if 𝑇 𝑐𝑜𝑢𝑛𝑡 is 1 second, instead of obtaining a 1-Hz resolution, a (1/𝑀 )-Hz resolution can be achieved. Equivalently, another option is to count for (1/𝑀 ) seconds and obtain a 1-Hz resolution, meaning that now the counting will be 𝑀 times faster. Additionally, instead of simply counting the number of full cycles, the DLL-based fractional frequency counter also allows us to count the initial and final phase difference between the signal being counted and the reference clock for generating 𝑇 𝑐𝑜𝑢𝑛 𝑡 (denoted as F% and L% in Fig. 1.2). This increases the accuracy of the frequency counting. The application of a fractional frequency counter is broad, making it suitable for any applications requiring an accurate frequency counting in a short period. The rest of this thesis is organized as follows. Section 2 summarizes the theories related to the DLL, and Section 3 demonstrates the design of the voltage-controlled delay line. Section 4 shows the design of the charge pump, and Section 5 shows the design of the phase-frequency detector. Section 6 demonstrates the fractional frequency counter, and Section 7 contains the top- level simulation results. Finally, Section 8 makes concluding remarks and describes future work. 7 Chapter 2: DLL Theories As shown in Fig. 2.1, DLLs are first-order negative feedback systems that align the feedback clock (P<M-1>) with the input clock (𝑓 𝑖𝑛 ). There are four main components in a DLL: 1) voltage- controlled delay line (VCDL), which contains 𝑀 delay cells in series; 2) phase frequency detector (PFD); 3) loop filter capacitor (𝐶 𝑝 ); and 4) charge pump (CP). The VCDL generates a series of delayed clock signals that are sequentially skewed from one another based on an input clock signal and a control voltage. The last output P<M-1> is fed back to the PFD, and the PFD outputs DN and UP pulses, whose widths are based on the skew between P<M-1> and 𝑓 𝑖𝑛 . This subsequently enables the CP to discharge or charge the loop filter capacitor for a certain amount of time. The control voltage outputs from the loop filter cap, therefore, fall and rise accordingly and adjust the delay generated from the VCDL. As a result, the output and input clocks will be in phase with each other, and the time interval between the outputs of adjacent delay cells will be (𝑇 𝑖𝑛 /𝑀 ) [24]. Fig. 2.1. (a) Structure of the DLL and (b) example output waveforms of the DLL. 8 DLLs can be generally classified into digital and analog DLLs. Digital DLLs execute quantized steps for delay modification in a delay line, which is regulated via a digital code received from a controller. Conversely, the delay line in analog DLLs is controlled by an analog input intended to adjust the delay, thus reducing the skew between the input and output clocks of the DLLs. Digital DLLs are designed mostly using standard cells, decreasing the design complexity. The problem is that digital delay cells normally suffer from large output jitter and low delay resolution because of the quantization, thereby reducing the accuracy of frequency counting [25]. This key point means that analog DLLs are preferable for the current project. 2.1 DLL frequency response Fig. 2.2. Linear s-domain model of the DLL. Based on Fig. 2.2, only one integrator is sufficient to eliminate the static phase error of the DLL loop because the DLL includes just one phase variable. This yields a first-order feedback operation. Therefore, the theoretical phase margin of the DLL can be 90°. The s-domain closed-loop and open-loop transfer functions of the DLL can be expressed as follows [26]: 𝐻 𝐶𝐿 ( 𝑠 )= 𝑇 𝐷 ,𝑜𝑢𝑡 ( 𝑠 ) 𝑇 𝐷 ,𝑖𝑛 ( 𝑠 ) = 1 1+ 𝑠 𝜔 𝑁 ( 2.1) 9 𝐻 𝑂𝐿 ( 𝑠 )= 𝐼 𝑐𝑝 𝐾 𝑉𝐶𝐷 𝐿 𝑠 𝑇 𝑖𝑛 𝐶 𝑝 ( 2.2) 𝜔 𝑁 = 𝐼 𝑐𝑝 𝐾 𝑉𝐶𝐷𝐿 𝑇 𝑖𝑛 𝐶 𝑝 ( 2.3) 𝐾 𝑉𝐶𝐷𝐿 = 𝛥 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 𝛥 𝑉 𝑐𝑡𝑟𝑙 ( 2.4) where 𝑇 𝐷 ,𝑜𝑢𝑡 is the output delay time, 𝑇 𝐷 ,𝑖𝑛 is the input delay time, 𝜔 𝑁 is the loop bandwidth of the DLL, 𝐼 𝑐𝑝 is the CP current, 𝑇 𝑖𝑛 is the input clock cycle, 𝐶 𝑝 is the low-pass filter capacitor of the CP, 𝐾 𝑉𝐶𝐷𝐿 is the gain of the VCDL, 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 is the delay time of the VCDL, 𝑉 𝑐𝑡𝑟𝑙 is the control voltage outputs from the CP. The normalized DLL loop bandwidth 𝜀 𝐷𝐿𝐿 is defined as [27–30]: 𝜀 𝐷𝐿𝐿 = 𝜔 𝑁 × 𝑇 𝑖𝑛 = 𝐼 𝑐𝑝 𝐾 𝑉𝐶𝐷𝐿 𝐶 𝑝 ( 2.5) where 𝜔 0 is the DLL loop bandwidth. 10 2.2 DLL time domain jitter transfer function DLL jitter is an important factor that affects the accuracy of frequency counting because it introduces uncertainty as to the timing at which rising edges occur. Therefore, this section referenced [27], [28], [32]–[34] to model the time domain jitter transfer function for guiding the jitter reduction of the DLL. Four main sources of jitter contribute to the DLL output jitter: 1) jitter due to the noisy CP (∆𝑡 𝐶𝑃 ); 2) jitter due to VCDL noise (∆𝑡 𝑉𝐶𝐷𝐿 ), which is composed of N delay cells and where the jitter of the i th delay cell is expressed as (∆𝑡 𝐷𝐶𝑖 ); 3) jitter due to the PFD noise (∆𝑡 𝑃𝐹𝐷 ), which is also the error of the PFD phase detection result; and 4) reference clock (input clock) jitter (∆𝑡 𝑅𝐸𝐹 ). This section demonstrates the time domain analysis of how ∆𝑡 𝐶𝑃 , ∆𝑡 𝐷𝐶 , ∆𝑡 𝑃𝐹𝐷 , and ∆𝑡 𝑅𝐸𝐹 are converted to the output jitter of the DLL (∆𝑡 𝑁 ,𝐷𝐿𝐿 ) [27], [28], [32]–[34]. To analyze the jitter of the DLL, five assumptions were made [27], [32]: 1) The DLL is in the lock condition, and the DLL is analyzed as a linear system when it is locked and the jitter is small compared with the input clock. 2) All the jitter sources are uncorrelated, and the jitter contribution of every single delay cell is uncorrelated. 3) Thermal noise is the source of all the noises. In practice, for the thermal noise to be dominant, the corner frequency of the flicker noise should be one or two decades below the DLL bandwidth. 4) The CP current can be modeled as Dirac pulses. 5) The variance in jitter of every delay cell is equal. Because the shape of the input signal to every delay cell is the same, and the structure of each delay cell is the same, this is a reasonable 11 assumption. The total delay of the VCDL is equal to the sum of the period of the reference clock and the jitter due to the noisy VCDL and the phase change because of the CP [27], [28], [32]–[34]: 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 = 𝑇 𝑅𝐸𝐹 + ∆𝑡 𝑉𝐶𝐷𝐿 − 𝐾 𝑉𝐶𝐷𝐿 × 𝑉 𝑐𝑡𝑟𝑙 ( 2.6) where 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 is the total offset of the VCDL delay from 𝑇 𝑅𝐸𝐹 when the DLL locks, 𝑇 𝑅𝐸𝐹 is the reference clock cycle. In the DLL, the voltage generated from the CP after the (m-1) th period adds up to the 𝑉 𝑐𝑡𝑟𝑙 in the (m-1) th period and produces the 𝑉 𝑐𝑡𝑟𝑙 in the m th period. The minus sign before ∆𝑡 𝑅𝐸𝐹 ( 𝑚 − 1) is because the PFD will try to determine the difference between the reference clock and the output from the N th (last) delay cell in the VCDL [27], [28], [32]–[34]: 𝑉 𝑐𝑡𝑟𝑙 ( 𝑚 )= 𝑉 𝑐𝑡𝑟𝑙 ( 𝑚 − 1)+ 𝐼 𝑐𝑝 [∆𝑡 𝑁 ( 𝑚 − 1)− ∆𝑡 𝑅𝐸𝐹 ( 𝑚 − 1)+ ∆𝑡 𝑃𝐹𝐷 ( 𝑚 − 1) ] 𝐶 𝑝 + 𝑞 𝑛 ,𝐶𝑃 ( 𝑚 − 1) 𝐶 𝑝 ( 2.7) where 𝑉 𝑐𝑡𝑟𝑙 ( 𝑚 ) is the control voltage after the reference clock completes the m th period, 𝑉 𝑐𝑡𝑟𝑙 ( 𝑚 − 1) is the control voltage after the reference clock completes the (m-1) th period, ∆𝑡 𝑁 ( 𝑚 − 1) is the jitter outputs from the N th (last) delay cell in the VCDL after the reference clock completes the (m-1) th period, ∆𝑡 𝑅𝐸𝐹 ( 𝑚 − 1) is the timing jitter of the reference clock after its (m-1) th period, ∆𝑡 𝑃𝐹𝐷 ( 𝑚 − 1) is the error of phase detection in PFD after the reference clock completes 12 the (m-1) th period, and 𝑞 𝑛 ,𝐶𝑃 ( 𝑚 − 1) is the charge produced with a noisy CP after the reference clock completes the (m-1) th period. The term 𝑞 𝑛 ,𝐶𝑃 can be expressed as [32] 𝑞 𝑛 ,𝐶𝑃 ( 𝑚 − 1)= 𝐼 𝑐𝑝 × ∆𝑡 𝐶𝑃 ( 𝑚 − 1) ( 2.8) where ∆𝑡 𝐶𝑃 ( 𝑚 − 1) is the jitter outputs from the CP after the reference clock completes the (m-1) th period. The output jitter from the X th delay cell in the VCDL is [27], [28], [32]–[34] ∆𝑡 𝑋 ( 𝑚 )= ∆𝑡 𝑅𝐸𝐹 ( 𝑚 − 1)− 𝑋 𝑁 × 𝐾 𝑉𝐶𝐷𝐿 × 𝑉 𝑐𝑡𝑟𝑙 ( 𝑚 )+ ∑∆𝑡 𝐷𝐶 𝑖 ( 𝑚 ) 𝑋 𝑖 =1 ( 2.9) where ∆𝑡 𝑋 ( 𝑚 ) is the jitter outputs from the X th delay cell in the VCDL after the reference clock completes the m th period and ∆𝑡 𝐷𝐶 𝑖 ( 𝑚 ) is the amount of jitter contributed by the i th delay cell itself after the reference clock completes the m th period. When trying to model the jitter due to each of the above four sources individually, we must make two assumptions [27], [32]: 1) The jitter contribution of other sources is zero, and 2) each of the above four jitter components behaves linearly, and the average of ∆𝑡 𝐶𝑃 , ∆𝑡 𝐷𝐶 𝑖 , ∆𝑡 𝑃𝐹𝐷 , and ∆𝑡 𝑅𝐸𝐹 is zero. In other words, 13 𝐸 ( ∆𝑡 𝐶𝑃 )= 𝐸 ( ∆𝑡 𝐷𝐶 𝑖 )= 𝐸 ( ∆𝑡 𝑃𝐹𝐷 )= 𝐸 ( ∆𝑡 𝑅𝐸𝐹 )= 0 ( 2.10) where 𝐸 ( 𝑋 ) is the expected value of X based on the definition of variance. 𝜎 2 ( ∆𝑡 𝐶𝑃 )= 𝐸 [( ∆𝑡 𝐶𝑃 − 𝐸 ( ∆𝑡 𝐶𝑃 ) ) 2 ] = 𝐸 [∆𝑡 𝐶𝑃 2 ] − 𝐸 2 [∆𝑡 𝐶𝑃 ] = 𝐸 [∆𝑡 𝐶𝑃 2 ] ( 2.11) Similarly, 𝜎 2 ( ∆𝑡 𝐷𝐶 𝑖 )= 𝐸 [∆𝑡 𝐷𝐶 𝑖 2 ] ( 2.12) 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 )= 𝐸 [∆𝑡 𝑃𝐹𝐷 2 ] ( 2.13) 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 )= 𝐸 [∆𝑡 𝑅𝐸𝐹 2 ] ( 2.14) Based on equations (2.5–2.7) and (2.9–2.14), we can model how the variance in jitter of each of the four sources is transferred to the variance of the total output jitter of the DLL [𝜎 2 ( ∆𝑡 𝑁 ,𝐷𝐿𝐿 ) ]. 2.2.1 Jitter generated from the CP First, the variance in jitter of the DLL (∆𝑡 𝑁 ,𝐶𝑃 ) due to CP jitter (∆𝑡 𝐶𝑃 ) can be calculated as follows when we assume ∆𝑡 𝐷𝐶 𝑖 = ∆𝑡 𝑃 𝐹𝐷 = ∆𝑡 𝑅𝐸𝐹 = 𝐸 ( ∆𝑡 𝐶𝑃 )= 0 [27], [32]: 𝜎 2 ( ∆𝑡 𝑁 ,𝐶𝑃 )= | | 1 −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1 | | × 𝜎 2 ( ∆𝑡 𝐶𝑃 ) ( 2.15) The jitter outputs from the X th delay cell in the VCDL (∆𝑡 𝑋 ,𝐶𝑃 ) due to CP jitter is [27], [32] 14 𝜎 2 ( ∆𝑡 𝑋 ,𝐶𝑃 )= | | ( 𝑋 𝑁 ) 2 −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1 | | × 𝜎 2 ( ∆𝑡 𝐶𝑃 ) ( 2.16) 2.2.2 Jitter generated from the delay cell The variance in jitter of the DLL (∆𝑡 𝑁 ,𝐷𝐶 ) due to delay cell jitter (∆𝑡 𝐷𝐶 ) can be calculated as follows when we assume ∆𝑡 𝐶𝑃 = ∆𝑡 𝑃𝐹𝐷 = ∆𝑡 𝑅𝐸𝐹 = 𝐸 ( ∆𝑡 𝐷𝐶 𝑖 )= 0 . Because all the delay cells are assumed to have the same amount of jitter [27], [32], 𝐸 [∑∆𝑡 𝐷𝐶 𝑖 𝑁 𝑖 =1 ] = 𝑁 × 𝐸 [∆𝑡 𝐷𝐶 2 ] ( 2.17) 𝜎 2 ( ∆𝑡 𝑁 ,𝐷𝐶 )= | | 2𝑁 2− 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 𝐶 𝑝 | | × 𝜎 2 ( ∆𝑡 𝐷𝐶 ) ( 2.18) Due to the delay cell noise, the jitter outputs from the X th delay cell in the VCDL (∆𝑡 𝑋 ,𝐷𝐶 ) is [27], [32] 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐶 )= | | ( 𝑋 2 𝑁 ) 2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 − 1 + 𝑋 | | × 𝜎 2 ( ∆𝑡 𝐷𝐶 ) ( 2.19) 𝜎 ( ∆𝑡 𝑋 ,𝐷𝐶 )≈ √𝑋 𝜎 ( ∆𝑡 𝐷𝐶 ) ( 2.20) 2.2.3 Jitter generated from the PFD The variance in jitter of the DLL (∆𝑡 𝑁 ,𝑃𝐹𝐷 ) due to the PFD can be calculated as follows when we 15 assume ∆𝑡 𝐷𝐶 𝑖 = ∆𝑡 𝐶𝑃 = ∆𝑡 𝑅𝐸𝐹 = 𝐸 ( ∆𝑡 𝑃𝐹𝐷 )= 0 [27], [32]: 𝜎 2 ( ∆𝑡 𝑁 ,𝑃𝐹𝐷 )= | | 1 −2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 + 1 | | × 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 ) ( 2.21) Due to the PFD noise, the jitter outputs from the X th delay cell in the VCDL (∆𝑡 𝑋 ,𝑃𝐹𝐷 ) is [27], [32] 𝜎 2 ( ∆𝑡 𝑋 ,𝑃𝐹𝐷 )= | | ( 𝑋 𝑁 ) 2 −2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 + 1 | | × 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 ) ( 2.22) 2.2.4 Jitter generated from the reference clock The variance in DLL jitter due to the reference clock jitter can be derived as follows when we assume ∆𝑡 𝐷𝐶 𝑖 = ∆𝑡 𝑃𝐹𝐷 = ∆𝑡 𝐶𝑃 = 𝐸 ( ∆𝑡 𝑅𝐸𝐹 )= 0 [27], [32]: 𝜎 2 ( ∆𝑡 𝑁 ,𝑅𝐸𝐹 )= 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 ) ( 2.23) Due to reference clock jitter, the jitter outputs from the X th delay cell in the VCDL is [27], [32] 𝜎 2 ( ∆𝑡 𝑋 ,𝑅𝐸𝐹 )= 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 ) ( 2.24) 2.2.5 The output jitter of the DLL Based on equations (2.15, 2.18, 2.21, and 2.23), we can obtain the total DLL output jitter (∆𝑡 𝑁 ,𝐷𝐿𝐿 , the same as the jitter outputs from the N th delay cell in the VCDL) considering all four jitter sources (CP, delay cell, PFD, and the reference clock) as follows [27], [32]: 16 𝜎 2 ( ∆𝑡 𝑁 ,𝐷𝐿𝐿 )= | | 1 −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1 | | × 𝜎 2 ( ∆𝑡 𝐶𝑃 )+ | | 2𝑁 2− 𝐼 𝑐𝑝 × 𝐾 𝑉 𝐶𝐷𝐿 𝐶 𝑝 | | × 𝜎 2 ( ∆𝑡 𝐷𝐶 ) +| | 1 −2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 + 1 | | × 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 )+ 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 ) ( 2.25) Based on equations (2.16, 2.19, 2.22, and 2.24), we can obtain the total jitter outputs from the X th delay cell in the VCDL considering all four jitter sources (CP, delay cell, PFD, and reference clock) as follows [27], [32]: 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐿𝐿 )= | | ( 𝑋 𝑁 ) 2 −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1 | | × 𝜎 2 ( ∆𝑡 𝐶𝑃 )+ | | ( 𝑋 2 𝑁 ) 2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 − 1 + 𝑋 | | × 𝜎 2 ( ∆𝑡 𝐷𝐶 ) +| | ( 𝑋 𝑁 ) 2 −2𝐶 𝑝 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 + 1 | | × 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 )+ 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 ) ( 2.26) A wider loop bandwidth will make the DLL lock faster, whereas the jitter performance will worsen. According to [26], a widely used tip to deal with this trade-off is to make sure 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 2𝜋 𝐶 𝑝 ≤ 1 10 ( 2.27) Therefore, the following equations about the denominators in equations (2.25) and (2.26) can be got: −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1 < 0 ( 2.28) | −2𝐶 𝑝 𝐾 𝑉𝐶𝐷𝐿 × 𝐼 𝑐𝑝 + 1| ≥ ( 10 𝜋 − 1)> 1 ( 2.29) 2 − 𝐼 𝑐𝑝 × 𝐾 𝑉𝐶𝐷𝐿 𝐶 𝑝 > 0 ( 2.30) 17 Based on equations (2.15–2.30) and the simulation results from [27]–[35], the pattern of how the DLL output jitter will change when the jitter source parameters change is shown in Table 2.1. Table 2.1. The pattern of how the output jitters will change as the jitter source parameters change 𝜎 2 ( ∆𝑡 𝑋 ,𝐶𝑃 ) 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐶 ) 𝜎 2 ( ∆𝑡 𝑋 ,𝑃𝐹𝐷 ) 𝜎 2 ( ∆𝑡 𝑋 ,𝑅𝐸𝐹 ) 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐿𝐿 ) 𝐾 𝑉𝐶𝐷𝐿 ↑ ↑ ↑ ↑ No change ↑ 𝐶 𝑝 ↑ ↓ ↓ ↓ ↓ 𝐼 𝑐𝑝 ↑ ↑ ↑ ↑ ↑exponentially X↑ ↑ ↑ ↑ ↑ N↑ ↑linearly 𝜎 2 ( ∆𝑡 𝐶𝑃 ) ↑ ↑ ↑ 𝜎 2 ( ∆𝑡 𝐷𝐶 ) ↑ ↑ ↑ 𝜎 2 ( ∆𝑡 𝑃𝐹𝐷 ) ↑ ↑ ↑ 𝜎 2 ( ∆𝑡 𝑅𝐸𝐹 ) ↑ ↑ ↑ 𝜀 𝐷𝐿𝐿 ↑ ↑ According to equations (2.25-2.30), and as stated in [28]-[32], the influence extent of the jitter sources to the DLL output jitter can be summarized as: VCDL > Ref > PFD > CP. 18 Chapter 3: Voltage-Controlled Delay Line (VCDL) The VCDL, which consists of cascaded variable delay stages, generates a series of delayed clock signals that are sequentially skewed from one another based on an input clock signal and based on a control voltage. In DLLs, a VCDL is a key block because it is the main source of jitter and of power consumption. 3.1 Review of delay cell structure Fig. 3.1. (a) Resistive delay cell and (b) capacitive delay cell [36], [37]. Delay cells are classified into resistive delay cells, which achieve variable delay by tuning resistance in the charging and/or discharging path, and capacitive delay cells, which tune 19 capacitance at the output node and thus change delay values. Fig. 3.1(a) presents a differential pair-based resistive delay cell, which achieves variable delay by tuning the tail current source. The structure in Fig. 3.1(a) presents the advantage of common-mode rejection compared with single- ended delay cells. Nevertheless, such a structure has three disadvantages: (1) its power and area can double compared with those of single-ended delay cells; (2) it consumes DC power, whereas inverter-based delay cells do not; and (3) a complicated biasing circuit (which possibly needs op amp feedback) may be required at 𝑉 𝑏 to achieve an improved performance [38]. Accordingly, a differential pair-based resistive delay cell was not used in this project. Fig. 3.1(b) shows a varactor-based capacitive delay cell, which varies delay values by changing the output capacitance of an inverter. This cell has a narrow tuning range in contrast to resistive delay cells, as a varactor incorporates large capacitance into the inverter output even when 𝑉 𝑐𝑡𝑟𝑙 = 0. This phenomenon limits the minimum available delay. The delay cell for a DLL-based fractional frequency counter is expected to exhibit a wide tuning range, rendering this capacitive delay cell unfavorable. Additionally, the 𝑉 𝑐𝑡𝑟𝑙 line in this delay cell is directly linked to the inverter output by the 𝐶 𝑔𝑑 of the varactor, causing ripples on the 𝑉 𝑐𝑡𝑟𝑙 line because of clock feedthrough and charge injection. Thus, a buffer may be required between the gate of the varactor and the output of a CP, rendering the structure increasingly complex. Furthermore, a considerably higher dynamic power dissipation occurs in a varactor-based capacitive delay cell than in resistive counterparts. Although inverter-based delay cells demonstrate a negligible DC power consumption, their dynamic power is determined by the slopes of signal edges. During a transition, dynamic current uninhibitedly flows from the varactor. When a transition is prolonged in order to realize a 20 larger delay value, the current flowing through the varactor lasts longer, thereby increasing the dynamic power of capacitive delay cells. In contrast, the slopes of signal edges do not affect the dynamic power of resistive delay cells because the resistance of the charging and/or discharging path expands under a prolonged delay, thus limiting the current flow during a transition [37]. The current-starved inverter in Fig. 3.2(a) is a promising alternative to the delay cells shown in Fig. 3.1 because of its wider tuning range, lower power consumption, and simpler bias structure. Therefore, it was selected as the delay cell structure for the VCDL. Sections 3.2 to 3.5 explain the current-starved delay cell in detail. 3.2 Operation principle of current-starved delay cells Fig. 3.2. (a) Current-starved inverter and (b) output waveforms under a rising input edge [39]. 21 The current-starved inverter in Fig. 3.2 (a) operates similar to a basic inverter, except that the current-limiting transistors limit its output slew (𝑀 1 controls the falling edge and 𝑀 4 controls the rising edge). Fig. 3.2(b) depicts a detailed analysis of the current-starved inverter exhibiting voltage transitions when the input is a rising edge. There are three phases of operation, namely, initial (𝑡 < 𝑡 1 ), switching (𝑡 1 < 𝑡 < 𝑡 2 ), and discharge (𝑡 > 𝑡 2 ) [39]. During the first phase, capacitance 𝐶 1 is charged via 𝑀 3 and 𝑀 4 , while the current- limiting transistor 𝑀 1 discharges the capacitance 𝐶 2 to zero. The rising edge of 𝑉 𝑖𝑛 switches 𝑀 2 on and 𝑀 3 off throughout the switching period. Consequently, 𝑉 𝑜𝑢𝑡 is isolated from the power supply, and 𝑉 𝑜𝑢𝑡 and 𝑉 𝑑 1 converge closer to the common level 𝑉 𝑐𝑚 ≈ 𝑉 𝑑𝑑 × [𝐶 1 /( 𝐶 1 + 𝐶 2 ) ], which represents the beginning of the discharge phase. The voltage drop of 𝑉 𝑜𝑢𝑡 (from 𝑉 𝑑𝑑 to 𝑉 𝑐𝑚 ) can be fractionally because capacitances 𝐶 1 and 𝐶 2 share the charge. In the discharge phase, 𝑉 𝑖𝑛 is high and 𝑀 2 is completely on, connecting 𝐶 1 and 𝐶 2 in parallel. The bias voltage 𝑉 𝑐𝑡𝑟𝑙 that controls the current-limiting transistor 𝑀 1 mainly determines the discharging rate for 𝐶 1 and 𝐶 2 [39], [40]. The timing delay 𝑇 𝐷 in Fig. 3.2 (b) is calculated as follows [39], [40]: 𝑇 𝐷 = ( 𝐶 1 + 𝐶 2 ) 𝑉 𝐶𝑀 − 𝑉 𝑑𝑑 2 𝐼 𝑑 1 ( 3.1) where 𝐼 𝑑 1 is the current flowing through the current-limiting transistor 𝑀 1 . 22 A similar analysis could be conducted when the input experiences a falling edge. The (𝑊 /𝐿 ) of the PMOSs in the delay cell are maintained at (𝜇 𝑛 /𝜇 𝑝 ) times higher than those of the NMOSs so that it compensates the mobility asymmetry between holes and electrons [41]. The charge sharing model only rudimentarily approximates 𝑉 𝑐𝑚 because of the capacitive coupling between the input and output nodes (due to, for example, high 𝐶 𝑔𝑠 and 𝐶 𝑔𝑑 in MOSFETs), as well as due to the non-zero drain current of 𝑀 1 , the change in input signal slopes, the charge injection effects of 𝑀 2 and 𝑀 3 , and the non-zero off current of 𝑀 3 . Nonetheless, this estimation of 𝑉 𝑐𝑚 is sufficient for analysis purposes [39], [40]. 3.3 Effect of parameter variations and mismatches Fig. 3.3. (a) Current-starved inverter and (b) output waveforms under variable MOSFET parameters under a rising input edge [39]. 23 The effects of the variability in MOSFET parameters on the fluctuation of 𝑉 𝑐𝑚 and the generated timing delay 𝑇 𝐷 when the input is a rising edge are shown in Fig. 3.3. It is stated in Section 3.2 that the 𝑉 𝑐𝑚 value calculated from the charge sharing model is a crude estimation, and the 𝑉 𝑐𝑚 value will be affected by some other factors. Hence, the variation in the MOSFETs, particularly the current-limiting transistor 𝑀 1 , the variations in 𝐶 1 and 𝐶 2 , and the leakage current of 𝑀 3 and 𝑀 4 will affect the variability in 𝑉 𝑐𝑚 [39, 40]. Similarly, the parameter variation of the current-limiting transistor 𝑀 1 will affect the discharge phase. For instance, because the threshold voltage 𝑉 𝑡 ℎ of the current-limiting transistor varies randomly, this transistor may be somewhat “faster” (with lower 𝑉 𝑡 ℎ values and higher drain current) or slightly “slower” (with higher 𝑉 𝑡 ℎ values and lower drain current) than a typical one. A faster 𝑀 1 will sharpen the discharging edge and will reduce 𝑉 𝑐𝑚 , whereas at a slower 𝑀 1 , the discharge phase will be prolonged, and 𝑉 𝑐𝑚 will eventually settle at a higher value [39], [40]. It was concluded in [39] and [40] that the variability in timing delay of a delay cell is mostly determined by the variation in the current-limiting transistors 𝑀 1 and 𝑀 4 , by the variability in 𝑉 𝑐𝑚 voltage, by the leakage in 𝑀 3 and 𝑀 4 , and by the variation in 𝐶 2 . Most studies have indicated that increasing the size of a current-limiting transistor improves the accuracy of time 𝑇 𝐷 , but the related trade-off between precision and area should also be addressed. The normalized delay variations can be derived as follows [39]: 𝜎 𝑇 𝐷 2 𝑇 𝐷 2 = 𝜎 𝐼 𝑑 2 𝐼 𝑑 1 2 + 𝜎 𝐶 1 2 + 𝜎 𝐶 2 2 ( 𝐶 1 − 𝐶 2 ) 2 ( 3.2) 𝜎 𝑇 𝐷 2 𝑇 𝐷 2 = 4𝛽 𝜎 𝑉 𝑡 ℎ 2 𝑇 𝐷 𝑉 𝑑𝑑 ( 𝐶 1 − 𝐶 2 ) + 𝜎 𝛽 2 𝛽 2 + 𝜎 𝐶 1 2 + 𝜎 𝐶 2 2 ( 𝐶 1 − 𝐶 2 ) 2 ( 3.3) 24 where 𝛽 is the process transconductance of a MOSFET. Correspondingly, the normalized delay variance of the VCDL consisting of 𝑁 delay cells in series is estimated as follows: 𝜎 𝑇 𝑁 2 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 2 = 1 𝑁 𝜎 𝑇 𝐷 2 𝑇 𝐷 2 ( 3.4) 𝑇 𝐷 ,𝑉𝐶𝐷𝐿 = 𝑁 𝑇 𝐷 ( 3.5) 3.4 Output jitter of a current-starved delay cell 3.4.1 Output jitter in delay cells due to output voltage noise Fig. 3.4. Conversion of delay cell voltage error (∆𝑉 𝑛 ,𝐷𝐶 ) into delay cell timing jitter (∆𝑡 𝐷𝐶 ). 25 Fig. 3.4 shows the approximation of how the delay cell voltage noise is translated into timing jitter. As a result, the timing jitter variance is estimated as follows [42]–[44]: 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐶 )= 𝜎 2 ( ∆𝑉 𝑛 ,𝐷𝐶 ) 𝑆𝑙𝑒𝑤 2 ( 3.6) where ∆𝑡 𝑋 ,𝐷𝐶 is the timing jitter outputs from the X th delay cell in the VCDL ∆𝑉 𝑛 ,𝐷𝐶 is output voltage error of the X th delay cell in the VCDL due to its noise 𝑆𝑙𝑒𝑤 is the output slew rate of the current starved delay cell and 𝑆𝑙𝑒𝑤 = 𝐼 𝑑 1 ( 𝐶 1 + 𝐶 2 ) ( 3.7) Adopting the model presented in [43] and using 𝜆 𝑃 𝑜 ( 𝑡 ) to represent the weight of the noise after taking the interaction between adjacent delay cells into account, 𝜎 2 ( ∆𝑉 𝑛 ,𝐷𝐶 ) can be expressed as 𝜎 2 ( ∆𝑉 𝑛 ,𝐷𝐶 )= 𝑘𝑇 2𝐶 1 𝑎 𝑣 2 𝜆 𝑃 𝑜 ( 𝑡 ) [𝑅 𝐿𝑁 ( 𝛾 4 𝑔 𝑚 4 + 𝛾 3 1 𝑟 𝑜 3 )+ 𝑅 𝐿𝑃 ( 𝛾 1 𝑔 𝑚 1 + +𝛾 2 1 𝑟 𝑜 2 ) ] ( 3.8) 𝜆 𝑃 𝑜 ( 𝑡 )= [1 + 2𝑃 𝑜 𝑡 + 2( 𝑃 𝑜 𝑡 ) 2 ]𝑒 −2𝑃 𝑜 𝑡 ( 3.9) 𝑃 𝑜 = 1 ( 𝑅 𝐿𝑁 ||𝑅 𝐿𝑃 ) 𝐶 1 ( 3.10) 𝑅 𝐿𝑁 ≈ 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 ( 3.10𝑎 ) 𝑅 𝐿𝑃 ≈ 𝑔 𝑚 3 𝑟 𝑜 3 𝑟 𝑜 4 ( 3.10𝑏 ) 𝑎 𝑣 = ( 𝑔 𝑚 2 + 𝑔 𝑚 3 ) ( 𝑅 𝐿𝑁 ||𝑅 𝐿𝑃 ) ( 3.11) where 𝑇 is the absolute temperature in K 𝑘 is the Boltzmann’s constant 26 𝑎 𝑣 is the small signal gain of the current starved inverter 𝑅 𝐿𝑁 is the output resistance of the NMOS half 𝑅 𝐿𝑃 is the output resistance of the PMOS half 𝑔 𝑚𝑥 is the small signal transconductance of the transistor 𝑀 𝑥 𝛾 is the noise excess factor (𝛾 = 2/3 in saturation, and 2/3 < 𝛾 < 1 in triode) [45]. (The 𝛾 here is close to the value for the triode region.) 𝑃 𝑜 is the output pole of the current starved inverter 3.4.2 Output jitter in delay cells due to delay cell mismatch Assuming that the mismatches in the delay cells are uncorrelated, the DLL output jitter due to delay cell mismatch was estimated in [46] to be ∆𝑡 𝑋 ,𝐷𝐶 _𝑚𝑖𝑠𝑚𝑎𝑡𝑐 ℎ = 𝑇 𝑖𝑛 ( 𝑋 + ∑ 𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) 𝑋 𝑖 =1 𝑁 + ∑ 𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) 𝑁 𝑖 =1 − 𝑋 𝑁 ) ( 3.12) where ∆𝑡 𝑋 ,𝐷𝐶 _𝑚𝑖𝑠𝑚𝑎𝑡𝑐 ℎ is the timing jitter output from the X th delay cell in the VCDL due to delay cell mismatch 𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) is the delay cell mismatch for a certain 𝑉 𝑐𝑡𝑟𝑙 value; 𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) has zero mean because the loop will remove all common changes of delay in the cells 𝑁 is the total number of delay cells in the VCDL 𝑋 is the X th delay cell 𝑇 𝑖𝑛 is the input clock cycle. 27 After assuming that 𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) has a zero mean, 𝜎 2 [𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) ] ≪ 1, and using first-order Taylor expansion, the variance of ∆𝑡 𝑋 ,𝐷𝐶 can be approximated as follows [46]: 𝜎 2 ( ∆𝑡 𝑋 ,𝐷𝐶 _𝑚𝑖𝑠𝑚𝑎𝑡𝑐 ℎ )= 𝐸 [( ∆𝑡 𝑋 ,𝐷𝐶 _𝑚𝑖𝑠𝑚𝑎𝑡𝑐 ℎ ) 2 ] ≈ 𝑇 𝑖𝑛 2 𝑋 ( 𝑁 − 𝑋 ) 𝑁 3 𝜎 2 [𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) ] ( 3.13) It was found in [46] that the delay cell in the middle of the VCDL has the highest variance for ∆𝑡 𝑋 ,𝐷𝐶 _𝑚𝑖𝑠𝑚𝑎𝑡𝑐 ℎ because the loop will keep the time error at the VCDL input and output at zero, and the middle delay cell has the longest distance to these clean points. The sigma value of the phase time error of the middle delay cell (𝜎 ( ∆𝑡 𝑁 /2 ) ) can be estimated as follows [46]: 𝜎 ( ∆𝑡 𝑁 /2 )≈ 𝜎 [𝑒 𝑖 ( 𝑉 𝑐𝑡𝑟𝑙 ) ] 𝑇 𝑖𝑛 2√2𝑁 ( 3.14) 3.5 Structure of the VCDL and the delay cell 3.5.1 Replica of a current-starved delay cell that maintains the output duty cycle at 50% The duty cycle of the delay cell output will be distorted due to the mismatch between the rise and fall times in each delay cell when many delay cells are cascaded to form a VCDL with a relatively long timing delay, and such a distortion will accumulate as the input clock passes through the cascaded delay cells. As a result, the last delay cell in the VCDL will exhibit a duty cycle that suffers from the largest deviation from 50%, and the output may become a horizontal line at 𝑉 𝑑𝑑 or ground in the worst case. Thus, a duty cycle correction stage is sometimes needed between the VCDL and the proceeding stage, which increases circuit complexity [47]. 28 To avoid using a duty cycle correction stage after the VCDL, the analog delay cell in Fig. 3.5, which comprises two identical current-starved inverters with identical characteristics, was used in this project. For example, when a clock signal passes through the delay cell, a rising input edge will fall at the drains of 𝑀 2 and 𝑀 3 , and then rise again at the output. Similarly, a falling input edge will rise at the drains of 𝑀 2 and 𝑀 3 , and then fall again at the output. Except for the sequence, they are both delayed by the amount of two edge-delay periods: one rising-delay and one falling-delay. As a result, the output duty cycle can be maintained at approximately 50% [47]. Fig. 3.5. Schematic of a unit delay cell. 3.5.2 Low supply sensitivity and a simple structured voltage-to-current (V-to- I) converter A V-to-I converter from the main loop controls all delay cells. There are different types of V-to-I 29 converters. The schematics of the simplest V-to-I converters are shown in Fig. 3.6(a) and (b). As shown in Fig. 3.6(a), 𝑀 12 is a resistive device, and the equivalent resistance across 𝑀 11 and 𝑀 13 are small, as they are diode connected. Therefore, the equivalent circuit of the V-to-I converter in Fig. 3.6(a) is a resistive divider. Then, 𝑉 𝑔𝑠 1 and 𝑉 𝑔𝑠 4 will be distorted by power supply noise, and such a distortion will eventually become variations in the delay value of the delay cell (i.e., jitter). Fig. 3.6. Two types of V-to-I converters (a) and (b). For the V-to-I converter shown in Fig. 3.6(b), a single NMOS transistor performs the function of V-to-I control for simplicity. Additionally, since the impedance looking down from the 𝑀 9 drain is high, 𝑉 𝑝 will follow the variation in power supply, and 𝑉 𝑔𝑠 10 will be kept constant under the power supply noise. Therefore, the V-to-I converter presented in Fig. 3.6(b) was selected for the VCDL due to its simplicity and immunity to power supply variations. The final schematic of 30 a single delay cell with the V-to-I converter is shown in Fig. 3.7, and the final schematic of the VCDL with 35 cascaded delay cells is shown in Fig. 3.8. The first and the last delay cells are dummies to ensure that the input and output load of the 33 main delay cells in the middle are equal. P<0> and P<32> will be in phase with each other due to the closed loop locking process of the DLL. Fig. 3.7. Final structure of the delay cell and the V-to-I converter. Fig. 3.8. Schematic of a VCDL. 31 3.6 Simulation results A larger VCDL delay range (a higher VCDL gain) will enable the DLL-based frequency counter to operate under a wider input frequency range. However, this will simultaneously introduce more jitter, as stated in section 2.2.2. Design efforts were taken in this project to compromise this trade- off. The simulation results about the optimized VCDL are shown as follows. Fig. 3.9 shows the layout of the VCDL, which has an area of 103.88 μm × 20.8 μm. Post layout corner and Monte Carlo simulations were implemented, and the results are shown below. Fig. 3.9. Layout of the VCDL. Fig. 4.10 shows the available delay range of the VCDL at all corners. When 𝑉 𝑐𝑡𝑟𝑙 < 0.3 V, the bias current through the delay cells is insufficient to charge and discharge the load within a short time, and the square wave could not be generated. When 𝑉 𝑐𝑡𝑟𝑙 > 0.9 V, any further increase in 𝑉 𝑐𝑡𝑟𝑙 will not change the bias current by a large amount, and the delay value of the VCDL will not decrease significantly with an increase in 𝑉 𝑐𝑡𝑟𝑙 . Therefore, the valid 𝑉 𝑐𝑡𝑟𝑙 operational range of the VCDL was defined as ∈[0.3, 0.9]V. As Fig. 3.10 shows, the VCDL will demonstrate the narrowest delay range at the FF corner when 𝑉 𝑑𝑑 = 1.08 V, and the temperature was 60 °C. The narrowest delay range was ∈[2.79 ns, 41.94 ns] = [23.84 MHz, 358.42 MHz]. 32 Fig. 3.10. Delay range of the VCDL at all corners. The peak-to-peak jitter of the VCDL at all corners was tested using Monte Carlo simulations. The worst-case peak-to-peak jitter was 279 ps (𝑇 𝑖𝑛 =92.44 ns) at SS, at 𝑉 𝑑𝑑 = 1.08 V, and at 60 °C. 33 Chapter 4: Charge Pump This section will demonstrate the design of the CP in the DLL. The design flow is shown in Fig. 4.1; details are given in subsections 4.1–4.5. Fig. 4.1. Design flow of the CP. 34 4.1 Structure Fig. 4.2. (a) Conceptual diagram of a charge pump (CP) and (b) basic CP. A CP is a circuit that either sinks or sources a charge for a set. Figure 4.2 (a) shows that a CP consists of two different parts: (1) a biasing branch and (2) an output branch. The output branch is comprised of (2.1) 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 current sources and (2.2) UP and DN switches. The 𝐶 𝑝 , loop integrator is also shown in the CP schematic. The basic operation of the CP is as follows: 𝐼 𝑢𝑝 charges 𝐶 𝑝 if switch UP is turned on, and 𝐼 𝑑𝑛 discharges 𝐶 𝑝 if switch DN is turned on. The transistor-level implementation of the basic CP is shown in Fig. 4.2 (b), where UPB (UP bar) is the inverted signal of UP [24], [48]. 35 4.2 Nonidealities of the basic CP 4.2.1 Charge injection When the switch transistors (𝑀 2 and 𝑀 3 ) are turned on or off, the channel charge in the inversion layer underneath the gate in the amount of [49] |𝑄 𝑐 ℎ𝑎𝑛𝑛𝑒𝑙 | = 𝑊𝐿 𝐶 𝑜𝑥 ( 𝑉 𝑔𝑠 − 𝑉 𝑡 ℎ ) ( 4.1) where 𝑊 , 𝐿 is the width and length of the MOSFET 𝐶 𝑜𝑥 is the gate capacitance per unit area 𝑉 𝑔𝑠 is the gate-to-source voltage 𝑉 𝑡 ℎ is the MOSFET threshold voltage will be absorbed or released from the MOSFET. One part of the charge will flow to the source of the MOSFET, and the other will go to the output node. Consequently, a net disturbance will arise in 𝑉 𝑐𝑡𝑟𝑙 on both the rising and falling edges of the UP and DN pulses because the two switches do not necessarily carry equal charges. Assume that the fraction of the charge that flows into the output node for one switch 𝑀 3 is 𝑘𝑄 𝑐 ℎ𝑎𝑛𝑛𝑒𝑙 , then the error of 𝑉 𝑐𝑡𝑟𝑙 caused by the charge injection from 𝑀 3 will be [24], [49] 𝛥𝑉 𝑐𝑡𝑟𝑙 _𝑖𝑛𝑗𝑒𝑐𝑡𝑖𝑜𝑛 = | 𝑘𝑄 𝑐 ℎ𝑎𝑛𝑛𝑒𝑙 𝐶 𝑝 | = 𝑘 𝑊𝐿 𝐶 𝑜𝑥 ( 𝑉 𝐺𝑆 − 𝑉 𝑡 ℎ ) 𝐶 𝑝 ( 4.2) 36 where 𝑘 , 𝑉 𝐺𝑆 , and 𝑉 𝑡 ℎ depend on the input signal. It has been proven that turning off the switch MOSFETs in the saturation region will cause all the channel charges to flow into the source; the drain will not be affected because the channel is disconnected from the drain [49]. Therefore, it is desirable to keep the switch MOSFETs in the saturation region to minimize the effect of the charge injection on the output node. However, the switch MOSFETs will be pushed into the triode region if 𝑉 𝑐𝑡𝑟𝑙 goes too high or too low [49]. 4.2.2 Clock feedthrough Fig. 4.3. Description of how clock feedthrough affects the basic CP. 37 𝐶 𝑔𝑑 2 and 𝐶 𝑔𝑑 3 conduct the edges of the UP/DN pulses to the output, producing a jump in 𝑉 𝑐𝑡𝑟𝑙 of the following value [49]: 𝛥𝑉 𝑐𝑡𝑟𝑙 _𝑓𝑒𝑒𝑑𝑡 ℎ𝑟𝑜𝑢𝑔 ℎ = 𝑉 𝑑𝑑 |𝐶 𝑔𝑑 2 − 𝐶 𝑔𝑑 3 | 𝐶 𝑔 𝑑 2 + 𝐶 𝑔𝑑 3 + 𝐶 𝑝 ( 4.3) because 𝐶 𝑔𝑑 2 and 𝐶 𝑔𝑑 3 are generally unequal. Contrary to the error in 𝑉 𝑐𝑡𝑟𝑙 due to the charge injection given in Section 4.2.1, the clock feedthrough error is largely independent of the input signal. In addition, unlike the charge injection error, the clock feedthrough error will remain even if the switch MOSFETs are kept in the saturation region [24], [49]. When the CP transistors are enlarged to minimize the effect of channel length modulation and random mismatch, the issues of charge injection and clock feedthrough will worsen. Thus, the loop capacitor 𝐶 𝑝 should be large enough to reduce 𝛥𝑉 𝑐𝑡𝑟𝑙 _𝑓𝑒𝑒𝑑𝑡 ℎ𝑟𝑜𝑢𝑔 ℎ , but should not be so large that it slows down the CP response. 4.2.3 Charge sharing Fig. 4.4. (a) Basic CP and (b) description of how charge sharing affects the basic CP. 38 As shown in Fig. 4.4, when UP=DN=0, 𝑉 𝐴 will be charged to 𝑉 𝑑𝑑 , 𝑉 𝐵 will be discharged to the ground, and 𝑉 𝑐𝑡𝑟𝑙 will be floating. During the operation of the DLL, in some cases, both switches will be turned on/off simultaneously. Then, 𝑉 𝐴 will be discharged and 𝑉 𝐵 will be charged, leading to charge sharing among 𝐶 𝐴 , 𝐶 𝐵 , and 𝐶 𝑝 . As a result, 𝑉 𝑐𝑡𝑟𝑙 will experience a variation [24], [50], [51]. 4.2.4 𝑰 𝒖𝒑 /𝑰 𝒅𝒏 current leakage Fig. 4.5. (a) CP leakage model and (b) description of how 𝐼 𝑢𝑝 /𝐼 𝑑𝑛 current leakage affects the basic CP. Another nonideality is the leakage of the switch transistors (𝑀 2 and 𝑀 3 ) while they are off. The difference between the charging leakage current and the discharging leakage current, 𝐼 𝑙𝑒𝑎𝑘 , discharges the loop capacitor for the time 𝑇 𝑖𝑛 − 𝑇 𝑟𝑒𝑠 ≈ 𝑇 𝑖𝑛 (where 𝑇 𝑖𝑛 is the input clock period and 𝑇 𝑟𝑒𝑠 is the duration of the UP and DN pulses), resulting in a peak-to-peak ripple of ( 𝐼 𝑙𝑒𝑎𝑘 /𝐶 𝑝 ) 𝑇 𝑖𝑛 in 𝑉 𝑐𝑡𝑟𝑙 . Such a change in 𝑉 𝑐𝑡𝑟𝑙 can be compensated in the steady-state operation as the expense of a phase error of 39 𝛥𝑇 = 𝐼 𝑙𝑒𝑎𝑘 𝐼 𝑐𝑝 𝑇 𝑖𝑛 ( 4.4) 𝐼 𝑐𝑝 is the nominal charge pump current. 𝛥𝑇 can be significant because the leakage occurs for one input period 𝑇 𝑖𝑛 , even though 𝐼 𝑙𝑒𝑎𝑘 /𝐼 𝑐𝑝 is exceedingly small. The aforementioned behavior may also be seen when the loop capacitor has its own leakage [24]. 4.2.5 𝑰 𝒖𝒑 /𝑰 𝒅𝒏 current mismatch and variation Fig. 4.6. The lock condition in the presence of 𝐼 𝑢𝑝 /𝐼 𝑑𝑛 mismatches. Fig. 4.7. (a) Basic CP and (b) example 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveform of the basic CP. 40 Assume 𝐼 𝑢𝑝 > 𝐼 𝑑𝑛 and ( 𝐼 𝑢𝑝 − 𝐼 𝑑𝑛 ) flows through the loop capacitor for 𝑇 𝑟𝑒𝑠 seconds in every input period, as shown in Fig. 4.6. The DLL attempts to make the input and output clocks have a constant phase difference. Consequently, the loop settles with a static phase error (𝛥𝑇 ), such that the larger current lasts shorter and the smaller current lasts longer, and there will be ripples in the waveform of 𝑉 𝑐𝑡𝑟𝑙 , as shown in Fig. 4.6 [24]. In the ripple of the 𝑉 𝑐𝑡𝑟𝑙 waveform, 𝑉 𝑐𝑡𝑟𝑙 falls at a rate of 𝐼 𝑑𝑛 /𝐶 𝑝 for 𝛥𝑇 seconds and rises at a rate of ( 𝐼 𝑢𝑝 − 𝐼 𝑑𝑛 ) /𝐶 𝑝 for 𝑇 𝑟𝑒𝑠 seconds. In the steady state, the fall and rise must cancel; hence, the phase error is given by [24] as 𝛥𝑇 = 𝐼 𝑢𝑝 − 𝐼 𝑑𝑛 𝐼 𝑐𝑝 𝑇 𝑟𝑒𝑠 ( 4.5) where 𝑇 𝑟𝑒𝑠 is the duration of the UP and DN pulses. The peak-to-peak ripple of 𝑉 𝑐𝑡𝑟𝑙 is given by [24] 𝛥 𝑉 𝑐𝑡𝑟𝑙 = 𝛥𝑇 𝐼 𝑐𝑝 𝐶 𝑝 ≈ ( 𝐼 𝑢𝑝 − 𝐼 𝑑𝑛 ) 𝑇 𝑟𝑒𝑠 𝐶 𝑝 . ( 4.6) An example waveform of the 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 of the basic CP is shown in Fig. 4.7 (b). Because of the channel length modulation effect, 𝐼 𝑢𝑝 will increase and 𝐼 𝑑𝑛 will decrease as 𝑉 𝑐𝑡𝑟𝑙 decreases. Therefore, 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 only match each other well within a narrow 𝑉 𝑐𝑡𝑟𝑙 range. Both the mismatch between 𝐼 𝑢𝑝 and 𝐼 𝑑 𝑛 and the variation of 𝐼 𝐶𝑃 will generate spurious tones and phase noise at the VCDL output and affect the transient response of the DLL [52]. 41 4.2.6 Output noise The output current noise of the CP will be converted to CP jitter and, ultimately, the output jitter of the DLL. Equations (2.15) and (2.16) in section 2.2.1 illustrates how CP jitter is converted to the output jitter of the DLL. 4.3 CP structure improvements 4.3.1 Make 𝑰 𝒖𝒑 match 𝑰 𝒅𝒏 under a wide 𝑽 𝒄𝒕𝒓𝒍 range There are two ways to make 𝐼 𝑢𝑝 match 𝐼 𝑑𝑛 under a wide 𝑉 𝑐𝑡𝑟𝑙 range: (1) Increase the output resistance of the PMOS and NMOS current sources to maintain 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 at a nominal value for a wide 𝑉 𝑐𝑡𝑟𝑙 range. Thus, both the current mismatch and current variation problems can be alleviated. (2) Extend the compliance voltage (𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 , the maximum/minimum allowable 𝑉 𝑐𝑡𝑟𝑙 voltage to keep all the current source transistors in the saturation region). In this report, the compliance voltage of the PMOS current source of the CP is denoted as 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑃 , which is expected to be high, while that of the NMOS current source of the CP is denoted as 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 , which is expected to be low. 42 4.3.1.1 Using feedback to increase the output resistance of the current source 4.3.1.1.1 Using positive voltage–voltage (series–shunt) feedback to improve output resistance Fig. 4.8. Output resistance improved PMOS current source of the CP using positive feedback. An example of using positive voltage–voltage (series–shunt) feedback to increase the output resistance of the PMOS current source of the CP is demonstrated in Fig. 4.8 [53]. The positive feedback loop is the loop that encloses 𝑀 1 , as indicated in Fig. 4.8. The switch transistor 𝑀 𝑠 is assumed to be on during analysis. As a result, the output resistance seen from the drain of 𝑀 1 can be expressed as 𝑅 𝑜𝑢𝑡 1 = 𝑟 𝑜 1 [1+ 𝑔 𝑚 2 𝐴 𝑜 ( 𝑟 𝑜 2 ||𝑅 𝑠 ) ] [1+ 𝑔 𝑚 2 𝐴 𝑜 ( 𝑟 𝑜 2 ||𝑅 𝑠 )− 𝑔 𝑚 1 𝐴 𝑜 𝑟 𝑜 1 ] ( 4.7) 43 where 𝑔 𝑚𝑥 is the small signal transconductance of the MOSFETs 𝑟 𝑜𝑥 is the small signal output resistance of the MOSFETs 𝐴 𝑜 is the DC gain of the amplifier 𝑅 𝑠 is the output resistance of the biasing current source (𝐼 𝑐𝑝 in Fig. 4.8). If 𝑀 1 and 𝑀 2 are matched well and 𝐴 𝑜 is large, 𝑅 𝑜𝑢𝑡 can be simplified as 𝑅 𝑜𝑢𝑡 1 ≈ −𝑅 𝑠 ( 4.8) The intuitive understanding of equation (4.8) and Fig. 4.8 can be expressed as follows. It is assumed that the amplifier has a large gain, 𝑀 1 and 𝑀 2 are matched well, and the voltage drop across 𝑀 𝑠 is negligible. Consequently, as 𝑉 𝑐𝑡𝑟𝑙 changes, 𝑉 𝑑 1 and 𝑉 𝑑 2 will always be equal to 𝑉 𝑐𝑡𝑟𝑙 , and the drain current through 𝑀 1 and 𝑀 2 will always match and be equal to what is provided by the biasing current source 𝐼 𝑐𝑝 . Then, the quality of the current source 𝑀 1 will be determined only by the quality of the biasing current source 𝐼 𝑐𝑝 . Because the quality of a current source is controlled by its output resistance, the condition illustrated in Fig. 4.8 is that the output resistance of the biasing current source 𝐼 𝑐𝑝 (𝑅 𝑠 ) is passed on to 𝑀 1 . The negative sign in equation (4.8) is due to the positive feedback. Therefore, the output resistance of the PMOS current source can be increased from 𝑟 𝑜 2 to the output resistance of the biasing current source 𝑅 𝑠 . 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑃 here is ( 𝑉 𝑑𝑑 − 𝑉 𝑑𝑠𝑎𝑡 ) , where 𝑉 𝑑𝑠𝑎𝑡 is the minimum 𝑉 𝑑𝑠 that keeps the MOSFET in the saturation region. 44 However, this kind of current source suffers from two disadvantages: more effort is needed to increase 𝑅 𝑠 , and 𝑅 𝑜𝑢𝑡 1 is negative because of the positive feedback. Figure 4.9 is an example of the CP in [54] that used positive feedback to the PMOS current source. From a large signal viewpoint, the feedback amplifier keeps 𝑉 𝑑𝑠 1 = 𝑉 𝑑𝑠 3 and 𝑉 𝑑𝑠 2 = 𝑉 𝑑𝑠 4 as 𝑉 𝑐𝑡𝑟𝑙 changes. Then, 𝐼 𝑢𝑝 = 𝐼 1 and 𝐼 𝑑𝑛 = 𝐼 2 for a wide 𝑉 𝑐𝑡𝑟𝑙 range. We also know that 𝐼 1 = 𝐼 2 because they are in the same branch. Thus, 𝐼 𝑢𝑝 can match with 𝐼 𝑑𝑛 under a wide 𝑉 𝑐𝑡𝑟𝑙 range. Although the current matching problem can be solved, the variation in the absolute current value of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 remains a problem because only the output resistance of the PMOS current source is boosted while the NMOS count part is not. Fig. 4.9. (a) CP with feedback to the PMOS current source [54] and (b) 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveforms of this CP. Another CP example in [54] used positive feedback to both the PMOS and NMOS current sources to solve the current mismatch and current variation problems simultaneously. Figure 4.9 demonstrates its structure and example waveform. In this CP, the amplifier feedback to the PMOS 45 current source ensures 𝐼 𝐶𝐻 2 = 𝐼 𝐷𝐼𝑆 1 , while the amplifier feedback to the NMOS current source ensures 𝐼 𝐶𝐻 1 = 𝐼 𝐷𝐼𝑆 2 . Because 𝐼 𝑑𝑛 = 𝐼 𝐷𝐼𝑆 1 + 𝐼 𝐷𝐼𝑆 2 and 𝐼 𝑢𝑝 = 𝐼 𝐶𝐻 1 + 𝐼 𝐶𝐻 2 , 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 match with each other and can be kept constant under a wide 𝑉 𝑐𝑡𝑟𝑙 range. Fig. 4.10. (a) CP with feedback to both the PMOS and NMOS current sources [54] and (b) example 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 waveforms of this CP. However, there are four problems with this circuit. First, according to my simulation results and the testing results in [55] and [56], the variation in the absolute current value of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 was not significantly reduced. This could be because the output resistance of the P/NMOS current sources of the CP (𝑅 𝑜𝑢𝑡 _𝐶𝑃 ) is only approximately 𝑟 𝑜 , which is equal to the output resistance of the current bias (denoted as 𝑅 𝑠 in Fig. 4.8), as there is only a single transistor. Nonetheless, 𝑟 𝑜 is too small to reduce the current variation as 𝑉 𝑐𝑡𝑟𝑙 changes. Additionally, because of the non- idealities of the amplifiers, 𝑅 𝑜𝑢𝑡 _𝐶𝑃 cannot be decently maintained as 𝑟 𝑜 as 𝑉 𝑐𝑡𝑟𝑙 changes from rail to rail. Thus, the absolute current value of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 will still vary as 𝑉 𝑐𝑡𝑟𝑙 changes. Additional circuits may be needed at the current bias to further enlarge its output resistance. 46 Second, the two amplifiers are required to have a high UGB for a fast CP and DLL response, increasing the power consumption. Fourth, both of the two global feedback loops (each made by a positive and a negative feedback loop) are required to be stable across a wide 𝑉 𝑐𝑡𝑟𝑙 range, which may result in a trade-off between the UGB and phase margin (in some cases, compensation capacitors are needed at the output nodes of the amplifiers to increase the phase margin; however, this will limit the UGB of the amplifiers). Fourth, the circuit in Fig. 4.10 (a) could increase design effort, area, and structural complexity, introducing higher risks to the circuit under PVT variations and mismatches. 4.3.1.1.2 Using negative current–voltage (series–series) feedback to increase output resistance ➢ Cascode (Fig. 4.11(a)) Fig. 4.11. Output resistance improved NMOS current source of the CP using negative feedback (a) cascode and (b) regulated cascode. 47 In the cascode current source shown in Fig. 4.11 (a), 𝑀 1 serves as the current–voltage (series–series) feedback to 𝑀 2 . As a result, the output resistance from the drain of 𝑀 2 (𝑅 𝑜𝑢𝑡 2 ) can be increased by the loop gain (from 𝑟 𝑜 2 to 𝑟 𝑜 2 × 𝐿𝑜𝑜𝑝 _𝐺𝑎𝑖𝑛 ): 𝑅 𝑜𝑢𝑡 2 = 𝑟 𝑜 1 + 𝑟 𝑜 2 + 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 ≈ 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 ( 4.9) The 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 for a normal cascode current source is 2𝑉 𝑑𝑠𝑎𝑡 (the switch transistor 𝑀 𝑠 is assumed to be on during analysis). ➢ Regulated cascode (Fig. 4.11(b)) The output resistance of a cascode current source can be further boosted by raising 𝑔 𝑚 2 . Employing negative voltage–voltage (series–shunt) feedback to 𝑀 2 (Fig. 4.11(b)) can increase 𝑔 𝑚 2 by the loop gain, where the improved 𝑔 𝑚 2 (𝑔 𝑚 2 ′) is expressed as: 𝑔 𝑚 2 ′ = 𝑔 𝑚 2 ( 𝐴 𝑜 + 1)≈ 𝑔 𝑚 2 𝐴 𝑜 ( 4.10) where 𝐴 𝑜 is the DC gain of the feedback amplifier. Thus, the boosted output resistance of the current source in Fig. 4.11(b) can be expressed as: 𝑅 𝑜𝑢𝑡 3 = 𝑟 𝑜 1 + 𝑟 𝑜 2 + 𝑔 𝑚 2 ( 𝐴 𝑜 + 1) 𝑟 𝑜 1 𝑟 𝑜 2 ≈ 𝑔 𝑚 2 𝐴 𝑜 𝑟 𝑜 1 𝑟 𝑜 2 ( 4.11) 48 Regulated cascode current source using differential to single-ended feedback amplifier Two examples of the regulated cascode NMOS output current source and its biasing branch are shown in Fig. 4.12 (a) and (b), where (a) has a simpler structure and (b) has better symmetry. The simplest feedback amplifiers that could be used in (a) and (b) are shown in (c)–(f). (c) and (e), with the NMOS input differential pair, have the minimum common mode input voltage (𝑉 𝑖𝑛 _𝑐𝑚 _𝑚𝑖𝑛 , which is equal to 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 ) of (2𝑉 𝑑𝑠𝑎𝑡 + 𝑉 𝑡 ℎ ). For (d), it is 𝑉 𝑑𝑠𝑎𝑡 . Hence, (d) is preferred to (c) and (e) to obtain a lower 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 . Because of equation (4.10), the feedback amplifier is required to have a large gain for a large 𝑅 𝑜𝑢𝑡 2 , and a cascode active load may be needed to replace the simple active load in the amplifier, as shown in Fig. 4.12(f). However, this will upshift the 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 to (2𝑉 𝑑𝑠𝑎𝑡 + 𝑉 𝑡 ℎ ), introducing a trade-off between the gain of the feedback amplifier and 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 . If 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 increases, 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 (the minimum allowable 𝑉 𝑐𝑡𝑟𝑙 voltage to keep all transistors in the NMOS current source in the saturation region) also increases, as 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 is part of 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 . 49 Fig. 4.12. (a), (b) Output resistance improved NMOS current source of the CP using regulated cascode and the biasing circuit; (c)–(f) simple structured amplifiers could be used in (a) and (b). Therefore, structural modification should be made to the current sources in Fig. 4.12(a) and (b) to solve two trade-offs: (1) achieve symmetry without adding too much complexity and (2) increase the gain of the feedback amplifier without increasing 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 and 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 . The circuit Fig. 4.13(a) and (b) are introduced to solve these trade-offs (the switch transistor 𝑀 𝑠 is assumed on when doing analysis). 50 Regulated cascode current source using single-input–single-output common-source feedback amplifier Fig. 4.13. (a) Output resistance improved NMOS current source of the CP using regulated cascode and the biasing circuit. (b) Regulated cascode current source with a level shifter for reducing 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑀 1 . (a) Principle In Fig. 4.13(a) and (b), 𝑀 1 and 𝑀 2 form the cascode output current source of the CP. 𝑀 3 and 𝐼 𝑐𝑠 form the single-input–single-output common source feedback amplifier. 𝑀 4 and 𝐼 𝑙𝑠 in (b) form the level shifter. The biasing branches of both (a) and (b) are replicas of the output branch with the gate of 𝑀 5 connected to the drain of 𝑀 6 to construct a current mirror. The use of a single-input–single-output common source feedback amplifier to boost 𝑔 𝑚 2 simplifies the amplifier structure, and the overall circuit does not become complex. Even replicas 51 are made as the bias branch to achieve symmetry, solving the first trade-off of the circuits in Fig. 4.12. (b) Small-signal analysis of the current sources in Fig. 4.13(a) and (b) From a small-signal viewpoint, both output current sources in Fig 4.13(a) and (b) have approximately the same output resistance: 𝑅 𝑜𝑢𝑡 4 = 𝑟 𝑜 1 + 𝑟 𝑜 2 + 𝑔 𝑚 2 [𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 )+ 1]𝑟 𝑜 1 𝑟 𝑜 2 ≈ 𝑔 𝑚 2 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝑟 𝑜 1 𝑟 𝑜 2 ( 4.11) where 𝑅 𝑐𝑠 is the output resistance of the current source 𝐼 𝑐𝑠 , and 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) is the absolute value of the gain of the 𝑔 𝑚 2 boosting amplifier. For the current source in Fig. 4.13(a) and (b), the common source feedback amplifier (𝑀 3 and 𝐼 𝑐𝑠 ) and the follower 𝑀 2 regulate the drain voltage of 𝑀 1 (𝑉 𝑑 1 ) to suppress the channel- length modulation of 𝑀 1 . Any small variation in 𝑉 𝑑 1 is amplified by the common source feedback amplifier and reflected to the gate voltage of 𝑀 2 (𝑉 𝑔 2 = 𝑉 𝑑 3 ), forcing 𝑉 𝑑 1 to stay constant. The frequency response of the output current source in Fig. 4.13(a) (containing 𝑀 1 , 𝑀 2 , 𝑀 3 , and 𝐼 𝑐𝑠 ) can be expressed as [57]: 52 𝐻 ( 𝑠 )= 𝑉 𝑑 2 ( 𝑠 ) 𝑉 𝑔 1 ( 𝑠 ) = −𝑔 𝑚 1 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 [1+ 𝐴 𝑓 ( 𝑠 ) ] 1 1+ 𝑠 1 𝑔 𝑚 2 𝐶 𝐿 𝑟 𝑜 1 𝑟 𝑜 2 [1+ 𝐴 𝑓 ( 𝑠 ) ] ( 4.12) 𝐴 𝑓 ( 𝑠 )= 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 1 1+ 𝑠 1 𝐶 𝑓 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ( 4.13) where 𝐶 𝐿 is the load capacitance at the drain of 𝑀 2 ; 𝐶 𝑓 is the load capacitance of the feedback amplifier at the gate of 𝑀 2 ; Therefore: 𝐻 ( 𝑠 )= 𝑉 𝑑 2 ( 𝑠 ) 𝑉 𝑔 1 ( 𝑠 ) = 𝐴 𝑣𝑜 ( 1+ 𝑠 𝑍 ) ( 1+ 𝑠 𝑃 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 ) ( 1+ 𝑠 𝑃 𝑛𝑜𝑛 _𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 ) ( 4.14) 𝐴 𝑣𝑜 = −𝑔 𝑚 1 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 [1+ 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ] ≈ −𝑔 𝑚 1 𝑔 𝑚 2 𝑔 𝑚 3 𝑟 𝑜 1 𝑟 𝑜 2 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ( 4.15) 𝑍 = 1+ 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝐶 𝑓 ≈ 𝑔 𝑚 3 𝐶 𝑓 ( 4.16) 𝑃 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 = 1 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝐶 𝑓 + 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 𝐶 𝐿 [1+ 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ] ≈ 1 𝑔 𝑚 2 𝑔 𝑚 3 𝑟 𝑜 1 𝑟 𝑜 2 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝐶 𝐿 ( 4.17) 𝑃 𝑛𝑜𝑛 _𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 = ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝐶 𝑓 + 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 𝐶 𝐿 [1+ 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ] 𝑔 𝑚 2 𝑟 𝑜 1 𝑟 𝑜 2 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) 𝐶 𝑓 𝐶 𝐿 ≈ 𝑔 𝑚 3 𝐶 𝑓 = 𝑍 ( 4.18) Finally: 𝐻 ( 𝑠 )= 𝑉 𝑑 2 ( 𝑠 ) 𝑉 𝑔 1 ( 𝑠 ) ≈ 𝐴 𝑣𝑜 1 1+ 𝑠 𝑃 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 ( 4.19) 𝑈𝐺𝐵 𝑟𝑐𝑔 = |𝐴 𝑣𝑜 𝑃 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 | = 𝑔 𝑚 1 𝐶 𝐿 ( 4.20) 53 Here, “rcg” denotes the regulated cascode. The power spectral density of the output current noise of the output current source in Fig. 4.13(a) is: 𝐼 𝑛 ,𝑜𝑢𝑡 _𝑟𝑐𝑔 2 ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ ≈ 4𝑘𝑇𝛾 [𝑔 𝑚 1 + 1 𝑔 𝑚 2 ( 𝑔 𝑚 3 𝑟 𝑜 1 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ) 2 + 1 𝑔 𝑚 3 𝑟 𝑜 1 2 ] ≈ 4𝑘𝑇𝛾 𝑔 𝑚 1 ( 4.21) This indicates that the output current noise of the regulated cascode current source is approximately equal to that of a single transistor if 𝑔 𝑚 𝑟 𝑜 ≫ 1. The approximated small-signal model of the current source in Fig. 4.13(b) is similar to that of Fig. 4.13(a), except 𝐴 𝑓 ( 𝑠 ) has an additional pole at a high frequency caused by the level shifter. (c) Large-signal analysis of the current sources in Fig. 4.13(a) and (b) Fig. 4.14. Large signal voltages 𝑉 𝑑 1 and 𝑉 𝑔 2 and currents 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 as a function of 𝑉 𝑐𝑡𝑟𝑙 . 54 Figure 4.14 illustrates the behavior of the large-signal voltages 𝑉 𝑑 1 and 𝑉 𝑔 2 and the currents 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 , as 𝑉 𝑐𝑡𝑟𝑙 varies (𝑉 𝑔𝑠 1 is assumed to be constant during analysis). When 𝑉 𝑐𝑡𝑟𝑙 decreases from high values to 𝑉 2 , 𝑀 2 transfers from the saturation region to the triode region. To conduct the saturation current asserted by 𝑀 1 , a larger 𝑉 𝑔𝑠 2 is needed, and the common source feedback amplifier helps to realize the respective increase in 𝑉 𝑔 2 . When 𝑉 𝑐𝑡𝑟𝑙 decreases even lower to 𝑉 1 , both 𝑀 1 and 𝑀 2 are in the triode region. Under this condition, further increasing 𝑉 𝑔 2 cannot force 𝑀 2 to conduct the saturation current from 𝑀 1 [58]. Fig. 4.14 shows that the slope of the 𝑉 𝑔 2 trace is steeper than that of 𝑉 𝑑 1 when 𝑉 𝑔 2 is less than 𝑉 𝑑𝑠𝑎𝑡 _𝐼𝑐𝑠 . This is because of the amplification in the feedback loop. If 𝑉 𝑔 2 exceeds 𝑉 𝑑𝑠𝑎𝑡 _𝐼𝑐𝑠 , the current source 𝐼 𝑐𝑠 transfers from the saturation region to the triode region, and the gain of the common source feedback amplifier drops. According to the simulation in this project, the DLL has a <1.5% phase error if the mismatch between 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 is less than 5%. 𝑉 𝑁 _5% , which is in region 2 in most of the cases, denotes the point where 𝐼 𝑑𝑛 deviates 5% from the nominal CP current value 𝐼 𝑐𝑝 _𝑛𝑜𝑚𝑖𝑛𝑎𝑙 . Thus, the valid 𝑉 𝑐𝑡𝑟𝑙 range of the CP is defined as 𝑉 𝑁 _5% ≤ 𝑉 𝑐𝑡𝑟𝑙 ≤ 𝑉 𝑃 _5% . Because 𝑉 𝑁 _5% is usually located in region 2, its value is mainly affected by the slope of 𝐼 𝑑𝑛 (𝑠𝑙𝑜𝑝𝑒 𝐼𝑑𝑛 ) rather than the value of 𝑉 𝑑𝑠𝑎𝑡 _𝐼𝑐𝑠 . The amplifier gain determines the slope of 𝑉 𝑑 1 (𝑠𝑙𝑜𝑝𝑒 𝑉𝑑 1 ) in Fig. 4.14, and 𝑠𝑙𝑜𝑝𝑒 𝑉𝑑 1 determines 𝑠𝑙𝑜𝑝𝑒 𝐼𝑑𝑛 . Using the cascode current source for 𝐼 𝑐𝑠 to increase the gain of the common source feedback amplifiers in Fig. 4.13(a) and (b) limits 𝑉 𝑑𝑠𝑎𝑡 _𝐼𝑐𝑠 . However, this is not the main factor affecting 𝑉 𝑁 _5% . In fact, using the cascode current source for 𝐼 𝑐𝑠 is beneficial for reducing 𝑠𝑙𝑜𝑝𝑒 𝑉𝑑 1 , leading to a lower 𝑠𝑙𝑜𝑝𝑒 𝐼𝑑𝑛 and 𝑉 𝑁 _5% . 55 Finally, the second trade-off of the current sources in Fig. 4.12 has been overcome: (1) using the cascode structure for 𝐼 𝑐𝑠 instead of a single transistor increases the gain of the common source feedback amplifier and reduces 𝑉 𝑁 _5% ( ≈𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 ) at the same time, and (2) pushing 𝑀 3 into weak inversion (𝑉 𝑔𝑠 3 = 𝑉 𝑑 𝑠 1 is reduced) further boosts the gain of the common source feedback amplifier and also reduces 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 = 𝑉 𝑔𝑠 3 + 𝑉 𝑑𝑠𝑎𝑡 2 concurrently. (c1) 𝑽 𝒄𝒐𝒎𝒑𝒍𝒊𝒂𝒏𝒄𝒆 __𝑵 of the current source in Fig. 4.13(a) 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 of the current source in Fig. 4.13(a) can be calculated as: 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) = 𝑉 𝑔𝑠 3 ( 𝑉 𝑑𝑠 1 )+ 𝑉 𝑑𝑠 2 ( 4.22) If 𝑀 1 in Fig. 4.13(a) is designed to be in saturation, 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑠𝑎𝑡 is: 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑠𝑎𝑡 = 𝑉 𝑔𝑠 3 + 𝑉 𝑑𝑠 2 = 2𝑉 𝑑𝑠𝑎𝑡 + 𝑉 𝑡 ℎ ( 4.23) If 𝑀 3 in Fig. 4.13(a) is designed to be in weak inversion (WI), 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑊𝐼 can be calculated as [58]: 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑊𝐼 = 𝑉 𝑔𝑠 3_𝑊𝐼 + 𝑉 𝑑𝑠 2 < 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑠𝑎𝑡 = 2𝑉 𝑑𝑠𝑎𝑡 + 𝑉 𝑡 ℎ ( 4.24) where 𝑉 𝑔𝑠 3_𝑊𝐼 is the gate-source voltage of 𝑀 3 when it is in weak inversion. The minimum 𝑉 𝑑𝑠 to keep 𝑀 1 in saturation when 𝑀 3 is in weak inversion (𝑉 𝑑𝑠𝑎𝑡 1_𝑀 3_𝑊𝐼 ) can be calculated by [58]: 56 𝑉 𝑑𝑠𝑎𝑡 1_𝑀 3_𝑊𝐼 = 𝑛 𝑘𝑇 𝑞 𝑙𝑛 ( 𝐿 3 𝐼 𝑐𝑠 𝑊 3 𝐼 𝑑𝑛 ) ( 4.25) where 𝑛 is [(𝐶 𝑜𝑥 + 𝐶 𝑑𝑒𝑝𝑙𝑒𝑡𝑖𝑜𝑛 )/𝐶 𝑜𝑥 ] ≈1.5; Then, the minimum possible 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑊𝐼 when 𝑀 3 is in weak inversion can be approximated as: 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( a) _𝑀 3_𝑊𝐼 _𝑚𝑖𝑛 ≈ 2𝑉 𝑑𝑠𝑎𝑡 ( 4.26) In practice, the minimum allowable output voltage of the current source in Fig. 4.13(a) can be lower than 2𝑉 𝑑𝑠𝑎𝑡 , because this current source can still maintain the output current value close to its nominal value, even if 𝑀 2 goes into the triode region. Thus, [58] defined the minimum allowable output voltage of the current source in Fig. 4.13(a) to be the voltage where its small- signal output resistance (𝑅 𝑜𝑢𝑡 4 ) is decreased to be equal to that of the regular cascode current source (𝑅 𝑜𝑢𝑡 2 ). Then, the new minimum output voltage of the current source in Fig. 4.13(a) (𝑉 𝑜𝑢𝑡 _𝑁 _4.13( a) _𝑚𝑖𝑛 ) can be calculated by: 𝑉 𝑜𝑢𝑡 _𝑁 _4.13( a) _𝑚𝑖𝑛 = ( 𝑉 𝑔𝑠 1 − 𝑉 𝑡 ℎ )+ ( 𝑉 𝑔𝑠 2 − 𝑉 𝑡 ℎ ) √ 𝜑 2+ 𝜑 < 2𝑉 𝑑𝑠𝑎𝑡 ( 4.27) 𝜑 = 𝑔 𝑚 2 𝑟 𝑜 2 𝑔 𝑚 3 ( 𝑟 𝑜 3 ||𝑅 𝑐𝑠 ) ( 4.28) Comparing the current source in Fig. 4.13(a) to that in Fig. 4.11(a), 𝑅 𝑜𝑢𝑡 4 is about 100 times larger than 𝑅 𝑜𝑢𝑡 2 , while 𝑉 𝑜𝑢𝑡 _𝑚𝑖𝑛 can be lowered 30% to 60%, based on the result in [58]. 57 (c2) 𝑽 𝒄𝒐𝒎𝒑𝒍𝒊𝒂𝒏𝒄𝒆 __𝑵 of the current source in Fig. 4.13(b) 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 of the current source in Fig. 4.13(b) can be calculated as: 𝑉 𝑐𝑜𝑚𝑝𝑙𝑖𝑎𝑛𝑐𝑒 _𝑁 _4.13( b) = ( 𝑉 𝑔𝑠 3 − 𝑉 𝑔𝑠 4 )+ 𝑉 𝑑𝑠 2 = ( 𝑉 𝑔𝑠 3 − 𝑉 𝑔𝑠 4 )+ 𝑉 𝑑𝑠𝑎𝑡 ( 4.29) The minimum output voltage of the current source in Fig. 4.13(b) (𝑉 𝑜𝑢𝑡 _𝑁 _4.13( b) _𝑚𝑖𝑛 ) can be smaller than that in Fig. 4.13(a) because ( 𝑉 𝑔𝑠 3 − 𝑉 𝑔𝑠 4 ) can be exceedingly small [57]. 4.3.2 Reduce clock feedthrough and charge injection One way to reduce the clock feedthrough and charge injection is to use a complementary switch, as shown in Fig. 4.15. The principle of how the clock feedthrough is reduced is explained as follows. If ( 𝑊 /𝐿 ) 𝑛 /( 𝑊 /𝐿 ) 𝑝 = 𝜇 𝑝 /𝜇 𝑛 , then 𝐶 𝑝 is approximately equal to 𝐶 𝑛 . Therefore, the voltage ripple caused by the rising clock edge, which feeds into 𝐶 𝑝 (𝐶 𝑛 ) is approximately cancelled by the falling edge that feeds into 𝐶 𝑛 (𝐶 𝑝 ). The principle of how the charge injection is reduced is explained as follows. The charge injected to 𝑉 𝑜 𝑢𝑡 by 𝑀 𝑛 and 𝑀 𝑝 can be calculated by: |𝑄 𝑐 ℎ 𝑀𝑛 | = 𝑊 𝑛 𝐿 𝑛 𝐶 𝑜𝑥 ( 𝑉 𝑢 𝑝 ℎ𝑖𝑔 ℎ − 𝑉 𝑖𝑛 − 𝑉 𝑡 ℎ𝑛 ) ( 4.30) |𝑄 𝑐 ℎ 𝑀𝑝 | = 𝑊 𝑝 𝐿 𝑝 𝐶 𝑜𝑥 ( 𝑉 𝑖𝑛 − 𝑉 𝑢𝑝 𝑏 𝑙𝑜𝑤 − |𝑉 𝑡 ℎ𝑝 |) ( 4.31) Similarly, 𝑄 𝑐 ℎ_𝑀𝑛 (negative) can cancel 𝑄 𝑐 ℎ_𝑀𝑝 (positive) if ( 𝑊 /𝐿 ) 𝑛 /( 𝑊 /𝐿 ) 𝑝 = 𝜇 𝑝 /𝜇 𝑛 . As a result, the ripple at 𝑉 𝑜𝑢𝑡 caused by clock feedthrough and charge injection can be significantly reduced by using a complementary switch. 58 Fig. 4.15. Complementary switch. Another strategy for canceling out clock feedthrough and absorbing charge injection is to add a dummy switch (𝑀 𝑛 ′ and 𝑀 𝑝 ′) with half the size of the main switch transistor, as shown in Fig. 4.16. If ( 𝑊 /𝐿 ) 𝑛 = 2( 𝑊 /𝐿 ) 𝑛 ′, then 𝐶 𝑛 ′1 ||𝐶 𝑛 ′2 = 𝐶 𝑛 . Therefore, the voltage ripple caused by the rising clock edge, which feeds into 𝐶 𝑛 is approximately cancelled by the falling edge that feeds into 𝐶 𝑛 ′1 ||𝐶 𝑛 ′2 . In addition, the charge injected by 𝑀 𝑛 to the output node is approximately absorbed by 𝑀 𝑛 ′ following the direction indicated by the red arrow. Thus, the dummy switch in Fig. 4.16 can also significantly reduce the ripple at 𝑉 𝑜𝑢𝑡 caused by clock feedthrough and charge injection. Finally, the switch in Fig. 4.16 was selected as the switch in the CP. 59 Fig. 4.16. Complementary switch and dummy switch. 4.3.3 Reduce the charge-sharing effect Fig. 4.17. The principle of charge-sharing reduction. As shown in Fig. 4.17, the charge sharing problem in the basic CP in Fig. 4.4 can be solved by using a voltage follower (VF) to always connect the drains (𝑉 𝑑𝑝 and 𝑉 𝑑𝑛 ) of the current sources 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 with 𝑉 𝑐𝑡𝑟𝑙 [51], [59]. Consequently, there will be no charge sharing among the parasitic capacitors at the drains of the current sources and the loop filter capacitor at 𝑉 𝑐𝑡𝑟𝑙 . 60 4.4 CP structure comparison Fig. 4.18. Charge pump structure comparison. There are four types of current sources: ① single transistor current source, ② cascode current source, ③ regulated cascode current source, and ④ regulated cascode current source with a level shifter. There are also four types of switches: (a) switch at the output node, (b) switch at the power supply, (c) switch at the bias circuit, and (d) switch at the output node with charge-sharing reduction. As stated in Section 4.1, the output branch of the CP consists of two parts: (1) the 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 current sources and (2) the UP and DN switches. Four types of current sources and four types of switching schemes form 16 combinations, such as: [ ①+(a)], [ ②+(b)], etc. All 16 CPs were simulated and compared. For the CP, switching at the power supply in Fig. 4.18(b) or the biasing branch in Fig. 4.18(c) shields switches from the 𝑉 𝑐𝑡𝑟𝑙 pin. This alleviates the problems of charge injection, charge sharing, and clock feedthrough. However, this measure increases the time required for the CP 61 current to settle down, and it exacerbates the leakage of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 because the switches are not directly connected to the output node. Five combinations require unrealistically small transistor sizes to enable them to be turned on within several hundreds of ps: [ ③+(b)], [ ④+(b)], [ ②+(c)], [ ③+(c)], and [ ④+(c)]. Therefore, those 5 CPs cannot work well in this application, so the results for those 5 combinations were not summarized. The performances of the other 11 kinds of CPs are summarized in Table 4.1. The VF used in this comparison test is demonstrated in detail in Section 4.5. Finally, the combination [ ③+(d)] was selected as the structure of the CP because of its wide 𝑉 𝑐𝑡𝑟𝑙 range, fast switching speed, and reasonable power and noise. The complete schematic of the charge pump [ ③+(d)] is shown in Fig. 4.19. The biasing branch was realized by regulated cascode current mirrors for better symmetry and accurate current copying. 62 Table. 4.1. Results of the charge pump structure comparison Combinations 𝑽 𝒄𝒕𝒓𝒍 range < 5% 𝑰 𝒄𝒑 mismatch Turn-on speed Output current noise @ 50 MHz Power ①+(a) [0.38, 0.77] V 110 ps 3.75 pA/√Hz 80 μW ①+(b) [0.38, 0.77] V 210 ps 3.75 pA/√Hz 80 μW ①+(c) [0.36, 0.79] V 382 ps 3.87 pA/√Hz 82 μW ①+(d) [0.36, 0.77] V 49 ps 4.23 pA/√Hz 325 μW ②+(a) [0.31, 0.82] V 128 ps 4.41 pA/√Hz 91 μW ②+(b) [0.31, 0.82] V 262 ps 4.41 pA/√Hz 91 μW ②+(c) [0.33, 0.86] V 450 ps 4.50 pA/√Hz 97 μW ②+(d) [0.30, 0.82] V 56 ps 4.69 pA/√ Hz 338 μW ③+(a) [0.20, 1.01] V 391 ps 4.72 pA/√Hz 185 μW ③+(d) [0.20, 1.01] V 60 ps 4.96 pA/√Hz 429 μW ④+(a) [0.12, 1.11] V 421 ps 6.22 pA/√Hz 656 μW ④+(d) [0.12, 1.11] V 286 ps 6.79 pA/√Hz 901 μW Fig. 4.19. Final charge pump structure. 63 4.5 Class AB voltage follower (VF) with complimentary input used in the CP 4.5.1 VF classification The VF in Fig. 4.19 requires a wide input and output common mode range. Additionally, its output current should support 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 [51]. Fig. 4.20 shows different classes of VFs. Fig. 4.20. Classification of various kinds of VFs. A good VF should have a wide input/output DC operation range with a smaller output/input voltage offset (|𝑉 𝑏𝑢𝑓𝑓𝑒𝑟𝑒𝑑 − 𝑉 𝑖𝑛 |), a larger -3 dB bandwidth, less power consumption, and less complexity. 64 Table 4.2. Performance comparison of various kinds of VFs based on the simulation results in this report With global feedback With local feedback Class AB 𝒈 𝒎 cell- based VF Class A 𝒈 𝒎 cell- based VF VFs that are derived from SF Input/output DC operation range Almost rail-to-rail Wide Narrow |𝑽 𝒃𝒖𝒇𝒇𝒆𝒓𝒆𝒅 − 𝑽 𝒊𝒏 | Small Small Large -3 dB bandwidth Medium Slow Fast Power Medium High Low Complexity High High Low To achieve a high power efficiency, high signal-to-noise ratio, and a wide DC input/output operation range, a class AB 𝑔 𝑚 cell-based VF with complementary input is preferred [60], [61]. The VF structure used in the CP for charge sharing reduction is referenced from [60]. The way this structure was developed from the flipped voltage follower (FVF) is explained below. 4.5.2 FVF analysis Fig. 4.21. (a) Source follower, (b) FVF and (c) FVF feedback analysis referenced from [62]. 65 From the analysis done in [62], the conventional source follower in Fig. 4.21(a) has a gain less than unity when it is connected to resistive loads or to capacitive loads under high frequencies because the output current will influence the current through 𝑀 1 so that 𝑉 𝑔𝑠 1 cannot be fixed. In contrast, the FVF shown in Fig. 4.21(b) has a constant current through 𝑀 1 , independent of the output current. Hence, its 𝑉 𝑔𝑠 1 can be fixed, and the gain is closer to unity with resistive loads or capacitive loads under high frequencies. 𝑀 1 and 𝑀 2 form a negative voltage-voltage (series-shunt) feedback loop with two poles. From Fig. 4.21(c), we can find the expression of the closed-loop gain of the FVF from 𝑉 𝑖𝑛 to 𝑉 𝑜𝑢𝑡 (𝐴 𝐹𝑉𝐹 ), the loop gain (𝐴 𝑂𝐿 ), the output voltage variation due to output noise ( 𝜎 𝑉 𝑛 ,𝑜𝑢𝑡 ) 2 , the dominant pole in the loop gain transfer function at node Y (𝑃 𝑌 ), the non-dominant pole in the loop gain transfer function at node X (𝑃 𝑋 ), the unity gain bandwidth for 𝑉 𝑜𝑢𝑡 ( 𝑠 ) /𝑉 𝑖𝑛 ( 𝑠 ) (𝑈𝐺𝐵 𝐹𝑉𝐹 ), and the closed-loop output resistance of the FVF (𝑅 𝑜𝑢𝑡 ,𝐶𝐿 ) as follows [62], [63]: 𝐴 𝐹𝑉 𝐹 = 1 1+ 1 𝑔 𝑚 1 𝑟 𝑜 1 + 1 𝑔 𝑚 1 𝑟 𝑜 1 𝑔 𝑚 2 𝑟 𝑜 2 ( 4.32) 𝐴 𝑂𝐿 = −𝑔 𝑚 2 𝑅 𝑌 ,𝑂𝐿 ( 4.33) ( 𝜎 𝑉 𝑛 ,𝑜𝑢𝑡 ) 2 = 𝐾𝑇 𝐶 𝑋 𝑔 𝑚 2 𝑟 𝑜 1 ( 4.34) 𝑃 𝑌 = 1 𝐶 𝑌 𝑅 𝑌 ,𝑂𝐿 ( 4.35) 𝑃 𝑋 = 1 𝐶 𝑋 𝑅 𝑋 ,𝑂𝐿 ( 4.36) 𝑈𝐺𝐵 𝐹𝑉𝐹 = 𝑔 𝑚 1 𝐶 𝑋 𝑔 𝑚 2 𝑟 𝑜 1 ( 4.37) 66 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 = 𝑅 𝑋 ,𝑂𝐿 1+ |𝐴 𝑂𝐿 | ≈ 1 𝑔 𝑚 1 (1+ 𝑟 𝑏 𝑟 𝑜 1 )||𝑟 𝑜 2 𝑔 𝑚 2 ( 𝑟 𝑏 ||𝑔 𝑚 1 𝑟 𝑜 1 𝑟 𝑜 2 ) ( 4.38) where 𝑅 𝑌 ,𝑂𝐿 is the open loop resistance at node Y and 𝑅 𝑌 ,𝑂𝐿 ≈ 𝑟 𝑏 ||𝑔 𝑚 1 𝑟 𝑜 1 𝑟 𝑜 2 , 𝑅 𝑋 ,𝑂𝐿 is the open loop resistance at node X and 𝑅 𝑋 ,𝑂𝐿 ≈ 1+ 𝑟 𝑏 𝑟 𝑜 1 𝑔 𝑚 1 ||𝑟 𝑜 2 , 𝐶 𝑋 is the parasitic capacitance at node X, including the load capacitance, 𝐶 𝑌 is the parasitic capacitance at node Y. If the current source 𝐼 𝑏 is a single transistor, 𝑟 𝑏 = 𝑟 𝑜 1 and 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 can be approximated as 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 _1 ≈ 2 𝑔 𝑚 1 𝑔 𝑚 2 𝑟 𝑜 1 ( 4.39) If the current source 𝐼 𝑏 is a cascoded transistor (𝑟 𝑏 = 𝑔 𝑚 1 𝑟 𝑜 1 𝑟 𝑜 2 ), or 𝑟 𝑏 is very large, 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 can be approximated as 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 _2 ≈ 1 𝑔 𝑚 1 𝑔 𝑚 2 𝑟 𝑜 1 ( 4.40) Both 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 _1 and 𝑅 𝑜𝑢𝑡 ,𝐶𝐿 _2 are very small compared to the output resistance of a conventional source follower, enabling the FVF to have large sourcing compatibility. To ensure the stability of the feedback loop, 𝑃 𝑋 > 2 × 𝑈𝐺𝐵 𝐹𝑉𝐹 must be satisfied. If 𝑟 𝑏 ≈ 𝑟 𝑜 1 , the following condition should be met: 𝐶 𝑋 𝐶 𝑌 < 𝑔 𝑚 1 4𝑔 𝑚 2 ( 4.41) If 𝑟 𝑏 is large enough to reduce the value of 𝐶 𝑋 /𝐶 𝑌 to below that of 1/( 𝑔 𝑚 2 𝑟 𝑜 2 ) , a 67 compensation capacitor 𝐶 𝑐 will be needed at node Y , as shown in Fig. 4.21(c). From the above analysis, it can be concluded that the UGB of the FVF can be enlarged by 𝑔 𝑚 2 𝑟 𝑜 1 times, and the output resistance will be reduced by 𝑔 𝑚 2 𝑟 𝑜 1 times compared to a conventional source follower. However, the output noise of the FVF is 𝑔 𝑚 2 𝑟 𝑜 1 times greater than that of the source follower [62], [63]. Therefore, careful design to reduce the noise of the FVF is necessary. 4.5.3 Differential flipped voltage follower (DFVF) analysis Fig. 4.22. (a) Differential flipped voltage follower (DFVF), (b) DFVF with extended common mode input range (CMR), and (c) example DC characteristic of the DFVF in (a) and (b) [60], [62]. The fundamental circuit cell of the VF in the CP is a DFVF, the schematic of which is shown in Fig. 4.22(a). For the DFVF in Fig. 4.22(a) and (b), the impedance at the source of 𝑀 1 (𝑠 1 ) is very low (derived in section 4.5.2), maintaining 𝑉 𝑠 1 to be approximately constant for a large 𝐼 𝑜𝑢𝑡 . Any increase in the differential input voltage (𝑉 𝑖𝑛 + − 𝑉 𝑖𝑛 − ) will generate current in 𝑀 3 that varies 68 under square law regime, making 𝐼 𝑜𝑢𝑡 able to exceed the quiescent current 𝐼 𝐵 1 by a large amount and enhancing the slew rate of the DFVF. Fig. 4.22(d) is an example waveform of 𝐼 𝑜𝑢𝑡 as a function of (𝑉 𝑖𝑛 + − 𝑉 𝑖𝑛 − ) in [60]. The common mode input range (CMR) of the DFVF in Fig. 4.22(a) can be calculated as shown below [60]. Since 𝑉 𝑖𝑛 + = 𝑉 𝑖𝑛 − during common mode analysis, the current through 𝑀 1 and 𝑀 3 are both equal to 𝐼 𝐵 1 . 𝑉 𝑐𝑚 _𝑚𝑖𝑛 _𝑎 = 𝑉 𝑠𝑠 + 𝑉 𝑑𝑠𝑎𝑡 2 + 𝑉 𝑔𝑠 1 = 𝑉 𝑠𝑠 + √ 2𝐼 𝐵 1 𝛽 2 + ( √ 𝐼 𝐵 1 𝛽 1, 3 + 𝑉 𝑡 ℎ𝑛 ) ( 4.42) 𝑉 𝑐𝑚 _𝑚𝑎𝑥 _𝑎 = 𝑉 𝑠𝑠 + 𝑉 𝑔𝑠 2 ( 𝑉 𝐷 1 )+ 𝑉 𝑡 ℎ𝑛 = 𝑉 𝑠𝑠 + (√ 2𝐼 𝐵 1 𝛽 2 + 𝑉 𝑡 ℎ𝑛 ) + 𝑉 𝑡 ℎ𝑛 ( 4.43) 𝐶𝑀𝑅 𝑎 = 𝑉 𝑐𝑚 _𝑚𝑎𝑥 _𝑎 − 𝑉 𝑐𝑚 _𝑚𝑖𝑛 _𝑎 = 𝑉 𝑡 ℎ𝑛 − √ 𝐼 𝐵 1 𝛽 1, 3 ( 4.44) where 𝛽 𝑥 is the process transconductance of a MOSFET. It can be seen that the 𝐶𝑀𝑅 𝑎 is narrow; therefore, the insertion of the level shifters, 𝑀 𝑠 and 𝐼 𝐵 , in Fig. 4.22(b) is needed to shift up 𝑉 𝑑 1 to (𝑉 𝑑𝑑 − 𝑉 𝑡 ℎ𝑛 ) , allowing 𝑉 𝑐𝑚 _𝑚𝑎𝑥 _𝑏 to be 𝑉 𝑑𝑑 . Thus, the CMR of the DFVF in Fig. 4.22(b) can be extended to be [60]: 𝑉 𝑐𝑚 _𝑚𝑖𝑛 _𝑏 = 𝑉 𝑠𝑠 + 𝑉 𝑑𝑠𝑎𝑡 2 + 𝑉 𝑔𝑠 1 = 𝑉 𝑠𝑠 + √ 2𝐼 𝐵 1 𝛽 2 + ( √ 𝐼 𝐵 1 𝛽 1, 3 + 𝑉 𝑡 ℎ𝑛 ) = 𝑉 𝑐𝑚 _𝑚𝑖𝑛 _𝑎 ( 4.45) 𝑉 𝑐𝑚 _𝑚𝑎𝑥 _𝑏 = 𝑉 𝐷𝐷 ( 4.46) 𝐶𝑀𝑅 𝑏 = 𝑉 𝑐𝑚 _𝑚𝑎𝑥 _𝑏 − 𝑉 𝑐𝑚 _𝑚𝑖𝑛 _𝑏 = 𝑉 𝐷𝐷 − 𝑉 𝑏𝑎𝑠𝑒 ( 4.47) 69 As shown in Fig. 4.23(a), complementary class AB VF can be constructed by using one PMOS DFVF (lower part) and one NMOS DFVF (upper part) with the same structure shown in Fig. 4.22(b). The current mirrors 𝑀 6 -𝑀 7 and 𝑀 13 -𝑀 14 convey the output currents of the PMOS and the NMOS parts to the output branch. Fig. 4.23. (a) Complementary class AB VF and (b) complementary class AB VF with adaptive biasing 𝑀 3 and 𝑀 10 . VF performance can be improved by adding adaptive biasing transistors 𝑀 3 and 𝑀 10 (in the shaded area in Fig. 4.23(b)) to deal with the condition of 𝑉 𝑖𝑛 being too high or too low. When 𝑉 𝑖𝑛 goes high, all the transistors of the PMOS part will be cut off, and 𝑀 3 will be responsible for 𝐼 𝑑 7 = 𝐼 𝑑 3 = 𝐼 𝐵 . Similarly, when 𝑉 𝑖𝑛 goes low, all the transistors of the NMOS part will be cut off, 70 and 𝑀 10 will be responsible for 𝐼 𝑑 14 = 𝐼 𝑑 10 = 𝐼 𝐵 . Such alternation is beneficial in allowing the VF to operate nearly rail-to-rail with less power and a more compact design while retaining similar slewing and settling performance [60]. However, the VF in Fig. 4.23(b) has two kinds of limitations to the maximum allowable output current 𝐼 𝑜𝑢𝑡 _𝑚𝑎𝑥 . The following analysis takes the NMOS half as an example. During the operations of the VF, when 𝑉 𝑖𝑛 increases dramatically in a very short period, 𝑉 𝑜𝑢𝑡 will take some time to follow 𝑉 𝑖𝑛 , so there will be some instantaneous points at which (𝑉 𝑖𝑛 − 𝑉 𝑜𝑢𝑡 ) suddenly goes high. Theoretically, 𝐼 𝑑 1 = 𝐼 𝑑 7 = 𝐼 𝑜𝑢𝑡 will increase with (𝑉 𝑖𝑛 − 𝑉 𝑜𝑢𝑡 ) following the square law. However, in practice, since 𝐼 𝑑 5 = 𝐼 𝑑 1 + ( 𝐼 𝑑 2 + 𝐼 𝑑 3 )= 𝐼 𝑜𝑢𝑡 + ( 2𝐼 𝐵 ) (the 𝑉 𝑔𝑠 of 𝑀 2 and 𝑀 3 are equal, so both of them are assumed to conduct 𝐼 𝐵 ), 𝑉 𝑔 5 should rise to accommodate the increase of 𝐼 𝑜𝑢𝑡 . Nonetheless, 𝑉 𝑔 5 cannot exceed (𝑉 𝑑𝑑 − 𝑉 𝑑𝑠𝑎𝑡 _𝐼𝐵 − 𝑉 𝑔𝑠 4 ) to keep the current source 𝐼 𝐵 at the gate of 𝑀 4 in saturation. Thus, the sinking compatibility of 𝑀 5 will limit 𝐼 𝑜𝑢𝑡 _𝑚𝑎𝑥 , yielding the first limitation to 𝐼 𝑜𝑢𝑡 _𝑚𝑎𝑥 . Based on this limitation, the maximum achievable and allowable output current (𝐼 𝑜𝑢𝑡 _max _1 ) can be expressed as [60]. 𝐼 𝑜𝑢𝑡 _max _1 = 𝛽 5 ( 𝑉 𝑔𝑠 5_𝑚𝑎𝑥 − 𝑉 𝑡 ℎ ) 2 − 2𝐼 𝐵 ( 4.48) 𝐼 𝑜𝑢𝑡 _max _1 = 𝛽 5 ( 𝑉 𝑑𝑑 − ( 𝑉 𝑑𝑠𝑎 𝑡 𝐼𝐵 + 𝑉 𝑑𝑠𝑎𝑡 4 )− 2𝑉 𝑡 ℎ ) 2 − 2𝐼 𝐵 ( 4.49) Secondly, as the 𝐼 𝑜𝑢𝑡 increases, the decrease of 𝑉 𝑠𝑔 6 will push 𝑀 1 into the triode region, restricting its ability to conduct current and introducing the second limitation to 𝐼 𝑜𝑢𝑡 _𝑚𝑎𝑥 . A similar analysis can be done for the PMOS half [60]. The two limitations can be solved by adding a current controlled current source (CCCS) in parallel with 𝑀 5, as shown in Fig. 4.23(a), enhancing the sinking compatibility at the tail. The 71 CCCS copies 𝐼 𝑜𝑢𝑡 to the tail, which allows 𝑀 5 to serve as a constant current source to conduct 2𝐼 𝐵 , and there is no signal swing headroom requirement for 𝑉 𝑔 5 . Therefore, 𝐼 𝑜𝑢𝑡 _max is able to break both the first limit and the second limit. The enlarged 𝐼 𝑜𝑢𝑡 _max can be estimated as [60] 𝐼 𝑜𝑢𝑡 _max _2 ≈ 𝛽 1 𝛽 6 ( 𝑉 𝑑𝑑 + √ 𝐼 𝐵 𝛽 1 √𝛽 1 + √𝛽 6 ) 2 ( 4.50) 𝐼 𝑜𝑢𝑡 _max_2 will follow triode region behavior beyond this range, where 𝑀 1 will enter the triode region when 𝑉 𝑖𝑛 + reaches (𝑉 𝑑𝑑 +𝑉 𝑡 ℎ𝑛 -|𝑉 𝑡 ℎ𝑝 |-√𝐼 𝑜𝑢𝑡 _max_2 /𝛽 6 ) [60]. An example comparison of the achievable 𝐼 𝑜𝑢𝑡 _max_2 with and without the CCCS is shown in Fig. 4.24(b). Fig. 4.24. (a) Dynamic tail current source used in the DFVF, (b) output current versus differential input voltage with and without the dynamic current source, and (c) final version of VF used in the CP: complementary class AB slew rate enhanced VF [60]. 72 The final version of the VF used in the CP is constructed as shown in Fig. 4.24(c), where the CCCS is realized by transistors 𝑀 𝐶 1−3 and 𝑀 𝐷 1−3 . 𝐼 𝑜𝑢 𝑡 is scaled down by 𝐾 at 𝑀 𝐶 3 to save power, while the current mirror 𝑀 𝐶 2 − 𝑀 𝐶 1 has a scale-down factor of 𝐾 −1 , making 𝑀 𝐶 1 to conduct 𝐼 𝑜𝑢𝑡 . The CCCS allows the VF to reach a higher slew rate than the unity gain buffer made by complementary folded cascode OTA with a lower power consumption. The output swing of this VF is (𝑉 𝑑𝑑 − 𝑉 𝑑𝑠𝑎𝑡 7 − 𝑉 𝑑𝑠𝑎𝑡 14 ), which is larger than the unity gain buffer made by OTAs with a cascode output stage. The closed-loop -3 dB bandwidth of the VF is calculated as 𝑈𝐺𝐵 𝐶𝑙𝑎𝑠𝑠 _𝐴𝐵 _𝑉𝐹 = 𝑔 𝑚 1 + 𝑔 𝑚 3 + 𝑔 𝑚 8 + 𝑔 𝑚 10 𝐶 𝐿 ( 4.51) and 𝑈𝐺𝐵 𝐶𝑙𝑎𝑠𝑠 _𝐴𝐵 _𝑉𝐹 should be greater than the maximum operation frequency of the DLL. 4.6 Simulation results Post-layout corner and Monte Carlo simulations were implemented to test the VF and CP. Their worst-case performances were summarized in Tables 4.3 and 4.4, respectively. Fig. 4.26 shows the transient simulation result of the VF at all the corners. In this test, the input voltage (𝑉 𝑐𝑡𝑟𝑙 ) of the VF ramped up from 0 to 𝑉 𝑑𝑑 in 1.2 μs and then ramped back down to 0 in another 1.2 μs. 𝑉 𝑐𝑡𝑟𝑙 _𝑏𝑢𝑓 was captured to measure the |𝑉 𝑐𝑡𝑟𝑙 − 𝑉 𝑐𝑡𝑟𝑙 _𝑏𝑢𝑓 | offset. From Section 3, the valid 𝑉 𝑐𝑡𝑟𝑙 range to operate the VCDL was defined as [0.3 V , 0.9 V]. Therefore, |𝑉 𝑐𝑡𝑟𝑙 − 𝑉 𝑐𝑡𝑟𝑙 _𝑏𝑢𝑓 | was measured when 𝑉 𝑐𝑡𝑟𝑙 ∈[0.2 V , 1 V]. As a result, the worst-case offset |𝑉 𝑐𝑡𝑟𝑙 − 𝑉 𝑐𝑡𝑟𝑙 _𝑏𝑢𝑓 | was 18.89 mV at the SS corner when 𝑉 𝑑𝑑 = 1.08 V and at Temperature = 60 ℃. 73 The 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 of the CP were simulated as 𝑉 𝑐𝑡𝑟𝑙 changes from 0 to 𝑉 𝑑𝑑 . The worst-case condition was at the SS corner, when 𝑉 𝑑𝑑 = 1.08 V and Temperature = 60 ℃, where the narrowest valid 𝑉 𝑐𝑡𝑟𝑙 range covers 66.5% from rail to rail. Fig. 4.25. The layout of the CP. Table 4.3. Worst-case performance summary of the VF (0.2 V < 𝑉 𝑐𝑡𝑟𝑙 < 1 V) Worst corner performance -3dB bandwidth 417.6 MHz @ SS, 𝑉 𝑑𝑑 = 1.08 V, 60 ℃ |𝑽 𝒄𝒕𝒓𝒍 − 𝑽 𝒄𝒕𝒓𝒍 _𝒃𝒖𝒇 | offset 18.89 mV @ SS, 𝑉 𝑑𝑑 = 1.08 V, 60 ℃ Input referred noise 19.25 nV/√Hz @ SS, 𝑉 𝑑𝑑 = 1.08 V, 60 ℃ Power consumption 0.46 mW @ FF, 𝑉 𝑑𝑑 = 1.32 V, 60 ℃ Phase margin of all the loops in the VF > 65.55° @ FF, 𝑉 𝑑𝑑 = 1.08 V, 10 ℃ 74 Table 4.4. Performance summary of the CP at the worst-case corner (SS, 1.08 V, 60 ℃) under Monte Carlo 𝑽 𝒄𝒕𝒓𝒍 _𝒎𝒊𝒏 for 𝑰 𝒖𝒑 -𝑰 𝒅𝒏 mismatch < 5 % 0.185 V 𝑽 𝒄𝒕𝒓𝒍 _𝒎𝒂𝒙 for 𝑰 𝒖𝒑 -𝑰 𝒅𝒏 mismatch < 5 % 0.903 V Settling time 105 ps Power consumption 0.80 mW Phase margin of all the loops in the CP > 63.06° Fig. 4.26. Post-layout transient simulation of the VF at all the corners. Fig. 4.27. Post-layout simulations of 𝐼 𝑢𝑝 and 𝐼 𝑑𝑛 at all the corners. 75 Chapter 5: Phase Frequency Detector (PFD) 5.1 Structure and operation As explained in Section 2, the PFD creates UP/DN pulses on the basis of the phase difference between the input clock and output of the last delay cell in the VCDL. Fig. 5.1 illustrates the basic structure of the PFD [24], [48]. Fig. 5.1. Basic PFD structure. 76 5.2 Non-idealities of the PFD Fig. 5.2. PFD transfer characteristics, blind zone, and dead zone. 5.2.1 Dead zone Fig. 5.2 shows the characteristics of PFD transfer, wherein the phase difference between the two clocks at the PFD inputs is represented by the x axis, and the average voltage of (UP-DN) is denoted by the y axis. As indicated in the figure, the dead zone is the region where the input phase difference cannot be accurately and linearly transferred to the UP-DN skew. The dead zone determines the minimum detectable phase error between the input and feedback clocks [64], [65]. 77 5.2.2 Blind zone The blind zone of the PFD is related to the maximum detectable phase error in one cycle. In this zone, the PFD senses the upcoming rising edge in the next cycle rather than detecting the correct rising edge in the current cycle. This mistakenly detected phase error causes cycle slipping. The width of the blind zone is calculated with [64], [65] 𝐵𝑙𝑖𝑛𝑑 𝑍𝑜𝑛𝑒 = 2𝜋 𝑇 𝑟𝑒𝑠𝑒𝑡 𝑇 𝑖𝑛 ( 5.1) where 𝑇 𝑟𝑒𝑠𝑒𝑡 is the reset delay and 𝑇 𝑖𝑛 is the input clock cycle. 5.2.3 Reset hold time The DFF has a fixed hold time that is needed for the reset. If the Reset = 0, the UP/DN signals go low, but if the Reset is held for an insufficient amount of time, the state of the DFF does not reset internally, and the UP/DN signals go high again. Given that the NAND gate is fast, the reset quickly goes low again. As a result, the UP/DN signals oscillate under an inadequate hold time for the reset pin (Fig. 5.3). 78 Fig. 5.3. UP/DN oscillation under an insufficient reset hold time. In practice, therefore, a buffer delay is typically needed between the NAND gate and the reset pin. The buffer delay cannot be excessively large. A reset delay determines the minimum width of the UP/DN pulses, which should be long enough to enable the appropriate activation of the charge pump. However, long reset delay limits the maximum operating frequency of the PFD, which can be expressed as [64] 𝐹 𝑚𝑎𝑥 = 1 2𝑇 𝑟𝑒𝑠𝑒𝑡 ( 5.2) A long reset delay also enlarges the blind zone of the PFD, according to equation (5.1), and exacerbates static offsets and current mismatches, thereby worsening jitter in the DLL. In summary, the reset delay in the feedback path of the PFD should have the following attributes: (1) A length sufficient to satisfy the hold time requirement (2) A length sufficient to ensure that the CP can be properly switched on via UP/DN pulses (3) A length that does not constrain the maximum operating frequency of the PFD 79 (4) A length that does not excessively expand the blind zone (5) A length that ensures reduced current mismatch in the charge pump, static DLL phase error, and DLL jitter. 5.2.4 Rise/fall times of the reset pulse and UP/DN pulses The rise/fall times of the reset and UP/DN signals are equally important because a sharp transition reduces the amount of time within which noise affects the succeeding components of the circuit. Additionally, if the threshold voltage of the two DFFs and the UP/DN switches in the charge pump differ owing to mismatch, a sharp transition alleviates the issue, as this translates to less time crossing the two threshold voltages. 5.3 False lock prevention The correct locking condition of the DLL occurs when the VCDL delay (𝑇 𝑉𝐶𝐷𝐿 ) is exactly 𝑇 𝑖𝑛 . The DLL can suffer from two false locking conditions: harmonic and stuck locking. As shown in Fig. 5.4(a) [66], harmonic locking refers to DLL locking at 𝑇 𝑉𝐶𝐷𝐿 = 𝑁 × 𝑇 𝑖𝑛 ( 𝑁 :𝑖𝑛𝑡𝑒𝑔𝑒𝑟 , 𝑁 > 1) under a wide VCDL range and a 𝑇 𝑉𝐶𝐷𝐿 , 𝑚𝑎𝑥 /𝑇 𝑉𝐶𝐷𝐿 , 𝑚𝑖𝑛 ≥ 2. This locking condition may be caused by (1) abrupt input frequency changes, especially a shift from a large 𝑇 𝑖𝑛 to a small 𝑇 𝑖𝑛 , (2) electrostatic discharge (ESD) noise, and (3) power supply noise. Stuck locking pertains to DLL locking at either the minimum or maximum VCDL delay because 80 of improper PFD states, as illustrated in Fig. 5.4(b) [66]. Fig. 5.4. Examples of (a) harmonic and (b) stuck locks [66]. False locking can be prevented in two ways: using a dual delay loop and modifying a PFD. PFD modification tasks can be further classified into rendering a PFD into a three-phased device and adding a reset to a PFD. However, dual delay loops almost double the overall power consumption of a DLL, and a three-phase PFD can be very sensitive to input duty cycles. Therefore, the false lock prevention method presented in [66], which involves incorporating a reset into a PFD, was adopted in the current project. The circuit structures of the harmonic lock detector (HLD) and the stuck lock detector are shown in Fig. 5.5. 81 Fig. 5.5. (a) PFD with reset, (b) harmonic lock detector, and (c) the stuck lock detector used in the DLL developed in this work [66]. 5.3.1 Harmonic lock prevention There are 33 delay cells (P<0>–P<32>) in the VCDL, and the odd delay stages (P<0>, P<2>, ……, P<28>, and P<30>) are used for harmonic lock detection. In the HLD, C_lock is the signal generated to detect a harmonic lock. A C_lock = 0 indicates that not all the rising edges of P<2:30> are sequentially placed within 1× 𝑇 𝑖𝑛 , thus potentially causing harmonic locking. Subsequently, the PFD is forced to output an UP = 1, and the VCDL delay is continuously reduced until 𝑇 𝑉𝐶𝐷𝐿 < ( 16/15) 𝑇 𝑖𝑛 . The working principle of the HLD is as follows: First, the frequency of P<0> is divided by 82 two using a T flip-flop to obtain P<0>DIV2. Under such a scheme, the duty cycle of P<0>DIV2 can be independent of the input clock and guaranteed at 50%. Then, P<2>, P<4>…, and P<30> samples P<0>DIV2 and Q<1:15> are generated, while D<1:14> can be produced through another sampling process. Finally, C_lock is generated using “ANDing” D<1:14>. Examples of the output waveforms of P<0:30>, Q<1:15>, D<1:14>, and C_lock in the HLD are shown in Fig. 5.6. All the rising edges under the high-level half of P<0>DIV2 are considered sequentially placed within 1×𝑇 𝑖𝑛 . If 𝑟 2[𝑥 ] − 𝑟 30[𝑥 ] all fall within the high-level half of P<0>DIV2 and 𝑟 2[𝑥 −1] − 𝑟 30[𝑥 −1] all fall beyond the high-level half of P<0>DIV2, the VCDL delay is less than ( 16/15) 𝑇 𝑖𝑛 , and C_lock is equal to 1. Otherwise, C_lock has a value of 0. Fig. 5.6. Example waveform of an HLD-corrected harmonic lock; (𝑟 𝑖 [𝑥 ] denotes the rising edge of P in the x th cycle). 83 In this example, from 𝑡 0 to 𝑡 1 , 𝑟 2[𝑥 ] − 𝑟 28[𝑥 ] are located within the high-level half of P<0>DIV2, but 𝑟 30[𝑥 ] is not. Because P<0>–P<32> are periodic, 𝑟 30[𝑥 −1] must fall under the high- level half of P<0>DIV2 together with 𝑟 2[𝑥 ] − 𝑟 28[𝑥 ] , whereas 𝑟 28[𝑥 −1] must fall under the low-level half of P<0>DIV2 in the previous cycle. As a result, the rising edge of Q<15> samples the low level of 𝑄 [14], generating a 𝐷 [14] = 0 and a C_lock = 0. In general, if 𝑟 𝑖 [𝑥 ] is the first among 𝑟 2[𝑥 ] − 𝑟 30[𝑥 ] that is beyond the high-level half of P<0>DIV2 (meaning not all the rising edges of P<2:30> are sequentially placed within 1 × 𝑇 𝑖𝑛 ), the rising edge of 𝑄 [𝑖 /2] samples the low level of 𝑄 [( 𝑖 − 2) /2] . This process produces a 𝐷 [( 𝑖 − 2) /2] = 0 and a C_lock = 0. As stated in [66], the HLD works only if the delay between adjacent delay cells is less than 𝑇 𝑖𝑛 . Delay mismatches do not affect the performance of this HLD because when it is operating, the gain of the PFD stays constant at 1. 5.3.2 Stuck lock prevention The stuck lock detector detects an inappropriate PFD state and resets the PFD when the rising edge of the middle delay cell (P<16>) emerges. As reflected in Fig. 5.7(a), the PFD is supposed to output UP and reduce the VCDL delay at 𝑡 2 . However, the DN signal is generated instead, given an improper PFD state. Therefore, P<16> monitors the DN signal, and if the rising edge of P<16> samples a DN = 1, the PFD is reset and reverts to normal operation. The similar principle can be applied when the UP signal is generated improperly as shown in Fig. 5.7(b). 84 Fig. 5.7. Example waveform of stuck lock correction when (a) 𝑇 𝑟𝑒𝑓 < 𝑇 𝑓𝑒𝑏 < ( 16/15) 𝑇 𝑟𝑒𝑓 and (b) 𝑇 𝑓𝑒𝑏 < 𝑇 𝑟𝑒𝑓 . 5.4 Simulation results The layout of the PFD, together with the false lock prevention circuits, is shown in Fig. 5.8. Post- layout corner and Monte Carlo simulations were implemented. The worst performance occurred at a 𝑉 𝑑𝑑 = 1.08 V and a temperature = 60 ℃ at the SS corner. The performance results of the PFD at this corner are presented in Fig. 5.9 and Table 5.1. To test the performance of the false lock prevention circuit, the DLL was simulated under input clocks with continuously changing frequencies (Fig. 5.10). The circuit works robustly under abrupt input frequency changes (even changes from a value of 𝑓 𝑖𝑛 to another frequency <0.5𝑓 𝑖𝑛 ), and the DLL can be accurately locked at the desired 𝑉 𝑐𝑡𝑟𝑙 value. 85 Fig. 5.8. Layout of the PFD and false lock prevention circuit. Fig. 5.9. PFD output characteristics at the worst-case corner at a 𝑉 𝑑𝑑 = 1.08 V and a temperature = 60 ℃ at SS. Table 5.1. Performance summary of the PFD at the worst-case corner (SS, 𝑉 𝑑𝑑 = 1.08 V, temperature = 60 ℃) Dead zone 0.08% Minimum detectable input skew when 𝑻 𝒊𝒏 = 𝟐𝟎 𝐧𝐬 16 ps Blind zone 0.55% Maximum detectable input skew when 𝑻 𝒊𝒏 = 𝟐𝟎 𝐧𝐬 19.89 ns Output jitter UP = 393.6 fs, DN = 342.77 fs under DLL locking Input-to-output delay 39.86 ps Minimum UP/DN pulse width 220 ps UP/DN rise and fall time 11 ps Worst-case UP/DN skew from Monte Carlo simulations (when inputs are in phase) 1.06 ps 86 Fig. 5.10. Dynamic performance of the DLL. 87 Chapter 6: Fractional Frequency Counter The fractional frequency counter built in this section has three main components: (1) a counting pattern generator, (2) a full-cycle counter, and (3) a fractional cycle counter. The structure of the fractional frequency counter is shown in Fig. 6.7. All input pins, including “P<31:0>” and “Start/Stop”, and outputs, including “L<4:0>”, “Overflow”, “F<4:0>”, and “N<39:0>”, are arranged on the right side of Fig. 6.7 and marked in green. 6.1 Counting pattern generator (C<31:0>) Fig. 6.1 is an example of the output waveforms of the VCDL. We can see that the rising edges of P<0>~P<32> will arrive one after another and P<0> will be in phase with P<32> when the DLL is locked. Each of the P<0>~P<31> are connected to the clock pin T1<0>~T1<31> of a T flip-flop (TFF), as shown in Figure. 6.2. The following analysis will assume that the Q outputs of all TFFs are reset to 0 at the beginning and that T1<0> is the first among T1<0>~T1<31> to receive a rising edge. Additional circuits in the complete fractional frequency counter schematic in Fig. 6.7 will ensure such conditions. When the rising edge of P<0> appears at T1<0>, the output of that TFF will be toggled, where Q<0> will change from 0 to 1 and Q ̅ <0> will change from 1 to 0. As the rising edges of 88 P<0>~P<31> arrive one after another, Q<0>~Q<31> will change from 0 to 1 in turn and Q ̅ <0>~Q ̅ <31> will change from 1 to 0 sequentially. As a result, the outputs of the TFFs Q<31:0> change from all-0 to all-1 and then from all-1 to all-0 again. This pattern of Q<31:0> circulates. In contrast, Q ̅ <31:0> circulates in the opposite pattern. Fig. 6.1. Example VCDL outputs. T Q Q Rst Q1<0> Q1<0> T Q Q Rst Q1<1> Q1<1> T1<0> T1<1> T Q Q Rst T Q Q Rst T1<30> Q1<30> T1<31> Q1<31> T1<0:31> P<0:31> VCDL P<0> P<1> P<31> P<32> TFF Line 1 Clk_In Q1<30> Q1<31> Q1<31:0> Q1<31:0> 64-to-32 Mux (3) 0 1 C<31:0> SQ2 Q1<0> Q1<31> Q1<0> Q1<31> T Q Q Rst Rst SQ1 Rst Fig. 6.2. Counting pattern generator. 89 If we add a mux to select between Q<31:0> and Q ̅ <31:0> and output only one of them that is in circulation from all-0 to all-1, the final output of the TFF Line (1) (C<31:0>) will always circulate from “0000……0001, S0” to all-1, as shown in Table 6.1. The SQ2 selection bit can be generated by Q<31>, Q<0>, Q ̅ <31>, Q ̅ <0>, 2 AND gates, 1 OR gate, and 1 TFF. The time point to flip SQ2 is when the output of one of the AND gates is 1 (meaning that the circulation hit the condition of all-1). Then, C<31:0> can be used as the counting pattern to calculate the frequency of the input clock (𝑓 𝑖𝑛 ). Table 6.1. Output of Q<31:0> and Q ̅ <0:31> Q<31:0> 𝐐̅ <31:0> SQ2 C<31:0> State 0000……0001 1111……1110 1 0000……0001 S0 0000……0011 1111……1100 1 0000……0011 S1 0000……0111 1111……1000 1 0000……0111 S2 0000……1111 1111……0000 1 0000……1111 S3 …… …… 1 …… …… 0001……1111 1110……0000 1 0001……1111 S28 0011……1111 1100……0000 1 0011……1111 S29 0111……1111 1000……0000 1 0111……1111 S30 1111……1111 0000……0000 1 1111……1111 S31 1111……1110 0000……0001 0 0000……0001 S0 1111……1100 0000……0011 0 0000……0011 S1 1111……1000 0000……0111 0 0000……0111 S2 1111……0000 0000……1111 0 0000……1111 S3 …… …… 0 …… …… 1110……0000 0001……1111 0 0001……1111 S28 1100……0000 0011……1111 0 0011……1111 S29 1000……0000 0111……1111 0 0111……1111 S30 0000……0000 1111……1111 0 1111……1111 S31 90 6.2 Full-cycle counter When C<31:0> finishes 𝑁 (an integer) circulation cycles from S0 to all-1, 𝑁 full input clock cycles (𝑇 𝑖𝑛 ) are counted. Recall from Section 1 that if the input clock has 𝑁 cycles counted in a time period of 𝑇 𝑐𝑜𝑢𝑛𝑡 seconds, the input clock frequency can be reconstructed as: 𝑓 𝑖𝑛 = 𝑁 ∙ 1 𝑇 𝑐𝑜𝑢𝑛𝑡 = 𝑁 ∙ 𝑓 𝑐𝑜𝑢𝑛𝑡 ( 6.1) A 40-bit up counter is used as the full-cycle counter, where N<39:0> increases from all-0 to all-1 following the binary digit pattern within the valid counting range. The output of the AND gate on the left of the up counter changes from 0 to 1 after every full circulation cycle of C<31:0>. Then, the output of the up counter should increment by 1. The up counter here overflows once N<39:0>=all-1. From the testing results, the DLL never operates at a frequency greater than 450 MHz. Therefore, the maximum 𝑇 𝑐𝑜𝑢𝑛𝑡 can be calculated as: 𝑇 𝑐𝑜𝑢𝑛𝑡 _𝑚𝑎𝑥 = 2 40 × 1 450𝑀𝐻𝑧 = 2443.36 𝑠𝑒𝑐𝑜𝑛𝑑𝑠 ( 6.2) In other words, the full-cycle counter will not overflow if we count for less than 2443.36 seconds, which is sufficient in practice. However, an overflow detection circuit is still needed to help identify invalid counting. This can be realized by an AND gate and a DFF, as shown in Fig. 6.3. The condition that N<39:0>=all-1 triggers the DFF and lets the overflow flag remain 1. 91 T Q Q Rst Q<0> Q<0> T Q Q Rst Q<1> Q<1> T<0> T<1> T Q Q Rst T Q Q Rst T<30> Q<30> T<31> Q<31> T<0:31> P<0:31> VCDL P<0> P<1> P<31> P<32> TFF Line 1 Clk_In Q<30> Q<31> Q<31:0> Q<31:0> 64-to-32 Mux (3) 0 1 C<31:0> SQ2 Q<0> Q<31> Q<0> Q<31> T Q Q Rst Rst SQ1 D Q Q Rst N<0> 40-bit Cycle Counter D Q Q Rst N<1> N<38> N<39> D Q Q Rst D Q Q Rst C<0> C<31> N<39:0> NC D Q Q Rst Overflow N<39:0> Rst Rst Fig. 6.3. Counting pattern generator with full-cycle counter. 6.3 Fractional counter 6.3.1 Fractional counting for the last circulation of cycle C<31:0>, which is non-complete If C<31:0> finishes 𝑁 complete circulation cycles from S0 to all-1 and one non-complete circulation from S0 to 0000……01111 (state S3 in Table 6.1), a fractional counter can be added to deal with the last non-complete circulation and realize more accurate frequency counting. 92 Additionally, counting only a fraction of a full cycle could increase the resolution of frequency reconstruction with a smaller 𝑇 𝑐𝑜𝑢𝑛𝑡 required, breaking the 𝑇 𝑐𝑜𝑢𝑛𝑡 -resolution trade-off stated in Section 1. As indicated by the name, the fractional counter named “32-to-5 Priority Encoder L (L stands for last)” in Fig. 6.4 was implemented by priority encoders, which can output a digital number that specifically and uniquely indicates the pattern of C<31:0> in the last non-complete circulation and show how much percentage (fraction) of the C<31:0> circulation has been carried out in this last non-complete circulation. For the example above, the last non-complete C<31:0> circulation is 3/32 of a full cycle (C<31:0>= S0 indicates the start of a cycle). T Q Q Rst Q<0> Q<0> T Q Q Rst Q<1> Q<1> T<0> T<1> T Q Q Rst T Q Q Rst T<30> Q<30> T<31> Q<31> T<0:31> P<0:31> VCDL P<0> P<1> P<31> P<32> TFF Line 1 Clk_In Q<30> Q<31> Q<31:0> Q<31:0> 64-to-32 Mux (3) 0 1 C<31:0> SQ2 Q<0> Q<31> Q<0> Q<31> T Q Q Rst Rst SQ1 D Q Q Rst N<0> 40-bit Cycle Counter D Q Q Rst N<1> N<38> N<39> D Q Q Rst D Q Q Rst C<0> C<31> N<39:0> NC D Q Q Rst Overflow N<39:0> Rst 32-to-5 Priority Encoder (L) L<4:0> Rst Fig. 6.4. Counting pattern generator with full-cycle counter and fractional counter (32-to-5 priority encoder (L) [67]) for the last non-complete circulation of C<31:0>. 93 The structure of the priority encoder was selected to be the “74 x 148” 32-to-5 priority encoder in [67], as shown in Fig. 6.5, which cascades four 8-to-3 priority encoders. The truth table for an 8-to-3 priority encoder is shown in Table 6.2. The gate-level schematic of the 8-to-3 priority encoder is shown in Fig. 6.6. Table 6.2. Truth table of an 8-to-3 priority encoder Inputs Outputs 𝐄𝐥 𝐈𝟎 𝐈𝟏 𝐈𝟐 𝐈𝟑 𝐈𝟒 𝐈𝟓 𝐈𝟔 𝐈𝟕 𝐀𝟐 𝐀𝟏 𝐀𝟎 𝐆𝐒 𝐄𝐎 Percentage State 1 X X X X X X X X 1 1 1 1 1 X SX1 0 X X X X X X X 0 0 0 0 0 1 X SX2 0 X X X X X X 0 1 0 0 1 0 1 0% or 100% S(0%/100%) 0 X X X X X 0 1 1 0 1 0 0 1 12.5% S12.5% 0 X X X X 0 1 1 1 0 1 1 0 1 25% S25% 0 X X X 0 1 1 1 1 1 0 0 0 1 37.5% S37.5% 0 X X 0 1 1 1 1 1 1 0 1 0 1 50% S50% 0 X 0 1 1 1 1 1 1 1 1 0 0 1 62.5% S62.5% 0 0 1 1 1 1 1 1 1 1 1 1 0 1 75% S75% 0 1 1 1 1 1 1 1 1 1 1 1 1 0 87.5% S87.5% 94 Fig. 6.5. “74 x 148” 32-to-5 priority encoder [67]. Fig. 6.6. “74 x 148” 8-to-3 priority encoder [67]. 95 6.3.2 Fractional counting for the first circulation cycle of Q2<31:0> in Fig. 6.7, which is an incomplete circulation cycle Fig. 6.7 shows the complete gate-level schematic of the fractional frequency counter. Right after the power-up, an off-chip short pulse (low level) is sent to the Start/Stop pin. Then, the outputs of all the DFFs and TFFs are reset to zero when the low-level Rst pulse arrives. After the Rst pulse, CK1=Start1=Start2=NC=0, CK2=SQ1=SQ2=1, Q1<31:0>=N<39:0>=C<31:0>=M<31:0>=all-0, and Q1 ̅ ̅ ̅ ̅ <31:0> =all-1. T Q Q Rst T1<31:0> Q1<31:0> Q1<31:0> P<31:0> 64-to-32 Mux (2) 64-to-32 Mux (4) 0 1 0 1 C<31:0> D Q Q Rst N<0> 40-bit Cycle Counter D Q Q Rst N<1> N<38> N<39> D Q Q Rst D Q Q Rst Rst 32-to-5 Priority Encoder (L) C<0> C<31> Q1<0> Q1<31> Q1<0> Q1<31> 64-to-32 Mux (5) 0 1 Rst N<39:0> L<4:0> Freeze T Q Q Rst Rst SQ2 SQ1 TFF Line (1) D Q Q Rst DFF Line D Q Q Rst Q2<31:0> 32-to-5 Priority Encoder (F) I<31:0> F<4:0> NC CK1 VCDL P<0> P<1> P<31> P<32> Clk_In P<30> Q2<31> D Q Q Rst Overflow N<39:0> Rst 64-to-32 Mux (1) 0 1 D Q Q Rst Start Box Rst Start/Stop T Q Q Rst TFF Line (2) 64-to-32 Mux (3) 0 1 CK2 Q2<31:0> T2<31:0> M<31:0> Start2 Start1 CK2 Rst SQ3 Fig. 6.7. Complete gate-level schematic of the fractional frequency counter. 96 𝑇 𝑐𝑜𝑢𝑛𝑡 starts to be timed after the point when the Rst signal returns to high. Because it takes some time to lock the DLL after it is reset, the DLL cannot be reset together with the fractional frequency counter. Because the DLL operates continuously, it will be uncertain which one among T1<0>~T1<31> will first receive a rising edge from the VCDL after the Rst signal returns to high. For example, if the rising edge of P<30> is the first one to be received by the TFF Line (1) and Q1<30> is the first among T1<0>~T1<31> that will be toggled from 0 to 1, C<31:0> will go from all-0 to 0000……0010 instead of 0000......0001. Then, the counting pattern will be distorted, and the fractional frequency counter will not work properly. For the reason mentioned above, the start box is added to ensure that the condition that T1<0> is the first among T1<0>~T1<31> to receive a rising edge (P<0>) from the VCDL after the start box connects 64-to-32 Mux (2) to the VCDL. C<31:0> circulates following the pattern in Table 6.1 only when the above condition is guaranteed. The way this scheme works is explained below. As shown in Fig. 6.7, after the Rst signal returns to high, Mux (1) does not pass P<31:0> to Mux (2) until the rising edge of P<31> makes Start2=Start1=1. After that, it is guaranteed that T1<0> is the first in the TFF Line (1) to get a rising edge (from P<0>), and C<0> is the first to toggle from 0 to 1 after the start box passes P<31:0> to Mux (2). In this case, after the Rst signal returns to high, C<31:0>= all-0, and after Start2=1, the rising edge of P<0> makes C<31:0> become S0=0000……0001. Then, the desired counting pattern can be realized. The waiting period for P<31> to rise is timed as part of 𝑇 𝑐𝑜𝑢𝑛𝑡 ; however, the rising edges from P<X> to P<31> (0≤X≤30) before Start2=1 are not counted. This will lead to an error during 97 the frequency reconstruction. Therefore, “32-to-5 priority encoder (F) (F stands for first)”, 64-to- 32 Mux (3), DFF Line, and TFF Line (2) were added to count the rising edges from P<X> to P<31> in the very first non-complete circulation of P<31:0> before Start2=1. Because CK2=Rst=1 after the Rst pulse, Mux (3) will pass P<31:0> to TFF Line (2) right after the Rst signal rises. Then, the rising edges of P<X>-P<31> toggle Q2<X>-Q2<31> from 0 to 1 one after another. After Q2<31> is toggled from 0 to 1, CK1 changes from 0 to 1 and loads Q2<31:0> to the priority encoder (F). Next, CK2 becomes 0 and forces Mux (3) to ground T2<31:0> and to freeze Q2<31:0>, I<31:0> and F<4:0>. Then, Mux (1) and (2) pass P<31:0> to TFF Line (1), and C<31:0> circulates in the expected counting pattern. 6.4 Frequency reconstruction To collect the digital bits from the full-cycle counter, priority encoder (F), and priority encoder (L), a low pulse is sent to the Start/Stop pin. Such a low pulse first freezes C<31:0>, N<39:0>, L<4:0>, Overflow, and F<4:0>. The external reading circuit has the time to store the output digits corresponding to the input frequency until the Rst pulse arrives to reset everything. As shown in Fig. 6.8, the output digits contain three parts: (1) the fractional counting result for the non-complete circulation of Q2<31:0> at the beginning, (2) the full-cycle counting result for N whole circulations of C<31:0>, and (3) the fractional counting result for the non-complete circulation of C<31:0> at the end. As a result, the equivalent number of cycles counted during 𝑇 𝑐𝑜𝑢𝑛𝑡 is equal to: 98 𝑁 𝑒𝑞 = 𝑁 + ( 𝐹 % + 𝐿 %) ( 6.3) Then, the input frequency can be reconstructed by: 𝑓 𝑖𝑛 _𝑓𝑟𝑎𝑐𝑡𝑖𝑜𝑛 = 𝑁 𝑒𝑞 ∙ 1 𝑇 𝑐𝑜𝑢𝑛𝑡 = 𝑁 𝑒𝑞 ∙ 𝑓 𝑐𝑜𝑢𝑛𝑡 ( 6.4) where 𝑁 𝑒𝑞 could be a non-integer number. Fig. 6.8. The content of the output digits from the fractional frequency counter. In comparison, the frequency reconstructed by using only a full-cycle counter is: 𝑓 𝑖𝑛 _𝑓𝑢𝑙𝑙 = 𝑁 ∙ 1 𝑇 𝑐𝑜𝑢𝑛𝑡 = 𝑁 ∙ 𝑓 𝑐𝑜𝑢𝑛𝑡 ( 6.5) where 𝑁 must be an integer and the two non-complete circulations at the beginning and the end are not counted, leading to errors in frequency reconstruction. As a result, 𝑓 𝑖𝑛 _𝑓𝑟𝑎𝑐𝑡𝑖𝑜𝑛 is more accurate than 𝑓 𝑖𝑛 _𝑓𝑢𝑙𝑙 . In addition, because there are 32 rising edges from the DLL used for the fractional frequency counter during one input cycle, the resolution for 𝑁 in equation (6.5) is 1, while the resolution for 𝑁 𝑒𝑞 in equation (6.4) is 1/32. This improves the resolution of the frequency reconstruction from 𝑓 𝑐𝑜𝑢𝑛𝑡 to 𝑓 𝑐𝑜𝑢𝑛𝑡 /32. Finally, the minimum value of 𝑇 𝑐𝑜𝑢𝑛𝑡 for the full-cycle counting method is 𝑇 𝑖𝑛 ; otherwise, there is no valid result. However, the minimum 𝑇 𝑐𝑜𝑢𝑛𝑡 for the fractional counting method is 𝑇 𝑖𝑛 /32, enabling the frequency counting to be done in a much shorter time. 99 Chapter 7: Top-level Simulation Results The extracted layout of the DLL was incorporated into the schematic of the fractional frequency counter and tested at the TT, SS, and FF corners under supply variations of 10% and temperatures ranging from 10 ℃ to 60 ℃. The DLL layout has an area of 159.3 μm × 44.2 μm. Fig. 7.1. Top-level block diagram of the DLL-based fractional frequency counter. Fig. 7.2. DLL layout. 100 The worst-case peak-to-peak jitter output from the DLL was 323 ps at the SS corner, a 𝑉 𝑑𝑑 = 1.08 V and a temperature = 60 ℃. Fig. 7.3 illustrates the phase error of the DLL versus the input frequency. The narrowest DLL locking range was 31.56 to 235.92 MHz. This worst-case situation happened at the FF corner at a 𝑉 𝑑𝑑 = 1.08 V and a temperature = 60 ℃. Fig. 7.3. Phase error of the DLL. Table 7.2 compares the performance of the DLL in this project with that of DLLs in other studies [68]‒[71]. The parameter values of this DLL summarized in Table 7.2 are those at the worst-case corners. The figure of merit (FoM) of the DLL in the last row was calculated by [72]: 𝐹𝑜𝑀 = 𝑃𝑜𝑤𝑒𝑟 ( 𝑚𝑊 ) 𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑂𝑝𝑒𝑟𝑎𝑡𝑖𝑛𝑔 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 ( 𝐺𝐻𝑧 ) ( 7.1) The DLL built in this project has the smallest area, competitive jitter performance, and delay range. 101 Table 7.1 Comparison of DLL performance in this project and previous studies T VLSI SYST 13 [68] ASSCC 13 [69] TCASII 14 [70] TCASII 16 [71] This work Process 180 nm 130 nm 130 nm 65 nm 65 nm Supply 1.8 V 1.2 V 1.2 V 1-1.5 V 1.2 V Operating frequency 5 MHz ‒ 120 MHz 100 MHz ‒ 1.5 GHz 400 MHz ‒ 800 MHz 120 MHz ‒ 2 GHz 31.56 MHz ‒ 235.92 MHz Delay range 8.33 ‒ 200 ns 0.67 ‒ 10 ns 1.25 ‒ 2.5 ns 0.5 ‒ 8.33 ns 4.24 ‒ 31.69 ns Power 28 mW @ 120 MHz 5.9 mW @ 1 GHz 7.2 mW @ 800 MHz 6.6 mW @ 2 GHz, 1.2V 4.05 mW @ 235.92 MHz Area 0.270 mm 2 0.110 mm 2 0.025 mm 2 0.059 mm 2 0.007 mm 2 Normalized jitter 𝜟 𝑻 /𝑻 𝒓𝒆𝒇 (%) 0.12 1.69 1.60 2.80 0.51 FoM (pJ/Hz) 233.33 5.90 9.00 3.30 17.17 The green zigzag line in Fig. 7.4 represents the reconstructed frequency versus the input frequency at the corner registering the worst-case locking range (FF, 𝑉 𝑑𝑑 = 1.08 V and 60 ℃). The blue line denotes the ideal result of frequency reconstruction, and the orange zigzag line indicates the ideal frequency reconstruction result of a 32-phase DLL-based fractional frequency counter. The DLL-based fractional frequency counter can determine the frequency of an input signal with an <1LSB error when the input frequency ranges from 42 to 216 MHz. 102 Fig. 7.4. Reconstructed frequency versus input frequency. Fig. 7.5 presents the overall power consumption of the DLL and the counter. The worst-case (at the FF corner under a 𝑉 𝑑𝑑 = 1.32 V and a temperature = 60 ℃) power consumption (within the locking range of the DLL) was 7.12 mW. Fig. 7.5. Overall power consumption at different corners. 103 Chapter 8: Conclusion and Future Work A DLL-based fractional frequency counter was constructed in this project, with the circuit registering its worst-case counting performance at a range of 42 to 216 MHz. The results confirmed that the fractional counter can increase the accuracy, speed, and resolution of frequency counting to levels higher than those achieved using conventional full-cycle frequency counters. This feature enables the use of the fractional frequency counter to improve the measurement speed and precision of frequency-shift magnetic biosensors. Future work on this project will include the following: 1. Finishing the layout of the digital counter 2. Debugging the layout of the DLL and determining the limiting factor for the phase error 3. Implementing Monte Carlo simulations with 10,000 points for the entire chip 4. Addressing the inability of the DLL-based fractional frequency counter to handle input frequency changes during 𝑇 𝑐𝑜𝑢𝑛𝑡 The DLL-based fractional frequency counter can also be modified in a way that enables it to output real-time frequency data for a continuously changing input frequency (Fig. 8.1). With such an improvement, the fixed-frequency oscillator and the fractional frequency counter will work simultaneously. The cycle counter will count the number of cycles at which the oscillator oscillates, thereby obtaining 𝑇 𝑐𝑜𝑢𝑛𝑡 (i.e., the time used for frequency counting). In this case, the 𝑇 𝑐𝑜𝑢𝑛𝑡 can be a very small value because of the fractional frequency counter. 104 If the 𝑓 𝑖𝑛 changes, the DLL shifts from locking to unlocking mode. This shift is sensed by the lock detector, which immediately freezes the fractional frequency counter for a certain amount of time (by sending a low pulse to the “Start/Stop” pin, as illustrated in Fig. 6.7). It then allows the memory to store the values of 𝑇 𝑐𝑜𝑢𝑛𝑡 and 𝐷 𝑓𝑟𝑒𝑞 (the output digits from the fractional frequency counter that indicate the frequency of the input clock before the freeze). This process derives the frequency data for the input clock before the frequency change. Accordingly, the frequency counter can work with continuously changing input frequencies and output the reconstructed input frequency in real time, all on a single chip. Fig. 8.1. Structure of the continuously working fractional frequency counter. 105 References [1] C. Sideris and A. Hajimiri, “Design and implementation of an integrated magnetic spectrometer for multiplexed biosensing,” IEEE Trans. Biomed. Circuits Syst., vol. 7, no. 6, pp. 773–784, Dec. 2013, doi: 10.1109/TBCAS.2013.2297514. [2] N. K. Tran and G. J. Kost, “Worldwide point-of-care testing: Compendiums of POCT for mobile, emergency, critical, and primary care and of infectious diseases tests,” J. Near-Patient Test. Technol., vol. 5, no. 2, pp. 84–92, Jun. 2006, doi: 10.1097/00134384-200606000-00010. [3] H. Wang, C. Sideris, and A. Hajimiri, “A frequency-shift based CMOS magnetic biosensor with spatially uniform sensor transducer gain,” in IEEE Custom Integr. Circuits Conf. 2010, 2010, pp. 1–4, doi: 10.1109/CICC.2010.5617603. [4] H. Wang, S. Kosai, C. Sideris and A. Hajimiri, “An ultrasensitive CMOS magnetic biosensor array with correlated double counting noise suppression,” in 2010 IEEE MTT-S Int. Microw. Symp., 2010, pp. 616–619, doi: 10.1109/MWSYM.2010.5514719. [5] C. Sideris, “Electromagnetic field manipulation: Biosensing to antennas,” Ph.D. dissertation, Dept. Elect. Eng., California Inst. Technol., Pasadena, CA, USA, 2017. [Online]. Available: https://resolver.caltech.edu/CaltechTHESIS:06082017-193807440 [6] H. Wang, Y . Chen, A. Hassibi, A. Scherer and A. Hajimiri, “A frequency-shift CMOS magnetic biosensor array with single-bead sensitivity and no external magnet,” in 2009 IEEE Int. Solid- State Circuits Conf. Dig. Tech. Papers, 2009, pp. 438–439, 439a, doi: 10.1109/ISSCC.2009.4977496. [7] H. Wang, A. Mahdavi, D. A. Tirrell and A. Hajimiri, “A magnetic cell-based sensor,” Lab on a Chip, vol. 12, no. 21, pp. 4465–4471, 2012, doi:10.1039/C2LC40392G. [8] C. Sideris and A. Hajimiri, “An integrated magnetic spectrometer for multiplexed biosensing,” in 2013 IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2013, pp. 300–301, doi: 10.1109/ISSCC.2013.6487744. [9] A. Pai, A. Khachaturian, S. Chapman, A. Hu, H. Wang and Ali Hajimiri, “A handheld magnetic sensing platform for antigen and nucleic acid detection,” Analyst, vol. 139, no. 6, pp. 1403– 1411, 2014, doi: 10.1039/C3AN01947K. 106 [10] S. -J. Han, H. Yu, B. Murmann, N. Pourmand and S. X. Wang, “A high-density magnetoresistive biosensor array with drift-compensation mechanism,” in 2007 IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2007, pp. 168–594, doi: 10.1109/ISSCC.2007.373347. [11] S. Gambini et al., “A CMOS 10kpixel baseline-free magnetic bead detector with column- parallel readout for miniaturized immunoassays,” in 2012 IEEE Int. Solid-State Circuits Conf., 2012, pp. 126–128, doi: 10.1109/ISSCC.2012.6176948. [12] N. Sun, Y . Liu, H. Lee, R. Weissleder and D. Ham, “CMOS RF biosensor utilizing nuclear magnetic resonance,” IEEE J. Solid-State Circuits, vol. 44, no. 5, pp. 1629–1643, May 2009, doi: 10.1109/JSSC.2009.2017007. [13] B. Jang, P. Cao, A. Chevalier, A. Ellington and A. Hassibi, “A CMOS fluorescent-based biosensor microarray,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2009, pp. 436-437,437a, doi: 10.1109/ISSCC.2009.4977495. [14] E. Timurdogan, B. E. Alaca, I. H. Kavakli, H. Urey, “MEMS biosensor for detection of hepatitis A and C viruses in serum,” Biosens. Bioelectron., vol. 28, no. 1, pp. 189–194, 2011, doi: 10.1016/J.BIOS.2011.07.014. [15] A. L. Washburn, L. C. Gunn, and R. C. Bailey, “Label-free quantitation of a cancer biomarker in complex media using silicon photonic microring resonators,” Anal. Chem., vol. 81, no. 22, pp. 9499–9506, 2009, doi: 10.1021/AC902006P. [16] D. A. Hall, R. S. Gaster, K. A. A. Makinwa, S. X. Wang, and B. Murmann, “A 256 pixel magnetoresistive biosensor microarray in 0.18μm CMOS,” IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1290–1301, May 2013, doi: 10.1109/JSSC.2013.2245058. [17] M. Crescentini, M. Marchesi, A. Romani, M. Tartagni and P. A. Traverso, “A broadband, on- chip sensor based on Hall effect for current measurements in smart power circuits,” IEEE Trans. Instrum. Meas., vol. 67, no. 6, pp. 1470–1485, Jun. 2018, doi: 10.1109/TIM.2018.2795248. [18] T. Costa, F. A. Cardoso, J. Germano, P. P. Freitas and M. S. Piedade, “A CMOS front-end with integrated magnetoresistive sensors for biomolecular recognition detection applications,” IEEE Trans. Biomed. Circuits Syst., vol. 11, no. 5, pp. 988–1000, Oct. 2017, doi: 10.1109/TBCAS.2017.2743685. [19] A. De Marcellis et al., “Giant magnetoresistance (GMR) sensors for 0.35μm CMOS technology sub-mA current sensing,” in Proc. IEEE SENSORS, Nov. 2014, pp. 444–447 , doi: 107 10.1109/ICSENS.2014.6985030. [20] S. -J. Han et al., “CMOS integrated DNA microarray based on GMR sensors,” in IEDM Tech. Dig., Dec. 2006, pp. 1–4, doi: 10.1109/IEDM.2006.346887. [21] A. Pai, A. Khachaturian, S. Chapman, A. Hu, H. Wang and Ali Hajimiri, “A handheld magnetic sensing platform for antigen and nucleic acid detection,” 16 th Int. Conf. Miniaturized Syst. Chemistry Life Sci. (MicroTAS 2013), Freiburg, Germany. [22] C. Sideris, P. P. Khial and A. Hajimiri, “Design and implementation of reference-free drift- cancelling CMOS magnetic sensors for biosensing applications,” IEEE J. Solid-State Circuits, vol. 53, no. 11, pp. 3065–3075, Nov. 2018, doi: 10.1109/JSSC.2018.2865480. [23] K. Hoffmann, Applying the Wheatstone Bridge Circuit. Berlin, Germany: HBM, 1974. [24] B. Razavi, Design of CMOS Phase-Locked Loops: From Circuit Level to Architecture Level. Cambridge, UK: Cambridge University Press, 2020. [25] B. I. Abdulrazzaq, I. Abdul Halin, S. Kawahito, R. M. Sidek, S. Shafie, and N. A. M. Yunus, “A review on high-resolution CMOS delay lines: Toward sub-picosecond jitter performance,” SpringerPlus, vol. 5, p. 434, 2016, doi: 10.1186/S40064-016-2090-Z. [26] H. -H. Chang, J. -W. Lin, C. -Y . Yang and S. -I. Liu, "A wide-range delay-locked loop with a fixed latency of one clock cycle," IEEE J. Solid-State Circuits, vol. 37, no. 8, pp. 1021-1027, Aug. 2002, doi: 10.1109/JSSC.2002.800922. [27] R. C. H. van de Beek, E. A. M. Klumperink, C. S. Vaucher and B. Nauta, “Low-jitter clock multiplication: A comparison between PLLs and DLLs,” IEEE Trans. Circuits Syst. II Analog Digit. Signal Process., vol. 49, no. 8, pp. 555–566, Aug. 2002, doi: 10.1109/TCSII.2002.806248. [28] Y . Chen and W. Li, “Modeling and analysis of DLLs for locking and jitter based on Simulink,” in 2015 IEEE Int. Circuits Syst. Symp. (ICSyS), Langkawi, Kedah, 2015, pp. 146– 150, doi: 10.1109/CIRCUITSANDSYSTEMS.2015.7394083. [29] B. Kim, T. C. Weigandt and P. R. Gray, “PLL/DLL system noise analysis for low-jitter clock synthesizer design,” in Proc. Int. Symp. Circuits Systems, 1994, pp. 31-34 vol.4, doi: 10.1109/ISCAS.1994.409189. [30] J. G. Maneatis, “Low-jitter process-independent DLL and PLL based on self-biased techniques,” IEEE J. Solid-State Circuits, vol. 31, pp. 1723–1732, Nov. 1996, doi: 108 10.1109/JSSC.1996.542317. [31] R. L. Aguiar and D. M. Santos, “Simulation and modelling of digital delay locked loops,” in 42 nd Midwest Symp. Circuits Systems, vol. 2, Las Cruces, NM, USA, 1999, pp. 843–846, Cat. no. 99CH36356, doi: 10.1109/MWSCAS.1999.867766. [32] M. Gholami, “Total jitter of delay-locked loops due to four main jitter sources,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 24, no. 6, pp. 2040–2049, June 2016, doi: 10.1109/TVLSI.2015.2494741. [33] M. Gholami and G. Ardeshir, “Jitter of delay-locked loops due to PFD,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 22, no. 10, pp. 2176–2180, Oct. 2014, doi: 10.1109/TVLSI.2013.2284501. [34] M. Gholami and G. Ardeshir, “Analysis of DLL jitter due to voltage-controlled delay line,” Circuits Syst. Signal Process., vol. 32, pp. 2119–2135, 2013, doi: 10.1007/s00034-013- 9584-5. [35] M. . -J. E. Lee et al., “Jitter transfer characteristics of delay-locked loops: Theories and design techniques,” IEEE J. Solid-State Circuits, vol. 38, no. 4, pp. 614–621, Apr. 2003, doi: 10.1109/JSSC.2003.809519. [36] J. D. Vandersand, “An analog multiphase self-calibrating DLL to minimize the effects of process, supply voltage, and temperature variations,” Ph.D. diss., Univ. Tenn., Knoxville, TN, USA, 2008. [Online]. Available: https://trace.tennessee.edu/utk_graddiss/351 [37] H. J. Ng, A. Fischer, R. Feger, R. Stuhlberger, L. Maurer and A. Stelzer, “A DLL-supported, low phase noise fractional-N PLL with a wideband VCO and a highly linear frequency ramp generator for FMCW radars,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 60, no. 12, pp. 3289–3302, Dec. 2013, doi: 10.1109/TCSI.2013.2265966. [38] M. Gholami, “A novel low-power architecture for DLL-based frequency synthesizers,” Circuits, Syst. Signal Process., vol. 32, pp. 781–801, 2013, doi: 10.1007/S00034-012-9488-9. [39] P. Mroszczyk and P. Dudek, “Tunable CMOS delay gate with improved matching properties,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 61, no. 9, pp. 2586–2595, Sept. 2014, doi: 10.1109/TCSI.2014.2312491. [40] P. Mroszczyk and P. Dudek, “Tunable CMOS delay gate with reduced impact of fabrication mismatch on timing parameters,” in 2013 IEEE 11 th Int. New Circuits and Systems Conf. 109 (NEWCAS), 2013, pp. 1–4, doi: 10.1109/NEWCAS.2013.6573595. [41] D. K. Jeong, G. Borriello, D. A. Hodges and R. H. Katz, “Design of PLL-based clock generation circuits,” IEEE J. Solid-State Circuits, vol. 22, no. 2, pp. 255–261, April 1987, doi: 10.1109/JSSC.1987.1052710. [42] T. C. Weigandt, Beomsup Kim and P. R. Gray, “Analysis of timing jitter in CMOS ring oscillators,” in Proc. IEEE Int. Symp. Circuits and Systems—ISCAS ’94, vol. 4, 1994, pp. 27– 30, doi: 10.1109/ISCAS.1994.409188. [43] L. Xiao, W. Liu and L. Yang, “Low jitter design for ring oscillator in Serdes,” 2007 7 th Int. Conf. ASIC, 2007, pp. 307–310, doi: 10.1109/ICASIC.2007.4415628. [44] A. A. Abidi and R. G. Meyer, “Noise in relaxation oscillators,” IEEE J. Solid-State Circuits, vol. 18, no. 6, p. 795502, Dec. 1983, doi: 10.1109/JSSC.1983.1052034. [45] B. Wang, J. R. Hellums and C. G. Sodini, “MOSFET thermal noise modeling for analog integrated circuits,” IEEE J. Solid-State Circuits, vol. 29, no. 7, pp. 833–835, July 1994, doi: 10.1109/4.303722. [46] R. C. H. van de Beek, E. A. M. Klumperink, C. S. Vaucher and B. Nauta, “Low-jitter clock multiplication: A comparison between PLLs and DLLs,” IEEE Trans. Circuits Syst. II Analog Digit. Signal Process., vol. 49, no. 8, pp. 555–566, Aug. 2002, doi: 10.1109/TCSII.2002.806248. [47] H. Chang, C. Sun and S. Liu, “Low jitter Butterworth delay-locked loops,” in 2003 Symp. VLSI Circuits: Dig. Tech. Papers, 2003, pp. 177–180, IEEE cat. no. 03CH37408, doi: 10.1109/VLSIC.2003.1221196. [48] B. Razavi, "The Delay-Locked Loop [A Circuit for All Seasons]," in IEEE Solid-State Circuits Mag., vol. 10, no. 3, pp. 9-15, Summer 2018, doi: 10.1109/MSSC.2018.2844615. [49] L. Dai and R. Harjani, “CMOS switched-op-amp-based sample-and-hold circuit,” IEEE J. Solid-State Circuits, vol. 35, no. 1, pp. 109–113, Jan. 2000, doi: 10.1109/4.818927. [50] P. Larsson, “A 2-1600-MHz CMOS clock recovery PLL with low-Vdd capability,” IEEE J. Solid-State Circuits, vol. 34, no. 12, pp. 1951–1960, Dec. 1999, doi: 10.1109/4.808920. [51] H. Yu, Y . Inoue and Y . Han, “A new high-speed low-voltage charge pump for PLL applications,” in 2005 6 th Int. Conf. ASIC, 2005, pp. 387–390, doi: 10.1109/ICASIC.2005.1611344. 110 [52] Q. Huang, X. Lin, and J. He, “A low current mismatch and deviation charge pump with symmetrical complementary half-current circuits,” in Recent Advances in Computer Science and Information Engineering, vol. 127, Lecture Notes in Electrical Engineering, Qian Z., Cao L., Su W., Wang T., Yang H., Eds., Berlin, Heidelberg, Germany: Springer, doi: 10.1007/978-3-642-25769-8_103. [53] F. You, H. K. Embabi, J. F. Duque-Carrillo and E. Sanchez-Sinencio, “Am improved tail current source for low voltage applications,” IEEE J. Solid-State Circuits, vol. 32, no. 8, pp. 1173–1180, Aug. 1997, doi: 10.1109/4.604073. [54] M.-S. Hwang, J. Kim and D.-K. Jeong, “Reduction of pump current mismatch in charge-pump PLL,” Electron. Lett., vol. 45, pp. 135–136, 2009, doi: 10.1049/el:20092727. [55] K. Park, W. Bae, J. Lee, J. Hwang and D. Jeong, “A 6.7–11.2 Gb/s, 2.25 pJ/bit, single-loop referenceless CDR with multi-phase, oversampling PFD in 65-nm CMOS,” IEEE J. Solid- State Circuits, vol. 53, no. 10, pp. 2982–2993, Oct. 2018, doi: 10.1109/JSSC.2018.2859947. [56] H.-G. Ko, W. Bae, G.-S. Jeong and D.-K. Jeong, “Reference spur reduction techniques for a phase-locked loop,” IEEE Access, vol. 7, pp. 38035–38043, 2019, doi: 10.1109/ACCESS.2019.2905767. [57] A. L. Coban and P. E. Allen, “A 1.75 V rail-to-rail CMOS op amp,” in Proc. IEEE Int. Symp. Circuits and Systems—ISCAS ’94, vol. 5, 1994, pp. 497–500, doi: 10.1109/ISCAS.1994.409420. [58] E. Sackinger and W. Guggenbuhl, “A high-swing, high-impedance MOS cascode circuit,” IEEE J. Solid-State Circuits, vol. 25, no. 1, pp. 289–298, Feb. 1990, doi: 10.1109/4.50316. [59] W. Rhee, “Design of high-performance CMOS charge pumps in phase-locked loops,” in 1999 IEEE Int. Symp. Circuits and Systems (ISCAS), vol. 2, 1999, pp. 545–548, doi: 10.1109/ISCAS.1999.780807. [60] C. Sawigun, A. Demosthenous, X. Liu and W. A. Serdijn, “A compact rail-to-rail class-AB CMOS buffer with slew-rate enhancement,” IEEE Trans. Circuits Syst. II Express Briefs, vol. 59, no. 8, pp. 486–490, Aug. 2012, doi: 10.1109/TCSII.2012.2204843. [61] V . Kasemsuwan and W. Nakhlo, “A simple rail-to-rail CMOS voltage follower,” Microelectronics J., vol. 26, no. 1, pp. 17–21, 2009, doi: 10.1108/13565360910923124. [62] R. G. Carvajal et al., “The flipped voltage follower: A useful cell for low-voltage low-power 111 circuit design,” IEEE Trans. Circuits and Syst. I Regul. Pap., vol. 52, no. 7, pp. 1276–1291, July 2005, doi: 10.1109/TCSI.2005.851387. [63] J. Liu, D. Li, Y . Zhong, X. Tang and N. Sun, “27.1 A 250kHz-BW 93dB-SNDR 4 th -order noise-shaping SAR using capacitor stacking and dynamic buffering,” in 2021 IEEE Int. Solid- State Circuits Conf. (ISSCC), 2021, pp. 369–371, doi: 10.1109/ISSCC42613.2021.9366008. [64] H. Lad Kirankumar, S. Rekha and T. Laxminidhi, “A dead-zone-free zero blind-zone high- speed phase frequency detector for charge-pump PLL,” Circuits Syst. Signal Process., vol. 39, pp. 3819–3832, 2020, doi: 10.1007/s00034-020-01366-1. [65] M. Gholami, “Phase detector with minimal blind zone and reset time for Gsamples/s DLLs,” Circuits Syst. Signal Process., vol. 36, pp. 3549–3563, 2017, doi: 10.1007/s00034- 016-0485-2. [66] Y . Moon, I. Kong, Y . Ryu and J. Kang, “A 2.2-mW 20–135-MHz false-lock-free DLL for display interface in 0.15-μm CMOS,” IEEE Trans. Circuits Syst. II Express Briefs, vol. 61, no. 8, pp. 554–558, Aug. 2014, doi: 10.1109/TCSII.2014.2327338. [67] “ENGR/ECE 212 21 February 2003,” https://slidetodoc.com/engrece-212-21-february-2003- all-slides-except/ (accessed Oct. 10, 2021). [68] S. Hwang, K. Kim, J. Kim, S. W. Kim, and C. Kim, “A self-calibrated DLL-based clock generator for an energy-aware EISC processor,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no. 3, pp. 575‒579, March 2013, doi: 10.1109/TVLSI.2012.2188656. [69] S. Han, T. Kim, and J. Kim, “A 0.1–1.5 GHz all-digital phase inversion delay-locked loop,” in Proc. IEEE Asian Solid State Circuits Conf., 2013, pp. 341–344. [70] K. Ryu, D.-H. Jung, and S.-O. Jung, “Process-variation-calibrated multiphase delay locked loop with a loop-embedded duty cycle corrector,” IEEE Trans. Circuits Syst., II, Exp. Briefs, vol. 61, no. 1, pp. 1–5, Jan. 2014. [71] J. -H. Lim et al., “A delay locked loop with a feedback edge combiner of duty-cycle corrector with a 20%–80% input duty cycle for SDRAMs,” IEEE Trans. Circuits Syst., II, Exp. Briefs, vol. 63, no. 2, pp. 141‒145, Feb. 2016, doi: 10.1109/TCSII.2015.2468911. [72] S. U. Rehman, M. M. Khafaji, A. Ferschischi, C. Carta, and F. Ellinger, “A 0.2‒1.3 ns range delay-control scheme for a 25 Gb/s data-receiver using a replica delay-line-based delay- locked-loop in 45-nm CMOS,” IEEE Trans. Circuits Syst., II, Exp. Briefs, vol. 67, no. 5, pp. 806‒810, May 2020, doi: 10.1109/TCSII.2020.2980813.
Abstract (if available)
Abstract
The increasing demand for point-of-care systems that help realize fast, home-based disease diagnosis drives the development of accurate and quick-response biosensors. A promising technology is the frequency-shift magnetic biosensor, which uses an LC oscillator as the sensor core to qualitatively and quantitatively detect target biomolecules by downshifting its oscillation frequency. The potential of this innovation for extensive use is the time- and cost-effective biomolecule detection enabled by its simple structure, high sensitivity, and compatibility with modern CMOS technology.
The accurate and rapid counting of frequency output from a frequency-shift magnetic biosensor is necessary to efficiently reconstruct the concentration of target biomolecules. Conventional full-cycle frequency counting, which involves recording the number of rising edges (N) in a certain period (T count), cannot detect the initial and final skews between the clock that marks T count and the signal to be measured, thereby diminishing the accuracy of frequency reconstruction. A longer counting time is required to upgrade the accuracy and resolution of full-cycle counting, but this strategy degrades the response speed of a biosensor, thus introducing a trade-off.
This project developed a delay-locked loop (DLL)-based fractional frequency counter, which can count a fraction of a full clock cycle, to simultaneously enhance the accuracy, resolution, and speed of frequency counting and further improve the overall performance of frequency-shift magnetic biosensors. This fractional counter is applicable not only to biosensors but also to any procedure that requires high-speed and precise frequency counting.
Analog delay-locked loops (DLLs) were chosen over digital counterparts because of the higher delay resolution and better jitter performance. A voltage-controlled delay line (VCDL) composed of current-starved delay cells was optimized for use under a wide delay tuning range, low jitter, low power consumption, low supply sensitivity, and a small area. Regulated cascode current sources with switches that can reduce charge injection, charge sharing, and clock feedthrough were used to construct a charge pump (CP) for realizing speedy operation and an extensive range of output control voltage. A phase frequency detector (PFD) with a false-lock-prevention circuit was inserted into the loop to enhance the robustness of the DLL under frequency changes.
As a result of the abovementioned procedures, the DLL-based fractional counter designed in TSMC 65nm can reconstruct the frequency of an input signal from 42 MHz to 216 MHz at a power less than 7.12 mW. The duration of counting is 32 times faster and the resolution of frequency reconstruction is 32 times higher than those achieved with conventional full-cycle frequency counting.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Surface acoustic wave waveguides for signal processing at radio frequencies
PDF
Wideband low phase-noise RF and mm-wave frequency generation
PDF
A generic spur and interference mitigation platform for next generation digital phase-locked loops
PDF
Silicon-based RF/mm-wave power amplifiers and transmitters for future energy efficient communication systems
PDF
Towards high-performance low-cost AMS designs: time-domain conversion and ML-based design automation
PDF
Digital to radio frequency conversion techniques
PDF
Mixed-signal integrated circuits for interference tolerance in wireless receivers and fast frequency hopping
PDF
Nonuniform sampling and digital signal processing for analog-to-digital conversion
PDF
Analog and mixed-signal parameter synthesis using machine learning and time-based circuit architectures
PDF
Charge-mode analog IC design: a scalable, energy-efficient approach for designing analog circuits in ultra-deep sub-µm all-digital CMOS technologies
PDF
High frequency ultrasonic imaging for the development of breast biopsy needle with miniature ultrasonic transducer array
PDF
Magnetic induction-based wireless body area network and its application toward human motion tracking
Asset Metadata
Creator
Chu, Xiao
(author)
Core Title
Delay-locked loop-based fractional frequency counter for magnetic biosensors
School
Viterbi School of Engineering
Degree
Master of Science
Degree Program
Electrical Engineering
Degree Conferral Date
2022-05
Publication Date
04/02/2024
Defense Date
10/22/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
fractional frequency counter,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Sideris, Constantine (
committee chair
), Chen, Mike Shuo-Wei (
committee member
), Hashemi, Hossein (
committee member
), Monge, Manuel (
committee member
)
Creator Email
chuxiao@usc.edu,chuxiaory@163.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC110879202
Unique identifier
UC110879202
Document Type
Thesis
Format
application/pdf (imt)
Rights
Chu, Xiao
Type
texts
Source
20220406-usctheses-batch-919
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
fractional frequency counter