Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Timing analysis of coupled interconnect and CMOS logic cells in the presence of crosstalk noise
(USC Thesis Other)
Timing analysis of coupled interconnect and CMOS logic cells in the presence of crosstalk noise
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TIMING ANALYSIS OF COUPLED INTERCONNECT AND CMOS LOGIC CELLS IN THE PRESENCE OF CROSSTALK NOISE by Shahin Nazarian A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2006 Copyright 2006 Shahin Nazarian ii Table of Contents List of Figures ............................................................................................................v List of Tables .............................................................................................................x Abstract .................................................................................................................... xi 1 INTRODUCTION.............................................................................................1 1.1 Background and Discussion of state-of-the-art..........................................1 1.2 Contribution of the Dissertation.................................................................3 1.3 Outline of the Dissertation .........................................................................4 2 EMPIRICAL STUDY OF PROPAGATION DELAY OF COUPLED INTERCONNECTS...................................................................................................9 2.1 Introduction................................................................................................9 2.2 Crosstalk Sensitivity Analysis: Dependence on Input Skew ...................14 2.2.1 Output Slowdown as a function of Input Skew ...............................14 2.2.2 Output Transition Time as a function of Input Skew.......................18 2.3 Crosstalk Sensitivity Analysis: Dependence on Transition Time............20 2.3.1 Both Input Transition Times Change...............................................21 2.3.2 Only One Transition Time Changes ................................................22 2.4 Crosstalk Sensitivity Analysis: Dependence on Circuit Parameters........25 2.4.1 Dependence on the Coupling Capacitance Value ............................25 2.4.2 Dependence on the Wire Capacitance..............................................29 2.4.3 Dependence on the Wire Resistance................................................33 2.5 Crosstalk Induced Speedup......................................................................37 2.6 Driver Strength.........................................................................................39 2.6.1 Dependence on the Drive Strength ..................................................39 2.6.2 Unbalanced Cells.............................................................................43 2.7 Interaction of Sites ...................................................................................45 2.7.1 Interaction of Two Slowdown Effects .............................................46 2.7.2 Interaction of Slowdown and Speedup Effects ................................47 2.8 Side-Load Routing...................................................................................48 2.9 Summary..................................................................................................50 3 STATISTICAL ANALYSIS OF COUPLED INTERCONNECTS ................53 3.1 Introduction..............................................................................................53 3.2 Interconnect Modeling.............................................................................56 3.2.1 Crosstalk Terminology.....................................................................56 3.2.2 Coupled Interconnect Characterization/Modeling ...........................57 3.2.3 Interconnect Sources of Variation....................................................59 3.3 Experimental Results...............................................................................61 iii 3.3.1 Statistical Model..............................................................................61 3.3.2 The Simulation Setup.......................................................................65 3.3.3 Statistical Comparison of Crosstalk Models....................................66 3.3.4 Variation Shielding..........................................................................67 3.3.5 Analytical Crosstalk-Aware Delay Analysis ...................................69 3.4 Summary..................................................................................................76 4 CROSSTALK-AWARE LOGIC CELL TIMING ANALYSIS USING VOLTAGE-BASED DELAY MODELING ...........................................................78 4.1 Introduction..............................................................................................78 4.2 Previous Voltage-based Cell Delay Modeling Techniques .....................83 4.2.1 Point-based Techniques.......................................................................84 4.2.2 Least Square Fitting-based Technique.................................................85 4.2.3 Weighted Least Squared Error-based Technique.................................86 4.2.4 Elmore-based Technique......................................................................88 4.3 Our Voltage Gain-Based Cell Delay Modeling Techniques....................89 4.4 Experimental Result for our Voltage Gain-based Cell Delay Analysis Technique..............................................................................................94 4.5 Summary..................................................................................................98 5 CROSSTALK-AWARE LOGIC CELL TIMING ANALYSIS USING CURRENT-BASED DELAY MODELING............................................................99 5.1 Introduction..............................................................................................99 5.2 Previous Non-voltage-based Cell Delay Modeling Techniques ............101 5.2.1 Equation-based techniques.............................................................102 5.2.2 Current-based techniques...............................................................102 5.3 ROCC-based Cell Delay Model.............................................................106 5.3.1 Impetus for our cell delay model ...................................................106 5.3.2 Cell characterization and output waveform computation ..............107 5.4 Experimental Results for the ROCC Model ..........................................112 5.5 Current Source Modeling of Logic Cells with Voltage Dependent Parasitic Effects..................................................................................................116 5.5.1 Experimental Results.....................................................................120 5.6 Summary................................................................................................121 6 STAX: STATISTICAL CROSSTALK TARGET SET COMPACTION .....124 6.1 Introduction............................................................................................124 6.2 Modeling, Extraction, Analysis .............................................................129 6.2.1 Coupled Interconnect Characterization/Modeling .........................129 6.2.2 Parameter Extraction/Estimation...................................................132 6.2.3 Statistical Timing Analysis Tools ..................................................133 6.3 Filtering..................................................................................................135 6.4 Problem Statement and Solution............................................................140 6.5 Experimental Results.............................................................................141 iv 6.5.1 Statistical Analysis Tool in STAX.................................................141 6.5.2 Target Set Compaction in STAX ...................................................143 6.6 Summary................................................................................................146 7 CONCLUSIONS............................................................................................147 Bibliography...........................................................................................................150 v List of Figures Figure 2-1: (a) The configuration used in our experiments, (b): The load configuration used in some of the experiments as the side-load, connected to the intermediate point i of line f......................................13 Figure 2-2: Delay from in_x (in_y) to out_u (out_v) as a function of input skew between in_x and in_y.................................................................15 Figure 2-3: Delay from in_x (in_y) to in_u (in_v) and from in_x (in_y) to out_u (out_v) as a function of input skew............................................18 Figure 2-4: Transition times of out_u and out_v as a function of input skew. ........19 Figure 2-5: Transition times of in_u (in_v) and out_u (out_v) as a function of input skew.............................................................................................20 Figure 2-6: Delay from in_x (in_y) to out_u (out_v) as a function of input transition time (both transition times change) for different coupling capacitance values. ................................................................22 Figure 2-7: Transition times of out_u and out_v as a function of input transition time (both change) for different coupling capacitance values....................................................................................................23 Figure 2-8: Delay from in_x (in_y) to out_u (out_v) as a function of transition time of in_x for different coupling values............................23 Figure 2-9: Transition time out_u (out_v) as a function of transition time of in_x.......................................................................................................24 Figure 2-10: Delay from in_x (in_y) to out_u (out_v) as a function of coupling value for three different input skews.....................................27 Figure 2-11: Transition times of out_u and out_v as a function of coupling value for different input skews.............................................................27 Figure 2-12: Delay from in_x (in_y) to out_u (out_v) as a function of input skew between in_x and in_y for different coupling values..................28 vi Figure 2-13: The transition times of out_u and out_v as a function of input skew......................................................................................................28 Figure 2-14: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for two 300μm-long parallel lines...............................................29 Figure 2-15: Delay from in_x (in_y) to out_u (out_v) as a function of the wire capacitance for three different input skew values. .......................30 Figure 2-16: Transition times of out_u and out_v as a function of the wire capacitance for different input skew values. ........................................30 Figure 2-17: Delay from in_x (in_y) to out_u (out_v) as a function of wire capacitance of line x for three different input skew values..................32 Figure 2-18: Transition times of out_u and out_v as a function of wire capacitance of line x for different input skew values...........................32 Figure 2-19: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance for three different input skew values. ..........................34 Figure 2-20: Transition times of out_u and out_v as a function of the wire resistance for different input skew values. ...........................................34 Figure 2-21: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance of line x for three different input skew values.............35 Figure 2-22: Transition times of out_u and out_v as a function of the wire resistance of line x for different input skew values..............................35 Figure 2-23: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance of line x (ranging from 0 to 5kΩ) for three different input skew values...................................................................36 Figure 2-24: Transition times of out_u and out_v as a function of the wire resistance of line x (ranging from 0 to 5kΩ) for different input skew values...........................................................................................37 Figure 2-25: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different coupling values for the speedup case......................38 Figure 2-26: Transition time of out_u and out_v as a function of input skew for different coupling values for the speedup case...............................39 vii Figure 2-27: Delay from in_x (in_y) to out_u (out_v) as a function of input driver size ratio.....................................................................................41 Figure 2-28: Transition times of out_u and out_v as a function of input driver size ratio................................................................................................42 Figure 2-29: Delay from in_x (in_y) to out_u (out_v) as a function of input skew using an unbalanced cell. ............................................................44 Figure 2-30: Transition times of out_u and out_v as a function of input skew for an unbalanced cell...........................................................................44 Figure 2-31: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different side-load connection points for coupling values 0, 50, and 300fF. size(INV side ) = size(INV x .)............................49 Figure 2-32: Transition times of out_u and out_v as a function of input skew for different side-load connection points for coupling values..............50 Figure 2-33: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different side load locations for coupling values 0, 50, and 300fF. size(INV side ) = 4 × size(INV x .)...........................................50 Figure 3-1: Distributed capacitive modeling of coupled interconnects. ..................58 Figure 3-2: Resistive and Capacitive line distribution for a 0.1mm long metal-4 interconnect. ............................................................................64 Figure 3-3: Distributed RC-π modeling of crosstalk site.........................................65 Figure 3-4: Comparison of different approaches with our statistical crosstalk model ....................................................................................................67 Figure 3-5: Accuracy improvement of RC-π crosstalk models. ..............................69 Figure 3-6: Crosstalk-aware victim interconnect delay (mean)...............................71 Figure 3-7: Crosstalk-aware victim delay (variance)...............................................71 Figure 3-8: (a) Delay (from in_x (in_y) to) out_u (out_v) vs. coupling for three different input skew values. (b) Transition Time of out-u (out_v) ..................................................................................................73 viii Figure 3-9: Transition times of out_u and out_v vs. wire capacitance of line x for different input skew values.............................................................74 Figure 3-10: Crosstalk-aware output delay distribution for a 1mm long metal- 4 interconnect pair ................................................................................74 Figure 3-11: Accuracy comparison vs. Hspice a) Mean Interconnect delay b) Interconnect delay variance..................................................................75 Figure 4-1: Elmore-based pessimism: Total coupling 350fF...................................89 Figure 4-2: Vgain(K×L): the cell current gain lookup table used in our model.......91 Figure 4-3: ρ v for: (a) a noiseless waveform (b) a typical crosstalk-induced noisy waveform. ...................................................................................93 Figure 4-4: gcdm: the actual and equivalent output voltage waveforms. ................94 Figure 5-1. i out is calculated as a function of v in and θ c ..........................................109 Figure 5-2. An example of the ROCC-based cell delay model used on a typical ramp input. θ c and i out have been scaled up to improve visibility..............................................................................................111 Figure 5-3. The actual and equivalent waveforms by our model for waveforms subjected to (a) one aggressor, as well as (b) and (c) three aggressors ..................................................................................115 Figure 5-4. Absolute errors in calculated delays vs. Spice simulation results for ROCC-based model ......................................................................116 Figure 5-5. Our proposed current-based circuit model for a logic cell..................117 Figure 5-6. The actual and equivalent waveforms by our model for some crosstalk-induced noisy waveforms. ..................................................121 Figure 5-7. Absolute errors in calculated delay for a min size inverter.................121 Figure 5-8. Waveform similarity (mean square error) comparison to Hspice for our model and the KTV model. ....................................................122 ix Figure 6-1: (a) Distributed RC-π model of a crosstalk site, (b) Lumped RC-π model of the crosstalk site ..................................................................130 Figure 6-2: Slowdown curves as a function of skew for different C C ...................138 Figure 6-3: (a) Comparison of distributed, RC-π , and 2RC-π models (b) accuracy improvement using our heuristic.........................................142 Figure 7-1: (a) Crosstalk effect at the output of a cell ..........................................148 x List of Tables Table 2-1: Crosstalk-affected delay sensitivity to the input transition time (in ps.) ............................................................................24 Table 2-2: Crosstalk-affected output delay and transition time sensitivity to timing and circuit parameters. ..........................................52 Table 3-1: The mean and standard deviation of the resistance and capacitance line variations depicted in Figure 3-2.................................64 Table 4-1: Experimental results for gcdm vs. other techniques...............................97 Table 6-1: Notation and descriptions .....................................................................128 Table 6-2: Extractors modeled in STAX ...............................................................133 Table 6-3: Sequences with highest compaction degrees for training circuits..................................................................................................143 Table 6-4: Efficiency results of STAX ..................................................................145 xi Abstract This dissertation investigates the effect of capacitive crosstalk on interconnect and logic cell (gate) delay modeling and calculation in state-of- the-art CMOS VLSI designs. First, based on distributed RC-π modeling of an interconnection, a detailed simulation-based study of the propagation delay of a pair of crosstalk-affected interconnect lines is presented. This is followed by a detailed model and delay analysis of coupled interconnect lines subject to manufacturing process and environmental variations. Next, the focus is shifted to delay analysis of logic cells (gates) in a VLSI circuit. Two different approaches to logic cell delay analysis, one motivated by voltage-based modeling of a CMOS gate; the other driven by current-based modeling of the same, are presented and compared. In addition, a new current-based model that accurately models the parasitic and nonlinear behavior of the logic cells and maintains a close-to-Spice accuracy is presented. Finally, the dissertation addresses the problem of developing a crosstalk target set compaction framework to reduce the complexity of the crosstalk ATPG process by pruning non-fault-producing targets. 1 1 INTRODUCTION 1.1 Background and Discussion of state-of-the-art As the layout geometries in recent CMOS process technologies scales down to 65nm and below, increases in transistor packing density and operational frequency of VLSI circuit aggravate the noise effects, including crosstalk noise. This type of noise is caused by unwanted capacitive coupling between a pair of interconnect lines, referred to as a crosstalk site. Crosstalk noise analysis has hence become a key challenge in the design process of CMOS VLSI circuits. In general crosstalk noise modeling is based on some simplifying assumptions, such as linear resistance modeling of the driver of the interconnect, and a single RC-π modeling the interconnect, because developing accurate analytical formulas using distributed RC modeling as well as more realistic modeling of the driver cells is very complicated. Timing analysis is an essential aspect of determining whether a noise source can create a faulty output in a circuit. Input pattern dependent circuit-level timing simulation is very accurate but infeasible for large VLSI circuits. Logic cell (Gate) level timing analysis is an efficient alternative. Timing analysis is conventionally performed based on signal transition waveforms with simple shapes such as saturated ramp. Considering the increasing effect of noise, such as coupling crosstalk, signal transitions of arbitrary shapes may exist. Therefore new delay 2 models are required that can accurately consider the impact of the shape of the signal transitions waveforms in timing analysis. On the other hand, the rapid increase in different sources of variations such as process variations makes accurate circuit analysis and optimization even more demanding. The conventional corner- based techniques which used to work in previous technology nodes are not effective in nanometer era. For example it is very difficult to estimate the worst- case possible slowdown considering process variations and the non-monotone properties of crosstalk effect with respect to process and environmental parameters. A VLSI circuits may include a large number of coupled interconnect lines, with non-zero crosstalk effect. However, only a small set of crosstalk targets can result in faulty circuit behavior. Therefore it is reasonable and perhaps necessary to prune as many crosstalk targets as possible before performing more expensive analysis, validation or testing. In this thesis we focus on techniques that can answer the afore-mentioned issues. More precisely, logic cell and interconnect models are presented that can accurately consider the effect of crosstalk noise in analysis, and more specifically timing analysis. The concern on process variations in addressed by development of a statistical model for the coupled interconnect. This model is used to identify the crosstalk sites that may force a circuit to faulty behavior and prune the crosstalk sites that may result in faulty circuit behavior. 3 1.2 Contribution of the Dissertation To answer the growing concerns explained in the previous section, we have developed models for both logic cells and interconnects that can function in the presence of crosstalk noise. We utilize an accurate distributed RC-π model of the interconnections to obtain realistic results. This model is enhanced to a statistical one to be able to handle process variations. The distributed modeling makes it possible to consider the effect local variations, and correlation among the parameters of neighboring wire segments. We have also developed STAX, which is to the best of our knowledge the first statistical crosstalk set compaction tool that can statistically analyze the effect of crosstalk targets in circuit performance and prune the ones that would not result in circuit faults. The main shortcoming of the existing logic cell delay models is that they ignore the effect of the shape of the signal transition waveforms in delay calculation by approximating an arbitrary waveform with a saturated ramp waveform. Our current-based logic cell delay models resolve this issue by considering the exact shape of the input waveform using pre-characterized lookup tables with the input voltage and output voltage of the logic cell as the keys to the table and cell’s parasitic and nonlinear output current as the outputs of the tables. This makes it possible to have close-to-Spice accuracy in the estimation of logic cell delay parameters. 4 1.3 Outline of the Dissertation Chapter 0 focuses on crosstalk issues in interconnect lines. It presents the results of a detailed investigation of various crosstalk scenarios in VDSM technologies by using a distributed model of the crosstalk site. As example of these results, it is stated that the combination of one crosstalk event at a site and another crosstalk event at a different site in the transitive fan-out of the first site may cause a slowdown or speedup of the circuit by an amount that can significantly exceed the sum of crosstalk effects caused by each site in isolation. As another example, it is reported that the common assumption that zero skew between the input transitions of aggressor and victim lines causes the worst case crosstalk effect is not always valid, and therefore, optimization or test based on such an assumption may be invalid. As importantly, this chapter demonstrates the non-monotone behavior of the crosstalk effect with respect to the skew between the input transition of aggressor and victim lines. Our sensitivity analysis shows that the conventional assumptions, based on lumped modeling of crosstalk site, regarding the monotone behavior of crosstalk with respect to wire capacitance may be invalid. In contrast, it is legitimate to assume a monotone property for crosstalk with respect to the coupling capacitance and wire resistance and apply linear modeling with respect to those parameters. Finally the impact of our findings on driver sizing and side-load routing techniques are studied. For example, it is shown that increasing the size of the driver of an aggressor line may not necessarily increase the slowdown of the 5 coupled lines as usually assumed by driver sizing algorithms. This chapter is mainly on our work presented in [49],[53]. Chapter 3 explains our statistical analysis of coupled interconnects which is an extension of the work described in Chapter 2 to deal with process variations. Process variations have become a key concern of circuit designers because of their significant, yet hard to predict impact on performance and signal integrity of VLSI circuits. Statistical approaches have been suggested as the most effective substitute for corner-based approaches to deal with the variability of present process technology nodes. This chapter introduces a statistical analysis of the crosstalk- aware delay of coupled interconnects considering process variations. The few existing works that have studied this problem suffer not only from shortcomings in their statistical models, but also from inaccurate crosstalk circuit models. We utilize an accurate distributed RC-π model of the interconnections to be able to model process variations close to reality. The considerable effect of correlation among the parameters of neighboring wire segments is also indicated. Statistical properties of the crosstalk-aware output delay are characterized and presented as closed-formed expressions. Monte Carlo Spice-based experimental results demonstrate the effectiveness of the proposed approach in accurately modeling the correlation- aware process variations and their impact on interconnect delay when crosstalk is present. Chapters 4 and 5 focus on logic cell delay modeling and analysis. In chapter 4 we will present our voltage-based logic cell timing analysis technique considering 6 impact of noise, mainly crosstalk noise, using voltage-based cell delay modeling. Cell delay modes are classified in two main groups of voltage-based and current- based models. Voltage-based (current-based) techniques calculate the cell timing information by modeling the output voltage (output current) as a function of the input parameters, respectively. Conventional cell delay modeling approaches are not accurate, mainly because they calculate the propagation delay and output transition time of a CMOS logic cell, which is subjected to a noisy input waveform, by approximating this noisy waveform with a saturated ramp signal and then utilizing cell library delay look-up tables to report the output timing information. Modeling the input waveform as a saturated ramp may however result in significant error in the timing parameters of interest because the actual output waveform can be very different from the one that is implied by a simple saturated ramp input. We present a gain-based cell delay analyzer, called gcdm to accurately consider the impact of the waveform shape in cell delay analysis. The key contribution of gcdm cell delay model is that the output waveform is calculated without the need to approximate the input waveform, using the sensitivity of the output to input voltage waveform. This significantly increases the accuracy of logic cell timing analysis, mainly due to accurate consideration of the shape of the noisy input waveform instead of approximating it with a saturated ramp. Voltage-based models may not be accurate enough in considering the growing impact of noise in VDSM geometries and below. Current-based models have been suggested as an accurate alternative. They model the cell output current as a 7 function if cell input parameters. The existing current-based cell delay models are reviewed in Chapter 5 and their shortcomings are highlighted. Chapter 5 also proposes our ROCC-based cell delay analysis technique to address this issue. To consider the nonlinear behavior of cell, the logic cell is modeled with a current source that is dependent on the input voltage and the output load. To calculate the output current, the rate of output current change, i.e., the time derivative of output current is utilized. More precisely, a pre-characterized table of time derivatives of the output current as a function of input voltage and output load values is constructed. The data in this table, in combination with the Taylor series expansion of the output current, is utilized to progressively compute the output current waveform, which is then integrated to produce the output voltage waveform. Experimental results show the effectiveness and efficiency of both delay models. Chapter 5 also introduces a new current-based model which is for the purpose of logic cell delay analysis with the close-to-Spice accuracy. The existing cell delay analysis techniques in general do not accurately model the parasitic and/or non-linear behavior of logic cells. The parasitic capacitances for a logic cell are highly dependent to the input and output voltage values of the cell. In this model, the cell capacitive parasitics as well as the nonlinear resistivity of the cell are pre- characterized as a function of the input and the output voltage values. This model can then construct the actual shape of the output voltage waveform for an arbitrary input voltage waveform while maintaining close-to-Spice accuracy. 8 Chapter 6 deals with STAX, a crosstalk target set compaction framework to reduce the complexity of the crosstalk ATPG process by pruning non-fault- producing targets. In general, existing pruning techniques do not employ their processes in a cost-effective manner. Neither do they handle process variations properly. To address the first weakness, this chapter presents a framework to determine a sequence of available analysis and pruning tool invocations to prune as many of the crosstalk targets as fast as possible. As a result, an initially enormous collection of crosstalk targets is usually reduced to a very small set of targets via a vectorless process. A statistical static timing analyzer is developed and embedded to address the second shortcoming of existing approaches. The statistical crosstalk modeling explained in Chapter 3 is used in the timing analyzer of STAX. Lastly, chapter 8 gives our concluding remarks and possible extensions to my work. 9 2 EMPIRICAL STUDY OF PROPAGATION DELAY OF COUPLED INTERCONNECTS 2.1 Introduction The drastic down scaling of layout geometries to 90nm and below along with the increase in the operational frequency of VLSI circuits to multiple of GHz have resulted in the aggravation of capacitive crosstalk effects in these circuits. As a result, crosstalk analysis and management have become some of the most important problems in the IC design flow. Crosstalk effect has been analyzed using lumped RC models to find simple close-form formulas for crosstalk-induced pulse and slowdown in [15],[17],[64],[72]. However, this model is inaccurate for global interconnects, especially at high clock frequencies. Using a distributed coupling capacitance model produces more accurate and realistic results. Closed-form expressions by using 2π and 4π configurations (which are based on linear circuit models) have been developed in [20] and [7], respectively. However, the quality of analysis and optimization tools degrades when using linear equations to model the nonlinear behavior of drivers. In [3],[31], distributed RC modeling has been used to estimate the pulse induced by crosstalk effect. In [34], by using distributed RC model with a circuit simulation engine, a number of interesting observations have been reported for weak spot defects in the presence of crosstalk noise. Indeed, 10 deriving accurate closed-form expressions based on distributed RC modeling has been a difficult, and so far unsatisfactory, undertaking. As a result, it is common practice to make simplifying, yet realistic, assumptions about the crosstalk noise in order to mitigate the complexity of crosstalk-affected analysis and optimization. This chapter provides the results of extensive simulations by using distributed RC modeling of a crosstalk site. In additions, it reports a number of important properties of the crosstalk that may be exploited in STA and ATPG tools to increase the accuracy of crosstalk effect analysis and reduce the computational time of these tools. For example take the common belief, which is often employed in crosstalk fault models [15],[77], ATPG tools [17],[16], and STA tools [13],[33], that the worst case slowdown of a crosstalk event occurs precisely when the aggressor and victim line inputs switch simultaneously, i.e., their inputs have zero- skew transitions. The assumption is mainly a consequence of using a lumped capacitive model. In contrast, in the present work, by using distributed RC modeling of a crosstalk, we demonstrate that this assumption is not always valid. More generally, this chapter reports a number of important properties of crosstalk effect that may be exploited in optimization tools to increase the accuracy of crosstalk effect analysis and reduce the computational complexity of these tools. In addition, we study the sensitivity of the delay and transition time of the output a crosstalk site to circuit parameters such as its coupling, wire capacitance, and resistance values, and the driver strength, as well as the timing parameters, namely the input skew and the input transition time. 11 Timing analysis is an essential aspect of determining whether a crosstalk event can create a faulty output in a circuit. In particular, the signal arrival times and transition times (inverse of slew rates) in a circuit can change as a function of the crosstalk noise that is present in the circuit. Therefore, the accuracy of timing analysis tools strongly depends on the accuracy of arrival time and transition time calculations in the presence of crosstalk noise. In this chapter, we adopt the standard definition of arrival time and transition time that is commonly used in static and statistical timing analysis and ATPG tools, meaning that the arrival time of a signal transition is set to the time instance at which signal waveform crosses the 0.5Vdd voltage level whereas the transition time of a signal transition is defined as the slope of a line connecting two specific points on the noisy input: the points are when the signal waveform crosses the 0.1Vdd and 0.9Vdd voltage levels. The skew between two signal transitions is the difference between their arrival times. In this chapter, we adopt the standard definition of arrival time and transition time that is commonly used in STA and ATPG tools, meaning that the arrival time of a signal transition is set to the time instance at which signal waveform crosses the 0.5Vdd voltage level whereas the transition time of a signal transition is defined as the slope of a line connecting two specific points on the noisy input: the points are when the signal waveform crosses the 0.1Vdd and 0.9Vdd voltage levels. The skew between two signal transitions is the difference between their arrival times. All our experiments are performed in Hspice and use configurations that are similar to the one depicted in Figure 2-1(a). In this configuration, the inverter 12 4INV x is fed by a long interconnect line which is a potential crosstalk victim. 1 Aggressor and victim lines run parallel to one another. Every 100μm long and 0.200µm wide wire segment is modeled by a single stage of an RC-π structure. For example, for a 1000μm wire pair we use 10 RC-π stages as depicted in Figure 2-1. The coupling of each stage is modeled by C m ; clearly, for a 1000μm wire pair the total coupling value is 10 times the C m value. We perform our circuit simulations with a 0.13µm technology and use standard cells from a 130nm, 1.2V production cell library. 2 The sheet resistance of metal interconnect in this technology is 0.074000Ω/square. The unit line capacitance is 22.6pF/meters. Therefore for a 1000μm wire, the total line resistance is 370Ω and the total self capacitance (capacitance to the ground) of each line is 22.6fF. From now on, we will refer to INV x and INV y as the line drivers. Similarly, 4INV x and 4INV y will be called the line receivers. We will refer to out_x and out_y (in_u and in_v) as the near-end (far-end) of the lines. Either line can be considered as a victim when the other is an aggressor. Figure 2-1(b) shows the side-load, which is used in some of our experiments. We define the (normalized) sensitivity of variable p to variable q as σ=(q.Δp)/(p.Δq). We say p is insensitive to q, when σ≅0; furthermore, p is weakly sensitive to q exactly if 0<σ<0.1, and is moderately sensitive if 0.1<σ<1; otherwise 1 The size of eINV f is e times as big as that of INV f , where e can be 1, 4, 16, and 64 and f can be x and y. 2 We have performed all of these experiments with a 0.25µm process technology and found similar outcomes. Only the results for 0.13µm are reported here. 13 we say that p is highly sensitive to q. At times we will refer to Δp as slowdown or speedup of p depending on whether Δp is positive or negative, respectively. We will also use the term crosstalk-affected delay and transition time of node p to refer to the delay and transition time of that node when the impact of crosstalk capacitances are considered. R CC R CC R CC R CC R CC R CC INV x INV y 4INV y 4INV x 16INV x 16INV y 64INV x 64INV y in_x in_y out_v in_v out_y in_u out_u out_x C m C m C m sub1_x sub2_x sub2_y sub1_y (a) The crosstalk model for long parallel lines R C INV side 4INV side C subi_f s (b) f ≡ x or y, sub0_x ≡ out_x, and sub0_y ≡ out_y Figure 2-1: (a) The configuration used in our experiments, (b): The load configuration used in some of the experiments as the side-load, connected to the intermediate point i of line f. The remainder of this chapter is organized as follows. Sections 2.2 to 0 focus on the slowdown effect of the crosstalk. More precisely, in section 2.2, we investigate the sensitivity of the crosstalk-affected delay and transition time of the output of the victim line driver to the input skew whereas, in section 2.3, we study the sensitivity of those to the transition time of the inputs of the victim line and 14 aggressor line drivers. Section 0 analyzes the crosstalk sensitivity to circuit parameters, namely, coupling and wire capacitance and wire resistance. Section 2.5 deals with crosstalk induced speedup effect. In section 2.6 we study the driver strength. The interaction of crosstalk sites is discussed in Section 2.7 and 2.8 investigates whether the side-load routing would be effective in crosstalk reduction. Concluding remarks and summary are provided in 2.9. 2.2 Crosstalk Sensitivity Analysis: Dependence on Input Skew The sensitivity of crosstalk-induced slowdown to timing parameters, namely input skew and input transition times is studied in this section. It is shown that crosstalk- induced delay is highly sensitive to the input skew and weakly sensitive to the input transition time. Sensitivity analysis is useful for better understanding of the impact of the crosstalk noise on circuit performance. Considering the increase in process variations in current process technologies, the results of the sensitivity analysis may be used to create more accurate statistical models based on the variability of input parameters. As stated earlier, the accuracy of noisy signal arrival time and transition time calculation is crucial in timing analysis; therefore we report the dependence of delay as well as the transition time of the output (out_u and out_v) on the crosstalk noise. 2.2.1 Output Slowdown as a function of Input Skew To study the sensitivity of the crosstalk-affected output delay to input skew, the coupled lines, x and y (shown in Figure 2-1,) have been considered. They run in 15 parallel and each of them is 1000µm long. We create signal transitions with opposite directions at the inputs of line drivers, namely in_x and in_y, hence the signal transitions of both lines will be slowed down. The arrival time of a falling transition at in_y is set to 1000ps and the arrival time of a rising transition at in_x is swept from -1000 to +1000ps; therefore the input skew between in_x and in_y changes from -1000ps to +1000ps. We also set their transition times to 100ps. Both out_u and out_v exhibit a crosstalk-induced slowdown in this case. Figure 2-2 shows the slowdown of out_u (delay of out_u w.r.t. in_x) and out_v (delay of out_v w.r.t. in_y.) Coupling capacitance is 300fF (C m =30fF.) Inverter cells with nearly equal fall and rise time ratio is used for all INV cells in the configuration. We refer to such cells as balanced cells in this chapter. Since the cells in this experiment are balanced cells the maximum slowdown at out_u and out_v are very close, e.g., they are less than around 75ps different in case of coupling capacitance of 300fF. 3.5E-10 4E-10 4.5E-10 5E-10 5.5E-10 6E-10 6.5E-10 7E-10 7.5E-10 8E-10 8.5E-10 -1000 -500 0 500 1000 out_u out_v Figure 2-2: Delay from in_x (in_y) to out_u (out_v) as a function of input skew between in_x and in_y. 16 P1: Crosstalk-affected delay can be highly sensitive to the input skew. Especially, for skew values that are close to the one generating the worst- case delay, a small change in the skew can significantly change the delay. For example in Figure 2-2, a 25ps change in the input skew can change the delay of transition at out_u for more than 105ps. P1 highlights the importance of accurately computing the arrival time of signal transitions at circuit lines in the presence of crosstalk noise. P2: The worst-case crosstalk slowdown at the output of the victim line receiver occurs at a certain skew, but a significant slowdown (e.g., more than 20% delay increase) occurs with a large range of skews. One way to reduce the crosstalk effect of a site is to deliberately change the delay of circuit lines driving the corresponding victim and/aggressor lines (e.g. by using buffers.) This can change the input skew such that the slowdown created by that crosstalk site cannot create any error. P2 shows that in order to significantly reduce the slowdown from its worst-case level, the input skew will have to be changed by a rather large amount. There is a common belief that is relied on in crosstalk fault models [15],[77], ATPG tools [17],[16], and STA tools [13],[33]. According to this view, the worst case slowdown of a crosstalk event occurs precisely when the aggressor and victim line inputs switch simultaneously, i.e., the inputs have zero-skew transitions. This concept is mainly a consequence of using a lumped capacitive model for studying the crosstalk effects. However, Figure 2-2 (also Figure 6) shows that this concept 17 may not be true even for two completely symmetric interconnects, i.e., with the same lengths, drivers and receivers, output loads (fan-out), and input transition times. Our experiments confirm that even a zero-skew between transitions at the near-end of the lines, i.e., out_x and out_y, may not necessarily create the worst- case crosstalk-induced slowdown. The reason is that the crosstalk coupling of the aggressor and victim lines is distributed along the length of the lines and the crosstalk effect at one point of the victim line propagates and affects the subsequent points along the victim line. Therefore, the crosstalk effect at each point of the victim line is the summation of coupling effects of that point plus the delayed effects propagated from the preceding points. As a result, the maximum crosstalk slowdown occurs over a much wider window of time than is usually assumed (refer to P2.) P3: The maximum crosstalk slowdown does not necessarily occur for zero input skew condition even for completely symmetric interconnects. P3 provides motivation for establishing a framework for alignment of multiple aggressors and the victim line such that the worst-case crosstalk effect is generated. An algorithm is suggested in [68] to solve the multiple aggressor alignment problems. Unfortunately, this algorithm is based on lumped modeling of crosstalk coupling. 18 2.2.1.1 Slowdown effect at the far-end of the victim line In Figure 2-3 we report the slowdown of in_u (in_v) w.r.t. in_x (in_y) and compare this slowdown with results of Figure 2-2 (slowdown of out_u (out_v) w.r.t. in_x (in_y). P4: Delay at the output of the victim line receiver, out_u (out_v), follows the shape of the delay at the far-end of the victim line, in_u (in_v). 2.2.2 Output Transition Time as a function of Input Skew A new experiment similar to the one described in Section 2.4.1 is set up. The only difference is that now the transition time change at the interconnect output (out_u/out_v) due to the crosstalk effect is simulated. Figure 4 shows the dependence of transition time of out_u and out_v on the input skew. The following summarizes the observations made from this experiment. 1.7E-10 2.7E-10 3.7E-10 4.7E-10 5.7E-10 6.7E-10 7.7E-10 -1000 -500 0 500 1000 out_u in_u out_v in_v Figure 2-3: Delay from in_x (in_y) to in_u (in_v) and from in_x (in_y) to out_u (out_v) as a function of input skew. P5: The output transition time can be highly sensitive to the input skew. Especially, for skew values that are close to the one generating the worst- 19 case increase in transition time, a small change in the skew can significantly change the transition time. For example in Figure 2-4 less than 20ps change in skew can result in more than 200ps increase in the transition time of out_u. P6: The maximum transition time at the output of the victim line occurs for a certain input skew, with a significant increase in transition time occurring for a large range of input skew values. P7: The maximum transition time at the output of the victim line receiver does not occur for the zero input skew even for completely symmetric interconnects. 2E-10 2.5E-10 3E-10 3.5E-10 4E-10 4.5E-10 5E-10 5.5E-10 6E-10 6.5E-10 -1000 -500 0 500 1000 out_v out_u Figure 2-4: Transition times of out_u and out_v as a function of input skew. 2.2.2.1 Transition time change at the far-end of the victim line In Figure 2-5 we compare the transition time of the signal transitions at the far-end of the victim line, i.e., in_u (in_v) with that of the output of the victim line receiver, 20 i.e., out_u (out_v). In contrast to what we observed in Figure 2-3 for slowdown, the transition time comparison shows different characteristics. P8: Transition times of the transitions at the far-end and the output of the victim line receiver are very different in terms of their waveform characteristics. In general the transition at the input of a gate tends to be smoothed out, and hence, the transition at the gate’s output will not change as drastically as the change in the gate’s input transition. 2.3 Crosstalk Sensitivity Analysis: Dependence on Transition Time To study the effect of transition time of signals at the input of the victim line driver and/or the aggressor line driver, we keep the skew between the transitions at in_x and in_y fixed at zero. We apply a falling transition at in_y and a rising transition at 0 2E-10 4E-10 6E-10 8E-10 1E-09 1.2E-09 1.4E-09 1.6E-09 1.8E-09 -1000 -500 0 500 1000 out_u in_u out_v in_v Figure 2-5: Transition times of in_u (in_v) and out_u (out_v) as a function of input skew. 21 in_x with identical arrival times so that both out_u and out_v will experience crosstalk-induced slowdown. We consider a reasonable range of transition times from 0 to 600ps. We will consider two scenarios for the transition time change. In the first scenario, the transition times of both in_x and in_y are changed. In the second scenario, only one of the input transition times is changed while the other one is kept constant. A balanced inverter cell has been used for both INV cells in the configuration of Figure 2-1. 2.3.1 Both Input Transition Times Change The transition time of in_x and in_y are identical and vary in lockstep from 0 to 600ps. Figure 2-6 illustrates how the slowdown of the transitions at out_u and out_v change based on change of transition times of in_x and in_y. It is seen that a 600ps increase in input transition time of both in_x and in_y causes only a 145ps slowdown for out_u (with a coupling capacitance value of 300fF.) Therefore, assuming equal transition times for the aggressor and victim inputs, the slowdown at the output of the victim line receiver is only weakly sensitive to its input transition time. Comparing P12 with P1, we conclude that crosstalk-affected delay sensitivity to the input transition time is much lower than that to the input skew. This has the implication that, as far as crosstalk is concerned, the accuracy of arrival time computation is more important than the accuracy of transition time computation. Figure 2-7 illustrates how the transition time of the transition at the output of the 22 crosstalk site, i.e. out_u/out_v would change when transition times of both in_x and in_y change. From this figure, a 600ps increase in the input transition time of both in_x and in_y changes the transition time at out_v by only 10ps. 1.E-10 2.E-10 3.E-10 4.E-10 5.E-10 6.E-10 7.E-10 8.E-10 9.E-10 0 100 200 300 400 500 600 out_v out_u 0fF 50fF 200fF 300fF Figure 2-6: Delay from in_x (in_y) to out_u (out_v) as a function of input transition time (both transition times change) for different coupling capacitance values. 2.3.2 Only One Transition Time Changes We simulate the crosstalk effect by keeping transition time of the signal transition at in_y constant at 100ps and then changing transition time of in_x from 0 to 600ps. Other parameters have been set similar to those of the experiment reported in Section 3.1. Figure 2-8 shows the effect of transition time change at one input (in_x) on the slowdown seen at the outputs. Considering in_y and out_v as the input and output of the victim line, there will be less slowdown at out_v if the transition time at the input of the aggressor line driver, in_x, increases. 23 P9: Crosstalk-affected delay and transition time of the output of the victim line are only weakly sensitive to its input transition time. P10: For a given transition time at input of the victim line driver, faster aggressor causes larger worst-case slowdown. P11: The maximum slowdown occurs when the victim has the largest transition time whereas the aggressor has the smallest transition time. 1.00E-10 1.50E-10 2.00E-10 2.50E-10 3.00E-10 3.50E-10 0 100 200 300 400 500 600 out_u out_v 300fF 200fF 50fF 0fF Figure 2-7: Transition times of out_u and out_v as a function of input transition time (both change) for different coupling capacitance values. 1.E-10 2.E-10 3.E-10 4.E-10 5.E-10 6.E-10 7.E-10 8.E-10 9.E-10 1.E-09 0 100 200 300 400 500 600 out_u out_v 300fF 300fF 200fF 50fF 0fF Figure 2-8: Delay from in_x (in_y) to out_u (out_v) as a function of transition time of in_x for different coupling values. 24 Table 2-1 lists the slowdown for several interesting transition times taken from Figure 2-6 and Figure 2-8. In the last row, the slowdown values for the last three columns (664, 718, and 720) substantiate P14. Comparing the second column entry (920) with entries of columns 1, 3 and 4 (919, 805, and 785) substantiates P15. In Figure 2-9 we study a similar effect to what was presented in Figure 2-7. However, only transition time at in_x is changed. P12: Slower aggressor creates slower transitions at the output of the victim line receiver. Table 2-1: Crosstalk-affected delay sensitivity to the input transition time (in ps.) transition time(in_x) 600 600 100 0 transition time(in_y) 600 100 100 100 delay(out_u) 919920 805 785 delay(out_v) 841664 718 720 1.0E-10 1.5E-10 2.0E-10 2.5E-10 3.0E-10 3.5E-10 4.0E-10 4.5E-10 0 100 200 300 400 500 600 out_u 200fF 300fF 200fF 300fF 50fF 0fF out_v Figure 2-9: Transition time out_u (out_v) as a function of transition time of in_x. 25 2.4 Crosstalk Sensitivity Analysis: Dependence on Circuit Parameters In this section we investigate the dependence of crosstalk-induced slowdown on the coupling capacitance value, wire capacitance, and wire resistance. 2.4.1 Dependence on the Coupling Capacitance Value The coupling (capacitance) value is the main factor in determining the magnitude of any crosstalk noise. Noise sensitivity analysis with respect to the coupling value are thus important to optimization algorithms such as wire spacing and buffer insertion, which aim at reducing the coupling value and subsequent minimization of the crosstalk effect. We ran the experiment described in Section 2.1 with different values of the coupling capacitance. The coupling capacitance value, Cm, was swept from 0 to 50fF, i.e., the total coupling between lines x and y was changed from zero to 500fF. The input skew was changed from -1ns to +1ns. However, for the sake of readability of the plots, only the results for three input skew values (i.e., -1ns, zero-skew, and +1ns) are provided. Figure 2-10 shows the slowdown of the output, out_u (delay of out_u w.r.t. in_x) and out_v (delay of out_v w.r.t. in_y.) Figure 2-11 provides the corresponding transition times. As expected the crosstalk-affected delay is highly sensitive to the coupling value. Both output delay and output transition time are well approximated by a 26 linear model. An important implication is that statistical analysis of crosstalk effect as a function of variability in C m can accurately be modeled by a first order canonical model [11], i.e., there is no need for more complicated models such as the quadratic ones suggested in [78]. P13: Both slowdown and transition time increase at the output of the victim line receiver are highly sensitive to the coupling capacitance value. Furthermore, both of these quantities are well approximated by assuming a linear dependence on the coupling value. In general lower input skews generate larger slowdown and longer output transition times; however, as stated before, zero skew condition does not necessarily generate the worst-case slowdown. Now we provide results for the same experiments as above but this time with respect to the input skew. For the sake of improved readability, results for only four coupling values are provided. Figure 2-12 shows the slowdown at the outputs, out_u and out_v, whereas Figure 2-13 shows the corresponding data of the transition time increase. P14: The input skew values that cause the maximum slowdown and largest transition time at the output of the victim line receiver are a strong function of the coupling capacitance value. It is worthwhile to mention that although throughout this chapter we report the results of experiments performed on 1000µm parallel lines, we have confirmed that the results are equally valid for shorter lines. As an example Figure 2-14 shows the crosstalk-affected delay of the outputs versus the input skew for two lines which 27 are 300µm each and run in parallel to one another (modeled by three stages of the RC-π structure.) It is seen that the results follow similar patterns to those for the 1000µm-long parallel lines. 0.00E+00 2.00E-10 4.00E-10 6.00E-10 8.00E-10 1.00E-09 1.20E-09 0 100 200 300 400 500 second Total Coupling (fF) out_v out_u skew =0 skew =0 skew =1ns skew =-1ns skew =-1ns skew =1ns skew = A(in_x)-A(in_y) Figure 2-10: Delay from in_x (in_y) to out_u (out_v) as a function of coupling value for three different input skews. 1.00E-10 1.50E-10 2.00E-10 2.50E-10 3.00E-10 3.50E-10 4.00E-10 4.50E-10 5.00E-10 0 100 200 300 400 500 second Total Coupling (fF) out_v out_u skew =0 skew = A(in_x)-A(in_y) -1ns 1ns -1ns skew =0 1ns Figure 2-11: Transition times of out_u and out_v as a function of coupling value for different input skews. 28 1.50E-10 2.50E-10 3.50E-10 4.50E-10 5.50E-10 6.50E-10 7.50E-10 8.50E-10 -1000 -500 0 500 1000 out_u out_v 300fF 200fF 50fF 0fF Figure 2-12: Delay from in_x (in_y) to out_u (out_v) as a function of input skew between in_x and in_y for different coupling values. 1E-10 2E-10 3E-10 4E-10 5E-10 6E-10 -1000 -500 0 500 1000 out_v 300fF 200fF 0fF out_u 50fF Figure 2-13: The transition times of out_u and out_v as a function of input skew for different coupling values. 29 1.4E-10 1.9E-10 2.4E-10 2.9E-10 3.4E-10 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 out_u out_v 90fF 0fF 0fF 15fF 90fF input skew (ps) second 15fF Figure 2-14: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for two 300μm-long parallel lines. 2.4.2 Dependence on the Wire Capacitance We performed a similar set of experiments to the ones in Section 2.4.1 in order to assess the sensitivity of crosstalk-induced output delay and transition time to the (self) wire capacitance value. The total wire capacitance for both wires is swept from zero to 500fF. Other parameters have been set as explained in Section 2.4.1. Figure 2-15 and Figure 2-16 show the output delay and transition time vs. the wire capacitance value, respectively. As before for readability purposes, the curves for only one coupling (300fF) and three input skew (-1ns, 0, +1ns) values are shown. Results for other coupling and input skew values are similar. As expected both the output delay and transition time exhibit a linear relationship with the wire capacitance. 30 3.00E-10 4.00E-10 5.00E-10 6.00E-10 7.00E-10 8.00E-10 9.00E-10 1.00E-09 1.10E-09 1.20E-09 1.30E-09 0 100 200 300 400 500 second w ireline capacitance (fF) out_v out_u skew =0 1ns -1ns -1ns 1ns skew = A(in_x)-A(in_y) skew =0 Figure 2-15: Delay from in_x (in_y) to out_u (out_v) as a function of the wire capacitance for three different input skew values. 2.20E-10 2.40E-10 2.60E-10 2.80E-10 3.00E-10 3.20E-10 3.40E-10 3.60E-10 3.80E-10 0 100 200 300 400 500 second Wireline capacitance (fF) out_v out_u skew =0 skew = A(in_x)-A(in_y) -1ns 1ns -1ns skew =0 1ns Figure 2-16: Transition times of out_u and out_v as a function of the wire capacitance for different input skew values. To evaluate the crosstalk effect when the wire capacitance of only one of the coupled lines varies, we swept the wire capacitance of line x from zero to 500fF while keeping that of line y at a constant value of 22.6fF. Results are provided in Figure 2-17 and Figure 2-18. 31 Based on a lumped RC model for crosstalk noise analysis, references [15],[17] state that the crosstalk noise is inversely proportional to the wire capacitances of both the aggressor line and the victim line. However, our experiments show that this is not true. For example, in Figure 2-17, as the wire capacitance of line x increases the delay of that line also increases, i.e., the crosstalk-induced delay increases monotonically with the victim line wire capacitance. However, the delay of line y decreases for zero skew but increases for large skew values. This highlights the fact that the inverse proportionality relationship of the crosstalk- induced effect to the aggressor line wire capacitance is in general not valid. Similar behavior is seen in Figure 2-18 for the output transition time. P15: The crosstalk-affected propagation delay is moderately sensitive to the wire capacitance value. In addition, it monotonically increases as the victim wire capacitance increases (victim output: out_u); However, it does not show a monotone behavior with respect to the aggressor wire capacitance (victim output: out_v.) In Figure 2-18, even the output transition time of out_u shows a non-monotone relationship with respect to victim line capacitance, i.e., the monotone behavior assumption of crosstalk-affected output transition time with respect to the wire capacitance of the victim or the aggressor line is in general invalid. 32 3.00E-10 4.00E-10 5.00E-10 6.00E-10 7.00E-10 8.00E-10 9.00E-10 1.00E-09 1.10E-09 1.20E-09 1.30E-09 0 100 200 300 400 500 second Wire capacitance of line x (fF) out_v out_u skew =0 1ns -1ns -1ns 1ns skew = A(in_x)-A(in_y) skew =0 Figure 2-17: Delay from in_x (in_y) to out_u (out_v) as a function of wire capacitance of line x for three different input skew values. P16: The crosstalk-affected output transition time is moderately sensitive to the wire capacitance value, however in general it does not exhibit a monotone relationship with respect to the wire capacitance of the victim or the aggressor line. 2.00E-10 2.20E-10 2.40E-10 2.60E-10 2.80E-10 3.00E-10 3.20E-10 3.40E-10 3.60E-10 0 100 200 300 400 500 second Wire capacitance of line x (fF) out_v out_u skew =0 skew = A(in_x)-A(in_y) -1ns 1ns -1ns skew =0 1ns Figure 2-18: Transition times of out_u and out_v as a function of wire capacitance of line x for different input skew values. 33 The corner-based worst-case analysis used in conventional STA or ATPG tools is generally assumed to be pessimistic. However, the above results show that it can also be optimistic. Due to lack of the monotone behavior property and the misconception about inverse proportionality of the crosstalk effect on the wire capacitances, the corner-based analysis can indeed underestimate the magnitude and severity of the crosstalk problem. For example, from Figure 2-17 the lack of the monotonic behavior for the slowdown w.r.t. the wire capacitance produces around 20ps delay underestimation at out_v for wire capacitance of 50fF (50% error considering the offset delay of 40ps for delay of out_v w.r.t. in_y.) 2.4.3 Dependence on the Wire Resistance We swept the wire resistance of both lines from zero to 500 ohms and ran a similar set of experiments to the one described in Section 2.4.1. Figure 2-19 and Figure 2-20 show the results for a total coupling value of 300fF. The delay of both outputs monotonically increases; however, the output transition time may exhibit non- monotone behavior in some cases (cf. out_v transition time at zero skew in Figure 2-20.) Figure 2-21 and Figure 2-22 contain the data for the case when the wire resistance of only line x is changed while keeping that of line y at a constant value of 370Ω. From Figure 2-21 the crosstalk-affected delay of the victim output (out_v) decreases when the wire resistance of the aggressor line (line x) increases. Also the delay of out_u increases which shows that the delay of the victim line increases with the increase in the victim line wire resistance. It is seen that both crosstalk- 34 affected delay and transition time of the outputs can well be approximated by using linear equations. 3.40E-10 4.40E-10 5.40E-10 6.40E-10 7.40E-10 8.40E-10 0 100 200 300 400 500 second w ireline resistance (Ω) out_v out_u skew =0 1ns -1ns -1ns 1ns skew = A(in_x)-A(in_y) skew =0 Figure 2-19: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance for three different input skew values. 2.00E-10 2.20E-10 2.40E-10 2.60E-10 2.80E-10 3.00E-10 3.20E-10 3.40E-10 0 100 200 300 400 500 second Wireline resistance (Ω) out_v out_u skew =0ns 0ns 1ns -1ns 1ns -1ns Figure 2-20: Transition times of out_u and out_v as a function of the wire resistance for different input skew values. P17: The crosstalk-affected output delay and transition time are weakly sensitive to the wire resistance value. In particular, they monotonically increase as the victim wire resistance increases (victim output: out_u) while monotonically decreasing as the aggressor wire resistance increases 35 (victim output: out_v.) In both cases the effect can be well approximated by linear equations. 3.00E-10 4.00E-10 5.00E-10 6.00E-10 7.00E-10 8.00E-10 0 100 200 300 400 500 second out_v out_u 0ns -1ns 1ns -1ns 1ns skew = A(in_x)-A(in_y) skew =0 Wireline resistance of line x Figure 2-21: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance of line x for three different input skew values. 2.00E-10 2.20E-10 2.40E-10 2.60E-10 2.80E-10 3.00E-10 3.20E-10 3.40E-10 3.60E-10 0 100 200 300 400 500 second Wireline resistance of line x out_v out_u skew =0ns -1ns 1ns -1ns 1ns 0 Figure 2-22: Transition times of out_u and out_v as a function of the wire resistance of line x for different input skew values. The reader is reminded that the range of the parameters has an important role in extracting the properties of crosstalk noise and its impact on circuit performance; it is sufficient to consider realistic ranges of parameters to reduce the complexity of 36 analysis and modeling. The results presented previously are all qualified to the range of parameters considered during the simulation. These results can change if the parameter range is modified. To make this point more clear, let’s suppose that the wire resistance of line x can range from 0 to 5kΩ (notice that the upper range is too high for typical interconnects in VLSI circuits.) The last two experiments (as reported in Figure 2-21 and Figure 2-22) are repeated with this extended wire resistance range and the results are provided in Figure 2-23 and Figure 2-24. Notice that the monotone behavior, which was observed for the reasonable range of wire resistances (0 to 500Ω), does not exist for this new range (0 to 5kΩ.) 0.00E+00 5.00E-10 1.00E-09 1.50E-09 2.00E-09 2.50E-09 3.00E-09 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 skew = A(in_x)-A(in_y) out_v out_u 0ns 1ns -1ns -1ns 1ns skew =0 Wireline resistance of line x (Ω) Figure 2-23: Delay from in_x (in_y) to out_u (out_v) as a function of the wire resistance of line x (ranging from 0 to 5kΩ) for three different input skew values. 37 1.00E-10 3.00E-10 5.00E-10 7.00E-10 9.00E-10 1.10E-09 1.30E-09 1.50E-09 1.70E-09 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 out_v out_u 0ns skew =-1ns 1ns -1ns 1ns 0ns Wireline resistance of line x ( Ω) Figure 2-24: Transition times of out_u and out_v as a function of the wire resistance of line x (ranging from 0 to 5kΩ) for different input skew values. 2.5 Crosstalk Induced Speedup To study the crosstalk-induced speedup effect, we consider transitions with the same direction of change at inputs, in_x and in_y. We set the arrival time of a rising transition at in_y to 1000ps and sweep the arrival time of a rising transition at in_x from -1000 to +1000ps; therefore the input skew between in_x and in_y changes from -1000ps to +1000ps. We also set the input transition times to 100ps. Figure 2-25 illustrates the speedups that occur at outputs, out_u and out_v based on the input skew change. Figure 2-26 illustrates the output transition time vs. the input skew for the speedup case. Having compared Figure 2-25 with Figure 2-2, we find that the maximum speedup at the victim’s output is 233ps whereas the maximum slowdown was 390ps. Figure 2-4 and Figure 2-26 how that the maximum decrease in transition time at the victim’s output is 110ps whereas the maximum increase in transition 38 time for the slowdown case was 390ps. Hence, the amount of speedup for the same configuration is lower than the slowdown and the transition time change in the speedup case is less than that in the slowdown case. Since both lines make transitions in the same direction, the out_u and out_v curves are symmetric to each other. Other observations are similar to those of the slowdown ones. 1.3E-10 1.8E-10 2.3E-10 2.8E-10 3.3E-10 3.8E-10 -1000 -500 0 500 1000 out_u out_v 0fF 0fF 300fF 300fF 200fF 200fF 50fF 50fF Figure 2-25: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different coupling values for the speedup case. 39 1.1E-10 1.3E-10 1.5E-10 1.7E-10 1.9E-10 2.1E-10 2.3E-10 -1000 -500 0 500 1000 out_v out_u 0fF 0fF 300fF 300fF 50fF 50fF 200fF 200fF Figure 2-26: Transition time of out_u and out_v as a function of input skew for different coupling values for the speedup case. Figure 2-25 and Figure 2-26 show that even for transitions in the same direction the zero skew may not create the worst case slowdown or output transition time. So P3 and P7 must be true even for completely balanced cells with equal rise and fall time transitions. 2.6 Driver Strength Driver sizing is considered as one of the most effective means of crosstalk reduction in optimization tools [8],[66]. In this section, we first study the effect of unbalanced cells on crosstalk noise and then do sensitivity analysis of this noise with respect to the driver strength. 2.6.1 Dependence on the Drive Strength To study the behavior of the crosstalk-affected output delay as a function of the driver strength, the size of INV y is kept constant, while that of INV x is swept from 40 0.2 to 10 times the size of INV y . Note that the size of the receiver 4INV x and 4INV y are kept constant at 4 times the size INV y in this experiment; other parameters are set as explained in Section 2.4.1. Figure 2-27(a) shows the output delay versus the ratio of size(INV x ) to size(INV y ) for coupling capacitance values of 0, 50, and 300fF. The driver sizing techniques usually attempt to take advantage of the trade-off whereby increasing the driver size aggravates the crosstalk effect for coupled lines [8],[66]. This trade-off is reasonable when a lumped RC modeling is applied in which the drivers are modeled as linear circuit elements. Considering line x as the victim line and line y as the aggressor line, Figure 2-27 confirms that the crosstalk- induced slowdown of out_u will decrease exponentially if the input driver size of the victim line is increased. It also shows that the crosstalk-induced slowdown of out_v is aggravated by an increase in the size(INV x ). However, the amount of slowdown aggravation at out_v is much less than (and almost negligible compared to) the slowdown decrease at out_u. This is more remarkable when noticing that the two lines are completely symmetric in everything but the size of the input drivers. Inspired by the typical concept of fanout-of-4 (FO4) for gate delay estimation, it is reasonable to assume that the driver size of an interconnect is at least ¼ th of the size of its receiver, This implies that the minimum ratio of size(INV x ) to size(INV y ) is set to 1. With this sizing constraint, Figure 2-27(b) shows that the so-called trade-off may not be always valid. This is because the increase in crosstalk noise of the coupled line y is very small and quite negligible compared to the crosstalk 41 reduction obtained in line x (even for a large coupling value of 300fF.) It is also worth noticing that increasing the driver size ratio beyond a certain point, say 4, can hardly change the crosstalk-affected delay. Figure 2-28 shows similar results for the transition time versus the driver size change. In this case a non-monotone behavior exists. 9.0E-11 5.9E-10 1.1E-09 1.6E-09 2.1E-09 2.6E-09 3.1E-09 3.6E-09 0.2 1.2 2.2 3.2 4.2 5.2 6.2 7.2 8.2 9.2 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF size(INVx)/size(INVy) second (a) 9.0E-11 1.9E-10 2.9E-10 3.9E-10 4.9E-10 5.9E-10 6.9E-10 7.9E-10 8.9E-10 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF size(INVx)/size(INVy) second (b) zoomed in the reasonable size ratio Figure 2-27: Delay from in_x (in_y) to out_u (out_v) as a function of input driver size ratio. 42 P18: Crosstalk-affected delay (transition time) at the output of the victim line receiver can be highly (moderately) sensitive to the driver strength of the victim line, but weakly sensitive to that of the aggressor line. 9.0E-11 1.4E-10 1.9E-10 2.4E-10 2.9E-10 3.4E-10 3.9E-10 4.4E-10 4.9E-10 0.2 1.2 2.2 3.2 4.2 5.2 6.2 7.2 8.2 9.2 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF size(INVx)/size(INVy) second (a) 9.0E-11 1.4E-10 1.9E-10 2.4E-10 2.9E-10 3.4E-10 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF size(INVx)/size(INVy) second (b) zoomed in for a reasonable sizing ratio Figure 2-28: Transition times of out_u and out_v as a function of input driver size ratio. 43 2.6.2 Unbalanced Cells So far we reported experiments on configurations with inverter cells with nearly equal rise and fall times. We call these cells balanced. To see how different rise and fall times may affect the results, we use driver and receiver inverter cells with different pulldown and pullup strengths. We refer to these types of logic gates as unbalanced cells. Figure 2-29 and Figure 2-30 show the delay and transition time change vs. input skew similar to configuration of Figure 2-2 and Figure 2-4 respectively, but with unbalanced cells used as line drivers and receivers. The falling transition at in_y occurs at +2000ps whereas the rising transition at in_x occurs between 0 to +4000ps, i.e., the input skew changes from -2000ps to +2000ps. The delay value for very large negative or positive skews actually captures the delay of the interconnect output which is not affected by any crosstalk. For example, the delay of out_u for the skew of -2000ps is around 470ps and that for the skew of +2000ps is around 410ps. The difference between the two delay values is the delay of an interconnect line that is influenced by the voltage level of the other interconnect through the coupling capacitance. P19: Crosstalk-affected delay and transition time at the output of the victim line receiver are highly sensitive to the ratio of pull-up and pull-down strengths of the inverter cells. 44 2.5E-10 3.5E-10 4.5E-10 5.5E-10 6.5E-10 7.5E-10 8.5E-10 9.5E-10 1.05E-09 -2000 -1500 -1000 -500 0 500 1000 1500 2000 out_v out_u Figure 2-29: Delay from in_x (in_y) to out_u (out_v) as a function of input skew using an unbalanced cell. 1.50E-10 2.50E-10 3.50E-10 4.50E-10 5.50E-10 6.50E-10 7.50E-10 8.50E-10 -2000 -1500 -1000 -500 0 500 1000 1500 2000 out_v out_u Figure 2-30: Transition times of out_u and out_v as a function of input skew for an unbalanced cell. Figure 2-29 contains interesting data to show that the assumption that zero input skew results in maximum crosstalk effect is indeed a misconception (cf. P3.) Assume a test generator which works based on this zero input skew assumption, finds a test that excites zero-skew input transitions at the input of the crosstalk site with slowdown vs. skew curves in Figure 2-29. Assume that if a circuit generates less than 960ps slowdown at that crosstalk site it passes the test. Now, based on the zero input skew assumption, the circuit passes the test but in fact if the test 45 generator had applied input excitations with a skew of around 300ps, then the crosstalk error would have been observed. Figure 2-29 also points out a fact about crosstalk, which we refer to as the non- monotone property of the crosstalk effect. Assume that the arrival time of in_x is sped up (e.g., as a result of the speedup effect of a crosstalk site in the transitive fan-in of node in_x) such that the input skew between in_x and in_y is reduced from 400 to 300ps. This skew reduction creates a 65dd0ps increase in the crosstalk- induced slowdown at out_v. Now, looking at the same scenario in the opposite direction, we can see that an input skew increase from 300 to 400ps will reduce the delay at out_v by 650ps. So, in general, circuit scenarios can be found such that a speedup at the input line of a crosstalk site can result in either a speedup or a slowdown effect at the output of the site. Similarly, an input slowdown may cause an output slowdown or output speedup. In the next section we will further explore the impact of this non-monotone behavior when the crosstalk sites interact with one another. P20: Crosstalk effect exhibits a non-monotone behavior with respect to the skew between the arrival times of the inputs of the aggressor and victim line drivers. 2.7 Interaction of Sites The crosstalk induced speedup or slowdown effects of two crosstalk sites, each similar to the one shown in Figure 1, may interact with each other if one is in the 46 transitive fan-out of the other. In this section we show this interaction can generate a total delay effect that is more significant than the delay effects caused by each site in isolation. 2.7.1 Interaction of Two Slowdown Effects Assume each has a total coupling value of 300fF. Assume both sites use the same moderately balanced cell that was used in Section 2.4.1 for all INV cells in the model. Consider a falling and a rising transition at the inputs in_x and in_y of the first crosstalk site. Figure 2-12 showed the slowdown effect vs. skew for this site. From Figure 2-12 a 30ps decrease in the arrival time of the transition at in_x of the first crosstalk site, which is equivalent to a 30ps increase in its input skew, can result in a slowdown of 150ps at out_u of this site. Let’s consider a worst-case scenario where all of this slowdown effect will reach the input of the second crosstalk site. Referring to Figure 2-12, a 150ps change in input skew can cause an output slowdown of up to 400ps. Therefore the total slowdown along the path is 30+150+400 or 580ps. If we study the slowdown effect of the crosstalk sites one at a time, then we will incorrectly conclude tdhat a 30ps change in the input skew of the site will create 150ps slowdown on each site, and thus to the total slowdown of the path is 30+150+150 or 330ps. Therefore, separate worst-case analysis of the two crosstalk sites would underestimate the total path slowdown by 93%. In addition, we should take into account the transition time change created at the output of the crosstalk sites. For example, in the case that one crosstalk site directly feeds into the other, from Figure 2-13, a 30ps change in the input skew causes a 47 90ps transition time change at the output of the first site and input of the second site. This in turn creates around 30ps extra slowdown at the output of the second site. This means that the total path slowdown is actually 580+30 or 610ps. P21: Crosstalk sites along a path may result in a significant increase in circuit delay, which can be much higher than the summation of delay increases caused by each site individually. 2.7.2 Interaction of Slowdown and Speedup Effects Assume a first site with total coupling value of 50fF uses the same moderately balanced cell that was used in Section 2.4.1 for all INV cells in the model. Assume rising transitions at the inputs of in_x and in_y of the first crosstalk site. Figure 2-25 showed the speedup effect vs. skew for this site. A 240ps decrease in the arrival time of the transition at in_x of the first site, which is equivalent to a 240ps increase in the input skew, can in the worst case cause a 60ps speedup at the output of the site. This speedup is in turn equivalent to a decrease in the input skew of the second site that is in the transitive fanout of the first one. The second site has a total coupling value of 300fF. It uses the unbalanced cell used in Secddtion 0. Therefore, from Figure 2-29, a 60ps decrease in the input skew can create up to 650ps increase in slowdown. Therefore the total slowdown along the path is 240- 60+650 or 830ps. Studying the second site in isolation, a 240ps increase in the input skew of the second site, which from Figure 2-29 means no slowdown at the output of this site, could be generated by the second site if the first site did not exist. The total slowdown created by each site in isolation is 240-60=220ps. 48 Therefore the total slowdown caused by the interaction of site is more than 3.7 times as large as the summation of crosstalk effects in isolation. This example highlights the non-monotone behavior of crosstalk site described in P12. The key to the synergistic interactions discussed in 2.7.1 and 2.7.2 is that crosstalk-affected delay is highly sensitive to the input skew (refer to properties P3 and P4.) 2.8 Side-Load Routing Consider the side-load of Figure 2-1(b); Assume it has to be routed in connection with line x. In the process it may be connected to any intermediate point along line x, i.e., out_x, sub1_x, … , or in_u. The question is which point gives the best performance in terms of crosstalk-affected delay. The following experiment was conducted to answer this question. We connected the side-load to out_x and then swept the size of the side-load, INV side from 0.2 to 4 times the size of line x driver, i.e., INV x . We repeated this experiment for the side-load connected to other intermediate points. Figure 2-31 and Figure 2-32 illustrate the output delay and transition time for both lines vs. the input skew for size(INV side ) equal to that of INV x for three different coupling values (coupling values in the range 0 to 500fF showed similar behavior.) 49 9.0E-11 1.9E-10 2.9E-10 3.9E-10 4.9E-10 5.9E-10 6.9E-10 7.9E-10 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF skew (psec) = A(in_x)-A(in_y) second Figure 2-31: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different side-load connection points for coupling values 0, 50, and 300fF. size(INV side ) = size(INV x .) Figure 2-33 illustrates the delay for the case of size(INV side ) four times as large as that of INV x , (the transition time figure is not shown since the results are similar to Figure 30.) Note that for a certain coupling value, the curves corresponding to different connection points almost fully overlap each other (actually hardly distinguishable from one another,) meaning that the connection point of the side- load of interconnect does not change the crosstalk-induced output delay (and transition time) of the interconnect. P22: Crosstalk-affected output delay and transition time have zero sensitivity with respect to the connection point of the side-load. 50 9.0E-11 1.9E-10 2.9E-10 3.9E-10 4.9E-10 5.9E-10 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF skew (psec) = A(in_x)-A(in_y) second Figure 2-32: Transition times of out_u and out_v as a function of input skew for different side-load connection points for coupling values. 9.0E-11 1.9E-10 2.9E-10 3.9E-10 4.9E-10 5.9E-10 6.9E-10 7.9E-10 8.9E-10 -1000 -800 -600 -400 -200 0 200 400 600 800 1000 out_u out_v 0fF 50fF 300fF 0fF 50fF 300fF skew (psec) = A(in_x)-A(in_y) second Figure 2-33: Delay from in_x (in_y) to out_u (out_v) as a function of input skew for different side load locations for coupling values 0, 50, and 300fF. size(INV side ) = 4 × size(INV x .) 2.9 Summary This section presented a number of key observations obtained from our extensive simulations of crosstalk-induced slowdown and speedup effects in the 0.13µ process technology used in experiments. The distributed modeling of capacitive 51 coupling was used to create more realistic results compared to the previous work in the literature and industry. We reported that the sensitivity of crosstalk-affected delay and transition time of the output of victim line receiver to the input skew is much higher than that to input transition times (P1, P5, and P9.) We also showed that the concept of zero skew used in ATPG tools for post-silicon testing and characterization, and pre-silicon validation may fail (P3 and P7.) Table 2-2 summarizes the sensitivity strength of crosstalk effect to each parameter. Our sensitivity analysis shows the conventional assumptions based on lumped modeling regarding the monotone behavior of crosstalk with respect to wire capacitance may be invalid (P15 and P16.) It is however legitimate to assume a monotone property for crosstalk with respect to the coupling capacitance and wire resistance and apply linear modeling with respect to those parameters (P13 and P17.) We also performed some experiments and made some key observations regarding driver sizing and side-load routing, which are useful for crosstalk optimization purposes. For example we reported cases where increasing the driver size may not aggravate the slowdown of other coupled lines as envisioned by recent driver sizing algorithms (P18.) We also identified scenarios where the interaction of two crosstalk sites creates delay effects well in excess of the sum of their individual delay effects (P21.) These findings can be applied to improve the accuracy of static timing analysis tools. The results presented in this chapter were presented mainly in [53] and [49]. We plan to extend this work as will be explained in the future work section. 52 Table 2-2: Crosstalk-affected output delay and transition time sensitivity to timing and circuit parameters. Input Skew Input Transition Time Coupling Capacitance Wire Capacitance Wire Resistance Victim Driver Strength Aggressor Driver Strength Pull-up to Pull-down Ratio Side-Load Connection point sensitivity high weak high Moderate weak high weak high almost insensitive 53 3 STATISTICAL ANALYSIS OF COUPLED INTERCONNECTS 3.1 Introduction The increase in package density as well as the clock frequency of the VLSI circuits has made noise, such as the capacitive coupling noise, one of the most challenging problems in the design and verification of modern VLSI circuits. Furthermore, the interconnect lines get thicker and narrower (and longer in case of global interconnects), which all result in the aggravation of crosstalk noise amplitude and duration, and the circuit faults caused by such noise sources. Therefore as the VLSI technology scales down the role of interconnect parasitic effects in the signal integrity becomes increasingly significant. Another unwanted side effect of CMOS process technology scaling is the increase in process variations. Differences between identical features in a certain lithographic process are referred to as process variations. Lithography steps generate more process variations in smaller geometric feature sizes. Therefore, cell and interconnect delay characterization methods should consider the increasing impact of process variations on circuit performance and reliability. In addition to IC manufacturing process variations, environmental variations, and device/interconnect aging processes create a rather large deviation of key circuit 54 parameters from their designed values. These phenomena in turn produce timing uncertainty and demand highly sophisticated and robust crosstalk-aware analysis and optimization tools. The conventional corner-based techniques, to handle variability of parameters will not be effective in nanometer technologies due to their highly pessimistic (and sometimes optimistic) views. Statistical analysis is viewed as an essential methodology for nanometer process technologies, which enables application of the actual statistics of the process technology parameters for the accurate calculation of design characteristics such as delay and noise [67],[73],[74], [79]. Although a great deal of research has been done on statistical static timing analysis, only a few approaches exists in literature that investigate the impact of process variations on crosstalk and inherently circuit performance. The statistical model of [45] uses a lumped RC model to explore crosstalk- induced pulse (glitch) effect, where a single resistance is extracted to capture the effect of total self resistance of interconnect, regardless of its length. The case for self and coupling capacitances is similar. Also the correlation between the circuit parameters, such as interconnect line resistance and capacitance is assumed to be zero. The statistical model proposed in [5] is more sophisticated and uses a circuit model with higher number of nodes; however, still a single capacitance is extracted to model the total coupling effect, which makes it inappropriate for long interconnect lines. The authors of [14] apply special exponential waveform shapes to analytically study the statistics of crosstalk effect. However the exponential type 55 waveforms cannot accurately represent noise-affected signal waveforms. Additionally, the above approaches are unable to consider the correlation between neighboring wire segments. The goal in the current chapter is to study the effect of process variations on some existing crosstalk analysis techniques, resolve their shortcomings, and finally propose an efficient model to statistically calculate the crosstalk-aware delay of the interconnect victim line. More precisely, first a distributed RC-π model is used to accurately capture the statistical variations in the physical dimensions of the interconnect lines and the corresponding electrical parameters. The local effects of process variations on the coupled wire segments and the correlations among variations in neighboring segments are considered in statistical analyses. This information is then used to evaluate the correctness of the existing crosstalk analysis techniques in the presence of process variations using extensive sets of Monte Carlo simulations to calculate the actual statistical distribution of victim line timing parameters. Finally based on the observations, we propose a set of heuristic solutions for each technique to improve their applicability in statistical analysis of crosstalk effects. The work described in this chapter is a major extension of the work which was explained in the first chapter on crosstalk analysis [53] where process variations were ignored. The main outcome of this work is the generation of an accurate, yet efficient statistical model for coupled interconnects. This statistical model will be the basis of the crosstalk set compaction which will be explained in the next chapter. 56 3.2 Interconnect Modeling 3.2.1 Crosstalk Terminology Capacitive coupling between a pair of interconnect lines can induce spurious pulses and/or cause delay effects. We refer to such effects as crosstalk-induced effects. The portion of the layout where the coupling occurs is referred to as a crosstalk site. Crosstalk-induced slowdown occurs when an aggressor line, A, and a victim line, V, make signal transitions (state changes) in opposite directions. The net effect, in theory, of the coupling between the two lines is that the transition on the aggressor line tends to slow-down the transition on the victim line, making it appear to be delayed in time. The amount of slow-down is the difference between the time the signal transition at the far-end of the victim line crosses 0.5V dd when the aggressor has made a transition in the opposite direction, and that when the aggressor remains quiet. Slowdown is dependent on the victim and aggressor signal transition times, the skew between their signal arrival times, and the parameter values that are reflected in the capacitive and resistive model components. The uncertainty about the crosstalk-induced delay increase/decrease may be due to variations in any of the above parameters. 57 3.2.2 Coupled Interconnect Characterization/Modeling Consider a pair of coupled interconnect lines in a given metal layer which lies in between two dielectric plates (cf. Figure 3-1(a), where one segment of the coupled interconnect is shown). The two interconnect lines run in parallel and are capacitively coupled. Either line can be considered as a victim, when the other is an aggressor. The goal is to statistically analyze the slowdown of the victim line that has been induced by the aggressor line. A distributed RC-π model (cf. Figure 3-1(b)) is used to accurately model the abovementioned interconnect line configuration. In this circuit, each RC-π stage represents an interconnect segment of a predefined length, L seg . The coupling between two interconnect lines along segment i is captured by the coupling capacitance C mi . The self capacitance and resistance of the victim interconnect in segment i are denoted by C vi and R vi , respectively. Note that although lengths of all wire segments are identical, due to process variations, parameter values for each segment are different from those for other segments. 58 (a) (b) Figure 3-1: Distributed capacitive modeling of coupled interconnects. The complexity of distributed RC-π circuit model significantly limits its application in real world designs where millions of interconnect lines and hence crosstalk sites are present. Therefore, circuit designers try to derive the electrical behavior of this complex circuit model by approximating its transfer function using different model order reduction techniques [62],[42]. 59 Reference [17] analyzes the crosstalk-induced effects by using a simple lumped RC model. In this model, each resistor represents the total resistance of each line, whereas two capacitors capture the total self and coupled capacitances along the length of the interconnect line, respectively. The lumped models are highly inaccurate for global interconnects, especially at high clock frequencies. Closed- form expressions by using 2π and 4π configurations have been developed in model [20] and [7], respectively. However, the quality of their analysis and optimization tools degrades when using linear equations to model the nonlinear behavior of the drivers. In [32] distributed RC modeling has been used to obtain quantitative measures of crosstalk-induced pulse peak and width. However, the authors have not considered the case slowdown of the victim’s transition, where the aggressor is changing in the opposite direction of the victim. Generally speaking, the existing models for coupled interconnects tend to be inaccurate when significant process variations exist. On the other hand, the complexity of model order reduction techniques significantly increases when considering process variations [42]. 3.2.3 Interconnect Sources of Variation The variation of physical parameters, such as width and thickness along interconnect lines is in general due to the imprecision in the IC manufacturing process. This is in turn due to effects of the neighboring interconnect lines, non- uniform metal densities, Non-Linear Resistance (NLR) effect, Selective Process 60 Bias (SPB) [59] effect, and thickness variation due to etching and CMP (Chemical Mechanical Polishing). Other sources of variation are die-to-die, wafer-to-wafer, lot-to-lot, and fab-to-fab variations. Thickness variation modeling is highly dependent on the accuracy of local wire density calculation. The resultant model is expressed in terms of the size of a density box surrounding the critical wire segment. This box typically covers any neighbor wires that can influence the thickness of the critical wire segment. However, wires farther away from the critical wire segment do not contribute to the thickness variation as much as those which are closer. So a weighting function is applied by using a rectangular prism for modeling. The following can be used to compute the effective local density, D eff : () () eff i i i D dX w X =⋅ ∑ (3-1) where X i is the size of the density box of segment i, w(X i ) is its weighting factor, and d(X i ) is the density of the density box of segment i. The number of boxes, i (note that the number of boxed is equal to that of wire segments.) the size of the density box, X i , and the w(X i ) are derived from silicon measurements by the semiconductor manufacturer’s technology development group. The spatial correlation among the variation of neighboring segments is quite important in the crosstalk-induced delay of the victim interconnect. For example most of the variation resulting from chemical-mechanical polishing (CMP) of the 61 inter-layer dielectric (ILD) is based on systematic spatial effects and vanes within- die [47]. The methodology proposed in this chapter uses the spatial information to develop a systematic variation model of the coupled interconnects. The parameters corresponding to every pair of points on the interconnect line are correlated. The correlation relation is a function of distance between those two points. More precisely, for each physical parameter, p i , (e.g., thickness), the correlation between the parameter values corresponding to two points along the interconnect length at locations x 1 and x 2 is as follows: 12 () 12 (( ), ( )) seg xx L ii Corr px px e −− = (3-2) where p i (x 1 ) and p i (x 2 ) are the values of parameter p i at locations x 1 and x 2 along the interconnect. L seg is a predefined segment length that is assumed to be given to us. It is found by a series of characterization and extraction experiments by the manufacturer. 3.3 Experimental Results 3.3.1 Statistical Model To capture the effect of variations of physical parameters such as width, height, and interlayer dielectric thickness on the circuit metrics, the following two-step 62 procedure is used for the calculation of electrical parameters of the distributed circuit model: • Physical outline generation: Complete physical outlines of the coupled interconnect lines is generated in this step, including the information about their width, height, and interlayer dielectric thickness along their length and their correlation as described in section 3.2.3. • RC-π stage parameter calculation: The generated physical outline of the coupled interconnects is used to calculate the corresponding electrical parameters for each interconnect segment. The key advantage of the proposed modeling approach is the ability to locally capture the effect of process variations on each interconnect segment. This is done by directly calculating the corresponding values of local resistance and capacitance of the RC-π model based on the exact information about the actual geometry of the interconnect lines in each segment. This is in direct contrast with previous approaches [5],[14],[45] where a single sample of a given parameter is adopted from the corresponding distribution to extract the electrical parameters of the complete circuit model. Next, each interconnect line is divided to multiple segments of length, L seg =100μm, and a set of parameters is randomly assigned to that segment based on the assumed distribution. To emulate the variations in the physical outline of the interconnect lines a Gaussian distribution is used for each physical parameter. The choices of distribution type, its characterizing parameters, and the correlations 63 among physical parameters are based on the information made available from the semiconductor manufacturer as described in section 3.2.3. In the second step, parameter extraction scheme similar to [19] is used to calculate values of different electrical parameters in the RC-π model. Note that although the approach used for parameter extraction may lack absolute accuracy, it provides the required fidelity with respect to different physical dimensions. Moreover, since all of the experiments use the same extraction method, the presented results only capture the differences in the models, and are not impacted by the accuracy of the extraction procedure. Figure 3-2 shows the line resistance and capacitance distribution of a segment of two 100μm-long coupled interconnect lines in metal-4. Note that none of these distributions are Gaussian, since the extraction procedures which transform the physical parameters of interconnect lines to the corresponding electrical parameters are non-linear, and therefore, they tend to result in non-Gaussian distributions for the electrical parameters. This is in contrast with the Gaussian-based assumptions used typically in statistical timing analysis methodologies [59], [41]. The best maximum-likelihood fit for these distributions are lognormal distributions, parameters of which are listed in Table 3-1 with confidence level of 98%. The corresponding fitted probability distribution functions (pdf) are shown by the red outlines in Figure 3-2. It is mentioned in [59] that variations in electrical parameters of interconnect may be approximated as normal distributions with the exception of via and contact 64 resistances which should be approximated as lognormal distributions. The Gaussian distribution assumption in this case is found to have and error of 3.5%. Figure 3-2: Resistive and Capacitive line distribution for a 0.1mm long metal-4 interconnect. Table 3-1: The mean and standard deviation of the resistance and capacitance line variations depicted in Figure 3-2. Resistance (Ω) Capacitance (pF) μ 3.3504 0.6271 σ 0.0861 0.2266 65 3.3.2 The Simulation Setup To simulate the crosstalk-aware delay of interconnect lines, a distributed circuit consisting of 10 RC-π stages are used (cf. Figure 3-3.) From now on, we will treat the interconnect with input V in and output V out as the victim line and the other as the aggressor line. The victim and aggressor lines have drivers Cell v and Cell a and receivers 4Cell v and 4Cell a , respectively. The cells are taken from a standard 130nm, 1.2V production library R v1 C v1 R v2 Cell v Cell a 4Cell a 4Cell v 16Cell v 16Cell a 64Cell v 64Cell a V in A in A out A far A near V far V out V near C m1 C m2 C mn seg1 v seg2 v seg2 a seg1 a C v1 C v2 C v2 R vn C vn C vn C a1 C a1 C a2 C a2 C an C an Distributed RC-π model Figure 3-3: Distributed RC-π modeling of crosstalk site. To capture the variations of each physical parameter, the two step procedure described in section 3.1 is used to create a large population of samples for each physical parameter which in turn is transformed into the electrical parameters of the distributed RC-π model. Next, Hspice simulation is performed. To achieve convergence in the desired statistical properties of output variables, Monte Carlo simulation is performed. Based on normality assumption for the interconnect delay 66 distribution, a sample size of 2500 is used i.e., the population generation and electrical parameter extraction steps are iterated 2500 times to achieve convergence in the desired statistical properties for each sample. The number of samples is then selected so that a 98% confidence level with 1ps error in the estimates of mean and variance of interconnect delay is achieved. 3.3.3 Statistical Comparison of Crosstalk Models To show the necessity of a statistical approach, first the statistical model based on distributed RC-π circuit is compared against the conventional corner-based approach. As seen in Figure 3-5, the corner-based “µ+3σ” value of the victim delay is equal to 435ps which shows more than 46% pessimism compared to that in our statistical model with distributed circuit modeling (µ+3σ=290). As discussed earlier, the accuracy of the existing models for coupled interconnects can severely degrade in the existence of process variations. A goal of this work is to investigate the source of inaccuracies for some common crosstalk site models and try to resolve them. The single RC-π model, as well as the 2RC-π model of [36] are considered here as two such models. The statistical distribution of the delay of the victim line for both approaches is illustrated in Figure 3-5. The mean delay is found to be very close to the one found by the distributed model (in fact the mean error for the single and two RC-π models is 5% and 3.2%, respectively.) However, the µ+3σ value for the single RC-π and 67 2RC-π is 330ps and 313ps respectively, meaning there is more than 13% and 7.9% pessimism in µ+3σ calculation when single and 2RC-π are used, respectively. This error mainly exists because when the extraction tool extracts the parameters values for each segment, it extracts the same value for each segment as long as the topology of the wire does not change; subsequently, these identical values are used by a model order reduction technique to create a reduced model such as the lumped RC or RC-π model. However, in reality the local process variations can cause parameter value variations in different segments (cf. 3.2.3.) Figure 3-4: Comparison of different approaches with our statistical crosstalk model 3.3.4 Variation Shielding To improve the accuracy of crosstalk models when used in a statistical analysis methodology, we propose the following heuristic algorithm. First, the interconnect line physical parameters are extracted based on conventional scheme. Next before applying the model order reduction formulas to find the capacitive or resistive 68 value of each element in the reduced model with respect to those of each segment of the distributed model, we select the mean and variance of values in each segment as follows. The mean values for all segments are set to the extracted value. The variances are drawn from a family of distributions where their variances decrease geometrically as we proceed towards the far-end of interconnects. The reason behind this type of variance assignment is that based on extensive simulations, we have found that the variation of parameters in each segment is affected by a phenomenon that we refer to as variation shielding. By variation shielding, we mean that while moving from an interconnect driver towards the far-end of the interconnect line the effect of parameter value variations on the output delay is reduced. We model the variation shielding phenomenon for each parameter p by the following expression: σ p (segment i ) = α × σ p (segment i-1 ) (3-3) where σ p (segment i ) is the variance of a parameter p of segment i. 1≤ i ≤10 and α <1; α is set to 0.95 in our heuristic tool. Having selected the values for each segment, the model reduction is applied by using these modified values and the values of each circuit parameters in the reduced model specified. We then repeat the experiments corresponding to Figure 3-5 using the values found through our heuristic. The results are shown in Figure 3-5. Compared to Figure 3-5, the pessimism is drastically reduced (e.g., for the case of 2RC-π model from 7.9% to 4% error in µ+3σ and from 3.2% to 1.4% error in mean, if α=0.95.) 69 The intuitive reason is that the in case of the summation of the distributed parameters to a single value, the variations tends to cancel each other and thus the pessimism of the conventional approach for the extraction of a single component is reduced. 3.3.5 Analytical Crosstalk-Aware Delay Analysis The simulation setup of 3.3.2 is used to derive the mean and variance of crosstalk- aware output delay of the victim vs aggressor and victim line widths. They are shown in Figure 3-6 and Figure 3-7, respectively. Similar simulations have been performed considering different physical and electrical parameters. Each point in these figures is the result of 2500 sampled data in a Monte Carlo based environment. The large sampled data of 2500 points guarantees that the sample means have a normal distribution and that the closed form expression to model the effect of process variations on delay are accurate. Figure 3-5: Accuracy improvement of RC-π crosstalk models. 70 There is a tradeoff for the level of accuracy and complexity of closed-form expressions. As the number of input parameters increase, lower order models such as linear modeling of variation becomes more suitable. According to our experimental setup, we found the 2 nd order modeling to be the most effective for the distribution properties of crosstalk-aware output delay and transition time of the victim line: ) ( ) delay ( mean 2 ∑ + = i parameter i i i i x B x A (3-4) ) ( ) ( var 2 ∑ + = i parameter i i i i x D x C delay iance (3-5) ) ( ) ( 2 ∑ + = i parameter i i i i x F x E time transition mean (3-6) ) ( ) ( var 2 ∑ + = i parameter i i i i x H x G time transition iance (3-7) where coefficients A i to H i are empirically found by using our statistical analysis and curve fitting of the results. 71 Figure 3-6: Crosstalk-aware victim interconnect delay (mean) Figure 3-7: Crosstalk-aware victim delay (variance) 3.3.5.1 Sensitivity Analysis To increase the efficiency of Equations (3-2) to (3-5) we performed an extensive sensitivity analysis of crosstalk-affected delay to all circuit parameters to able to choose the right model (linear or higher order model) with respect to each parameter. The distributed circuit model of Figure 3-3 was used. Figure 3-8 illustrates an example of such experiments, which is the sensitivity of crosstalk- 72 aware delay to the coupling capacitance value. The following observations were made with respect to each parameter: P1: Both crosstalk-affected output delay and transition time of the victim line are highly sensitive to the coupling capacitance value. Furthermore, both of these quantities are well approximated by assuming a linear dependence on the coupling value. P2: Both crosstalk-affected output delay and transition time are moderately sensitive to the wire capacitance value. The delay monotonically increases as the victim wire capacitance increases; however, it does not show a monotone behavior with respect to the aggressor wire capacitance. Also, the output transition time does not exhibit a monotone relationship with respect to the wire capacitance of the victim or the aggressor line (cf. Figure 3-9.) P3: The crosstalk-affected output delay and transition time are weakly sensitive to the wire resistance value. In particular, they monotonically increase as the victim wire resistance increases, while monotonically decreasing as the aggressor wire resistance increases. In both cases the effect can be well approximated by linear equations. The above observations are helpful to decide whether a linear or a higher order model should be used with respect to each parameter in Equations (3-2) to (3-5). For example a linear modeling is enough for the case of resistance and coupling values, however for wire capacitance a 2 nd or higher order model is necessary. 73 0.00E+00 2.00E-10 4.00E-10 6.00E-10 8.00E-10 1.00E-09 1.20E-09 0 100 200 300 400 500 second Total Coupling (fF) out_v out_u skew =0 skew =0 skew =1ns skew =-1ns skew =-1ns skew =1ns skew = A(in_x)-A(in_y) (a) 1.00E-10 1.50E-10 2.00E-10 2.50E-10 3.00E-10 3.50E-10 4.00E-10 4.50E-10 5.00E-10 0 100 200 300 400 500 second Total Coupling (fF) out_v out_u skew =0 skew = A(in_x)-A(in_y) -1ns 1ns -1ns skew =0 1ns (b) Figure 3-8: (a) Delay (from in_x (in_y) to) out_u (out_v) vs. coupling for three different input skew values. (b) Transition Time of out-u (out_v) 74 2.00E-10 2.20E-10 2.40E-10 2.60E-10 2.80E-10 3.00E-10 3.20E-10 3.40E-10 3.60E-10 0 100 200 300 400 500 second Wire capacitance of line x (fF) out_v out_u skew =0 skew = A(in_x)-A(in_y) -1ns 1ns -1ns skew =0 1ns Figure 3-9: Transition times of out_u and out_v vs. wire capacitance of line x for different input skew values. Figure 3-10 shows the fitted curve for the distribution of crosstalk-aware output delay of the victim line. Figure 3-10: Crosstalk-aware output delay distribution for a 1mm long metal-4 interconnect pair To evaluate the accuracy of our statistical expressions, we tested many cases of coupled interconnects from various sections of an industrial design by using 75 Hspice. We then compared the results with those found by using our statistical expressions. Figure 3-11 illustrates the corresponding results. The average and maximum error magnitude for the mean calculation of crosstalk-aware delay are 1.7% and 5.1% respectively. The corresponding errors for variance are 1.3% and 3.9% respectively. (a) (b) Figure 3-11: Accuracy comparison vs. Hspice a) Mean Interconnect delay b) Interconnect delay variance 76 Performing Hspice-based study of the coupled interconnects while using the distributed RC-π model resolves the shortcomings of the previous approaches which rely on simpler, yet less accurate, circuit or waveform models. To increase the reliability of our statistical model, we had to simulate our circuit model under the whole range of physical parameters as well as variations in the input timing parameters, namely, skew and input transition times, for many coupled interconnects with different geometries. This kind of extensive simulation is required to be run only once per each process technology. The resulting statistical crosstalk-aware delay expressions can subsequently be used throughout the whole design and testing stages. 3.4 Summary A statistical model was presented in this chapter that captures the effect of process variations on crosstalk-affected delay of coupled interconnects considering the large impact of correlation. A local process variation-aware distributed RC-π circuit modeling was developed for coupled interconnect pairs. Extensive Monte Carlo based Hspice simulations were performed to calculate the statistical properties of crosstalk with respect to variations in physical parameters and closed- form expressions for mean and variance have been derived. These expressions were shown to have near-to-Hspice accuracy. A statistical analysis method was also presented which is based on a heuristic that is applied prior to model order reduction so as to modify the values of the parameters such that process variations 77 are properly accounted for. The effectiveness of this heuristic algorithm is confirmed for reduced models, namely, single and two RC-π crosstalk site models. Most of the theories and results in this chapter have been presented in [58]. 78 4 CROSSTALK-AWARE LOGIC CELL TIMING ANALYSIS USING VOLTAGE-BASED DELAY MODELING 4.1 Introduction The drastic down scaling of layout geometries to 90nm and below has resulted in a significant increase in the packing density and the operational frequency of VLSI circuits. An unfortunate side effect of this technology advancement has been the aggravation of noise effects, such as the capacitive crosstalk noise, in VLSI circuits. These is mainly because the metal wires have become narrower and thicker (and in fact longer in the case of global interconnects) and are laid out closer to one another, which in turn increases the capacitive coupling noise. Furthermore, IC manufacturing process variations, device/interconnect aging phenomena, and dynamic circuit parameter changes (such as power plane fluctuations and temperature gradients in the substrate) give rise to a rather significant deviation of the electrical parameters of the circuit components from their designed (nominal) values. This effect can produce excessive timing uncertainty, which in turn requires sophisticated crosstalk-aware delay analysis techniques and tools to overcome it. 79 Timing analysis is an essential aspect of determining whether a noise source can create a faulty output in a circuit. In particular, the signal arrival times in a circuit can change as a function of the noise that is present in the circuit. Input pattern dependent circuit-level timing analysis with tools such as Spice, is very accurate, but requires significant computational resources, which makes this approach impractical for large VLSI circuits. Gate-level timing analysis tools such as STA (static timing analysis), and SSTA (statistical static timing analysis) tools are used as efficient alternatives with an acceptable level of accuracy. These tools utilize delay models for both interconnect lines and logic cells. The function of an interconnect delay model is to take as input the transient waveform at the near-end of an interconnect line and produce as output, the corresponding waveform at the far-end of the line while accounting for the effect of various noise sources that couple to the line. This process is known as the interconnect delay (or timing) analysis. Similarly, the function of a cell delay model is to take a noisy input waveform and produce the waveform for the cell output. This process is known as the cell delay or (timing) analysis. The fact that the interconnect delay dominates the cell delay in modern VLSI circuits, has drawn attention toward producing faster and more accurate interconnect delay models. While researchers have produced excellent interconnect delay calculators they have mainly ignored the deficiencies of conventional cell delay models to such a degree that, at the present time, the main source of 80 inaccuracy in timing analysis tools in the presence of noise is the cell delay modeling and not the interconnect delay calculation. Cell delay is pre-characterized based on input slew and capacitive output load by using a circuit level timing analyzer such as Spice. Therefore the resulting pre- characterized look-up tables are inherently incompatible with the RC/RLC interconnect loads. This incompatibility is dispelled by finding an effective capacitive load, which is in some way equivalent to the more complex RC load [22],[38],[46],[61] or RLC load [25],[3]. An iterative or non-iterative approach may be used to calculate the effective capacitance. Conventional timing analysis tools start with arrival time and slope (transition time or slew) at the near-end of a line and (by using the interconnect delay model) produce the arrival time and slew at the far-end of the interconnect and then (by using the cell delay model) produce the output delay and slew times for a fanout cell which is driven by the far-end of that line. Most timing analysis tools model the noisy signal at the input of a cell with a single reference point, i.e., an input arrival time, and a constant slope, i.e., equivalent input slew. This implies that the noisy waveform is modeled by an equivalent line with a certain arrival time and slew. STA commonly uses the minimum and maximum arrival times and the fastest and slowest slews for each line in the circuit and applies them to the model of the component driven by that line to find the bounds on arrival time and slew of the output line of that component [9],[13],[63]. The interconnect model should account for the worst-case noise-induced slowdown and speedup in the calculation of 81 bounds for the interconnect far-end [75]. In the case of crosstalk noise, the arrival times and slews of the aggressor lines should be chosen so as to give rise to the worst case slowdown or speedup at the far-end of the line [28],[68],[80]. The calculation of the output bounds from the input bounds is also referred to as propagation. The propagation starts at circuit primary inputs and concludes at primary outputs. The upper and lower bound arrival times and slews are then used to verify whether a circuit under design (pre-silicon) or test (post-silicon) meets the desired timing constraints. References [28], [68],[75] focus on the interconnect delay propagation. Similar to [23],[30], this chapter focuses on the cell delay propagation of noisy inputs. The problem may be stated as follows: Given a noisy voltage waveform at the input of a cell, statically determine the output voltage waveform, which has the minimum error with respect to the actual output waveform. We point out that this problem statement is more general than the conventional statement that: Given a noisy waveform at the input of a cell, find an equivalent input voltage waveform that when is applied to the cell generates an output waveform which is as close as possible to the output waveform in terms of its arrival time and slew. Consider the configuration of Figure 2-1 in a 0.13µ process technology where an inverter (4INV x ) is fed by a long interconnect line that is a potential crosstalk victim. Aggressor and victim lines run in parallel and are modeled by using a π structure. Each π stage is 100µm long. We use standard inverter cells of an industrial 0.13µ cell library in our experiments. Figure 2-2 shows the crosstalk- 82 induced slowdown as a function of the skew between the victim and aggressor arrival times at their driver inputs (in_x and in_y). Arrival time of signal transition at a node w is denoted by AR(w). An input skew of less than 25ps between the victim and the aggressor can create a slowdown of more than 200ps. This implies that a relatively small arrival time miscalculation (e.g., as much as 25ps) at the near-end of a capacitive crosstalk site can result in a large error at its far-end (200ps over/under-estimation.) This in turn can significantly increase the error in arrival time calculation of the gate that is fed by this crosstalk site. Note that any inaccuracy in a stage can be magnified when propagated through the following stages of a circuit. Hence it is crucial to calculate the arrival times very accurately in the presence of crosstalk noise. The interesting fact about the shape of the waveform is that different voltage waveforms with identical arrival time and slew at the input of a cell can result in very different propagation delays through that cell. This is because the exact shape of the input voltage waveform can greatly influence the cell output waveform behavior. Generally speaking, as the crosstalk noise becomes more significant in current technologies, using only a reference point (arrival time) and a constant slope (slew) to convey the timing information for a signal transition adversely impacts the robustness of timing analysis tools. Hence the shape of the waveform should be considered more effectively. The cell delay modeling techniques can be classified into two general groups, voltage-based and current-based ones, according to which of the output current or 83 output voltage they compute. In this chapter, we present some new voltage-based techniques to model the input waveform in the presence of noise such that the estimated output is as close as possible to the actual one. Without any additional library characterization, we define the sensitivity of output to noisy input, i.e., the derivative of output voltage waveform to the noisy input voltage waveform. The sensitivity is then used to model the effect of the shape of the input waveform on the output waveform. This information may be subsequently utilized to generate an equivalent linear waveform as required by conventional timing analysis tools. Our cell delay calculation techniques are capable of directly building the output waveforms without the need to create an equivalent input waveform as is done by conventional techniques. However, it can also generate the equivalent input line if that is indeed required by the conventional timing analysis tools. Our cell modeling approach is simple and quite efficient to implement. The remainder of this chapter is organized as follows. In section 4.2 we review the previous voltage-based approaches for cell delay modeling. Section 4.3 describes gcdm, our voltage-based gain-based techniques. Sections 4.4 and 4.5 review the experiments and conclusions, respectively. 4.2 Previous Voltage-based Cell Delay Modeling Techniques The conventional logic cell delay analysis techniques model the waveform by an equivalent linear waveform that has a constant slope and a certain arrival time, because the model should match the current gate delay libraries, which have two- 84 dimensional lookup tables with the input slew and output load as their key. The tables are utilized to estimate the arrival time and slew of the signal transition at the output of the gate. Hence the objective is to find an equivalent input line (denoted by Γ eff in this report) such that when applied to the input of a gate can generate an output waveform such that it is as matched as possible to the actual waveform in the arrival time and slew. 4.2.1 Point-based Techniques To construct Γ eff , techniques in this class generally pass an equivalent line through the latest 0.5V dd crossing point of the noisy voltage waveform. A technique, called noiseless point-based, sets the input slew of Γ eff to be equal to the time from the 0.1V dd to 0.9V dd of the noiseless waveform, i.e., as if the waveform had not been affected by the noise (this technique is described in [29] as a method which is practiced in industry.) Another technique, called noisy point-based, uses the time from the earliest 0.1V dd crossing point to the latest 0.9V dd crossing point of the noisy waveform as the effective slew of Γ eff (this method is described in [9].) Both point-based techniques may be too pessimistic in some cases because of the fact that they set the 0.5V dd point of the Γ eff to be the latest 0.5V dd crossing point. Conversely, they may be too optimistic in other cases because of the way that they calculate the slew of Γ eff . Clearly, it is possible to revise point-based techniques to use a different reference point as the 0.5V dd crossing point or calculate the slew of Γ eff differently. Although this modification may improve the 85 accuracy of the point-based techniques in certain cases, it cannot overcome the fundamental difficulty that arises from the fact that a combination of a single 0.5V dd crossing point and an effective slope is inadequate to accurately characterize the input waveform for the purpose of gate delay and output slew calculation. A more sophisticated technique in this class is presented in [68], which uses four-dimensional lookup-tables with noise width and height as the two additional dimensions. This technique has three shortcomings: 1) using the noise width and height is not sufficient to model all types of noise distortions; 2) it entails a new and costly cell delay characterization process to initialize the look-up tables; 3) It requires a major change to the STA tools, i.e., 4-D lookup tables must be adopted by EDA vendors and semiconductor manufacturing companies and that is unlikely at this point of time. 4.2.2 Least Square Fitting-based Technique A technique, denoted by LSF (which is explained, but not cited, in [29]) finds Γ eff that the sum of the squares of the sampled differences (for P sampling points in the range of interest) between Γ eff and the noisy voltage waveform is minimized, i.e., a line Γ eff with coefficients a and b is found such that Equation (4-1) is minimized. ∑ + × − noisy last t noisy first t noisy in b t a t v 2 )} ( ) ( { (4-1) where ) (t v noisy in is the noisy input voltage value at time t. noisy first t and noisy last t are selected to only consider the critical region of the noisy waveform, i.e., they are defined as 86 time instances at which the noisy input voltage crosses the 0.1V dd for the first time and the 0.9V dd level for the last time, respectively. Note that noise distortions outside the noisy critical region cannot affect the output waveforms and may thus be ignored. We will use the term “critical crossing points of the noisy input” to refer to noisy first t and noisy last t . LSF can randomly show pessimistic or optimistic behavior, since it is more of a mathematical approach to match a waveform with a line with no consideration of logic gate behavior. 4.2.3 Weighted Least Squared Error-based Technique Recently a technique, which we will denote as weighted LSF, has been suggested in [30],[29]. This technique multiplies each squared term in Equation (4-1) by a weight factor. The following explains the two main steps of weighted LSF. Weighted LSF-Step 1: Finding the derivative for the noiseless input For each logic cell, the derivative of the output waveform to the noiseless input waveform, ρ noiseless , is calculated as: dt t v dt t v t v t v t noiseless in noiseless out noiseless in noiseless out noiseless ) ( ) ( ) ( ) ( ) ( ∂ ∂ = ∂ ∂ = ρ (4-2) where ) (t v noiseless in and ) (t v noiseless out are the noiseless input and its resulting output voltage values at time t, respectively. Note that ρ noiseless is equal to the ratio of output slew to noiseless input slew. This weight factor is non-zero only for points in a critical region and is considered to be zero outside that region (this region is called noiseless critical region.) The region is defined between noiseless first t and 87 noiseless last t , which are in turn set to be equal to the 0.1V dd and 0.9V dd crossing points of the noiseless input, respectively. We will refer to noiseless first t and noiseless last t as the “critical crossing points of the noiseless input.” Weighted LSF -Step 2: Finding Γ eff Weighted LSF finds Γ eff with coefficients a and b, such that the following equation is minimized: ∑ + × − − = 1 0 2 } )) ( ) ( )( ( { P k k k noisy in k noiseless b t a t v t ρ (4-3) where P is the number of sampling points. The noiseless critical region in weighted LSF, [ noiseless first t , noiseless last t ], acts as a filter. If the noise distortion occurs outside the noiseless critical region, then it will be ignored. Our experiments confirm that limiting the noise consideration to this range only, causes inaccuracy in weighted LSF. More precisely, the higher the number of aggressors is, the higher is the probability that weighted LSF under-estimates the arrival time and/or slew at the output of the gate by a large amount. Another shortcoming of this technique is that it is meaningful only as long as the noiseless input and the output waveform overlap each other; otherwise the derivative of output to input is undefined. Therefore, weighted LSF cannot be applied to gates with large intrinsic delay such as multi-stage gates, and/or the ones with large fanout loadings, where the input and output transition do not overlap. (In 88 Section 4.3 we will discuss how our gain-based approach resolves these shortcomings.) 4.2.4 Elmore-based Technique Inspired by the Elmore delay idea [24], one technique is to pass Γ eff through the latest 0.5V dd crossing point of the noisy voltage waveform. The slope is then selected such that the area, which is encapsulated by that line and straight lines v 1 (t) = 0.5×V dd and v 2 (t) = V dd is equal to the area surrounded by the noisy input and lines v 1 and v 2 . This cell delay analysis technique, called the Elmore-based, is simple to implement and employ in practice. Our experimental results demonstrate that it generates very accurate results as long as the noisy waveform does not pass through 0.5V dd level more than once. However in case of multiple 0.5V dd crossing points, there is a chance that the logic cell output makes its transition before the last 0.5V dd crossing point implying that setting the arrival time of Γ eff to the last 0.5V dd crossing point of the noisy input will introduce pessimism in delay calculation. Figure 4-1 is an example of one such case. In general, the more times the noisy waveform passes through the 0.5V dd level, the higher is the probability for this approach to produce pessimistic delay estimates. 89 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.00E-09 1.50E-09 2.00E-09 2.50E-09 3.00E-09 3.50E-09 Γ (Elmore-based) Output (Elmore-based) Ouput (Hspice ) eff v in Figure 4-1: Elmore-based pessimism: Total coupling 350fF 4.3 Our Voltage Gain-Based Cell Delay Modeling Techniques This section describes our new voltage-based logic cell delay analysis techniques that use the gain (sensitivity) of the cell output voltage over its input voltage to directly calculate the output voltage waveform. As stated earlier, one unique characteristic of our delay models all existing techniques in the sense that it builds the output voltage waveform directly instead of first finding an equivalent input ramp. We notice that the output voltage of a cell is a function of the input voltage itself, the output parasitic capacitances, the output load, and the supply voltage, V dd . We define the voltage gain, ρ v , as the derivative of the output voltage to the input voltage waveform. Each cell is pre-characterized with a 2-D lookup table with input voltage and effective output capacitance as key into the table, and ρv the returned value. The effective capacitance captures the effect of the parasitic as well 90 as load capacitances. Output voltage waveform is directly built by using the lookup table gain information and by performing Taylor series expansion of the output voltage values. 4.3.1 gcdm Cell Delay Model Each logic cell in the library is pre-characterized with a lookup table, which is used for output voltage calculations of the cell. This table will be referred to as V gain (K× L) where K and L denote the number of input voltage levels and effective capacitance values, respectively. V gain contains ) , ( j eff i in v C V ρ which is simply the derivative of the cell’s output voltage v out with respect to its input voltage at voltage value i in V when the cell output is connected to an effective load with a value j eff C . Note that ρ v is stored only for points in a critical region where the input voltage is: 0.1 0.9 i dd in dd VV V ≤≤ . As expected, ρ v is typically zero outside this region (for a well-designed logic cell the voltage gain (ratio of output voltage to input voltage) is zero near its two stable operating points of logic zero and logic one.) i in in V t v in out v j eff i in v v v j i C V = Δ Δ = = ) ( ) , ( ) , ( ρ ρ (4-4) ρ v quantitatively shows how sensitive the output voltage is to the input voltage, at a certain input voltage value and for a certain effective output capacitance value. The ρ v (i,j) value is stored in row i and column j of the Vgain lookup table. The size of the Vgain tables for our library cells has been set to (20,5). Figure 4-2 depicts an example of such a lookup table. 91 The iterative effective capacitance calculation technique of [61] is used to find the effective capacitance seen by the output of the cell. We then use the V gain lookup table combined with an interpolation method to find ρ v for the noisy input waveform of the cell. Figure 4-3 (a) and (b) illustrate ρ v for a noiseless and typical crosstalk-induced noisy waveform, respectively. 1 eff C 2 eff C … j eff C … L eff C 1 in v … … 2 in v . . . . . . . . . i in v . . . ρ v (i,j) . . . K in v … … Figure 4-2: Vgain(K×L): the cell current gain lookup table used in our model. We assume that the noisy input voltage waveform, v in , has been characterized by the user (or a timing analysis tool) by having specified the input waveform voltage levels at P equidistant sample points (t 0 , …, t P-1 .) gcdm takes this data and uses the V gain table to construct the output waveform by reporting the output voltage levels at P equidistant points. Therefore, it is easy to see that gcdm can be used as the main delay calculation engine in a timing analysis tool which starts from the primary inputs of the circuits and calculates the voltage waveforms for all 92 intermediate signals and the primary outputs during a linear time traversal of the circuit net list. In general, to detect noise, P should be selected such that the time between two consecutive sampling points is no larger than one half of the smallest crosstalk noise width. In practice, we have considered a sampling time interval of 50ps, e.g., for an input waveform with a rise time of 1ns, 20 sampling points are used. To be more precise, gcdm builds an equivalent output voltage waveform by using the truncated Taylor series expansion of v out : ( ) () 11 2 1 ( ) () () ( ) () 1 () ( ) () 2 out k out k v k in k in k v kink ink in vt vt t v t v t tvt vt v ρ ρ ++ + =+ ⋅ − + Δ ⋅⋅ − Δ (4-5) where v out (t 0 ) is initialized to zero. ρ v (t k ) is a more concise notation for ρ v (v in (t k )). In general, the computed P output values may not be equidistant. This is undesirable when doing the timing analysis of a logic circuit. To avoid this, a set of P equidistant points are computed based on weighted average of the two nearest values found from Equation (4-5). As pointed out earlier, ρ v (v in (t k )) is found from the V gain table (if necessary by using an interpolation method based on weighted average of the two nearest values in the lookup table.) ) ( k in v t v Δ Δρ is found from the V gain table as follows: 1 1 (( )) (( )) (( )) () () () ( ) v v in k v in k v in k k in in k in k in k vt vt vt t vvt vtvt ρρ ρ ρ − − ΔΔ − == ΔΔ − (4-6) 93 ) ( k in v t v Δ Δρ is defined to be zero if the input voltage does not change from the previous sampling point to the current one, i.e., when Δv in (t k ) = 0. 0 0.2 0.4 0.6 0.8 1 1.2 4.E-09 5.E-09 5.E-09 6.E-09 6.E-09 v in out v |0.2 ×ρ | v (a) 0 0.2 0.4 0.6 0.8 1 1.2 3.80E-09 4.30E-09 4.80E-09 5.30E-09 5.80E-09 |0.2 ×ρ | v v in (b) Figure 4-3: ρ v for: (a) a noiseless waveform (b) a typical crosstalk-induced noisy waveform. Figure 4-4 illustrates the equivalent output voltage waveform for the noisy input waveform of Figure 4-2(b). The voltage waveform of our cell model closely matches the actual waveform generated by Hspice. A Padé approximation can be used to calculate the output current, instead of the Taylor series expansion of Equation (4-5). Padé approximations are usually superior to Taylor expansions when functions contain poles, because the use of 94 rational functions allows them to be well-represented. However, our experimental results demonstrate that using the first two terms of the Taylor series to find the output current provides sufficient accuracy, yet Taylor series approximation is much more efficient than using the Padé approximation. This makes Equation (4-5) more suitable than an equivalent Padé formula for use in a cell delay analysis tool. 0 0.2 0.4 0.6 0.8 1 1.2 3.80E-09 4.30E-09 4.80E-09 5.30E-09 5.80E-09 |0.2 ×ρ | v out v (Hspice) v (gcdm) v in out Figure 4-4: gcdm: the actual and equivalent output voltage waveforms. 4.4 Experimental Result for our Voltage Gain-based Cell Delay Analysis Technique Our proposed cell delay propagation model was written in C and compiled under Sun Blade 1000 machine. The cells used in the experiments are from a 130nm, 1.2v production cell library using parasitically extracted netlists. An automated test/evaluation system was devised to assess the proposed cell modeling and compare its delay accuracy and run-time with Hspice. We demonstrate the accuracy of our model on realistic circuit configurations that are part of a large high-performance ASIC design obtained from industry. The 95 circuit configurations appraise our model under different scenarios, i.e., for different number of aggressor lines, interconnect lengths, coupling capacitance values, and input slews to create various noisy waveform shapes. Configuration I is a pair of 1000μm coupled interconnect lines running parallel to one another with a total distributed coupling value of 100fF. Both aggressor and victim line inputs have a slew of 150ps. For all configurations we set the arrival time and slew (transition time) of the victim line input to 1000ps to 150ps, respectively. For configuration I we swept the arrival time of the aggressor line input from 500 to 1500ps in steps of 5ps. Configuration II includes two aggressor lines each with 100fF total coupling and a victim, all of which are 500μm long. We maintained a fixed offset of -100ps between signal arrival time of the 1 st and 2 nd aggressor line inputs, while sweeping that of the 2 nd aggressor line input arrival time. The two aggressor inputs have slews 200ps, and 400ps, respectively. Configuration III contains three aggressor lines, each with 50fF total distributed coupling and 300μm long. The victim line is 500μm long. We maintained a fixed offset of -50 between the arrival times of 1 st and 3 rd aggressor line inputs and -100 between those of 2 nd and 3 rd . The arrival time of the 3 rd aggressor line input was then swept from 500 to 1500ps in steps of 5ps. The slews of the three aggressor lines are 200ps, 350ps, and 400ps respectively. Table 4-1 shows the maximum and average delay errors of the existing voltage- based cell delay models including gcdm compared to Hspice. The cell delays were calculated as the difference between the 0.5V dd crossing point of the input and 96 output waveforms. The maximum error for gcdm was observed in circuit configuration III where the number of aggressors is the highest. However, the average error is quite low even in all cases for our cell delay modeling. All of the existing cell delay models approximate the noisy waveform by an equivalent linear waveform, Γ eff , the arrival time of which is set to the time instance of the latest 0.5V dd crossing points of the noisy input waveform. These models differ in the way they calculate the transition time of Γ eff . Noiseless point-based model sets this value to the time from 0.1V dd to 0.9V dd crossing points of the noiseless input waveform, i.e., as if the input had not been affected by the noise. Noisy point-based model sets the transition time to the time from 0.1V dd to 0.9V dd crossing points of the noisy input waveform. In case of LSF (Least Squares Fitting), the transition time of Γ eff is found such that the sum of the squares of the sampled differences between the equivalent and the input noisy waveforms is minimized. Elmore-based model sets the transition time of Γ eff such that the area which is encapsulated by that line, and straight lines v 1 (t) = 0.5V dd and v 2 (t) = V dd is equal to the area that is enclosed by the noisy input waveform and lines v 1 and v 2 . Finally the Weighted LSF model uses a similar concept to the LSF model however the square terms are weighted by a factor that shows the gain of the output to input voltage waveform. The reader is encouraged to refer to [51] for more details regarding these cell models. 97 Table 4-1: Experimental results for gcdm vs. other techniques. Delay Error (ps) = |Delay(Hspice) – Delay(Method)| Configuration I Configuration II Configuration III Method Max Avg Max Avg Max Avg Noiseless Point-based 81.3 29.3 134.2 48.5 153.4 55.3 Noisy Point-based 82.7 24.5 144.5 51.3 151.6 56.4 Least Square Fitting (LSF) 75.1 30.9 110.8 45.4 124.6 49.4 Elmore-based 82.3 14.5 145.3 33.4 166.3 35.3 Weighted LSF 42.4 10.3 49.3 17.4 48.5 15.6 gcdm 34.5 8.5 39.5 9.2 39.7 9.7 As can be seen from Table 4-1 results, gcdm is higher in accuracy than all existing models. For example, for configuration II, the average (maximum) delay error reduction is 8.2ps (9.8ps) i.e., %47.1 (19.9%) delay error improvement, compared to weighted LSF, which is the most accurate technique among the conventional ones. The number of input (and output) waveform sampling points, P, affects the accuracy and runtime of cell delay modeling approaches. The runtime can be reduced by using a small P value. However this will have an impact on the accuracy of the results. In general, P is selected such that the crosstalk-induced noises of input waveforms with widths above a certain level are detectable. It takes around 70µs for gcdm to process an input waveform on Sun Blade 1000 machine and report the delay of a cell using P = 35. The reported results in this chapter have been based on using the first two terms of Taylor series. We have concluded from our experiments that using the first two terms of the Taylor series provides the best tradeoff between accuracy and run time. 98 4.5 Summary We presented gcdm, our voltage-based cell delay modeling technique that utilizes the gain of the cell output voltage to its input voltage to directly compute the output voltage waveform (and hence the timing information regarding that) without the need to approximate the input voltage waveform. Our techniques can be easily embedded in conventional timing analysis tools. Our experiments demonstrate that our voltage-based techniques are more accurate than other existing cell delay models with low CPU runtime. The discussion in this chapter on gcdm, has been mainly presented in [57]. However, it is worth mentioning that we also developed other voltage-based cell delay analysis techniques, namely, SDP [50],[51] and SGDP [52]. We also developed a hybrid technique to judiciously choose one of the gain (sensitivity)-based or Elmore-based technique to increase the efficiency of cell timing analysis [50]. 99 5 CROSSTALK-AWARE LOGIC CELL TIMING ANALYSIS USING CURRENT-BASED DELAY MODELING 5.1 Introduction The goal of cell timing analysis is conventionally stated as: Given a noisy waveform at the input of a cell, find an equivalent input voltage waveform that when is applied to the cell generates an output waveform which is as close as possible to the output waveform in terms of its arrival time and slew. As the silicon technology is driven to nanometer, conventional voltage-based lookup tables are nearing the end of their useful life. In [30],[50] the dcommon voltage-based cell timing analyzers are reviewed and their shortcomings are highlighted. In addition to being inefficient in accurately considering the impact of the shape of the noisy waveform, the voltage-based timing analysis tools are inefficient in low power design styles that incorporate two or more logic “islands”, each running at a different operating voltage. Traditional library cell characterization that accurately covers a wide range of operating voltages can be prohibitively time consuming. To consider the shape of the waveform more effectively, the problem is re- stated in a more general statement as follows: Given a noisy voltage waveform at the input of a cell, determine the output voltage waveform, which has the minimum 100 error with respect to the actual output waveform. Current-based has been shown to be more accurate than voltage-based logic cell timing analysis [40], [6]. In fact some industrial current-based timing analyzers, such as CCSM and ECSM are already in use [6]. Existing current-based approaches may still exhibit large variations from Spice simulation when presented with complex interconnect models or non-monotonic input voltage waveforms. Their complexity is a barrier to apply them in novel design tools. In this chapter, we present a rate-of-current-change (ROCC) based logic cell timing analyzer, which utilizes a pre-characterized table of the time derivatives of the output current waveform to compute the output current and subsequently the output voltage waveforms. The data in this table, together with the Taylor series expansion of the output current, is utilized to compute the output current waveform in a step-by-step manner. Having computed the output current, the output voltage waveform can be computed based on the output load. To respond to the aforesaid more general problem, our model is able to directly build the output waveforms without the need for creating an equivalent input waveform as is done by conventional techniques. The characterization and application steps are simple and efficient to implement. Furthermore, the application of pre-characterized ROCC parameter values can accurately model the behavior of a logic cell as it receives a noisy input. Experimental results demonstrate that the ROCC-based delay calculator can accurately capture the impact of the shape of the input voltage waveform on the output current waveform 101 and eventually the voltage waveform. We will also review our new current source model that accurately models the nonlinear and parasitic behavior of the logic cell using pre-characterized lookup tables for the model components. The remainder of this chapter is arranged as follows. In section 5.2 the previous logic cell delay modeling techniques including the current-based ones are reviewed. Section 5.3 describes our current ROCC-based cell delay modeling. Section 5.4 present the experimental results for the ROCC-based model. A new current source model is explained in section 5.5 that accurately captures the nonlinearity of the logic cell in its parasitic capacitance as well as its output resistivity. Section 0 brings a summary of this chapter. 5.2 Previous Non-voltage-based Cell Delay Modeling Techniques The cell delay modeling techniques can be classified into two general groups, voltage-based and current-based ones, according to which of the output current or output voltage they compute. On the other hand, cell delay models may apply lookup tables and/or equations. Most of today’s logic cell delay models, which are used in integrated circuit design flows, consist of lookup tables or characteristic equations that rely on linear or ramp voltage waveforms and simplified loads as inputs and create linear or ramp voltage waveform approximations as output. We reviewed the various voltage- based cell timing analyzers and discussed their shortcomings and strengths in section 4.2. 102 Two recently developed approaches, i.e., equation-based and current-based techniques, contend to replace voltage-based lookup tables. Both have the ability to better predict nanometer timing across a range of supply voltage. 5.2.1 Equation-based techniques The equation-based timing analyzers generally use a polynomial with multiple coefficients relating timing to a variety of input parameters. The goal is to model delay variation due to environmental factors such as supply voltage and substrate temperature. However, it is difficult to fit the actual non-linear behavior of the timing quantity of interest with a polynomial that has a limited (and relatively small) number of terms. In practice, the extreme effort to characterize real silicon to the equation-based modeling has made it unpopular. Sophisticated optimization algorithms are required to perform curve fitting of a polynomial to simulation data, and the accuracy and turnaround time of the library creation is limited by the quality of the optimization algorithms. 5.2.2 Current-based techniques Current-based cell timing analyzers generally base their delay calculations on the amount of current flow into or out of a cell. Current-based cell modeling is much easier to characterize than the equation-based one. Rather than a mathematical abstraction, current-based modeling is a physical model patterned after the actual construction of transistors. It improves delay calculation accuracy by modeling a 103 cell’s output drive as a current source rather than a voltage source. Current sources are more effective at tracking non-linear transistor switching behavior and permit highly accurate modeling of long complex interconnects, which are common in many of today’s largest nanometer low power designs. One example of a current-based cell delay model is proposed in [40] where cells under the crosstalk-induced pulse (glitch) attack are modeled by using an analytical current model consisting of four parameters, namely a DC current source, a linear resistance, an output capacitance, and the internal delay of the gate. Another current-based model, called Blade [21], consists of a voltage- controlled current source, an internal capacitance, and a time shift of the output waveform. First I out (V in ,V out ), the amount of current sourced by a cell in response to DC voltage levels on the input and output pins of interest, is determined and a lookup table (denoted by the cell I-V table) is created for each cell by sweeping the DC values of input and output voltages and measuring the current sourced by the cell output pin. However, a response exclusively derived from the DC-based I-V table results in an overly optimistic timing analysis as the DC sweep of the input and output ignores the effects of parasitic elements. Therefore a calibration procedure is thus performed to consider the cell parasitic effects. This procedure determines an internal capacitive load which, when applied to the Blade model, results in a transient waveform that matches the shape of a Spice-generated waveform for the cell under identical conditions. Once the waveform shapes have been matched, a time shift is calculated by examining the time difference between 104 the 50% points of the Spice output and the calibrated Blade output. A runtime engine consisting 31×31 I-V lookup tables and a secant iteration-based nonlinear solver is used to compute the output waveforms. A more complete current-based cell delay technique is presented in [39], where the current drawn by a cell during the output switching is computed while considering the Miller effect between the input and output nodes along with the effect of internal parasitic capacitances. As a result, the current drawn by a cell during output switching is essentially represented by the following equation: t v v v C t v v v C V V I i in out in M out out in g out in out Δ Δ − Δ Δ + = ) , ( ) , ( ) , ( (5-1) The coefficients of the last two terms in Equation (5-1) capture the current charging an effective capacitance between the cell input and output, i.e., the Miller capacitance, C M , and that charging an effective ground capacitance at the output, C o . Also Cg=C M +C o . C M and C o assumed to be constant and calculated through a series of transient simulations with voltage transitions applied at the input and output nodes, during which the current flowing through the output node is measured. A 2-D lookup table similar to the I-V tables of Blade [21] is used to store values of I(V in ,V out ) which are found through a series of DC simulations using Hspice. The output voltage waveform can be iteratively computed using Equation (5-1). The cell characterization of this technique is more accurate than the ones in [40],[21] however, based on our observation, the Miller and output capacitive effects can vary significantly depending on cell input and output voltages. The assumption 105 of fixed values for C M and C o can give rise to large inaccuracy especially for complex cells. In [43] this weakness is resolved by introducing a nonlinear output capacitance model. The nonlinearity of the input of the logic cell is captured by 2-stages RC. The proposed current source model in [6] models each input and output pin of the cell with a nonlinear resistor and nonlinear capacitor each of which dependent on all the input voltage values as well as that of the output. Since the complexity of the model has an exponential order to the number of inputs, it becomes very complex for logic cells with more than 2 inputs and this makes the model impractical for a STA tool. Finally none of the above-mentioned models addresses the effect of process variations on cell delay analysis. We have developed a current-based technique, called CGTA [54]. Instead of the rate-of-current-change used in our ROCC-based technique, the gain (sensitivity) of output current to input voltage is defined as the derivative of output current waveform to the input voltage waveform. The gain is then used to accurately model the impact of the shape of the input voltage waveform on the output current waveform and eventually the voltage waveform. CGTA is close in accuracy to ROCC-based technique, however ROCC-based technique is simpler to use, mainly because it does not have the problem of CGTA in dealing with the gain factor where the input voltage does not change. More precisely, the gain factor defined in CGTA should be enforced to zero for points where the input voltage change is zero, however in ROCC-based the definition of the rate-of-current- 106 change is always meaningful. Next we will present our ROCC-based cell delay analyzer. 5.3 ROCC-based Cell Delay Model This section describes our ROCC-based cell delay modeling for the purpose of timing analysis. The key innovation in this model originates from its construction of the output current signal as a function of the input voltage signal. Therefore, we substitute the DC and transient steps of existing current-based cell delay models with a simpler computational model, while maintaining the accuracy. On the other hand, unlike the voltage-based methods which first need to find an equivalent linear input waveform, the ROCC-based delay calculator directly builds the output voltage waveform. We utilize the instantaneous rate of current change, θ c , i.e., the derivative of the output current with respect to time. Each cell is pre-characterized with a 2-D lookup table with input voltage and effective output capacitance as the input keys and θ c as its returned value. Output current waveform is computed by using the lookup table data in conjunction with Taylor series expansion of the output current at time instance t k around its value at time instance t k-1 . Having the output current waveform, the output voltage waveform can be computed considering the load. 5.3.1 Impetus for our cell delay model As described in 5.2.2, the characterization steps in the existing current-based cell timing analyzers are quite involved. Their major source of complexity is due to the 107 fact that both input and output voltages should be considered as input parameters to the cell model. The DC output current and parasitic effects are dependent to both input and output voltages. These voltages must then be swept during the DC characterization step in order to find the DC output current and fill in the I-V lookup tables. However, parasitic capacitances (i.e., C M and C o ) are assumed to be constant to simplify the model. It is not clear how valid this assumption is for different cells which are subjected to noisy waveforms of various shapes. The transient simulations required to find the constant values of the parasitic capacitances are another source of complexity. To resolve the abovementioned shortcomings, we notice that the output voltage of a cell is a function of the input voltage, the parasitic capacitors, output load, and supply voltage, V dd . For a given load and power supply voltage level, it is reasonable to assume that the output voltage and parasitic capacitances inside the logic cell are only a function of the applied input voltage waveform. Consequently, the output current can be written as a function of the input voltage for a certain load. This observation is important since it enables us to calculate the output current and voltage waveforms, starting from a given input voltage waveform through a constructive stepwise approach. 5.3.2 Cell characterization and output waveform computation Each logic cell in the standard library is pre-characterized with a lookup table, which is used for output voltage calculations of the cell. This table will be referred as CC R (K× L) where K and L denote the number of input voltage levels and 108 effective capacitance values, respectively. CC R contains ) , ( j eff i in c C V θ which is simply the rate of change for the cell’s output current, i out , with respect to time for input voltage value, i in V , when the cell output is connected to an effective load with a value of j eff C : i in in V t v out c j eff i in c t i j i C V = Δ Δ = = ) ( ) , ( ) , ( θ θ (5-2) The θ c (i,j) value is stored in row i and column j of the CC R lookup table. Note that CC R tables are created for each pair of input and output pins of the logic cell by a series of transient Hspice simulations, in which noiseless (saturated ramp) input waveforms are applied while the output current change is monitored. This process is repeated for different effective load capacitances. θ c is a function of the output load; therefore, an effective output capacitance is used to model the output of the load. The iterative effective capacitance calculation technique of [61] is used to determine the effective capacitance. Effective capacitance is dependent on the input transition time; therefore, given a noisy waveform, the effective capacitance changes for different regions of the waveform due to different slews. We thus divide the noisy waveform into different parts by doing a piecewise linear approximation of the waveform. Each part of the noisy waveform is approximated by a fixed transition time, and therefore, has its own effective capacitance. It is empirically found that the effective capacitance calculation converges in fewer than 3 iterations. The effective capacitance calculation is done only for the purpose of obtaining θ c values from the CC R lookup 109 table. Note that when calculating the output voltage, we use the actual load. The ROCC-based model is able to consider arbitrary loads including simple capacitive, RC-π, or more complex interconnect RC models. v in Cell Arbitrary Load i out . v out . . . Figure 5-1. i out is calculated as a function of v in and θ c . The transition time (slew) of the input voltage ramp waveform used to for cell characterization can affect θ c values. However, the dependency is weak. Therefore in practice we can do cell characterization for a single input ramp (with an effective value based on typical waveforms applied to cell.) The input voltage waveform, v in , is represented by a time-indexed voltage array, i.e., by using P equidistant sample points (t 0 , …, t P-1 .) The cell model takes this data and uses the CC R table to find θ c values for each point. Figure 5-2 depicts the waveform for θ c values of an inverter in our 130nm library under a ramp input (shown in red.) We assume that the noisy input voltage waveform, v in , has been characterized by the user (or a timing analysis tool) by having specified the input waveform voltage levels at P equidistant sample points (t 0 , …, t P-1 .) The output waveforms are constructed by reporting the output current and voltage levels at the equidistant 110 points. Therefore, it is easy to see that the ROCC-based cell modeling technique can be used as the main delay calculation engine in a timing analysis tool which starts from the primary inputs of the circuits and calculates the voltage waveforms for all intermediate signals and the primary outputs during a linear time traversal of the circuit net list. To detect noise, P should be selected such that the time between two consecutive sampling points is no larger than one half of the smallest crosstalk noise width. An equivalent output current waveform is then built, in response to the noisy input voltage waveform, v in , using Taylor series expansion of i out : 2 11 1 1 (1) ' 1 () () ( ) ()( ) 2 1 ()( ) ! out k k c k c k n ck n it it t t t t tt n θθ θ ++ + + − =+ Δ+ Δ ++ Δ … (5-3) where i out (t 0 ) is initialized to zero. θ c (t k ) is a shorthand notation for θ c (v in (t k )). As pointed out θ c (v in (t k )) is found from the CC R table (if necessary using interpolation.) ) ( ) ( ) ( ) 1 ( ) 1 ( k n out n k n c k n c t t i t t t Δ Δ = Δ Δ = − − θ θ is the n th rate of change of the output current over time which can be calculated directly during the initial library characterization process or can be approximately calculated from the entries in the CC R table. In practice n=1 (n=2) is sufficient for accurate timing analysis of a logic cell subjected to a noiseless ramp (a noisy input waveform.) Δt=t k+1 -t k is the sampling time. In general, the P computed output values may not be equidistant. This is undesirable when doing the timing analysis of a logic circuit. To avoid this, a set of P 111 equidistant points are computed based on weighted average of the two nearest values found from Equation (5-3). A Padé approximation can be used to calculate the output current, instead of the Taylor series expansion of Equation (5-3). Padé approximations are usually superior to Taylor expansions when functions contain poles, because the use of rational functions allows them to be well-represented]. However, our experimental results demonstrate that using truncated Taylor series to find the output current provides sufficient accuracy, yet it is much more efficient than using the Padé approximation. This makes Equation (5-3) more suitable than an equivalent Padé formula to be used in a logic cell timing analysis tool. Having calculated the output current, the output voltage can be found for an arbitrary load connected to the output. Figure 5-2 illustrates the equivalent output current waveform and also the resulting output voltage waveform for a ramp input as well as the actual waveforms generated by Hspice. -0.7 -0.4 -0.1 0.2 0.5 0.8 1.1 1.4 4.0E-09 4.3E-09 4.6E-09 4.9E-09 5.2E-09 5.5E-09 in v 3Ε5 × θ c -1E5 × i out out v (ROCC-based) out v (Hspice) (ROCC-based) (Hspice) Figure 5-2. An example of the ROCC-based cell delay model used on a typical ramp input. θ c and i out have been scaled up to improve visibility. 112 The underlying principle of our approach to handle the compound cells (i.e., multi-stage cells, for example an AND gate) is similar to that described in [21]. We repeat the characterization process for each logic function (NAND function and the NOT function.) Therefore two runs of calculation steps are required for output waveform computation of an AND gate. Each cell exhibits a kind of low pass filtering effect, which prunes certain amount of input noise. This is not considered in current-based approaches in general. To increase the accuracy, similar to [21], a low pass filter may be used on the noisy input waveforms prior to presenting the waveform to the ROCC-based waveform calculator. 5.4 Experimental Results for the ROCC Model The ROCC-based cell timing analysis was coded in C and compiled under Sun Blade 1000 machine. The cells used in the experiments are from a 130nm, 1.2V production cell library using parasitically extracted netlists. An automated test system was devised to assess the model and compare its delay accuracy and run- time with Hspice. A variety of cells in the production library were tested considering waveforms with a large variety of shapes, from pure ramp to noisy waveforms. The set of experiments included RC-π structure as well as capacitive only loads. The size of CC R lookup table for each cell was set to (20,5) meaning that 20 input voltage values between 0 and 1.2V and 5 output capacitance values are considered. No low pass filters were used to generate the results in this chapter. 113 Compared with Hspice, the output voltage waveforms generated by the ROCC- based delay calculator matched the Hspice with only a 1-3% error. Figure 5-3 shows comparison with Hspice for some examples of such output waveforms. In this figure, in part (a), the crosstalk-induced noisy input waveforms are generated under single aggressor attack. In parts (b) and (c), the noisy waveform is subjected to three aggressor signal transitions. Therefore there are multiple crosstalk-induced distortions. The equivalent output waveforms generated by our model nearly match the Hspice for waveforms in parts (a) and (b). Part (c) shows an extreme case where the input signal transition to cell is the victim of three strong couplings. To be precise, 200, 200, and 220fF of coupling capacitances exists and the signal transitions on aggressor lines occur close enough to create large crosstalk-induced fluctuations around 0.5V dd level and hence cause multiple 0.5V dd crossing points at the output of the victim. Although the error in 0.5V dd propagation delay value is quite low (less than 1%,) it is seen that the equivalent output waveform does not match the Hspice waveform as close as those in parts (a) and (b). The accuracy of ROCC-based cell model is next demonstrated on some circuit configurations that are part of a large high-performance ASIC design obtained from industry. The circuit configurations appraise our model under different scenarios, i.e., for different number of aggressor lines, interconnect lengths, coupling capacitance values, and input slews to create various noisy waveform shapes. Configuration I is a pair of 1000μm coupled interconnect lines running parallel to one another with a total distributed coupling value of 200fF. Both aggressor and 114 victim line inputs have a slew of 150ps. For all configurations we set the arrival time and slew (transition time) of the victim line input to 1000ps to 150ps, respectively. For configuration I we swept the arrival time of the aggressor line input from 500 to 1500ps in steps of 5ps. Configuration II includes two aggressor lines each with 200fF total coupling and a victim, all of which are 500μm long. We maintained a fixed offset of -100ps between signal arrival time of the 1 st and 2 nd aggressor line inputs, while sweeping that of the 2 nd aggressor line input arrival time. The two aggressor inputs have slews 200ps, and 400ps, respectively. Configuration III contains three aggressor lines, with 200fF, 200fF, and 220fF total distributed coupling, respectively, and all 500μm long. The victim line is also 500μm long. We maintained a fixed offset of -50 between the arrival times of 1 st and 3 rd aggressor line inputs and -100 between those of 2 nd and 3 rd . The arrival time of the 3 rd aggressor line input was then swept from 500 to 1500ps in steps of 5ps. The slews of the three aggressor lines are 200ps, 350ps, and 400ps respectively. 0 0.2 0.4 0.6 0.8 1 1.2 1.00E-09 2.00E-09 3.00E-09 4.00E-09 5.00E-09 (Hspice) (a) 115 0 0.2 0.4 0.6 0.8 1 1.2 1.4 2.E-10 2.E-09 3.E-09 5.E-09 6.E-09 8.E-09 (b) 0 0.2 0.4 0.6 0.8 1 1.2 4E-09 4.5E-09 5E-09 5.5E-09 6E-09 6.5E-09 7E-09 (Hspice) (c) Figure 5-3. The actual and equivalent waveforms by our model for waveforms subjected to (a) one aggressor, as well as (b) and (c) three aggressors Figure 5-4 shows the maximum and average delay errors of the ROCC-based cell delay model compared to Hspice. The cell delays were calculated as the difference between the 0.5V dd crossing point of the output waveform and that of the input waveform. Compared to Hspice and in terms of percentage errors, the average and maximum errors for our model are about 1% and 3%, respectively. The average run-time of output waveform computation for a typical logic cell is less than 100μsec for our model. 116 0 5 10 15 20 25 30 12 3 Configuration I Configuration II Configuration III Delay error (ps)= |Delay(Hspice)-Delay(ROCC-based Model)| Figure 5-4. Absolute errors in calculated delays vs. Spice simulation results for ROCC-based model 5.5 Current Source Modeling of Logic Cells with Voltage Dependent Parasitic Effects The existing cell delay analysis techniques in general do not accurately model the parasitic and/or non-linear behavior of logic cells. The parasitic capacitances for a logic cell are highly dependent to the input and output voltage values of the cell. We developed a current-based circuit model to calculate the output voltage waveform (c.f. Figure 5-5.) It consists of two main components, namely, parasitic capacitances to model the loading at input and output nodes of the cell and the Miller effect between the two nodes, as well as a current source at the output node to model the nonlinear behavior of the logic cell. Each component is in turn a function of the input and output voltage values. As a result, our proposed cell model is represented by the following KCL equation which essentially models the current at the output pin of the cell during switching: 117 (, ) ( (, ) ( , )) (, ) 0 o o o io o i o M io i Mi o V i I VV C V V C VV t V CVV t Δ ++ + Δ Δ −= Δ (5-4) where the Miller capacitance C M (V i ,V o ) and output capacitance C o (V i ,V o ) values are pre-characterized through a series of SPICE-based transient simulations, in which saturated ramp input and output voltages are applied to input and/or output nodes while the output current is monitored. 2-D lookup tables are used to store C M (V i ,V o ) and C o (V i ,V o ) values. The amount of current sourced by a logic cell in response to DC voltage levels on the input and output pins of interest, I o (V i ,V o ), is also determined for each logic cell by sweeping the DC values of input and output voltages and measuring the current sourced by the cell output pin in SPICE. As a result, to model the nonlinear behavior of a logic cell w.r.t. input and output voltage values, a 2-D lookup table is created to store the values of I o (V i ,V o ). C M (V i ,V o ) I o (V i ,V o ) C i (V i ) V o i o V i i i C o (V i ,V o ) Figure 5-5. Our proposed current-based circuit model for a logic cell. Precise estimation of output load is critical for accurate output voltage calculation of a cell. The output node of a cell is usually connected to several fanout cells directly or indirectly through an interconnect line. The input parasitic capacitances of fan-out cells should hence be considered as part of the load for output voltage 118 calculation of the driver cell. The following equation is used to characterize the parasitic effect seen at the input of a cell: { ( ) ( ,)} ( ,) io iii Mio Mio VV iCV C VV C VV tt Δ Δ =+ − Δ Δ (5-5) A SPICE-based transient analysis is used to determine C i . In this analysis, a saturated ramp is applied to the input, while the output node is connected to a DC voltage source, and the input current, i i , is measured. Although the input parasitic capacitance, C i , is in fact a function of the input and output voltage values, in practice an input-voltage-dependent C i is all that can be efficiently utilized. This is because when calculating the output voltage waveform of a logic cell, the output voltage values of its fanout cells are unknown, and therefore, calculation of C i values of the fanout cells cannot make use of any information about the output voltage levels of these fanout cells. That is why we say that making C i dependent on V o is not useful in practice. Note that Equation (5-4) is enough to calculate the output voltage waveform, and Equation (5-5) is only used to characterize C i . Distributed RC circuit modeling is an accurate way of representing an interconnect line and its parasitic effects. However the complexity of this model limits its application in real world designs where millions of interconnect lines are present. Therefore, circuit designers try to derive the electrical behavior of this complex circuit model by approximating its transfer function using different model order reduction techniques [60],[62]. The pre-characterization steps of our model are load-independent, therefore output voltage waveform can be constructed for a 119 given input voltage waveform considering any arbitrary load; without loss of generality we consider a capacitive only load, C L , to simplify the presentation: (, ) 0 oo o i Lo io M M VV V V CC IVVC C tt t t ΔΔ Δ Δ ++ + − = ΔΔ Δ Δ (5-6) Equation (5-6) can be rewritten with respect to output voltage values, resulting in: 1 1 1 () () {(() ()) (, ) } ok ok Lo M Mi k i k i o Vt Vt CC C CVt Vt IVV t + + =+ × ++ − −×Δ (5-7) As far as the process variation is concerned our new current source model is upgraded to calculate the output voltage considering different sources of variability [26]. The output voltage waveform of logic cells is modeled by a stochastic process in which the voltage value probability distribution at each time instance is computed from that of the previous time instance. Next the probability distribution of α%V dd crossing time, i.e., the hitting time of the output voltage stochastic process is computed. Experimental results in [26] demonstrate the high accuracy of our cell delay model compared to Monte-Carlo-based SPICE simulations. In addition to timing analysis, our current source modeling technique can be adapted to do power analysis more accurately. In [27] we discuss how sensitive the power consumption of a logic cell is to the shape of the signal transition waveform at the cell input and output nodes. For example, we show in [27] that the approximation of the crosstalk induced noisy waveforms with saturated ramps can lead to short circuit energy estimation errors as high as orders of magnitude for a minimum sized inverter. Our 120 current source model is capable of calculating the short circuit energy dissipation caused by glitches in VLSI circuits, which in some cases can be a key contributor to the total circuit energy dissipation. 5.5.1 Experimental Results To evaluate our current source model, it was compared with Hspice. The set of experiments involved various logic cells, including simple inverter and NAND gates, as well as complex cells such as AOI (And-Or-Invert). Figure 5-6 shows comparison with Hspice for some examples of crosstalk- induced noisy waveforms given to a minimum size inverter in 130nm cell library. The equivalent output waveforms generated by our model match the Hspice results closely. Next comparison is presented with the most accurate existing current source model by Keller et al [39] (it will be denoted by KTV for the rest of this chapter.) Figure 5-7 illustrates the absolute delay errors w.r.t. to Hspice for a minimum size inverter in our 130nm cell library. The input line to the inverter is coupled by a 50fF capacitance and is under attack by an aggressor line. Both the victim and aggressor are driven by minimum size inverters. The cell under consideration has a FO4 load. The signal arrival time of the input of the victim line driver is set to 10ps while that of the aggressor (i.e., the noise injection time) is swept from 100ps to 200ps with a time step of 1ps. Compared to KTV, the accuracy of delay calculation for the minimum size inverter is improved by 8.8% (17.3%) on average (max), respectively. 121 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 5E-10 1E-09 1.5E-09 2E-09 2.5E-09 3E-09 Time (sec) noisy input output (Hspice) output (Our model) Figure 5-6. The actual and equivalent waveforms by our model for some crosstalk-induced noisy waveforms. 2.00E-12 2.20E-12 2.40E-12 2.60E-12 2.80E-12 3.00E-12 3.20E-12 3.40E-12 3.60E-12 3.80E-12 4.00E-12 1E-10 1.2E-10 1.4E-10 1.6E-10 1.8E-10 2E-10 KTV model Our model Noise injection time at cell input (sec) delay error vs. HSpice (sec) Figure 5-7. Absolute errors in calculated delay for a min size inverter As discussed earlier, the shape of the waveform greatly impacts the delay calculation, therefore, delay and output slew metrics may not be sufficient to model the waveform shape. Our current-based model is capable of producing output waveforms whose shapes are very close to those produced by Hspice. Mean square error (MSE) is a good metric to compare waveform shapes. Figure 5-8 shows MSE for our cell model and KTV compared to Hspice. It is seen that this value is lower for our model 122 compared to KTV in most of the cases. In fact, the average MSE improvement for the inverter (AOI22) for the aforementioned experiment setup is 11.3% (24.5%). 9.10E-10 9.30E-10 9.50E-10 9.70E-10 9.90E-10 1.01E-09 1.03E-09 1.05E-09 1.07E-09 1.09E-09 1.11E-09 1.00E-10 1.20E-10 1.40E-10 1.60E-10 1.80E-10 2.00E-10 MSE vs. Hspice Noise injection time at cell input (sec) Our model KTV model Figure 5-8. Waveform similarity (mean square error) comparison to Hspice for our model and the KTV model. 5.6 Summary In this chapter the ROCC-based logic cell delay analysis model was first presented to address the complexity issue with the existing current-based cell delay models. A pre-characterized table of ROCC, i.e., time derivatives of the cell output current as a function of input voltage and output load values is utilized in combination with the Taylor series expansion to progressively compute the output current waveform. The output voltage is then produced by integrating the output current. Experimental results show the accuracy and efficiency of this new delay model. ROCC-based technique has been presented in [55]. We had also developed a current-based technique, called CGTA [54], prior to the ROCC-based technique. 123 A new current source logic cell delay model for the purpose of timing analysis was presented in this chapter. This model captures various cell parasitic and nonlinear effects in the computation of output voltage waveform in the presence of crossstalk-induced noise. Using this model and for an arbitrary input voltage waveform, the output voltage waveform can be accurately constructed. Experimental results showed the high accuracy of our cell delay model compared to the existing cell delay models. This current source modeling technique was also presented in [26]. 124 6 STAX: STATISTICAL CROSSTALK TARGET SET COMPACTION 6.1 Introduction As the layout geometries in recent CMOS process technologies scales down to 65nm and below, increases in transistor packing density and operational frequency of VLSI circuits aggravate the noise effects, including crosstalk noise. This noise is caused by unwanted capacitive coupling between a pair of interconnect lines, referred to as a crosstalk site. Three types of crosstalk effects namely pulse (glitch,) slowdown, and speedup effects can be associated with each site [15]. This chapter focuses on the slowdown target set compaction. When the signals on the interconnect lines of a crosstalk site make opposite transitions and their arrival times are close to each other, a slowdown effect occurs on both signal transitions. Each line can be considered as the victim line, while the other as the aggressor. Each of these lines is associated with two slowdown targets, namely the slowdown of the rising or the falling signal transition; hence there are four slowdown targets at each site. The slowdown of a signal transition can result in faulty circuit behavior, in which case the target is referred to as a fault-producing target (FT). Otherwise, the target is called a safe target (ST). 125 It seems necessary to generate test for FTs associated with each coupled interconnect line. However, a large VLSI circuit might contain a huge number of coupled interconnect lines, and thus a large population of crosstalk targets. In practice, only a small set of targets can result in faulty circuit behavior. Considering the high complexity of test generation for each target, it is reasonable to try and prune as many targets as possible and thereby reduce the size of target set that consists of the ones that should be considered during test generation. This procedure, which is referred to as crosstalk target set compaction, outputs a set of crosstalk targets that must include all of the FTs but may also contain some non- fault-producing targets. This could happen, for example, if a non-fault-producing target cannot be proven as safe by the available pruning tools. Differences between identical features in a certain lithographic process are referred to as process variations, and such mask differences tend to rise rapidly as the technology scales down. In addition to these manufacturing-induced variations, environmental variations along with device/interconnect aging processes (e.g., hot electron effects and electro-migration,) tend to generate a rather large deviation of key circuit parameters from their designed values. These phenomena create parasitic and electrical parameter uncertainties for various elements in the circuit and cause significant timing variations. Consequently, highly sophisticated and robust crosstalk-aware performance analysis and optimization tools are needed to account for these variational effects. The impact of a crosstalk target on the correct operation of a circuit depends on logic values, signal arrival times and slews, and 126 parasitic and electrical parameters. The timing and electrical parameters are strongly dependent on process variations. Up until recently, corner-based timing analysis techniques, such as static timing analysis (STA), were used as relatively fast techniques to address the concerns related to various sources of variation in VLSI circuits. In general, corner-based techniques tend to overestimate circuit delay and noise effects. These techniques can also result in underestimation of circuit delay and noise because these metrics are non-monotone functions of some circuit parameters. Exacerbating the situation, it is nontrivial to find the worst-case value for each parameter that would result in the worst-case delay or noise. Therefore, crosstalk target set compaction using corner-based timing analysis tools will not be effective in future process technology nodes. Statistical analysis is thus viewed as an essential methodology for nanometer process technologies, which enables application of the actual statistics of the process technology parameters for accurate calculation of circuit characteristics such as gate and interconnect delay [11],[1]. The idea of applying a crosstalk target set compaction tool prior to using ATPG was first introduced in [15]. A qualitative and detailed discussion of the target identification ideas was then given in [44]. Following this work, a pruning method was proposed in [70] for crosstalk target identification in sequential circuits. A different method was proposed in [65] to prune redundant crosstalk faults in sequential and combinational circuits. Unfortunately, none of these techniques address the problem of how to utilize the filtering resources in a cost-effective 127 manner. In addition, none considers the effect of process variations. Departing from this practice, the XIDEN target identification framework for crosstalk pulse [48] and slowdown [10] to find effective sequences of available timing and parameter extraction tools and filtering tools to perform target set compaction. In addition, none considers the effect of process variations. On the weak side, process variations were handled by employing corner-based analysis techniques. In this chapter we present STAX, a STatistical Xtalk (crosstalk) target set compaction methodology to evaluate crosstalk slowdown in the existence of process variations and to efficiently filter as many targets as possible. A number of timing analysis, extraction, and filtering tools with different qualities and runtime complexities are incorporated in the STAX framework. The first question that comes to mind is: why not use the best timing analyzer, extractor, and filtering tools available to prune as many targets as possible. The answer is that we can achieve the same level of pruning in much shorter CPU times by not restricting ourselves to only these tools. The general idea is to use less accurate and fast tools in the initial stages to process a large initial set of targets and prune as many targets as possible, and then use the more accurate, but computationally more expensive tools, on the remaining (much smaller) set of targets in later stages. Target set compaction is dependent on the ordering of pruning tools. The ratio of the number of remaining targets after a sequence of tool invocations to the number of initial crosstalk targets is referred to the compaction degree. Different 128 tool sequences with identical compaction degrees can have computational costs that can differ by orders of magnitude. Therefore, the goal is to find effective sequence(s) of tool invocations to provide the highest compaction degree in the least amount of CPU time. In this chapter first the variation-aware crosstalk model, extraction and statistical timing analysis tools used in STAX are reviewed. The crosstalk slowdown filters are then discussed. We also review how statistical timing analyzers, extractors, and filters are placed into an efficient order. Notation to be used throughout this chapter is summarized in Table 6-1. Table 6-1: Notation and descriptions Symbol Description Symbol Description V Victim E i Extractor i A Aggressor F i Filter i δ AT(V) −AT(A) T i Timing analyzer i C c Coupling capacitance AT μ -3σ (V) Earliest arrival time at victim PO Primary output AT μ +3σ (V) Latest arrival time at victim R(V) Victim’s required time SD max-Cc Maximum slowdown by C c 129 6.2 Modeling, Extraction, Analysis 6.2.1 Coupled Interconnect Characterization/Modeling The distributed RC-π model of Figure 6-1(a) is used to model a pair of capacitively-coupled interconnect lines while considering the local variations of physical parameters, such as line width and thickness. In this circuit, each RC-π stage represents an interconnect segment of predefined length, L seg . The coupling between two interconnect lines along segment i is captured by the coupling capacitance C ci . Moreover, the self capacitance and resistance of the victim line in segment i are denoted by C vi and R vi , respectively. Although lengths of all segments are identical, due to process variations, the parameters (i.e., the value of different elements) of the corresponding electrical circuit are different. The variation of physical parameters, such as interconnect width and thickness, along the interconnect line is due to IC manufacturing defects, neighboring metal lines, optical proximity, chemical mechanical polishing (CMP) metal process, etc. The following procedure is used for calculating electrical parameters of the distributed model. First, complete physical outlines of the coupled interconnect lines are generated, including information about their width, height, and interlayer dielectric thickness along their length. This physical outline is used to calculate resultant electrical parameters for each interconnect segment by using a scheme similar to that introduced in [19]. 130 R v1 C v1 R v2 V in A in A out A far A near V far V out V near C c1 C c2 C cn C v2 R vn C vn C a1 C a2 C an R a2 R an R a1 (a) R v C v V in A in A out V out C c C a R a (b) Figure 6-1: (a) Distributed RC-π model of a crosstalk site, (b) Lumped RC- π model of the crosstalk site The heuristic explained below is used as a model order reduction technique to construct the variational circuit model of coupled interconnect lines as a variational coupled single RC-π model as depicted in Figure 6-1(b). In this RC-π model, the mean value of each quantity of interest (i.e., R v , C v , R a , C a , C c ) is calculated as the summation of the mean values of all the coupled segments in the distributed circuit model of Figure 6-1. The variance of each quantity is however calculated as weighted summation of the variances of the coupled segments. In particular, the weights are designed to monotonically decrease from the near-end of the coupled line toward the far-end. This is because we have empirically observed that the 131 effect of segment variations on the output delay at the far-end of the coupled lines decreases as one visits segments starting from the near-end toward the far-end. The key advantage of the proposed modeling approach is the ability to locally capture the effect of process variations on each interconnect segment. This is done by directly calculating the corresponding values of local resistance and capacitance of the RC-π model based on the exact information about the actual geometry of the interconnect lines in each segment. To achieve convergence in the desired statistical properties of the output variables, Monte Carlo simulation is performed, where we calculate the mean and variance of a collection of samples, each comprising of a large number of units in the population under study. According to our experiments, a sample size of 2500 is suitable to use, i.e., the population generation and electrical parameter extraction steps are iterated 2500 times to achieve convergence in the desired statistical properties for each sample. The number of samples (or sample count) is then selected so that a 98% confidence level with 1ps error in the estimates of mean and variance of interconnect delay is achieved. (Recall that the effectiveness of Monte Carlo simulation technique is based on the fact that, regardless of the population distribution, the sample distribution becomes normal; hence, a well-defined stopping criterion exists from which the confidence interval for the final estimate can be calculated.) There is a tradeoff between the level of accuracy and complexity of closed-form expressions. As the number of input parameters increase, lower order models such as modeling of the delay random variable as a 132 linear function of sources of variation becomes more suitable. According to our experimental setup, we have found that the 2 nd order modeling is the most successful for capturing the distribution properties of crosstalk-affected delay on the victim line: 2 m ean(delay) ( ) ii ii param eter i AxBx =+ ∑ (6-1) 2 variance(delay) ( ) ii i i param eter i Cx D x =+ ∑ (6-2) where x i is a physical parameter, such as wire width or length, and A i to D i are regression coefficients found by using statistical analysis and curve fitting techniques. 6.2.2 Parameter Extraction/Estimation To determine the quantitative value of crosstalk-affected delay of a crosstalk target, knowledge of the parametric electrical values associated with the aggressor and victim lines is required. The fastest way to estimate the value of an electrical parameter is to keep track of pre-computed upper and lower bounds for the parameter, given the CMOS process manufacturing technology and the circuit. It may be sufficient to rule out a target as a fault by using these bounds for some parameters, and approximate and/or extract values for other parameters. In this case, there is no need to determine more accurate values of the parameters associated with the crosstalk site. However, it is more accurate to estimate the value 133 of a parameter by extraction. But the cost of extraction is higher than bound approximation. An extractor is a tool that determines (estimates) the values of a set of parameters within a certain degree of accuracy. The list of extractor models utilized in STAX is reported in Table 6-2. The total cost of extraction is approximated by a cost per site of the utilized extractor tool multiplied by the number of crosstalk sites in the input set of targets, which is passed to the extractor. Table 6-2: Extractors modeled in STAX E (Extracted parameters and extraction accuracies) Cost (Sec/Site) E 1 {(C c , 45%), (C v , 45%), (R v , 45%)} 0.005 E 2 {(C c , 45%), (C v , 30%), (R v , 35%)} 0.02 E 3 {(C c , 30%), (C v , 30%), (R v , 30%)} 0.05 E 4 {(C c , 15%), (C v , 15%), (R v , 15%)} 0.18 E 5 {(C c , 10%), (C v , 10%), (R v , 10%)} 0.45 6.2.3 Statistical Timing Analysis Tools Logical level statistical timing analysis is an effective way of modeling IC manufacturing process variations. It is also an important aspect of determining whether or not a coupling capacitance subjected to process variations can transform a crosstalk target into a crosstalk fault. Using statistical static timing analysis (SSTA) for all signal transitions in a circuit, one can determine statistical characteristics such as the mean and variance of key design attributes at intermediate circuit nodes, such signal arrival times and required times. The required time R(V) associated with a line V is the maximum (latest) time at which a transition can occur at this line, yet propagate to all primary output lines 134 before the end of a clock period (while satisfying the flip-flop set-up time.) In principle, the upper bound (µ+3σ) on the arrival time at victim node V, i.e., AT μ+3σ (V), should be less than the required time at V, namely R(V). 6.2.3.1 SSTA Tools The accuracy of SSTA depends on the delay models used for logic cells and crosstalk sites. We have incorporated a SSTA tool in STAX that exploits the variation-aware modeling technique discussed in Section 6.2.1 for the coupled interconnects, and relies on the non-linear delay modeling of [11] for logic cells. In this way, the SSTA tool can calculate the arrival time distributions of each line through a forward traversal process. Required times are computed using a backward traversal procedure as in [9]. Similar to trade-offs between computation time and accuracy employed for extraction, the SSTA tool can operate at different levels of accuracy and run-time complexity. In this chapter, the run-time complexity of the SSTA tools is considered as part of the cost of the filter(s) that employ them. (The STAX filters and their characteristics will be explained in Section 6.3.) In general, SSTA computes the μ±3σ bounds for the timing parameter of interest, which are needed by filters. Based on the computed bounds, the filters determine whether to prune a target. Details are provided below. SSTA T 1 : Arrival time calculation considering the crosstalk effects 135 T 1 calculates the (variational) arrival times (i.e., μ±3σ values) of all circuit lines by a forward traversal algorithm. It is executed on a site-by-site basis for each site associated with the set of crosstalk targets which are passed to T 1 by some filter. The arrival times are computed by considering the slowdown effect of the crosstalk target that is under consideration. This is a CPU-intensive task that requires iterative calculation of the maximum slowdown of the victim output as a function of the overlap between the arrival time ranges of the victim and the aggressor. Each time T 1 is called to process a crosstalk target, the crosstalk-affected arrival times (denoted by CT) of the victim and aggressor lines as well as all nodes in the fanout cones of the victim and aggressor lines are computed. SSTA T 2 : Required time calculation ignoring the crosstalk effects T 2 calculates the (variational) required times (i.e., μ±3σ values) of all circuit lines through a backward traversal approach. To do this, T 2 must first calculate arrival times by a forward traversal. In this case, however, the arrival times are calculated without considering the slowdown effect of any crosstalk target in the circuit. As a result, given the current (undecided) set of crosstalk targets in the circuit, T 2 is executed once to calculate the required times for all nodes in the circuit. 6.3 Filtering A filter is a tool to assess a sufficient set of conditions that can confirm whether or not a crosstalk target is non-fault-producing. The conditions can be related to 136 circuit parameters, such as the coupling capacitance value of a target, or physical dimensions, such has the coupled interconnect line width. For example, if it is known through a pre-processing that the coupling capacitance values, C C , of all FTs is equal to or greater than a threshold value, say C C-th , then a filter can be devised that simply states the following: “For each target T i in the input target set, if C C < C C-th ⇒ target is safe and is pruned.” The pruning power of a filter is defined as the ratio of the number of targets pruned to the number of targets in the input set of targets passed to the filter. The pruning power of a filter is dependent on both the effectiveness of its pruning conditions and the set of targets passed to it. For example, if the smallest set of targets (generated as a result of the strongest pruning possible) is passed to a filter, then the pruning power would be zero, because no more pruning is possible. To compare the relative pruning power of filters, the same set of targets should be passed to them. We say that filter F x is dominated by filter F y with respect to initial target set S if S y is contained in S x where S x (S y ) is the set of remaining targets in S after application of F x (F y ). The value of circuit and timing parameters required by a filter are determined by extractor(s) and timing analyzer(s). After a filter has been applied to a set of targets, the remaining set of targets can be passed to another non-dominated filter or one with relatively higher pruning power. Alternatively, a new extractor or timing analysis tool can be run that determines at least one circuit or timing 137 parameter value of higher accuracy than previously known. Then some filter (even the same one as before) can be applied to achieve additional pruning. The cost of a filter is a function of the number of targets in the input set passed to it. Typically, a filter with higher cost has a higher pruning power. This is because more elaborate conditions must be checked to increase the pruning power of the filter. At each stage of target set compaction, the relative pruning power of the filters as well as the filter cost must be known in order to decide which filter is the best one to use next. The pruning power of a filter, however, in going from one stage to the next, is a function of the actual pruning tools that have been previously executed. It also varies from one circuit to the next. This variability of the filter pruning power is one of the key factors that make the formulation of target set compaction difficult. Next we describe the main filters that are used in STAX to process crosstalk slowdown targets. Filter F 1 : Required time-based pruning based on looked-up maximum slowdown values Figure 6-2 depicts the crosstalk-affected signal arrival time at the victim, AT μ+3σ (V), of a crosstalk site as a function of the input skew, δ. Let SD max-Cc be the highest possible slowdown that can be generated by the crosstalk site with coupling value C C . Filter F 1 checks whether or not the maximum arrival time plus the worst- case slowdown of the victim can violate its minimum required time: 138 For target i, {AT μ+3σ (V) + SD max-Cc ≤ R μ-3σ (V)} ⇒ Target i is safe and can be pruned. (6-3) F 1 uses the required times computed by T 2 . This filter has two modes, depending on whether the value of C c is known or not. If an extractor is used to extract C c , then we can utilize the slowdown vs. input skew lookup tables (such as the one which is graphically depicted in Figure 6-2) to determine SD max-Cc ; On the other hand, if C c is unknown, a worst case value will be assumed for looking up the table and fetching SD max-Cc . 2.00E-10 3.00E-10 4.00E-10 5.00E-10 6.00E-10 7.00E-10 8.00E-10 -1000 -500 0 500 1000 300fF 200fF 50fF δ (psec) crosstalk-affected delay at victim output (psec) SD max-300f SD max-200f max-50f SD Figure 6-2: Slowdown curves as a function of skew for different C C Filter F 2 : Required time-based pruning based on crosstalk- affected arrival times 139 For each target under consideration, this filter uses the crosstalk-affected arrival times of the victim and the aggressor as computed by T 1 as well as the required time of the victim as calculated by T 2 . The target is pruned if the µ+3σ value of the victim’s arrival time does not violate the µ-3σ value of its required time. For target i, {CT μ+3σ (V) ≤ R μ-3σ (V)} ⇒ The target is safe and can be pruned. (6-4) Filter F 3 : PO arrival time-based pruning For a given crosstalk target, this filter uses T 1 to compute the crosstalk-affected arrival time at the primary outputs (PO) of the circuit. The target is pruned if the µ+3σ value of the any of the PO arrival times violates clock cycle time, D. For target i, {∀PO j : CT μ+3σ (PO j ) ≤ D} ⇒ Target i is safe and can be pruned. (6-5) A cost-per-target value is associated with each filter accounting for the computation cost of the timing tool that it uses. For F 1 , this cost is approximately 50μsec/target on Sun Blade 1000 machine. The cost for F 2 and F 3 is approximately 35 and 55msec/target, respectively. 140 With respect to a random set of sites, filter F 3 usually shows the highest pruning power since it uses T 1 to compute accurate distribution of arrival times and a forward traversal to determine whether or not the additional delay at a site actually violates the clock sampling. F 1 is the fastest of the filters described, but also the most pessimistic. F 2 is faster than F 3 because in F 2 , the backward traversal of T 2 to calculate the required time of circuit lines is done only once and subsequently used for all targets, whereas in F 3 , the forward traversal of additional delay must be repeated for each site. 6.4 Problem Statement and Solution Assume a CMOS VLSI circuit with an initial set of crosstalk targets, Set 0 . There are n filter tools, F 1 , ... , F n , m extractor tools, E 1 , … , E m , and p statistical timing analyzers T 1 , … , T p readily available. For a given circuit, there exists an optimal sequence of extractors, filters and timing analyzer tools to execute to find a compact set of targets in the minimal amount of CPU time. The problem is to find that optimal sequence that consists of (a subset of) the available tools that provides the best pruning possible (or a desired amount of pruning). Unfortunately, this sequence is usually different for each circuit. Three factors help in identifying a good pruning sequence. First, there is a partial ordering among many of the tools. For example, once some set of extractors are executed, the only extractors that can be subsequently executed are those that compute at least one parameter value to a higher degree of accuracy compared to 141 any previously executed extractor. Similarly, if a filter is executed, then a dominated filter cannot be executed unless the accuracy of at least one variable is improved. The second factor is that a good sequence based upon a suite of circuit benchmarks is often a good sequence for a new circuit. Finally, the application of one tool might imply the application of another one. For example, T 2 is required to run at least once prior to running F 1 . Additionally, it is known that F 3 does not require T 2 . Reference [48] shows how the subset property (i.e., every subset of a frequent set is frequent) and the corresponding association rules among targets sets can drastically reduce the complexity of finding good sequences. 6.5 Experimental Results 6.5.1 Statistical Analysis Tool in STAX To show the need for a statistical approach to compute the necessary timing information, first the statistical model based on the distributed RC-π circuit is compared against the conventional corner-based approach. A coupled global interconnect pair, each 1000µm long, is used to study the effect of line width and height variations on the crosstalk-affected output delay of the victim line. From Figure 6-3(a), the corner-based value of the victim delay shows more than 46% pessimism compared to that in the statistical model. We also substituted our distributed model with lumped RC-π and 2RC-π models and performed statistical analysis. The mean delay was found to be close to that for the distributed model. The µ+3σ value for the single RC-π (2RC-π) shows about 13% (8%) pessimism. 142 We next repeated the experiment by using the RC values found by the variational lumped RC-π model construction heuristics described in Section 6.2.1. The results are shown in Figure 6-3(b). (a) (b) Figure 6-3: (a) Comparison of distributed, RC-π , and 2RC-π models (b) accuracy improvement using our heuristic Compared to Figure 6-3(a), the overestimation is drastically reduced (e.g., for the case of 2RC-π model from 7.9% to 4% error for µ+3σ value and from 3.2% to 1.4% error for the mean value.) The intuitive explanation is that in case of 143 summation of the distributed parameters to a single value, the variations tend to cancel each other and thus the pessimism of the conventional approach for the extraction of a single component is reduced. Using lumped models greatly increases the efficiency of STAX. 6.5.2 Target Set Compaction in STAX Our experiments on different sequences show that the sequences with highest compaction degrees for different circuits are similar to each other. For example, all sequences end with the highest quality tools available. Therefore, we use a training procedure where the most efficient sequences for pruning the targets in a number of (training) circuits are found. Table 6-3 shows the sequences with highest compaction degree found for each of the training circuits. As mentioned in the previous section, these sequences are also effective for other circuits. All sequences end by executing the most accurate extractor (E 5 ) and the filter with highest pruning power (F 3 .) Table 6-3: Sequences with highest compaction degrees for training circuits Circuit Sequence C17 S 1 =E 1 T 2 F 1 E 2 T 2 F 1 T 1 T 2 F 2 F 3 E 3 T 1 F 3 E 4 T 1 F 3 E 5 T 1 F 3 C432 S 2 =E 1 T 2 F 1 E 2 T 2 F 1 T 1 T 2 F 2 F 3 E 4 T 1 F 3 E 5 T 1 F 3 C499 S 3 =E 1 T 2 F 1 E 2 T 2 F 1 T 1 T 2 F 2 E 4 T 1 F 3 E 5 T 1 F 3 C880 S 4 =E 1 T 2 F 1 E 2 T 2 F 1 T 1 F 3 E 5 T 1 F 3 The sequences shown in Table 6-3 were applied to five benchmark circuits, namely C1355, C1908, C3540, C5315, and C7552. None of these circuits were 144 included in the training set. To demonstrate the efficacy of tool/filter sequences found by STAX, we generated several semi-random sequences and ran them on those circuits. The reason for “semi-random” selection is that the last pruning tool is the one with highest compaction degree, i.e., SP≡E 5 T 1 F 3 . Thus, all sequences produce the same final set of potential crosstalk faults. Table 6-4 shows the execution times for all STAX generated sequences (S 1 to S 4 ), and a few semi- random ones (SR 1 to SR 4 ) as well as for SP. We see that for each of the five circuits, each of the four training sequences results in about the same computation time. Sequence SR 4 is very similar to S 4 and performs fairly well, but the rest require up to one order of magnitude more time to generate the final set of targets. To compare our framework with previous work on target identification [10] [15],[44],[48],[65],[70] we assume that the highest quality filter and most accurate extractor used in these earlier works are the same as ours. This means that they use something equivalent to SP. Hence, for C7552, their system would that run more than 49 times slower than the sequences produced by our framework. To show the effectiveness of this framework as the first phase of a test generation system, similar to [10], we assume that an ATPG system requires at least 40 seconds, which translates to 1,450,000 seconds to process all initial targets in C7552. Using SP, followed by ATPG would take about 35,296 seconds (33,056 for SP and 2,240 for an ATPG to process 14 sites). Using STAX and ATPG together would cost around 2,950 seconds (710 for STAX, and 2,240 for ATPG to process 14 sites.) The ratios of these three times are 491:15:1. 145 Finally, compared to previous work in [10], an average of 20% more pruning was observed in ISCAS85 benchmarks, e.g., for the case of the C7552 circuit, more than 33% improvement in pruning efficacy was achieved (going from 21 targets in [10] to 14 in STAX.) Table 6-4: Efficiency results of STAX (a) List of the semi-random sequences Sequence Sequence Elements SR 1 E 4 T 2 F 1 T 1 T 2 F 2 F 3 E 5 T 1 F 3 SR 2 E 1 T 2 F 1 E 1 T 2 F 1 E 1 T 1 T 2 F 2 E 5 T 1 F 3 SR 3 E 2 T 1 T 2 F 2 E 4 T 1 T 2 F 2 E 4 T 1 F 3 E 5 T 1 F 3 SR 4 E 1 T 1 F 1 E 2 T 1 F 1 T 1 T 2 F 2 E 5 T 1 F 3 SP E 5 T 1 F 3 (b) Results of using “best” sequences on large circuits as well as the rest of the sequences in Table 6-4(a) C1355 (sec) C1908 (sec) C3540 (sec) C5315 (sec) C7552 (sec) S 1 105 50 380 507 656 S 2 100 47 384 563 631 S 3 103 41 402 492 651 S 4 98 46 383 627 624 SR 1 752 767 2725 5025 8055 SR 2 211 189 1655 1772 1467 SR 3 905 878 2281 4121 671 SR 4 105 51 625 882 1015 SP 2425 2677 9463 16323 32067 146 6.6 Summary This chapter was on STAX, a framework for statistical crosstalk slowdown target set compaction, which incorporates an efficient sequence of filters, statistical timing analyzers, and extraction tools. A variation-aware coupled interconnect modeling was used to consider the local process variation effects along coupled interconnects. Experimental results confirm that STAX can significantly improve the efficiency of identifying crosstalk targets to be considered for test generation. This results presented in this chapter has been mainly published in [56]. 147 7 CONCLUSIONS The focus of this thesis was on the development of models and analysis techniques that can accurately take into account the growing impact of crosstalk noise as well as process variations. Our logic cell delay models consider the nonlinear and parasitical behavior of logic cells in the presence of crosstalk noise. They can construct the output voltage waveform given the input voltage waveform by accurately considering the impact of the shape of the waveforms. To resolve the shortcomings of the existing current-based models, our current source model captures the nonlinear behavior of the logic cell as well as its Miller and output parasitic effects considering their dependence to input and output voltage values. One of the applications of our current source modeling (other than timing analysis) is accurate power estimation of CMOS circuits. For example, in [27] an accurate model for short-circuit power consumption is presented that can handle input waveforms of arbitrary shapes, including glitches. One possible extension to this work is to consider coupling effect in the load of the logic cells. The existing cell delay analyzers use model order reduction to convert a complex load, such as distributed RC, to much simpler ones, such as an effective capacitance, or a single stage (lumped) RC. These tools, however, ignore the crosstalk effect that may exist at the output of the cell in its delay calculation. Figure 7-1 shows an example of such case. The input of Cell B, is subjected to 148 crosstalk noise by C m1 . In addition, the output of cell B is under attack via C m2 . The conventional techniques assume that the interconnect A L is quiet, i.e., in steady state value and then use a model order reduction technique to find the effective load. However, in reality the signal transition at A L can change the loading effect of cell B and hence its delay. Therefore a new load modeling technique can be developed to compute the loading of a cell when the output of the cell is under possible crosstalk attack. V in V out C m1 C m2 A L B Figure 7-1: (a) Crosstalk effect at the output of a cell Another possible future work is on the consideration of MIS (Multiple Input Switching) effect in logic cell delay modeling. One of the most important sources of error in cell delay analysis is the application of pin-to-pin delay models, or considering SIS (Single Input Switching) [4],[6],[69]. This model assumes only one transition at the input of a cell occurs at a time instance. Therefore, in case of MIS, i.e., simultaneous or close-to-simultaneous transitions at two or more inputs, it independently calculates the delay from each of the inputs to the output using the 149 cell delay model and reports the maximum delay among those, as the cell delay. It has been shown in [4] that ignoring MIS may create errors as high as 26%. [69] moves further and reports errors as high as 100% for stage delay and slew calculation if MIS is not modeled. The goal is to create an accurate current-based model that considers MIS and works efficiently at least for a few inputs, such as three or four input cells. However the cell delay model in [4] and [69] uses a voltage-based technique and the proposed MIS algorithm is not applicable to current-based cell delay modeling. [69] used the simplest approach that may come to mind, which is to pre-characterize CMOS logic cells with lookup tables with all input voltage signal values as well as output voltage value as the keys to the table. That means for a 4 input gate lookup tables with size 5x5 will be created. Although a simple approach like this can be applied to any current source modeling including ours, however the large size of lookup tables make this approach impractical. Finally the models developed in this work can be enhanced to consider spot defects. The challenging fact, in spot defect analysis, is that the defect occurrence and its circuit parametric value such as resistive value for a bridging defect, are probabilistic [35]. Therefore the analysis of a crosstalk-and-bridge site should not only contain the statistical information regarding the process variations, but also the probabilistic information regarding the occurrence of the bridge at each certain site, as well as the probabilistic information regarding its value. 150 Bibliography [1] S. Abbaspour, H. Fatemi, and M. Pedram, “VITA: Variation-aware interconnect timing analysis for symmetric and skewed sources of variation considering variational ramp input,” Proc. Great Lakes Symposium on VLSI (GLSVLSI), pp. 426-430, 2005. [2] K. Agarwal, Y. Cao, T. Sato, D. Sylvester, C. Hu, “Efficient generation of delay change curves for noise-aware static timing analysis,” Proc. Asia South Pacific Design Automation Conf. (ASP-DAC), pp. 77-84, 2002. [3] K. Agarwal, D. Sylvester, D. Blaaw, “An effective capacitance based driver output model for on-chip RLC interconnects,” Proceedings of Design Automation Conference (DAC), pp. 376 –381, 2003. [4] Agarwal, A.; Dartu, F.; Blaauw, D.;”Statistical gate delay model considering multiple input switching”, Proc. DAC, pp. 658-663, 2004. [5] M. Agarwal, K. Agarwal, D. Sylvester, D. Blaauw, “Statistical modeling of cross-coupling effects in VLSI interconnects,” Proc. ASP-DAC, Vol. 1, pp. 503- 506, Jan 2005. [6] C. Amin, C. Kashyap, N. Menezes, K. Killpack, E. Chiprout, “A Multi-port Current Source Model for Multiple-Input Switching Effects in CMOS Library Cells,” Proc. Design Automation Conference (DAC), pp. 247-252. [7] M.R. Becer, D. Blaaw, V. Zolotov, R. Panda, I.N. Hajj, “Analysis of noise avoidance techniques in DSM interconnects using a complete crosstalk noise model,” Proc. Design, Automation, & Test Eur. (DATE), pp. 456-463, 2002. [8] M.R. Becer, D. Blaauw, I. Algor, R. Panda, C. Oh, V. Zolotov, I.N. Hajj, “Postroute gate sizing for crosstalk noise reduction,” Transactions on Computer- Aided Design of Integrated Circuits and Systems, Volume 23, Issue 12, pp. 1670-1677, 2004. [9] D. Blaauw, V. Zolotov, and S. Sundareswaran, “Slope propagation in static timing analysis,” IEEE Transactions on Computer Aided-Design of Integrated Circuits and Systems, pp. 1180-1195, 2002. 151 [10] M.A. Breuer, S.K. Gupta, S. Nazarian, “Efficient identification of crosstalk induced slowdown targets,” Proc. Asian Test Symp. (ATS), pp. 124-131, Nov. 2004. [11] H. Chang, V. Zolotov, C. Visweswariah, S. Narayan, “Parameterized Block- Based Statistical Timing Analysis with Non-Gaussian Parameters and Nonlinear Delay Functions,” Proc. of Design Automation Conf. (DAC), pp. 71-76, 2005. [12] P. Chen, D.A. Kirkpatrick, K. Keutzer, K., “Switching window computation for static timing analysis in presence of crosstalk noise,” Proceedings of International Conference on Computer Aided Design (ICCAD), pp. 331-337, 2000. [13] P. Chen, D.A. Kirkpatrick, K. Keutzer, “Switching window computation for static timing analysis in presence of crosstalk noise,” Proc. Int’l Conf. on Computer-Aided Design (ICCAD), pp. 331-337, 2000. [14] T. Chen, A. Hajjar, “Statistical timing analysis of coupled interconnects using quadratic delay-change characteristics”, IEEE Trans. On Comp.-Aided Design of Integ. Cir.s and Systems, Vol. 2312, pp. 1677-1683, 2004. [15] W. Chen, S.K. Gupta, M.A. Breuer, “Analytic models for crosstalk delay and pulse analysis under non-ideal inputs,” Proc. Int’l Test Conf. (ITC), pp. 809- 818, 1997. [16] W.Y. Chen, S.K. Gupta, M.A. Breuer, “Test generation for crosstalk induced faults: Framework and computational results,” Journal of Electronic Testing, Theory and Applications (JETTA), pp. 17-28, 2000. [17] W.Y. Chen, S.K. Gupta, M.A. Breuer, “Analytical models for crosstalk excitation and propagation in VLSI circuits,” Trans. On Computer-Aided Design of Integ. Cir. & Sys., Vol. 21 No. 10, pp. 1117-1131, 2002. [18] K.T. Cheng, S. Dey, M. Rodgers, K. Roy, “Test challenges for deep sub- micron technologies”, Proc. Design Automation Conf. (DAC), pp. 142-149, 2000. [19] J.H. Chern, J. Huang, L. Arledge, P.C. Li, P. Yang, “Multilevel metal capacitance models for CAD design synthesis systems,” Elc. Dev. Letters, Vol. 13, Issue 1, pp. 32-34, 1992. [20] J. Cong, D. Pan, P.V. Srinivas, “Improved crosstalk modeling for noise constrained interconnect optimization,” Proc. Asia South Pacific Design Automation Conf. (ASP-DAC), pp. 373-378, 2001. 152 [21] J.F. Croix, D.F. Wong, “Blade and razor: cell and interconnect delay analysis using current-based models,” Proc. Design Automation Conf. (DAC), pp. 386-389, 2003. [22] F. Dartu, N. Mendezes, L.T. Pileggi, “Performance computation for precharacterized CMOS gates with RC loads,” IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, Volume 15, Issue 5, pp. 544 – 553. 1996. [23] F. Dartu and L.T. Pileggi, “Modeling signal waveshapes for empirical CMOS gate delay models,” Proc. Int’l Workshop on Power & Timing modeling, Optimization and Simulation (PATMOS), pp. 57-66. 1996. [24] W. C. Elmore, “The transient response of damped linear networks with particular regard to wideband amplifiers,” Journal of Applied Physics, 19, , pp. 55-63, Jan. 1948. [25] Y. Eo, S. Shin, W.R. Eisenstadt, J. Shim, “A decoupling technique for efficient timing analysis of VLSI interconnects with dynamic circuit switching,”, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 23, Issue 9, pp. 1321–1337, 2004. [26] H. Fatemi, S. Nazarian, M. Pedram, “Statistical Logic Cell Delay Analysis Using a Current-based Model,” Proc. Design Automation Conference (DAC), pp. 253-256, 2006. [27] H. Fatemi, S. Nazarian, M. Pedram, “A Current-based Method for Short Circuit Power Calculation under Noisy Input Waveforms” Proc. Asia Pacific Design Automation Conference (ASP-DAC), 2007. [28] P.D. Gross, R. Arunachalam, K. Rajagopal, L.T. Pileggi, “Determination of worst-case aggressor alignment for delay calculation,” Proceedings of International Conference on Computer-Aided Design (ICCAD), pp. 212-219, 1998. [29] M. Hashimoto, Y. Yamada, H. Onodera, “Equivalent waveform propagation for static timing analysis,” International Conference on Computer Aided Design, (ICCAD), pp. 169–175, 2003. [30] M. Hashimoto, Y. Yamada, H. Onodera, “Equivalent waveform propagation for static timing analysis,” IEEE Trans. Computer-Aided Design of Integ. Circuits & Systems, Vol. 23, No.4, pp. 498-508, 2004. 153 [31] P. Heydari, M. Pedram, “Analysis and reduction of capacitive coupling noise in high-speed VLSI circuits,” Proc. Int’l Conf. on Computer Design (ICCD), pp. 104-109, 2001. [32] P. Heydari, M. Pedram, “Analysis and reduction of capacitive coupling noise in high-speed VLSI circuits,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 24, Issue 3, pp. 478-488, March 2005. [33] I. Huang, S.K. Gupta, M.A. Breuer, “Accurate and efficient static timing analysis with crosstalk,” Int’l Conf. on Computers and Processors (ICCD), pp. 265-272, 2002. [34] S. Irajpour, S. Nazarian, L. Wang, S.K. Gupta, M.A. Breuer, “Analyzing crosstalk in the presene of weak bridge defects,” Proc. VLSI Test Symp. (VTS), pp. 385-392, 2003. [35] A. Jee, F.J. Ferguson, “Carafe: an inductive fault analysis tool for CMOS VLSI circuits,” IEEE Digest of Papers on VLSI Test Symposium, pp. 92-98, 1993. [36] A. Kahng, S. Muddu, D. Vidhani, “Noise and delay uncertainty studies for coupled RC interconnects,” Proc. In’t. ASIC/SOC Conference, pp. 3-8, 1999. [37] “Cmos Digital Integrated Circuits: Analysis and Design,” S. Kang, Y. Leblebici, McGraw-Hill College, 3 rd edition, 2002. [38] C.V. Kashyap, C.J. Alpert, A. Devgan, “An “effective” capacitance based delay metric for RC interconnect,” Proceedings of IEEE/ACM International Conference on Computer Aided Design (ICCAD), pp. 229–234, 2000. [39] I. Keller, K. Tseng, N. Verghese, “A robust cell-level crosstalk delay change analysis,” Proceedings of International Conference on Computer-Aided Design (ICCAD), pp.147-154, 2004. [40] A. Korshak, J.C. Lee, “An effective current source cell model for VDSM delay calculation,” Proceedings of International Symposium on Quality of Electronic Designs (ISQED), pp. 296–300, 2001. [41] J. Le, X. Li, L.T. Pileggi, “STAC: Statistical timing analysis with correlation,” Proc. of DAC, pp. 343-348, 2004. [42] P. Li, F. Liu, X. Li, T.L. Pileggi S.R. Nassif, “Modeling interconnect variability using efficient parametric model order reduction,” Proc. of DATE, pp. 958-963, 2005. 154 [43] P. Li and E. Acar, “Waveform independent gate models for accurate timing analysis”, Proc. ICCD, pp. 363-365, 2005. [44] M.A. Margolese, F.J. Ferguson, “Using temporal constraints for eliminating crosstalk candidates for design and test,” Proc. VLSI Test Sym. (VTS), pp. 80- 85, 1999. [45] M. Martina, G. Masera, “A statistical model for estimating the effect of process variations on crosstalk noise,” Proceedings of the 2004 international workshop on System level Interconnect Prediction (SLIP), pp. 115-120, 2004. [46] S. Mei, J. Kawa, C. Chiang, Y.I. Ismail, “An accurate low iteration algorithm for effective capacitance computation, IEEE Int’l workshop on SoC for Real-Time Applications (IWSOC), pp. 99–104, 2004. [47] S. Nassif, et al, “A methodology for modeling the effects of systematic with-in die variation,” Proc. of DAC, pp. 172-175, 2000. [48] S. Nazarian, H. Huang, S. Natarajan, S. K. Gupta, and M.A. Breuer, “XIDEN: Crosstalk target identification framework,” Proc. Int’l. Test Conf. (ITC), pp. 365-374, 2002. [49] S. Nazarian, M. Pedram, "Delay analysis of coupled interconnects in VDSM technologies." Under review journal paper. [50] S. Nazarian, M. Pedram, E. Tuncer, T. Lin, “Sensitivity-based gate delay propagation in static timing analysis,” ACM/IEEE Workshop on Timing Issues in the Specification and Synthesis of Digital Systems (TAU), pp. 20-25, Feb. 2005. [51] S. Nazarian, M. Pedram, E. Tuncer, T. Lin, “SDP: Sensitivity-based gate delay propagation in static timing analysis,” Proc. of International Symposium on Quality of Electronic Designs (ISQED), pp. 536-541, Mar. 2005. [52] S. Nazarian, M. Pedram, E. Tuncer, T. Lin, A.H. Ajami, “Modeling and propagation of noisy waveforms in static timing analysis,” Proc. of Design Automation and Test in Europe (DATE), pp. 776-777, Feb. 2005. [53] S. Nazarian, M. Pedram, E. Tuncer, “An empirical study of crosstalk in VDSM technologies,” Proc. Of GLSVLSI, 104-109, April, 2005. 155 [54] S. Nazarian and M. Pedram, "CGTA: Current gain-based timing analysis for logic cells," To appear in Proc. of Asia and South Pacific Design Automation Conference (ASPDAC), Jan. 2006. [55] S. Nazarian and M. Pedram, "Cell delay analysis based on rate-of-current change," To appear in Proc. of Design Automation and Test in Europe (DATE), March 2006. [56] S. Nazarian, M. Pedram, S.K. Gupta, M.A. Breuer, “STAX: Statistical crosstalk target set compaction,” Proc. Design Automation and Test in Europe (DATE), Vol. 2, pp. 1-6, 2006. [57] S. Nazarian and M. Pedram, "Gain-based Cell Delay Modeling," To appear in Proc. of International Symposium on VLSI Design, Automation, and Test (VLSI-DAT), April 2006. [58] S. Nazarian, A. Iranli, and M. Pedram, "Crosstalk analysis in nanometer technologies," Proc. of Great Lakes Symposium on VLSI, April 2006, pp. 253- 258. [59] N. NS, T. Bonifield, A. Singh, C. Bittlestone, U. Narisimha, V. Le, A. Hill, “ BEOL variability and impact on RC extraction,” Proc. of DAC, pp. 758-759, 2005. [60] A. Odabasioglu, M. Celik and L. Pileggi, “PRIMA: passive reducedorder interconnect macromodeling algorithm,” IEEE Trans. on CAD, vol. 17, No. 8, pp. 645-653, August 1998. [61] J. Qian, S. Pullela, L. Pillage, “Modeling the effective capacitance for the RC interconnect of CMOS gates,” Transactions On Computer-Aided Design of Integrated Circuits and Systems, Vol. 13, pp. 1526–1535, 1994. [62] C.L. Ratzlaff, L.T. Pillage, “RICE: rapid interconnect circuit evaluation using AWE,” IEEE Trans. on CAD, vol. 13, Issue 6, June 1994 pp. 763-776, June 1994. [63] M. Ringe, T. Lindenkreuz, E. Barke, “Static timing analysis taking crosstalk into account,” Proceedings of Design, Automation, and Test in Europe Conference (DATE), pp. 451-455, 2000. [64] A. Rubio, N. Itazaki, X. Xu, K. Kinoshita, “An approach to the analysis and detection of crosstalk faults in digital VLSI circuits,” Tran. Comp.-Aided Design of Integ. Cir. & Sys., Vol. 13, pp. 387-395, 1994. 156 [65] A.D. Sathe, M.L. Bushnell, V.D. Agrawal, “Analog macromodeling of capacitive coupling faults in digital circuit interconnects,” Proc. Int’l Test Conf. (ITC), pp. 375-383, 2002. [66] G.D. Sinah, H. Zhou, “Gate sizing for crosstalk reduction under timing constraints by Lagranian relatxation,” Proceedings, of Computer Aided Design (ICCAD), pp. 14-19, 2004. [67] J. Singh, S. Sapatnekar, “Statistical timing analysis with correlated non- gaussian parameters using independent component analysis,” Proc. of Design Automation Conference (DAC), pp. 155-160, July 2006. [68] S. Sirichotiyakul, D. Blaauw, C. Oh, R. Levy, V. Zolotov, and J. Zuo, “Driver modeling and alignment for worst-case delay noise,” Proceedings of Design Automation Conference (DAC), 2001, pp. 720-725. [69] J. Sridharan, T. Chen, “Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR,” Proc. Design, Automation and Test in Europe (DATE), 2006. pp. 1-6. [70] H. Takahashi, K.J. Keller, K.T. Le, K.K. Saluja, Y. Takamatsu, “A method for reducing the target fault list of crosstalk faults in synchronous sequential circuits,” IEEE Tran. On Computer-Aided Design of Integrated Circuits and Systems, Vol. 24, Issue 2, pp. 252-263, Feb. 2005. [71] H. Takahashi, K.J. Keller, K.T. Le, K.K. Saluja, Y. Takamatsu, “A method for reducing the target fault list of crosstalk faults in synchronous sequential circuits,” IEEE Trans. on CAD, Vol. 24, Issue 2, pp. 252-263, Feb. 2005. [72] C. Tsai, M. Marek-Sadowska, “Modeling crosstalk induced delay,” Proc. Int’l Symp. on Quality Electronic Design (ISQED), pp. 189-194, 2003. [73] C. Visweswariah, “Death, taxes and failing chips” Proc. of Design Automation Conf. (DAC), pp. 343-347, 2003. [74] C. Visweswariah, K. Ravindran, K. Kalafala, S.G. Walker, S. Narayan, D.K. Beece, J. Piaget, N. Venkateswaran, J.G. Hemmett, “First-Order Incremental Block-Based Statistical Timing Analysis,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Volume 25, Issue 10, pp. 2170-2180, Oct. 2006. [75] T. Xiao, M. Marek-Sadowska, “Worst delay estimation in crosstalk aware static timing analysis,” Proceedings of International Conference on Computer Design (ICCD), pp. 115-120, 2000. 157 [76] J. Yu, F.J. Ferguson, “Maximum likelihood estimation for failure analysis [IC yield],” IEEE Trans. Semiconductor Manufacturing, Vol. 11, Issue 4, pp. 681-691, 1998. [77] S.T. Zachariah, Y. Chang, S. Kundu, C. Tirumurti, “On modeling crosstalk faults,” Proc. Design, Automation & Test Eur. (DATE), pp. 490-495, 2003. [78] Y. Zhan, A.J. Strojwas, X. Li, L.T. Pileggi, D. Newmark, M. Sharma “Correlation-Aware Statistical Timing Analysis with Non-Gaussian Delay Distributions,” Proc. of Design Automation Conf. (DAC), pp. 77-82, 2005. [79] L. Zhang, Y. Hu, C.C.P. Chen, “Statistical timing analysis with path reconvergence and spatial correlations,” Proc. of Design, Automation and Test in Europe (DATE), Volume 1, 5 pages, March 2006. [80] V. Zolotov, D. Blaauw, S. Sirichotiyakul, M. Becer, C. Oh, R. Panda, A. Grinshpon, R. Levy, “Noise propagation and failure criteria for VLSI designs,” ICCAD, pp. 587-594, 2002.
Abstract (if available)
Abstract
This dissertation investigates the effect of capacitive crosstalk on interconnect and logic cell (gate) delay modeling and calculation in state-of-the-art CMOS VLSI designs. First, based on distributed RC-[pi] modeling of an interconnection, a detailed simulation-based study of the propagation delay of a pair of crosstalk-affected interconnect lines is presented. This is followed by a detailed model and delay analysis of coupled interconnect lines subject to manufacturing process and environmental variations. Next, the focus is shifted to delay analysis of logic cells (gates) in a VLSI circuit. Two different approaches to logic cell delay analysis, one motivated byvoltage-based modeling of a CMOS gate
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Timing and power analysis of CMOS logic cells under noisy inputs
PDF
Test generation for capacitance and inductance induced noise on interconnects in VLSI logic
PDF
Modeling and testing crosstalk faults in arbitrary inter-core interconnects that include tri-state and bi-directional nets
PDF
Advanced cell design and reconfigurable circuits for single flux quantum technology
PDF
Verification and testing of rapid single-flux-quantum (RSFQ) circuit for certifying logical correctness and performance
PDF
Formal equivalence checking and logic re-synthesis for asynchronous VLSI designs
PDF
Power efficient design of SRAM arrays and optimal design of signal and power distribution networks in VLSI circuits
PDF
Thermal analysis and multiobjective optimization for three dimensional integrated circuits
PDF
Average-case performance analysis and optimization of conditional asynchronous circuits
PDF
Redundancy driven design of logic circuits for yield/area maximization in emerging technologies
PDF
Clustering and fanout optimizations of asynchronous circuits
PDF
A logic partitioning framework and implementation optimizations for 3-dimensional integrated circuits
PDF
Designing efficient algorithms and developing suitable software tools to support logic synthesis of superconducting single flux quantum circuits
PDF
Charge-mode analog IC design: a scalable, energy-efficient approach for designing analog circuits in ultra-deep sub-µm all-digital CMOS technologies
PDF
Compiler and runtime support for hybrid arithmetic and logic processing of neural networks
PDF
Optimal redundancy design for CMOS and post‐CMOS technologies
PDF
Electronic design automation algorithms for physical design and optimization of single flux quantum logic circuits
PDF
Power optimization of asynchronous pipelines using conditioning and reconditioning based on a three-valued logic model
PDF
Library characterization and static timing analysis of asynchornous circuits
PDF
Variation-aware circuit and chip level power optimization in digital VLSI systems
Asset Metadata
Creator
Nazarian, Shahin
(author)
Core Title
Timing analysis of coupled interconnect and CMOS logic cells in the presence of crosstalk noise
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering (VLSI Design)
Publication Date
10/30/2006
Defense Date
09/25/2006
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
coupled interconnect,current-based model,OAI-PMH Harvest,process variation,statistical,timing analysis
Language
English
Advisor
Pedram, Massoud (
committee chair
), Draper, Jeffrey T. (
committee member
)
Creator Email
shahin@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m114
Unique identifier
UC1224776
Identifier
etd-Nazarian-20061030 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-30728 (legacy record id),usctheses-m114 (legacy record id)
Legacy Identifier
etd-Nazarian-20061030.pdf
Dmrecord
30728
Document Type
Dissertation
Rights
Nazarian, Shahin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
coupled interconnect
current-based model
process variation
statistical
timing analysis