Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Trustworthiness of integrated circuits: a new testing framework for hardware Trojans
(USC Thesis Other)
Trustworthiness of integrated circuits: a new testing framework for hardware Trojans
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TRUSTWORTHINESS OF INTEGRATED CIRCUITS: A NEW TESTING FRAMEWORK FOR HARDWARE TROJANS by Byeongju Cha A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2015 Copyright 2015 Byeongju Cha 2 Dedication To my family 3 Acknowledgements It is a pleasure to thank all those who helped me in this long journey. I would never have been able to finish this dissertation without the guidance of my advisor and committee members, help from my colleagues, and support from my family. First and foremost, I would like to express my deepest gratitude to my advisor, Dr. Sandeep K. Gupta, for his excellent mentorship, support, patience, and immense knowledge. His valuable advice and guidance always helped me in all the times of research and writing of this dissertation. I would like to thank my committee members, Dr. Melvin A. Breuer, Dr. Massoud Pedram, Dr. Aiichiro Nakano and Dr. Paul Bogdan, for their time and feedback to my work. I would also like to thank Prof. Gandhi Puvvada, for providing me the valuable opportunity to participate in his classes. I would like to thank Hsunwei Hsiung, as a good mentor, who was always willing to help me from the beginning of my study. I would also like to thank Prasanjeet Das for his work and implementation on test generation. I also thank my colleagues in my research group, Doochul Shin, Yue (Sam) Gao, Da Cheng, Jianwei Zhang and Mohammad Mirza-Aghatabar, for their comments on the work and dissertation. Special thanks go to other group members, Jizhe Zhang, Fangzhou Wang, Yang Zhang and Xuan Zuo. I would also like to thank my wife, Julia, for her invaluable encouragement. She always cheered me up and stood by me throughout my study and writing this dissertation. I sincerely thank my parents and brother, for their unconditional love and support. Without their support and faith in me, I would not be the man I am today. Last but not least, I thank God I believe for everything that happened for me. 4 Table of Contents Dedication ....................................................................................................................................... 2 Acknowledgements ......................................................................................................................... 3 List of Tables .................................................................................................................................. 7 List of Figures ................................................................................................................................. 9 Abstract ......................................................................................................................................... 14 CHAPTER 1. INTRODUCTION .........................................................................16 1.1 Background and Motivation ................................................................................................. 16 1.2 Related Research and Limitations ........................................................................................ 19 1.2.1 Trojan Taxonomies and Trojan Designs ........................................................................ 19 1.2.2 Trojan Detection Methods .............................................................................................. 22 1.3 Challenges ............................................................................................................................ 26 1.4 Key Research Tasks ............................................................................................................. 27 1.5 Dissertation Outline ............................................................................................................. 28 CHAPTER 2. CHARACTERIZATION OF TROJANS ...................................30 2.1 Introduction .......................................................................................................................... 30 2.2 Capabilities and Effectiveness of Delay Measurements ...................................................... 32 2.2.1 Delay Fault Models and Test Generation Methods........................................................ 32 2.2.2 Delay Measurement Methods ........................................................................................ 33 2.2.3 Effectiveness of Delay Measurements ........................................................................... 34 2.3 Trojan Models ...................................................................................................................... 35 2.3.1 Conditions for Maximally Challenging Trojans ............................................................ 37 2.3.2 Minimally Delay-Invasive Models ................................................................................ 40 2.3.3 Maximally-Matched Models .......................................................................................... 43 2.3.4 Analysis of the Models ................................................................................................... 43 CHAPTER 3. DETECTION OF MINIMALLY DELAY-INVASIVE TROJANS VIA DELAY MEASUREMENT ......................................................48 3.1 Introduction .......................................................................................................................... 48 3.2 Process Variations and Simulation Models ......................................................................... 51 5 3.3 Problem Statement ............................................................................................................... 52 3.4 Shortest Delay Path Selection .............................................................................................. 53 3.5 Conditions for Vector Generation ........................................................................................ 58 3.6 Calibration of Process Variations ........................................................................................ 60 3.7 Path Delay Measurement ..................................................................................................... 67 3.8 Estimation of the Number of Chips to be Tested ................................................................. 69 3.9 Algorithm ............................................................................................................................. 74 3.10 Experimental Results .......................................................................................................... 77 3.11 Conclusions ........................................................................................................................ 79 CHAPTER 4. A METHOD TO MINIMIZE THE IMPACT OF TROJANS VIA RESIZING ......................................................................................................81 4.1 Introduction .......................................................................................................................... 81 4.2 Problem Statement ............................................................................................................... 84 4.3 Gate Resizing Method and Analysis .................................................................................... 86 4.3.1 Fitness Function.............................................................................................................. 86 4.3.2 Analysis of the Impact of Trojans on Delay, Area, and Power ...................................... 89 4.4 Proposed Approach .............................................................................................................. 94 4.4.1 Candidate Gate Selection ............................................................................................... 95 4.4.2 Candidate Pruning .......................................................................................................... 96 4.4.3 Optimal Gate Resizing ................................................................................................... 98 4.4.4 Iterative Improvement .................................................................................................... 99 4.4.5 Algorithm ..................................................................................................................... 101 4.4.6 Implementation of Gate Resizing ................................................................................. 103 4.4.7 Complexity of the Approach ........................................................................................ 105 4.5 Experimental Setup ............................................................................................................ 105 4.6 Experimental Results ......................................................................................................... 107 4.7 Conclusions ........................................................................................................................ 109 6 CHAPTER 5. DETECTION OF MAXIMALLY-MATCHED MODELS OF TROJANS….. .......................................................................................................111 5.1 Introduction ........................................................................................................................ 111 5.2 Problem Statement ............................................................................................................. 113 5.3 Definitions and Notations .................................................................................................. 115 5.4 Gate Delay Model .............................................................................................................. 118 5.5 Proposed Approach ............................................................................................................ 122 5.5.1 The Effects of Single Instance of Circuit Transformations on Delays ......................... 124 5.5.1.1 Transformations using De Morgan’s law ................................................................ 124 5.5.1.2 Decomposition of a Large Fanin Gate into Gates with Smaller Fanins .................. 131 5.5.1.3 Algebraic Substitution............................................................................................. 136 5.5.1.4 Theoretical Proofs for Non-Reconvergent Circuits ................................................ 138 5.5.2 Sensitivity Analysis ...................................................................................................... 142 5.5.2.1 Finding Vectors Invoking the Maximum Difference in Delays ............................. 142 5.5.2.2 Sub-Circuit Analysis ............................................................................................... 145 5.6 Experimental Results ......................................................................................................... 146 5.7 Extensions of the Proposed Approach ............................................................................... 149 CHAPTER 6. FUTURE RESEARCH ...............................................................151 References .................................................................................................................................. 153 7 List of Tables Table 1: The description of minimally delay-invasive models and maximally-matched models of Trojans ................................................................................................................. 44 Table 2: Symbols for variation sources ........................................................................................ 52 Table 3: Variation models for three cases, in terms of four sources of variations ....................... 62 Table 4: The test costs and Trojan coverages (TC) for some benchmark circuits using various combinations of our approaches. (a) A baseline method of classical delay testing without calibration of process variations. (b) Our approach of selecting shortest delay paths without calibration of process variations. (c) Our approach of classical delay testing with calibration of process variations. (d) Our approach of selecting shortest delay paths with calibration of process variations. .................................................. 79 Table 5: The minimum, maximum, and average differences in total power consumption (from Figure 21(a) and (b)) ................................................................................................... 93 Table 6: The difference (percentage) in the total power consumption and delay (A) between C and C1, and (B) between C and CT for the selected benchmark circuits. NTS : the number of Trojan sites, NTB: the number of gates and components in the Trojan block. . 108 Table 7: The average number of chips required to detect a Trojan in the Trojan-affected circuit without resizing (No), and the Trojan affected circuit after resizing (NResize) for various ISCAS ’85 and 89 benchmark circuits for 65nm and 45nm technologies, where γ0= 0.01, γ1= 0.05 and confidence level of 95%. The maximum number of iterations for the algorithm is 50. ........................................................................................ 108 Table 8: SIS and MIS of a 3-input NAND gate .......................................................................... 117 8 Table 9: Delays of two circuits shown in Figure 31 for two different vectors to the input of the circuits, V1 and V2. ....................................................................................................... 128 Table 10: Delays of two circuits shown in Figure 34 for two different vectors to the input of the circuits, V1 and V2. ....................................................................................................... 134 Table 11: Summary of minimally delay-invasive Trojans and three circuit transformations studied ................................................................................................................................. 137 Table 12: Sub-circuit analysis on three possible transformations (targets) for a 4-input NAND, using the 65nm industrial technology ................................................................... 146 Table 13: SIS-TNC coverage, and the average number of chips to be tested to detect each one of the target scenarios applied to each multi-input gate, in selected ISCAS’85 and 89 benchmark circuits. ........................................................................................................ 148 9 List of Figures Figure 1: The scenario for IC design and fabrication considered in this dissertation. .................. 17 Figure 2: Delay measurement versus power/ground currents measurement. ............................... 24 Figure 3: A minimally delay-invasive Trojan. .............................................................................. 41 Figure 4: Trojan taxonomies proposed in [77][86]. ...................................................................... 46 Figure 5: (a) Original circuit, C, with the original netlist and sizing factors. (b) A Trojan- affected circuit with a minimally delay-invasive Trojan (but without resizing), C1, where a Trojan block is connected to an arbitrary line in the original circuit via a gate having minimum input capacitance. ..................................................................................... 49 Figure 6: The distribution of delay at an output of s420 considering process variations for the original circuit version, and a version with a Trojan sited on line 371. For a vector that excites (a) a longer delay path, and (b) a shorter delay path. ......................................... 57 Figure 7: Conditions for robust delay fault testing for an on-path NAND gate. The thick and thin lines denote on-path and off-path lines, respectively. ................................................... 59 Figure 8: Conditions for a NAND gate along a surrogate path, to detect above category of Trojans. ................................................................................................................................. 59 Figure 9: Measurement of delay of path Pk, where the size of the logic block is smaller than the fabricator-specified local area, represented with dotted lines. ........................................ 63 Figure 10: Delay distribution of path k: (a) full-statistical variations, and (b) local-nearby variations. .............................................................................................................................. 63 Figure 11: Probability density functions of delays measured on an inverter chain with twelve inverters for three cases of variations. .................................................................................. 65 10 Figure 12: Obtaining calibrated delay values from N data points measured from N fabricated chips. ..................................................................................................................................... 66 Figure 13: Path delay measurement architecture. ......................................................................... 67 Figure 14: Type I (false positive) and Type II (false negative) error probabilities given two probability density functions, fA (original) and fB (Trojan-affected), of a particular parameter, and rejection threshold. ....................................................................................... 70 Figure 15: An overview of our prototype tool for generating vectors for detecting Trojans. ...... 76 Figure 16: The number of chips to be tested for detecting a minimally delay-invasive Trojan sited at each line in c17, using our approach of targeting shortest delay path and classical delay testing method. (a) Student’s t-test. (b) Likelihood-ratio based test............. 78 Figure 17: (a) An original circuit, C. (b) A Trojan-affected circuit with a minimally delay- invasive Trojan (with resizing in order to maximally match delays of every path with (a)), CT, where a Trojan block is connected to an arbitrary line in the original circuit via a gate having minimum input capacitance. ........................................................................... 82 Figure 18: Delay distributions of the same path, from the original circuit (solid curve) and the Trojan affected circuit (dotted curve) ............................................................................. 87 Figure 19: Example of paths to be considered when resizing a gate ............................................ 89 Figure 20: Current measured from three versions of the circuit using a vector that excites a Trojan at line 627 in c880. .................................................................................................... 92 Figure 21: The difference in power consumption (a) between the original circuit and the Trojan-affected circuit without our approach, and also (b) between the original circuit and the Trojan-affected circuit with our approach, for c880 ................................................ 92 11 Figure 22: Example of gate categorization ................................................................................... 96 Figure 23: Example of candidate selection in the first iteration. Three gates in the circle are candidates and their optimal sizing factors are computed .................................................. 100 Figure 24: The circuit’s fitness during the course of applying the algorithm on c880, where γ0= 0.01 and γ1= 0.05. ‘Our approach’ indicates the result with the new metric βg, and ‘only fitness’ represents the result using (y0-y) only in the best candidate selection. ....... 100 Figure 25: Gate resizing algorithm ............................................................................................. 102 Figure 26: Two versions of the layout of s510, (a) without gate resizing and (b) with gate resizing and the minimum-sized inverter whose input is connected to line 35 .................. 104 Figure 27: (a) An original circuit, C. (b) A Trojan-affected circuit with a maximally-matched Trojan, Ci (with resizing in order to maximally match delays of every path with (a)). ...... 112 Figure 28: RC equivalent circuits of a 2-input NAND for two input patterns, (a) (A, B) = (R, S1) and (b) (A, B) = (S1, R), where xi’s and yi’s are the sizings of p- and n- transistors, vCdp(x) and vCdn(x) are diffusion capacitances of a p- and n-transistors with sizing x, respectively, and vRp(x) and vRn(x) are drain-to-source resistances of a p- and n- transistors with sizing x, respectively. ................................................................................ 119 Figure 29: (a) Transformation of an arbitrary multi-input gate, g, into its structurally-dual gate, g' , where F is the logic subcircuit in fanin of g , I1 and I2 are inverters added/moved due to the transformation, and fi is logic function implemented by the ith input of g and fi=NOT(fi). (b) transistor-level diagrams of g and g', where capacitors within dashed lines are internal capacitances whose values are the sum of drain/source capacitances of two consecutively connected transistors. .................................................. 125 12 Figure 30: Two different implementations of 4-input AND function, NAND4-INV (left) and INV-NOR4 (right) .............................................................................................................. 126 Figure 31: RC equivalent circuits for (a) a 4-input NAND and (b) a 4-input NOR gates when V1 and V2 are applied to the circuit, where ai’s and bi’s are the sizings of p- and n- transistors of the 4-input NAND, and xi’s and yi’s are the sizings of p- and n-transistors of the 4-input NOR gate, respectively. The blue arrows drawn at each RC equivalent circuit indicates the direction of charge/discharge paths of the gates when the corresponding vectors shown below are applied to the input of the gates.......................... 127 Figure 32: (a) Decomposition of an arbitrary multi-input gate, g, into gates with smaller fanins, gi'’s and a logic block H, where F is the subcircuit in fanin of g, and fi is logic function implemented by ith input of g. (b) transistor-level diagrams of g and g1', where the two gates have different numbers of internal capacitances. ............................... 132 Figure 33: Two different implementations of 4-input NAND function, NAND4-INV (left) and NAND2-NOR2-INV (right) ......................................................................................... 133 Figure 34: RC equivalent circuits for (a) a 4-input NAND, g, and (b) a 2-input NAND, g1' when V1 and V2 are applied to the circuit, where ai’s and bi’s are the sizings of p- and n-transistors of g, and xi’s and yi’s are the sizings of p- and n-transistors of g1', respectively. The blue arrows drawn at each RC equivalent circuit indicates the direction of charge/discharge paths of the gates when the same vectors are applied to the input of the gates. .......................................................................................................... 133 13 Figure 35: An arbitrary circuit, C, is transformed to Ci via algebraic substitution, where two lines with identical logic functions, say li and lj, are selected and line li is substituted into lj. .................................................................................................................................. 136 Figure 36: A non-reconvergent combinational logic block, C0, is transformed into Ci, where l is an arbitrary line, CAi is the input cone of l, and CBi is the rest of the circuit. .............. 141 Figure 37: Two SIS-TNC for a 4-input NAND gate causing the greatest delay difference ....... 144 14 Abstract High cost differentials are causing many aspects of integrated circuit (IC) design – including IC design and fabrication, high-volume testing, and IC packaging – to increasingly move overseas. Consequently, it is increasingly more common for a new IC’s original designers to lose direct control of many design and fabrication steps. Thus, designers now face hardware tampering that may occur during the manufacturing process, called hardware Trojan insertion. These hardware Trojans are expected to be designed and inserted by an intelligent and resourceful adversary to gain unauthorized access to information or unauthorized control. Thus, developing techniques to detect hardware Trojans is becoming more important to ensure trustworthiness of digital ICs, especially when they are fabricated by untrusted vendors. Detection of hardware Trojans is challenging since the specifics of hardware Trojans are unknown and difficult to predict. Furthermore, the most challenging types of Trojans will not change the logic behavior of the original circuit, and will cause only minimal deviation to the circuit’s parameters while the levels of process variations are high and continue to increase. Moreover, Trojan designers are expected to be well-financed and equipped with the state-of-the- art detection techniques, and keep improving techniques to make their Trojans more sophisticated and undetectable. In this context, this dissertation introduces a new framework for the problem of evaluating trustworthiness of digital ICs, especially when they are fabricated by untrusted vendors, while addressing solutions to all the abovementioned challenges of Trojan detection. First, our new framework comprehensively identifies and characterizes the changes caused by Trojans. Unlike traditional methods for enumerating many specific types of Trojans, we provide 15 a systematic approach for developing a universal set of Trojans, based on our canonical models of deviation from the original design that span all possible circuit-level modifications. From the derived set of Trojans, we select to study the conditions for Trojans that are maximally challenging for any parametric measurement method, namely minimally-invasive Trojans and maximally-matched Trojans, to make our detection approaches effective. In addition, our approaches are designed to be maximally effective, in the sense that they maximally harness the information that can be gathered by applying selected vector sequences and measuring parameter values, and produce results with minimum cost at a given level of process variations, a given measurement noise, and a given level of confidence. In particular, we first evaluate the effectiveness of measuring delays for detecting Trojans, and show that even maximally challenging Trojans can be detected by delay measurements. And we propose methods to estimate the delay change caused by each model of Trojans, when both the circuit and the Trojans are designed to give only minimal deviation in the circuit’s delay. Furthermore, we develop approaches to reduce the cost of measurements for detecting Trojans, including methods for selecting paths, generating vectors, and calibrating the effects of process variations, which improve the effectiveness of delay measurements even further. Last, we develop techniques for adding specially-designed features to the design that will make it difficult – preferably, impossible – for untrusted vendors to insert Trojans. 16 CHAPTER 1. INTRODUCTION 1.1 Background and Motivation The continued scaling of the IC fabrication technology has profoundly changed most areas of digital circuit design. Especially, high fixed costs associated with building and maintaining state- of-the-art semiconductor IC fabrication facilities have caused dramatic consolidations, where a decreasing number of vendors serve large numbers of fabless IC design houses. In addition, it is impossible for the designers of relatively low volume applications to develop state-of-the-art fabrication facilities by themselves and hence they are increasingly forced to use the services of outside vendors. For example, the fabrication capacity of the United States has been falling (to 24% of the capacity in 2003) and many steps of the IC design and fabrication are moving overseas to reduce cost [22]. Thus, it is becoming more common for a new IC’s original designers to lose direct control of many design and fabrication steps. And this increases the opportunities for intelligent and resourceful adversaries to tamper with the circuit by introducing hardware Trojans during fabrication steps. A Hardware Trojan is a malicious modification of a design, which may result in the incorrect behavior of the circuit. An adversary might tamper with the circuit by adding malicious circuitry to the circuit’s layout during fabrication, which makes the circuit serve the purpose of the adversaries (Figure 1). The purpose of hardware Trojans depends on the intent of the Trojan designer, such as changing the logic behavior of the circuit (unauthorized control), degrading the performance (sabotage), leaking information (unauthorized access), etc. The problem of possible insertion of hardware Trojans is of special concern to two industries, sensitive applications and 17 defense, which typically manufacture chips in relatively small volumes. However, the recent trends have accelerated increasing use of overseas fabricators, even for chips in cryptography and computer security domains. Figure 1: The scenario for IC design and fabrication considered in this dissertation. The importance of detecting and preventing hardware Trojans has been addressed in many reports, including one from a United States Department of Defense [22]. In particular, trustworthiness of chips for sensitive and defense applications is reported to be greatly threatened, as defense industries tend to rely on untrusted fabricator due to economic concerns. Defense Advanced Research Projects Agency (DARPA) has also warned the potential threat of hardware Trojans [101]. According to DARPA’s scenario, an intelligent adversary which is well-financed and has the ability to understand and manufacture the original design, may add to the original design carefully designed Trojans that serve their purpose and are extremely difficult to be detected by the original designer. The potential threat of Trojan attacks is also reported to be a growing concern to industries as well, since many IC design companies go fabless and fabricate their chips using the facilities of outside fabricators [51][53]. Furthermore, the threat of hardware Trojans is also becoming an increasingly important issue to many other countries which have no or limited capabilities to manufacture chips. For example, a report from Australia Department of RTL design Logic design Layout design Testing fabricated chips Untrusted vendor Trustworthy or not? Specifications RTL Netlist Gate-level design (netlist+sizings) Circuit layout Fabricated chips Chip Fabrication 18 Defense has alerted that hardware Trojans may be designed to defeat any or all security mechanisms and subvert or augment the normal operation of electronic devices [8]. Due to this trend, there has been increasing need for trusted IC fabrications. As an attempt to secure production of ICs, National Security Agency (NSA) launched the Trusted Access Program Office (TAPO) to provide a service for reliable production of ICs, by using facilities of only trusted foundries [80]. But it is not feasible to manufacture every chip for all types of strategically important applications by utilizing only a few trusted vendors. As the concerns for hardware Trojans continue to grow, Trojans have been identified in very few real-life chips. One widely reported case is a Trojan in an external hard-drive by Maxtor that captured passwords and forwarded them to a remote site [40]. And the authors in [73] reported that a backdoor exists in Actel ProASIC3 FPGAs that are presently used in military devices. By using that backdoor, the adversary is able to dump data stored in the FPGA’s memory, by applying a simple command (undocumented) to FPGA’s JTAG interface. And some researches proposed a special Trojan which can sabotage the cryptographic capabilities of a microprocessor, and demonstrate its usage by inserting it into Intel’s Ivy Bridge processor [9]. Our challenge is compounded by the fact that, hardware Trojans are likely to be designed and inserted by a well-financed and highly intelligent adversary, with highly trained team of experts who are aware of the state-of-the-art practices in design, post-silicon (PS) validation [1], high volume manufacturing (HVM) testing [33] for manufacturing defects, as well as the emerging practices in Trojan detection. So a well-designed Trojan is expected to pass all traditional methods such as PS validation or HVM testing and will not be detected. Thus, we need a new 19 testing framework for investigating the nature of hardware Trojans, and developing detecting techniques to defeat every one of them. 1.2 Related Research and Limitations A number of researches have enumerated all possible hardware Trojans to develop taxonomies and developed detection techniques against these kinds of Trojans. 1.2.1 Trojan Taxonomies and Trojan Designs As discussed in Section 1.1, only few cases of real Trojans are presently known, possibly due to confidentiality reasons and the difficulty of Trojan detection. Thus, many researchers have developed taxonomies by imagining and enumerating a wide range of Trojans and identifying their characteristics. These taxonomies consider specific scenarios of hardware Trojans, and try to enumerate them in terms of a few attributes. Among early works, the authors in [2] proposed the definition of Trojans and provided some examples of possible Trojans. From the definition of Trojans provided in [2], a Trojan is a circuitry that consists of intrusions and attacks, where intrusions denote hostile modification of a circuit that occurs before deployment, and attacks are actions that occur after deployment. Similarly, many Trojan taxonomies have been proposed to develop a few attributes, like intrusions and attacks, which can be used to identify Trojans [38][63][77][86][93]. In [93], the authors developed a taxonomy that categorizes Trojans in terms of two proposed attributes, ‘trigger’ and ‘payload’, where ‘trigger’ is a part of the Trojan logic which monitors logic values of some lines in the original circuit and activates the Trojan, and ‘payload’ is the part of the Trojan that takes some action when activated. The taxonomies 20 proposed in [77] and [86] include one more attribute, called ‘physical characteristics’, which specifies the layout size, type, and location of a Trojan, where the rest of the attributes, ‘activation characteristics’ and ‘action characteristics’ are similar to ‘trigger’ and ‘payload’ proposed by [93], respectively. On the other hand, taxonomies proposed in [38][63] introduce three more attribute in addition to ‘activation mechanism’ (equivalent to ‘trigger’) and ‘effects (equivalent to ‘payload’), where two of them are ‘design phase’ (or ‘insertion phase’) and ‘abstraction level’, which identify the design/fabrication stage when a Trojan is inserted, and the abstraction level of a Trojan. For example, malicious code can be inserted to a register-transfer level (RTL) design during design phase, or a GDSII-formatted layout can be manipulated during fabrication steps. The last attribute is ‘location’, which defines the physical location of a Trojan. So these taxonomies consist of the total of five attributes – design phase, abstraction level, activation mechanism, effects, and location. Many detection strategies are developed and evaluated based on very specific types of Trojans, such as specific designs for extra Trojan circuitry or insertion of simple components, e.g., gates, counters, comparators, etc. [77]. However, it is dangerous to rely upon only a few specific types of Trojans while developing detection techniques. Such approaches are unable to cover every possible scenario of Trojans without knowing specific details that the Trojan detectors do not have. On the other hand, many techniques are proposed to design most challenging Trojans. First, Polytechnic Institute of New York University has hosted the competition, Embedded System Challenge (ESC), for many years, where the participants of the competition design Trojans on FPGAs against a given detection approach, or propose approaches to detect unknown Trojans, 21 specifically those embedded in FPGAs [25]. The results of the competition have shown that well-designed Trojans often escape existing detection techniques [7][35]. The authors in [24][82][96] show approaches to insert Trojans to the original design written in RTL, which show that insertion of hardware Trojans during RTL design phase might be a feasible scenario. However, in this dissertation we are focusing on possible attack scenarios at the layout-level in ASIC or custom designs as fabrication stage is the most likely to be carried out by untrusted foundries and hence is the most vulnerable to attacks by the adversary. In addition, every attack to a RTL design, e.g., use of untrusted third-party IPs containing malicious circuitry, must be done before fabrication, when the original designer still has full control over the design and capabilities to detect it. Thus, for now we exclude such Trojans from our framework. In [87], the authors propose approaches for reducing the impact of Trojans on values of parameters such as static/dynamic power and delay, by selecting lines to be used for connecting the input of the Trojan gate and intentionally aging a Trojan gate. In addition, an approach [67] introduces a Trojan that inserts an additional gate to leak information of an internal line in the circuit. Though this approach does not change the logic behavior of the circuit (the additional gate will just behave as an additional fanout of the attacked circuit line), it induces the additional load and will increase delays of paths passing via the line. Also, another Trojan design technique has been proposed, which targets a specific Trojan detection method called UCI [75]. However, this approach lacks consideration of changes in values of parameters due to Trojan insertion. 22 1.2.2 Trojan Detection Methods Existing approaches to detect Trojans can be categorized into destructive and non-destructive methods. Detecting Trojans by destructive physical inspection or reverse engineering that examine a few sampled chips has been proposed [79]. But the costs of destructive techniques are very high, and physical inspection may not always succeed in locating well-designed Trojans [66]. Thus, it is important to detect the impact of Trojan on the logic functionality or values of circuit parameters using non-destructive methods, prior to embarking on any destructive method. Existing non-destructive Trojan detection methods can be viewed as belonging to two major categories. In the first category are logic test methods [5][34][70][71][76][98] which apply vectors and examine logic values at the circuit’s outputs. And the second category are parametric measurement methods [4][41][50] which apply vectors and measure values of parameters, such as power/ground currents [6][59][61][85][97], temperature [28][31][56], or delays [23][36][42][44][45][59][62][88][94]. In addition, some on-line detection techniques that observe the frequency or temperature during the run-time have been proposed [28][97]. Among various logic test methods, there is a method that proposes to characterize every output of the circuit by obtaining the probability distribution of each output’s logic values [34]. Another approach partitions the circuit into several regions to achieve higher accuracy in Trojan detection [5]. The authors in [70] proposed to increase the probability of activating a Trojan by inserting dummy flip-flops. In addition, the authors in [76] proposed to evaluate the trustworthiness of a particular circuit in terms of the occurrence of rare events and controllability. However, most logic test methods require activation of Trojans, either fully or partially, which has been shown to be extremely difficult [38], since the specifics of the Trojan are unknown and 23 we can never be sure of activating the Trojan. Moreover, a Trojan can easily be designed to be triggered under extremely rare conditions, and hence it is almost impossible to guarantee finding conditions for Trojan activation. In addition, Trojans that do not affect the logic behavior of the circuit at all, e.g., additional circuity designed for unauthorized access to internal line(s), will not be detected by logic test methods. In contrast, parametric measurement methods do not necessarily require activation of Trojans, as these methods focus on any deviation in parameters, such as delay, power, temperature, etc., caused by the Trojans even without activation. The parametric measurement methods aim to capture changes in values of circuit parameters, e.g., additional leakage current caused by Trojan logic, or additional delay due to extra gates/capacitances induced by Trojans. However, every parametric measurement method suffers from increasing levels of process variations, i.e., variations in the attributes of devices (transistor’s length, width, gate oxide thickness, etc.) occurring during fabrication [55], especially when the impact of Trojans on values of parameters is much smaller than the effects of process variations [62]. A number of approaches are focused on measuring power/ground currents – steady-state currents (IDDQ) and/or transient currents (IDDT). To mitigate the effects of process variations, some approaches improve the measurement resolution by making measurements at multiple power pins [85], or by calibrating the effects of process variations [61]. The authors in [97] measure IDDQ at higher resolution by adding more features, e.g., ring oscillators. And the synergy between measuring power/ground currents and temperature was studied in [28][31][56]. However, power/ground current measurements are performed over large regions of a chip, namely, each power/ground pin or even the entire chip, it is difficult to obtain sufficient 24 resolution under high levels of process variations. So a small difference (change in power consumption of a few transistors) due to the Trojan may not be visible compared to power consumed by a large number of transistors (≫10 3 ) under high levels of process variations. Furthermore, some approaches try to fully or partially activate Trojans in order to capture the difference in dynamic power consumption caused by switching activities of Trojan gates. However, as discussed above, it is nearly impossible to guarantee activation of Trojans without knowing the specifics of Trojans. Figure 2: Delay measurement versus power/ground currents measurement. Several approaches to detect Trojans by measuring delays of paths have been proposed. In general, every delay-based approach focuses on selecting paths and/or vectors and measuring each individual path’s delay for a sequence of vectors (Figure 2). One of the early studies on delay measurement, [36] proposed to measure delays of a few paths by applying test vectors for transition delay faults and compare these delay values with those measured from Trojan-affected circuits. However, a Trojan is different from a transition delay fault in the sense that it may induce a very small extra delay to the delay of a path, and hence may not be detected since small changes in delay are not guaranteed to be propagated to the circuit’s outputs by transition delay fault vectors. The authors in [59][88] proposed to characterize delays of every individual gate in … T 1000+ gates per each power/ground pin Input Output Only a few gates on the path 25 the circuit for different input patterns, by measuring delays of multiple paths using test vectors, expressing path delays as sums of delays of gates along the paths for the corresponding input patterns, and solving LP problems to estimate the delay of each gate. To characterize delays of every gate in the circuit for many different input patterns, the authors inserted additional test points at some nodes in the circuit to break reconvergent paths into single paths. This approach is able to identify the differences in individual gates’ delays of a single chip and hence shown to be also useful for diagnosing Trojans [89]. However, in order to characterize every gate’s delays for many different input patterns, this approach requires a large number of vectors and also solves a LP problem for each individual chip, which makes the complexity of the approach impractically high for today’s circuit sizes. In addition, all the above approaches lack consideration of the actual steps for delay measurements. On the other hand, the authors in [94] proposed to apply vectors for transition delay faults and use a clock sweeping method to measure delays by controlling skews between launch and capture clocks. And a number of approaches have been proposed to improve the accuracy of delay measurements. For example, there is an approach that introduces delay measurement architecture to measure path delays more accurately [42][44][45]. And [62] presents a method to detect a Trojan using vectors for functional testing, however the vectors for functional testing are not designed to propagate the Trojan-induced extra delay to the circuit’s output. Last, every existing approach except [62] considers a Trojan as many additional capacitive loads and/or gate(s) inserted to path(s) in the circuit, and measures the rate of detecting a target Trojan from a fixed number of chips to which that particular Trojan is inserted. However, as claimed in [62], if a Trojan gives only a minimal impact on the circuit’s delays, e.g., inducing only a minimal capacitive load to a line in the circuit, then it will be extremely difficult 26 to detect the Trojan by measuring delays from only for a small and fixed number of chips, especially when the effects of process variations are high. 1.3 Challenges In this section, we show our findings from previous studies on Trojan detection and development. And we address two fundamental challenges that are necessary for developing the proposed framework. First, we need a systematic approach that can identify every possible Trojan, without simply enumerating every specific type of Trojans and developing a solution against each type. Trojan taxonomies are continuously developed by many researchers, and every Trojan, which is already proposed or detected, or to be designed in the future, is very likely to be explained by the latest versions of Trojan taxonomies. However, using these taxonomies to enumerate all kinds of Trojans is a different problem, and characteristics of Trojans, e.g., type, size, and purpose, may all vary with the intent of Trojan designers, which make the number of all these possible combinations of Trojans tremendously large. Also, it was shown that some detection techniques imagine specific kinds of Trojans and are evaluated based on them, but we want to avoid such strategy. Instead we aim to focus on general cases that cover every possible hardware tampering caused by Trojans. So this gives us the following question: how can we systematically explore the entire space of possible Trojans, without enumerating specific types of Trojans? Second, as discussed in Section 1.2.2, it is impossible to guarantee activation of well-designed Trojans, without knowing specifics of the Trojans. So it is unlikely to detect Trojans by only observing logic values at the output. Also, parametric measurement methods suffer from 27 increasing levels of process variations, and the effectiveness of these approaches will be reduced if Trojans give only minimal impact on the values of parameters. So every parametric measurement method should address the solution to the following question: how can we effectively detect a target Trojan that imposes only minimal impact on values of parameters, under increasing levels of process variations and with measurement noise? 1.4 Key Research Tasks From the above challenges, we address the following two research tasks to be conducted. The first task is to derive a universal set of Trojans that can explain every possible scenario of Trojan attacks. Successful derivation of the universal set of Trojans, especially a set of relatively small size, will enable us to systematically design detection approaches and achieve the maximum Trojan coverage without enumerating every kind of Trojan. To accomplish that, we develop canonical models of Trojans that can explain every kind of deviation from the original design. Furthermore, we focus on maximally challenging Trojans for our approaches to detect. In this context, we have identified the following two kinds of scenarios and their combinations. (1) minimally-invasive models of Trojans designed for unauthorized access or control, and (2) maximally-matched models of Trojans in terms of every metric of the circuit. The second task is to develop a new Trojan detection strategy that can successfully detect even maximally challenging Trojans with maximal effectiveness (at minimal cost), for a given levels of process variations and measurement noise, and at a desired confidence level. 1) For every parameter, analyze capabilities of using a parameter for testing based on the maximally challenging Trojan models derived in the first task. For example, we evaluate 28 whether a selected parameter with arbitrary vector sequences and measurements can provide sufficient measurement resolution for Trojan detection even under increasing levels of process variations and with measurement noise. This analysis is particularly important for applications for which chips are manufactured in small volume or with limited capabilities for testing. 2) Develop a method to maximize the effectiveness of testing in case of maximally challenging Trojans to detect. For example, we develop a method that can maximize the resolution of measurements of a selected parameter to minimize test cost. In particular, we focus on maximizing the impact of a Trojan on values of parameters with respect to the effects of process variations, and hence our parametric measurement approaches can be maximally effective. 1.5 Dissertation Outline In this dissertation, we started by explaining our motivations, challenges, and directions of our framework. The rest of the dissertation is dedicated to addressing the above two research challenges for improving our framework. First, in Chapter 2 we explain the benefits of using delay measurements for detecting Trojans, and introduce our new notion of a universal model of Trojans. Specifically, we focus on two scenarios that are maximally challenging for delay measurement approaches, namely minimally delay-invasive and maximally-matched models of Trojans. In Chapter 3 and 4, we explain our approaches for detecting minimally delay-invasive Trojans via delay measurements. Especially, in Chapter 3 we show basic principles for applying vectors and measuring delays to detect minimally delay-invasive Trojans, and in Chapter 4 we 29 evaluate these approaches when the original circuit is redesigned to minimize changes in delays. In Chapter 5, we show our theoretical foundations for detecting maximally-matched Trojans. Finally, in Chapter 6 we conclude this dissertation with proposed future work about methods to extend our framework by further improving the capabilities of using delay measurements for Trojan detection. 30 CHAPTER 2. CHARACTERIZATION OF TROJANS In this chapter, we analyze the shortcomings of existing approaches for identifying Trojans, introduce basic principles for deriving Trojan models, and finally present our universal model of Trojans. 2.1 Introduction A Trojan is any kind of malicious modification to a given design (referred to as the original circuit). And its type, size and purpose would vary with the intent of the Trojan designer. Thus, the Trojan may affect many aspects of the original circuit, e.g., the functionality of the original circuit, values of parameters such as power/ground currents, temperature, or delay, etc. However, degrees of these alterations may differ with Trojans. For example, some types of Trojans may not be detected by measuring power/ground currents as they consist of only few gates, but they significantly change some paths’ delays which makes delay measurement based approaches detect the corresponding Trojans more easily, or vice versa. While some of previous studies tried to address this issue by introducing taxonomies for Trojans and using them for developing detection strategies, in Chapter 1 we have shown that common limitations exist among all these approaches. First of all, Trojan taxonomies have unique advantage in identifying Trojans, and have been widely used to develop detection techniques in many literatures. However, taxonomies cannot always guarantee providing every new kind of Trojans as there are numerous kinds of Trojans with different types, sizes, functionalities, and so on. Though all the existing taxonomies try to include every possible kind of Trojans and they are continuously extended, it is inefficient to use 31 them to enumerate every possible type of Trojans. In addition, only few real cases of Trojans are known, since the existence of Trojans is expected to be kept in secrecy and their information may not always be shared amongst researchers, companies, and even governments. In addition, the performance of every detection method is determined by specifics of the Trojan target. Many detection techniques imagine specific types of Trojans and are developed based on these Trojan targets. For example, some existing techniques use simple logic components such as counters, muxes, or comparators [77], or their own Trojan circuitries (structures of such Trojans are not fully explained to the readers), for evaluation. Since characterization of Trojans is a fundamental step of our methodology, we have taken a completely new approach. Instead of expanding the existing taxonomies to include all specific types of Trojans that we can imagine, we develop a universal set of Trojans which can cover every attempt to alter the original circuit. To begin with our discussion for deriving models of Trojans, we briefly explain the result of our task taken for selecting a parameter to be used for measurements to detect a Trojan. Especially, we show that delay measurements are particularly useful, and briefly explain existing delay testing and measurement approaches that are used for PS validations and HVM testing, and how they can be used in our approaches (a detailed discussion can be found in Chapter 3). 32 2.2 Capabilities and Effectiveness of Delay Measurements 2.2.1 Delay Fault Models and Test Generation Methods The idea of delay fault modeling and testing starts from the fact that a path’s delay varies with chips due to a number of factors, including process variations and manufacturing defects. For example, a spot defect occurred at a particular gate or wire may increase one or more paths’ delays, and hence makes a chip fail at a specified clock frequency. Thus, a number of delay fault models have been proposed to capture the effects of manufacturing defects and process variations, including transition delay faults [43][83], path delay faults [47], small-delay defects [74], etc. For a given delay fault model, a test for a combinational logic block consists of a sequence of two vectors, <V1, V2>, where V1 and V2 are respectively called an initialization vector and a test vector. The purpose of a sequence of vectors, <V1, V2> is to make transition(s) at desired line(s) in the logic block, and propagate the transition(s) to the output of the logic block. Here, V1 is scanned in through a scan chain of the circuit and sets the value of every line in the logic block to known value, and V2 makes certain line(s) have opposite value(s) with the initial value(s) set by V1, in order to make desired transition(s) occur at these lines. And logic response (logic values) at the output of the logic block is captured at sampling time, and is compared with correct response to determine whether the target delay faults exist. To ensure generation of transition(s) and propagation to the output, a number of conditions for generating vectors have been proposed, including robust test [69], non-robust test [19], hazard- free robust test [60], validatable non-robust test [64], and many variants of the above test 33 generation conditions. Among them, a robust test vector for a particular path delay fault is a sequence of vectors that applies transition at the input of the path, and makes the output of the path erroneous at sampling time, independent of delays of other paths along side-inputs of the gates along the path. We will explain more details about the robust delay test conditions in Section 3.5. And an automatic test pattern generator (ATPG) is a program which can generate vectors automatically using one or more of the above test generation conditions within. The interested readers may refer to [33] for more details about delay fault models, tests and ATPG. We note that we will simply use the term ‘vector’ to also describe ‘a sequence of vectors’ in the rest part of this dissertation. 2.2.2 Delay Measurement Methods As shown in the above, delay fault tests aim to detect erroneous timing behavior of a combinational logic block by capturing logic response at the output. On the other hand, methods for delay measurements take a slightly different approach from the delay fault tests, in the sense that they do measure the quantities of delays at some granularities instead of just capturing logic response at certain sampling time. Thus, they require special hardware components or external equipment for measurements, compared to the delay fault tests requiring conventional DFT components, i.e., scan flip-flops. Due to a large cost of measuring delays using external equipment at a finer granularity with low measurement error, several approaches have been proposed to design special on-chip circuitries to measure delays [20][26][44][45][57][81]. For example, an architecture called RAZOR [26] was proposed to dynamically measure delays of paths, and the authors of [26] also proposed an improved version of RAZOR, RAZORII in [20]. 34 And the authors of [44][45] proposed to use delay measurement architectures to detect hardware Trojans. And by using [45], we are capable of applying vectors and measuring delays precisely by controlling skews between edges of two clocks applied to scan flip-flops at the input side of the logic block and the output side of the logic block, respectively. More detailed discussion about delay measurement architectures to be used in our approaches will be provided in Section 3.7. 2.2.3 Effectiveness of Delay Measurements In Chapter 1, we have shown that there are two categories of Trojan detection techniques – logic test methods and parametric measurement methods, where logic test methods may not guarantee activation of Trojans and may be ineffective. In addition, for any Trojan that does not change the logic functions of the original circuit, e.g., Trojans designed to only observe internal logic values (unauthorized access), can never be detected by just observing the logic behavior of the circuit. Compared to logic test methods, parametric measurement methods are particularly useful due to the fact that the impact of a Trojan and the effects of process variations on values of parameters are fundamentally different. To illustrate this idea, consider an arbitrary Trojan inserted to a circuit. For any particular vector, the Trojan will introduce uni-directional shifts in values of parameters, e.g., delay values for any given vector. In contrast, process variations typically introduce bi-directional changes. Hence, even Trojans giving minimal deviations in values of parameters can be detected with high confidence. This is true even if the uni-directional shift is significantly smaller than the bi-directional changes due to process variations, provided 35 that we are able to make measurements on a sufficient number of ICs sampled from the fabricated batch of chips. Among various parametric measurement methods, delay measurements benefit from the fact that the delay of each path can be measured separately by applying vectors. The resolution of delay measurement for one path is independent of the other paths in a logic block and other logic blocks on the chip, where power/ground currents are measured from several power/ground pins that are expected to be highly correlated. Also, delay measurements can achieve greater resolution in detecting Trojans compared to power and current measurements since path’s delay consists of delays of only a few gates along the path, and the additional delay induced by the Trojan (or equivalently additional delay due to charge/discharge of extra capacitances induced by the Trojan), whereas the total power consumption measured at each power/ground pin consists of power consumed by a large number of transistors plus a small difference (change in power consumption of a few transistors). In addition, we can measure delays of individual paths or even sub-paths by improving controllability and observability of the circuit, e.g., using additional observable points to test more paths [88]. Hence, we pursue Trojan detection via delay measurements. 2.3 Trojan Models In our study, we ensure that the universality of our set of Trojans is guaranteed by using the following two principles during our derivation process. First, we make no specific assumptions about the Trojan(s). In our framework, we first start from the original design, and develop canonical models of deviation from the original design that 36 can cover all possible circuit modifications. Thus, any Trojan that gives any deviation to any of the original circuit’s metric is covered by our set of Trojans. For each model of deviation, we capture a set of necessary conditions that each deviation model must satisfy, and use them as surrogate targets. Second, we focus on Trojans that are maximally challenging for delay measurement approaches to detect, in order to ensure that our detection approach is developed under the most challenging conditions. In Section 2.2, we have shown that delay measurement has several benefits over other parameters and hence will be used for our approach. However, increasing levels of process variations make it more difficult to detect Trojans. In addition, Trojan designers will try to design Trojans that are most challenging for all approaches measuring circuit parameters, in order to prevent them from being detected by any parametric measurement approach. Thus, the main challenge of our framework is how to detect a Trojan that gives the smallest impact on delays under increasing levels of process variations, since the amount of deviation induced by the most challenging Trojan becomes much smaller in magnitude compared to the impact of process variations. Since our first approach for Trojan detection is based on measurement of delays of paths in combinational logic blocks, we posed the above principle in the form of the following question: Which Trojans will minimally change the delay of circuit paths? In particular, we focus on two scenarios that are maximally challenging for delay measurement approaches (i.e., any other scenario will be easier that these for our approaches to detect) at two different ends of the spectrum. The first one minimally alters the original design by only making a single connection to accomplish unauthorized access or unauthorized control, 37 while ensuring that this connection has minimal impact on the original circuit’s parameters. The second one captures or subsumes, in terms of severity of challenges posed, all other cases where one or more changes may be made to the structure of the circuit and/or sizing of circuit elements, in a manner that maximally matches the version of design it creates with the original design in terms of circuit parameters. Hence, we have the following canonical models of deviations and their combinations: (1) minimally-invasive models of Trojans designed for unauthorized access or unauthorized control, and (2) maximally-matched models of Trojans. For Trojan detection, we consider each of the above two canonical models as well as their combinations. In the following subsections, we describe the necessary conditions that all maximally challenging Trojans should satisfy, and introduce details of minimally-invasive and maximally- matched models of Trojans. 2.3.1 Conditions for Maximally Challenging Trojans First, we introduce necessary conditions that every maximally challenging Trojan should satisfy. Any Trojan that does not satisfy one or more of the following conditions is not either maximally challenging, or undetectable by any parametric measurement approach. Condition 1: A Trojan would have some impact on the original circuit in terms of any attribute, such as the logic behavior of the circuit, or the values of parameters like delays of paths, power/ground currents, etc. (The term ‘original circuit’ refers to the Trojan-free design and its information is completely known to the original designer, including the netlist and sizings of the gates in the netlist.) Otherwise, a Trojan has absolutely no impact on any attribute of the original circuit and is completely independent from the original circuit. 38 Trojans that are completely independent from the original circuit and do not alter any attribute of the original circuit, e.g., the logic behavior or values of parameters such as delay or power, may be either harmless or cannot be detected by making measurements on the circuit. There might be some kinds of Trojans that may not have any impact on the original circuit, such as ones that measure power/ground currents or temperature using sensors and send measured values to extra output or outside via radio signal. Detection of such kinds of Trojans will be a subject of our future work. However, if they have any impact on the original circuit’s parameters, e.g., incur extra coupling capacitances and cause any deviation on delays, then such deviation in delays can be detected by our approach (to be introduced in Chapter 3). Condition 2: A Trojan does not change the functionality of the original circuit unless activated, i.e., the Boolean function implemented by every output of the original circuit. Suppose that there is a combinational logic block that has 𝑁 𝐼 input and 𝑁 𝑂 output pins. The reason to choose a combinational logic block as an example is that we assume a fully scannable design, i.e., every input and output of every combinational logic block is controllable and observable. We expect that every flip-flop is a part of scan-chains and hence any input vector can be applied to any combinational logic block. Then each output of the original circuit corresponds to a Boolean function, 𝑓 𝑖 (𝑋 ):𝐵 𝑁 𝐼 →𝐵 ;𝑖 ∈{1,…,𝑁 𝑂 } , where 𝑓 𝑖 is a Boolean function that corresponds to the 𝑖 th output, 𝐵 is a set of binary numbers, i.e., 𝐵 ={0,1}, and 𝑋 is a 𝑁 𝐼 tuple representing an input vector, i.e., 𝑋 =(𝑥 1 ,…,𝑥 𝑁 𝐼 )∈𝐵 𝑁 𝐼 ;𝑥 𝑗 ∈𝐵 . If a Trojan is added, then it may introduce extra input or output, depending on the purpose of the Trojan designer. Let a manipulated version of the circuit, be 𝐶 𝑇 , and it has 𝑁 𝐼 ′ input and 𝑁 𝑂 ′ output pins (but only 𝑁 𝐼 input and 𝑁 𝑂 output pins are known to exist by the detector) where each one of the original 39 output pins implements a Boolean function 𝑓 𝑖 ′ . Here, every 𝑓 𝑖 ′ should either be the same with the corresponding 𝑓 𝑖 or contain 𝑓 𝑖 , otherwise it will be eventually detected by applying a vector that causes any difference in logic value at any original output pin. For example, suppose that a 2- input NAND gate of the original logic block which implement a Boolean function 𝑓 𝑖 =𝑔 𝑖 1 ∗𝑔 𝑖 2 ̅̅̅̅̅̅̅̅̅̅̅ (𝑔 𝑖 1 and 𝑔 𝑖 2 are Boolean functions implemented by the inputs of the gate) is replaced by a 3- input NAND gate implementing 𝑓 𝑖 ′ =𝑔 𝑖 1 ∗𝑔 𝑖 2 ∗𝑔 𝑒𝑥𝑡𝑟𝑎 ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ (𝑔 𝑖 1 and 𝑔 𝑖 2 are Boolean functions implemented by the inputs of the gate, and 𝑔 𝑒𝑥𝑡𝑟𝑎 is Boolean function implemented by the extra input) whose extra input is connected to the extra input pin of the circuit. This modification alone will not change the logic behavior of the original circuit, if a non-controlling value of the NAND gate, 1, is being applied to this extra input of the 3-input NAND gate during the normal operation, i.e., when the Trojan is not activated. However, if the same 2-input NAND gate is now replaced by, for example, a 2-input NOR gate, and (0, 1) or (1, 0) input combinations can be applied to this NOR gate’s input and the gate’s output can be propagated to any output of the original circuit, then this change will be eventually detected by finding and applying a vector that applies any of (0, 1) or (1, 0) input combinations to the NOR gate and propagates it to any output of the original circuit. The former and latter examples show that a Trojan can be inserted in manner that it may or may not change the logic behavior of the original circuit during the normal operation. As illustrated in the above examples, a Trojan should not produce any incorrect logic value at the output of the original circuit for every vector applied to the input of the original circuit, especially during the normal operation. In other words, a Trojan is expected to be stealthy in terms of the circuit’s logic behavior during the normal operation. Otherwise, it will cause an error at any output of the original circuit for at least one vector applied to the input of the original 40 circuit. These Trojans that tamper with the functionality of the original circuit will be detected by traditional testing methods and are easier to detect. Condition 3: Any change in any metric of the original circuit after Trojan insertion should be minimized. When a Trojan is inserted and has any impact on any of the original circuit’s metrics (all other Trojans that do not cause any change to any of the metrics do not satisfy the condition 1 and hence will not be detected by any parametric measurement method), the amount of the impact should be minimal. In addition, any change in the metrics should be further minimized unless it is impossible. For example, if there is any change in any path’s delay due to insertion of a Trojan, then this change should be minimized by every possible mean, e.g., resizing gates to reduce the change in delays, as long as it does not cause changes to other metrics by larger amounts. And all other Trojans that do not satisfy this condition will be easier to detect than the case of inducing only minimal change(s) to the circuit’s metric(s). 2.3.2 Minimally Delay-Invasive Models We target a Trojan that minimally changes the original circuit and induces the smallest additional delay. In this context, we can first think of a scenario that the adversary will design its Trojan logic as a separate logic block, say, a Trojan block, which will be placed using the unused area (empty space) in the layout, and establish a connection between at least one line in the original circuit, called a Trojan site, and the input or the output of the Trojan block, in order to use or alter the logic value of the Trojan site as shown in Figure 3. Thus, the netlist of the 41 original circuit remains the same while only an extra fanin/fanout is added to the Trojan site. We discuss two different scenarios of Trojans that do not conform to the above description. First, the adversary may design a Trojan as a completely independent logic block which does not have any connection with the original circuit. In this scenario, the netlist of the original circuit is still kept unchanged, but no extra fanin or fanout is added to the original circuit. We have already excluded some of these according to our first condition introduced in Section 2.3.1. Since these kinds of Trojans will not be able to control the logic value of any line in the original circuit, they may not affect the logic behavior of the original circuit. And all other Trojans belonging to the scenario which give any impact on the circuit’s parameters, e.g., incur extra coupling capacitance to any of original circuit’s line and increase delay, will cause any amount of deviation in parameters (e.g., delays) and can be detected due to the property of uni- directionality of deviation caused by Trojans, compared to bi-directionality of process variations. Figure 3: A minimally delay-invasive Trojan. The second alternative scenario is that the adversary will change the netlist to implement the logic function of a Trojan. This scenario is captured as our maximally-matched models of Trojans, and will be discussed in Section 2.3.3. 42 To improve (or, preferably, guarantee) completeness of our models of Trojans, we use our other new principle: Characterize Trojans in terms of sets of necessary conditions that any Trojan must satisfy. In the context of the characteristics of Trojans identified above, we identify the following necessary conditions. 1) A Trojan must involve a connection between at least one line, say x, called the Trojan site, in at least one original circuit block and the newly added Trojan block(s). 2) This connection may use the value at line x in the original circuit block to drive line(s) in the added Trojan block(s). In the minimally-invasive case, this will take the form of an additional fanout of minimum load at line x. This will cause a small additional delay at line x. 3) Some or all gates/wires in the circuit will become resized to match delays, in order to hide the impact of this additional minimum load on delays (still without manipulating the original netlist). Alternatively, a connection to line x may be used by the added Trojan block(s) to modify the value at line x. In the minimally-invasive cases, this might be achieved by inserting one additional circuit element (e.g., a gate or a multiplexor) at line x, or by adding an extra fanin to the gate driving line x. Again, this will cause a small additional delay at line x. Since it causes the minimum impact on the original circuit’s delay, and only makes the smallest change to the original circuit, we believe that our minimally delay-invasive model is one of the most challenging Trojans for any delay measurement approaches to detect. 43 2.3.3 Maximally-Matched Models In addition, we also consider an alternative scenario of changing the netlist of the original circuit instead of making a separate logic block, and resizing gates to hide the impact of the changes on delays. We have derived necessary conditions that every such Trojan should satisfy. 1) At least one change is made to the netlist, where any change to the netlist should make the netlist implement the same logic functions with the original netlist. 2) The impact of changes to the netlist should be eliminated to the extent possible, e.g., delay of every path should be maximally matched, by making other changes to the netlist or by resizing the gates in the circuit. All such Trojans that satisfy the above conditions will belong to maximally-matched models of Trojans. Here, our question is, is it possible for the Trojan designer to eliminate the impact of one change in the netlist of logic block by making other changes to the netlist or by resizing the gates in the circuit? To answer this question, we study a wide range of circuit transformations used to design or modify circuits that implement a desired logic function, which are very likely to be used for redesigning the netlist. The results of our investigations on maximally-matched Trojans are provided in Chapter 5. 2.3.4 Analysis of the Models In summary, we have derived two models for maximally challenging Trojans, and captured necessary conditions for each one of the models. Table 1 summarizes the description and necessary conditions of these two models. 44 Table 1: The description of minimally delay-invasive models and maximally-matched models of Trojans Models Necessary conditions Minimally delay-invasive models of Trojans 1) No change in the netlist of an original logic block 2) A single connection that gives minimum additional load to a line in the circuit 3) Gates/wires in the netlist are resized to eliminate the change in delays Maximally-matched models of Trojans 1) At least one change to the netlist of an original logic block but logic functions of the original netlist remain unchanged 2) Make changes to other parts of the netlist, or resize gates/wires in the netlist in order to eliminate the change in delays To show the validity of our models, we employ some Trojan taxonomies proposed in previous studies [38][63][77][86], which are the most updated and extended versions to our best knowledge. And we demonstrate the effectiveness of our Trojan models by comparing them with every Trojan defined in each one of these taxonomies, and show that almost every Trojan can be successfully covered by our models, except a few exceptions. First, we examine our models with the taxonomies proposed in [77][86]. As shown in Figure 4, a Trojan is claimed to be characterized by three attributes, trigger, payload, and physical characteristics in these taxonomies. Trigger and Payload: Here, trigger is referred to the mechanism to activate a Trojan, where it can be internal or external activation. An internal trigger looks at a specific event (triggering event) occurred during the normal operation, where an external trigger requires input from outside of the circuit for activation. And payload of a Trojan refers to the behavior of the Trojan. But in our Trojan models, we do not make any assumption on how the Trojans will be triggered 45 and how they will behave. Instead, we only categorize the Trojans into two complementary scenarios: (1) Trojans do not make change to the netlist of an original logic block and may have (1-1) at least one connection to the original logic block, or (1-2) no connection at all, and (2) Trojans make at least one change to the original netlist. And we capture only necessary conditions that every Trojan in each scenario should satisfy, and improve them to catch necessary conditions of maximally challenging Trojans for any approaches to detect. For example, a minimally delay-invasive Trojan (scenario 1-1) gives a single connection to the original circuit in order to use or alter the logic value from any line in the circuit (Trojans giving multiple connections to the original circuit will be easier to detect than the single-connection case, and demonstration of our approaches using such Trojans will be shown in Section 4.6), and hence we make no specific assumption on triggering mechanism except that it induces an additional connection between the input or the output of the Trojan block, and any line in the circuit. Of course this may imply that the logic value(s) of the Trojan site can be used for activation of the Trojan block, where the logic value(s) to be used for Trojan activation may originate from the normal operation of the circuit (internal trigger), or special input applied to the circuit by the adversary (external trigger). However, all these kinds of triggering mechanisms stated above can still be explained by our minimally delay-invasive Trojan models. For the maximally-matched models of Trojans (scenario 2), we only capture the necessary conditions that they will make at least one change to the netlist of the circuit, but the new netlist must implement the same logic functions. We do not make any assumption on how the maximally- matched Trojans will be triggered or behave. Last, there might be Trojans that do not introduce any interconnection between the Trojan block and the Trojan site in the original circuit, and also 46 make no change to the original netlist (scenario 1-2). As discussed in Section 2.3.1 and 2.3.2, we expect that these kinds of Trojans might measure values of parameters such as power/ground currents or temperature, or leak the information using extra output pin or radio wave signals. However, detecting these kinds of Trojans by making measurements on the original circuit may not be possible, unless they have some impact on a metric of the original circuit. Again, such Trojans which do not satisfy the condition 1 are explained in Section 2.3.1 and will be excluded from our search for now, however, they are subjects of our future research. Figure 4: Trojan taxonomies proposed in [77][86]. Physical Characteristics: Next, our models assume that every minimally delay-invasive Trojan consists of a Trojan block and an additional interconnect, where the Trojan block consists of extra gates and wires. These properties will correspond to the third attribute, physical characteristics of Trojans. In Section 2.3.2, we assume that these extra gates and wires of the Trojan block may be placed using the unused area of the layout and can be either spatially distributed or clustered. Trojans that do not correspond to this description are the ones that are implemented based on only existing gates and wires of the netlist, and such Trojans are covered by the maximally-matched models in our universal set of Trojans. Trojans Trigger Payload Physical Characteristics Internal External Functionality Change Performance Degradation Leak Information … Type Size Location 47 Design Phase and Abstraction Level: Next, we compare our models with Trojans defined in taxonomies proposed in [38][63]. In these taxonomies, the authors introduced two more attributes to identify Trojans, in addition to three attributes proposed in [77][86] (two attributes, ‘physical characteristics’ used in [77][86] and ‘location’ used in [38][63] are similar to each other). These two additional attributes have been proposed to represent the point when a Trojan is inserted (design phase) and the format of the Trojan design (abstraction level). From our problem definition introduced in Chapter 1, our framework focuses on the case that a Trojan is inserted during fabrication by untrusted vendors. So our models are focused on Trojans that are inserted to the layout of the original design during fabrication. Thus, in our framework, Trojans are defined as modifications to the layout of the original design. Trojans that might be inserted during other design phases, e.g., malicious code inserted to soft IP core by the third-party IP designer, are not of our immediate concern. In summary, every Trojan in our universal set of Trojans can be categorized into two complementary classes, minimally delay-invasive and maximally-matched Trojans, and can be successfully identified by the existing Trojan taxonomies. But we noted that there are a few exceptional cases that are not captured by our Trojan models, which are the Trojans that do not introduce any additional connection with the original circuit, and also do not make any change to the netlist of the original circuit. In our future research, we will investigate such Trojans and try to derive another set of necessary conditions to capture the impact of these Trojans. And we continue to apply our approach to derive other necessary conditions that a Trojan may satisfy. 48 CHAPTER 3. DETECTION OF MINIMALLY DELAY- INV ASIVE TROJANS VIA DELAY MEASUREMENT In this chapter, we present our approach for detecting minimally delay-invasive Trojans using delay measurements while significantly reducing the cost of detection. This work has been published, and some figures, tables, and texts from [16][17][95] are used under IEEE copyright policy 1 . 3.1 Introduction As discussed in Chapter 2, due to its benefits over other parameters, we have chosen delay as the primary parameter to measure for detecting Trojans. Also, we introduced our concept of uni- directionality of impact of Trojans, compared to bi-directionality of the effects of process variations. Thus, we are able to detect even a minimally delay-invasive Trojan (Figure 5) with high level of confidence, provided that delays from sufficiently large number of chips can be measured. However, the levels of process variations continue to increase with each scaling generation, and the magnitude of process variations on the values of the circuit parameters measured can easily exceed the impact of a Trojan. This trend will significantly increase the number of chips required for detecting a Trojan with given level of confidence, which will be unacceptable, especially for chips used in sensitive applications which are often manufactured in small volumes. Thus, the main challenge is, how to detect, with a given level of confidence, a 1 © 2012, 2013 IEEE 49 minimally delay-invasive Trojan under increasing levels of process variations by testing a minimum number of chips. (a) (b) Figure 5: (a) Original circuit, 𝐶 , with the original netlist and sizing factors. (b) A Trojan-affected circuit with a minimally delay-invasive Trojan (but without resizing), 𝐶 1 , where a Trojan block is connected to an arbitrary line in the original circuit via a gate having minimum input capacitance. To solve the above problem, we devise a novel approach for reducing and estimating the cost of delay measurements for Trojan detection. First, our approach reduces the test cost by focusing on two parameters, (1) the effects of process variations on the average value of the total path delay, and (2) the impact of a Trojan on the average value of the total path delay. Next, we develop a method to estimate the number of chips and vectors required for detecting a given set of Trojans, and given levels of process variations and confidence. The major contributions of our approaches can be summarized as follows. 1) We develop a path selection scheme for a target Trojan which maximizes the impact of a Trojan on the measured delay. As Trojans are expected to cause minimal delay deviations, our goal is to select paths which maximize the additional delay induced by the Trojan with respect to the effects of process variations. In contrast to some existing methods that target critical paths, our path selection scheme targets paths having the smallest delay values to maximize the impact of a Trojan on each path’s delay. The complexity of our path C ... ... ... C1 ... ... ... x Trojan block T 50 selection scheme is 𝑂 (𝑚 ) (𝑚 : the number of lines in the circuit) as it requires only one path to be tested using a vector to detect a particular Trojan, and hence our method is applicable to a large circuit. 2) We derive new logic and timing conditions that a sequence of vectors must satisfy to detect a particular Trojan. Our new logic and timing conditions guarantee that, if a particular path is sensitized by generated vectors, then any change in the delay of any line along the path can be propagated to the circuit’s output, compared to the traditional transition delay fault tests used by other approaches. 3) To reduce the effects of process variations on the path delay, we present a new approach to minimize the test cost by reducing its effect on delay via some additional measurements on other parts of chips. 4) We design a new hypothesis testing method based on likelihood-ratio test, which can estimate the number of chips required for detecting an arbitrary Trojan by measuring delays of an arbitrary path and vector, for given levels of process variations and confidence. 5) We demonstrate the effectiveness of our approaches using several benchmark circuits, and an industrial 65nm technology for high levels of process variations provided by a foundry. In particular, we show the minimum number of paths whose delays should be measured to cover every possible Trojan in the circuit, and estimate the test cost to achieve a given level of confidence in Trojan detection, in terms of the sum of number of chips to be tested for each selected path. 51 In this chapter, we mainly focus on explaining our approaches for reducing the test cost of detecting minimally delay-invasive Trojans using delay measurements. We also developed an algorithm for resizing gates to reduce and estimate the impact of minimally delay-invasive Trojans on delays, and the effectiveness of our approaches with gate resizing will be demonstrated in Chapter 4. 3.2 Process Variations and Simulation Models Before introducing our main idea, we explain the model of process variations used in our approach. Process variations are typically divided into two components, global variations (inter- die variations) between chips, across wafers, and across wafer lots, and local variations (intra-die variations) on device parameters across each chip. Both global and local variations can be further classified into systematic and random components [48][55]. These sources of process variations contribute to a path’s delay to different extents and in different ways [13]. Global variations, including their systematic and random components, equally affect parameters of all devices on a particular chip. Hence, every device in a particular chip will exhibit the same amount of shift in the value of a parameter, such as the width of a transistor. On the other hand, local variations affect differently the parameters of each individual device along a path through a logic block on a chip. Furthermore, local systematic variations affect devices differently depending on whether these devices are placed near or far away from each other [18]. Also, local random variations affect parameters of each device in the chip independently. Table 2 illustrates four main components of process variations on the devices in a chip: global (including systematic and random), systematic local on nearby devices, systematic local on far-away devices, and random 52 local variations. More detailed explanation on these four main components of process variations can be found in Section 3.6. Table 2: Symbols for variation sources Symbol Representation 𝜎 𝑔 Global variations (including wafer-to-wafer, lot-to-lot, and chip-to-chip), considering the effects of systematic and random components 𝜎 𝑙 ,𝑠 ,𝑛 Local systematic variations on devices located nearby 𝜎 𝑙 ,𝑠 ,𝑓 Local systematic variations on devices located far-away 𝜎 𝑙 ,𝑟 Local random variations on devices, location independent 𝜎 Total value of variations 𝜎 2 =𝜎 𝑔 2 +𝜎 𝑙 ,𝑠 ,𝑛 2 +𝜎 𝑙 ,𝑠 ,𝑓 2 +𝜎 𝑙 ,𝑟 2 For any given vector, we characterize delays of paths in benchmark circuits using realistic delay values and under realistic levels of process variations supplied by the fabricators for the fabrication process in the form of technology files. In particular, we use an industrial 65nm technology and use the delay model, including global and local variations, provided by the foundry which fabricates chips using this technology [10]. We perform Monte Carlo simulations to obtain realistic distributions of path delay values, using the Cadence Spectre simulator [99] in a manner where it uses the foundry-supplied model of process variations in terms of variations in about 50 device parameters including, L eff , V th , t ox , etc. 3.3 Problem Statement For minimally delay-invasive Trojans, we formulate the problem of Trojan detection at an arbitrary line (Trojan site) in the circuit. Let m be the number of lines in the original design of a logic block, say C. The question is, how do we detect a Trojan which is suspected to exist on an 53 arbitrary line, say i, where the Trojan induces a minimum additional load at the line (in Chapter 4 we will also consider resizing of gates). We assume that the same Trojan will be inserted in every copy of the design C, i.e., in every fabricated chip, since inserting Trojans into a subset of chips requires additional masks and is very expensive. As mentioned above, our goal is to detect the minimally delay-invasive Trojan by measuring delays under high levels of process variations. For this purpose, we have identified several mathematical characteristics of Trojans’ impact on delays and use them as the basis for developing our approach for Trojan detection using the following key steps. First, we target the impact of Trojans on delays, and develop path selection and vector generation schemes that maximize the impact of Trojans. Furthermore, we investigate several categories of process variations and provide a way to minimize the effects of process variations on delays via calibration, by inserting additional test structures into the circuit and making measurements. Finally, we use our hypothesis testing method to estimate the number of chips required for detecting each Trojan. 3.4 Shortest Delay Path Selection The total delay of an arbitrary path passing via line i, say 𝑃 , using vector 𝑉 can be expressed as: 𝐷 (𝑃 ,𝑉 )=𝐷 𝑁 (𝑃 ,𝑉 )+𝛥 𝐷 𝑣𝑎𝑟 (𝑃 ,𝑉 )+𝛥 𝐷 𝑇 (𝑃 ,𝑉 ), (1) where three parameters are (1) nominal delay of path P using vector V, 𝐷 𝑁 (𝑃 ,𝑉 ), (2) the effect of process variations on the delay of P using vector V, 𝛥 𝐷 𝑣𝑎𝑟 (𝑃 ,𝑉 ), which is the overall effect of global and local variations on the path delay, and (3) extra delay induced by a Trojan at line i, 54 𝛥 𝐷 𝑇 (𝑃 ,𝑉 ). Among these three parameters, the effect of process variation, 𝛥 𝐷 𝑣𝑎𝑟 (𝑃 ,𝑉 ), follows random distribution with standard deviation (𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 )) which is typically bi-directional around a mean, e.g., normal or truncated normal distribution [48]. In contrast, extra delay induced by Trojan (𝛥 𝐷 𝑇 (𝑃 ,𝑉 )) is uni-directional and always changes the total delay in the same direction. For every copy of the design, i.e., for every fabricated chip, for the design with the Trojan, the delay of the gate/line at the Trojan site uni-directionally changes. This observation enables us to prove that a minimally delay-invasive Trojan can always be detected, provided that we make measurements on a sufficient number of chips. However, this observation only guarantees the possibility of detecting a Trojan causing even a minimal deviation in the delay. The number of chips required for detecting a Trojan will be still high, especially when the uni-directional change due to the Trojan is small compared to the bi- directional change due to the effects of process variations, i.e., 𝛥 𝐷 𝑇 (𝑃 ,𝑉 )/𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) is small. In other words, if we can select a path and a vector that maximizes this ratio, 𝛥 𝐷 𝑇 (𝑃 ,𝑉 )/𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ), then the number of chips required for detecting the Trojan will be minimized 2 . Then the next question is, how to find a path and vector that maximizes 𝛥 𝐷 𝑇 (𝑃 ,𝑉 )/𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ). We solve the above problem by making a second observation: For the paths that pass via an arbitrary Trojan site, say i, and for vectors that sensitize the delays of these paths, the Trojan sited at line i produces almost constant extra delay. To illustrate this idea, suppose that in an arbitrary combinational logic block, a minimally delay-invasive Trojan is sited at line i and there exist more than one path passing via the Trojan site. And we represent the delay of an arbitrary 2 In our likelihood-ratio based hypothesis testing method (Section 3.8), 𝛥 𝐷 𝑇 (𝑃 ,𝑉 )/𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) is inversely proportional to the number of chips required to detect the Trojan by measuring delays of path 𝑃 using vector 𝑉 . A formal proof for this relationship can be found in Section 4.3.1. 55 path for an arbitrary vector as the sum of delays of the gates along the path the vector sensitizes. Since a minimally delay-invasive Trojan gives the minimum amount of additional capacitive load to the Trojan site, a gate driving the Trojan site will see the increase in its load capacitance, and hence its delay will also increase. The original gate’s delay can be formulated using a linear delay model, 𝑔 ℎ+𝑝 , where 𝑔 is the logical effort of the gate which may vary with the direction of the output transition (rising or falling), ℎ is the electrical effort and 𝑝 is the parasitic delay [91]. After a Trojan is inserted to line i, then the delay of the gate driving line i becomes 𝑔 ℎ ′ +𝑝 , where ℎ ′ is the new electrical effort when the additional load is induced at line i by the Trojan. So the increase in the delay can be estimated as 𝑔 (ℎ ′ −ℎ), which is a fixed constant for every vector applied to the gate, as long as the logical effort of the gate, 𝑔 , is identical for both rising and falling transition. Thus, we model the extra delay due to the Trojan for every path passing via the Trojan site for different vectors sensitizing the delay of the path as an almost fixed constant. However, when we evaluate our approach and other approaches, we accurately measure the extra delay due to the Trojan for any given path and vector by performing transistor- level simulations (see Section 3.10 for more details about how we model a Trojan-induced minimum additional capacitive load and perform transistor-level simulations). Computation of 𝜎 𝑣𝑎 𝑟 (𝑃 ,𝑉 ) involves Monte-Carlo simulations to estimate the effects of process variations on delays, and it is impractical to obtain 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) for every path 𝑃 and vector 𝑉 due to high complexity. Instead, we observed that 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) is typically large when the nominal value of the delay, 𝐷 𝑁 (𝑃 ,𝑉 ), is large. In other words, paths with larger delays tend to have larger variations in their delays than paths with smaller delays. This is due to the fact that standard deviation of the delay for a particular path and vector can be represented as the square 56 root of sums of squares of standard deviations of gates constituting the path, i.e., 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 )= √∑ 𝜎 𝑣𝑎𝑟 ,𝑘 2 (𝑉 ) 𝑘 ∈𝑃 , where 𝜎 𝑣𝑎𝑟 ,𝑘 (𝑉 ) is the standard deviation of delays of gate 𝑘 when vector 𝑉 is applied to the circuit. And 𝜎 𝑣𝑎𝑟 ,𝑘 (𝑉 ) can be modeled as some fraction of the nominal value of the delay, 𝑑 𝑁 ,𝑘 (𝑉 ). Thus, for a given 𝑉 , 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) depends on the number and sum of nominal delays of gates along path 𝑃 , and hence paths with smaller delays, 𝐷 𝑁 (𝑃 ,𝑉 )(=∑ 𝑑 𝑁 ,𝑘 (𝑉 ) 𝑘 ∈𝑃 ), tend to have smaller value of 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ). Since we are interested in finding paths with the maximum 𝛥 𝐷 𝑇 (𝑃 ,𝑉 )/𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ), where 𝛥 𝐷 𝑇 (𝑃 ,𝑉 ) is almost constant and 𝜎 𝑣𝑎𝑟 (𝑃 ,𝑉 ) roughly follows 𝐷 𝑁 (𝑃 ,𝑉 ), we select paths with smallest delays (shortest delay paths). To show the effectiveness of selecting the shortest delay path, we choose two different paths with significantly different path delays that pass via the same Trojan site, line 371, in s420 benchmark circuit. We perform Monte Carlo simulations to obtain delay values shown in Figure 6. The distribution of delay for the original version of the s420 benchmark circuit for a specific vector is shown by the darker curve in Figure 6(a). We then obtain a version of this circuit with a Trojan by inserting a Trojan at line 371 of the circuit, and repeat the simulations to obtain the distribution shown by the lighter curve in Figure 6(a). Note that the Trojan causes a relatively small change in delays, namely 8ps, compared to the nominal delay for the original benchmark circuit, and the variations in the delay of the original circuit caused by process variations. For the same circuit and the same Trojan, we repeat the simulations for a different vector that is selected to excite a much shorter delay path that passes via the same Trojan site. It is easy to see in Figure 6(b), that while the expected value of the additional delay due to the Trojan remains 57 around 8ps, i.e., at the same level as in Figure 6(a), the impact of the Trojan increases significantly as a percentage of the average delay of the path. Trojan’s impact also increases with respect to the values of the variance due to process variations and hence it is easier to detect. (a) (b) Figure 6: The distribution of delay at an output of s420 considering process variations for the original circuit version, and a version with a Trojan sited on line 371. For a vector that excites (a) a longer delay path, and (b) a shorter delay path. The idea of targeting the shortest delay paths is also useful due to the fact that a Trojan always increases the total path delay. The conventional delay testing method targets the longest delay paths and checks erroneous logic values at primary outputs, which arrive later than the desired clock period. Even though the impact of a Trojan on delay is very small, the Trojan might be detected if it increases the delay of any of the longest delay path and the path delay goes beyond the clock period. Since the adversary is aware of every commonly used conventional testing method, he/she will try to insert a Trojan to paths other than the longest delay paths to avoid 0 0.005 0.01 0.015 Probability Density Path delay (ps) Original circuit Circuit with an additional fanout 0 0.01 0.02 0.03 0.04 Probability Density Path delay (ps) Original circuit Circuit with an additional fanout 58 detection. Yet another benefit of targeting shortest delay paths is that shorter paths tend to have fewer off-path inputs than longer paths. Due to this reason, the coverage is higher since the probability that a test vector that satisfies all our conditions exists is greater for a shorter path. Hence, in order to detect a Trojan at line i using path delay measurement, we select the shortest delay path passing via line i. We consider the following set of paths as surrogate paths where each surrogate path is the shortest delay path passing via each line in the circuit. As we select a surrogate path for each Trojan, in the worst case, the total number of surrogate paths is only proportional to the number of lines in the circuit. Hence, testing of all possible Trojans using surrogate paths in the circuit only needs O(m) time, in the worst case. 3.5 Conditions for Vector Generation Our next task is to generate a vector that invokes the delay of a path 𝑃 𝑖 in the above set of surrogate paths. This task has some similarity to the problem of test vector generation for path delay testing during HVM testing with two important differences. First, here we have selected the shortest delay paths, in contrast to the longest delay paths in delay testing. Second, our objective is to excite the delay of the selected path, whereas in delay testing the goal is to invoke a delay that is either greater than or equal to that of the target path. For any given target path selected as a surrogate path, we have derived conditions that must be satisfied by a vector to ensure that our objective is satisfied. Figure 7 shows the conditions that must be satisfied by a vector generated for robust delay fault testing. In contrast, Figure 8 shows the conditions we have derived for generating a vector for a surrogate path, for an on-path NAND gate. For the first case, where the on-path input of the NAND gate has a falling transition, 59 delay testing as well as Trojan detection both require the off-path input to have a steady-1. This is because in both cases we must ensure that a transition at any off-path input does not decrease the delay of the target path. In the second case, where the on-path input of the NAND gate has a rising transition, robust delay fault testing only focuses on invoking delay that that is equal to or greater than the delay of the path. Since in Trojan detection our goal is to invoke the delay of the target path, we have modified these conditions to preclude cases where a late transition at an off-path input invokes delay greater than that of the target path. Figure 7: Conditions for robust delay fault testing for an on-path NAND gate. The thick and thin lines denote on-path and off-path lines, respectively. Figure 8: Conditions for a NAND gate along a surrogate path, to detect above category of Trojans. If the transition at a gate’s input on the path being tested is from a controlling value (c) to a non-controlling value (c ̅ ), then we have two conditions: - Condition I: The off-path signal values should be of the form < x, c ̅ >, where x can be either 0 or 1. - Condition II: The off-path signal values should change to non-controlling value before the on-path input arrives. 60 We have derived new conditions for all types of gates and integrated these into a vector generation framework. 3.6 Calibration of Process Variations The first two steps, path selection and vector generation methods, focus on how to maximize the impact of a Trojan on path delay. However, as the levels of process variations continue to increase with each technology scaling generation, the magnitude of process variations on the values of the circuit parameters measured can easily exceed the impact of a minimally invasive Trojan. Hence, new approaches are required to maintain the quality of Trojan detection methods that are based on measuring circuit parameters. Based on the model of a Trojan discussed above, we developed a calibration method which reduces the effects of process variations on delays dramatically via some additional measurements on test structures inserted into the circuit [16]. Our idea originates from the fact that parameters of devices in a particular chip follow local variations, which are smaller compared to global variations, around a unique operating point that is specific to the chip and determined by global variations, as discussed in Section 3.2. Hence a delay of a path in a particular chip is expected to be within a smaller range determined by local variations, around a mean that is shifted by a global shift caused by global variations. In addition, as our approach measures delays of paths on a block-by-block basis, even for some of the largest combinational logic blocks, every device along a path whose delay is measured will reside in a relatively small part of the chip’s area. And it is generally accepted that parameters of devices within a small area have significantly higher correlations than those of devices placed far away. Thus, the devices in a typical logic block will follow a distribution (local nearby) which is 61 narrower than the distribution of devices placed far away in the chip (local far-away). Thus, we arrived at the observation that we can significantly reduce the number of chips to be tested required for detecting a particular Trojan, by estimating a global mean shift for a particular logic block in a particular chip and calibrating the effects of process variations on delays using the estimated mean value of delay, by making measurements on test structures to quantify the effects of global variations. To understand this concept, we further look at three different cases of process variations where measurements are performed in different ways. The first case is full-statistical process variations, including all global and local variations (Case I). This case represents delay distributions measured from chips from several lots and wafers manufactured over long periods of time. The second case represents the model of process variations where only all local variations are considered, including nearby and far-away systematic and random components (Case II). The second case is applicable to the situation where the delays of different devices within the same chip are measured and thus used to evaluate the effects of local variations. Since this assumes that the effects of global variations, which is represented as a mean shift in delay, is factored out, the global variation component is assumed to be zero. The third case considers only systematic local variations on devices placed near each other, i.e., only local nearby, and random local variations (Case III). One additional assumption made in Case III is that all devices are placed within a small area (as defined by the technology vendor) and measurements are performed on these clustered devices. Table 3 summarizes the above variation models and the corresponding notations used in the rest of this dissertation. In order to estimate the mean shift due to global variations, we insert test structures near each combinational logic block and 62 estimate the mean shift for each block using the information extracted from the corresponding test structures. Then we use the estimated mean shift values to calibrate the distribution of delay values measured from fabricated chips. Table 3: Variation models for three cases, in terms of four sources of variations Case Representation I Full-statistical variation model considering every source of variation. It represents the situation where measurements are performed on devices in several chips from different wafers and lots fabricated over a long period. Variance: 𝜎 1 2 =𝜎 2 =𝜎 𝑔 2 +𝜎 𝑙 ,𝑠 ,𝑛 2 +𝜎 𝑙 ,𝑠 ,𝑓 2 +𝜎 𝑙 ,𝑟 2 II Local variation model. Applicable to the situation when measurements are performed on devices on the same chip. Variance: 𝜎 2 2 =𝜎 𝑙 ,𝑠 ,𝑛 2 +𝜎 𝑙 ,𝑠 ,𝑓 2 +𝜎 𝑙 ,𝑟 2 III Local-nearby variation model on devices placed nearby. Similar to Case II but measurements are performed on a block-by-block basis where each logic block resides within a small area (as defined by the foundry). Variance: 𝜎 3 2 =𝜎 𝑙 ,𝑠 ,𝑛 2 +𝜎 𝑙 ,𝑟 2 To illustrate our new idea, we assume that there is a single combinational logic block whose size is less than the fabricator specified local area, shown in Figure 9, surrounded by dotted lines. The delay of an arbitrary path, say 𝑃 𝑘 , is measured with vector 𝑉 and is denoted by 𝐷 (𝑃 𝑘 ,𝑉 ), or simply 𝑑 𝑘 , since our approach applies to any vector that can excite the path’s delay, i.e., it does not hide the impact of a Trojan on the path delay, where the Trojan is located (sited) on a line along the path. If measurements are performed on chips from several wafers and several wafer lots without calibration, then the distribution of 𝑑 𝑘 is determined by the full-statistical variations and it follows the normal distribution N(µ, 𝜎 1 ) as depicted in Figure 10(a). With our new approach, we insert one or more test structures in/near every logic block and estimate the mean shift due to global variations, µ 𝑠 ℎ𝑖𝑓𝑡 , using the above measurement and estimation method. The 63 resulting mean value of µ ̂ (=µ−µ 𝑠 ℎ𝑖𝑓𝑡 ) is obtained once calibration is completed, and delay values after calibration follow the normal distribution N(µ ̂ , 𝜎 2 ) as depicted in Figure 10(b). Hence, the effects of process variations with respect to delay is dramatically reduced and it becomes much easier to detect the Trojan. Figure 9: Measurement of delay of path 𝑃 𝑘 , where the size of the logic block is smaller than the fabricator-specified local area, represented with dotted lines. Figure 10: Delay distribution of path k: (a) full-statistical variations, and (b) local-nearby variations. In our experiment, we captures all the above four types of variation parameters, 𝜎 𝑔 , 𝜎 𝑙 ,𝑠 ,𝑛 , 𝜎 𝑙 ,𝑠 ,𝑓 , and 𝜎 𝑙 ,𝑟 separately for each device parameter and changes every device parameters according to corresponding coefficients defined in an industrial 65nm technology and the 𝜎 2 µ ̂ Δ 𝑑 𝑘 𝜎 1 µ Δ 𝑑 𝑘 (a) (b) 64 associated process variation model. Also, three cases illustrated in Table 3 are also provided as standard variation models and are controllable via 23 Monte-Carlo simulation switches. Especially, Case III, i.e., local nearby, is defined and is said to be applicable when measurements are performed on devices that are placed within a 100µm × 100µm area of the chip. Note that the size of most combinational logic blocks in 65nm technology is smaller than 100µm × 100µm. For example, a 5-stage 16x16 multiplier has area less than 3000µm 2 [39]. So we can assume that Case III holds for almost every individual combinational logic block designed using this technology. To show how three cases of variation models are specified in this 65nm technology, we conducted a simple experiment on an inverter chain with 12 inverters. Using the process variation models provided by the technology, we performed Monte-Carlo simulations to obtain realistic delay distributions using the Cadence Spectre simulator. Figure 11 shows the probability density functions (pdfs) of delays measured on the inverter chain, for three cases of variation models, and 3σ values of three cases are shown on the x-axis. Compared to Case I and II, for Case III the standard deviation of delay with respect to the mean delay reduces by 24.7% and 5%, respectively. This is because 𝜎 𝑔 is factored out and all device parameters are affected by only 𝜎 𝑙 ,𝑠 ,𝑛 and 𝜎 𝑙 ,𝑟 . Thus, the effects of process variations are greatly reduced when the global variation component can be factored out, and can be further decreased if devices to be measured are clustered in a small area as specified in the process variations model. 65 Figure 11: Probability density functions of delays measured on an inverter chain with twelve inverters for three cases of variations. To measure mean global shift in path delay in a particular chip, we insert test structures, such as ring oscillators and various types of gates, to each combinational logic block and estimate a component of global variations in the path delay via calibration using variability information extracted from actual chips. Many approaches to measure variations using test structures and estimate variation components from extracted data have been proposed and validated [11][52]. With delay values measured from test structures, we use a minimum mean square error (MMSE) estimator which statistically estimates the mean shift that has occurred in a particular region of a particular chip [52]. If there are M measurements (𝑑 1 , … , 𝑑 𝑀 ) from M test structures for a particular logic block, then the MMSE estimator, denoted by µ ̂ , is the value which minimizes ∑ (𝑑 𝑖 −µ ̂ ) 2 𝑀 𝑖 =1 . The more test structures we have, the more precisely can we estimate the mean delay. One drawback of our approach is the area overhead caused by insertion of test structures. However, we can apply our new approach at low area overheads by utilizing unused spaces in the layout. In addition, our new idea can be applied even when only a few test structures are 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Probability Density Function (pdf) mean delay - delay (ps) Case I Case II Case III 66 placed on the entire chip, instead of within each combinational logic block. In this case, we can use Case II by estimating the mean shift of a chip itself, so much of the benefits still remains. This is because a significant reduction in the standard deviation occurs between Case I and Case II, and the values of standard deviations are much closer for Case II and Case III. Hence use of Case II preserves most of the benefits of our approach while requiring only a small number of test structures for the entire chip. The value of the mean shift is determined for each path delay distribution, and a measured delay value from a particular chip can be adjusted to a new value, called a calibrated value. For example, if the delay value measured for path k of a particular logic block on a particular chip, say i, is 𝑑 𝑘 𝑖 and the amount of mean shift of that block is µ 𝑖 ̂ , then we can compute the calibrated value, say 𝑑 𝑘 𝑖 ̂ , by subtracting µ 𝑖 ̂ from 𝑑 𝑘 𝑖 . In this way, we can map all values measured from fabricated chips into a single calibrated delay distribution as shown in Figure 12. This distribution will be used to estimate the number of chips to be tested to detect a Trojan along the path. Figure 12: Obtaining calibrated delay values from N data points measured from N fabricated chips. 67 3.7 Path Delay Measurement Finally, we consider the problems of testing short delay paths [62][78]. Targeting short delay paths may require very fast clocks which are not available in most testers. Even when available, fast clock can cause excessive heat dissipation. In addition, measurement noise might occur at connections between tester probes and circuit pins. Also, it is important to measure path delays in a manner that it can capture suitably small differences in delay caused by a Trojan, around several pico-seconds in our 65nm technology. Finally, our approach must measure delays at the flip-flop at the output of the selected path, and not at all flip-flops of every combinational logic block. This is unlike conventional delay testing which checks timing violations at all flip-flops. Figure 13: Path delay measurement architecture. To satisfy one of the above requirements, we choose the on-chip delay measurement architecture because it does not rely on external automatic test equipment (ATE) and has been shown to obtain precise timing information of paths from silicon under real operating conditions [57]. We adapt the architecture provided in [45] and change the structure of scan registers connected to the inputs and outputs of logic blocks to meet our requirements. The scan-register 68 connected to each output pin includes an additional shadow register operated by a separate shadow clock to measure the path delay at the primary outputs. The path delay is determined by comparing logic values captured at both registers while controlling the skew size, shown as ∆ in Figure 13. Measurement errors which might be caused by temperature and voltage changes are reported to be considered in this approach, by estimating the effect of measurement noise by monitoring circuit parameters. Other ways have been proposed to minimize the measurement noise by inserting delay sensors [46] or test [16] in the unused spaces in the layout. Thus, measurement noise can be reduced, while sacrificing only small amount of area. In addition, this approach is reported in [45] to provide sufficient measurement resolution, which is dependent upon how precisely we can control the skew of the shadow clock. This approach uses digitally variable resistors to control the skew size to 1ps which provides sufficient resolution, ∆ 𝑚𝑖𝑛 , for our purposes [68]. Other approaches using programmable delay elements (PDE) are also reported as being able to control the skew size precisely at several pico- seconds level [37]. Moreover, we can avoid using fast clocks by introducing multiple clocks, each with the same frequency as the original clock, but with controllable phase shifts to obtain desired skews. We use a combination of multiple skewed clocks followed by the original clock (a slight modification of the classical approach used for path delay fault testing [33]) and capture logic values for different skew sizes, between ∆ 𝑚𝑖𝑛 and 𝑘 ∆ 𝑚𝑖𝑛 . By capturing logic values at the output using multiple clocks with different skews (binary search can be used to minimize the number of measurements) and comparing these values with expected output values, we can find which skew size leads to an erroneous logic value and hence determines the delay of the selected path. Thus, 69 our approach does not require high frequency clocks and avoids all problems associated with excessive heat dissipation during testing of short delay paths. Last, area overhead caused by using this architecture is relatively low. This overhead is higher if surrogate paths arrive at many different flip-flops. Thus, we have the option of reducing the area overhead by choosing surrogate paths that terminate at fewer outputs. This leads to an interesting tradeoff, since this might increase the number of chips to be tested. 3.8 Estimation of the Number of Chips to be Tested Based on measured delay values for selected paths and generated vectors, we determine whether a Trojan exists or not. In this phase of the study, we show that our problem of Trojan detection is unique, which is different from classical hypothesis testing or classification problems, and develop a likelihood-ratio based method which can effectively determine the existence of the target Trojan for a given level of confidence. A number of approaches have been proposed to determine the existence of a Trojan using statistical methods, while some techniques classify chips into several categories and analyze the result of classification to detect a Trojan. And we have categorized all these approaches based on their characteristics. One category corresponds to methods that pre-process data by reducing the dimensionality/volume of the data, and analyze the pre-processed data in order to determine the existence of Trojans, e.g., visualizing them on the plot or using outlier analysis. The authors in [31][36][97] chose a principal component analysis (PCA) to remove correlated factors from measurement results in order to reduce the dimensionality of data, and visualize only uncorrelated factors on the graph to determine whether there is notable dissimilarity between the 70 data obtained from measurements, and the data of the original design. Furthermore, some researches chose outlier analysis using scatterplot data [61]. The second category includes approaches that use machine-learning techniques such as Support Vector Machine (SVM). The authors in [32] used SVM to classify data points obtained from ICs and use this result to identify any suspicious ICs. And other researchers proposed to use hypothesis testing methods to solve the problem, which compute statistics based on the obtained data and a set of probability distribution curves, and choose one of competing hypotheses depending on values of the statistics. The authors in [34] used Student’s t-test to determine the existence of a Trojan, where data points are assumed to be normally distributed. Figure 14: Type I (false positive) and Type II (false negative) error probabilities given two probability density functions, 𝑓 𝐴 (original) and 𝑓 𝐵 (Trojan-affected), of a particular parameter, and rejection threshold. Every approach introduced above has its own unique advantage, and we choose hypothesis testing to solve this problem due to the following reasons. First, every approach that pre- processes and visualizes data still needs additional statistical methods to make decisions, such as outlier analysis to distinguish suspected data points from the entire data set, or hypothesis testing. Since we aim to develop an approach for making decisions on the existence of Trojans, these preprocessing and visualizing approaches can be used in conjunction with our approach. Second, some methods lack the ability to systematically control the probabilities of Type I and Type II Delay Type-II error Type-I error Rejection threshold 71 errors, which are necessary for evaluating the accuracy of Trojan detection. For example, some machine-learning based approaches like SVM might be useful, but they can only provide empirical procedures for controlling the probabilities of Type I and II errors. Also, the performance of every supervised machine-learning based technique is not deterministic and highly depends on the quality of a training data set. In contrast, hypothesis testing methods have the capabilities to systematically control the probability of Type I and II errors. For example, [31] uses a hypothesis testing approach to determine the existence of a Trojan while controlling the probability of Type I error. Due to several unique advantages, we choose to use hypothesis testing to solve our problem. However, existing hypothesis testing methods are far from the ideal for our objective, since they assume generic distributions for delay values (parametric tests, e.g., Student’s t-test) or require large number of samples to be tested (non-parametric tests, e.g., goodness-of-fit test). Due to these restrictions, existing methods are not applicable to our problem of determining the existence of the target Trojan, where delays are not perfectly normally distributed and some designs may be fabricated in small quantities. In addition, our problem is to select a more likely model between two competing models, “Trojan-free” and “circuit with the target Trojan”. However, existing hypothesis testing methods make decisions whether the data follows a certain distribution or not. Thus we need to develop a method for efficiently solving our selection/classification problem. Thus, we propose a more efficient non-parametric test to identify the existence of the target Trojan using a likelihood-ratio test. It utilizes every single path delay value measured from fabricated chips by dividing the entire sample space into a fixed number of intervals, and 72 computing the conditional probability for each interval, based on the probability distribution for both models (hypotheses): “Trojan-free” and “circuit with the target Trojan” that result from process variations. Then it computes the exact conditional probabilities. Because our goal is to choose the more probable model between these two competing models, we use the likelihood- ratio between conditional probabilities to make a decision. We then compute the actual Type II error probability. We have also developed an ILP that estimates the number of chips required for measurements for a given vector, and a given level of confidence and the probability of Type II error. Suppose that we have a probability distribution 𝑓 𝐴 for delay values of path k of the original circuit, and 𝑓 𝐵 for the circuit with the target Trojan. For some unknown distribution 𝑓 𝑋 and i.i.d. calibrated delay values 𝑑 𝑘 1 ̂ ,…,𝑑 𝑘 𝑁 ̂ of path k from 𝑁 𝑘 fabricated chips using a test vector, the likelihood-ratio test decides between the following hypotheses. H0: 𝑓 𝑋 =𝑓 𝐴 , i.e., target Trojan does not exist, H1: 𝑓 𝑋 =𝑓 𝐵 , i.e., target Trojan does exist. The entire sample space is divided into r mutually-exclusive intervals and we compute 𝑝 𝑗 0 and 𝑝 𝑗 1 , which are the conditional probabilities that one data point belongs to the interval j given that either one of two above hypotheses is true. 𝑝 𝑗 0 =𝑃 [𝑑 𝑘 1 ̂ ∈(𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑗 )|H0], 𝑗 =1,…,𝑟 . (2) For calibrated delay values and r intervals, numbers of samples that belong to the interval 𝑗 =1,…,𝑟 are 𝑛 1 ,…,𝑛 𝑟 , respectively, where ∑ 𝑛 𝑗 =𝑁 𝑘 𝑟 𝑗 =1 . Let the event X be a combination of 73 𝑛 1 ,…,𝑛 𝑟 , i.e., 𝑿 ={𝑛 1 ,…,𝑛 𝑟 }. The conditional probability of the event 𝑿 given that H0 (H1) is true is 𝑝 (𝑿 |H0)=∏𝑝 𝑗 0 𝑛 𝑗 𝑟 𝑗 =1 ∙ 𝑁 𝑘 ! 𝑛 1 !𝑛 2 !…𝑛 𝑟 ! ,𝑝 (𝑿 |H1)=∏𝑝 𝑗 1 𝑛 𝑗 𝑟 𝑗 =1 ∙ 𝑁 𝑘 ! 𝑛 1 !𝑛 2 !…𝑛 𝑟 ! . (3) The likelihood ratio test statistic can be written as 𝛬 (𝑿 )= 𝑝 (𝑿 |H0) 𝑝 (𝑿 |H1) . (4) And the decision rule is as follows. If 𝛬 (𝑿 )<𝑐 , do not reject H0, otherwise, reject H0, where c is a threshold value. In addition, the probability of Type II error, β, is computed as 𝛽 = ∑ 𝑝 (𝑿 |H1), 𝛬 (𝑿 )≥𝑐 (5) where 𝛽 represents the sum of conditional probabilities given that H1 is true when the test decides to accept H0 (i.e., when 𝛬 (𝑿 )≥𝑐 ). This problem can be expressed as integer linear programming (ILP) shown below: Objective: minimize 𝑁 𝑘 for path k and H0 and H1 Constraints: 𝑐 ≥1, 𝛽 𝑚𝑎𝑥 ≥∑ 𝑝 (𝑿 |H1) 𝛬 (𝑿 )≥𝑐 , 𝛼 𝑚𝑎𝑥 ≥∑ 𝑝 (𝑿 |H0) 𝛬 (𝑿 )<𝑐 , 74 where 𝛼 𝑚𝑎𝑥 and 𝛽 𝑚𝑎𝑥 are maximum values of Type I and Type II error probabilities allowed for the test. By solving this ILP problem, the number of chips to be tested, 𝑁 𝑘 , for path k can be computed for given calibrated delay values. In summary, our hypothesis testing method can successfully identify a Trojan with given values of Type I and II error probabilities, where the distributions of data points are not necessarily normal. Our proposed researches will be focused on developing a new method to statistically detect a Trojan, but it will not be constrained to be a hypothesis testing. We are still investigating new statistical methods that are suitable for our problem, including machine- learning techniques. The methods for visualizing data such as PCA also have advantages in reducing the volume of data, and may be considered to conduct our future research. 3.9 Algorithm We have integrated and implemented all our above results into a single framework. Figure 15 provides a high-level overview of our integrated approach. We have used our prototype tool to demonstrate the effectiveness of our approach. We have also compared our approach with an adaption of existing delay testing approach to demonstrate the dramatic increase in effectiveness and/or dramatic decrease in cost. To evaluate the benefit of our approach, we introduce two metrics, Test Cost and Trojan Coverage (TC). 75 Since there might be multiple surrogate paths passing via the same Trojan site, it is possible that some surrogate paths may be used to detect more than one Trojan. The goal is to find a minimal set of surrogate paths that detects every detectable Trojan to minimize the test cost. This problem can be stated as an ILP. Objective: minimize |𝐓 |, 𝐓 ⊆𝐏 Constraints: ∑ 𝑥 𝑖 ,𝑗 𝑃 𝑗 ⊆𝐓 ≥1 for 1≤𝑖 ≤𝑚 𝑥 𝑖 ,𝑗 =1, if 𝑖 ∈𝐒 𝑃 𝑗 and 𝑃 𝑗 ∈𝐏 𝑥 𝑖 ,𝑗 =0, if 𝑖 ∉𝐒 𝑃 𝑗 or 𝑃 𝑗 ∉𝐏 , where 𝑥 𝑖 ,𝑗 is an indicator that shows whether Trojan i is detectable using surrogate path 𝑃 𝑗 (𝑥 𝑖 ,𝑗 =1) or not (𝑥 𝑖 ,𝑗 =0) and T is a minimal set of surrogate paths to be used for testing. The above problem can be solved using greedy heuristics. Two metrics, test cost and Trojan coverage are computed as follows: Test Cost = ∑ (max 𝑖 𝑛 𝑖 ,𝑗 ) 𝑃 𝑗 ∈𝐓 (6) Trojan Coverage (TC) = |𝐒 | 𝑚 ×100 (%) (7) And test cost is sum of the number of required chips to be tested for every surrogate path in T. 76 1. Initialize 𝑃 𝑖 =NULL (surrogate path i), 𝐒 =∅ (set of detectable Trojans), 𝐏 =∅ (set of surrogate paths) and 𝐒 𝑃 𝑖 =∅ (subset of Trojans that are detectable by surrogate path i) for every line in the circuit i = 1,…, m, enumerate every path passing via the target line i. Sort paths in increasing delay order and add them into 𝐒 𝑖 begin 2. while test generation is successful or 𝐒 𝒊 ≠ ∅ Choose the path k having the smallest delay in 𝐒 𝑖 Generate a test vector for corresponding path k 2-1. if (test generation is successful) then Update 𝑃 𝑖 = k Add 𝑃 𝑖 to the set of surrogate paths, 𝐏 Add line i to 𝐒 Add every line along the path 𝑃 𝑖 to 𝐒 𝑃 𝑖 break 2-2. else then Remove k from 𝐒 𝑖 end if; end loop; 3. if (𝑃 𝑖 = ∅) then Surrogate path for line i does not exist end if; end loop; 4. Simulate and measure delay of each surrogate path 𝑃 𝑘 in 𝐏 , without and in presence of a Trojan sited on each line i in 𝐒 𝑃 𝑘 using Monte-Carlo simulations 5. for every Trojan i in every non-empty 𝐒 𝑃 𝑘 begin 5-1. Calibrate the effect of process variations by eliminating global components of the total variations (optional) 5-2. Compute the number of fabricated chips, 𝑛 𝑖 ,𝑘 , to be applied to 𝑃 𝑘 to detect Trojan i. end Figure 15: An overview of our prototype tool for generating vectors for detecting Trojans. 77 3.10 Experimental Results For our experiments, we use the combinational parts of nine ISCAS89 benchmark circuits. The timing-aware ATPG tool for the proposed test generation procedure has been implemented on Intel Core i7 with 2.67GHz processors and 4GB of main memory, for the industrial 65nm technology. As a Trojan which induces a minimum load at a line in the circuit, we use a minimum-sized inverter as an extra fanout that induces extra delay at each particular Trojan site. The hypothesis testing methods and our Trojan detection algorithm are implemented using MATLAB. Figure 16 shows the number of chips to be tested for each surrogate path with our approach and classical delay testing method. For each testing method, we apply two different hypothesis testing method, Student’s t-test and our approach based on likelihood-ratio. We use Student’s t- test as a baseline method to compare with our method, since Student’s t-test requires a lower number of chips to be tested than any type of non-parametric tests. Since our approach is a non- parametric test, this comparison is more than fair (in fact, it is pessimistic). It can be seen that using shortest delay paths as surrogate paths we significantly reduce the required number of chips for every possible surrogate site in c17. In addition, our hypothesis testing method is significantly more effective than Student’s t-test since it chooses one likely model between two competing models, “Trojan-free” and “circuit with the Trojan”, where Student’s t-test focuses on the closeness of measured delay values to the Trojan-free model. 78 (a) (b) Figure 16: The number of chips to be tested for detecting a minimally delay-invasive Trojan sited at each line in c17, using our approach of targeting shortest delay path and classical delay testing method. (a) Student’s t-test. (b) Likelihood-ratio based test. In Table 4, we compute the Trojan detection cost for eight benchmark circuits of four different methods. And we compare costs of (a) and (d) to see improvements in the cost of Trojan detection. In each method, we use 95% confidence level and 5% Type II error probability. Note that method (a) is the classical method targeting longest delay paths without calibration of process variations, method (d) is the method we have proposed, and (b) and (c) are intermediate methods. Our new approach with likelihood-ratio based test (method (d)) dramatically improves, by 4.51X, the test cost compared to the existing delay testing targeting the longest delay paths with Student’s t-test (method (a)), without performing calibration on process variations and for identical Trojan coverage. For designs that are fabricated in small 0 100 200 300 400 500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 The number of chips to be tested Surrogate path number Our approach Classical delay testing 0 100 200 300 400 500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 The number of chips to be tested Surrogate path number Our approach Classical delay testing 79 volumes, our approach may be the only one that can perform Trojan detection with desired level of confidence. In summary, the results clearly demonstrate that our approach which targets the shortest delay paths and calibrates the effect of process variations gives better results than classical delay testing method. In addition, the results show that the Trojan detection procedure that uses our hypothesis testing method reduces test cost significantly. Table 4: The test costs and Trojan coverages (TC) for some benchmark circuits using various combinations of our approaches. (a) A baseline method of classical delay testing without calibration of process variations. (b) Our approach of selecting shortest delay paths without calibration of process variations. (c) Our approach of classical delay testing with calibration of process variations. (d) Our approach of selecting shortest delay paths with calibration of process variations. Benchmark circuit Number of gates TC (%) Test cost Improvement/change in test cost (a) (b) (c) (d) (a)/(b) (a)/(c) (a)/(d) s1238 508 94.3 4016155 2182693 1731101 940816 1.84X 2.32X 4.27X s1488 653 98.2 13304287 7310048 5685593 3123952 1.82X 2.34X 4.26X s5378 2779 98.7 18226483 9592886 7994071 4207406 1.90X 2.28X 4.33X s9234 5597 93.4 40691887 22860611 20144499 11317134 1.78X 2.02X 3.60X s15850 9772 94.2 75014631 41216830 38469042 21136836 1.82X 1.95X 3.55X c2670 1193 92.0 5907916 3537674 2924711 1751324 1.67X 2.02X 3.37X c5315 2307 91.4 15695471 9342543 7887172 4694745 1.68X 1.99X 3.34X c7552 3512 89.4 31685354 19927896 15763858 9914376 1.59X 2.01X 3.20X 3.11 Conclusions We have identified several principles for selection of target paths and generation of vectors to enable identification of minimally delay-invasive Trojans. Our approach is also demonstrated as being efficient in the presence of increasing levels of process variations, which cannot be tackled 80 by classical testing and validation approaches. The experimental results show that the proposed approach reduces test cost significantly compared to classical methods. 81 CHAPTER 4. A METHOD TO MINIMIZE THE IMPACT OF TROJANS VIA RESIZING In this chapter, we show our approach for resizing gates, which is designed to reduce changes on delays due to one (or possibly more than one in case of less challenging types of Trojans) additional connections induced by minimally delay-invasive Trojans. This work has been published and some figures, tables, and text from [15] are used under IEEE copyright policy 3 . 4.1 Introduction In Chapter 2, we have identified the properties of maximally challenging Trojans, that such Trojans will minimally change values of the circuit’s parameters without changing the logic functions implemented by the circuit. And in Chapter 3, we showed our approach for detecting minimally delay-invasive models of Trojans that only adds a single connection giving the minimum additional load. However, this approach is evaluated under the condition that no action is taken further to minimize the effects of Trojans inserted, i.e., gates and wires in the original circuit can be resized to eliminate the changes in delays. Figure 17 illustrates a scenario that the original circuit is manipulated via gate resizing after minimally delay-invasive Trojan insertion, compared to the scenario where no gate resizing is taken place (Figure 5). 3 © 2014 IEEE 82 (a) (b) Figure 17: (a) An original circuit, 𝐶 . (b) A Trojan-affected circuit with a minimally delay-invasiv e Trojan (with resizing in order to maximally match delays of every path with (a)), 𝐶 𝑇 , where a T rojan block is connected to an arbitrary line in the original circuit via a gate having minimum inp ut capacitance. In this chapter, we show that a Trojan can be designed to maximally match delays with the original circuit, i.e., maximize the difficulty of detection using delay measurements, by further reducing the impact of Trojan on delay. In particular, we focus on the problem of hiding the impact of Trojan on delay by resizing gates in the original circuit and propose a new strategy that enables Trojan designers to hide the impact of Trojans effectively at minimal cost. In this problem, our primary objective is to maximize the cost of detecting a target Trojan by reducing the impact of the Trojan on every path’s delay via gate resizing, and hence make every delay- based technique less effective. At the same time, unless we are careful, resizing might increase the effectiveness of parametric measurement methods that target other metrics, e.g., power or area, in terms of detecting Trojans. Thus, we strictly limit changes to other parameters such as area and power of the circuit due to resizing, thus minimize any benefit to other parametric measurement methods that measure power/ground currents. To accomplish this, we develop an ... ... ... ... ... ... x Trojan block Resized gates C CT T 83 algorithm for gate resizing which targets path delays and provides a version of the circuit with a Trojan that minimizes the difference between the delays of corresponding paths, while strictly constraining the changes to the area of the original circuit. Also the proposed algorithm does not change the functionality of the original circuit, and the locations of the cells (gates) relative to power/ground network remain mostly unchanged, as we end up changing the sizes of a small number of gates mostly by using the empty spaces around the corresponding gates. So every logic test method for Trojan detection remains ineffective, and the amount of power consumption change is shown to be very small (see Section 4.6 for the experimental results). In addition, our approach focuses on redesigning the original circuit rather than the Trojan itself, and hence it can be applied to any kind of Trojan that is connected to arbitrary line(s) in the original design. Last, this method reduces the delay difference caused by Trojan insertion, and hence makes every statistical method for Trojan detection require much larger number of measurements to determine the existence of a target Trojan at a given level of confidence. Hence, our approach is effective especially for strategically sensitive chips (e.g., specialized chips for military applications) that are typically fabricated in a small volume. Our problem is different from the traditional gate sizing problem in two important ways. First, we do not target only the critical or near-critical paths. Instead, we consider every path whose delay might be affected by gate resizing or Trojan insertion, since we cannot make any assumption regarding which paths will be excited by the vectors used for Trojan detection. Second, as we must minimize the impact of the Trojan on every path’s delay, we cannot focus on optimizing metrics such as slack, power, or area. 84 The long-term objective of this research is not to create a new approach for Trojan design, but to develop an approach that can create most challenging Trojans to inspire and guide the researchers in Trojan detection to develop the next generation detection approaches. 4.2 Problem Statement Minimally delay-invasive Trojan with resizing gates: A circuit 𝐶 is a combinational logic block, and it has 𝑁 𝑔 gates and 𝑁 𝑝 paths where gate 𝑔 𝑖 is characterized by an initial sizing factor 𝛼 𝑖 , for every 𝑔 𝑖 ∈𝐆 . 𝐆 is a set of every gate in 𝐶 , {𝑔 𝑖 ,…,𝑔 𝑁 𝑔 }. A Trojan designer chooses line(s), called Trojan site(s), in 𝐶 , and adds interconnect(s) between Trojan site(s) and the Trojan block to create the Trojan-affected circuit 𝐶 1 . After that, the Trojan designer resizes the gates in this circuit 𝐶 1 , i.e., determines the new sizing factor 𝛼 𝑖 ′ , to obtain the final version of the circuit, 𝐶 𝑇 . Thus, 𝐶 𝑇 has the same set of gates and wires as well as interconnect topology as 𝐶 , except that gate sizing factor of every gate may be changed and a minimal additional load is induced at the Trojan site by the additional interconnect. However, gate resizing may incur changes in other circuit parameters, such as power/ground current and area. Thus, in order to minimize the additional impact on other circuit parameters due to resizing, e.g., power/ground current and area, change in those parameters must be strictly limited. Our problem is: Assume that an arbitrary line is chosen as the Trojan site, how to determine new sizing factor for every gate in 𝐶 𝑇 that maximizes the number of chips to be tested for Trojan detection under a given level of process variations, while satisfying the area and power constraints? 85 For the given Trojan site, our problem can be formally defined as: Objective: Maximize min ∀𝑃 𝑖 ∈𝐏 {𝑛 𝑃 𝑖 } Constraints: |𝐴 𝐶 𝑇 −𝐴 𝐶 |≤𝛥 , where 𝑛 𝑃 𝑖 is the number of chips required to detect a Trojan at the Trojan site by measuring delay of 𝑃 𝑖 ∈𝐏 under a given level of process variations, where 𝐏 is the set of every path in the circuit, {𝑃 1 ,…,𝑃 𝑁 𝑝 }. And 𝐴 𝐶 and 𝐴 𝐶 𝑇 are the sums of the area of gates in 𝐶 and 𝐶 𝑇 , respectively, i.e., 𝐴 𝐶 =∑ 𝑠 𝑖 𝛼 𝑖 𝑖 ∈G and 𝐴 𝐶 𝑇 =∑ 𝑠 𝑖 𝛼 𝑖 ′ 𝑖 ∈G , where 𝑠 𝑖 is a constant multiplier for the area of gate 𝑔 𝑖 whose value depends on the gate and the original circuit. 𝛥 is the maximum allowable change in value of 𝐴 𝐶 . Note that the change in the area of the entire layout is always no greater than the sum of individual gate area change, |𝐴 𝐶 𝑇 −𝐴 𝐶 |, since many gates are resized by utilizing empty space around the corresponding gates in the original circuit. To simplify the problem, every tested chip is assumed to have the identical design, 𝐶 𝑇 , plus the Trojan block. The Trojan designer may carefully insert Trojans to only a subset of chips in order to increase the difficulty of detection, but it requires additional sets of masks and would be very expensive [78]. However, we note that the quality of the result, 𝐶 𝑇 , is independent of this assumption and it does not affect the benefit of our approach of gate resizing. This is because, instead of directly computing 𝑛 𝑃 𝑖 , we utilize numerical values of path delays and additional delay induced by a Trojan, and aim to minimize the impact of the Trojan on delay with respect to the effects of process variations by introducing a new notion of fitness. 86 4.3 Gate Resizing Method and Analysis We start by presenting our key ideas and then describe our algorithm for resizing. 4.3.1 Fitness Function The objective of our problem is, under a given level of process variations, to maximize the number of chips required to test every path’s delay, min ∀𝑃 𝑖 ∈𝐏 {𝑛 𝑃 𝑖 }. And 𝑛 𝑃 𝑖 can be derived from 𝑓𝑖𝑡 (𝑃 𝑖 ), called fitness function of 𝑃 𝑖 , which is defined as 𝑓𝑖𝑡 (𝑃 𝑖 )=| 𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 ) 𝜎 𝐷 (𝑃 𝑖 ) |. The fitness function for a particular path is the ratio of the addition due to a Trojan to the path’s delay to the standard deviation of the original path delay due to process variations. The numerator of 𝑓𝑖𝑡 (𝑃 𝑖 ), 𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 ) is the difference between nominal delays of 𝑃 𝑖 of the original circuit 𝐶 and the Trojan-affected circuit 𝐶 𝑇 . Second, its denominator, 𝜎 𝐷 (𝑃 𝑖 ) is the standard deviation of delay. Theorem 1 (fitness function): 𝑓𝑖𝑡 (𝑃 𝑖 ) is inversely proportional to the number of chips to be tested, 𝑛 𝑃 𝑖 . Hence, minimizing max ∀𝑃 𝑖 ∈𝐏 {𝑓𝑖𝑡 (𝑃 𝑖 )}, which is defined as the circuit’s fitness, results in maximizing the number of chips to be tested to detect a Trojan via path delay measurement Proof: The numerator of 𝑓𝑖𝑡 (𝑃 𝑖 ), 𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 ) is the difference between nominal delays of 𝑃 𝑖 of the original circuit 𝐶 and the Trojan-affected circuit 𝐶 𝑇 . Second, its denominator, 𝜎 𝐷 (𝑃 𝑖 ) is the standard deviation of delay. 87 In the likelihood-ratio based test used in our detection approach, the whole sample space is divided into a fixed number of intervals and the conditional probability of each interval is obtained. And the likelihood-ratio is computed using conditional probabilities of all intervals and the number of chips to be tested is determined by solving the LP problem at the given level of confidence. In our proof, we will show that identical values of the fitness corresponds to identical values of the conditional probabilities of the intervals and the likelihood-ratio, hence gives the same result for given path delay values. Also we will show that the number of chips required for the test is a function of the fitness function. In Figure 18, there are two curves that respectively represent the distribution of delay in the original circuit 𝐶 (solid curve) and the Trojan-affected circuit 𝐶 𝑇 (dotted curve) for the same path. 𝐼 𝐴 and 𝐼 𝐵 are two intervals among the intervals that constitute the whole sample space (x-axis). The vertical dotted line that distinguishes between 𝐼 𝐴 and 𝐼 𝐵 is the point where the delay is 𝐷 𝑁 (𝑃 𝑖 )+|[𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 )] / 2|. (9) Figure 18: Delay distributions of the same path, from the original circuit (solid curve) and the Trojan affected circuit (dotted curve) Delay Probability density function 88 The probability that delay belongs to each interval is then computed. 𝑝 𝐼 𝐴 |𝐶 (𝑝 𝐼 𝐵 |𝐶 ) is the conditional probability that delay of one delay sample belongs to 𝐼 𝐴 (𝐼 𝐵 ), given that the circuit has no Trojan (the solid curve). 𝑝 𝐼 𝐴 |𝐶 =∫ 𝑓 (𝑥 )𝑑𝑥 𝐷 𝑁 (𝑃 𝑖 )+|[𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 )] / 2| 𝐷 𝑁 (𝑃 𝑖 ) , (10) where 𝑓 (𝑥 ) is the probability density function for delay 𝑥 which represents the solid curve in Figure 18. 𝑝 𝐼 𝐴 |𝐶 can be expressed in terms of the error function, 𝑒𝑟𝑓 (𝑥 ) where 𝑒𝑟𝑓 (𝑥 ) is defined as 𝑒𝑟𝑓 (𝑥 )= 1 √𝜋 ∫ 𝑒 −𝑡 2 𝑑𝑡 𝑥 −𝑥 . Here, we make another assumption that delay is, approximately, normally distributed. 𝑝 𝐼 𝐴 |𝐶 =𝑒𝑟𝑓 ( |𝐷 𝑁 (𝑃 𝑖 )−𝐷 𝑁 ′ (𝑃 𝑖 )| 2√2𝜎 𝐷 (𝑃 𝑖 ) ) =𝑒𝑟𝑓 ( 𝑓𝑖𝑡 (𝑃 𝑖 ) 2√2 ) (11) We note that the conditional probability of the interval is a function of 𝑓 𝑖𝑡 (𝑃 𝑖 ), and 𝑒𝑟𝑓 (𝑥 ) monotonically increases with 𝑥 . Since conditional probabilities of intervals, which are used as the only input to the LP problem to compute the number of chips to be tested, are solely determined by the value of fitness function of the path to be tested, we can conclude that minimizing the fitness of a particular path results in maximizing the number of chips to be tested for the path. Furthermore, this idea also applies to any delay curves, including those that are not normally distributed, as long as the probability density function of delay is a monotonic function before and after its mean. Also, the idea of minimizing the fitness function to maximize the number of chips to be tested is generally accepted by other statistical methods, such as Student’s t-test, which uses the statistic 𝜇̅−𝜇 𝜎̅/√𝑛 , where 𝜇 ̅−𝜇 is the difference between means of the original 89 and the Trojan-affected circuit, 𝜎̅ is the standard deviation, and 𝑛 is the number of chips required for the test. In this statistic, the value of 𝑛 is quadratic to (𝜇 ̅−𝜇 )/𝜎̅ , which conforms to our idea of fitness function. Q.E.D. 4.3.2 Analysis of the Impact of Trojans on Delay, Area, and Power We investigate the effect of gate sizing on delay, area, and power, which is essential to explain how to solve the problem. Property 1: The size of a gate affects its delay and its input capacitance. Thus, if the size of a particular gate is changed, then it will affect delays of paths passing via the resized gate itself, as well as paths passing via its immediate fanins. Figure 19 shows gate G and its fanins (A, B, and C) and fanouts (D and E). If G is resized, then its input capacitance changes, which is the load capacitance seen by its fanin gates. As a result, it will affect the delay of its immediate fanins, and hence change delay of every path that passes via A, B, and C. In addition, delays of every path passing via G will be also affected. Thus, if a particular gate is resized, then the algorithm needs to consider (1) every path that passes via the resized gate, and (2) every path that passes via its immediate fanin. Figure 19: Example of paths to be considered when resizing a gate C G A D B E F 90 Property 2: In contrast to delay, the overall circuit area is not greatly affected by gate resizing. We note that our problem is different from the traditional discrete gate sizing problem which selects gates from the library. Instead the adversary might be capable of changing the width of each transistor in the gate in a fine-grained manner, where the resolution of the width change is determined by the technology. Also, resizing of a particular gate can often be done by utilizing empty space around the corresponding gate and does not necessarily increase the area of the entire layout. And in our experiments, we computed the sum of change in every gate’s area, |𝐴 𝐶 𝑇 −𝐴 𝐶 |, which is the upper bound on the change in the area of the entire layout, and is less than 1% for every benchmark circuit (Table 7). Property 3: By resizing gates to match the timing of every gate, changes in dynamic power/ground currents as well as power signatures can be reduced while limiting static power consumption changes by putting a constraint on |𝐴 𝐶 𝑇 −𝐴 𝐶 |. Power consumption can be decomposed into static and dynamic components. Among these two components, the static component of power can be expressed as the sum of static power consumed by every device in the circuit, where static power consumed by each device is commonly modeled as being roughly proportional to the area of the corresponding device. In addition, our approach does not affect interconnects and the structure of the circuit, but only gates, hence the difference in the static power consumption before and after our approach will be roughly proportional to |𝐴 𝐶 𝑇 −𝐴 𝐶 |, which is the difference between the sum of every gate’s area. As our resizing problem limits the difference in circuit area, it also restricts changes in static power consumption by introducing the area constraint. 91 On the other hand, the dynamic component of power is dependent upon switching activities of gates in the circuit, because switching activities of some gates may be delayed due to extra load induced by a Trojan. So 𝐶 1 will show a different dynamic power signature due to delayed switching activities of gates. Since our solution minimizes the change in delay of every path, we will show that it also reduces the difference in switching activities of every gate in the circuit. This is achieved by matching timings of switching activities occurring at every gate. Thus, dynamic power signatures of 𝐶 and 𝐶 𝑇 are closer to each other than those of 𝐶 and 𝐶 1 . As an example, consider c880 as the original circuit (𝐶 ). And we added a minimum-sized inverter whose input is connected to line 627, which is chosen as the Trojan site, and created a Trojan-affected circuit (𝐶 1 ). Here the minimum-sized inverter is selected to introduce a minimal additional load to the Trojan site. Finally, we applied our gate resizing method to 𝐶 1 and obtained a new version of the Trojan-affected circuit with gate resizing (𝐶 𝑇 ). Figure 20 shows the amount of current measured at a vdd/gnd pin for the above three versions of the circuit using 65nm technology. In our experiment, there are 383 gates in c880 after synthesis, so it is a realistic assumption that there is only one vdd/gnd pin for the entire logic. We can find that power signatures of all three versions of the circuit are almost identical, since currents are measured for a large number of transistors. However, the current curve of the circuit with the Trojan after resizing (gray curve) is closer to the curve of the original circuit (dotted curve) with 0.85% difference, than the curve of the circuit with the Trojan, but without our resizing approach (black curve) with 2.25% difference. Also, average static power consumption remains almost constant (<0.2% difference) in all three versions of the circuit, since our approach strictly limits area change and hence restricts static power change. 92 Figure 20: Current measured from three versions of the circuit using a vector that excites a Trojan at line 627 in c880. (a) (b) Figure 21: The difference in power consumption (a) between the original circuit and the Trojan- affected circuit without our approach, and also (b) between the original circuit and the Trojan- affected circuit with our approach, for c880 In Figure 21, we repeat the experiment for every line in the circuit and show the total power consumption of all three versions of the circuit, 𝐶 ,𝐶 1 , and 𝐶 𝑇 of c880, in presence of Trojan sited at each line in c880. Figure 21(a) represents the difference in the total power consumption 0 0.05 0.1 0.15 0.2 Simulation time (ps) The original circuit The Trojan-affected circuit without our approach The Trojan-affected circuit with our approach 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 7.00% 8.00% Power consumption change w.r.t. the original circuit Line number 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 7.00% 8.00% Power consumption change w.r.t. the original circuit Line number 93 between the original circuit (𝐶 ) and the Trojan-affected circuit without our approach (𝐶 1 ), i.e., a minimum-sized inverter’s input is connected to a target Trojan site in c880 and no resizing is done. And Figure 21(b) represents the different in the total power consumption between the original circuit (𝐶 ) and the Trojan-affected circuit with our approach (𝐶 𝑇 ), i.e., resizing is done after Trojan insertion. And the important statistics from Figure 21 are provided in Table 5. Table 5: The minimum, maximum, and average differences in total power consumption (from Figure 21(a) and (b)) Min Max Average 𝐶 vs. 𝐶 1 (Figure 21(a)) 0.09% 7.07% 0.85% 𝐶 vs. 𝐶 𝑇 (Figure 21(b)) 0.22% 0.98% 0.51% From Table 5, we note that the difference in the average power consumption between the original circuit and the Trojan-affected circuit with our approach is less than the difference between the original circuit and the Trojan-affected circuit without our approach. And the maximum change in power consumption is dramatically reduced from 7.07% to 0.98% after our resizing, due to closer matching in dynamic power signatures. However, the minimum difference in power consumption is slightly increased from 0.09% to 0.22%, because our approach resizes several gates and hence increases static power consumption (in our approach, majority of resized gates are sized up and they contribute to the increase in static power consumption). But the total power consumption change by our approach is less than 1% for every circuit version with Trojan for every possible Trojan site, which conforms to the average area change reported in Table 5. Other benchmark circuits also exhibit the same trend. 94 In this way, our approach successfully limits changes to static and dynamic power consumptions by introducing the area constraint and matching dynamic power signatures. 4.4 Proposed Approach Based on the findings and observations from Section 4.3, our resizing problem to minimize the delay impact of a Trojan can be reformulated as follows. Objective: Minimize max ∀𝑃 𝑖 ∈𝐏 {𝑓𝑖𝑡 (𝑃 𝑖 )} Constraints: |𝐴 𝐶 𝑇 −𝐴 𝐶 |≤𝛥 , 𝐷 (𝑃 𝑖 )=∑ 𝑑 (𝑔 𝑗 ,𝛼 𝑗 ) ∀𝑔 𝑗 ∈𝑃 𝑖 , 𝐷 ′ (𝑃 𝑖 )=∑ 𝑑 (𝑔 𝑗 ,𝛼 𝑗 ′ ) ∀𝑔 𝑗 ∈𝑃 𝑖 , where 𝐷 (𝑃 𝑖 ) and 𝐷 ′ (𝑃 𝑖 ) are the delays of path 𝑃 𝑖 in 𝐶 and 𝐶 𝑇 , respectively, and 𝑑 (𝑔 ,𝛼 ) is the nominal delay of gate 𝑔 and the sizing factor 𝛼 . In our algorithm, delay of a single gate is estimated using a linear delay model as 𝑑 =𝑔 ℎ+𝑝 , where 𝑑 is the delay of the gate, 𝑔 is the logical effort of the gate, ℎ is the electrical effort and 𝑝 is the parasitic delay [91]. More accurate delay model can be used to improve the accuracy of delay computation. However, the effectiveness of our algorithm will be demonstrated using accurate circuit simulations using the industrial 65nm technology, and will be discussed in Section 4.6. (while our algorithm for resizing uses the linear delay model, all our evaluations are carried out using accurate circuit- level simulations considering process variations.) In order to obtain the optimal solution to the above problem, we need to consider delay of every path that may be affected by the Trojan. We may iteratively consider one path at a time until we handle every path. However, sizing a particular gate affects all other paths that pass via 95 the resized gate or via its immediate fanins, as discussed in Section 4.3.2. Thus, optimizing a single path at a time may lead to oscillations and the solution may not converge. In addition, this will limit scalability of the solution since in the worst case, the number of paths increases exponentially with the size of the circuit. Instead, if a particular gate is selected and then resized to balance between values of fitness of all the paths that pass via the gate, then we can always obtain the benefit by reducing the circuit’s fitness. Hence, we solve the problem by iteratively selecting gates and resizing selected gates while applying our observation discussed in previous sections. 4.4.1 Candidate Gate Selection First, at the start of every iteration our algorithm identifies a set of candidates to be resized. Since we try to minimize max 𝑃 𝑖 ∈𝐏 𝐭 {𝑓𝑖𝑡 (𝑃 𝑖 )}, every gate that is along any path whose delay is affected by resizing or by the Trojan, is considered. The algorithm categorizes all gates as follows. Category 1: Gates that are not connected to the Trojan site via any gates or lines. Category 2: Gates that are connected to the Trojan site via gates or lines, but do not presently belong to transitive fanin or fanout cones of the Trojan site and any of the resized gates. Category 3: Gates that presently belong to the transitive fanin or fanout cones of the Trojan site or any of the resized gates. 96 Figure 22: Example of gate categorization The set of candidates to be searched in the algorithm includes gates in category 3 only. The gates in category 1 are permanently excluded during the initial candidate search and will never be considered as candidates. The gates in category 2 are not targeted at the present stage since every path passing via every such gate has zero fitness value. However, in subsequent steps a subset of gates in category 2 may move to category 3. The example of gate categorization is shown in Figure 22. 4.4.2 Candidate Pruning At each iteration of the algorithm, for the present version of the circuit, 𝐶 𝑘 , every gate is categorized and used to prune the set of candidates. The next step is to determine the optimal sizing factor that yields the best solution achievable by resizing a particular gate selected from the candidate set. However, since the number of paths increases exponentially as the circuit size increases, it is not feasible to search every gate that belongs to category 3. This is especially important as after a few iterations, a majority of gates may be moved to category 3 after multiple Category 1 C G E F A B I0 I1 I3 I4 I5 I6 O0 O1 O2 I2 1 2 4 5 Trojan block … Trojan site D 3 Category 3 Category 2 Category 2 97 gates are resized. To reduce the number of candidates, we consider the following: the amount of benefit that can be obtained by resizing a particular gate depends on the fitness values of paths that pass via the gate. Thus, we introduce the second threshold 𝛾 1 to further classify gates in category 3 into category 3a and category 3b, where 𝛾 1 is empirically determined in the due course of the algorithm. The algorithm will focus on gates in category 3b only and this will help the algorithm to significantly reduce the search space. Category 3a: Gates that belong to the transitive fanin or fanout cones of the Trojan site or any of the resized gates, and 𝑓𝑖𝑡 (𝑃 𝑖 )≤𝛾 1 for every path 𝑃 𝑖 which passes via the gates. Category 3b: Gates that belong to the transitive fanin or fanout cones of the Trojan site, and 𝑓𝑖 𝑡 (𝑃 𝑖 )>𝛾 1 for any path 𝑃 𝑖 which passes via the gates. The gates in this category constitute our candidate set, 𝐂 , which is represented by ⋃ 𝐆 𝒊 𝑖 ,𝑓𝑖𝑡 (𝑃 𝑖 )>𝛾 1 , where 𝐆 𝒊 is the set of gates that exist along path 𝑃 𝑖 . 98 4.4.3 Optimal Gate Resizing To find the best candidate, the algorithm computes the optimal sizing factor of every gate in the candidate set, 𝐒 , for the present version of the circuit 𝐶 𝑘 . For a particular gate 𝑔 ∈𝐒 , the optimal sizing factor can be obtained by solving the following linear programming problem. Objective: Minimize 𝑦 Constraints: |𝑓 𝑖𝑡 (𝑃 𝑖 )|≤𝑦 , ∀𝑃 𝑖 ∈(𝐏 𝒈 ∪𝐏 𝒇𝒂𝒏𝒊𝒏 ), 𝐷 ′ (𝑃 𝑖 )= { ∑ 𝑑 (𝑔 𝑗 ,𝛼 𝑗 ′ ) 𝑔 𝑗 ∈(𝑃 𝑖 −𝑔 −𝑔 𝑓𝑖𝑛 ) +𝑑 (𝑔 ,𝛼 𝑔 ′ )+𝑑 (𝑔 𝑓𝑖𝑛 ,𝛼 𝑔 𝑓𝑖𝑛 ′ ),if 𝑃 𝑖 ∈𝐏 𝒈 ∑ 𝑑 (𝑔 𝑗 ,𝛼 𝑗 ′ ) 𝑔 𝑗 ∈(𝑃 𝑖 −𝑔 𝑓𝑖𝑛 ) +𝑑 (𝑔 𝑓𝑖𝑛 ,𝛼 𝑔 𝑓𝑖𝑛 ′ ), ,if 𝑃 𝑖 ∈𝐏 𝒇𝒂𝒏𝒊𝒏 𝐷 (𝑃 𝑖 )= ∑ 𝑑 (𝑔 𝑗 ,𝛼 𝑗 ) 𝑔 𝑗 ∈𝑃 𝑖 , where 𝐏 𝒈 is a set of paths that pass via 𝑔 and 𝐏 𝒇𝒂𝒏 𝒊𝒏 is a set of paths that pass only 𝑔 ’s fanin, not 𝑔 itself. In the above problem, only 𝛼 𝑔 ′ , the sizing factor of 𝑔 , is controllable and other gate sizing factors are treated as fixed constants. And 𝐷 (𝑃 𝑖 ), the original delay of path 𝑃 𝑖 , is also a fixed real number. 𝐷 ′ (𝑃 𝑖 ) is the sum of delays of gates along path 𝑃 𝑖 except 𝑔 and its fanin gate (𝑔 𝑓𝑖𝑛 ), 𝑔 𝑓𝑖𝑛 ’s delay, and 𝑔 ’s delay depending on whether 𝑔 exists along 𝑃 𝑖 (𝑃 𝑖 ∈𝐏 𝒈 ) or not (𝑃 𝑖 ∈𝐏 𝒇𝒂𝒏𝒊𝒏 ). The solution to this problem is 𝛼 𝑔 ′ , the optimal sizing factor of 𝑔 , and 𝑦 , the minimum fitness among paths passing via 𝑔 that is achievable by resizing 𝑔 . 99 4.4.4 Iterative Improvement After computing 𝑦 and the optimal sizing factor for every gate in the candidate set, the algorithm chooses the best candidate to be resized. However, we also need to consider the number of paths passing via each candidate, since the candidate set to be determined at the subsequent iterations will be affected by the present decision. And we demonstrate the necessity of considering the number of paths in the following example. In Figure 23, line 4 is affected by the Trojan. and the algorithm will consider three gates, C, E, and F, depending on whether they belong to category 3a or 3b. Let’s assume that all these three gates belong to category 3b and hence will be searched. Also assume that the number of paths of which fitness is greater than 𝛾 1 are 2 for C, and 3 for E. These paths are I3-3-4-O2 and I4-3-4-O2 for C, and I3-3-4-O2, I4-3-4- O2, and I5-4-O2 for E. If we consider only the amount of the fitness decrease, (𝑦 − 𝑦 𝑜 ), and choose C as the best candidate, then this gate will be eventually resized and delays of all the paths passing via this gate will be also changed. However, this will result in change of delay of path I3-3-O1, which may make the algorithm include the inverter, D, as the new candidate at the subsequent iteration. However, more efficient moves can be made subsequently if we select E, which drives the Trojan site, as the best candidate. It is because every path that passes via E is affected by the Trojan, so we can easily obtain the benefit by resizing this gate. Furthermore, this move does not change the set of gates that are affected by either resizing or the Trojan, so the candidate set in the subsequent iteration remains the same. The only difference between these two candidates is the number of paths, so we use the number of paths with fitness greater than 𝛾 1 for selecting the best candidate. 100 Figure 23: Example of candidate selection in the first iteration. Three gates in the circle are candidates and their optimal sizing factors are computed Figure 24: The circuit’s fitness during the course of applying the algorithm on c880, where 𝛾 0 = 0.01 and 𝛾 1 = 0.05. ‘Our approach’ indicates the result with the new metric 𝛽 𝑔 , and ‘only fitness’ represents the result using (𝑦 0 −𝑦 ) only in the best candidate selection. The new metric, 𝛽 𝑔 , for gate 𝑔 is used, which is the product of 𝑦 0 −𝑦 , the difference between the original and new maximum fitness of paths passing via 𝑔 , and |𝐏 𝒈 ′ |, which is the number of paths passing via 𝑔 with fitness greater than 𝛾 1 . The benefit of using 𝛽 𝑔 instead of (𝑦 0 −𝑦 ) is demonstrated on c880 and is shown in Figure 24. It shows that our approach steadily reduces the circuit fitness until it reaches the desired result. Also, we see that area change is less than 1% after resizing gates. Thus, the algorithm uses this metric to choose the best candidate and finally resizes the selected gate using the optimal sizing factor computed from LP. C G E F A B I0 I1 I3 I4 I5 I6 O0 O1 O2 I2 1 2 4 5 Trojan site D 3 0.996 0.998 1 1.002 1.004 1.006 1.008 0 0.05 0.1 0.15 0.2 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Normalized Area Circuit Fitness Iteration Our approach Only fitness Area 101 4.4.5 Algorithm Our overall approach is shown in the algorithm in Figure 25. The goal of the algorithm is to select and resize gates iteratively until the circuit’s fitness, max 𝑃 𝑖 ∈𝐏 𝐭 {𝑓𝑖𝑡 (𝑃 𝑖 )} goes below the threshold 𝛾 0 , where 𝐏 𝐭 is a set of paths whose delays are affected by resizing or by the Trojan. 𝛾 0 is computed from 𝑁 , the number of fabricated chips as this serves as upper bounds on the number of chips that can be tested. It is because if the number of chips to be tested to detect a particular Trojan in the circuit at given confidence level is beyond the total number of chips, then any test to detect the Trojan using delay measurement may fail. Our approach is particularly useful for chips for sensitive applications where 𝑁 is typically small. 102 INPUT: Original Circuit (𝐶 ), the circuit with the Trojan but without resizing (𝐶 1 ), and the total number of fabricated chips (𝑁 ). OUTPUT: The circuit with the Trojan after resizing (𝐶 𝑇 ) 1: 𝐏 𝐭 ← every path that passes via the Trojan site 2: Measure 𝐷 (𝑃 𝑖 ) and 𝐷 ′ (𝑃 𝑖 ) from 𝐶 and 𝐶 1 respectively, ∀𝑃 𝑖 ∈𝐏 𝐭 3: Compute 𝛾 0 and 𝛾 1 for 𝑁 4: 𝑘 ← 1 5: while max ∀𝑃 𝑖 ∈𝐏 𝐭 {𝑓𝑖𝑡 (𝑃 𝑖 )}>𝛾 0 OR improvement possible begin 6: 𝐒 ← every gate along path 𝑃 𝑖 with 𝑓𝑖𝑡 (𝑃 𝑖 )>𝛾 1 , ∀𝑃 𝑖 ∈𝐏 𝐭 7: for every gate 𝑔 in 𝐒 begin 8: 𝐏 𝒈 ← every path that passes via 𝑔 9: 𝐏 𝒈 ′ ← every path 𝑃 𝑖 with 𝑓𝑖𝑡 (𝑃 𝑖 )>𝛾 1 , ∀𝑃 𝑖 ∈𝐏 𝒈 10: 𝑦 𝑜 ← max ∀𝑃 𝑖 ∈𝐏 𝒈 {𝑓𝑖𝑡 (𝑃 𝑖 )} 11: [𝑦 , α 𝑔 ′ ] ← solve_LP(𝑔 , 𝐶 𝑘 ) 12: 𝛽 𝑔 =(𝑦 0 − 𝑦 )×|𝐏 𝒈 ′ | 13: if 𝛽 𝑔 >𝛽 𝑚𝑎𝑥 then 14: 𝛽 𝑚𝑎𝑥 ←𝛽 𝑔 , 𝛼 𝑚𝑎𝑥 ←α 𝑔 ′ , and 𝑔 𝑚𝑎𝑥 ←𝑔 15: end 16: end 17: 𝐏 𝒈 𝒎𝒂𝒙 ← every path passing via 𝑔 𝑚𝑎𝑥 and its immediate fanins 18: Resize 𝑔 𝑚𝑎𝑥 with sizing factor, 𝛼 𝑚𝑎𝑥 19: 𝐏 𝐭 ←𝐏 𝐭 ∪𝐏 𝒈 𝒎𝒂𝒙 20: Update 𝐷 (𝑃 𝑖 ) and 𝐷 ′ (𝑃 𝑖 ) from 𝐶 and 𝐶 𝑘 respectively, ∀𝑃 𝑖 ∈𝐏 𝐭 21: 𝐶 𝑘 +1 ←𝐶 𝑘 and 𝑘 ←𝑘 +1 22: end 23: 𝐶 𝑇 ←𝐶 𝑘 Figure 25: Gate resizing algorithm 103 4.4.6 Implementation of Gate Resizing Our proposed algorithm for gate resizing terminates when the circuit’s fitness, max ∀𝑃 𝑖 ∈𝐏 𝐭 {𝑓𝑖𝑡 (𝑃 𝑖 )}, becomes less than or equal to the threshold, 𝛾 0 , and it produces the new version of the circuit, 𝐶 𝑇 , as shown in Figure 25. As a result, a new gate sizing factor, 𝛼 𝑖 ′ , is determined for every gate 𝑔 𝑖 in the circuit. However, our LP problem explained in Section 4.4.3 outputs a real number for 𝛼 𝑖 ′ so it can only be used after discretization. In Chapter 1, we assume that the Trojan is inserted into the layout during fabrication steps. And the Trojan designer would be fully-aware of both the original design and the manufacturing technology. Hence, the Trojan designer would not be constrained to choosing only gates from the standard library but may try to adjust the size of each gate with respect to the minimum resolution allowed by the design rule of the technology. For example, if the original and new widths of a PMOS transistor in a particular gate computed from the proposed algorithm are 𝑤 and 𝑤 ′, respectively, then the Trojan designer will choose 𝑤 ′′ for the actual width of the PMOS transistor by adjusting 𝑤 ′ with respect to 𝑤 𝑚𝑖𝑛 , where 𝑤 𝑚𝑖𝑛 is the minimum resolution defined in the design rule of the technology. In our experiments, we followed the design rule provided by the corresponding technology, NCSU 45nm and the industrial 65nm PDK, and the final layouts before and after applying gate resizing approach successfully passed design rule checking (DRC). As an example, we consider s510 benchmark circuit where line 35 is affected by Trojan (line 35 is the Trojan site). Here, a minimum-sized inverter is inserted to the layout after place & route, and we connect its input to line 35 to introduce the minimal additional load to the Trojan site. Figure 26 shows two versions of the layout, where Figure 26(a) and (b) correspond to the original layout without resizing and the new layout with the Trojan and gate resizing, 104 respectively. The layout is drawn using Cadence Encounter [100] and Nangate 45nm Open Cell Library [54]. From the original layout, we obtained the new layout by incrementally changing the sizes of gates selected from the algorithm, using Cadence Virtuoso. As explained in Section 4.3.2 and 4.6, the impact of gate resizing on the area of the circuit is negligible, and in this example there is actually no difference in the total area of the layout (core size in Encounter) between two versions of the circuit, since we utilized empty space around each gate for resizing. However, static power consumption of the circuit will be changed due to gate resizing, and it will follow the area change of individual gates, not just the total area of the layout. Thus, we report the sum of area change of every gate in the circuit in Table 7, instead of the total area change of the layout. (a) (b) Figure 26: Two versions of the layout of s510, (a) without gate resizing and (b) with gate resizing and the minimum-sized inverter whose input is connected to line 35 105 4.4.7 Complexity of the Approach For a circuit consisting of 𝑁 𝑔 gates and 𝑁 𝑝 paths, the size of the candidate set is 𝑁 𝑔 in the worst case, which means that the algorithm may invoke solve_LP at most 𝑁 𝑔 times at every iteration. Also, the number of constraints in the LP problem is 𝑁 𝑝 , in the worst case. However, we employ the linear delay model in order to compute and update 𝐷 ′ (𝑃 𝑖 ) quickly (However, final solution, 𝐶 𝑇 is evaluated using accurate transistor-level simulations with process variations). Moreover, the size of the candidate set is greatly reduced by candidate pruning and in practice it is much smaller than 𝑁 𝑔 . And in reality, solve_LP is done faster in our algorithm by searching a solution, the optimal sizing factor, starting from the current sizing factor, since these two sizing factors are typically close together. The actual CPU time of the algorithm is much faster than the worst case and is 138 minutes for c7552 in our experiments. Also it is important to note that our approach focuses on reducing delay of every path in a single logic block at a time. Finally, we can get significant further speed-up by parallelizing the for-loop between line 7 and 16 of the algorithm in Figure 25, as every iteration of this loop is independent of every other. 4.5 Experimental Setup To evaluate the effectiveness of our algorithm, we conducted two kinds of experiments. For all experiments, we performed Monte-Carlo simulations using Cadence Spectre with process variations models described in Section 3.2. Our algorithm is implemented in C, and runs on a workstation with 2.4GHz AMD Opteron 6234 processors and 32GB RAM. First, we selected benchmark circuits from ISCAS’85 and 89 suites for our experimentation, and simulated using 106 two different technologies (65nm and 45nm) with process variations models to show that the benefits of our algorithm are independent of the particular technology. For every line in a benchmark circuit, we insert a minimum-sized inverter load to mimic a minimum additional load by a Trojan, connect its input to each line in the circuit, and create a Trojan-affected circuit as described in Chapter 3. We note that for this set of experiments, we don’t add a particular Trojan block in order to observe the pure benefit of our resizing approach on values of delays and power/ground currents of the original circuit. And then our algorithm produces a new version of the Trojan-affected circuit after gate resizing. Finally, we apply vectors to all three versions of the circuit and measure path delays, and compute the number of chips required to detect a Trojan in the Trojan-affected circuit without resizing (𝑁 0 ), and the Trojan-affected circuit after resizing (𝑁 𝑅𝑒𝑠𝑖𝑧𝑒 ) using the likelihood-ratio based hypothesis testing method. Vectors applied to benchmarks are obtained from Chapter 3. The average values of 𝑁 0 and 𝑁 𝑅𝑒𝑠𝑖𝑧𝑒 over every line in each circuit are reported in Table 7. In addition, we also evaluate our approach using the actual Trojan implementations in [104]. Among these benchmarks, we selected three Trojan benchmark circuits, s38417-T100, T200, and s35932-T300, and synthesized and simulated them using the 65nm technology with and without applying our approach. A Trojan in each benchmark consists of a Trojan block and multiple Trojan sites and is triggered at very low probability, which makes Trojan activation extremely difficult, e.g., s38417-T100 contains a Trojan block consisting of 16 additional gates, which is connected to 17 Trojan sites in the original netlist and is only triggered with the probability of 1.4243e-70 [104]. Basic information of each benchmark, such as the number of Trojan sites and the size of the Trojan block is provided in Table 6 (more information about each benchmark is 107 available in [104]). 4.6 Experimental Results In Table 7, the benefit of our approach is between 3X and 12X (the average of 6.5X) in terms of the average number of chips to be tested to detect the Trojan for the selected ISCAS’85 and 89 benchmark circuits with 95% confidence level, where the average area change caused by our approach is less than 1%. This shows that the difficulty of Trojan detection is greatly increased by our new approach, and the area overhead is negligible. Also, as indicated in Table 7, our approach also provides an extremely low change in power/ground currents. In practice, to connect the Trojan block to the original circuit, a Trojan designer possibly uses a smaller gate than the minimum-sized inverter we have used in our experiments to reduce the additional load. In this scenario, the amount of additional delay induced to the Trojan site would be smaller than that of the minimum-sized inverter and hence the Trojan designer may be able to obtain the same degree of benefit from our approach with even smaller area overhead. In addition, we measure the average CPU time of running the algorithm for every line in each benchmark circuit. The result shows that the CPU time increases as the size of the circuit, but remains practical (maximum of 138 minutes) for various sized benchmark circuits, because our approach only needs to be applied to one combinational logic block at a time and is inherently amenable to parallelization. We also measure the average number of gates chosen by the algorithm to be resized, for every benchmark circuit. The results show that a very small number of gates (less than 30 for c7552) needs to be resized. We also observe that there is typically empty space around a gate to be resized in a layout which allows us to resize the corresponding gate without 108 changing locations of nearby gates. Also, our approach targets a single logic block at a time and hence redesigns a small fraction of the entire chip. Thus, our approach can be easily applied without changing locations of gates in the layout every time. Table 6: The difference (percentage) in the total power consumption and delay (A) between 𝐶 and 𝐶 1 , and (B) between 𝐶 and 𝐶 𝑇 for the selected benchmark circuits. 𝑁 𝑇𝑆 : the number of Trojan sites, 𝑁 𝑇𝐵 : the number of gates and components in the Trojan block. Benchmark 𝑵 𝑻𝑺 𝑵 𝑻𝑩 (A) (B) Power Delay Power Delay s38417-T100 17 16 0.501% 15.6% 0.845% 3.1% s38417-T200 20 26 1.212% 10.5% 2.093% 2.1% s35932-T300 21 39 2.56% 23.6% 2.93% 4.3% Table 7: The average number of chips required to detect a Trojan in the Trojan-affected circuit without resizing (𝑁 𝑜 ), and the Trojan affected circuit after resizing (𝑁 𝑅𝑒𝑠𝑖𝑧𝑒 ) for various ISCAS ’85 and 89 benchmark circuits for 65nm and 45nm technologies, where 𝛾 0 = 0.01, 𝛾 1 = 0.05 and confidence level of 95%. The maximum number of iterations for the algorithm is 50. Benchmark circuits No. of Gates 65nm 𝑵 𝒐 𝑵 𝑹𝒆𝒔𝒊𝒛𝒆 𝑵 𝑹𝒆𝒔𝒊𝒛𝒆 /𝑵 𝟎 Average area change Average no. of resized gates CPU runtime (sec) s298 119 697 6242 8.96X 0.57% 5 0.8 s386 159 2151 13978 6.50X 0.62% 7 1.1 s420 196 587 7307 12.44X 0.48% 6 2.4 s510 211 1030 8594 8.35X 0.56% 9 8.3 s1238 508 1852 9574 5.17X 0.32% 16 40 s1488 653 4784 14579 3.05X 0.45% 18 52 s5378 2779 1514 9487 6.27X 0.12% 13 438 s9234 5597 2022 8776 4.34X 0.09% 10 1102 s15850 9772 2163 10234 4.73X 0.59% 12 3021 c880 383 1070 5660 5.29X 0.78% 12 132 c2670 1193 1468 6566 4.47X 0.56% 17 893 c5315 2307 2035 7317 3.60X 0.45% 20 4298 c7552 3512 2823 12921 4.58X 0.32% 28 8324 109 Benchmark circuits No. of Gates 45nm 𝑵 𝒐 𝑵 𝑹𝒆𝒔𝒊𝒛𝒆 𝑵 𝑹𝒆𝒔𝒊𝒛𝒆 /𝑵 𝟎 Average area change Average no. of resized gates CPU runtime (sec) s298 119 775 9374 12.09X 0.32% 5 0.7 s386 159 2207 17572 7.96X 0.54% 7 1.3 s420 196 887 10119 11.41X 0.49% 6 2.3 s510 211 1394 12010 8.61X 0.82% 9 10.0 s1238 508 2235 11795 5.28X 0.44% 15 37 s1488 653 5034 15161 3.01X 0.36% 19 54 s5378 2779 1861 12224 6.57X 0.23% 14 523 s9234 5597 2325 13232 5.69X 0.22% 11 1302 s15850 9772 1982 14521 7.33X 0.61% 13 3892 c880 383 1258 8262 6.57X 0.65% 12 234 c2670 1193 1910 12014 6.29X 0.43% 18 1021 c5315 2307 2103 13204 6.28X 0.33% 21 4692 c7552 3512 3829 15392 4.02X 0.23% 28 8124 Finally, we measure the delay of a path and the total power consumption from each version of three Trust-hub benchmark circuits as shown in Table 6. In these circuits, Trojans contribute only a few percentages to the total power consumption, where the delay difference caused by Trojan insertion is much larger (more than 10%). The difference in the total power consumption remains almost unchanged before and after applying our approach, while the path delay difference is greatly reduced (5.12X) due to our gate resizing. However, these results also demonstrate that our approaches for detecting minimally delay-invasive Trojans are still effective, but require only larger number of chips to detect the Trojans. 4.7 Conclusions In this chapter, we introduced the new attack scenario that effectively hides the impact of the Trojan on delay. And our approach minimizes the impact of Trojan on delay by gate resizing that 110 does not alter the functionality of the circuit and allows only a very small change in area. We formulated the gate resizing problem and developed the algorithm to solve the problem. Our experiments show that the benefit of our approach is 6.74X in terms of the number of chips required to detect a Trojan. Furthermore, our approach changes area minimally (<1%) and has negligible impact on other circuit parameters. 111 CHAPTER 5. DETECTION OF MAXIMALLY-MATCHED MODELS OF TROJANS 5.1 Introduction In this chapter, we now consider an alternative scenario of Trojans, namely the maximally- matched model of Trojans, which involves one or more changes to the netlist of the original circuit and/or resizing of gates to maximally match delays of every path. Such Trojans always incur at least one gate or wire change to the netlist of the original circuit, and hence may cause changes in delays of many paths. In addition, even though the circuit is redesigned and resized to make Trojans cause only minimal deviations on delays, it is shown to be possible to detect the Trojans by making measurement on sufficient number of chips, due to the fundamental difference between uni-directionality of delay changes caused by Trojans vs. bi-directionality of changes due to process variations. In this chapter, we aim to formally prove that the maximally-matched models of Trojans can be detected using delay measurements. To accomplish that, we have derived the following two research tasks: (1) Exploration of a wide range of circuit-level transformations, which can span the entire space of possible circuit-level transformations. (2) Investigation of the effects of logic transformations on delays, where the rest of the circuit can be redesigned and transistors resized to maximally match delays between the original and transformed circuits. In our study, we do not make any assumption about how the adversary will re-implement the same logic function. Instead we focus on general cases of hardware tampering that cover every possible circuit modification, without adding any particular Trojan to the netlist in order ensure that our investigations are not constrained by the specifics of Trojans. Thus, we think of at least 112 one change (or transformation) to a gate-level netlist where these transformations are arbitrary except that they ensure that the new netlist implements the same logic function, and the gates in the netlist are resized to minimize the impact on delays as shown in Figure 27. Hence, our exploration focuses on the following question: For a netlist 𝑁 𝑖 , does there exist a set of sizing factors 𝑆 𝑖 for the gates, such that the resulting circuit 𝐶 𝑖 has a delay signature that matches (within margin of measurement resolutions) the delay signature for the original circuit 𝐶 with netlist 𝑁 0 , where 𝑁 0 ≠𝑁 𝑖 ? Figure 27: (a) An original circuit, 𝐶 . (b) A Trojan-affected circuit with a maximally-matched Trojan, 𝐶 𝑖 (with resizing in order to maximally match delays of every path with (a)). In the rest of the chapter, we show our analysis on several most common types of logic transformations and their effects on delays. And we provide theoretical foundations that for every logic transformation we study, mismatch in delays caused by each of logic transformation can be detected by measuring delays, provided that we are able to apply desired vectors to a target and propagate its delay to the circuit’s output. ... ... ... ... ... ... The original netlist and gate sizings The new netlist and gate sizings C Ci 113 5.2 Problem Statement For a given design of circuit 𝐶 , which consists of a netlist 𝑁 0 and gate sizings 𝑆 0 , the adversary manipulates its netlist and sizings via an arbitrary circuit transformation, say 𝑇 𝑖 , and hence 𝐶 is transformed into a different design, 𝐶 𝑖 , which consists of a different netlist 𝑁 𝑖 and gate sizings 𝑆 𝑖 . Here, manipulation of the netlist is done in a manner that the original and the manipulated netlists implement the same logic functions, and delays of every path in 𝐶 and 𝐶 𝑖 are maximally matched via gate resizing. In other words, for the inputs of 𝐶 and 𝐶 𝑖 , we apply vectors, measure delays at the outputs of 𝐶 and 𝐶 𝑖 , and compare the delays, i.e., for given 𝐶 and 𝐶 𝑖 , delays measured at a particular output, j, of 𝐶 and 𝐶 𝑖 after applying a particular vector 𝑉 𝑘 , are 𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 ) and 𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 ) where 𝑂 𝑗 is the jth output of 𝐶 and 𝐶 𝑖 . And the difference between delays measured from the jth output is |𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 )−𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )|. The question is whether it is possible to detect the transformation from 𝐶 to 𝐶 𝑖 by selecting vectors and measuring delays, in presence of process variations and capabilities (resolution) of delay measurements? First, we assume that an attempt to applying only one vector to detect such difference may fail, since there will be ways to eliminate the delay difference by resizing not only gates, but also individual transistors. This is to say, for given one set of linear inequalities for particular vector 𝑉 𝑘 , ∀𝑗 , |𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )−𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 )|≤𝛥 𝑚 , (12) where 𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 ) is the delay of 𝐶 and hence is constant value, say 𝑀 𝑗 ,𝑘 , and 𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 ) is a function of sizings of transistors in 𝐶 𝑖 . The question is whether we can find sizings of transistors 114 in 𝐶 𝑖 which makes |𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )−𝑀 𝑗 ,𝑘 |≤𝛥 𝑚 . And if the circuit has only one output, or only one output has a transition when 𝑉 𝑘 is applied, then it is usually possible to find a solution which makes the above inequality true. This is true except for a few special cases such as the scenario when the circuit is delay-optimized in terms of the number of stages, sizings, etc., so changing the logic and resizing transistors cannot make delay become lower than the initial (optimal) value. Thus, we avoid detecting a target using only one vector, but instead using multiple vectors that always cause mismatch in delays independent of transistor sizings. For example, we can think of multiple linear inequalities like Eq. (12), whose solution never exists. So our principle is, if we carefully select several vectors of which at least one vector should cause the difference in delays, independent of transistor sizings, then we are able to detect the difference, provided that these vectors propagate the difference to the circuit’s output. For example, we can think of a scenario that a 4-input NAND gate within the original circuit is decomposed into smaller fanin gates, and only one vector, say (R, S1, S1, S1), is used to detect this transformation. It is possible to hide the difference in delays for this vector by specifically resizing transistors of the decomposed gates. Table 12 shows that it is also possible to hide the delay difference for this particular vector, for many other kinds of logic transformations, however, cannot completely hide it when many vectors are applied and delays are measured. Thus, our goal is find a set of vectors, 𝐕 , for an arbitrary pair of 𝐶 and 𝐶 𝑖 , where the difference in delays at an arbitrary output, say j, is greater than the detection threshold for at least one vector, say 𝑉 𝑘 ∈𝐕 , provided that we have the ability to apply desired vectors to sensitize the difference and propagate the difference to the circuit’s output. In other words, except a few special cases of transformations and original circuit, we assume that it is always possible to 115 detect the mismatch caused by any arbitrary logic transformation with transistor resizing by having the capabilities described above (ability to apply vectors and propagate the difference). Based on the above assumption, we formally introduce our problem of selecting such vectors invoking the difference and quantifying this difference. Problem statement: For a given circuit 𝐶 and an arbitrary type of logic transformation and transistor resizing 𝑇 𝑖 which transforms 𝐶 into 𝐶 𝑖 , find vectors 𝐕 of which at least one vector, say 𝑉 𝑘 ∈𝐕 , invokes the difference greater than 𝛥 𝑚 between delays measured at the outputs of 𝐶 and 𝐶 𝑖 , where 𝛥 𝑚 is the detection threshold determined by the measurement resolution. Furthermore, also find a subset of 𝐕 that invokes the maximum difference between delays of 𝐶 and 𝐶 𝑖 . The term, 𝛥 𝑚 is detection threshold determined by the resolutions of delay measurements. If we are able to measure delays at finer granularities, then 𝛥 𝑚 can become very small. In Section 3.7, we showed that we can achieve very small amount of 𝛥 𝑚 , around 1ps. However, improving the measurement resolution will incur extra overhead to the chip and may be costly. Thus, selection of vectors is highly constrained by the amount of 𝛥 𝑚 , and it is important to find vectors that invoke difference in delays maximally. 5.3 Definitions and Notations For an arbitrary circuit with primitive gates, 𝐶 , and a circuit after logic transformation, 𝐶 𝑖 , suppose that we select a pair of arbitrary gates, say 𝑔 1 and 𝑔 2 , from 𝐶 and 𝐶 𝑖 , respectively, and these gates have 𝑛 𝑔 1 and 𝑛 𝑔 2 inputs, respectively. Boolean functions implemented by inputs of 𝑔 1 and 𝑔 2 are denoted by 𝑓 𝑔 1 ,𝐼 1 , … , 𝑓 𝑔 1 ,𝐼 𝑛 𝑔 1 and 𝑓 𝑔 2 ,𝐼 1 , … , 𝑓 𝑔 2 ,𝐼 𝑛 𝑔 2 , respectively. Similarly, 116 Boolean functions implemented by output of 𝑔 1 and 𝑔 2 are denoted by 𝑓 𝑔 1 𝑂 and 𝑓 𝑔 2 𝑂 , respectively. Definition 1 (multi-input gate): A multi-input gate is a gate with more than one fanin, e.g., NAND or NOR. In contrast, a single-input gate refers to an inverter. In this chapter, we only consider two types of multi-input gates: NAND and NOR. Definition 2 (identical gates): Two multi-input gates, 𝑔 1 and 𝑔 2 , are identical if and only if (1) they have the same number of fanins 𝑛 𝑔 1 =𝑛 𝑔 2 and (2) all of its input(s) and output implement the same Boolean functions, i.e., ∀𝑖 ,𝑓 𝑔 1 ,𝐼 𝑖 ≡𝑓 𝑔 2 ,𝐼 𝑖 and 𝑓 𝑔 1 ,𝑂 ≡𝑓 𝑔 2 ,𝑂 . Thus, if the gates satisfy the following conditions, then they are identical gates. - Inputs implement the same Boolean functions, i.e., ∀𝑖 ,𝑓 𝑔 1 ,𝐼 𝑖 ≡𝑓 𝑔 2 ,𝐼 𝑖 - The numbers of fanins of the gates are identical, i.e., 𝑛 𝑔 1 =𝑛 𝑔 2 - The gate type is also identical Definition 3 (structurally-dual gates): Two multi-input gates, 𝑔 1 and 𝑔 2 , are structurally- dual if and only if (1) they have the same number of fanins, and (2) Boolean function of every input and output is complementary. Thus, if the gates satisfy the following conditions, then they are structurally-dual gates. - Inputs implement complementary Boolean functions, i.e., ∀𝑖 ,𝑓 𝑔 1 ,𝐼 𝑖 ≡𝑓 𝑔 2 ,𝐼 𝑖 ̅̅̅̅̅̅ - The numbers of fanins of the gates are identical, i.e., 𝑛 𝑔 1 =𝑛 𝑔 2 - The gate type is opposite to each other, i.e., 𝑔 1 is NAND and 𝑔 2 is NOR, or vice versa. 117 For each pair of multi-input gates in two different circuits, whether two gates are either identical or structurally-dual can be determined by (1) comparing the number of fanins of two gates, and (2) comparing Boolean functions implemented by inputs and output. If a vector is applied to the input of the circuit, then a specific vector will be applied to the inputs of each multi-input gate as a consequence. Every input vector causing a transition at the gate’s output can be categorized into two classes, namely single-input switching pattern (SIS) and multiple-input switching pattern (MIS). Definition 4 (single-input switching (SIS) and multiple-input switching (MIS) patterns): A single-input switching (SIS) pattern of a multi-input gate refers to a sequence of vectors at the gate’s inputs, which consist of rising or falling transition at only one input of the gate and steady non-controlling value at the rest of the inputs. And a multiple-input switching (MIS) pattern of a gate refers to a sequence of vectors consisting of transitions at more than one input of the gate, all in the same direction, and steady non-controlling value at the rest of the inputs. So the only difference between SIS and MIS is the number of inputs having transitions. As an example, Table 8 shows an exhaustive list of SIS and MIS of a 3-input NAND gate, where symbols S1, S0, R, and F refer to steady-1, steady-0, rising transition, and falling transition, respectively. And the controlling and non-controlling values of a 3-input NAND gate is 0 and 1, respectively. Table 8: SIS and MIS of a 3-input NAND gate Type Values applied to inputs Single-input change (SIS) SIS-TNC: (S1, S1, R), (S1, R, S1), (R, S1, S1) SIS-TC: (S1, S1, F), (S1, F, S1), (F, S1, S1) Multiple-input change (MIS) MIS-TNC: (S1, R, R), (R, S1, R), (R, R, S1), (R, R, R) MIS-TC: (S1, F, F), (F, S1, F), (F, F, S1), (F, F, F) 118 SIS can be further classified into single-input switching with to-controlling transitions (SIS- TC) and single-input switching with to-non-controlling transitions (SIS-TNC) depending on the direction of transition. The direction of transition in SIS-TC is to-controlling transition, i.e., transition from non-controlling value to controlling value. On the other hand, the transition in SIS-TNC is to-non-controlling transition. Similarly, MIS can be classified into MIS-TC and MIS-TNC. For example, for a 3-input NAND gate, (S1, S1, R) and (S1, S1, F) are SIS-TNC and SIS-TC, respectively, and (S1, R, R) and (S1, F, F) are MIS-TNC and MIS-TC, respectively. 5.4 Gate Delay Model In this chapter, we analyze the effects of logic transformation and resizing on delays, and hence require quantification of delays using an appropriate delay model. Specifically, we use first-order delay models to estimate delays of the circuit for a particular vector applied to the circuit’s input, where the delay of a path sensitized by the vector can be represented as the sum of gates along the sensitized path. And for a particular gate whose delay needs to be estimated for a particular input pattern, we first construct a RC equivalent circuit of the gate, using (1) sizings of transistors constituting the gate, (2) the resistances and capacitances, and (3) an input vector, and then compute the delay in terms of equivalent resistances and capacitances. To accomplish this, we use well-known MOSFET and RC delay models described in [27][91]. In the rest of the chapter, we assume that every transistor has the minimum length, and hence the term ‘sizing’ of a particular transistor refers to the width of the transistor. Suppose that a particular transistor is converted to an equivalent RC circuit, and it consists of diffusion capacitances at drain and source nodes, a gate capacitance, and an equivalent resistance 119 between drain and source nodes (source-to-drain resistance). Here, the values of gate, source, and drain capacitances are all proportional to the transistor sizing (again, we note that every transistor in our study has the minimum length, and hence sizing of a transistor means changing its width). At the same time, the source-to-drain resistance is inversely proportional to the transistor sizing. And the two types of transistors, i.e., p- and n-transistors, have significantly different source-to-drain resistance for the same size, because of the difference between carrier mobilities of p-transistor (hole) and n-transistor (electron) [91]. For simplicity of constructing the RC equivalent circuit, we ignore the drain-to-body capacitance and the gate-to-drain capacitance. Figure 28: RC equivalent circuits of a 2-input NAND for two input patterns, (a) (A, B) = (R, S1) and (b) (A, B) = (S1, R), where 𝑥 𝑖 ’s and 𝑦 𝑖 ’s are the sizings of p- and n- transistors, 𝑣 𝐶 𝑑𝑝 (𝑥 ) and 𝑣 𝐶 𝑑𝑛 (𝑥 ) are diffusion capacitances of a p- and n-transistors with sizing 𝑥 , respectively, and 𝑣 𝑅 𝑝 (𝑥 ) and 𝑣 𝑅 𝑛 (𝑥 ) are drain-to-source resistances of a p- and n-transistors with sizing 𝑥 , respectively. A B A B RC-equivalent circuit of pull-down network 2-input NAND Rising Steady-1 A B A B RC-equivalent circuit of pull-down network 2-input NAND Rising Steady-1 (a) (b) 120 An internal node between drain/source nodes of two or more adjacent transistors corresponds to an internal capacitance whose size is the sum of all the diffusion capacitances of the transistors. And every internal capacitance is initially precharged or discharged depending upon the initialization vector, and hence RC equivalent circuits are also constructed differently for different initialization vectors. For example, as shown in Figure 28, if the initialization vector for a 2-input NAND gate is (R, S1), then the internal capacitance between two serially-connected n- transistors will be initially discharged since the n-transistor connected to the gnd pin is initially turned on and discharges the internal capacitance. However, if the initialization vector is (S1, R), then this internal capacitance will be initially charged because the upper n-transistor is turned on. In CMOS technologies, the amount of diffusion capacitance of a transistor is determined by several parameters, including doping densities of the drain/source/substrate/channel stop regions, the area of each region, etc. The diffusion capacitance can be largely decomposed into two factors, junction capacitance between source/drain regions and the substrate (whose value depends on the bottom area of the source/drain regions), and junction capacitances between source/drain regions and channel stop regions (whose value depends on the sum of side-wall areas of the source/drain regions). Thus, we can say that the diffusion capacitance of a transistor is roughly proportional to sizing of the transistor. In addition, the gate capacitance of a transistor is determined by the amount of area overlapping between its gate and substrate, and also depends upon voltage applied to the gate. For simplicity, we express the gate capacitance as a function of only sizing of the transistor. On the other hand, the source-to-drain resistance of a transistor is inversely-proportional to the sizing of the transistor, since the amount of currents flowing through the channel of the transistor is proportional to the width of the transistor. 121 Thus, the source-to-drain resistance 𝑣 𝑅 𝑛 (𝑥 ), and diffusion and gate capacitances, 𝑣 𝐶 𝑑𝑛 (𝑥 ) and 𝑣 𝐶 𝑔𝑛 (𝑥 ), of a n-transistor with the sizing of 𝑥 can be formulated as follows: 𝑣 𝑅 𝑛 (𝑥 )=𝑅 𝑛 /𝑥 𝑣 𝐶 𝑑𝑛 (𝑥 )=𝐶 𝑑𝑛 𝑥 𝑣 𝐶 𝑔𝑛 (𝑥 )=𝐶 𝑔𝑛 𝑥 , (13) where 𝑅 𝑛 , 𝐶 𝑑𝑛 , and 𝐶 𝑔𝑛 are the source-to-drain resistance, diffusion capacitance, and gate capacitance of a n-transistor having the unit width and minimum length. Similarly, we can represent equivalent resistance and capacitances of a n-transistor with the sizing of 𝑥 as follows: 𝑣 𝑅 𝑝 (𝑥 )=𝑅 𝑝 /𝑥 𝑣 𝐶 𝑑𝑝 (𝑥 )=𝐶 𝑑𝑝 𝑥 𝑣 𝐶 𝑔𝑝 (𝑥 )=𝐶 𝑔𝑝 𝑥 , (14) where 𝑅 𝑝 , 𝐶 𝑑𝑝 , and 𝐶 𝑔𝑝 are the source-to-drain resistance, diffusion capacitance, and gate capacitance of a p-transistor having the unit width and minimum length. In addition, we can represent 𝑅 𝑝 as a function of 𝑅 𝑛 and the ratio of the mobilities of p- and n- transistors, 𝜇 =𝜇 𝑝 /𝜇 𝑛 . The source-to-drain resistances of two p- and n- transistors with the same unit widths and minimum lengths will have relationship as follows: 𝑅 𝑝 =( 𝜇 𝑝 𝜇 𝑛 )𝑅 𝑛 (15) On the other hand, diffusion capacitances of p- and n-transistors with the same unit widths and minimum lengths are roughly the same, compared to the large difference between 𝑅 𝑝 and 𝑅 𝑛 due to high mobility difference. This is true in many technologies, e.g., the values of 𝜇 are estimated as around 4 in IBM 90nm technology [102], around 2.7 in TSMC 180nm technology 122 [103], and about 2.5 in the 65nm industrial technology that is extensively used throughout this dissertation. On the other hand, the value of 𝐶 𝑑𝑝 /𝐶 𝑑𝑛 is roughly 1.39 in IBM 90nm technology, 0.88 in TSMC 180nm technology, and 1.15 in the industrial 65nm technology (capacitances are computed using equations in [91][92]). And this phenomenon is similar for gate capacitances. So we assume that 𝐶 𝑑𝑝 and 𝐶 𝑔𝑝 are roughly identical to 𝐶 𝑑𝑛 , and 𝐶 𝑔𝑛 , respectively. 𝐶 𝑑 =𝐶 𝑑𝑝 ≈𝐶 𝑑𝑛 𝐶 𝑔 =𝐶 𝑔𝑝 ≈𝐶 𝑔𝑛 (16) Based on the MOSFET model described above, we can estimate delay of a RC equivalent circuit using the Elmore delay model. For example, delays of a 2-input NAND gate for two input patterns, (A, B) = (R, S1) and (A, B) = (S1, R), can be calculated from the RC equivalent circuits constructed in Figure 28, where the delay of Figure 28(a) is, ( 𝑅 𝑛 𝑦 1 + 𝑅 𝑛 𝑦 2 )(𝐶 𝑑 𝑦 1 +𝐶 𝑑 𝑥 1 +𝐶 𝑑 𝑥 2 +𝐶 𝐿 ), (17) and the delay of Figure 28(b) is as follows. 𝑅 𝑛 𝑦 2 (𝐶 𝑑 𝑦 1 +𝐶 𝑑 𝑦 2 )+( 𝑅 𝑛 𝑦 1 + 𝑅 𝑛 𝑦 2 )(𝐶 𝑑 𝑦 1 +𝐶 𝑑 𝑥 1 +𝐶 𝑑 𝑥 2 +𝐶 𝐿 ) (18) 5.5 Proposed Approach As the first step of solving the problem formulated in Section 5.2, we start by investigating various common methods for circuit transformations, which are very likely to be used by the adversary to implement Trojans. First, we identify three transformation methods, including (a) transformations using De Morgan’s law, (b) decomposition of a large fanin gate into gates with smaller fanins, and (c) algebraic substitution based transformation. Since one or more instance of 123 the above three transformations are extremely likely to be used during design of any maximally- matched Trojan, investigating these three circuit transformations will help us study applicability of using delay signatures to detect the maximally-matched models of Trojans. Next, we prove that every single or multiple instances of these transformations causes mismatch in delays, and the mismatch can be detected by applying vectors that sensitize the mismatch and propagate it to the circuit’s output. First, we generalize the effects of each of the three identified circuit transformation methods on the netlist of an arbitrary circuit. As a result, we found that (a) transformations using De Morgan’s law and (b) decomposition of a multi-input gate into gates with smaller fanins involve changes in at least one multi-input gate. Subsequently, we have made several observations, that these transformations significantly change the gate’s delays, and the delays of two gates – the original and the transformed gates - cannot be matched for every possible vector that can be applied to inputs of the gates. In addition, we observed that every instance of (c) algebraic substitution based transformation which always causes fanouts of the substituting line be re- connected to the substituted line, can be detected using the same vectors for detecting minimally delay-invasive Trojans. Furthermore, we will show our theoretical proofs that for any non-reconvergent combinational logic block, we can always detect any single or multiple instances of transformation using the above three methods, independent of how the gates/transistors of the circuit are resized and how the input/output cones of the transformed part of the circuit are redesigned. And we extend this idea to any general circuit, that we can detect any 124 abovementioned instances of transformations, provided that we are able to apply desired input vectors to a target and propagate its delays to the circuit’s output. 5.5.1 The Effects of Single Instance of Circuit Transformations on Delays In this subsection, we consider a case of only a single instance of circuit transformations using the methods identified above. Once a particular single instance of transformations is applied to the netlist, the transistors in the transformed part of the circuit are resized to maximally match the delays of the original and the transformed circuits. We will show that for any single instance of transformations using the three methods identified above, there is no way to hide the mismatch in the delays for every possible vector. 5.5.1.1 Transformations using De Morgan’s law First, we start by describing how transformations using De Morgan’s law affect a netlist. Every single instance of transformations using De Morgan’s law transforms a multi-input gate into its structurally-dual gate by adding/pushing inverters around the gate’s inputs and output. Here, structural duality between a multi-input gate in the original circuit and the corresponding gate in the transformed netlist is: (1) the two gates are of the dual types, i.e., NAND vs. NOR, and (2) have the same number of fanins, and (3) logic functions implemented by the inputs are complementary. Figure 29(a) depicts how such a single instance of transformations occurs at an arbitrary multi-input gate. After a particular transformation, every transistor of the transformed gate, 𝑔 ′ will be resized to eliminate the difference between the delays of two circuits. 125 Figure 29: (a) Transformation of an arbitrary multi-input gate, 𝑔 , into its structurally-dual gate, 𝑔 ′ , where 𝐹 is the logic subcircuit in fanin of 𝑔 , I1 and I2 are inverters added/moved due to the transformation, and 𝑓 𝑖 is logic function implemented by the 𝑖 th input of 𝑔 and 𝑓 𝑖 ̅ =𝑁𝑂𝑇 (𝑓 𝑖 ). (b) transistor-level diagrams of 𝑔 and 𝑔 ′ , where capacitors within dashed lines are internal capacitances whose values are the sum of drain/source capacitances of two consecutively connected transistors. Lemma 1 (De Morgan’s law): The delays of any two structurally-dual gates cannot be matched for every possible vector, independent of sizings of transistors. Proof: First, we focus on the structures of gates 𝑔 and 𝑔 ′ , and analyze the fundamental difference between any 𝑔 and 𝑔 ′ . Figure 29(b) shows transistor-level representations of a m- input gate and its structurally-dual gate (if 𝑔 is a m-input NAND gate then 𝑔 ′ is a m-input NOR gate, and vice versa). We note that pull-up and pull-down networks of 𝑔 and 𝑔 ′ have correspondence with each other, i.e., the pull-up network of 𝑔 is symmetric to the pull-down network of 𝑔 ′ except the transistor types and hence they have correspondence with each other, and so do the pull-down network of 𝑔 ′ and the pull-up network of 𝑔 . Thus, there also exists correspondence between every transistor and internal node (equivalently, equivalent resistance of every transistor and internal capacitance at every node) of 𝑔 and 𝑔 ′ . For example, in Figure 29(b), the p-transistor connected to input 𝑓 1 in the m-input NAND gate has a correspondence with the n-transistor connected to input 𝑓 𝑖 ̅ in its structurally-dual gate, m-input NOR gate. Due to this m-input NAND gate ... ... ... ... m-input NOR gate … … … … … (a) (b) 126 property of correspondence, if a particular vector is applied to the input of the circuit, causes transition at the input of 𝑔 , and sensitizes a charge/discharge path of 𝑔 (which starts at the vdd/gnd pin, passes via some equivalent resistance(s) and internal capacitance(s) of 𝑔 , ends at the output of 𝑔 ), then the same vector applied to the transformed version of the circuit containing 𝑔 ′ will sensitize the corresponding charge/discharge path of 𝑔 ′ , which is constituted of only the corresponding equivalent resistance(s) and internal capacitance(s). Despite the correspondences between inputs, charge/discharge paths, equivalent resistances, and internal capacitances, the mismatch in the delays occur due to the fact that internal capacitances of different types of transistors, p- and n-transistors, do not scale in the same way with sizing, since there is a significant difference between mobilities of p- and n-transistors, and hence equivalent resistances and capacitances of p- and n-transistors, and even their products cannot be the same for any sizing of transistors in 𝑔 ′ . Figure 30: Two different implementations of 4-input AND function, NAND4-INV (left) and INV-NOR4 (right) As an example, we pick a circuit consisting of only one 4-input NAND gate followed by an inverter (hence the circuit has 4 inputs and one output), and transform it by applying the De Morgan’s law. Figure 30 shows these two versions of the circuit, where the first circuit, NAND4- INV, corresponds to an original circuit, and the second circuit, INV-NOR4, is the transformed version of the circuit. Here, 𝑔 and 𝑔 ′ are 4-input NAND and 4-input NOR gates, respectively. A B C D A B C D Z Z 127 However, due to the changes in the location and number of inverters, 𝑔 and 𝑔 ′ may drive different loads, so we denote these two loads differently, namely 𝐶 𝐿 and 𝐶 𝐿 ′ Figure 31: RC equivalent circuits for (a) a 4-input NAND and (b) a 4-input NOR gates when 𝑉 1 and 𝑉 2 are applied to the circuit, where 𝑎 𝑖 ’s and 𝑏 𝑖 ’s are the sizings of p- and n-transistors of the 4-input NAND, and 𝑥 𝑖 ’s and 𝑦 𝑖 ’s are the sizings of p- and n-transistors of the 4-input NOR gate, respectively. The blue arrows drawn at each RC equivalent circuit indicates the direction of charge/discharge paths of the gates when the corresponding vectors shown below are applied to the input of the gates. Let us choose two vectors, say 𝑉 1 and 𝑉 2 , which apply (A, B, C, D) = (R, S1, S1, S1) and (S1, R, S1, S1) to the input of 𝑔 , where these two vectors are designed to sensitize different number of internal capacitances of 𝑔 . If the same vectors 𝑉 1 and 𝑉 2 are applied to the transformed version of the circuit, then the vectors applied to the input of 𝑔 ′ will be (A', B', C', D') = (F, S0, S0, S0) RC-equivalent circuit of pull-down network for two different vector pairs Rising Steady-1 Rising (A, B, C, D):(0, 1, 1, 1)->(1, 1, 1, 1) (A, B, C, D):(1, 0, 1, 1)->(1, 1, 1, 1) A B C D A B C D 4-input NAND A’ B’ C’ D’ D’ C’ B’ A’ 4-input NOR Falling Falling (A’, B’, C’, D’):(1, 0, 0, 0)->(0, 0, 0, 0) (A’, B’, C’, D’):(0, 1, 0, 0)->(0, 0, 0, 0) (a) (b) Steady-1 Steady-1 Steady-1 Steady-1 Steady-1 Steady-0 Steady-0 Steady-0 Steady-0 Steady-0 Steady-0 128 and (S0, F, S0, S0), respectively, because of structural duality between 𝑔 and 𝑔 ′ . Figure 31 shows that only the corresponding internal capacitances and equivalent resistances are sensitized by the same vector applied to the input of the circuit. The delays of these two vectors can be computed using the RC delay model explained in Section 5.4, and are tabulated in Table 9. And sizings of transistors in 𝑔 ′ , i.e., 𝑥 𝑖 ’s and 𝑦 𝑖 ’s (𝑎 𝑖 ’s and 𝑏 𝑖 ’s are transistor sizings of 𝑔 , and hence they are fixed constants), can be freely changed to maximally match the delays between circuits in Figure 31(a) and (b). The solution that makes these two set of delays identical should satisfy at least one of the following two conditions. Condition 1: Every corresponding internal capacitance and equivalent resistance matches with each other, i.e., ∀𝑖 , 𝑅 𝑛 /𝑏 𝑖 =µ𝑅 𝑛 /𝑥 𝑖 , µ𝑅 𝑛 /𝑎 𝑖 =𝑅 𝑛 /𝑦 𝑖 , 𝐶 𝑑 (𝑎 𝑖 +𝑎 𝑖 +1 )=𝐶 𝑑 (𝑦 𝑖 +𝑦 𝑖 +1 ), and 𝐶 𝑑 (𝑏 𝑖 +𝑏 𝑖 +1 )=𝐶 𝑑 (𝑥 𝑖 +𝑥 𝑖 +1 ) Condition 2: The product of every corresponding internal capacitance and equivalent resistance through a sensitized charge/discharge path matches with each other. Table 9: Delays of two circuits shown in Figure 31 for two different vectors to the input of the circuits, 𝑉 1 and 𝑉 2 . Vectors Delays of Figure 31(a) Delays of Figure 31(b) 𝑉 1 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ) {𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 + 𝑎 4 +𝑏 1 )+𝐶 𝐿 } µ𝑅 𝑛 ( 1 𝑥 1 + 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 ){𝐶 𝑑 (𝑦 1 +𝑦 2 +𝑦 3 + 𝑦 4 +𝑥 1 )+𝐶 𝐿 ′ } 𝑉 2 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )+ 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 + 𝑎 4 +𝑏 1 )+𝐶 𝐿 } µ𝑅 𝑛 ( 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 )𝐶 𝑑 (𝑥 1 +𝑥 2 )+ µ𝑅 𝑛 ( 1 𝑥 1 + 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 ){𝐶 𝑑 (𝑦 1 +𝑦 2 +𝑦 3 + 𝑦 4 +𝑥 1 )+𝐶 𝐿 ′ } In fact, satisfying condition 1 will also satisfy conditions 2. However, before we move onto a formal proof on satisfiability of condition 2, we explain that condition 1 can never be satisfied. 129 First, condition 1 can never be satisfied due to the difference between mobilities of p- and n- transistors, µ. Suppose that a particular p-transistor has the sizing of 𝑎 and its corresponding n- transistor has the sizing of 𝑦 . In this case, making 𝑦 =𝑎 /µ that makes the equivalent resistances of these two transistors the same, i.e., µ𝑅 𝑛 𝑎 = 𝑅 𝑛 𝑦 (=µ 𝑅 𝑛 𝑎 ), will cause mismatch in the values of internal capacitances, because 𝐶 𝑑 𝑎 ≠𝐶 𝑑 𝑦 (= 𝐶 𝑑 𝑎 µ ). Thus it is impossible to match both equivalent resistances and internal capacitances only by changing the transistor sizing. Second, satisfying condition 2 is also impossible because the product of equivalent resistance and diffusion capacitance of a particular transistor does not change with its sizing. For example, suppose that a particular p-transistor has the sizing of 𝑎 and its corresponding n-transistor has the sizing of 𝑦 . Then the product of its equivalent resistance and diffusion capacitance of the p- transistor is µ𝑅 𝑛 𝑎 *𝐶 𝑑 𝑎 =µ𝑅 𝑛 𝐶 𝑑 , where the product of the equivalent resistance and diffusion capacitance of the n-transistor is 𝑅 𝑛 𝑦 *𝐶 𝑑 𝑦 =𝑅 𝑛 𝐶 𝑑 and 𝑅 𝑛 𝐶 𝑑 ≠ µ𝑅 𝑛 𝐶 𝑑 . Since the delay of a particular charge/discharge path is the sum of products of equivalent resistances, and diffusion capacitances of the transistors along the path plus load capacitance, it is not possible to match the delays by only resizing transistors. In addition, changing the load capacitance, e.g., changing the sizing of gate(s) driven by 𝑔 ′ , also cannot provide a solution. For the two sets of delays in Table 9, we can first match the delays of the two circuit versions for 𝑉 1 by changing the load capacitance, 𝐶 𝐿 ′ . As a result, the following equality becomes true. 130 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 +𝑎 4 +𝑏 1 )+𝐶 𝐿 } =µ𝑅 𝑛 ( 1 𝑥 1 + 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 ){𝐶 𝑑 (𝑦 1 +𝑦 2 +𝑦 3 +𝑦 4 +𝑥 1 )+𝐶 𝐿 ′ } (19) The next question is whether it is also possible to simultaneously match the delays of the two circuit versions for 𝑉 2 . We note that this is always impossible, because matching the delays for one vector causes mismatch in the delays for another vector. In other words, we want to make the following equality also true. 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )+𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 +𝑎 4 +𝑏 1 )+𝐶 𝐿 } =µ𝑅 𝑛 ( 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 )𝐶 𝑑 (𝑥 1 +𝑥 2 )+µ𝑅 𝑛 ( 1 𝑥 1 + 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 ){𝐶 𝑑 (𝑦 1 +𝑦 2 +𝑦 3 +𝑦 4 +𝑥 1 )+𝐶 𝐿 ′ } (20) In order to make the above equality true, it should satisfy 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 + 𝑏 2 )= µ𝑅 𝑛 ( 1 𝑥 2 + 1 𝑥 3 + 1 𝑥 4 )𝐶 𝑑 (𝑥 1 +𝑥 2 ) due to Eq. (19). But here these equations are independent of the size of the load, 𝐶 𝐿 ′ , and hence changing the size of the load cannot make the delays equal for the two circuit versions when 𝑉 2 is applied. Similarly, if we match the delays for 𝑉 2 and then we cannot match the delays for 𝑉 1 . Thus, we are not able to match the delays of a 4-input NAND and a 4-input NOR by transistor resizing, and hence cannot make the two circuits in Figure 30 have the same delays. We can also prove this for all other pairs of primitive gates, i.e., NAND or NOR with arbitrary fanins. Hence the delays of structurally-dual gates cannot be equal, especially for input vectors that charge/discharge different numbers of internal capacitances of the gates. 131 Thus, the difference caused by any single instance of transformations using De Morgan’s law can be detected by applying vectors that charge/discharge different number of internal capacitances through serially-connected transistors. 5.5.1.2 Decomposition of a Large Fanin Gate into Gates with Smaller Fanins A single instance of decomposition of a large fanin gate into smaller fanin gates refers to transformation of an arbitrary multi-input gate into a number of parallel gates with smaller fanins, as depicted in Figure 32(a). A number of possibilities exist, such as different number of decomposed gates with different fanins, and correspondingly different choices of 𝐻 , and all should preserve the same logic function of the circuit. For example, a 8-input NAND gate can be transformed in many ways, such as two parallel 4-input NAND gates followed by INV-NAND2, four parallel 2-input NAND gates followed by INV-NAND4, or two parallel 4-input NAND followed by NOR2-INV, etc. Here, all possible single instances of transformations that occur at a particular multi-input gate share a common characteristic – each one of the lines that used to drive an input of the original gate now drives an input of one of the decomposed gates with smaller fanins. After transformation, we assume that transistors of the transformed gates will be resized to match delays. 132 Figure 32: (a) Decomposition of an arbitrary multi-input gate, 𝑔 , into gates with smaller fanins, 𝑔 𝑖 ′ ’s and a logic block 𝐻 , where 𝐹 is the subcircuit in fanin of 𝑔 , and 𝑓 𝑖 is logic function implemented by 𝑖 th input of 𝑔 . (b) transistor-level diagrams of 𝑔 and 𝑔 1 ′ , where the two gates have different numbers of internal capacitances. Lemma 2 (gates with different fanins): The delays of gates with different fanins cannot be matched. Proof: The delays of two gates with different fanins cannot be matched as they have different numbers of transistors, and hence have different numbers and values of equivalent resistances and internal capacitances. For example, suppose that a m-input NAND gate is decomposed into several parallel smaller-fanin gates as depicted in Figure 32(b). And we choose gate, 𝑔 , and one of these decomposed gates, say 𝑔 1 ′ , and show that the delays of these two gates cannot be matched for every possible vector. Specifically, for every vector that sensitizes the discharge path from the gnd pin to the output node of the gate through m serially-connected n-transistors in 𝑔 , the discharge path consisting of smaller number of serially-connected n-transistors, 𝑚 1 , of 𝑔 1 ′ will be sensitized. And we show that this change cannot be eliminated by resizing transistors of every 𝑔 𝑖 ′ and 𝐻 , given that we can apply vectors that sensitize a fixed delay of a path through 𝐻 and excite only one of the decomposed gates. For an example, we think of a circuit consisting of only a 4-input NAND gate. And after decomposition, the 4-input NAND gate becomes two parallel 2-input NAND gates, followed by NOR2-INV, as depicted in Figure 33. From Figure ... m-input NAND gate ... ... m1-input NAND gate ... ... … … … … … … m internal caps m1 internal caps 133 32(a), 𝑔 and 𝑔 1 ′ are the 4-input NAND and the 2-input NAND gates, where 𝐻 corresponds to a network of NOR2-INV in Figure 33. Figure 33: Two different implementations of 4-input NAND function, NAND4-INV (left) and NAND2-NOR2-INV (right) Figure 34: RC equivalent circuits for (a) a 4-input NAND, 𝑔 , and (b) a 2-input NAND, 𝑔 1 ′ when 𝑉 1 and 𝑉 2 are applied to the circuit, where 𝑎 𝑖 ’s and 𝑏 𝑖 ’s are the sizings of p- and n-transistors of 𝑔 , and 𝑥 𝑖 ’s and 𝑦 𝑖 ’s are the sizings of p- and n-transistors of 𝑔 1 ′ , respectively. The blue arrows drawn at each RC equivalent circuit indicates the direction of charge/discharge paths of the gates when the same vectors are applied to the input of the gates. A B C D Z A B C D Z RC-equivalent circuit of pull-down network for two different vector pairs Rising Steady-1 Rising (A, B, C, D):(0, 1, 1, 1)->(1, 1, 1, 1) (A, B, C, D):(1, 0, 1, 1)->(1, 1, 1, 1) A B C D A B C D 4-input NAND Steady-1 Steady-1 Steady-1 Steady-1 Steady-1 A B A B 2-input NAND Rising Steady-1 Rising (A, B, C, D):(0, 1, 1, 1)->(1, 1, 1, 1) (A, B, C, D):(1, 0, 1, 1)->(1, 1, 1, 1) Steady-1 (a) (b) 134 For the two circuits in Figure 33, we apply two different vectors that sensitize a fixed delay of 𝐻 , e.g., (A, B, C, D) = (R, S1, S1, S1) and (S1, R, S1, S1), and we call these 𝑉 1 and 𝑉 2 , respectively. Since the inputs C and D are assigned steady non-controlling values in both vectors, they will not cause any transition at the output of the corresponding 2-input NAND gate, and hence make the gate in its fanout, 2-input NOR, have a fixed delay for both vectors. Figure 34 shows RC equivalent circuits of Figure 33 when 𝑉 1 and 𝑉 2 are applied. The delays for 𝑉 1 and 𝑉 2 are computed and are tabulated in Table 10. Table 10: Delays of two circuits shown in Figure 34 for two different vectors to the input of the circuits, 𝑉 1 and 𝑉 2 . Vectors Delays of Figure 34(a) Delays of Figure 34(b) 𝑉 1 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ) {𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 + 𝑎 4 +𝑏 1 )+𝐶 𝐿 } 𝑅 𝑛 ( 1 𝑦 1 + 1 𝑦 2 ){𝐶 𝑑 (𝑥 1 +𝑥 2 +𝑦 1 )+𝐶 𝐿 ′ } 𝑉 2 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )+ 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 + 𝑎 4 +𝑏 1 )+𝐶 𝐿 } 𝑅 𝑛 ( 1 𝑦 2 )𝐶 𝑑 (𝑦 1 +𝑦 2 ) +𝑅 𝑛 ( 1 𝑦 1 + 1 𝑦 2 ){𝐶 𝑑 (𝑥 1 +𝑥 2 +𝑦 1 )+𝐶 𝐿 ′ } For the obtained equations for the delays when 𝑉 1 and 𝑉 2 are applied, we adopt a similar approach that we used in Section 5.5.1.1. First, we assume that the delays of the two circuits match when we apply 𝑉 1 . 𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 +𝑎 4 +𝑏 1 )+𝐶 𝐿 } =𝑅 𝑛 ( 1 𝑦 1 + 1 𝑦 2 ){𝐶 𝑑 (𝑥 1 +𝑥 2 +𝑦 1 )+𝐶 𝐿 ′ } (21) 135 And we show that it is not possible to simultaneously match the delays when we apply 𝑉 2 . First, matching the delays for 𝑉 2 means that the equation below is satisfied. 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )+𝑅 𝑛 ( 1 𝑏 1 + 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 ){𝐶 𝑑 (𝑎 1 +𝑎 2 +𝑎 3 +𝑎 4 +𝑏 1 )+𝐶 𝐿 } =𝑅 𝑛 ( 1 𝑦 2 )𝐶 𝑑 (𝑦 1 +𝑦 2 )+𝑅 𝑛 ( 1 𝑦 1 + 1 𝑦 2 ){𝐶 𝑑 (𝑥 1 +𝑥 2 +𝑦 1 )+𝐶 𝐿 ′ } (22) And if Eq. (21) is substituted into Eq. (22), then it implies the following. 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )=𝑅 𝑛 ( 1 𝑦 2 )𝐶 𝑑 (𝑦 1 +𝑦 2 ) (23) From Eq. (23), it is clear that the terms (1) 𝑅 𝑛 ( 1 𝑦 2 )𝐶 𝑑 (𝑦 1 +𝑦 2 )=𝑅 𝑛 𝐶 𝑑 (1+ 𝑦 1 𝑦 2 ) and (2) 𝑅 𝑛 ( 1 𝑏 2 + 1 𝑏 3 + 1 𝑏 4 )𝐶 𝑑 (𝑏 1 +𝑏 2 )=𝑅 𝑛 𝐶 𝑑 (1+ 𝑏 1 𝑏 2 + 𝑏 1 +𝑏 2 𝑏 3 + 𝑏 1 +𝑏 2 𝑏 4 ) cannot be identical. For example, if we assume a conventional case that the values of 𝑎 𝑖 ’s and 𝑏 𝑖 ’s are 2 and 4, respectively, so the 4-input NAND gate has equal worst-case rising and falling delays when µ=2. Then the second term, 𝑅 𝑛 𝐶 𝑑 (1+ 𝑏 1 𝑏 2 + 𝑏 1 +𝑏 2 𝑏 3 + 𝑏 1 +𝑏 2 𝑏 4 ) is 6𝑅 𝑛 𝐶 𝑑 where a set of solutions to make 𝑅 𝑛 𝐶 𝑑 (1+ 𝑦 1 𝑦 2 )= 6𝑅 𝑛 𝐶 𝑑 is only those satisfying 𝑦 1 =5𝑦 2 , which means that the sizing of one n-transistor in a 2- input NAND gate has 5 times the sizing of the other transistor. Hence, resizing transistors to match the delays between these two versions of the circuit will not succeed in this case. In a similar way, we can easily extend the proof to an arbitrary case of decomposition for an arbitrary multi-input gate. So we can capture the difference caused by any single-instance of transformation using decomposition by selecting vectors that sensitize a fixed delay of a path through H for only one 136 of the decomposed gates, and simultaneously charge/discharge different number of internal capacitances of the decomposed gate. 5.5.1.3 Algebraic Substitution Every single instance of algebraic substitution based transformations targets two circuit lines with identical logic functions, and one of these two lines is substituted into the other as a result of transformation. Algebraic substitution will move the fanouts of the substituting line to the substituted line, and hence the substituted line drives a larger load. To hide the impact of substitution, gates of the circuit will be resized. Figure 35 illustrates the above scenario. Figure 35: An arbitrary circuit, 𝐶 , is transformed to 𝐶 𝑖 via algebraic substitution, where two lines with identical logic functions, say 𝑙 𝑖 and 𝑙 𝑗 , are selected and line 𝑙 𝑖 is substituted into 𝑙 𝑗 . The result of algebraic substitution is similar to the case of the minimally delay-invasive Trojans, of which each adds a single connection to the original circuit, in a manner that it also increases the amount of load driven by a particular circuit line and then resizes gates. Thus, we can identify every single instance of algebraic substitution based transformation by using the same set of vectors used for detecting the minimally delay-invasive Trojans. ... ... ... ... ... ... C Ci 137 Table 11 summarizes each canonical model of Trojans and vectors for detecting each Trojan, based on the observations and findings described in the above sections and Chapter 3. Table 11: Summary of minimally delay-invasive Trojans and three circuit transformations studied Type of Trojans Phenomenon Vectors Minimally delay-invasive Trojans Change in delays of some gate(s) due to additional load or resizing gate(s), or both Vectors sensitizing a path passing via a target line that propagate the delay of the target line to the output Maximal ly- matched designs De Morgan’ law Mismatch in the values of internal capacitances/equivalent resistances between serially- connected transistors Vectors charging/discharging different number of internal capacitances at the target gate Decomposition of a large fanin gate into gates with smaller fanins Mismatch in the numbers/values of internal capacitances between serially-connected transistors Vectors charging/discharging different number of internal capacitances of only one of the decomposed gates Algebraic substitution Increase in the number of fanouts of the substituted line The same as vectors for detecting the minimally delay- invasive models of Trojans Theorem 2 (single instance of circuit transformations using De Morgan’s law, decomposition of a large fanin gate into smaller fanin gates, and algebraic substitution): Provided that we have capabilities to apply desired vectors to a target and propagate the delay to the output of the circuit, it is always possible to detect mismatch in delays when a single instance of the above three circuit transformations is applied, using the vectors summarized in Table 11. The proof that such capabilities really exist, will be provided in a later section. 138 5.5.1.4 Theoretical Proofs for Non-Reconvergent Circuits We show that, provided that we are able to apply desired vectors to a target and propagate its delay to the circuit’s output, we can detect the mismatch in delays for any single instance of the minimally delay-invasive Trojans and the above three circuit transformations, using the vectors summarized in Table 11. In this section, we explain our theoretical foundations that we are always able to generate vectors with such capabilities in any non-reconvergent combinational logic block, i.e., a combinational logic block which consists of only NAND, NOR, and INV and has no reconvergent path. And we extend our approaches to general circuits, i.e., circuits with reconvergent paths, and show necessary conditions in order to achieve such capabilities to generate vectors. First, we explain our theoretical proofs that show that, in any non-reconvergent combinational logic block, delay measurements are guaranteed to detect all minimally delay-invasive Trojans, and maximally-matched Trojans including transformations using De Morgan’s law, decomposition of a large fanin gate into gates with smaller fanins, and algebraic substitution based transformations. Lemma 3: In an arbitrary non-reconvergent combinational logic block, it is always possible to apply any desired vector to any multi-input gate. Proof: First, we show that each input of any multi-input gate is driven by a distinct input cone which does not overlap with the input cones of other inputs of the gate, otherwise two different input cones driving two different inputs of the gate share the same gate leading to a contradiction, since a reconvergent path exists. Thus, input cones of the inputs of the gate are independent of each other. Since the input cones are independent with each other and every line in a 139 reconvergent circuit is not redundant, i.e., there always exists at least one input vector that assigns logic value of ‘0’ to the line, and another input vector that assigns logic value of ‘1’, we are always able to find a vector that assigns a desired logic value at the line. Hence any logic values at inputs of the gate can be assigned independently. Lemma 4: The delays of an arbitrary multi-input gate can be propagated to the output of the logic block in any non-reconvergent combinational logic block. Proof: According to Lemma 3, for any path from the output of the gate to any of the circuit’s output, any vector can be applied to the input of any gate constituting the path. Hence it is always possible to propagate the delays of the gate to the circuit’s output by assigning non-controlling values at side-inputs of all subsequent gates of the path. From Lemma 3 and 4, we arrive at the following theorem. Theorem 3: For every instance of circuit transformations in an arbitrary non-reconvergent circuit, it is possible to generate vectors that can excite the difference caused by any single instance of circuit transformations, and propagate this difference to the circuit’s output. Thus, we have shown that, for any non-reconvergent circuit, by applying any desired vectors and measuring delays we can guarantee detection of every single instance of above circuit transformations. The second step of our explorations is to prove whether multiple instances of the transformations followed by resizing, i.e., arbitrary numbers and combinations of circuit above transformations and resizing of gates to maximally match the delays, can be detected by measuring delays. We start by proving that it is possible to detect any multiple instances of the transformations in any non-reconvergent combinational logic block by measuring delays. 140 Lemma 5: Applying any single instance of any of the transformations to an arbitrary non- reconvergent combinational logic block does not produce a reconvergent path. Proof: We start proving it by showing that De Morgan’s law and decomposition based transformations do not produce a reconvergent path. First, transforming a multi-input gate to its structurally-dual by moving inverters does not produce any extra fanout. Also, decomposition of a large fanin gate into gates with smaller fanins breaks a target multi-input gate into a number of parallel gates with smaller fanins, where none of two decomposed gates share the same inputs and the output of every decomposed gate converges to a line via a single gate or a fanout-free network of gates. Hence there is no extra fanout is created due to decomposition and no reconvergent path is produced. Last, algebraic substitution makes every fanout of the substituting line be re-connected to the substituted line. Here, the two lines subject to substitution should implement the same logic function, which means that they originate at the same set of inputs of the circuit. Thus, the output cones of these two lines do not have a common gate, otherwise there exists a reconvergent path from the input of the circuit to the common gate. Hence, there is no reconvergent path produced by a single instance of transformations based on algebraic substitution. Theorem 4: For an arbitrary non-reconvergent combinational logic block after one or more instances of transformations, we can always generate input vectors that apply desired vectors to one target transformation in the circuit, and propagate the delay of the target to the circuit’s output. Proof: Suppose that an arbitrary non-reconvergent combinational logic block, 𝐶 0 is transformed to 𝐶 𝑖 after one or more instances of transformations, where a target line whose 141 delays need be propagated to the circuit’s output is 𝑙 and its input cone is 𝐶 𝐴𝑖 (Figure 36). It is always possible to apply any vector to the gate driving line 𝑙 , because 𝐶 𝐴𝑖 is non-reconvergent and this property is proven to be unchanged after circuit transformations. Thus, we can apply vectors to the gate driving 𝑙 by controlling inputs 𝛽 𝑖 . And propagating the delays of 𝑙 via an arbitrary path going to the output of the circuit is also always possible independent of 𝐶 𝐵𝑖 , because 𝐶 𝐵𝑖 is also non-reconvergent. Since we proved that the property of non-reconvergence does not change due to circuit transformations, 𝐶 𝐵𝑖 must be non-reconvergent. Hence, we are able to control side-inputs of the path via controlling inputs 𝛼 𝑖 and 𝛾 𝑖 , and can propagate the delays to the output of the circuit using the same vector, independent of the different vectors we apply at inputs 𝛽 𝑖 of 𝐶 𝐴𝑖 . Figure 36: A non-reconvergent combinational logic block, 𝐶 0 , is transformed into 𝐶 𝑖 , where 𝑙 is an arbitrary line, 𝐶 𝐴𝑖 is the input cone of 𝑙 , and 𝐶 𝐵𝑖 is the rest of the circuit. We have already empirically shown that we can detect some of these canonical models of Trojans in arbitrary combinational logic blocks, i.e., minimally delay-invasive Trojans in Chapter 3, however, the above theoretical proofs provide guarantees only for non-reconvergent combinational logic blocks. In Section 5.6 and 5.7, we will explain how the above approach can be extended to a more generalized problem. ... ... ... ... ... ... C0 Ci CAi CBi 142 5.5.2 Sensitivity Analysis In this section, we show our approach for quantifying the difference in delays, and finding vectors that invoke the maximum mismatches for each one of the logic transformation methods discussed in Section 5.5.1. Furthermore, we examine whether the deviation caused by each transformation exceeds a given detection threshold. We start with quantitative analysis on every single instance of the three transformations by constructing a sub-circuit that only consists of a few gates, i.e. only part of the circuit that is modified by the given instance of transformation, and find vectors that invoke the maximum difference by analyzing the delays of the sub-circuit for various vectors found in Table 11. For the quantification of the delays, we will use the first-order delay model described in Section 5.4. However, our approach should also use very accurate delay models, so the delay differences caused by circuit transformations can be accurately estimated, which is particularly important when the detection threshold is large. Thus, we also perform realistic transistor-level simulations on each one of the sub-circuits to verify our selection of vectors. 5.5.2.1 Finding Vectors Invoking the Maximum Difference in Delays In our problem of detecting a maximally-matched Trojan formulated in Section 5.2, our goal is to find a subset of 𝐕 , 𝐕 𝐦 , which maximizes the mismatch, i.e., the difference between delays. The problem of finding 𝐕 𝐦 can be formulated as the following linear programming (LP) problem. Objective: maximize min 𝑡𝑟𝑎𝑛𝑠𝑖𝑠𝑡𝑜𝑟 𝑠𝑖𝑧𝑖𝑛𝑔𝑠 𝑖𝑛 𝐶 𝑖 {max ∀𝑗 ,𝑉 𝑘 ∈𝐕 𝐦 |𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 )−𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )|} Constraint: min 𝑡𝑟𝑎𝑛𝑠𝑖𝑠𝑡𝑜𝑟 𝑠𝑖𝑧𝑖𝑛𝑔𝑠 𝑖𝑛 𝐶 𝑖 {max ∀𝑗 ,𝑉 𝑘 ∈𝐕 𝐦 |𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 )−𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )|}≥𝛥 𝑚 143 The objective function refers to finding a set of vectors 𝐕 𝐦 that maximizes the difference between delays, |𝑑 𝐶 ,𝑂 𝑗 (𝑉 𝑘 )−𝑑 𝐶 𝑖 ,𝑂 𝑗 (𝑉 𝑘 )|, for 𝑉 𝑘 ∈𝐕 𝐦 , where this difference becomes minimized due to resizing of transistors in 𝐶 𝑖 . In this section, we find 𝐕 𝐦 for each one of circuit transformation methods explained in Section 5.5.1, from the vectors summarized in Table 11. De Morgan’s law: To detect any single instance of transformations using De Morgan’s law, we showed that vectors charging/discharging different number of internal capacitances can be used for detection. To achieve that, we can apply various SIS-TNC to a target multi-input gate, since different SIS-TNC charge/discharge different numbers of internal capacitances of the target gate. And among various SIS-TNC, we can use two vectors with the greatest difference in the number of internal capacitances charged/discharged. For example, for a 4-input NAND gate, we can choose two vectors where one of these vectors gives to-non-controlling transition to the input of the transistor which is closest to vdd/gnd pin (no internal capacitance is initially charged/discharged), and the other vector gives transition to the input of the transistor which is closest to the output node (every internal capacitance is initially charged/discharged), as shown in Figure 37. And we can always find such two vectors for an arbitrary multi-input gate. 144 Figure 37: Two SIS-TNC for a 4-input NAND gate causing the greatest delay difference Decomposition of a large fanin gate into gates with smaller fanins: The mismatch in the delays is caused by different numbers of serially-connected transistors of the decomposed gates and the original gate, and hence we can also use SIS-TNC to detect the difference. However, unlike transformation using De Morgan’s law, we cannot select only two vectors, because there are many cases of decomposition, including different number of decomposed gates with different fanins, etc. Thus, vectors invoking the maximum difference may be different depending on how a target multi-input gate is decomposed. But we can still sensitize the difference by using any two vectors among SIS-TNC, where the magnitude of the difference depends on the way of decomposition as well as selected vectors. So we can basically apply any two SIS-TNC to a target multi-input gate to detect the transformation, however, the greatest difference is guaranteed to be invoked when every SIS- TNC can be used. Algebraic substitution: As discussed in Section 5.5.1.3, we can use the same vectors for detecting minimally delay-invasive Trojans. In any non-reconvergent circuit, SIS-TNC for every multi-input gate can be also used for detecting minimally delay-invasive Trojans, because they A B C D Rising Steady-1 Steady-1 Steady-1 (A, B, C, D):(0, 1, 1, 1)->(1, 1, 1, 1) A B C D Rising Steady-1 Steady-1 Steady-1 (A, B, C, D):(1, 1, 1, 0)->(1, 1, 1, 1) 145 also make transition at each circuit line and propagate it to the circuit’s output. And this property of SIS-TNC conforms to the conditions for generating vectors for minimally delay-invasive Trojans that we discussed in Section 3.5. 5.5.2.2 Sub-Circuit Analysis Based on the vectors we obtained above for detecting maximally-matched Trojans using the three circuit transformation methods, we show our approach for quantifying the delay differences at a practical complexity. In particular, we analyze and simulate only a sub-circuit of the circuit which is a modified part of the circuit after transformation. The advantage of the sub-circuit analysis is that we can avoid simulation of delays from the whole circuit, and hence the complexity of simulations is lower. In addition, according to Theorem 3 and 4, in any non- reconvergent circuit, the difference caused by a sub-circuit is guaranteed to be propagated to the circuit’s output. The sub-circuit analysis refers to the process of taking out a part of the netlist which is changed due to Trojan, and obtaining the maximum difference caused by the Trojan using simulations, given (1) the part of the netlist taken out above, (2) sizings of the transistors in this part of the netlist, and (3) vectors to be used for detection. And we determine the new sizings of transistors from the delays computed using the first-order RC delay models, where the transistor sizings are designed to minimize the delay difference for the given vectors. As an example, we performed experiments on a sub-circuit consisting of a 4-input NAND gate, where possible transformations from this sub-circuit include (a) INV-NOR4-INV, (b) NAND2-NOR2-INV, (c) NAND2-INV-NAND2, and so on. And the simulations are performed using the industrial 65nm 146 technology, and the experimental results are provided in Table 12. And the sizings of transistors are determined which maximally match the delays between 4-input NAND and its possible transformations (a), (b), and (c) for four SIS-TNC, (R, S1, S1, S1), (S1, R, S1, S1), (S1, S1, R, S1), and (S1, S1, S1, R). In Table 12, the numbers in the row entitled (R, S1, S1, S1) are almost identical, where the slight difference between the values occurs due to several second-order effects of transistor sizings on delays which we ignored when we determined sizing using first- order models. Besides, Table 12 clearly shows that the four implementations of 4-input NAND function cannot be matched for SIS-TNC. And for the case of transformation using De Morgan’s law, (a), the maximum difference is caused when (R, S1, S1, S1) and (S1, S1, S1, R) are used, which conforms to our findings in Section 5.5.2.1. Table 12: Sub-circuit analysis on three possible transformations (targets) for a 4-input NAND, using the 65nm industrial technology Input vectors Simulation results NAND4 (a)INV-NOR4- INV (b)NAND2- NOR2-INV (c)NAND2-INV- NAND2 (R, S1, S1, S1) 25.87ps 26.87ps 24.01ps 24.87ps (S1, R, S1, S1) 34.23ps 43.21ps 30.23ps 30.21ps (S1, S1, R, S1) 39.12ps 55.42ps 31.42ps 34.23ps (S1, S1, S1, R) 42.35ps 60.21ps 38.43.ps 36.12ps 5.6. Experimental Results In the previous sections, we demonstrated the effectiveness of delay measurements for identifying maximally-matched Trojans which use one or more of the three common circuit transformation methods, including transformations using De Morgan’s law, gate decomposition, and algebraic substitution. First, we showed that each one of these transformations always causes 147 mismatch in the delays when specific vectors are applied to a target gate. Second, we proved that it is always possible to detect every single or multiple instance of the above transformations in any non-reconvergent circuit, since for any non-reconvergent circuit, we are able to apply any vector to any gate and propagate it to the circuit’s output. However, this may not be true for general circuits including circuits with reconvergent paths, and hence we may fail to either apply a desired vector to a target, propagate the delay it invokes at a target gate to the circuit’s output, or both. Due to the limitation in the capabilities for generating vectors, the coverage of targets using the three transformation methods may not be always 100% in circuits with reconvergent paths. In this section, we evaluate our approach using selected ISCAS’85 and 89 benchmark circuits, where all these benchmarks contain reconvergent paths. First, we assume a single instance of each transformation is applied to an arbitrary multi-input gate in the circuit. Thus, for every multi-input gate in the circuit, we assume two target scenarios, where one is the transformation using De Morgan’s law, and the other is decomposition of the gate into two symmetric gates, e.g., 4-input NAND will be decomposed into two 2-input NAND gates and following logic, since it gives the minimum difference in the fanins between the original and decomposed gates and will be the most difficult to detect. After each single instance of transformation, the transistors in the transformed part of the circuit are resized to match the delays. In order to generate vectors that apply desired vectors to a target gate and propagate them to the circuit’s output, we adopt the conditions for generating vectors from Section 3.5, which guarantee application of transition at a target line (gate), and propagation of the transition to the circuit’s output. Specifically, we generate vectors that apply desired vectors, SIS-TNC, to all 148 multi-input gates in the circuit and also satisfy the conditions in Section 3.5. In Table 13, we introduce the term ‘SIS-TNC coverage’, which refers to the number of multi-input gates that can be tested with more than one SIS-TNC. And among the generated SIS-TNC for each multi-input gate in the circuit, we choose vectors that maximally invoke the difference, and count the number of multi-input gates with the delay difference greater than the detection threshold, 𝛥 𝑚 . Here, the delay difference is obtained using the sub-circuit analysis and the 65nm industrial technology. In our experiments, we used 1ps for the value of 𝛥 𝑚 , where 1ps is the resolution of measurements claimed to be available in [68]. Finally, we compute the number of chips to be required for a given difference in delays by analyzing the corresponding sub-circuit. Table 13: SIS-TNC coverage, and the average number of chips to be tested to detect each one of the target scenarios applied to each multi-input gate, in selected ISCAS’85 and 89 benchmark circuits. Benchmark circuit Number of multi-input gates SIS-TNC Coverage (%) The average number of chips to be tested s420 122 53.28 4268.5 s510 179 65.36 4422.5 s1238 428 46.50 3176.3 s5378 1004 51.29 3008.1 s9234 2027 22.59 1729.4 c880 294 67.35 4755.8 c2670 676 32.25 2469.0 c5315 1419 32.77 2481.8 The results in Table 13 show that in most benchmark circuits, SIS-TNC coverage for detecting maximally-matched Trojans is relatively low compared to the coverage of minimally delay-invasive Trojans, due to the following two reasons. First, we generated SIS-TNC vectors with the additional timing condition that we discussed in Section 3.5. Also, we defined the term 149 ‘SIS-TNC Coverage’ as the percentage of multi-input gates which can be tested by more than one SIS-TNC, where generating SIS-TNC with satisfying the above timing conditions for side- inputs is not guaranteed. But we require more than one vector that satisfies this additional constraint on timings at side-inputs of the path and is also SIS-TNC. It is not always possible to generate such vectors for a target multi-input gate. Besides, larger sized gates tend to have lower SIS-TNC coverage due to the complexity of the circuit and larger number of reconvergent paths. Since the timing conditions for generating vectors are tight and the existence of reconvergent paths impede us from achieving high coverage of detection, future extension of our approach may consider applying different vectors which are easier to find than SIS-TNC and also improving the testability of the circuit by adding special elements to it (more discussion can be found in Section 5.7). On the other hand, for the given value of 𝛥 𝑚 , the average number of chips to be tested to detect each target scenario for every multi-input is shown. 5.7 Extensions of the Proposed Approach We have shown the theoretical foundations and approaches for detecting maximally-matched Trojans by applying vectors and measuring delays, for any non-reconvergent circuit. While we showed the empirical results for detecting some of circuit transformations applied to an arbitrary multi-input gate in general circuits, i.e., circuits with reconvergent paths, our ultimate goal is to develop a comprehensive list of targets, not only a few sets of targets using some circuit transformation methods, and theoretical foundations which will enable us to generate vectors to detect an arbitrary target via measurements in an arbitrary circuit. 150 The main challenges to extend our approach are: (1) how can we identify a comprehensive list of targets that are guaranteed to cover every possible circuit-level transformation, and (2) how to generate vectors that can (a) apply desired vectors to an arbitrary target in the circuit, and (b) propagate to any observable point of the circuit. Based on the above challenges, we propose the following research tasks. First, we will enumerate a rich set of instances and variants for the canonical models. We do so, for example, for the maximally-matched case by enumerating different types of transformations that may be performed, different locations in which they may be performed, and various ways in which they may be performed. In term of locations, we enumerate every gate/line where each can be applied; and so on. In this manner, we can expand the minimally- invasive and maximally-matched sets into a large number of specific transformations applied in specific ways at specific circuit lines. Second, we will further develop the conditions for detection of each of the above instances of transformations. In this manner, we will derive a set of targets as the set of all the above conditions and then locations in the original circuit, and vectors for detecting each one of them. Last, we will design and add special elements to the circuit, so every target derived in the previous task can be tested by generating vectors. For example, we will consider adding circuit components that can break every reconvergent path, and hence we can apply vectors and detect Trojans based on our theoretical foundations on detection of Trojans in any non-reconvergent circuit. 151 CHAPTER 6. FUTURE RESEARCH We propose two future research tasks to extend our Trojan detection framework. First, we propose an approach to reduce the test cost of delay measurements for detecting Trojans even further, by reducing and estimating the effects of measurement noise on delays. Among major sources of measurement noise, temperature affects the delays of the gates and interconnects. The temperature of the circuit increases due to the heat generated by devices, which is caused by switching activities of the gates, and Joule-heating of the interconnects [65]. Second, fluctuations of supply voltage, including IR-drop and ground bounce during scan-in and application of test vectors, affect the amount of power/ground currents flowing through the devices and hence change the delays. The problem is how to reduce the effects of measurement noise on delays, and estimate the values of the delays in presence of process variations and measurement noise at a given confidence level. Among two major sources of measurement noise, temperature slowly changes due to slow conduction of heat between gates [12][29][49][58][72][84]. To make the circuit’s temperature stable, we propose to plan test application in a manner where we apply vectors to detect defects first, so the temperature of the circuit reaches the stable state before applying vectors for delay measurements for Trojan detection. In addition, we can mitigate the effects of measurement noise by making multiple measurements on each chip, which will reduce the numbers of chips we would need to test, compared to our original approach that makes only one measurement per each chip (Chapter 3). After making multiple measurements, we propose to use Kalman Filtering (KF), which recursively updates estimates by getting a weighted average of the measured values [90]. In our future work, we will develop approaches to estimate the number of measurements as 152 well as the number of chips required for detecting a particular Trojan at a given level of confidence, for given levels of process variations and measurement noise. In addition, we propose an approach for adding specifically-designed features to a layout of the original circuit, namely active fillers (for eliminating empty spaces within each logic block) and boundary cells (for restricting/estimating any change in the area of the logic block by completely surrounding the corresponding logic block), to prevent insertion of any Trojan at layout level and to detect even a minimal alteration caused by such Trojans. First, active fillers will limit the opportunity for a Trojan designer to resize gates by removing empty space around every gate in the original circuit. Furthermore, measurements on the boundary cells, which will be interconnected to form a ring, will enable us to determine the existence of a Trojan by detecting any increase in the area of the module due to insertion of a Trojan or gate resizing. In our future research, we will develop an architecture for active fillers and boundary cells, and a method to insert these into the layout, while considering the following factors. 1) Controllability and sensitivity in measurements: Active fillers and boundary cells should be designed to be easily controlled and tested, and should be susceptible to changes in the area of the original circuit. 2) The effects of active fillers on the logic behavior and performance and area of the original circuit: Both active fillers and boundary cells should not affect the logic behavior of the original design. In addition, the performance of the original design should not be affected significantly by filler insertion. 3) Design-for Manufacturability (DFM) issues: Active fillers and boundary cells should be designed and inserted in a manner that does not reduce the yield of the original circuit. 153 References [1] M. Abramovici, “In-System Silicon Validation and Debug,” IEEE Design & Test of Computers, Vol. 25, Issue. 3, pp. 216-223, May-June. 2008. [2] M. Abramovici and P. Bradley, “Integrated Circuit Security: New Threats and Solutions,” Cyber Security and Information Intelligence Research Workshop (CSIIRW), 2009, pp. 1362- 1365. [3] M. H. Abu-Rahma and M. Anis, “A Statistical Design-Oriented Delay Variation Model Accounting for Within-Die Variations,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 27, Issue. 11, pp. 1983-1995, Nov. 2008. [4] D. Agrawal, S. Baktir, and D. Karakoyunlu, P. Rohatgi, and B. Sunar, “Trojan Detection using IC Fingerprinting,” IEEE Symposium on Security and Privacy (SP), 2007, pp 296–310. [5] M. Banga and M. S. Hsiao, “A Region based Approach for the Identification of Hardware Trojans,” Hardware-Oriented Security and Trust (HOST), 2008, pp. 40-47. [6] M. Banga and M. S. Hsiao, “VITAMIN: Voltage Inversion Technique to Ascertain Malicious Insertions in ICs” Hardware-Oriented Security and Trust (HOST), 2009, pp. 104-107. [7] A. Baumgarten, M. Steffen, M. Clausman, and J. Zambreno, “A case study in hardware Trojan design and implementation,” International Journal of Information Security, Vol. 10, Issue. 1, pp. 1-14, Feb. 2010. [8] M. Beaumont, B. Hopkins, and T. Newby, “Hardware Trojans - prevention, detection, countermeasures (a literature review).” Technical report, DTIC Document, 2011. [9] G. T. Becker, F. Regazzoni, C. Paar, and W. P. Burleson, "Stealthy Dopant-Level Hardware Trojans", Cryptographic Hardware and Embedded Systems (CHES), 2013, pp. 197-214. [10] K. Bernstein et al., “High-Performance CMOS variability in the 65-nm regime and beyond,” IBM Journal of Research and Development, Vol. 50, Issue 4.5, pp. 433-449, 2006. [11] M. Bhushan, A. Gattiker, M. B. Ketchen, and K. K. Das, “Ring oscillators for CMOS process tuning and variability control,” IEEE Transactions on Semiconductor Manufacturing, Vol. 19, Issue. 1, pp. 10-18, Feb. 2006. [12] S. A. Bota, J. L. Rossello, C. D. Benito, A. Keshavarzi, and J. Segura, "Impact of thermal gradients on clock skew and testing," IEEE Design & Test of Computers, Vol. 23, Issue. 5, pp. 414-424, May. 2006. [13] K. A. Bowman, S. G. Duvall, and J. D. Meindl, “Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration,” IEEE Journal on Solid-State Circuits, Vol. 37, Issue. 2, pp. 183-190, Feb. 2002. [14] Y. Cao, et al., “Design Sensitivities to Variability: Extrapolations and assessments in nanometer VLSI,” IEEE International Conference on ASIC/SOC, 2002, pp. 411-415. 154 [15] B. Cha and S. K. Gupta, “A Resizing Method to Minimize Effects of Hardware Trojans,” Asian Test Symposium (ATS), 2014, pp. 192-199. [16] B. Cha and S. K. Gupta, “Efficient Trojan detection via calibration of process variations,” Asian Test Symposium (ATS), 2012, pp. 355-361. [17] B. Cha and S. K. Gupta, “Trojan detection via delay measurements: A new approach to select paths and vectors to maximize effectiveness and minimize cost,” Design, Automation and Test in Europe (DATE), 2013, pp. 1265-1270. [18] H. Chang and S. Sapatnekar, “Statistical timing analysis under spatial correlations,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 24, Issue. 9, pp.1467-1482, Sep. 2005. [19] K-T. Cheng and H-C. Chen, "Classification and identification of nonrobust untestable path delay faults," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 15, Issue. 8, pp. 845-853, Aug. 1996. [20] S. Dad, et al., “RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance,” IEEE Journal of Solid-State Circuits, Vol. 44, Issue. 1, pp. 32-48, Jan. 2009. [21] P. Das and S.K. Gupta, “On Generating Vectors for Accurate Post-Silicon Delay Characterization,” Asian Test Symposium (ATS), 2011, pp. 251-260. [22] “Defense Science Board Task Force Report on High Performance Microchip Supply”, Defense Science Board, US DoD. Available at http://www.cra.org/govaffairs/images/2005- 02-HPMS_Report_Final.pdf. [23] S. Deyati, B. J. Muldrey, A. Singh, and A. Chatterjee, “High Resolution Pulse Propagation Driven Trojan Detection in Digital Logic: Optimization Algorithms and Infrastructure,” Asian Test Symposium (ATS), 2014, pp. 200-205. [24] C. Dunbar and G. Qu, “Designing Trusted Embedded Systems from Finite State Machines,” ACM Transactions on Embedded Computing Systems (TECS), Vol. 13, Issue. 5s, Nov. 2014. [25] “Embedded Systems Challenge (ESC): Cyber Security Awareness Week,” https://isis.poly.edu/esc/ [26] D. Ernst, et al., “Razor: circuit-level correction of timing errors for low power operation,” IEEE Micro, vol. 24, Issue. 6, pp. 10-20, Nov. 2004. [27] J. Fishburn and A. Dunlop, “TILOS: a posynomial programming approach to transistor sizing,” International Conference on Computer-Aided Deisgn (ICCAD), 1985, pp. 326-328. [28] D. Forte, C. Bao, and A. Srivastava, “Temperature Tracking: An Innovative Run-Time Approach for Hardware Trojan Detection,” International Conference on Computer-Aided Deisgn (ICCAD), 2013, pp. 532-539. 155 [29] Z. Hassan, N. Allec, F. Yang, L. Shang, R. P. Dick, and X. Zeng, "Full-Spectrum Spatial–Temporal Dynamic Thermal Analysis for Nanometer-Scale Integrated Circuits," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 19, Issue. 12, pp.2276-2289, Dec. 2011. [30] R. V. Hogg and E. A. Tanis, Probability and Statistical Inference, Pearson Education, 2008, pp. 407-463. [31] K. Hu, A. N. Nowroz, S. Reda, and F. Koushanfar, “High-Sensitivity Hardware Trojan Detection Using Multimodal Characterization,” Design, Automation and Test in Europe (DATE), 2013, pp. 1271-1276. [32] K. Huang, J. M. Carulli, and Y. Markis, “Parametric Counterfiet IC Detection via Support Vector Machines,” Defect and Fault Tolerance of VLSI Systems (DFT), 2012, pp. 7- 12. [33] N. Jha and S. K. Gupta, Testing of Digital Systems, Cambridge, U.K.: Cambridge Univ. Press, 2003. [34] S. Jha and S. K. Jha, “Randomization Based Probabilistic Approach to Detect Trojan Circuits,” High Assurance System Engineering Symposium (HASE), 2008, pp. 117-124. [35] Y. Jin, N. Kupp and Y. Makris, “Experiences in Hardware Trojan Design and Implementation,” Hardware-Oriented Security and Trust (HOST), 2009, pp. 50-57. [36] Y. Jin and Y. Makris, “Hardware Trojan detection using path delay fingerprint,” Hardware-Oriented Security and Trust (HOST), 2008, pp. 51-57. [37] M. Kaneko and J. Li, “Post-silicon skew tuning algorithm utilizing setup and hold timing tests,” IEEE International Symposium on Circuits and Systems (ISCAS), 2012, pp.125-128. [38] R. Karri, J. Rajendran, K. Rosenfeld, and M. Tehranipoor, “Trustworthy Hardware: Identifying and Classifying Hardware Trojans,” IEEE Computer Magazine, Vol. 43, Issue 10, pp. 39-46, Oct. 2010. [39] F. Kashfi, S. M. Fakhraie and S. Safari, “A 65nm 10GHz pipelined MAC structure,” IEEE International Symposium on Circuits and Systems (ISCAS), 2008, pp. 460-463. [40] G. Keizer. “Update: Maxtor drives contain password-stealing Trojans,” Computerworld, November 12, 2007. Available at http://www.computerworld.com/s/article/9046424/. [41] F. Koushanfar and A. Mirhoseini, “A Unified Framework for Multimodal Submodular Integrated Circuit Trojan Detection, ” IEEE Transactions on Information Forensics and Security, Vol. 6, Issue. 1, pp. 17-32, Mar. 2011. [42] C. Lamech and J. Plusquellic, “Trojan Detection based on Delay Variations Measured using a High-Precision, Low-Overhead Embedded Test StructureTrojan implementation,” Hardware-Oriented Security and Trust (HOST), pp. 72-75, 2012. 156 [43] Y. Levendel and P.R. Menon, “Transition Faults in Combinational Circuits: Input Transition Test Generation and Fault Simulation,” International Symposium On Fault- Tolerant Computing, 1986, pp. 278-283. [44] J. Li and J. Lach, “At-speed delay characterization for IC authentication and Trojan Horse detection,” Hardware-Oriented Security and Trust (HOST), 2008, pp. 8-14. [45] J. Li and J. Lach, “Negative-Skewed Shadow Registers for At-Speed Delay Variation Characterization,” International Conference on Computer Design (ICCD), 2007, pp. 354- 359. [46] M. Li, A. Davoodi, and M. Tehranipoor, “A Sensor-Assisted Self-Authentication Frameworkfor Hardware Trojan Detection,” Design, Automation and Test in Europe (DATE), 2012, pp.1331-1336. [47] C. J. Lin and S. M. Reddy. "On delay fault testing in logic circuits," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 6, Issue. 5, pp. 694-703, Sep. 1987. [48] F. Liu, “A General Framework for Spatial Correlation Modeling in VLSI Design,” Design Automation Conference (DAC), 2007, pp. 817-822. [49] W. Liu, A. Calimera, A. Nannarelli, E. Macii, and M. Poncino, "On-chip thermal modeling based on SPICE simulation," Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation (PATMOS), 2010, pp. 66-75. [50] Y. Liu, K. Huang, and Y. Markis, “Hardware Trojan Detection through Golden Chip- Free Statistical Side-Channel Fingerprinting,” Design Automation Conference. (DAC), 2014, pp. 1-6. [51] J. Markoff, “Old Tricks Threatens the Newest Weapons,” The New York Times, October 27, 2009. Available at http://www.nytimes.com/2009/10/27/science/27trojan.html? pagewanted= all&_r=0. [52] G. S. May and C. J. Spanos, Fundamentals of Semiconductor Manufacturing and Process Control, Wiley-IEEE Press, 2006. [53] “Modeling a New Threat: Embedded Malware,” KUITY, 2010, http://kuity.com/KUITY_ia%20whitepaper%202010%20Final.pdf. [54] The Nangate Open Cell Library, Available at http://www.si2.org/openeda.si2.org/ projects/nangatelib [55] S. R. Nassif, “Modeling and Analysis of Manufacturing Variations,” IEEE conference on Custom Integrated Circuits, pp. 223-228, 2001. [56] A. N. Nowroz, K. Hu, F. Koushanfar, and S. Reda, “Novel Techniques for High- Sensitivity Hardware Trojan Detection using Thermal and Power Maps,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 33, No. 12, pp.1792- 1805, Dec. 2014. 157 [57] S. Pei, H. Li, and X. Li, "A high-precision on-chip path delay measurement architecture," IEEE Transaction on Very Large Scale Integration (VLSI) Systems, Vol. 20, Issue. 9, pp. 1565-1577, Sept. 2012. [58] E. Pop, S. Sinha, and K. E. Goodson, "Heat generation and transport in nanometer-scale transistors," Proceedings of the IEEE, Vol. 94, Issue. 8, pp. 1587-1601, Aug. 2006. [59] M. Potkonjak, A. Nahapetian, M. Nelson, and T. Massey, “Hardware Trojan Horse Detection Using Gate-Level Characterization,” Design Automation Conference. (DAC), 2009, pp. 688-693. [60] A. K. Pramanick and S. M. Reddy. "On the detection of delay faults," International Test Conference (ITC), 1998, pp. 845-856. [61] R. Rad, J. Plusquellic, and M. Tehranipoor, "A Sensitivity Analysis of Power Signal Methods for Detecting Hardware Trojans under Real Process and Environmental Conditions," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 18, Issue. 12, pp. 1735-1744, Dec. 2010. [62] D. Rai and J. Lach, "Performance of delay-based Trojan detection techniques under parameter variations," Hardware-Oriented Security and Trust (HOST), 2009, pp. 58-65. [63] J. Rajendran, E. Gavas, J. Jimenez, V. Padman, and R. Karri, “Towards a comprehensive and systematic classification of hardware Trojans,” IEEE International Symposium on Circuits and Systems (ISCAS), 2010, pp. 1871-1874. [64] S. M. Reddy, C. J. Lin, and S. Patil. "An automatic test pattern generator for the detection of path delay faults," International Conference on Computer-Aided Design (ICCAD), 1987, pp. 284-287. [65] J. L. Rosselló, V. Canals, S. A. Bota, A. Keshavarzi, and J. Segura, "A fast concurrent power-thermal model for sub-100 nm digital ICs," Design, Automation and Test in Europe (DATE), 2005, pp. 206-211. [66] M. Rostami, F. Koushanfar, J. Rajendran, and R. Karri, "Hardware Security: Threat Models and Metrics", International Conference on Computer-Aided Deisgn (ICCAD), 2013, pp. 819-823. [67] M. R. Rudra, N. A. Daniel, and D. H. K. Hoe, “Designing Stealthy Trojans with Sequential Logic: A Stream Cipher Case Study,” Design Automation Conference (DAC), 2014, pp. 1-4. [68] M. Saint-Laurent and M. Swaminathan, “A digitally adjustable resistor for path delay characterization in high frequency microprocessors,” Southwest Symposium on Mixed-Signal Design, 2001, pp. 61-64. [69] A. Saldanha, R. K. Brayton, and A. L. Sangiovanni-Vincentelli, "Equivalence of robust delay-fault and single stuck-fault test generation," Design Automation Conference (DAC), 1992, pp.173 -176. 158 [70] H. Salmani, M. Tehranipoor, and J. Plusquellic, “A Novel Technique for Improving Hardware Trojan Detection and Reducing Trojan Activation Time,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 20, Issue. 1, pp. 112-125, Jan. 2011. [71] K. Senwen and J. Dworak, “Triggering Trojans in SRAM Circuits with X-Propagation,” Defect and Fault Tolerance of VLSI Systems (DFT), 2014, pp. 1-8. [72] K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan, "Temperature-aware microarchitecture," ACM SIGARCH Computer Architecture News, Vol. 31, Issue. 2, pp. 2-13, May. 2003. [73] S. Skorobogatov and C. Woods, “Breakthrough Silicon Scanning Discovers Backdoor in Military Chip,” Cryptographic Hardware and Embedded Systems (CHES), 2012, pp. 23-40. [74] G. L. Smith, "Model for Delay Faults Based upon Paths," International Test Conference (ITC). 1985, pp. 342-351. [75] C. Sturton, M. Hicks, D. Wagner, and S. T. King, “Defeating UCI: Building stealthy and malicious hardware,” IEEE Symposium on Security and Privacy (SP), 2011, pp. 64-77. [76] D. Sullivan, J. Biggers, G. Zhu, S. Zhang, and Y. Jin, “FIGHT-Metric: Functional Identification of Gate-Level Hardware Trustworthiness,” Design Automation Conference. (DAC), 2014, pp. 1-4. [77] M. Tehranipoor and F. Koushanfar, "A Survey of Hardware Trojan Taxonomy and Detection," IEEE Design & Test of Computers, Vol. 27, Issue. 1, pp. 15-20, Feb. 2010. [78] M. Tehranipoor, H. Salmani, X. Zhang, X. Wang, R. Karri, J. Rajendran, and K. Rosenfeld, "Trustworthy Hardware: Trojan Detection Solutions and Design-for-Trust Challenges," IEEE Computer Magazine, Vol. 44, Issue 7, pp. 66-74, Jul. 2011. [79] R. Torrance and D. James, “The State-of-the-Art in IC Reverse Engineering,” Cryptographic Hardware and Embedded Systems (CHES), 2009, pp 363–381. [80] “Trusted Access Program Office”, National Security Agency. Available at http://www.nsa.gov/business/programs/tapo.shtml. [81] M-C. Tsai, C-H. Cheng, and C-M Yang, "An all-digital high-precision built-in delay time measurement circuit," VLSI Test Symposium (VTS), 2008, pp. 249-254. [82] N. G. Tsoutsos, C. Konstantinou, and M. Maniatakos, “Advanced Techniques for Designing Stealthy Hardware Trojans,” Design Automation Conference (DAC), 2014, pp. 1- 4. [83] J. A. Waicukauski et al., “Transition Fault Simulation by Parallel Pattern Single Fault Propagation,” International Test Conference (ITC), 1986, pp.542-549. [84] T-Y. Wang and CC-P. Chen, "SPICE-compatible thermal simulation with lumped circuit modeling for thermal reliability analysis based on modeling order reduction," International Symposium on Quality Electronic Design (ISQED), 2004, pp. 357-362. 159 [85] X. Wang, H. Salmani, M. Tehranipoor, and J. Plusquellic, ‘‘Hardware Trojan Detection and Isolation Using Current Integration and Localized Current Analysis,” Defect and Fault Tolerance of VLSI Systems (DFT), 2008, pp. 87-95. [86] X. Wang, M. Tehranipoor, and J. Plusquellic, “Detecting Malicious Inclusions in Secure Hareware: Challenges and Solutions,” Hardware-Oriented Security and Trust (HOST), 2008, pp. 15-19. [87] S. Wei, K. Li, F. Koushanfar, and M. Potkonjak, "Hardware Trojan Horse Benchmark via Optimal Creation and Placement of Malicious Circuitry", Design Automation Conference (DAC), 2012, pp. 90-95. [88] S. Wei, K. Li, F. Koushanfar, and M. Potkonjak, "Provably Complete Hardware Trojan Detection Using Test Point Insertion", International Conference on Computer-Aided Design (ICCAD), 2012, pp. 569-576. [89] S. Wei and M. Potkonjak, “Scalable Hardware Trojan Diagnosis,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 22, Issue. 6, pp. 1049-1057, Jun. 2012. [90] G. Welch and G. Bishop, "An introduction to the Kalman filter," 1995. [91] N. Weste and D. Harris, “CMOS VLSI Design: A Circuits and Systems Perspective,” Addison-Wesley Publishing Company, 2010, pp. 155-162. [92] L. William, “Mosfet models for spice simulations including BSIM3V3 and BSIM4,” New York : Wiley-Interscience Publication, 2001. [93] F. Wolff, C. Papachristou, S. Bhunia, and R. Chakraborty, “Towards Trojan-free trusted ICs: Problem analysis and detection scheme,” Design, Automation and Test in Europe (DATE), 2008, pp. 1362-1365. [94] K. Xiao, X. Zhang, and M. Tehranipoor, “A Clock Sweeping Technique for Detecting Hardware Trojans Impacting Circuits Delay,” IEEE Design & Test of Computers, Vol. 30, Issue. 2, pp. 26-34, Apr. 2013. [95] J. Zhang, B. Cha and S. K. Gupta, “Reduced-Complexity Trojan Detection Method via Delay Measurements,” IEEE workshop on Silicon Debug and Diagnosis (SDD), 2012. [96] J. Zhang and Q. Xu, “On Hardware Trojan Design and Implementation at Register- Transfer Level,” Hardware-Oriented Security and Trust (HOST), 2013, pp. 107-112. [97] X. Zhang and M. Tehranipoor, "RON: An On-chip Ring Oscillator Network for Hardware Trojan Detection," Design, Automation and Test in Europe (DATE), 2011, pp.1530-1591. [98] B. Zhou, W. Zhang, S. Thambipillai, and J. K. J. Teo, "A low cost acceleration method for hardware trojan detection based on fan-out cone analysis,” International Conference on Hardware/Software Codesign and System Synthesis (CODES), 2014, pp. 1-10. [99] http://www.cadence.com/products/cic/spectre_circuit [100] http://www.cadence.com/products/di/edi_system/pages/default.aspx 160 [101] http://www.darpa.mil/Our_Work/MTO/Programs/Trusted_Integrated_Circuits_(TRUST). aspx. [102] https://www.mosis.com/cgi-bin/cgiwrap/umosis/swp/params/ibm-90/v02d_9lprf_9m_lb- params.txt [103] https://www.mosis.com/vendors/view/tsmc/018 [104] http://www.trust-hub.org
Abstract (if available)
Abstract
High cost differentials are causing many aspects of integrated circuit (IC) design—including IC design and fabrication, high-volume testing, and IC packaging—to increasingly move overseas. Consequently, it is increasingly more common for a new IC’s original designers to lose direct control of many design and fabrication steps. Thus, designers now face hardware tampering that may occur during the manufacturing process, called hardware Trojan insertion. These hardware Trojans are expected to be designed and inserted by an intelligent and resourceful adversary to gain unauthorized access to information or unauthorized control. Thus, developing techniques to detect hardware Trojans is becoming more important to ensure trustworthiness of digital ICs, especially when they are fabricated by untrusted vendors. ❧ Detection of hardware Trojans is challenging since the specifics of hardware Trojans are unknown and difficult to predict. Furthermore, the most challenging types of Trojans will not change the logic behavior of the original circuit, and will cause only minimal deviation to the circuit’s parameters while the levels of process variations are high and continue to increase. Moreover, Trojan designers are expected to be well‐financed and equipped with the state‐of‐the‐art detection techniques, and keep improving techniques to make their Trojans more sophisticated and undetectable. ❧ In this context, this dissertation introduces a new framework for the problem of evaluating trustworthiness of digital ICs, especially when they are fabricated by untrusted vendors, while addressing solutions to all the above mentioned challenges of Trojan detection. ❧ First, our new framework comprehensively identifies and characterizes the changes caused by Trojans. Unlike traditional methods for enumerating many specific types of Trojans, we provide a systematic approach for developing a universal set of Trojans, based on our canonical models of deviation from the original design that span all possible circuit‐level modifications. From the derived set of Trojans, we select to study the conditions for Trojans that are maximally challenging for any parametric measurement method, namely minimally‐invasive Trojans and maximally‐matched Trojans, to make our detection approaches effective. ❧ In addition, our approaches are designed to be maximally effective, in the sense that they maximally harness the information that can be gathered by applying selected vector sequences and measuring parameter values, and produce results with minimum cost at a given level of process variations, a given measurement noise, and a given level of confidence. In particular, we first evaluate the effectiveness of measuring delays for detecting Trojans, and show that even maximally challenging Trojans can be detected by delay measurements. And we propose methods to estimate the delay change caused by each model of Trojans, when both the circuit and the Trojans are designed to give only minimal deviation in the circuit’s delay. Furthermore, we develop approaches to reduce the cost of measurements for detecting Trojans, including methods for selecting paths, generating vectors, and calibrating the effects of process variations, which improve the effectiveness of delay measurements even further. Last, we develop techniques for adding specially‐designed features to the design that will make it difficult—preferably, impossible—for untrusted vendors to insert Trojans.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A variation aware resilient framework for post-silicon delay validation of high performance circuits
PDF
Verification and testing of rapid single-flux-quantum (RSFQ) circuit for certifying logical correctness and performance
PDF
Variation-aware circuit and chip level power optimization in digital VLSI systems
PDF
Defect-tolerance framework for general purpose processors
PDF
Thermal analysis and multiobjective optimization for three dimensional integrated circuits
PDF
Advanced cell design and reconfigurable circuits for single flux quantum technology
PDF
Optimal defect-tolerant SRAM designs in terms of yield-per-area under constraints on soft-error resilience and performance
PDF
Development of electronic design automation tools for large-scale single flux quantum circuits
PDF
A logic partitioning framework and implementation optimizations for 3-dimensional integrated circuits
PDF
Graph machine learning for hardware security and security of graph machine learning: attacks and defenses
PDF
Timing-oriented approach for delay testing
PDF
Structural delay testing of latch-based high-speed circuits with time borrowing
PDF
Power efficient design of SRAM arrays and optimal design of signal and power distribution networks in VLSI circuits
PDF
Optimizing power delivery networks in VLSI platforms
PDF
High level design for yield via redundancy in low yield environments
PDF
Electronic design automation algorithms for physical design and optimization of single flux quantum logic circuits
PDF
Optimal redundancy design for CMOS and post‐CMOS technologies
PDF
Understanding dynamics of cyber-physical systems: mathematical models, control algorithms and hardware incarnations
PDF
A joint framework of design, control, and applications of energy generation and energy storage systems
PDF
Energy efficient design and provisioning of hardware resources in modern computing systems
Asset Metadata
Creator
Cha, Byeongju
(author)
Core Title
Trustworthiness of integrated circuits: a new testing framework for hardware Trojans
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
02/18/2015
Defense Date
01/22/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
delay measurement,hardware Trojans,integrated circuits,OAI-PMH Harvest,parametric test,Security,testing
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Gupta, Sandeep K. (
committee chair
), Nakano, Aiichiro (
committee member
), Pedram, Massoud (
committee member
)
Creator Email
bjcha74@gmail.com,byeongjc@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-534923
Unique identifier
UC11298343
Identifier
etd-ChaByeongj-3192.pdf (filename),usctheses-c3-534923 (legacy record id)
Legacy Identifier
etd-ChaByeongj-3192.pdf
Dmrecord
534923
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Cha, Byeongju
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
delay measurement
hardware Trojans
integrated circuits
parametric test
testing