Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Thermal analysis and multiobjective optimization for three dimensional integrated circuits
(USC Thesis Other)
Thermal analysis and multiobjective optimization for three dimensional integrated circuits
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THERMAL ANALYSIS AND MULTIOBJECTIVE
OPTIMIZATION FOR THREE DIMENSIONAL
INTEGRATED CIRCUITS
by
Fatemeh Kashfi
______________________________________________________________
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
December 2013
Copyright 2013 Fatemeh Kashfi
ii
To:
my beloved husband,
my parents, and my brothers.
iii
Acknowledgement
The completion of this doctoral dissertation has been a long journey made
possible through the inspiration and support of a handful of people.
First and foremost, I especially would like to thank my Ph.D. advisor and
dissertation committee chair, Professor Jeff Draper, for his support, mentorship, and
enthusiasm throughout my PhD study and research. He showed me how to think, not
what to think. He will always be an inspiration throughout my personal and professional
life.
I would also like to extend my great appreciation to my dissertation and
qualification committees, including Prof. Sandeep Gupta, Prof. Alice Parker, Prof. Peter
Beerel, and Prof. Aiichiro Nakano for their time, guidance, and feedback. My great
respect and acknowledgment also go to Prof. Massoud Pedram for all of his supports.
I also want to thank every teacher that I had during the 24 years of my education
from the first year of primary school to the last year of my graduate school. They are
always in my heart and thought. I have learned a lot from every one of them, and I am
who I am because of every single one of them.
I sincerely thank those who have in many ways contributed to the success of my
academic endeavors. They are the staffs at the University of Southern California,
particularly Diane Demetras, and Tim Boston, and the staffs at the Information Sciences
Institute (ISI). Also my great appreciation goes to the members of Micro-Architecture
and Integrated (Marina) Circuit Group, especially Jeff Sondeen.
iv
I wish to acknowledge Dr. Safar Hatami for his help and advices, and I offer my
regards and blessings to my friends who supported me in any respect during the
completion of this dissertation, especially Hanie Sedghi, Majid Janzamin, and Dr.
Mohammad Mirzaaghatabar.
My deepest gratitude goes to my parents, Dr. Tayebeh Tavassoli and Dr.
Abdolrasoul Kashfi, and my brothers, Ali and Hossein, for their unconditional love,
support, and devotion throughout my life and especially during my education.
Finally, yet importantly, I would like to mention that none of this could happen
without help, encouragement, and love from my husband, Dr. Hamed Abrishami. His
endless support during my PhD study played a substantial role in my success.
v
Table of Contents
Acknowledgement ............................................................................................................. iii
List of Tables .................................................................................................................... vii
List of Figures .................................................................................................................. viii
Abstract ................................................................................................................................x
Chapter 1 : Introduction .......................................................................................................1
1.1 Introduction to 3DIC Technology ................................................................1
1.1.1 Why 3DIC? ........................................................................................... 3
1.1.2 Challenges Facing the 3DIC Technology ............................................. 4
1.2 Our Research ................................................................................................5
1.2.1 Thermal Analysis of a 3DIC ................................................................. 6
1.2.2 Multiobjective Optimization of a 3DIC ............................................... 8
1.3 Related Work .............................................................................................10
1.3.1 Thermal Analysis of a 3DIC ............................................................... 10
1.3.2 Multiobjective Optimization of a 3DIC ............................................. 12
1.4 Research Contributions ..............................................................................14
1.5 The Dissertation Outline ............................................................................17
Chapter 2 : 3D Sensor Design ............................................................................................18
2.1 Study of Thermal Correlation Between layers in 3DICs ...........................18
2.2 Application: Ring Oscillator Based 3D Thermal Sensor ...........................22
2.3 Experimental results...................................................................................30
Chapter 3 : Sensor Allocation in 3DICs ............................................................................34
3.1 3D Thermal Map Modeling for 3DICs ......................................................34
3.2 Proposed 3D Thermal Map Modeling for 3DICs ......................................36
3.2.1 Effect of Distance from Heatsink on Each Layer’s Thermal Map ..... 37
3.2.2 Modeling of Thermal Effects of Other Layers on a Specific Layer ... 40
3.2.3 Scaling Elements Calculation ............................................................. 43
3.2.4 3D thermal map modeling .................................................................. 44
3.3 Application: Thermal Sensor Distribution Algorithm for 3DICs ..............46
3.4 Experimental Results .................................................................................48
Chapter 4 : Multiobjective Optimization ...........................................................................58
4.1 Study of Multiobjective Optimization Techniques for VLSI Circuits .......58
4.2 Power and Delay Modeling .......................................................................59
4.2.1 Non-convex Modeling of Power and Delay ....................................... 60
4.2.2 Convex Modeling of Power and Delay .............................................. 61
4.3 Multiobjective Optimization Problem .......................................................63
4.3.1 Pareto Optimal Solution ..................................................................... 64
vi
4.3.2 Multiobjective Optimization Solution Methods ................................. 64
4.3.3 Proposed Approach ............................................................................ 68
4.4 Experimental Results .................................................................................70
4.4.1 Multiobjective Optimization of Power and Delay in a Circuit ........... 72
4.4.2 Experimental Results .......................................................................... 77
Chapter 5 : Multiobjective Optimization in 3DICs............................................................82
5.1 Conflicting Objectives ...............................................................................83
5.2 Problem Statement and Formulation .........................................................85
5.2.1 Variables of the MOP ......................................................................... 86
5.2.2 Objectives of the MOP ....................................................................... 88
5.3 Multiobjective Optimization Algorithm ....................................................92
5.3.1 Convexity of the Objectives ............................................................... 93
5.3.2 The Proposed Algorithm .................................................................... 95
5.4 Experimental Results .................................................................................97
5.4.1 Simulation Setup ................................................................................ 97
5.4.2 Single Objective Analysis .................................................................. 99
5.4.3 Effect of Number of Layers on Optimum Design of a 3DIC ........... 101
5.4.4 Global Optimization and Comparison with Previous Works ........... 105
Chapter 6 : Summary .......................................................................................................109
6.1 Conclusion ...............................................................................................109
6.2 Main Contributions ..................................................................................112
6.3 Future Work .............................................................................................115
References ........................................................................................................................117
Alphabetized Bibliography ..............................................................................................122
Appendix ..........................................................................................................................130
vii
List of Tables
Table 2.1. Thermal correlation between layers in a 4-layer 3DIC .................................... 31
Table 2.2. Number of sensors and maximum reading error ............................................. 32
Table 3.1. Core processor configuration ........................................................................... 50
Table 3.2. 3DIC material properties ................................................................................. 51
Table 3.3. Power profile of the benchmarks ..................................................................... 52
Table 3.4. Maximum 3D thermal modeling error and sensor reading error ..................... 57
Table 4.1. Simulation results for the adder circuit ............................................................ 77
Table 4.2. Simulation results for the Flip Flop circuit ...................................................... 81
Table 5.1. Conflicting objectives in a 3DIC ..................................................................... 84
Table 5.2. 3DIC cost parameters ....................................................................................... 98
Table 5.3. Effect of number of layers on multiobjective optimization results ............... 102
Table 5.4. Comparison with previous method ................................................................ 107
viii
List of Figures
Figure 1.1. Flip-chip 3DIC structure ................................................................................... 1
Figure 1.2. Bonding styles: (a) face-to-face (b) back-to-face (c) back-to-back .................. 2
Figure 2.1. 3DIC packaging layers ................................................................................... 18
Figure 2.2. Reflection of thermal map of a layer on other layers ..................................... 19
Figure 2.3. Compact vs. sparse TSVs impact on the junction temperature ...................... 21
Figure 2.4. (a) planar hotspot (b) cross section of the spatial hotspot .............................. 21
Figure 2.5. Linearity evaluation of the ring oscillator’s output frequency ....................... 23
Figure 2.6. Output frequency vs. temperature consuming TSVs ...................................... 24
Figure 2.7. (a) Power supply variation (b) Process variation ........................................... 26
Figure 2.8. (a) TSV footprints (b) final 3D ring oscillator ............................................... 28
Figure 2.9. Thermal sensor distribution in a 4-layer 3DIC ............................................... 31
Figure 3.1. Thermal modeling of two layers in a 3DIC .................................................... 35
Figure 3.2. Hotspot area around a heat source................................................................... 38
Figure 3.3. Primary vs. secondary path of heat transfer in a 7-layer 3DIC ....................... 41
Figure 3.4. Vertical thermal scaling matrix elements calculation...................................... 43
Figure 3.5. 3D Thermal map modeling steps .................................................................... 45
Figure 3.6. Floorplan of a core processor ......................................................................... 49
Figure 3.7. 3DIC stacked layers (Flip-chip configuration) ............................................... 50
ix
Figure 3.8. Histogram of 3D thermal map modeling error ............................................... 53
Figure 3.9. Monitoring area of the sensors vs. number of sensors .................................... 54
Figure 3.10. Thermal sensor positions .............................................................................. 56
Figure 4.1. Histogram of error in non-convex delay modeling ........................................ 61
Figure 4.2. Pareto optimal set ........................................................................................... 63
Figure 4.3. Power and delay multiobjective optimization ................................................ 69
Figure 4.4. 10-bit carry-lookahead adder .......................................................................... 71
Figure 4.5. TSPC Flip Flop ............................................................................................... 72
Figure 4.6. Pareto set for non-convex modeling (a) WS method (b) CP method ............. 73
Figure 4.7. Pareto set with analytical gradient (a) WS method (b) CP method ................ 74
Figure 4.8. Pareto set for convex modeling (a) WS method (b) CP Method .................... 75
Figure 4.9. (a) Non-convex power optimization (b) Convex power optimization ........... 76
Figure 4.10. Degradation from the optimum values for the adder circuit ........................ 79
Figure 4.11. Degradation from the optimum values for Flip Flop circuit ........................ 80
Figure 5.1. Multiobjective optimization algorithm for 4-layer 3DIC design ................... 96
Figure 5.2. Cost of 3DIC vs. area, number of layers and number of TSVs ...................... 99
Figure 5.3. Temperature of 3DIC vs. area, number of layers and number of TSVs ....... 100
Figure 5.4. Optimum objectives (a) 16 building blocks. (b) 48 building blocks ............ 104
Figure 5.5. Comparison of the optimization execution times ......................................... 108
x
Abstract
Three Dimensional Integrated Circuit (3DIC) technology has been introduced to
address the interconnect issues in nanometer circuit design that limit performance
improvement and power reduction. However, stacking active layers of silicon leads to
increased power density and overall higher temperatures in a 3D chip implementation for
many designs. New thermal map modeling, and temperature measurement, mitigation
and management techniques should be introduced for this technology. In this dissertation
we study the thermal correlation between the stacked layers in 3DICs. We then propose a
fast and efficient 3D thermal map modeling based on scaled hotspot areas, depending on
the distance of a stacked layer from the heatsink and also thermal effects of the layers on
each other. The modeling is 53x faster than the existing method of temperature compact
modeling. The efficiency of the proposed modeling is demonstrated with its use in a
thermal sensor distribution algorithm. We also show that the thermal sensor distribution
algorithm should be solved as a 3D problem. In this way for the same sensor reading
error the total number of needed sensors is reduced by 44%. We furthermore propose a
new 3D design for thermal sensor circuits to be shared between layers in a 3DIC. Using
3D thermal sensors that are shared between adjacent layers can reduce the total number
of needed sensors by half.
We also study different methods of multiobjective optimization to find the
optimum operating point of a VLSI circuit. We provide wide mathematical analyses of
different multiobjective optimization techniques for this purpose. We also study the
xi
difference of convex and non-convex modeling of the objectives in the multiobjective
optimization algorithms.
We apply our multiobjective optimization methods to optimize three conflicting
objectives of cost, performance and thermal reliability to find an optimum building block
placement in 3DICs. The variables for the optimization are the number of layers, area of
the 3DIC, position of the building blocks, number of TSVs and total wirelength. We used
our proposed fast 3D thermal map modeling to eliminate the thermal analysis bottleneck
in multiobjective optimization iterations. In comparison with a previous state-of-the-art
multiobjective optimization method which employs Simulated Annealing, a weighted
sum method of scalarization and compact modeling for thermal analysis, our method
reduces the peak temperature of a representative 3DIC by 4.3% and total wire length by
5.7% while it is more than 17x faster in optimization runtime. The execution runtime of
the proposed algorithm also scales linearly with problem size in contrast with the existing
heuristic method of Simulated Annealing, which scales poorly with problem size.
1
Chapter 1: Introduction
1.1 Introduction to 3DIC Technology
Three Dimensional Integrated Circuit (3DIC) technology is emerging to extend
Moore’s law along a vertical dimension. Device scaling is reaching a limitation because
of short-channel effects, quantum tunneling, increasing variability and power dissipation.
3DIC technology makes it possible to continue Moore’s law further not by scaling, but by
stacking active layers of transistors on top of each other [1].
Figure 1.1. Flip-chip 3DIC structure
In electronics a 3DIC is a chip in which two or more layers of active electronic
components are integrated vertically into a single system (Figure 1.1). The figure shows a
flip-chip in which the heat sink is located on top of the 3DIC and the connection to
external circuitry is provided by the solder balls located at the bottom of the chip. The
2
semiconductor industry is pursuing this promising technology in many different forms,
but it is not widely used yet (as of year 2013).
There are different ways of manufacturing a 3DIC, including wafer-to-wafer, die-
to-wafer and die-to-die. In wafer-to-wafer technology, electronic components are built on
two or more semiconductor wafers, which are then aligned, bonded, and diced into 3DICs.
In die-to-wafer technology, electronic components are built on two semiconductor wafers.
One wafer is diced; the singulated dice are aligned and bonded onto die sites of the
second wafer. In die-to-die technology, electronic components are built on multiple dice,
which are then aligned and bonded. Through Silicon Vias (TSVs) are used for
communication between the stacked layers which are very high-performance vias that
pass through the silicon wafer or die. Thinning of the stacked layer is necessary for better
communication and for reducing the vertical size of the chip.
(a)
(b)
(c)
Figure 1.2. Bonding styles: (a) face-to-face (b) back-to-face (c) back-to-back
In each manufacturing method thinning and TSV formation can be done before or
after the stacking.
Depending on which sides of the dice are bonded together, there exist three types
of bonding styles, namely, face-to-face, face-to-back, and back-to-back, in which the face
3
is defined as the metal side of the die. Face-to-face bonding does not utilize TSVs
because the connection between the dice is established using metal layers as shown in
Figure 1.2.
1.1.1 Why 3DIC?
3D stacking can potentially improve interconnect performance by exploiting a
third dimension to reduce the total wire length, and parasitic capacitance as a
consequence, which decreases both latency and power consumption. 3D integration
allows a large number of vertical vias between the layers. This allows construction of
low-latency wide-bandwidth buses between functional blocks in different layers. For
example, stacking memory and processor in a 3DIC is regarded as a promising solution
for addressing the memory wall problem
1
[2].
3D stacking also has the potential to reduce chip area. This technology makes it
possible to integrate dissimilar technologies in one 3DIC. Analog and digital circuits can
be placed in one 3DIC; also a 3DIC can be made of transistors of different technologies
(i.e. CMOS, SOI, etc) and technology nodes (i.e. 65nm, 32nm, etc) [3]. The vertical
dimension adds a higher order of connectivity and offers new design possibilities. It can
also reduce fabrication costs and improve yield by partitioning a large chip into smaller
dice that are later assembled through 3D stacking.
1
The "memory wall" is the growing disparity of speed between CPU and memory outside the CPU chip. An important reason for
this disparity is the limited communication bandwidth beyond chip boundaries.
4
3DICs can also be designed in a way to be somewhat more secure than 2D ICs.
The stacked structure complicates attempts to reverse engineer the circuitry. Sensitive
circuits may also be divided among the layers in such a way as to obscure the function of
each layer.
1.1.2 Challenges Facing the 3DIC Technology
There are also some challenges for this technology to mature and be widely used.
Stacking and alignment of layers using TSVs (Through Silicon Vias) is a technology
challenge. TSVs are large compared to gates and impact the floorplan. Furthermore,
manufacturability demands landing pads and keep-out zones which further increase TSV
area footprints. Depending on the technology choices, TSVs block some subset of layout
resources. [4].
Testing of the stacked layers, especially inside layers, and designing optimum
scan chains are problems to be solved in the test domain [5][6]. To achieve a reliable
3DIC, each circuit should be tested by itself and in operation with the rest of the 3DIC. A
circuit’s components can be located in different layers. Because of the tight integration
between the layers and significant amount of interconnects, use of conventional 2D
testing techniques is limited in 3DICs, and new methods should be developed.
Existing design and CAD tools need to be extended to support a third dimension
too [7]. New 3DIC standards should also be developed. Taking full advantage of 3D
integration also adds to the design complexity. Each extra manufacturing step adds a risk
of defect and may decrease the yield of a 3DIC.
5
Higher junction temperatures due to increased power density are a major
challenge that 3DIC designers are facing. Thermal issues gets worse because of
limitations in using cooling channels between the layers, and because the layers which
are further from the heatsink may get overheated [8][9]. Heat energy is trapped inside the
3DIC, and leakage power thermal run-away exacerbates the problem [10].
A good understanding of the thermal behavior of 3DICs and providing fast and
efficient 3D thermal modeling is necessary for devising design and packaging techniques
and power saving methods to mitigate the thermal problems in 3DICs [11][12][13].
Thermal-aware partitioning, floorplanning, and placement techniques customized for
3DICs [14][15][16] are other fields that must be improved for an optimized structure of
the 3DIC. Additionally, innovative heat removal techniques like liquid cooling micro-
channels [17][18] and thermal via placement [19][20] must be introduced in order for
3DICs to be more practical in electronic products. Beside the thermal-aware design and
mitigation methods, new techniques for thermal measurement and management of the
3DIC during runtime should also be introduced [21][22]. These techniques include 3D-
aware thermal sensor design and allocation, runtime mitigation techniques like 3D-aware
task scheduling [23], and dynamic voltage and frequency scaling (DVFS) [24].
1.2 Our Research
In this dissertation we focus on two important problems in 3DIC technology. In
the first part we focus on thermal analysis and study the problem of 3DIC thermal map
modeling and temperature measurement. We propose a new 3DIC thermal map
modeling, a thermal sensor distribution algorithm customized for 3DIC, and a new design
6
for 3D thermal sensors for runtime thermal measurement inside a 3DIC. In the second
part, we focus on the problem of multiobjective optimization of performance, cost and
thermal reliability for optimum building block placement inside a 3DIC. We start by
studying the efficiency of different multiobjective optimization algorithms for VLSI
circuits. Each objective of performance, cost and thermal reliability of the 3DICs is then
formulated and studied in detail. We will then propose a general framework for
multiobjective optimization algorithm of 3DICs. Our proposed 3D thermal map modeling
is embodied in the optimization algorithm to eliminate the thermal analysis bottleneck
during each optimization’s iteration.
1.2.1 Thermal Analysis of a 3DIC
In a 3DIC, the thermal maps of the stacked layers are highly correlated to each
other. Any active layer thermally affects other layers, and any planar hotspot is converted
to a spatial hotspot. This high correlation is studied through experiments. We approach
the problem of 3DIC thermal map modeling from a new angle. Our new proposed model
is based on common thermal scaling behaviors of 3DICs and can be used for both
transient and steady-state thermal analyses of a 3DIC. The method is based on scaled
hotspot areas, depending on the distance of a stacked layer from the heatsink and also
thermal correlations between the layers. We model each layer’s thermal map in a 3DIC as
a superposition of its own thermal map, after proper scaling based on its location in the
3DIC, and scaled reflections of other layers’ thermal maps on that specific layer. By
finding proper thermal scaling factors between layers, a characterization of the 3DIC
thermal map for all applications can be obtained. The model is accurate, very fast, and
7
eliminates the thermal analysis bottleneck common in many modern-day thermal
measurement techniques and optimization algorithms.
A new thermal sensor distribution method for 3DICs is also proposed in this
research. The method is thermal gradient-aware and employs a k-means clustering
algorithm in a 3D manner. Taking advantage of the high thermal correlation between
adjacent layers, we claim that any thermal sensor distribution algorithm should be solved
as a 3D problem, not for individual layers separately, to avoid excessive assignment of
sensors to the same spatial hotspot. The thermal sensor distribution algorithm employs
our fast 3D thermal map modeling.
The model is utilized to generate the 3D thermal map of a sample 4-layer stacked
3DIC, consisting of two layers of quad-core processors and one layer of L2 cache and
one layer of main memory. For different applications running on the processor the
proposed modeling yields a maximum error of less than 5.5%, which is quite acceptable
for the purpose of a sensor distribution algorithm. The 3D thermal sensor distribution is
based on the k-means clustering algorithm in 3D Euclidian space. With the proposed
method for the 4-layer stacked 3DIC, less than 4.4% error in maximum sensor reading of
the temperature of the chip for all analyzed applications was achieved. The algorithm
uses the proposed 3D thermal map modeling, which speeds up evaluation time by 53x for
six different applications to be run on the 3DIC, compared with the situation in which
detailed 3D map modeling using HotSpot 5.0 is embodied in the algorithm. This speedup
becomes even more significant as the number of evaluation scenarios increases for more
complicated 3DICs and applications. Furthermore, as demonstrated, thermal sensor
8
distribution for 3DICs must be solved as a 3D problem, which results in 44% fewer
sensors, as compared with conventional 2D methods, while maintaining the same sensor
reading error tolerance.
A new design for a ring oscillator based 3D thermal sensor is introduced to be
used in thermal measurement and management in 3DICs. This 3D thermal sensor is
shared between layers to consider the thermal effects of TSVs as a great thermal
exchanger between layers in a 3DIC and to reduce the total number of sensors needed for
the whole chip by almost half.
1.2.2 Multiobjective Optimization of a 3DIC
Given the often conflicting design constraints for 3DICs, it is useful to have a
framework for multiobjective optimization. We first study different multiobjective
optimization techniques that can be applied to find an optimum operation point of a VLSI
circuit with multiple conflicting objectives to be satisfied. A multiobjective optimization
problem (MOP) is the optimization of different objective functions simultaneously and
reaching a solution that is the best in regard to all of the objective functions.
Multiobjective optimization problems are present at different levels of VLSI circuit
optimization. During a conventional design flow, including front end design and backend
steps, circuit designers typically wish to optimize a circuit with respect to its performance,
power dissipation, and layout area at the same time. The optimization of a circuit for
speed and power is nearly always conflicting, i.e., higher speed leads to higher power
dissipation. If this optimization is done at the transistor level or through gate sizing of the
circuit, finding the best sizing vector for both speed and power is a challenging task. This
9
problem can also emerge in a circuit working under different supply voltage levels or
clock frequencies, or even different die temperatures. As discussed in the power and
speed trade-off case, multiobjective optimization techniques become important when
optimizations of different objective functions conflict with each other.
To study the efficiency of different multiobjective optimization algorithms we
will focus on multiobjective optimization of power and delay, as an example of the two
conflicting objective functions for VLSI circuits. The proposed methods can be applied
for any other set of disagreeing functions to find the best (Pareto optimal) operating
points. First we propose different non-convex and convex models of power and delay and
explain which ones are the most appropriate to be used in multiobjective optimization. In
particular, we present three methods for solving a multiobjective problem. The first one is
the Weighted Sum method, which is the most popular technique for solving
multiobjective optimization. Compromise Programming is the second method, which is
quite effective for convex functions. Finally a Satisficing Trade-off Method (STOM)
based algorithm is presented for optimization. STOM can be used to find the best point
when the designer has provided goals for the objective functions. We will describe step
by step how to find the best point of operation of a circuit in regard to multiple
conflicting objective functions.
After finding an effective multiobjective optimization algorithm for a typical
VLSI circuit, we will propose a general framework for multiobjective optimization on a
3DIC. We will target cost, thermal reliability and performance for optimum building
block placements and try to find the best operating point for a 3DIC. For cost we
10
formulate the total cost of a 3DIC as a function of number of layers, die area, number of
TSVs, and yield of each die including bonding yield. For thermal reliability we use our
proposed 3D thermal map modeling, and for performance we target routing length
between each two communicating blocks. The convexity of each objective is also studied.
Using our proposed multiobjective optimization algorithm for 3DICs, these objectives are
optimized simultaneously to find the best microarchitectural placement inside a 3DIC
chip.
Our proposed method searches in continuous space using a Quasi-Newton
analytical optimization method. Our approach also uses a scalarization method of
Compromise Programming in which the weighted distance of the objectives from their
minimum points is optimized. A fast 3DIC thermal map model is used for the
optimization algorithm to eliminate the thermal analysis bottleneck. In comparison with a
previous state-of-the-art multiobjective optimization method which uses Simulated
Annealing, a weighted sum method of scalarization and compact modeling for thermal
analysis, our method reduces the peak temperature of the 3DIC by 4.3% and total wire
length by 5.7% while it is more than 17x faster in optimization runtime.
1.3 Related Work
1.3.1 Thermal Analysis of a 3DIC
In recent years different methods for thermal map modeling of 3DICs have been
proposed. The methods are mainly divided into three categories. The Finite Difference
Method (FDM) solves Partial Differential Equations (PDE) of the heat conduction with
constraint boundaries. The method is very accurate in modeling both steady-state and
11
transient thermal behavior in the cubic geometry of a 3DIC. The main disadvantage of
this method is its high computational cost [34][35]. Compact modeling divides a 3DIC
into thermal grids and based on electrical-thermal analogy equations, solves the thermal
relationship between neighboring cells. A transient temperature response is calculated
given the physical characteristics and power consumption of units on the die. Compact
modeling still very accurately reduces the problem size and expedites thermal simulations
[11][12][13]. Fast thermal modeling methods sacrifice some accuracy to more quickly
reach a solution and can be used in thermal management algorithms to get insight into the
spatial thermal map of a 3DIC for all analyzed applications. In [36] a fast thermal
simulation for 3DICs is proposed using a Neural Network heuristic. Although the
speedup is considerable, using this model still needs a time consuming training phase for
random power inputs for each thermal cell.
Thermal sensors should be used for runtime monitoring and managing of the
3DIC thermal behavior. Different mechanisms for thermal sensor design including the
diode [25] and the ring oscillator have been used in conventional VLSI circuits. Ring
oscillator thermal sensors are widely used in cell-based ICs (in which only digital gates
are available) as there is no need for introducing other non-cell based technology in the
design [26][27][28]. The frequency of a ring oscillator decreases with increasing
temperature and can thus be an indicator of temperature changes. Peripheral circuits can
be added to make it immune to noise and other variations [29].
Thermal sensors must be allocated inside 3DICs for runtime thermal measurement
and management of the chip. The thermal sensor distribution problem has been widely
12
studied for conventional 2D chips. In [30] locations of the sensors are determined after
providing an analytical model that describes the maximum temperature difference
between a hotspot in the IC and a region of interest. In [31] uniform and non-uniform
sensor allocations are compared. The non-uniform allocation identifies an optimal
physical location for each sensor such that the sensor’s attraction toward steep thermal
gradients is maximized. In [32] after locating high energy regions of the chip, two
methods of energy-center sensor allocation and energy-cluster sensor allocation are
compared, while in the latter, sensor locations are determined using the k-means
clustering algorithm, which attempts to strike a balance between hotspot estimation and
full thermal characterization. These methods should be further developed and customized
for 3DICs, considering the special thermal behavior of this technology.
1.3.2 Multiobjective Optimization of a 3DIC
Multiobjective optimization is an important topic of research in science and
engineering. There are many tutorials, review papers, and even text books written on this
subject. Different aspects of nonlinear multiobjective optimization are defined in [37]. In
[38] advanced methods for solving a MOP, e.g., fuzzy methods, interactive methods, and
evolutionary algorithms, are explained in detail. Multiobjective optimization techniques
have been applied for designing analog and digital circuits. Coello in [39] emphasizes
Evolutionary Multiobjective Optimization and explains different methods using this
technique. The author claims that aggregating objective functions by simply doing a
weighted summation to produce a single function can be used for VLSI circuit
optimization. In particular, the author shows that Vector Evaluated Genetic Search is
13
effective in designing some combinational circuits and multiplier-less IIR filters. In the
field of digital circuit design, signal delay, chip area, and dynamic power dissipation are
optimized with a design tool, called Multiobjective Gate Level Optimization [40].
Multiobjective optimization for VLSI interconnects is discussed in [41], where the
objective functions to be simultaneously optimized are the metal widths, metal spaces,
and metal thicknesses with constraints on speed, area and power of the chip.
Recent literature also focuses on the problem of multiobjective optimization of a
3DIC subject to three objectives of cost, performance and temperature.
In [42] and [43], the authors provide system-level cost modeling for 3DICs. The
modeling considers chip area, number of layers, and bonding costs. It also models
different cooling method costs. While the model considers the effect of TSVs in the
footprint area of the 3DIC, it doesn’t model the TSV yield in bonding cost of the 3DIC.
In [44] the trade-offs between cost and temperature have been studied. It is also
shown how the number of layers, defect density and power consumption can affect the
manufacturing cost of the 3DIC. It presents a cost model for 3DICs while assuming the
number of TSVs is fixed and the area of cores and memory blocks are equal. Different
scenarios are studied but there is no holistic multiobjective optimization algorithm
provided to minimize cost and temperature.
Another approach is in [45] using a force-directed formulation to perform
floorplanning. This thermally-driven approach uses a three-stage flow, starting with
global optimization in the 3D space, to optimization in 2.5D space with layer assignment,
14
and ending with macro-block legalization. This paper shows improvements over [46] in
both quality of the result and runtime.
A multiobjective optimization on performance and thermal reliability for 3DIC
micro-architecture floorplanning is presented in [47]. Linear programming and Simulated
Annealing are employed for the optimization. This work does not consider area, number
of layers and effect of TSVs on the cost of a 3DIC. Compact thermal modeling is also
used for 3DIC thermal analysis, which is extremely time-consuming. A multiobjective
optimization on total wire length, 3DIC area, number of TSVs and temperature using a
Simulated Annealing method and compact thermal modeling is proposed in [15]. The
method expands current 2D floorplanning to the z-dimension. The min-cut placement
method and Simulated Annealing are combined to find the best thermal-aware placement
in [48]. All of these solutions use stochastic optimization techniques based on Simulated
Annealing, which generally results in a long runtime and scales poorly with problem size.
1.4 Research Contributions
The research proposed for this PhD program is motivated by the deficiencies of
related work highlighted above. Specifically the research contributions and impact of the
proposed research are as follows:
An efficient and fast 3D thermal map modeling is developed. In comparison with
existing models this model gives a broad insight into common thermal behaviors in
3DICs and is simple. The model can be developed using existing thermal CAD tools of
thermal analysis of 2D ICs. With one time characterization of the chip the model can be
used to re-evaluate the thermal map of the 3DIC for different workloads, in contrast to
15
existing models which must be recalibrated for every analysis. For each individual
application, using the same platform, the speedup in using our 3D thermal map modeling
is significant compared to existing methods. Given the vast number of scenarios that must
be evaluated for a complete thermal analysis of a typical 3DIC system with applications,
the multiplicative effect of repeated analyses makes the speedup even more impactful.
We developed a thermal sensor allocation algorithm for 3DICs. To the best of our
knowledge this is the first thermal sensor allocation algorithm which is customized for
3DICs. The model is based on the k-means clustering algorithm and is solved in the 3D
realm. Based on the physical adjacency and high thermal correlation between the layers
we show that any thermal sensor allocation problem should be solved as a three-
dimensional problem in order to avoid assigning an excessive number of sensors to the
same spatial hotspot. We show a considerable reduction in the number of needed sensors
when the problem is solved in the 3D space instead of solving it for each layer
individually.
A new ring oscillator based 3D thermal sensor is proposed in this research. To the
best of our knowledge this is the first design for 3D thermal sensors that can be shared
between the adjacent layers. Because of the physical adjacency and use of Through
Silicon Vias (TSVs) as thermal exchangers between the stacked layers, the thermal
profiles of the layers are highly correlated with each other. Any planar hotspot in a layer
in a 3DIC is converted to a volumetric spatial hotspot. Runtime thermal management in
3DICs requires proper monitoring and measurement of these spatial hotspots inside the
chip. The existence of spatial hotspots and the high thermal correlations between layers
16
are motivations for designing 3D thermal sensors. Use of this sensor design approach
reduces the total number of needed sensors to monitor a typical whole 3DIC by half with
the same reading error, as compared to the conventional 2D sensor distribution approach.
We address the problem of multiobjective optimization of VLSI circuits. We
compare three methods of multiobjective optimization and the efficiency of these
methods using convex and non-convex modeling of the objective functions. Existing
methods for multiobjective optimization of VLSI circuits use the simple model of
weighted sum while we show that interactive methods like Satisficing Trade-off Method
(STOM) can reach acceptable solutions without any need of exhaustive search for the
optimum operating point of the circuit or any need to find all Pareto optimal sets. This is
also the first work that shows the efficiency of using convex models of the objective
functions. Although the convex model sacrifices some accuracy, the benefit of finding
optimum solutions faster far outweighs the slight decrease in accuracy.
Our method of multiobjective optimization for building block placement in 3DICs
considers three objectives of cost, performance, and thermal reliability at the same time,
contrasted to prior work where at most only two of these objectives are considered for the
optimization. We employ a proper formulation for the cost as a function of three
variables of area, number of layers, and number of TSVs. Cost is also considered as one
of the objectives in the MOP. In previous methods, the variables of area, number of
layers, and number of TSVs are the MOP’s objectives, and final cost is not considered. In
our method, these three variables are in interaction with each other during the
optimization and determine the objective of cost, which is important. We provide a
17
technique in which the number of layers is one of the variables during the optimization,
unlike all previous methods in which the number of layers is predetermined. This
approach helps us find the globally optimum structure of the 3DIC. Another contribution
in this research is the utilization of a very fast 3DIC thermal map model in the
optimization algorithm that improves the execution time by eliminating the thermal
analysis bottleneck during optimization iterations.
1.5 The Dissertation Outline
In chapter 2 we discuss thermal correlation between layers in 3DICs, and we
propose a design for 3D thermal sensors that can be shared between adjacent layers. In
chapter 3 we propose a new, fast and very efficient thermal map modeling for 3DICs. We
also provide a 3D thermal sensor allocation algorithm which employs our 3D thermal
map modeling and is customized for 3DICs. In chapter 4 we present a study of the
effectiveness of different multiobjective optimization algorithms to be used in VLSI
circuits. In chapter 5 we focus on multiobjective optimization subject to cost, thermal
reliability and performance for 3DICs. Chapter 6 is dedicated to conclusion and
suggested future work.
18
Chapter 2: 3D Sensor Design
2.1 Study of Thermal Correlation Between layers in 3DICs
Thermal effects of active layers on each other in a 3DIC can be studied through
primary and secondary paths of heat transfer. The primary and secondary paths of heat
transfer in a flip-chip 3DIC are shown in Figure 2.1.
Figure 2.1. 3DIC packaging layers
The primary path of heat transfer is responsible for conducting the heat toward the
heatsink and removing the heat through the heatsink. The secondary path of heat is the
one toward the PCB. In a conventional 2D IC chip, the secondary path of heat transfer is
always insignificant, because there are no additional layers of circuitry between the
primary heat source and the PCB. While in a 3DIC this is not the case. In a 3DIC,
temperatures in adjacent layers are highly correlated: A high temperature in one layer can
19
cause high temperatures in both adjacent layers too. The secondary path of heat transfer
in 3DICs is even more problematic. If there is a source of heat in the middle of a 3DIC,
the layers between the heat source and the PCB are greatly affected. The path to the
heatsink has high thermal resistivity, trapping the heat in these layers.
Figure 2.2. Reflection of thermal map of a layer on other layers
The further a layer with higher temperature is located from the heatsink, the
greater the increase in the overall 3DIC temperature. A layer affects neighboring layers
20
significantly since there is no small thermal resistivity path between the source of the heat
and the heatsink.
Through both primary and secondary paths of heat transfer a scaled replicate of a
layer’s thermal map is reflected on other layers. Figure 2.2 shows how the thermal map
of a processor located in the middle layer of a 7-layer 3DIC is replicated on other layers
in primary and secondary paths for a sample 3DIC.
The thermal correlation of the stacked layers in a 3DIC is a function of the
thermal conductivity between the layers and a layer’s location relative to the heatsink.
The thermal conductivity on the other hand, depends on the stacking configuration (i.e.
face-to-face or face-to-back), layers’ thicknesses, layers’ materials, and TSV diameters,
pattern, and pitches. Among these, TSVs play a major role in the thermal correlation of
stacked layers in a 3DIC. By introducing more TSVs, the thermal conductivity between
the layers increases and the thermal resistivity between each layer and the heatsink is
reduced consequently.
Figure 2.3 compares the maximum junction temperature of each layer in both the
primary and secondary paths of heat transfer in a 3DIC when employing compact TSVs
and sparse TSVs in the IC structure. The 3DIC consists of 7 layers (each of them consists
of bulk, active Si, and metal sublayers, and layer 7 is the closest layer to the heatsink)
with the processor in the middle layer and 3 passive layers both above and below it. It is
shown that although compact TSVs reduce the whole temperature of the 3DIC
significantly, such a pattern leads to more correlation between the temperatures of the
stacked layers.
21
Figure 2.3. Compact vs. sparse TSVs impact on the junction temperature
Because of the high thermal correlation between the stacked layers in 3DICs, any
planar hotspot is converted to a 3D volumetric hotspot. Figure 2.4 shows how a planar
hotspot in the middle layer converts to a spatial hotspot in a 3DIC consisting of 7 layers
with an active layer in the middle and passive layers in all other layers.
(a)
(b)
Figure 2.4. (a) planar hotspot (b) cross section of the spatial hotspot
80
85
90
95
100
105
110
115
120
13579 11 13 15 17 19 21
Temperature (ºC)
Sublayer numbers
Sparse TSV
Compact TSV
Heat Source
22
Spatial hotspots in a 3DIC necessitate the design and implementation of 3D
sensors that can monitor temperatures and be shared between stacked layers. In the next
section we explain how to design a 3D thermal sensor to be used in runtime thermal
measurement and management of a 3DIC.
2.2 Application: Ring Oscillator Based 3D Thermal Sensor
Dynamic thermal measurement is crucial during 3DIC runtime. Spatial hotspots
in 3DICs necessitate a design of 3D thermal sensors to efficiently monitor any hotspot in
3DICs. Using 3D thermal sensors reduces the total number of needed sensors and
considers the thermal effect of the TSVs as substantial thermal exchangers between
layers in 3DICs.
Ring oscillator (RO) thermal sensors are widely used in cell-based ICs as they can
be implemented entirely using cells from the target library [26] [27] [28]. The delay of
each logic gate increases with temperature, and that results in lower frequency of
oscillation for higher temperature. Other types of thermal sensors, like diode-based,
suffer from non-linearity and are more sensitive to environmental effects. In this section
we describe how to design a 3D ring oscillator thermal sensor in 32nm CMOS
technology.
ROs can be designed with differing numbers of stages and transistor sizing. The
most desirable configuration is the one that creates a circuit with the most linear
dependency of frequency to the temperature. By keeping all transistors minimally sized
we can isolate the effect of increasing the number of stages (inverters) in a RO on the
linear dependency of the output frequency on temperature. For each new RO
23
configuration, we sweep the temperature for the range of -50°C to 120°C with a step size
of 5°C and find the output frequency at each temperature. After linear curve fitting of the
output frequency vs. temperature, two factors to evaluate the linearity of the frequency
dependency on temperature are used. These two factors are maximum error and Sum of
Squares due to Error (SSE). A smaller SSE indicates that the model has a smaller random
error component, and that the fit will be more useful for prediction. SSE measures the
total deviation of the response values from the fit to response values and is formulated as:
(2.1)
In which y
i
is the actual value and is the fitted value. A smaller SSE indicates
that the model has a smaller random error component, and that the fit is more useful for
prediction.
Figure 2.5. Linearity evaluation of the ring oscillator’s output frequency
10
110
210
310
410
510
610
710
810
910
0
0.5
1
1.5
2
3579 11 13
SSE (x 10
15
)
Maximum Error (%)
Number of stages in the Ring Oscillator
24
To quantify effects targeting a specific technology, we conducted simulation
experiments for ring oscillators implemented in IBM 32nm technology. As shown in
Figure 2.5 by increasing the number of stages based on the SSE factor we can attain
much better linearity.
However, by examining the maximum error there is not much benefit from
increasing the number of stages. On the other hand, as can be seen in the figure, based on
the maximum error all ROs have acceptable linearity error, and each of them is a good
candidate to be used as a thermal sensor. Changing the size of transistors does not
improve the linearity; therefore we use minimum-sized transistors in our design. We
selected the 7-stage RO for further study, although the analyses can be extended to ROs
with any number of stages.
Figure 2.6. Output frequency vs. temperature consuming TSVs
The ring oscillator as a thermal sensor can incorporate TSVs to be shared between
layers. This kind of thermal sensor includes the effect of the TSV’s thermal behavior on
the RO’s frequency too. Because they consist of high thermal conductive material and are
y = -0.0042x + 6.3059
R² = 0.997
5.7
5.8
5.9
6
6.1
6.2
6.3
6.4
6.5
6.6
-50 -30 -10 1030507090 110
Frequency (GHz)
T emperature (C)
25
relatively large, TSVs are great thermal exchangers between two adjacent layers. TSVs’
size and pitch affect the temperature of neighboring layers. By including TSVs in the
interconnect within a RO, these effects are included in the RO output frequency.
Figure 2.6 shows the temperature-frequency dependence of a 7-stage ring
oscillator passing though TSVs between two adjacent layers through a zig-zag routing
obtained from the physical design of the sensor. The figure shows that adding TSVs does
not affect linearity of the output frequency. It has the benefit of lowering the output
frequency, making it easier to measure without need of a frequency divider. The effect of
process-variations and power supply variation on deep sub-micron circuits is significant.
Accordingly, there is a change in the response of thermal sensors occupying different
process-corners which causes a shift in their calibration constants. Modern
microprocessors employ a single, 2-point hard calibration model in the form of slope and
intercept [64]. In post-silicon stage sensors must be calibrated based on their effective
power supply and the effect of process variation. Figure 2.7 shows the effect of 10%
change in VDD and ± σ variability in all process parameters on frequency vs. temperature
response of the sensor.
Possible non-linearity effects can also be mitigated by compensating VLSI
techniques like controlling the frequency with current rather than voltage [28], using
decoupling capacitors in the supply [27], or using calibration methods like using Kalman
Filtering [29].
26
(a)
(b)
Figure 2.7. (a) Power supply variation (b) Process variation
y = -0.0058x + 6.9111
R² = 0.9994
y = -0.0023x + 5.6248
R² = 0.9821
4
4.5
5
5.5
6
6.5
7
7.5
-50 0 50 100 150
Frequency (GHz)
Temperature (C)
VDD+10%
Normal
VDD-10%
y = -0.0048x + 7.2361
R² = 0.9962
y = -0.0007x + 4.295
R² = 0.8499
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
-50 0 50 100 150
Frequency (GHz)
Temperature (C)
Fast
Normal
Slow
27
A 3D RO as a thermal sensor itself or to be used in much more improved thermal
sensors [28] should be designed optimally to minimize overhead. Especially for ROs with
large numbers of stages, optimum routing should be considered. Graph theory’s
Traveling Salesman problem [33] can be used to design an optimum RO based on both
area and routing issues. The problem is defined as: “Given a set of vertices and the
distances between them, determine the shortest path starting from a given vertex, passing
through all the other vertices and returning to the first vertex.” We choose the Nearest
Neighbor algorithm [33] as one of the well-known heuristics to solve the problem. The
complexity is O(n
2
) which is acceptable because the number of vertices is limited and the
algorithm can be easily implemented and quickly executed.
The optimum design for a 7-stage RO thermal sensor is as follows: For any spot
that we decide to insert a 7-stage RO thermal sensor we find 6 TSVs which are located
close to each other that are also connecting the two layers that will share the RO. The
footprints of the 6 TSVs in one layer determine the vertices of the complete graph. The
distance of each vertex from all other vertices is the weight of the edge that connects
these two together. In this graph based on the Nearest Neighbor algorithm we determine
the cycle that passes through all the vertices with the least weight.
An example is shown in Figure 2.8(a). In this figure inverters with the same
colors are located in the same layer. For designing a 3D RO, any time a TSV (a vertex) is
encountered; it should be used to pass through to the other layer. Note that one layer
should have two consecutive inverters, because an odd number of inverters is needed to
make an oscillator. The 7-stage 3D RO thermal sensor is shown in Figure 2.8(b). The
28
method can be applied to 3D ROs with any odd number of stages. For an m-stage RO, m-
1 TSVs should be selected which are located close to each other. Their footprints in one
layer will determine the vertices of the graph in which the Traveling Salesman problem
will be solved using the Nearest Neighbor algorithm.
Figure 2.8. (a) TSV footprints (b) final 3D ring oscillator
In our experiments, TSVs are considered to be positioned in a homogenous
pattern. Based on current technology trends, TSVs’ diameters and pitches are considered
to be 2 m and 4 m, respectively. As any sensor needs 6 TSVs, using the nearest
neighbor algorithm the length of horizontal metal for routing all inverters is 24 m. The
algorithm we provided can be applied to non-homogenous TSV patterns too. Based on
the statistics of the TSV locations and hotspot locations, the mean-min-max of metal
length for connecting the inverters can be calculated. This calculation is beyond the scope
of this paper. We also assumed that all the inverters are exposed to the same temperature
inside a sensor. Based on our experiments the thermal gradient inside a grid cell is totally
negligible. Even maximum thermal difference between two adjacent grid cells in a layer
29
is less than 0.6% for all of our applications. There is also high thermal correlation
between two adjacent layers in 3DICs as shown in Table 2.1. Maximum thermal
difference between two adjacent grid cells in two adjacent layers is less than 1.5% which
translates to less than ±2 temperature degree. This maximum temperature difference
happens only in layers far from the heatsink. This validates our assumption that all the
components of a sensor are exposed to almost the same temperature.
Sharing thermal sensors between layers reduces the total number of thermal
sensors needed to monitor the whole chip. However, there is a limitation on the number
of layers that share a RO. Increasing the number of layers besides increasing the
complexity of designing an optimum RO (based on the area it is occupying and the
routing problem in 3D) decreases its efficiency to detect small spatial hotspots. Hence, a
RO serving as a thermal sensor should be spread among at most two layers incorporating
TSV effects on its output frequency, as it is simple enough to be designed optimally
based on both area and routing. These ROs can also be used to monitor the functionality
of the TSVs they include. They have the ability to detect any process variation and ageing
effects of the TSVs. For 3DIC designs in which supply noise levels of the layers are
independent from each other [65], sharing thermal sensors between two layers can also
improve the linearity of the design. As a 3DIC often contains thermal TSVs [19] used as
thermal exchangers between layers and the heatsink, these thermal TSVs can be used for
3D ring oscillator thermal sensor design to eliminate any TSV overhead of incorporating
3D thermal sensors.
30
2.3 Experimental results
In this section we discuss the thermal correlation of stacked layers and advantage
of using 3D thermal sensors in a sample 3DIC. The flip-chip 4-layer 3DIC used
throughout the experiment includes an 8-core Alpha 21264 processor in 90nm. The top
two layers by the heatsink (layer 3 and 4) consist of four cores each, and the next layer
(layer 2) is dedicated to L2 cache. The last layer which is the closest to PCB (layer1)
contains memory.
The 3DIC layers are stacked face-to-back. 16MB of L2 cache is shared between
all cores and the size of main memory is 1024MB [53]. Four different benchmarks of
vortex, equake, gcc, apsi from SPEC2000 benchmarks are running on 4 cores on each
layer [51]. The second layer is rotated 180 degrees to balance the chip power profile [53].
The power of the L2 cache and memory are assumed to be uniform. The Wattch
infrastructure [52] is used for architectural-level power modeling of the system, and
HotSpot 5.0 [11] is used for grid-based thermal simulation of the 3DIC. Each layer is
divided into 128×128 grids for thermal analysis.
Table 2.1 shows the high thermal correlation between each two layers in the 4-
layer 3DIC. The high thermal correlation between the layers gives us the opportunity to
share thermal sensors between the layers which effectively reduce the number of thermal
sensors needed to monitor the thermal behavior of the 3DIC.
In order to show the efficiency of using our proposed 3D thermal sensor, the
thermal sensor distribution algorithm is conducted in three steps. In the first step, uniform
sensor allocation is applied on each layer. In the second step sensors are kept only for the
31
hotspot areas. Hotspots are grid cells that have temperature greater than a specified
T
hotspot
, which is 80ºC in our experiment.
Table 2.1. Thermal correlation between layers in a 4-layer 3DIC
Thermal
Correlation
layer 1
External
Memory
layer 2
L2
Cache
layer 3
Quad-core
processor
layer 4
Quad-core
processor
layer 1 1.00 0.98 0.93 0.87
layer 2 0.98 1.00 0.97 0.92
layer 3 0.93 0.97 1.00 0.97
layer 4 0.87 0.92 0.97 1.00
To reduce errors in reading temperatures in areas which are capable of reaching
much higher temperatures than T
hotspot
we used smaller grids and more sensors in those
areas. This method is called hybrid thermal gridding [54] and is shown in Figure 2.9.
Figure 2.9. Thermal sensor distribution in a 4-layer 3DIC
32
In the third step we pair each two layers and assign a 3D thermal sensor to them.
In each step a reduction in the number of sensors and maximum sensor reading error will
be studied. Table 2.2 compares the total number of sensors needed in each layer and
maximum sensor reading error for each step.
Table 2.2. Number of sensors and maximum reading error
Experiment’s
steps
Layer
no.
Number of
sensors per
layer
Total
number of
sensors
Maximum error
Step 1
layer 1 64
256 10.6%
layer 2 64
layer 3 64
layer 4 64
Step 2
layer 1 8
135 5.6%
layer 2 6
layer 3 62
layer 4 59
Step 3
layer
1&2
8
70 5.8%
layer
3&4
62
As can be seen in the table, by allocating sensors only in the hotspot areas the
total number of sensors is reduced by 47% in comparison with a uniform sensor
allocation method. Employing hybrid thermal gridding reduces the maximum sensor
reading from 10.6% to 5.6%. To compute this error, the maximum of differences between
the temperature of the 3D chip provided by HotSpot 5.0 grid-based thermal simulation
and the temperature detected by the sensors in critical areas of the 3DIC was obtained. In
33
the third step 3D thermal sensors are shared between each pair of two adjacent layers.
The total number of needed sensors is reduced by 48%. The slightly higher error is
because of the small nonlinearity of the sensor.
34
Chapter 3: Sensor Allocation in 3DICs
3.1 3D Thermal Map Modeling for 3DICs
Thermal measurement techniques and management methods in 3DICs require a
solid understanding of thermal behavior in 3DICs. A precise full 3D thermal analysis
requires solving of the Partial Differential Equation (PDE) of heat conduction [66]:
,
. , , ,
(3.1)
where ρ, c
p
, and k, are density, heat capacity and thermal conductivity of the material,
respectively. Thermal conductivity is a function of location and current temperature. T is
the temperature, and r is the spatial coordinate of the point at which temperature is being
measured, g is the power density in point r, and t represents time. The equation calculates
the transient thermal response to the power being dissipated at point r, and it is dependent
on the characteristics of the material. A set of thermal boundary conditions forced by the
chip packaging macromodel and the location of the heatsink (HS) should also be
formulated and used in solving the PDE. The boundary condition is formulated as shown
in Equation (3.2):
, ,
, ,
(3.2)
where n
i
is the outward direction normal to the boundary surface i, h
i
is the heat transfer
coefficient of the surface, and , is an arbitrary function on the boundary surface s
i
.
A total of six boundary conditions are required to solve the 3DIC thermal behavior.
35
A Finite Difference Method (FDM) [66] is one method of solving equation (3.1)
PDE with boundary constraints (3.2). This method discretizes the entire chip and forms a
system of linear equations. The equations relate the temperature distribution to the power
distribution of the chip, resulting in a large system of linear equations to be solved. For
better insight into the problem, the FDM method leads to introducing an equivalent
thermal circuit based on an electrical analogy. Based on this model, any block in the
circuit is modeled with a thermal node. Neighboring nodes are connected to each other by
a thermal resistance to model lateral and vertical heat conduction paths. Each block is a
source of heat energy (they are dissipating power) that is modeled as a current source. For
simplicity it is assumed that only neighboring nodes have a thermal effect on each other.
The model is shown in Figure 3.1.
Figure 3.1. Thermal modeling of two layers in a 3DIC
36
This method is used in HotSpot [11], an automated thermal model, which
calculates a transient temperature response given the physical characteristics and power
consumption of units on the die. The HotSpot model has been validated with real
temperature measurements from a commercial thermal testing chip [11]. Although
HotSpot gives a precise model for thermal behavior of a 3DIC, it is very time consuming
in calculating the temperature to be used in the algorithm of any thermal management
method.
In this section we propose an abstract method for 3D thermal map modeling of
3DICs. In our proposed method we rely on the facts that any hotspot area in a layer inside
the 3DIC increases if the layer gets further from the heatsink, and a scaled version of each
layer’s thermal map is reflected on other layers. This method helps us to apply a detailed
thermal analysis on each layer in a conventional 2D manner and apply a simplified scaling
method to attain the whole 3DIC spatial thermal map. The model provides tremendous
speedup in analysis time with very little sacrifice in accuracy.
3.2 Proposed 3D Thermal Map Modeling for 3DICs
In this section we propose a method for 3D thermal map modeling of 3DICs. In the
proposed method we rely on the facts that any hotspot area in a layer inside the 3DIC
increases if the layer gets further from the heatsink, and a scaled version of each layer’s
thermal map is reflected on other layers. This method helps us to apply a detailed thermal
analysis on each 2D layer using the existing IC thermal analysis CAD tools and apply a
simplified scaling method to attain the whole 3DIC spatial thermal map. The 3D thermal
37
map model is generated in three steps. In the first step each layer is assumed to be located
by the heatsink and there is no other active layer inside the 3DIC. Each layer’s thermal
map is then generated using conventional thermal analysis 2D CAD tools. This thermal
map is called the original-thermal-map of the layer. In the second step the effect of
distance from heatsink (after allocating the layer on its position inside the 3DIC) on each
layer thermal map is modeled. This thermal map of each layer is called intermediate-
thermal-map of the layer. In the third step the effect of thermal map of the layers on each
other is modeled. The Final-thermal-map of each layer is then the superposition of its own
thermal map after proper scaling based on the distance from heatsink and the effect of
other layers thermal map. Each of these steps is explained in detail as follows.
3.2.1 Effect of Distance from Heatsink on Each Layer’s Thermal Map
In the first step toward the accomplishment of the 3D thermal map model we study
how the thermal map of a single layer changes based on its distance from the heatsink.
Theorem (1): With the same source of heat, the hotspot area increases as a layer
gets further from the heatsink (hotspot area is the area around a heat source that has the
possibility of having a temperature greater than a T
hotspot
).
Proof: Let’s assume that a heat source is located in layer i and the hotspot area
around it has radius r. The border of the area has a temperature of T
b
. The layer’s thermal
resistance to the heatsink is R
i-HS
(Figure 3.2).
38
Figure 3.2. Hotspot area around a heat source
By going to layer j which is further from the heatsink we attain R
j-HS
> R
i-HS
. To
have an area with the same temperature of T
b
on its border, based on Equation (3.3), R
l
(equivalent lateral thermal resistance from heat source to the border) must increase and
that means radius r must increase. Therefore with the same source of heat the area of the
hotspot in layer j is larger.
(3.3)
By finding the proper scaling factors, any layer’s thermal map can be obtained and
then scaled based on its location relative to the heatsink. It is also necessary to model the
thermal effects of other layers on a specific layer to obtain the complete solution. Thermal
effects of the layers on each other will be studied in the next section.
For now we assume that we have only one active layer in the 3DIC (located in
layer k), and its thermal map is solely obtained by a proper scaling of the thermal map of
the layer derived from the conventional 2D method that we call original-thermal-map of
the layer and show by the matrix O. The original-thermal-map of the layer is divided into
39
an n×n grid, and the average temperatures of the grid cell points form the n×n matrix O.
The scaling model is obtained by data interpolation of the temperature change of each grid
cell for each active layer location in the 3DIC. The active layer thermal map matrix in
layer k as a function of its original-thermal-map matrix can be formulated as Equation
(3.4). We call this matrix intermediate-thermal-map matrix and show it by I.
∆
. (3.4)
Based on Equation (3.4) each element of the intermediate-thermal-map matrix in
layer k is a function of the corresponding element in the original-thermal-map matrix,
, the horizontal thermal gradient matrix of the original-thermal-map matrix,
∆
, and the vertical thermal scaling matrix,
, from the heatsink up to layer k.
a
k
, b
k
, and c
k
are scalar fitting factors which are functions of lateral thermal conductivity
and the location of layer k.
Each element in the thermal gradient matrix ∆
is calculated by taking the
difference of the temperature of grid cell (i,j) from its neighboring cells’ temperatures.
This factor arises from the fact that a cell with much hotter neighbor cells suffers more
from their temperatures as it gets further from the heatsink, and a larger scaling factor is
applied accordingly.
Each element in the vertical thermal scaling matrix, , is a function of an
equivalent vertical thermal conductivity from the corresponding element in the thermal
matrix of layer k to the heatsink.
Equation (3.5) shows how to calculate .
40
, (3.5)
In this equation layer m is the closest layer to the heatsink and D
i,i+1
is an n×n
vertical scaling matrix between layer i and layer i+1. Each element in D
i,i+1
is a function of
the vertical thermal conductivity between corresponding grid cells in layer i and i+1 and
also the location of the layers in the 3DIC relative to the heatsink.
3.2.2 Modeling of Thermal Effects of Other Layers on a Specific Layer
The thermal effects of layers in a 3DIC on each other can be studied through
primary and secondary paths of heat transfer. The primary path of heat transfer is
responsible for conducting the heat toward the heatsink and removing the heat through the
heatsink. The secondary path of heat is the one toward the PCB (Printed Circuit Board). In
a conventional 2D IC chip, the secondary path of heat transfer is always insignificant,
because there are no additional layers of circuitry between the primary heat source and the
PCB. Clearly this is not the case in a 3DIC. The secondary path of heat transfer in a 3DIC
is more problematic.
If there is a source of heat in the middle of a 3DIC, the layers between the heat
source and the PCB are greatly affected because the path to the heatsink has high thermal
resistivity, practically trapping the heat in these layers. Figure 3.3 compares the maximum
junction temperature of each layer in both the primary and secondary paths of heat transfer
in a 3DIC. This sample chip consists of 7 layers (each of them consists of bulk, active Si,
and metal sublayers which comprise a total of 21 sublayers under study) with the
41
processor in the middle layer and 3 passive layers both above and below it. Thermal
simulation was performed using HotSpot 5.0 [11].
Figure 3.3. Primary vs. secondary path of heat transfer in a 7-layer 3DIC
As can be seen in the figure, the temperature drops dramatically for the layers
closer to the heatsink along the primary path of heat transfer, while it remains almost
constant for the layers between the source of heat and the PCB along the secondary path
of heat transfer.
Through both primary and secondary paths of heat transfer, a scaled replicate of a
layer’s thermal map will be reflected on other layers. Figure 2.2 shows how the thermal
map of the processor will be replicated on other layers in both primary and secondary
paths for a chip with a processor in the middle layer and 3 passive layers both above and
below it. This effect can be formulated as shown in Equation (3.6).
90
95
100
105
110
115
120
1 3 5 7 9 11 13 15 17 19 21
Temperature (ºC)
Sublayer numbers
Toward PCB Toward HS Heat Source
Primary Path Secondary Path
42
′
′
′∆
′.
′
′∆
′.
,
,
(3.6)
where ′
is a thermal map in layer k generated solely based on the thermal effect of
layer l. If layer k is located between layer l and the heatsink, the thermal map generated on
it is through the primary path of heat transfer from layer l, and if layer k is located between
layer l and the PCB, the thermal map generated on it is through the secondary path of heat
transfer from layer l. Each element in the thermal gradient matrix ∆
is calculated by
taking the difference of the temperature of grid cell (i,j) from its neighboring cells’
temperatures. This factor arises from the fact that a cell with much hotter neighboring cells
generates hotter grid cells on other layers. a’
l
, b’
l
, and c’
l
are scalar fitting factors which
are functions of lateral thermal conductivity and the location of layer l.
The scaling matrix on the primary path is shown by P
kl
and on the secondary path
by S
kl
. Each element of the scaling matrices is a function of equivalent thermal
conductivity between corresponding elements in layer k and l and location of the layers.
As the layers between the active layer l and PCB have high thermal resistance to the
heatsink, with the same distance from the active layer l, elements in matrix S
kl
are greater
than elements in matrix P
kl
, and layer l has lower thermal effect on the layers between
itself and the heatsink. Each element in the scaling matrix between layer k and l is
43
obtained by the product of corresponding elements in scaling matrices of adjacent layers
between layer l and k as shown in Equation (3.6).
3.2.3 Scaling Elements Calculation
Each element in the matrices of , , , , and , depends on the equivalent
thermal conductivity of that grid cell and the location of the corresponding layer. To
compute the equivalent thermal conductivity the fraction of the grid cell being occupied
by TSVs must be considered as shown in Figure 3.4.
This fraction can be calculated using smaller grid cells. 1% accuracy yields
acceptable error tolerance.
Figure 3.4. Vertical thermal scaling matrix elements calculation
Each scaling element can be calculated as follows:
1 1
1
, , ,
, , ,
(3.7)
44
, ⁄
In which f is the fraction of the grid occupied by TSVs and R values are thermal
resistivity of each layer as a function of their height, the size of the grid cells and the
material of the layers. As the grid cells are assumed to be equal in size, R values are
calculated only once. α
i,i+1
and δ
i,i+1
are scaling parameters which are proportional to the
distance of the layer from the heatsink. Parameters β
(i,i+1),l
and γ
(i,i+1),l
are inversely
proportional to the distance from the source layer l.
3.2.4 3D thermal map modeling
Each layer’s thermal map is a combination of its scaled original thermal map and
the effects of other layers. By combining Equations (3.4) and (3.6), each layer’s final-
thermal-map shown by F can be calculated as follows:
′
′
∆
′
.
′
′
∆
′
.
∆
.
∆
.
(3.8)
P
ki
models the thermal effect of layer i on layer k, in which layer k is located on its primary
path of heat transfer, while S
ki
models this effect when layer k is located on the secondary
path of the heat transfer of layer i. (Layer m is the closest layer to the heatsink.) O
i
and O
k -
are original-thermal-maps of layers i and k, and I
i
and I
k
are their intermediate-thermal-
maps. F
k
is the final-thermal-map of layer k.
This model can be used for efficient 3D thermal map estimation of a 3DIC with
any number of active layers and any possible application set without requiring tedious,
45
detailed, time-consuming simulations for every application and configuration. The model
also gives a very clear insight into the thermal behavior of 3DICs. The model can be used
for both steady-state and transient modeling of 3DICs. For the 3DICs employing cooling
channels [49] the model can be extended after finding proper scaling factors
corresponding to the layer containing cooling channels. The 3D thermal modeling steps
are summarized in Figure 3.5.
Figure 3.5. 3D Thermal map modeling steps
Existing 3D thermal analysis tools like HotSpot [11] and 3D-ICE [12] conduct
very detailed thermal analyses by finding the thermal resistance and thermal capacitance
between thermal nodes to model lateral and vertical heat conduction paths. As the number
of active layers and the number of applications to be run on a 3DIC increase, this kind of
detailed thermal simulation is very time consuming, especially for the purpose of thermal
control algorithms like thermal sensor allocation which requires a global view of the
thermal maps of the 3DIC. Using the proposed method we can attain a significant savings
in simulation time especially when the number of active layers and applications increases.
46
The proposed model reduces the complexity of the problem of solving 3DIC thermal
modeling by decomposing the 3D problem into interactions among essential components
only in 2D layers.
3.3 Application: Thermal Sensor Distribution Algorithm for
3DICs
Efficient thermal sensor allocation is crucial for thermal monitoring of a 3DIC.
Overestimation of temperature has a negative impact on performance by excessive
triggering of thermal control mechanisms, and underestimation reduces reliability or can
even lead to thermal shutdown. An energy based thermal sensor allocation approach aims
to allocate the sensors near critical areas like hotspots. A proper clustering of the hotspots
considering the desired accuracy is efficient in regard to the optimum number of sensors
and their positions. An energy based method of the k-means clustering algorithm for
sensor allocation in conventional 2D ICs is proposed in [31]. However, in this paper we
describe the necessity of considering the spatial thermal map of the 3DIC instead of each
individual active layer’s thermal map for optimum sensor allocation.
Definition (1): The k-means clustering algorithm is defined as follows: Given an
integer k and a set of n data points of in an m-dimensional space in the set of
|
,
,… ,
, 1,2,…, , determine k centers of such that the mean-square
distance from each data point to its nearest center is minimized. The set of |
,
,…,
, 1,2,…, consists of these center positions [31].
In the sensor allocation problem, n data points are n hotspot locations in the 3DIC,
and k centers are the optimum position of the k number of sensors for monitoring the
47
hotspots. Positions of the hotspots and sensors are three-dimensional in 3D space of the
3DIC. To find the hotspot locations, each layer in the 3DIC is divided into n×n grid cells
and the maximum temperature of each grid cell is derived using 3D thermal map
modeling. Because of the large set of applications, especially for multi-core processors,
thermal map modeling should be fast while accurate enough to give the location of all
possible hotspots. In the next section we provide a fast and efficient 3DIC thermal map
model.
A hotspot is regarded as any grid cell with a possibility of having a temperature
higher than a specified T
hotspot
. We also define critical areas inside the 3DIC which are the
grid cells with maximum temperature greater than T
critical
. Note that T
hotspot
>> T
critical
.
Thermal sensors have 100% coverage if they monitor all critical area with acceptable
reading error. No thermal sensor is allocated in non-critical areas of the chip.
The k-means clustering algorithm is solved considering the relative position of the
hotspots in the 3D space of the 3DIC’s cube. To make the allocation thermally aware, the
maximum temperature of each hotspot is added as the 4
th
dimension of that hotspot. The
k-means clustering algorithm is then solved in a 4-dimensional space. A higher
temperature of a hotspot as one of its dimension attracts a sensor more toward itself. In
this way a sensor is allocated closer to the hotspots with higher temperature. Although the
problem is solved in a 4-dimensional space using this approach, we call the algorithm a
3D k-means clustering as it is solved in a 3D Euclidian space. We also solved the problem
considering each layer’s 2D thermal map individually and we call it a 2D k-means
clustering algorithm.
48
Because of the physical adjacency and use of high thermal conductive TSVs
between the layers, the thermal maps of adjacent layers are highly correlated to each other.
Therefore, any sensor’s temperature can also represent the temperature of its immediate
neighbor cells in the adjacent layers. The high thermal correlation between the layers
shows the necessity of monitoring spatial hotspots in 3DICs instead of planar hotspots in
each layer for further thermal control of the 3DIC. A hotspot in a layer sometimes needs to
be controlled
by applying thermal control techniques on its adjacent layers.
In the next section through experiments we show the efficiency of using k-means
clustering considering a 3D thermal map of the chip instead of solving thermal sensor
allocation for each individual active layer separately.
3.4 Experimental Results
This section provides experimental results for the proposed thermal modeling and
sensor allocation technique described in previous sections. In all experiments a modified
version of Alpha 21264 is used as a baseline processor [50]. The processor is scaled to
65nm technology node [68] and the die size is multiplied by 1.5 to provide space for our
homogenous TSV pattern. TSV diameters are and pitches are assumed to be 2 and
4 respectively. SPEC2000 benchmarks are used as the benchmarks to be run on the
processors [51]. The Wattch infrastructure [52] is used for architectural-level power
modeling of the system, and HotSpot 5.0 [11] is used for grid-based thermal simulation of
the 3DIC. For the packaging parameters, we used HotSpot 5.0 defaults [11].
49
Figure 3.6. Floorplan of a core processor
The 4-layer 3DIC in our experiments as shown in Figure 3.7 consists of 8 cores of
Alpha 21264 in 90nm. The top two layers by the heatsink consist of 4 cores each, and the
next layer is dedicated to the L2 cache. The last layer which is the closest to the PCB
contains main memory. The second layer is rotated 180 degrees to balance the chip power
profile [53]. The floorplan of each core is shown. The 3DIC layers are stacked face-to-
back. 16MB of L2 cache is shared between all cores and the size of main memory is
1024MB. The Alpha 21264 configuration is shown in Table 3.1. The material properties
of the 3DIC in our experiments are shown in Table 3.2.
50
Figure 3.7. 3DIC stacked layers (Flip-chip configuration)
Table 3.1. Core processor configuration
Die size 4.65×4.65mm
2
Frequency and Voltage 2GHz, 1.2V
Instruction Queue 64 entries
Functional unit 4IXU, 2FPU, 1BPU
Branch predictor 1K local, 4K global
L1 DCache/core 32KB, 2-way, 64B blocks, 3 cycle lat
L1 ICache/core 64KB, 2-way, 64B blocks, 1 cycle lat
Shared L2 cache 16MB, 8-way LRU, 64B blocks, 25 cycle lat
51
Table 3.2. 3DIC material properties
3DIC layers
Specific heat
capacity
(J/m
3
K)
Thermal
Conductivity
(W/m.K)
Thickness
(µm)
Bulk (SiO
2
) 1.96×10
6
1.2 100
Active (Si) 1.63×10
6
100 0.1
Metal (Cu) 3.45×10
6
400 10
Heat sink 3.55×10
6
400 6900
Heat spreader 3.55×10
6
400 1000
TIM 4×10
6
4 20
Each layer is divided into 128×128 grid cells. The elements of the 128×128 scaling
matrices D, S, P for each two adjacent layers is obtained based on the percentage of the
rectangular prism, connecting two corresponding elements in two adjacent active layers
vertically, occupied by TSVs.
The data is then gathered in a lookup table to be used for any considered
application. For each two layers this scaling factor is then obtained by a combination of
the scaling factors of each two adjacent layers between these two layers.
We evaluated six different applications run on the 3DIC’s 8-core processor. These
applications are combinations of the benchmarks shown in Table 3.3. With different
power profiles of the benchmarks we attained a wide variety of spatial thermal maps for
the 3DIC to examine our 3D thermal map modeling and the efficiency of the 3D k-means
52
clustering algorithm for sensor allocation in a 3DIC. The applications are shown in Table
3.4.
Table 3.3. Power profile of the benchmarks
Benchmarks
Avg
power
(W)
Power Distribution (%)
branch
pred
Icache Dcache
Functional
units
Clock Others
apsi 33.5 2.4 13.4 43.8 7.2 22.5 10.7
equake 25.3 2.6 8.1 47 8.1 20.8 13.4
gcc 28.3 2 6.5 53.2 6.5 19.8 12
bzip 21 4.6 17 14 12.3 31.9 20.2
Using the 3D thermal modeling steps shown in Figure 3.5, we modeled the spatial
thermal map of the 3DIC for our six applications. The maximum of the modeling error
observed was in the layer closest to the PCB in application 4. Figure 3.8 shows the
histogram of the 3D thermal map modeling error for that layer. The layer is divided into
128×128 grid cells, and the error represents the difference between the temperatures of the
grid cells calculated with HotSpot 5.0 and the results provided by our proposed 3D
thermal map modeling. As shown in the figure the mean of the modeling error is 2.5%
while the maximum error is less than 5.5% which is quite acceptable for the purpose of
thermal sensor allocation algorithms.
For each individual application, using the same platform, the speedup in using our
3D thermal map modeling is 53x compared to HotSpot 5.0 thermal modeling. HotSpot 5.0
would have to be run again for each application for the entire 3DIC, while our model
would just need to be re-evaluated but not recalibrated. This speedup is significant, given
53
the vast number of scenarios that must be evaluated for a complete thermal analysis of
typical 3DIC systems and applications.
Figure 3.8. Histogram of 3D thermal map modeling error
Because of the high thermal correlation between the layers in a 3DIC we contend
that the k-means clustering algorithm should be solved as a 3D problem instead of solving
it for each layer individually.
Table 2.1 shows the high thermal correlation of the layers tested for a large set of
applications.
To show the efficiency of using the k-means clustering algorithm in the 3D space
instead of solving the problem for each individual layer we conducted the following
experiment. Each layer in the 4-layer 3DIC is divided into 16×16 macro cells (each macro
cell consists of 8×8 grid cells). After deriving the spatial thermal map of the 3DIC and
pinpointing possible hotspots, we run both 2D and 3D k-means clustering algorithms. In
both methods a sensor located in a macro cell represents the maximum temperature. For a
macro cell that contains no sensor, if its immediate lateral or vertical macro cell neighbors
1 2 3 4 5 6 7 8 9 10 11
0
200
400
600
800
1000
1200
1400
Modeling Error Bins (%)
Number of Grid Cells
Histogram of 3D Thermal Modeling
54
contain sensors, its temperature is the average temperature of its neighbor sensors [30].
This approximation is reasonable because of the high vertical and lateral thermal
correlation between the neighbor macro cells but will also be evaluated in the
experimental results. If the temperature of any macro cell cannot be determined using this
method we say that this macro cell is not in the coverage area. The goal is to cover critical
areas, which are any macro cells containing possible hotspots with acceptable sensor
reading error tolerance.
Figure 3.9 shows percentage of critical area coverage for both 2D and 3D k-means
clustering as the number of sensors increases. The sensor reading error tolerance for both
methods is kept the same, which is less than 5%. To attain a much smaller error tolerance,
the size of macro cells must be set smaller.
Figure 3.9. Monitoring area of the sensors vs. number of sensors
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
040 80 120
Sensors' Monitoring Area
Number of Sensors
3D k‐means clustering
2D k‐means clustering
55
We can see that with the same number of sensors and error tolerance, using 3D k-
means clustering covers a much higher percentage of the critical macro cells than 2D k-
means clustering, due to the high thermal correlation between the layers. For the case
using 2D k-means clustering for each layer separately, a sensor allocation results with
sensors vertically close to each other because any planar hotspot creates hotspots on
adjacent layers too. On the other hand by using 3D k-means clustering a spatial hotspot is
considered one data point, so a smaller number of sensors is assigned to it. We can see that
with 60 sensors we can monitor all critical macro cells in the 3DIC for an error of less than
5% while a much greater 108 sensors are needed if we use a 2D k-means clustering
algorithm. Figure 3.9 also shows that with increasing the number of sensors and using 2D
k-means clustering we may not reach higher coverage, again because of assigning an
excessive number of unnecessary sensors to the same spatial hotspots.
The optimum thermal sensor positions using a 3D k-means clustering algorithm
for sensor allocation are shown in Figure 3.10. We can see that with a minimum number
of sensors, for 100% coverage of the critical area and an acceptable reading error of less
than 5%, thermal sensors are only located in middle layers and they also monitor the
temperatures of their adjacent layers. Note that the 3D thermal map shown in the figure is
for one application, while the sensor allocation considers all six thermal maps of the 3DIC.
Figure shows the macro cells that contain sensors.
Table 3.4 shows the combination of benchmarks for each application used to
create various 3D thermal maps for the 3DIC. The “Max Modeling Error” column shows
56
the maximum grid cell’s temperature error of our 3D thermal map modeling as compared
to the temperature calculated by HotSpot 5.0 grid-based thermal simulation.
Figure 3.10. Thermal sensor positions
The “Max Sensors Reading Error” column shows the maximum sensor reading
error for the critical areas that are covered by the proposed thermal sensor distribution.
The sensor reading temperature is determined based on the temperature of the grid cell in
which it is located and using our 3D thermal map modeling. The maximum temperature of
critical macro cells determined by the sensors is then compared with the actual maximum
temperature of those macro cells computed with HotSpot 5.0. The column shows the
maximum of these errors.
57
The small magnitude of the sensor reading errors demonstrates the efficiency of
our distribution method. The maximum sensor reading error in the critical areas is less
than the maximum 3D thermal map modeling error of the whole 3DIC, demonstrating that
the model can predict steep thermal gradients quite precisely. This shows another great
advantage of using this fast 3D thermal map modeling.
The table shows that the maximum 3D modeling error is 5.46%, and the maximum
sensor reading error is 4.40%. This small sacrifice in accuracy is impressive given that the
solution was attained with a 53x speedup compared to HotSpot 5.0 thermal modeling and
uses 44% fewer sensors than the number needed if using conventional 2D k-means
clustering.
Table 3.4. Maximum 3D thermal modeling error and sensor reading error
Applications Benchmarks running on core 1 through core 8
Max
Modeling
Error (%)
Max Sensor
Reading Error
(%)
1 apsi/equake/gcc/bzip/bzip/gcc/equake/apsi 2.72 2.95
2 apsi/equake/gcc/bzip/apsi/equake/gcc/bzip 2.35 3.27
3 apsi on all cores 2.13 3.28
4 equake on all cores 5.46 4.40
5 gcc on all cores 3.53 4.08
6 bzip on all cores 2.96 3.63
58
Chapter 4: Multiobjective Optimization
4.1 Study of Multiobjective Optimization Techniques for VLSI
Circuits
A multiobjective optimization problem (MOP) is the optimization of different
objective functions simultaneously and reaching a solution that is the best in regard to all
of the objective functions. As a first step to developing a multiobjective optimization
framework for 3DICs, we first study the general problem domain of applying
multiobjective optimization to VLSI circuit design.
Multiobjective optimization problems are present at different levels of VLSI
circuit optimization. During chip design, circuit designers typically wish to optimize a
circuit with respect to its performance, power dissipation, and layout area at the same
time. The optimization of a circuit for speed and power is nearly always conflicting i.e.,
higher speed leads to higher power dissipation. If this optimization is done on the
transistor or gate sizing of the circuit, finding the best sizing vector for both speed and
power is a challenging task. This problem can also emerge in a circuit working under
different supply voltage levels or clock frequencies, or even different die temperatures.
As discussed in the power and speed trade-off case, multiobjective optimization
techniques become important when optimization of different objective functions conflicts
with each other.
This chapter focuses on multiobjective optimization of power and delay, as an
example of the two conflicting objective functions for VLSI circuits. The proposed
methods can be applied for any other set of disagreeing functions to find the best (Pareto
59
optimal) operating point in today’s multi-mode multi-corner problems. First we propose
different non-convex and convex models of power and delay and explain which ones are
the most appropriate to be used in multiobjective optimization. In particular, we present
three methods for solving a multiobjective problem. The first one is the Weighted Sum
method which is the most popular technique for solving multiobjective optimization.
Compromise Programming is the second method which is quite effective for convex
functions.
Finally a Satisficing Trade-off Method (STOM) based algorithm is presented for
the optimization (Satisficing is a portmanteau of satisfy and suffice; it is a strategy that
attempts to meet criteria for adequacy, rather than to try and find an optimal solution.)
STOM can be used to find the best point when the designer is provided some goals for
the objective functions. We will describe step by step how to find the best point of
operation of a circuit in regard to multiple conflicting objective functions [59]. After
providing a wide mathematical analysis of different multiobjective optimization
techniques in this chapter, the problem of multiobjective optimization in 3DICs will be
studied in the next chapter.
4.2 Power and Delay Modeling
In this section, two different methods are proposed to model delay and power of a
circuit. We will use these analytical models in the multiobjective optimization
algorithms.
60
4.2.1 Non-convex Modeling of Power and Delay
The first modeling that we used is obtained by a second order polynomial
interpolation of sampling points of power and delay. We specified upper and lower
bounds for the sizing vector elements and applied every permutation of the possible value
of the sizing vector elements to a circuit analyzer (HSPICE) and then obtained the
corresponding delay and power dissipation values for the circuit. Next by interpolation of
the sampling points, we derived an analytical model for the circuit power dissipation and
delay, which happens to be non-convex. The non-convex model of delay is formulated as
follows:
∑ ∑ ∑∑
⁄ (4.1)
where n signifies the non-convexity of model, x
i
is the size of gate i. and i
, i
, and i
are
real-valued fitting coefficients. In the experimental results, the maximum error of the
delay macromodel equation
is 6% for every possible sizing vector value, while the mean
and the variance are 1.5% and 1.2% respectively. The histogram of the modeling error is
shown in Figure 4.1.
The non-convex model of power dissipation is as follows:
(4.2)
where x
i
is the size of gate i. and i
, i
, i
, and i
are fitting coefficients and they are real
numbers. The maximum error of this macromodel equation
for power
is 0.3% for every
possible sizing vector value.
61
Figure 4.1. Histogram of error in non-convex delay modeling
4.2.2 Convex Modeling of Power and Delay
A multiobjective optimization is convex if all objective functions and the feasible
region are convex. There are many algorithms that can solve a convex MOP but they face
difficulty in solving non-convex MOP [37].
Definition (2): A function :
→ is convex if for all ,
∈
:
1 1
0 1
(4.3)
A set ⊂ is convex if ,
∈ implies that
1 ∈ for all
0 1 .
0 1 2 3 4 5 6
0
5
10
15
20
25
H isto g ra m o f E rro r
Erro r bin
F r eq uency( % )
62
Delay and power dissipation can be modeled as convex functions of sizing vector
elements too. We used posynomial
2
functions for modeling circuit the power and delay
values. These functions are convex when the variables have positive value [55]. Again
the model is obtained by interpolation of the sampling points, and finding the best fitting
coefficients. The convex delay modeling equation is as follows:
(4.4)
where x
i
is the size of gate i, and , , , are positive real numbers. The maximum
error of this macromodel equation
for every possible sizing vector elements in the
specified range is 18%, while the mean and the variance are 3% and 5% respectively.
The convex power modeling equation is as follows:
∗
(4.5)
where x
i
is the size of gate i, and is positive real number. The maximum error of
macromodel equation
is 5%.
By using convex modeling although we have less precise models, we can easily
find the global optimum of the function during optimization. After finding the global
optimum point using convex models, we narrow our search near the convex global
optimum point and use more precise delay and power macromodel equations to find the
actual optimum point of the functions.
2
A posynomial is a function of the form ,
,… , ∑
…
where all the coordinates and coefficients are positive
real numbers, and the exponents are real numbers.
63
4.3 Multiobjective Optimization Problem
Multiobjective optimization problem (MOP) is formulated as follows:
, ,… , 2
∈
(4.6)
where f
i
is an objective function and :
→ . ,
,… , is called the
decision vector upon which the optimization is performed. ⊂ is the feasible region
that is determined by the constraints on the MOP.
The goal is to minimize all objective functions simultaneously. We assume that
there is no single solution that is optimal with respect to every objective function. The
optimum single-objective solutions are at least partly conflicting with one another, and
they can also be incommensurable, i.e., they may be expressed in different units (µw of
power dissipation vs. ns of delay).
Figure 4.2. Pareto optimal set
64
4.3.1 Pareto Optimal Solution
If the objective functions are conflicting, there is not a single solution that
minimizes all the objective functions simultaneously. We are thus looking for a non-
dominated solution in the sense that if we try to optimize one of the objective functions
any further, the other objective function value(s) will degrade. This kind of optimality is
called Pareto optimality [37].
Definition (3): A decision vector ∗
∈ is Pareto optimal if there is no other
decision vector ∈ such that ∗
for all 1, … , and ∗
for
at least one index j. (Figure 4.2)
Mathematically a MOP is solved when the Pareto optimal set is reached.
4.3.2 Multiobjective Optimization Solution Methods
MOP is usually solved by scalarization. It means that objective functions are
combined in a way that at the end a single objective function is optimized. As a single
objective function can be optimized only to its local optimum, solving a MOP can also be
ended in local Pareto optimal sets. If a MOP is convex then every locally Pareto optimal
solution is also global Pareto optimal solution.
Moving from decision vector in Pareto optimal solution to another needs trading
off. Always there is a decision maker (the designer) with a better insight to the problem
who decides which optimal decision vector to be chosen. A function : → that
represents the preference of the decision maker among all the objective function is called
“Value Function”. In MOP the value function is assumed to be implicitly known [37]. A
decision maker is needed to reach to a single solution for the problem.
65
Based on the participation of the decision maker in different phases of the
problem solving, methods of MOP is categorized in four categories. In the no-preference
methods, the decision maker is not participated. In posteriori methods, decision maker
will choose the desired answer among the Pareto optimal solutions at the end. In Priori
methods, the preference and opinion of the decision maker is considered before the
solving of the problem. In interactive methods, the decision maker is involved in every
iteration of the optimization, and based on the new information will decide [37]. There
are several methods for solving a MOP; here we explain three methods in detail.
4.3.2.1 Weighted Sum Method
In the Weighted Sum (WS) method the weighted sum of the objective functions is
minimized. The problem can be formulated as follows:
∈
0 1
(4.7)
where w
i
is the weight corresponds to objective function f
i
. w
i
’s are positive real numbers
and are normalized. By perturbing the weights in the WS method we can find the Pareto
Surface although some solution may be missed in non-convex functions [37]. WS is
categorized as an a posteriori method. After finding the Pareto optimal set by changing
the weights, the designer selects the desired solution in that set.
66
4.3.2.2 Compromise Programming Method
In this paper we also use an a priori method called Compromise Programming
(CP) method for our MOP circuit optimization. CP is categorized as goal programming
methods. Any goal programming method of MOP is a priori method, as the designer sets
the goals for final solution. In this method the distance between reference points and the
corresponding objective functions are minimized. Consider k objective functions of
,
,…,
to be optimized simultaneously. We assign the design references
of
∗
,
∗
,… ,
∗
for the set of objective functions. These values can be equal to the
minimum of each objective functions. The problem is formulated as follows:
| ∗
|
∈
(4.8)
The vector of w specifies how much an objective function needs to get close to its
reference. Sometimes some objective functions are relatively more important and the
designer needs them to be much more optimized than the others. Therefore, by specifying
larger w
i
to them, they will be encouraged to get closer to their references compared to
others. The weighting vector also determines the direction of the search toward the
optimum point in the feasible region.
This method is robust and can be used in multiobjective optimization of a digital
circuit. It is one of the simple and straightforward methods, and is very efficient in
practice. In circuit optimization the designer can determine the optimum point (reference
point) of each operating function (delay, power, etc).
67
The preference of the decision maker is determined by the weights and the value
of the references. If these values are chosen appropriately the Pareto optimal solution can
be obtained by equation (4.8). However it is sometimes difficult to determine the best
value of them. Moreover, the solution cannot be better than the references, even though
they are pessimistically underestimated. Note that the desirable solution can be obtained
by adjusting the weight, and there is no positive correlation between the weight w
i
and
the corresponding objective function [56].
4.3.2.3 Satisficing Trade-off Method (STOM)
Satisficing Trade-off Method (STOM) is an interactive method for getting a
solution that a decision maker desires. After the Pareto optimal solution has been
obtained it is presented to the decision maker. The objective functions are then classified
in three classes: ones to be improved more, those which are accepted, and the objective
functions that can be relaxed more. Based on this information aspiration level (i.e., the
objective function values that are satisfactory to the decision maker) is specified. These
aspiration levels are developed interactively until the desired solution is obtained. In this
method, in the first step, the range of each objective function is specified. For an
objective function f
i
the maximum and minimum of it are shown by and . An
aspiration level of is also specified for the objective function.
The problem formulation in STOM method is as follows:
(4.9)
68
, 1,2,…, ,
∈, 1 ⁄
Which is a modified version of the weighted Tchebycheff problem which employs the
augmentation term multiplied by a small value small value of . This term adds a slight
slope to the contour of the metric therefore the weak Pareto optimal solution can be
avoided.
This problem is usually solved for a small value of . If no solution satisfying the
problem is found, the decision maker asks for another aspiration level [55][56].
4.3.3 Proposed Approach
We now summarize the steps toward finding the best point of operation in regard
to power and delay in VLSI circuits, which are also given in Figure 4.3.
We can use either non-convex models of power and delay or convex models.
Non-convex models have less error but they have a high chance of getting stuck in the
local optimum points during the optimization. Convex models have more error but
guarantee to lead to the global optimum points during the optimization.
Furthermore, having generated analytical models for circuit power and delay
parameters, we can also generate analytical equations for the gradients of the parameters
of interest as well. Providing analytical gradient equations to the optimizer helps it
achieve the global optimum point without suffering from the inaccuracies associated with
the numerical computation of the gradient values during the optimization process.
69
Figure 4.3. Power and delay multiobjective optimization
The impact of the gradient equation will be explained in section 4.4.1.1. By
providing analytical gradient to the optimizer, the points on the Pareto surface can be
reached more easily. Recall that these are the most optimum points based on Definition
(3).
After providing proper models and analytical gradient to the optimizer, we can
optimize the circuit using the WS or CP method. As it will be explained in section 4.4.1.1
Power and delay
analytical modeling
Convert non-convex to
convex models
Provide Analytical
Gradient to the optimizer
Are the
goals
provided?
Use STOM to obtain
the results
NO
YES
Are the
weights
provided?
Use WS or CP
with the weights
to obtain the
result
Use equal
weights and WS
or CP to obtain
the results
YES
NO
70
by using non-convex models, the WS method cannot result in all of the possible points on
the Pareto surface; therefore, in this case the CP has an advantage over the WS method
[56]. If circuit power and delay parameters are of the same “value” to the designer, we
should provide equal weights to them and do multiobjective optimization to find the best
operating point of the circuit in regard to power and delay. If on the other hand, the
designer has a preference for lower delay over higher power dissipation (or vice versa),
then the weights must be set so as to reflect this preference. Unfortunately it is hard to
translate a designer’s preference for one or the other objective function into weights that
yield the desired trade-off behavior during multiobjective optimization process. In
addition, sometimes there are objective function goals set by the designer. In this case we
suggest that the designer uses STOM method, which is an interactive multiobjective
programming technique based on aspiration levels. The aspiration levels are developed
during the optimization until the desired solution is obtained.
4.4 Experimental Results
We used combinational and sequential circuits to test the effectiveness of the
multiobjective optimization algorithms which were defined in the previous sections. The
combinational circuit was a Ladner-Fischer 10-bit carry-lookahead adder shown in Figure
4.4 [57]. In the adder circuit, propagation delay of the critical path and power of the
circuit are modeled as the function of gate sizing. Therefore the objective functions are
delay and power and the elements of the decision vector are the sizing of the gates in the
circuit.
71
The sequential circuit was a True Single-Phase Clock (TSPC) Flip Flop [58]. In
the TSPC FF, the Clk-to-Q delay and power dissipation are modeled as functions of
transistor sizing. Therefore the two objective functions are Clk-to-Q delay and power
dissipation and the elements of the decision vector are the sizing of the transistors in the
circuit. The schematic of the circuit is shown in Figure 4.5.
Figure 4.4. 10-bit carry-lookahead adder
72
Figure 4.5. TSPC Flip Flop
4.4.1 Multiobjective Optimization of Power and Delay in a Circuit
We now explain how to find the best operating point of the circuit in regard to
power and delay for the adder circuit. The method can be applied to other conflicting
objective functions (area, delay, routing cost, etc.) and to other circuits.
4.4.1.1 Finding the Pareto Surface
First we used non-convex (but more precise) models of power and delay and
applied the WS and CP methods to combine the objective functions for optimization. For
each weight combination (total of 100 weights) used to scalarize the power and delay
objective functions, we optimized the circuit for 100 initial sizing values. The simulation
results for finding the Pareto surfaces reached for the WS and CP methods for the non-
convex modeling of power and delay are shown in Figure 4.6.
We see that for one step optimization, it is not guaranteed that we reach the global
optimum points which are on the frontier of the obtained graph (lower left most points of
the scatter plots i.e., the Pareto surface).
73
Figure 4.6. Pareto set for non-convex modeling (a) WS method (b) CP method
There are two main reasons that, for each weight, we do not reach one point for
all of the associated initial sizing vectors. The first is that both delay and power are non-
convex functions and they have local optimum besides their global optimum point. The
second is that the optimizer calculates the gradients numerically. Therefore starting from
one initial sizing solution, during the optimization iterations it may not converge to the
optimum points; rather it finds the point which is near the optimum points.
By providing the analytical gradient for the optimizer, we directly reach the
Pareto surfaces for the WS and CP methods as shown in Figure 4.7. Here again
optimizations were done for 100 initial values for each weight (we used 100 weights).
For each weight the optimizer found almost the same point for different initial
vectors provided. We see that, in both scatter plots, the points in the right end of the
Pareto surface have some spread and the optimizer did not give us one point (as the
74
optimal point) for this weight. (There is the same behavior in the left end of the graph as
well, but it is less severe.)
Figure 4.7. Pareto set with analytical gradient (a) WS method (b) CP method
This is where power has a weight of 1 and delay has a weight of 0. It means that
the optimizer had to do single objective optimization. If we optimize the non-convex
power for different initial points, we obtain the graph depicted in Figure 4.9(a), in which
the x-axis corresponds to the number of the initial points (meaning, for example, that x=1
corresponds to the first set of sizing initial vector values supplied to the optimizer) and
the y-axis corresponds to the optimized circuit power reached for that initial point of the
optimization.
The figure shows the optimization of the non-convex power model, in which an
analytical equation for the gradient of the circuit power dissipation function is also
provided for the optimizer. We can see that the optimizer gets stuck in the local minimum
in almost half of the optimizations done under different initial points. The single
75
objective optimization on non-convex delay function also gets stuck in local optimums
for some of the initial values of the sizing vector.
Figure 4.8. Pareto set for convex modeling (a) WS method (b) CP Method
Figure 4.7 also shows that the WS method has the drawback of not being able to
find some points on the Pareto surface for the non-convex models (even for all the
possible weights) while the CP method can find them.
4.4.1.2 Using Convex Modeling
If we try to optimize the convex circuit power and delay equations with analytical
gradient functions, we get the Pareto surfaces for the WS and CP methods depicted in
Figure 4.8.
We observe that we have much improved Pareto surface for convex models
because now each weight has only one point as the optimal point. Single power
optimization for convex modeling is shown in Figure 4.9(b). As we can see there is only
one value (global optimum point) for every random initial vector value for sizing. For
convex modeling, both WS and CP can find all points on the Pareto surface.
76
Figure 4.9. (a) Non-convex power optimization (b) Convex power optimization
4.4.1.3 STOM
In this step we used equation (4.9) for the scalarization of our objective functions.
The aspiration levels are the values of the objective function which have an equal
percentage of degradation from their optimum values (obtained by single objective
optimization.) After one step optimization if the desirable result is not obtained, the
aspiration levels are revised by relaxing the objective function which has been over
improved. With the proposed modeling of the delay and power (either convex or non-
convex) and the convex shape of Pareto surface which can be obtained from the
multiobjective optimization, up to 10% change in the aspiration levels can lead to the
desirable results.
77
4.4.2 Experimental Results
In Table 4.1 for the adder optimization, the effect of convex and non-convex
modeling and using of analytical gradient for single step multiobjective optimization are
shown for WS and CP methods.
Table 4.1. Simulation results for the adder circuit
In the table by single objective power optimization, we mean optimization of
power and use of the obtained sizing vector for calculating delay. A similar definition
applies to the single objective delay optimization. For non-convex modeling and when
non‐
convex
/convex
Method
Using of
gradient
power
( )
delay
( )
Single‐obj
Power
Optimization
Single‐obj
Delay
Optimization
Multiobjective
Optimization
Power 48.011 48.932 48.719
Delay 11.992 10.463 10.537
Power 47.924 48.908 48.657
Delay 12.533 10.279 10.389
Power 47.924 48.633 48.439
Delay 12.533 10.806 11.018
Power 48.019 48.908 48.148
Delay 12.808 10.279 11.317
Power 48.085 48.495 48.432
Delay 13.001 10.647 10.658
Power 47.809 48.451 48.335
Delay 13.642 10.306 10.381
Power 47.809 48.464 48.123
Delay 13.642 10.603 11.408
Power 47.809 48.451 48.102
Delay 13.642 10.306 10.791
w/o grad
w grad
CP
w/o grad
w grad
non‐
convex
convex
WS
w/o grad
w grad
CP
w/o grad
w grad
WS
78
we do not use analytical gradient, single step optimization does not guarantee of reaching
the best point of operation. We need to run the optimization algorithm for several initial
sizing vectors to reach the best operating point in regard to power and delay. By using
analytical gradient and convex modeling we can guarantee reaching of the best in a single
step optimization.
Figure 4.10 shows the percent of the degradation from the best operating point of
power and delay (which is obtained by single objective optimization and utilization of the
analytical gradient). Multiobjective optimizations (number 3, 6, 9, 12) have better results
in comparison with single objective optimizations. As we can see in both graphs
multiobjective optimization with CP method and use of analytical gradient is the best
solution in regard to both power and delay.
In Figure 4.10 results of the STOM method are compared with the results
obtained with single step optimization and shown in Table 4.1. Using the STOM, the
designer is able to find the best desired point in which both power and delay has almost
equal percent of degradation from their optimum points (this point is the best in regard to
power and delay with equal weights corresponds to them.)
STOM is much more time consuming than aforesaid methods since we need to
update aspiration levels in each step of optimization to find the best operating point. For
the True Single-Phase Clock Flip Flop (TSPC FF), if we optimize the non-convex and
convex models of power and delay simultaneously with the WS and CP methods, we will
reach the same optimization results as the adder’s, although we changed the CMOS
technology, the circuit topology and used transistor sizing instead of gate sizing.
79
(a) non-convex modeling
(b) convex modeling
Figure 4.10. Degradation from the optimum values for the adder circuit
80
(a) non-convex modeling
(b) convex modeling
Figure 4.11. Degradation from the optimum values for Flip Flop circuit
81
Table 4.2. Simulation results for the Flip Flop circuit
We conclude that for the proposed modeling of power and delay, we can reach an
almost convex Pareto surface and we can generalize our analysis to other circuits. The
simulation results are gathered in Table 4.2 and Figure 4.11. Again we can see that
multiobjective optimization with CP method and using of analytical gradient is the best
solution in regard to both power and delay.
STOM, while more time consuming, can lead to the best results in which the
obtained power and delay have only up to 30% degradation from the best power and best
delay of the circuit (in both adder and Flip Flop).
non‐
convex
/convex
Method
Using of
gradient
power
( )
delay
( )
Single‐obj
Power
Optimization
Single‐obj
Delay
Optimization
Multiobjective
Optimization
Power 72.709 79.596 78.37
Delay 53.858 15.192 15.21
Power 70.73 78.52 73.24
Delay 55.82 14.86 17.56
Power 72.709 79.17 75.118
Delay 53.858 15.11 16.4
Power 70.73 78.67 73.23
Delay 55.821 15 17.58
Power 70.709 77.677 74.571
Delay 56.858 14.6 16.549
Power 70.709 77.677 75.21
Delay 56.858 14.6 15.521
Power 71.898 78.634 74.411
Delay 53.548 14.998 17.01
Power 70.709 78.357 74.173
Delay 56.858 15 17.185
convex
WS
w/o grad
w grad
CP
w/o grad
w grad
non‐
convex
WS
w/o grad
w grad
CP
w/o grad
w grad
82
Chapter 5: Multiobjective Optimization in 3DICs
The complex structure of the 3DIC makes it difficult to find the optimum design
in regard to the objectives of cost, performance and thermal reliability. This problem is
challenging because if optimization is performed individually for cost, performance or
thermal reliability, the outcome for one objective conflicts with those for other objectives
with regard to the proper number of layers, chip area, position of communicating blocks
and number of TSVs in a 3DIC. Therefore, seeking an optimum design, in any stage of
design and manufacturing of the 3DIC, is challenging and requires complex analysis.
In this chapter we propose an analytical multiobjective optimization method for
3DIC block building placement. The method uses a Quasi-Newton optimization method
in continuous space. Compromise Programming is employed for the scalarization of
objectives [60].
Although the method is proposed for 3DIC placement it is also effective for
building block placement in 2D conventional chips. The main differences between
applying this method on 3DICs vs. 2D conventional chips are: 1) In a 3DIC the
placement is fulfilled in a 3D space. We provide a smart technique for finding an
optimum number of layers for an optimum structure of the 3DIC. In 2D chips the
placement is accomplished in a plane, in contrast. 2) A fundamental element for making
3DICs is the TSV, which is used for communication between the layers. In our 3DIC
placement we also carefully take care of the best places for the TSVs to be inserted in
order to have optimum communication speed between the blocks and in order for the
TSVs to provide low resistance thermal paths to the heatsink. In 2D chips, no TSV is
83
involved as the placement is only in one layer. 3) Thermal reliability is a challenge in a
3DIC. Vertical thermal correlation between the layers in a 3DIC exacerbates the
temperature problem. In our 3DIC placement method, the hot building blocks are
intended to be positioned for them to have the least vertical thermal interaction. There is
no vertical thermal interaction consideration in 2D placement. 4) In a 2D placement all
building blocks have the same distance from the heatsink, but in a 3DIC the placement
should be managed in a way that the hot blocks are allocated closer to the heatsink and
cold blocks placed further from the heatsink. 5) As TSVs enable high communication
bandwidth between the building blocks, the placement method in a 3DIC should cause
highly connected blocks to be horizontally close to each other if they are allocated in two
different layers to get maximum benefit from using TSVs. In a 2D placement, only planar
connections between building blocks are considered, as there is no TSV capability for
high communication bandwidth in a third dimension. 6) The cost model for 3DIC
placement should consider the cost of stacking and TSV insertion, while these factors are
not presented in a 2D IC cost model.
5.1 Conflicting Objectives
The optimization of three objectives of cost, performance and thermal reliability
for 3DIC building block placement conflicts with each other with regard to the proper
number of layers, chip area, position of communicating blocks and number of TSVs in a
3DIC as shown in Table 5.1.
Considering the manufacturing cost, vertical stacking adds the cost of aligning,
thinning, and bonding of the layers to the packaging cost of the chip. TSV formation is an
84
extra cost for 3DICs and any extra steps in manufacturing affects the yield of the chip.
Increasing the area results in a lower number of dice produced from a wafer. Therefore,
an optimum structure with regard to the number of layers, chip area and number of TSVs
exists.
Table 5.1. Conflicting objectives in a 3DIC
Conflicting
objectives
Number
of layers ↑
Number of
TSVs ↑
Area ↑
Highly connected
block stacking
Manufacturing
cost advantage
↓ ↓ ↓ ↑
Thermal reliability ↓ ↑ ↑ ↓
performance ↑ ↑ ↓ ↑
Tight integration of the layers, high thermal resistance paths to the heatsink, and
high thermal correlation between the layers increase the maximum temperature of the
3DIC. As a result, a lower number of stacked layers is favored with regard to the thermal
reliability. TSVs provide a low thermal resistance path to the heatsink, so a larger number
of them reduces the temperature of a 3DIC. Increasing the area reduces the number of
layers and temperature of the chip. Stacking high frequency blocks with high power
dissipation increases the temperature because of their thermal interaction in adjacent
layers. In order to reduce the temperature, high-temperature blocks should be located far
from each other in the 3DIC.
Similarly, 3DIC enables us to stack two blocks with high frequency of
communication on top of each other. Therefore, the blocks can exploit short and high
85
bandwidth communication routing through the TSVs between them. The clock frequency
can be higher and the speed of communication is increased in consequence. Accordingly,
in regard to better performance we need more stacked layers (which results in smaller
area) and more TSVs.
The factors described above illustrate the conflicting issues involved in optimum
3DIC design. Three objectives of cost, thermal reliability and performance conflict with
each other in optimum design of a 3DIC, consequently we are seeking a proper
multiobjective optimization method in order to find an optimum design for a 3DIC in
regard to these objectives.
5.2 Problem Statement and Formulation
In the following we present the formulation for the variables and the objectives
shown in Table 5.1 to be used in the multiobjective optimization algorithm for 3DIC
placement. The algorithm will be explained in the next section.
In each optimization’s iteration, the relative positions of the building blocks in the
continuous space of a 3DIC cube are determined by the optimizer. Based on these
relative positions, the number of layers is determined. After assigning the building blocks
to each layer, the area of the chip, number of TSVs and total wire length inside the chip
are calculated. Cost, temperature and performance are then computed as functions of
these variables.
86
5.2.1 Variables of the MOP
The decision vector of building block positions is defined to be continuous to be
used in the analytical Quasi-Newton optimization method [62]. The algorithm seeks the
optimum point in the continuous space by analyzing the neighbor points and finds the
point in which the gradient is approaching zero. Therefore discrete placement methods
are not acceptable in our approach.
In our method vertical scattering of the building blocks defines the number of
layers in each optimization’s iteration. The minimum is one layer and the maximum
(
) is dictated by the optimization problem’s constraint. The number of layers is
calculated as follows:
,… ,
min ,… ,
∈ 1, 2, … ,
(5.1)
in which N is the number of layers and m is the total number of building blocks. shows
the vertical coordinate of the block j’s position, and is a discrete number which
determines the number of layers in the 3DIC package. is a threshold border which
corresponds to the number of .
After determining the number of layers and finding the relative position of the
blocks, the blocks will be assigned to predetermined slots using a minimum distortion
linear assignment [67]. Considering their relative positions, each block will be allocated
in a slot in which the square of its Euclidean distance from the center of that slot is
minimized. This assignment is necessary in each iteration in order to calculate the area,
87
number of TSVs, and total wire length inside the chip. The area, or footprint, of the 3DIC
is the maximum area of the stacked layers and is shown in equation (5.2).
,…, (5.2)
where N is the number of layers and is the area of the layer i calculated based on
equation (5.9) .
The number of TSVs to be used in the cost and temperature calculation is
formulated in equation (5.3).
,
2
(5.3)
where m is the total of blocks. and represent the layer number associated to blocks i
and j, respectively. ,
indicates the number of TSVs required for communication of
block i and j if allocated on adjacent layers, which is based on the number of
interconnects between i and j. The number of layers, area and number of TSVs will be
used to calculate the objectives of cost and temperature.
The total wire length, as an indicator of communication length between the
blocks, will be used as the objective of performance in the MOP. We use a Manhattan
distance method to calculate the total wire length, since the metal routing is assumed to
be grid-based on horizontal and/or vertical paths in each layer. Since the TSVs provide a
high bandwidth connection path between the blocks on different layers, making the
highly connected blocks close to each other horizontally improves performance
significantly. Total wire length is calculated as follows:
88
2
(5.4)
in which m is the total number of blocks. and are the horizontal coordinates of block
i.
is the interconnection weight between i and j.
5.2.2 Objectives of the MOP
5.2.2.1 Cost
The 3DIC cost consists of the cost of each individual die and the bonding cost of
the layers. The yield of each die and also TSV yield affect the cost. For an N-layer 3DIC
the cost is formulated as shown in Equation (5.5) [44].
, (5.5)
in which
is the cost of each individual die. , is the cost of stacking each
two layers which consists of TSV forming, wafer thinning, and bonding. TSV yields
affect the bonding cost.
is formulated as:
(5.6)
where
4
2 (5.7)
(5.8)
1
(5.9)
89
shows the price of each wafer and
is the TSV manufacturing cost for each
wafer.
is the number of good dice per wafer. Each die has the area of
and
is the wafer diameter.
is a function of area occupied by cells
and the
routing overhead which is calculated by and the area occupied by TSVs (
).
is
the yield of each die, which is a function of area of the die and the defect density.
, is formulated as:
,
1
(5.10)
is the stacking cost for a single tier, and
is the reliability of the
stacking.
is the number of TSVs between each two layers and
is the TSV
failure probability.
In the cost model that we used, it is assumed that all layers in the 3DIC are fully
occupied by the building blocks. The areas of the layers in a 3DIC, determined by the
maximum area of the stacked layers, are equal to each other. Making the layers have
equal sizes may cause some layers’ areas to not be fully utilized just to accommodate a
layer or layers in the stack that are all fully occupied. A layer that is not fully occupied
will have a higher yield than one that is fully occupied. Therefore, the formulation for the
cost that is used in this research assumes the worst-case cost for the 3DIC, but other
formulae that take into account occupancy of the die area may more accurately capture
cost and can easily be substituted into the general framework of our model.
90
5.2.2.2 Thermal Reliability
The thermal analysis step is the main bottleneck during 3DIC optimization. We
use a very fast and efficient thermal analysis for 3DIC in our optimization algorithm [22].
The thermal modeling is based on two facts: any component’s temperature inside
the 3DIC increases if the layer it is located on gets further from the heatsink, and a scaled
version of each layer’s thermal map reflects on other layers. After each optimization step
the thermal map of each layer is determined by:
∆
.
′
′
∆
′
.
(5.11)
where
∆
.
∆
.
(5.12)
in which is the thermal map matrix of layer k. Each layer is divided into
grid cells and each element of is the average temperature of the corresponding grid
cell. is composed of three components: the thermal map of the layer k after proper
scaling based on its distance from the heatsink ( ), the effects of the layers in which
layer k is located on their secondary path of heat transfer, and the effect of the layers in
which layer k is located on their primary path of heat transfer. Layer m is the closest layer
to the heatsink. The scaling matrices on the secondary path are shown by
and on the
primary path by
. Each element of these matrices is a function of the distance of the
91
layers from each other and number of TSVs between corresponding grid cells in the two
layers.
and are intermediate thermal maps of layer l and k respectively after proper
scaling from the heatsink. is a function of which is the original thermal map of layer
k and obtained by allocating the layer by the heatsink. , , and are fitting
coefficients which are functions of lateral thermal conductivity and ∆
is the
horizontal thermal gradient matrix calculated by taking the difference of the temperature
of grid cell (i,j) from its neighboring cells’ temperatures. Each element in is a function
of the distance of the layer from the heatsink and number of TSVs between the layer and
the heatsink. The same calculations also apply to .
If is located between and the heatsink, the reflection of the thermal map of
layer l on it is through the primary path of heat transfer from layer l, and if is located
between and the PCB, it is through the secondary path of heat transfer from layer l.
Each element in the thermal gradient matrix, ∆
, is calculated by taking the difference
of the temperature of grid cell (i,j) in from its neighboring cells’ temperatures. This
factor arises from the fact that a cell with much hotter neighboring cells generates hotter
grid cells on other layers. , , and are scalar fitting factors which are functions of
lateral thermal conductivity and the location of layer l.
Through three scaling matrices of
,
and the effect of the number of
TSVs on the thermal map of the 3DIC is modeled during the optimization (for more
details on the thermal model, see [22]). Throughout the multiobjective optimization
simulation we minimize the maximum temperature of the 3DIC.
92
5.2.2.3 Performance
To enhance the performance of the 3DIC we minimize the communication cost by
decreasing the Manhattan distance of each two blocks that communicate directly. On the
other hand minimizing the average temperature of the chip also enhances the
performance. Throughout the multiobjective optimization, the total wire length weighted
based on the routing width will be minimized.
5.3 Multiobjective Optimization Algorithm
In this section we present the method of Compromise Programming for the
scalarization of the objectives. We discuss the convexity of the objectives and propose
our multiobjective optimization algorithm’s flowchart.
The objective references of
∗
,
∗
,
∗
are assigned to each objective of
,
,
. These references can be the minimum of each objective or any optimistic
guess. We use Compromise Programming [37] to obtain the final problem formulation as
follows:
| ∗
|
, ,… ,
,
,
1
1
(5.13)
93
in which is the final objective function and , , and are the three
objectives of cost, maximum temperature and total wire length. We used 2 in our
MOP. Values of for each objective are determined in a way that all objectives have the
same order of magnitude and treated by the optimizer equally. The weights can also be
adjusted for any preference. is the decision vector of building block positions, and
and are lower and upper bound limit vectors, respectively, of the elements in . is
the total number of blocks.
The Quasi-Newton algorithm [62] is employed in the multiobjective optimization
algorithm.
5.3.1 Convexity of the Objectives
The two objectives of cost and performance are functions of the position of the
building blocks, while temperature is the function of position and power profile of the
building blocks.
As a single objective function can be optimized only to a local optimum, solving a
MOP can also end in local Pareto optimal sets. If a MOP is convex then every locally
Pareto optimal solution is also a global Pareto optimal solution. A multiobjective
optimization is convex if all objective functions and the feasible region are convex. There
are many algorithms that can solve a convex MOP, but solving a non-convex MOP [37]
is more challenging.
Definition (4): A function :
→ is convex if for all ,
∈
:
94
1 1
0 1
A region ⊂ is convex if ,
∈ implies that
1 ∈ for all
0 1 .
Theorem (2): For a practical number of TSVs, number of layers and area [63],
cost is a convex function of the block positions.
Proof: For a practical number of TSVs, number of layers and area, cost is an
increasing function of these variables as shown in Figure 5.2. On the other hand an
increasing function of a convex function is convex (Please refer to Appendix )[61].
is convex if and only if 0 for all ∈ [61]. Each variable of the number of
TSVs (equation (5.3)), number of layers (equation (5.1)) and area (equation (5.2)) are
convex functions, because 0 for all possible block positions for them.
Therefore cost is a convex function.
Theorem (3): Total wire length is a convex function of the block positions.
Proof: is convex if and only if 0 for all ∈ . Based on equation
(5.4), 0 for all possible block positions, therefore as the total wire length
objective is convex.
Theorem (4): Temperature is not a convex function of the building block
positions.
Proof: The maximum temperature of the chip is a function of building block
positions and their power profile. Accordingly the maximum temperature is not a convex
function of the building blocks’ positions because of the thermal interaction between
95
them. For a block with a specific power profile, positioning it close to hot neighbor
blocks increases the maximum temperature of the chip more than if it is positioned in a
cool neighborhood. This effect is not only dependent on the block’s position. Therefore
maximum temperature is not a convex function.
Both cost and performance objectives lead the optimizer to their global
optimization point, but as the temperature is not convex the MOP may end up in a local
optimum point. In order to prevent this, based on the maximum allowable number of
layers in our 3DIC design, several random initial solutions are produced. The vertical
sparseness of the relative building block positions is limited to each threshold border
shown in equation (5.1) in each initial random solution. For instance, for a 3DIC with a
maximum of four layers, we examine the optimization for four initial random solutions.
Since changing the number of layers results in a significant change in any objective, the
optimization algorithm has the tendency to search the best result after determining the
best number of layers; accordingly this technique of choosing initial solutions helps
finding the global optimization point and prevents the optimizer from getting stuck in
local optima.
5.3.2 The Proposed Algorithm
Figure 5.1 shows the flowchart of the proposed optimization algorithm. The power
profiles and weighted connections of the Building Blocks (BB) are the inputs to the
algorithm. The algorithm is tested for four random initial placements (for a maximum of
4 layers in our 3DIC experiments). A minimum distortion linear assignment approach is
used to assign the BB’s to the predetermined slots based on their relative spatial
96
positions. After computing cost, maximum temperature and total wire length as the input
to the optimization algorithm, the optimization iterations are run in order to find an
optimal result. The outcome of any MOP is a Pareto surface based on the initial solutions
and the weights. The best result will then be chosen by the designer.
Figure 5.1. Multiobjective optimization algorithm for 4-layer 3DIC design
In all of our experiments the weights shown in equation (5.13) are determined in a
way that all of the objectives are treated by the optimizer equally and they are in the same
order. Therefore, in the end the optimum results happen when the objectives have the
same degradation from their own optimal amounts. In order for the Pareto surface to be
BB’s Power Profile
BB’s Weighted Connections
Four Initial Placements
Relative position of the BB’s
Minimum Distortion Linear Assignment
BB’s assigned to the slots on the layers
MOP
Pareto Surface
Final result
WL Cost T
97
generated the experiments should be done for a range of weights. By generating the
Pareto surface for different weights the designer will be able to select an optimum
structure in regard to any objective by sacrificing other objectives that have the least
importance to him/her. Assigning a high weight to one of the objectives will cause that
objective to get much closer to its optimum amount while sacrificing other objectives. As
the main advantage of a 3DIC is its high performance because of the reduction in total
wirelength, a designer can assign a high weight to the objective of total wirelength in
order to reach a higher speed of operation. Other objectives like mean and variance of the
interconnection lengths can also be introduced in equation (5.13). In this way all of the
interconnections are important to the optimizer and that prevents a skewed distribution
where a long critical path could occur.
5.4 Experimental Results
This section provides experimental results for the proposed multiobjective
optimization in 3DICs. After providing an analysis for each objective, we will compare
the optimum design for different numbers of layers in a 3DIC. We will also compare our
global optimization in a 3DIC building block placement with previous work which is
based on Simulated Annealing optimization.
5.4.1 Simulation Setup
In all experiments a modified version of Alpha 21264 is used as a baseline core
processor [50] with the configuration shown in Table 3.1.
98
SPEC2000 benchmarks are run on the cores to provide different power profiles
[51]. The Wattch infrastructure [52] is used for architectural-level power modeling of the
system. We assume homogenous power profiles for the memory blocks. The 3DIC
structure uses flip-chip packaging and is shown in Figure 3.7 for a 4-layer configuration.
In a flip-chip structure the heat sink is located on top of the 3DIC and the
connection to external circuitry is provided by the solder balls located at the bottom of
the chip. The material and packaging properties for the 3DIC structure are based on [11]
and also shown in Table 3.2.
In order to compute cost based on equations (5.5) to (5.10) we used the values for
the parameters shown in Table 5.2 [44].
Table 5.2. 3DIC cost parameters
Parameter Description Value
Pwafer Single wafer price $3000
CTSV TSV cost per wafer $300
dwafer Wafer diameter 200mm
D0 Defect density 0.001/mm2
Cstacking Stacking cost for a single tier $2
Ystacking Yield of stacking 0.99
FTSV TSV failure probability 0.0001/No. of TSVs
γ Routing area overhead 0.1
DTSV TSV Dimension 2µm
99
5.4.2 Single Objective Analysis
Based on the values shown in Table 5.2 and equations (5.5) to (5.10) cost is an
increasing function of area, number of layers and total number of TSVs, when these
variables are in the range of practical numbers [63], as shown in Figure 5.2.
Figure 5.2. Cost of 3DIC vs. area, number of layers and number of TSVs
The figure shows that the effect of the area and number of layers on the cost is
much more than the effect of increasing the TSVs. Consequently, for the 3DIC design,
the number of layers and area can be determined for an acceptable amount of cost. After
0
0.2
0.4
0.6
0.8
1
050 100
Normalized Cost
Area (mm
2
)
TSV=2000
N=4
0
0.2
0.4
0.6
0.8
1
1 23456
Normalized Cost
Number of Layers
TSV=2000
A=25 mm
2
0
0.2
0.4
0.6
0.8
1
0 2000 4000
Normalized Cost
Number of TSVs
N=4
A=25 mm
2
100
that, the performance can be enhanced by reducing routing length between
communicating blocks and employing more TSVs to provide high bandwidth connection
between them.
Figure 5.3. Temperature of 3DIC vs. area, number of layers and number of TSVs
The maximum temperature of the 3DIC, as a function of area, number of layers,
and number of TSVs, is shown in Figure 5.3. The experimental results are based on the
model presented in Section 5.2.2. A greater number of stacked layers increases the
maximum temperature of a 3DIC, due to the high thermal correlation between the layers
and increase in the distance of active layers from the heatsink. More TSVs inside the chip
0.40
0.60
0.80
1.00
050 100
Normalized Temperature
Area (mm
2
)
TSV=2000
N=4
0.40
0.60
0.80
1.00
12 34
Normalized Temperature
Number of Layers
TSV=2000
A=25 mm
2
0.4
0.6
0.8
1
0 2000 4000
Normalized Temperature
Number of TSVs
N=4
A=25 mm
2
101
reduces the maximum temperature since TSVs provide high thermal conductive paths to
the heatsink.
For the same number of blocks increasing the area reduces the number of layers;
therefore, increasing the area has the same impact of decreasing the number of layers on
the temperature. Increasing the area by introducing blank parts inside the layers also
lowers the maximum temperature, as the layers are highly thermal correlated.
Less total wire length inside a 3DIC and more TSVs (to provide high bandwidth
communication) results in closer horizontal distance of highly connected blocks and
improves the performance.
5.4.3 Effect of Number of Layers on Optimum Design of a 3DIC
In order to compare the effect of number of layers on the optimum design of a
3DIC in this section we provide the multiobjective optimization results by keeping the
number of layers constant. We compare the results for two different configurations. The
first example consists of 8 cores and 8 memory blocks (16 building blocks). In the second
example the number of cores is increased to 32 and memories to 16 (48 building blocks).
The maximum allowable number of layers is four, since more than four stacked layers
generate unacceptably high average temperatures of the chip designs in these
experiments. Table 5.3 shows the multiobjective optimization results for both
configurations and for each number of layers.
102
Table 5.3. Effect of number of layers on multiobjective optimization results
Number of
Blocks
Number
of layers
Cost ($)
Temp
(°C)
Wire length
(mm)
Area
(mm2)
TSVs
Final
objective(Q)
m=16
N=1 1.02 82.3 1218 8.3 0 19
N=2 3.07 102.1 982 4.2 293 20.1
N=3 5.36 125.2 690 2.8 668 23.8
N=4 7.62 131.7 557 2.1 690 25.5
m=48
N=1 3.2 84.6 23350 25 0 58.7
N=2 5.82 122 16344 12.5 2600 53.1
N=3 8.91 166.9 10779 8.4 3051 43.3
N=4 12.59 191.3 4778 6.3 3536 38. 3
The last column of the table is the final objective function (Q) based on equation
(5.13). As shown, the minimum of the final objective occurs for a 1-layer structure for 16
building blocks and 4-layer structure for 48 building blocks. This indicates for 8 cores
and 8 memory blocks a 1-layer structure gives optimum results, and for 32 cores and 16
memory blocks a 4-layer structure is best. The reason is the increase in the
communication wire length by increasing the number of building blocks. Since we
assume the number of four as the maximum allowable number of layers, an optimum
placement with four layers decreases the communication length significantly and
compensates for the increase in the temperature and cost in the MOP. In the first
103
configuration, because of a lower number of communication routings, the optimum result
is obtained for one layer. In the one-layer configuration, the temperature and cost are
minimized, compensating for the increase in the objective of wire length in the MOP. As
a result, the optimum structure highly depends on the final objective of the MOP, which
in our method is the point in which the weighted distance of each objective from its
minimum is minimized.
Figure 5.4 shows the optimum value of each objective in 1, 2, 3 and 4-layer 3DIC
structures in both configurations. As depicted in the figure, generally increasing the
number of layers improves the performance but increases the cost and maximum
temperature of the chip. The figure also shows that the optimum Q is increasing by the
number of layers in configuration with less number of interconnects, and decreasing in
configurations with more number of interconnects.
After several experiments in a 3DIC placement these common trends are observed:
1) Hot building blocks are generally allocated in layers which are closer to the heatsink.
2) Highly connected blocks with low thermal interaction are allocated close to each other
in one layer or in two adjacent layers. 3) Building blocks with high thermal interaction
are allocated far from each other. 4) Optimum positions of TSVs are introduced to
produce low thermal resistance paths to the heatsink. 5) Configurations with a high
number of building blocks and interconnections derive more benefit from 3D placement
vs. 2D placement.
104
(a)
(b)
Figure 5.4. Optimum objectives (a) 16 building blocks. (b) 48 building blocks
0
0.2
0.4
0.6
0.8
1
1234
Normalized optimum objectives
Number of Layers
Cost
Temperature
Total Wirelength
Q
0
0.2
0.4
0.6
0.8
1
1 234
Normalized optimum objectives
Number of Layers
Cost
Temperature
Total Wirelength
Q
105
5.4.4 Global Optimization and Comparison with Previous Works
In this section we consider the number of layers as one of the variables and
provide global multiobjective optimization results for different numbers of cores and
memory blocks. State-of-the-art previous methods of thermal-ware 3DIC placement
[15][44][46][47][48] employ a Simulated Annealing (SA) optimization method in order
to find the optimum structure of the chip. Simulated Annealing is complex and scales
poorly with problem size.
As discussed earlier, 3DIC thermal analysis is the main bottleneck during
optimization iterations. Detailed thermal analyses of compact modeling and Finite
Element Analysis (FEA) of Partial Differential Equation (PDE) of heat transfer are two
main methods used in previous papers. These methods have a long runtime especially for
3DICs with a larger number of layers. Since the number of layers is not fixed in our
approach, using a detailed thermal analysis approach increases the optimization time
significantly. Therefore, we use the fast 3DIC thermal map model explained in Section
5.2.2 in order to improve the optimization runtime.
In all previous methods, a weighted sum is used for scalarization of the objective
functions while we used Compromise Programming (Section 5.3). Minimizing the
weighted sum of the objectives makes the optimizer focus on an objective while it cannot
be further optimized from its global optimum. In Compromise Programming, the
weighted distance of the objectives from their minimum is optimized. Employing this
technique results in a more balanced result and reduces optimization runtime.
106
None of [15][44][46][47][48] considers cost as one of the objectives; rather they
consider the number of TSVs and/or area as an objective or an optimization constraint.
We compare our results with the method proposed in [15], which used Simulated
Annealing, compact thermal modeling, and a weighted sum for scalarization of the four
objectives of temperature, wire length, area and number of TSVs in Table 5.4. Our
method is based on Quasi-Newton optimization in continuous space, 3DIC fast thermal
map modeling for thermal analysis, and Compromise Programming for scalarization of
three objectives of cost, performance and temperature.
In our proposed method, the analytical algorithm searches the optimum results in
the continuous space. As presented in Section 5.3.1, temperature is a non-convex function
and may lead the algorithm to a local optimum. We run the algorithm for four initial
random solutions as we assume there can be the maximum of four layers in the chip.
More than four stacked layers generate unacceptably high average temperatures of the
chip designs in these experiments. The best results based on the final objective function
are kept at the end. Table 5.4 shows that the proposed method yields a superior result for
different numbers of cores and memory blocks. On average our method reduces the peak
temperature of the 3DIC by 4.3% and total wire length by 5.7% and increases the cost by
only 0.9%. Area and number of layers are the main components that determine the final
cost as shown in Figure 5.2. Since these two variables are the same in the Simulated
Annealing and our proposed optimization results, the optimum cost are comparable in
both methods. Based on the final objective function (Q) the optimum number of layers is
one for the first two configurations and it is four for the last two. The reason is the
107
increase in the number of interconnects in the last two configurations as explained in
Section 5.4.3.
Table 5.4. Comparison with previous method
MOP method
Simulated Annealing and Compact
Thermal Modeling
Quasi-Newton and Fast Thermal Modeling
Configuration
WL
(mm)
T
(ºC)
C
($)
time
(min)
TSVs
No.
Layers
WL
(mm)
T
(ºC)
C
($)
time (min) TSVs
No.
Layers
8 core
8 mem
1289 85.2 1.02 13 0 1
1218
(-5.8%)
82.3
(-3.5%)
1.02
(0.0%)
6.3
(-106.3%)
0
(0.0%)
1
(0.0%)
16 core
8 mem
3919 93.5 1.54 22 0 1
3628
(-8.0%)
90.05
(-3.8%)
1.54
(0.0%)
7.8
(-182.1%)
0
(0.0%)
1
(0.0%)
16 core
16 mem
3852 176 10.45 262 2686 4
3807
(-1.2%)
165.01
(-6.7%)
10.58
(1.2%)
10.1
(-2494.1%)
2795
(3.9%)
4
(0.0%)
32 core
16 mem
5141 197.6 12.28 340 3323 4
4778
(-7.6%)
191.34
(-3.3%)
12.59
(2.5%)
12.2
(-2686.9%)
3536
(6.0%)
4
(0.0%)
Average deviation from Simulated
Annealing solution
-5.7% -4.3% 0.9% -1367.3% 2.5% 0.0%
The method of fast 3DIC thermal modeling models the direct thermal effect of
vertical neighbors on all other layers (through primary and secondary paths of heat
transfer) and horizontal neighbors on the same layer. Therefore, using this modeling
technique in the multiobjective optimization algorithm improves the thermal reliability by
guiding the high temperature blocks to be located as far from each other as possible.
For the sake of fair comparison, the temperature measurement is based on the
detailed compact thermal modeling in both methods shown in the table. On the other
108
hand, focusing on the cost as one of the objectives instead of number of TSVs or area,
improves the performance. The optimizer reduces the total communication wire length
inside the 3DIC by introducing more TSVs.
Since the number of TSVs has less effect on the final cost in comparison with
area and number of layers (as shown in Figure 5.2), this approach improves the
performance and keeps the cost as minimum as possible.
Figure 5.5. Comparison of the optimization execution times
The main advantage of our method is lower complexity and much faster
optimization runtime. Table 5.4 shows our technique is more than 17x faster than the
methods using Simulated Annealing and compact thermal modeling. Figure 5.5 indicates
that the Quasi-Newton analytical method with fast thermal map model yields a runtime
that scales linearly with the problem size. Employing fast 3DIC thermal map modeling,
searching in continuous space and employing a Compromise Programming method of
scalarization improves the optimization runtime significantly.
0
50
100
150
200
16 24 32 48
Optimization Time (min)
Number of building blocks
Simulated Annealing
and Compact Thermal
Modeling
Quasi-Newton and Fast
Thermal Modeling
109
Chapter 6: Summary
6.1 Conclusion
This dissertation focused on two main research topics in 3DIC technology:
thermal analysis and multiobjective optimization. To tackle the thermal analysis problem,
we provided investigations and experiments on thermal correlation between the layers,
primary and secondary paths of heat transfer, spatial hotspots, thermal map modeling,
thermal sensor design, and thermal sensor distribution for 3DICs. For the multiobjective
optimization problem, we started by providing a wide mathematical study for different
multiobjective optimization techniques effective for VLSI circuit optimization. We
focused on delay and power as two well-known conflicting objectives for this study. The
effect of convex and non-convex modeling of the objectives was also studied. For
multiobjective optimization problem in 3DICs we focused on three conflicting objectives
of cost, performance and thermal reliability. After studying each objective in detail, we
provided a general framework for multiobjective optimization of these three objectives in
3DIC building block placement. This section concludes this dissertation with a summary
of the contributions and impact of each of these research tasks.
In this dissertation we presented the analysis of the thermal correlation between
the stacked layers in 3DICs. Through primary and secondary paths of heat transfer a
scaled version of the thermal map of each layer is reflected on other layers, and that leads
to a high thermal correlation between the stacked layers. Similarly, any planar hotspot in
each layer is converted to a spatial hotspot. We also introduced a new 3D design for
110
thermal sensors to be utilized to monitor spatial hotspots in 3DICS. The sensor frequency
vs. temperature response is linear while passing through the TSVs in an optimal routing.
The 3D ring oscillator can also be utilized to monitor any process variation and ageing
effects of the TSVs in 3DICs. Its frequency can also be used as a factor to measure the
thermal correlation between adjacent layers. The use of 3D thermal sensors in the thermal
sensor allocation algorithm for 3DICs was shown to reduce the total number of needed
sensors in a sample 3DIC by almost half.
We also proposed a new thermal map modeling and sensor distribution technique
for 3DICs. The proposed 3D thermal map modeling relies on two factors: the scaled
hotspot area based on the distance of the stacked layer from the heatsink, and the thermal
effect of each active layer on other layers. The model is very fast and efficient and gives a
clear insight into the thermal behavior in 3DICs. The model is utilized to generate the 3D
thermal map of a sample 4-layer stacked 3DIC, consisting of two layers of quad-core
processors and one layer of L2 cache and one layer of main memory. For different
applications running on the processor the proposed modeling yields a maximum error of
less than 5.5%, which is quite acceptable for the purpose of a sensor distribution
algorithm. The 3D thermal sensor distribution is based on the k-means clustering
algorithm in 3D Euclidian space. With the proposed method for the 4-layer stacked 3DIC,
less than 4.4% error in maximum sensor reading of the temperature of the chip for all
analyzed applications was achieved. The algorithm uses the proposed 3D thermal map
modeling, which speeds up evaluation time by 53x for six different applications to be run
on the 3DIC, compared with the situation in which detailed 3D map modeling using
111
HotSpot 5.0 is embodied in the algorithm. This speedup becomes even more significant
as the number of evaluation scenarios increases for more complicated 3DICs and
applications. Furthermore, as demonstrated, thermal sensor distribution for 3DICs must
be solved as a 3D problem, which results in 44% fewer sensors, as compared with
conventional 2D methods, while maintaining the same sensor reading error tolerance.
We also studied multi-objective optimization in VLSI circuits. We presented
different ways to provide analytical models of power and delay to be used by the
optimizer. The approaches included convex and non-convex models for VLSI circuits.
While a convex model has more modeling error, it guarantees reaching the global
optimum in a single step optimization. A non-convex model has a greater chance to find
the global optimum if the analytical gradient of the model is also provided to the
optimizer instead of a numerical gradient, which is calculated point by point.
Three methods for multi-objective optimization were discussed: Weighted Sum,
Compromise Programming, and STOM. By providing a wide range of experimental
results and analytical analyses we concluded that using convex models and an analytical
gradient with the Compromise Programming method provides the best result in a single
step optimization. Weighted Sum is less effective for solving a multi-objective
optimization containing non-convex functions. If the designer has a specific desired
design point, the interactive STOM method is preferred. The proposed method can be
applied on every conflicting operational function in VLSI circuits.
These findings were then applied to the 3DIC realm to develop a multiobjective
optimization method to find an optimal placement in a 3DIC. The optima of cost, thermal
112
reliability and performance conflict with each other with regard to the position of the
building blocks, number of layers, area, total number of TSVs, and total wire length
inside the chip. Therefore there is no single solution which optimizes all the objectives to
their optimum points. Our proposed method searches in continuous space using the
Quasi-Newton analytical optimization method. Our approach also uses a scalarization
method of Compromise Programming in which the weighted distance of the objectives
from their minimum points is optimized. A fast 3DIC thermal map model is used for the
optimization algorithm to eliminate the thermal analysis bottleneck. In comparison with a
previous state-of-the-art multiobjective optimization method which uses Simulated
Annealing, a weighted sum method of scalarization and compact modeling for thermal
analysis, our method reduces the peak temperature of the 3DIC by 4.3% and total wire
length by 5.7% while it is more than 17x faster in optimization runtime.
6.2 Main Contributions
In the research decribed in this dissertation, an efficient and fast 3D thermal map
modeling was developed. In comparison with existing models this model gives a broad
insight into common thermal behaviors in 3DICs and is simple. The model can be
developed using existing thermal CAD tools for thermal analysis of 2D ICs. With one
time characterization of the chip, the model can be used to re-evaluate the thermal map of
the 3DIC for different workloads, in contrast to existing models which must be
recalibrated for every analysis. For each individual application, using the same platform,
the speedup in using our 3D thermal map modeling is significant compared to existing
113
methods. This speedup is significant, given the vast number of scenarios that must be
evaluated for a complete thermal analysis of a typical 3DIC system and applications.
We developed a thermal sensor allocation algorithm for 3DICs. To the best of our
knowledge this is the first thermal sensor allocation algorithm which is customized for
3DICs. The model is based on the k-means clustering algorithm and is solved in the 3D
realm. Based on the physical adjacency and high thermal correlation between the layers
we show that any thermal sensor allocation problem should be solved as a three-
dimensional problem in order to avoid assigning an excessive number of sensors to the
same spatial hotspot. We show a considerable reduction in the number of needed sensors
when the problem is solved in the 3D space instead of solving it for each layer
individually.
A new ring oscillator based 3D thermal sensor is proposed in this research. To the
best of our knowledge this is the first design for 3D thermal sensors that can be shared
between adjacent layers. Because of the physical adjacency and use of Through Silicon
Vias (TSVs) as thermal exchangers between the stacked layers, the thermal profiles of the
layers are highly correlated with each other. Any planar hotspot in a layer in a 3DIC is
converted to a volumetric spatial hotspot. Runtime thermal management in 3DICs
requires proper monitoring and measurement of these spatial hotspots inside the chip.
The existence of spatial hotspots and the high thermal correlations between layers are
motivations for designing 3D thermal sensors. Use of this sensor design approach will
reduce the total number of needed sensors to monitor a typical whole 3DIC by half with
the same reading error, as compared to a conventional 2D sensor distribution approach.
114
We address the problem of multiobjective optimization of VLSI circuits. We
compare three methods of multiobjective optimization and the efficiency of these
methods using convex and non-convex modeling of the objective functions. Existing
methods for multiobjective optimization of VLSI circuits use the simple model of
weighted sum while we show that interactive methods like Satisficing Trade-off Method
(STOM) can reach acceptable solutions without any need of exhaustive search for the
optimum operating point of the circuit or any need to find all Pareto optimal sets. This is
also the first work that shows the efficiency of using convex models of the objective
functions. Although the convex model sacrifices some accuracy, the benefit of finding
optimum solutions faster far outweighs the slight decrease in accuracy.
Our method of multiobjective optimization for building block placement in 3DICs
considers three objectives of cost, performance, and thermal reliability at the same time,
contrasted to prior work where at most only two of these objectives are considered for the
optimization. We employ a proper formulation for the cost as a function of three
variables of area, number of layers, and number of TSVs and consider it one of the
objectives in the MOP. In previous methods, these variables are the MOP’s objectives,
and final cost is not considered. In our method, these three variables are in interaction
with each other during the optimization and determine the objective of cost, which is
important. We provide a technique in which the number of layers is one of the variables
during the optimization, unlike all previous methods, in which the number of layers is
predetermined. This approach helps us find the globally optimum structure of the 3DIC.
Another contribution in this research is the utilization of a very fast 3DIC thermal map
115
model in the optimization algorithm that improves the execution time by eliminating the
thermal analysis bottleneck during optimization iterations.
6.3 Future Work
3DIC technology is a new trend in semiconductor industry, and many challenging
problems must be addressed for this technology to mature and be widely used. In this
dissertation we provided a new technique for thermal map modeling, thermal sensor
design and thermal sensor distribution algorithm in 3DICs. We also provided a
multiobjective optimization algorithm for optimum building block placement inside the
3DIC. However we believe that our techniques can further be improved and can also be
used in other applications as explained in the following:
High power density inside the 3DIC leads to high temperature inside the 3DIC.
New thermal mitigation techniques are suggested to decrease this temperature. Using
liquid channel cooling is one of the techniques that is believed to be very effective. Our
3D thermal map modeling can be modified to also model the effect of liquid cooling
channels inside the chip. The other technique is using dummy thermal TSVs to generate a
low resistance thermal path to the heatsink. Our model can be used to effectively find out
the best position of these thermal TSVs, as the model is based on lateral and vertical
thermal conductivities inside the 3DIC. We used our model in two applications of
thermal measurement and overall 3DIC structure optimization; however the model can be
used in conjunction with any thermal management technique, like DVFS or DTM .
We proposed a ring oscillator thermal sensor design that passes through the TSVs.
The sensor requires 2-point calibration of slope and intercept based on process variation
116
and effective supply voltage. The design can further be improved to be immune to
variations and noise. The sensor can also be used to monitor TSV reliability and device
aging. The method of using a k-means clustering algorithm to find an optimum position
of sensors inside the 3DIC can also be applied to thermal management techniques. For
example, the method can be used for the clustering of hotspots to define power/thermal
regions to be used in DVFS or DTM algorithms.
We focused on three objectives of temperature, cost and performance in our
multiobjective optimization algorithm. These objectives can be extended to any
conflicting objectives during the chip design, like noise immunity, optimum supply
power, etc. The method can also be used to find an optimum design in regard to all
modes and corners of operations of a 3DIC that is targeted for multi-mode and multi-
corner operations. We focused on optimum building block placement; however, the
method can be used in any stage of design, from the primary front end design, to back
ends like partitioning, floorplanning, routing or even optimum clock distribution. In any
of these design stages the method focuses on three conflicting objectives of cost,
performance and thermal reliability and the variables can be modified for the purpose of
that design stage.
117
References
[1] C. H. Yu, “The 3rd dimension-more life for Moore’s Law,” in Proc. IMPACT, pp.
1–6, 2006.
[2] G. Loh, “3D-stacked memory architecture for multi-processor processors,” in Proc.
ISCA, pp. 453-464, 2008.
[3] L. Labrak and I. O’Connor, “Heterogeneous system design platform and
perspectives for 3D integration,” in Proc. ICM, pp. 161-164, 2009.
[4] P. Leduca, et. al. “Challenges for 3D IC integration: bonding quality and thermal
management,” in Proc. IITC, pp. 210-212, 2007.
[5] H. Lee, and K. Chakrabarty, “Test challenges for 3D integrated circuits,” IEEE
Design & Test of Computers, vol. 26-5, pp. 26-35, 2009.
[6] X. Wu, P. Falkenstern, and Y. Xie, “Scan chain design for three-dimensional
integrated circuits (3D ICs),” in Proc. ICCD, pp. 208-214, 2007.
[7] P. Franzon, et al. “Design and CAD for 3D integrated circuits,” in Proc. DAC, pp.
668-673, 2008.
[8] Y. Xie, G. Loh, B. Black, and K. Bernstein, “Design space exploration for 3-D
architecture,” ACM JETC, vol. 2, no. 2, pp. 65–103, 2006.
[9] G. L. Loi, et al. “A thermally-aware performance analysis of vertically integrated (3-
D) processor-memory hierarchy,” in Proc. DAC, pp. 991–996, 2006.
[10] J. Da-Cheng, S. Garg, and D. Marculescu, “Statistical thermal evaluation and
mitigation techniques for 3D Chip-Multiprocessors in the presence of process
variations,” in Proc. DATE, pp. 1-6, 2011.
[11] W. Huang et al, “Hotspot: A compact thermal modeling methodology for early-stage
VLSI design.” IEEE TVLSI, vol. 14, no. 15, pp. 501–513, 2006.
[12] A. Sridhar, et al., “3D-ICE: Fast compact transient thermal modeling for 3D ICs
with inter-tier liquid cooling”, in Proc. ICCAD, pp. 463 - 470, 2010.
[13] H. Mizunuma, Y. Lu, and C. Yang, “Thermal modeling and analysis for 3-D ICs
with integrated microchannel cooling,” IEEE TCAD, vol. 30, no. 9, pp. 1293 – 1306,
2011.
[14] T. Yan, et. al., “How does partitioning matter for 3D floorplanning?,” in Proc.
GLSVLSI, pp. 73-78, 2006.
118
[15] J. Cong, J. Wei, and Y. Zhang, “A thermal-driven floorplanning algorithm for 3D
ICs,” in Proc. ICCAD, pp. 306-313, 2004.
[16] J. Cong, and G. Luo, “A multilevel analytical placement for 3D ICs,” in Proc. ASP-
DAC, pp. 361-366, 2009.
[17] A. Coskun, et. al., “Energy-efficient variable-flow liquid cooling in 3D stacked
architectures,” in Proc. DATE, pp. 111-116, 2010.
[18] A. Coskun, et. al., “Modeling and dynamic management of 3D multicore systems
with liquid cooling,” in Proc. VLSI-SOC, pp. 35-40, 2009.
[19] B. Goplen, and S. Sepatnekar, “Thermal via placement in 3D ICs,” in Proc. ISPD,
pp. 167-174, 2005.
[20] J. Cong, and Y. Zhang, “Thermal via planning for 3-D ICs,” in Proc. ICCAD, pp.
745-752, 2005.
[21] F. Kashfi, J. Draper, “Thermal sensor design for 3D ICs,” in Proc. MWSCAS, pp.
482-485, 2012.
[22] F. Kashfi, J. Draper, “Thermal Sensor Distribution Method for 3D Integrated
Circuits Using Efficient Thermal Map Modeling,” in Proc. THERMINIC, pp.1-6,
2012.
[23] C. Lung, et. al. “Thermal-aware on-line task allocation for 3D multi-core processor
throughput optimization,” in Proc. DATE, pp. 1-6, 2011.
[24] G. Semeraro, et. al., “Energy-efficient processor design using multiple clock
domains with dynamic voltage and frequency scaling,” in Proc. HPCA, pp. 29-40,
2002.
[25] A. Naveh et. al., “Power and thermal management in the IntelTM CoreTM Duo
Processor,” Intel Technology J., vol. 10(2), pp. 109–122, 2006.
[26] S. Bota et al., “Smart temperature sensor for thermal testing of cell-based ICs,” in
Proc. DATE, pp. 464–465, Mar. 2005.
[27] C.-K. Kim, et al., “CMOS temperature sensor with ring oscillator for mobile DRAM
self-refresh control,” Microelectronics J., vol. 38, no. 10–11, pp. 1042–1049, 2007.
[28] J. Clabes et al., “Design and implementation of the POWER5 microprocessor,” in
IEEE ISSCC Dig. Tech. Papers, pp. 56–57, 2004.
119
[29] S. Sharifi and T. Rosing, “Accurate direct and indirect on-chip temperature sensing
for efficient dynamic thermal management,” IEEE Trans. on CAD for ICs, vol. 29,
no. 10, pp. 1586– 1599, 2010.
[30] K. J. Lee and K. Skadron, “Analytical model for sensor placement on
microprocessors,” in Proc. ICCD, pp. 24-27, 2005.
[31] S. O. Memik et al., “Optimizing thermal sensor allocation for microprocessors,”
IEEE TCAD, 27(3), pp. 516–527, 2008.
[32] A. N. Nowroz, R. Cochran, and S. Reda, “Thermal Monitoring of Real Processors:
Techniques for Sensor Allocation and Full Characterization,” in Proc. DAC, pp. 56-
61, 2010.
[33] D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis, “An analysis of several heuristics
for the traveling salesman problem,” SIAM J. Computing, vol. 6, pp. 563–581, 1977.
[34] C. Xu et al.. “Fast 3D Thermal Analysis of Complex Interconnect Structures Using
Electrical Modeling and Simulation Methodologies”, in Proc. IEEE/ACM ICCAD,
pp. 658 – 665, 2009.
[35] P. Li et al., “IC thermal simulation and modeling via efficient multi grid based
approaches”, IEEE TCAD, 25(9), pp.1763-1776, 2006.
[36] A. Vincenzi, A. Sridhar, M. Ruggiero, and D. Atienza, “Fast thermal simulation of
2D/3D integrated circuit exploiting neural networks and GPUs,” in Proc. ISLPED,
pp. 151 – 156, 2011.
[37] K. Miettinen, 1999: Nonlinear Multiobjective Optimization. Boston: Kluwer
Academic Publishers.
[38] M. Ehrgott and X. Gandibleux. Multiple Criteria Optimization. State of the Art
Annotated Bibliographic Surveys. Kluwer, 2002.
[39] C. A. Coello, “A short tutorial on evolutionary multiobjective optimization,” in Proc.
EMO, pp. 21–40, 2001.
[40] B. Hoppe, G. Neuendorf, D. Schmitt-Landsiedel, and W. Specks, “Optimization of
high-speed CMOS logic circuits with analytical models for signal delay, chip area,
and dynamic power dissipation,” IEEE TCAD, vol. 9, pp. 236–247, 1990.
[41] M. B. Anand, H. Shibata, and M. Kakumu, “Multiobjective optimization of VLSI
interconnect parameters,” IEEE TCAD, vol. 17, no. 12, p. 1252-1261, 1998.
[42] J. Zhao, X. Dong, and Y. Xie, "Cost-Aware Three-Dimensional (3D) Many-Core
Multiprocessor Design," in ACM/IEEE DAC, pp. 126-131, 2010.
120
[43] X. Dong and Y. Xie, “System-level cost analysis and design exploration for three-
dimensional integrated circuits (3D ICs),” in Proc. IEEE ASP-DAC, pp. 234–241,
2009.
[44] A. Coskun, A. Kahng, and T. Rosing, “Temperature- and Cost-Aware Design of 3D
Multiprocessor Architectures,” in Proc. DSD, pp. 183-190, 2009.
[45] P. Zhou, et. al. “3D-STAF: Scalable temperature and leakage aware floorplanning
for three-dimensional integrated circuits,” in Proc. ICCAD, pp. 590–597, 2007.
[46] W. Hung, et al., “Interconnect and thermal-aware floorplanning for 3D
microprocessors,” in Proc. ISQED, pp. 104-109, 2006.
[47] M. Healy et al., “Multiobjective microarchitectural floorplanning for 2D and 3D
ICS,” IEEE TCAD, vol. 26, no. 1, pp. 38–52, 2007.
[48] K. Balakrishnan, et al., “Wire congestion and theramal aware 3D global placement,”
in Proc. ASP-DAC, pp.1131-1134, 2005.
[49] A. Coskun et al., “Energy-efficient variable-flow liquid cooling in 3D stacked
architectures,” in Proc. DATE, pp.111–116, 2010.
[50] R. E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro, 19(2):24–36,
March-April 1999.
[51] SPEC-CPU2000, Standard performance evaluation council, performance evaluation
in the new millennium, version 1.1, Electronic Resource, 2000.
[52] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A framework for architectural-
level power analysis and optimizations,” in Proc. ISCA, pp. 83–94, 2000.
[53] C. Zhu, et al., “Three-dimensional chip-multiprocessor run-time thermal
management,” IEEE TCAD, 27(8), pp. 1479–1492, 2008.
[54] L. He, W. Liao, and M. R. Stan, “System level leakage reduction considering the
interdependence of temperature and leakage,” in Proc. DAC, pp. 12–17, 2004.
[55] K. Kasamsetty, et. al., “A new class of convex functions for delay modeling and
their application to the transistor sizing problem,” IEEE TCAD, vol. 19, no. 7, pp.
779–788, 2000.
[56] H. Nakayama, Y. Yun, M. Yoon. 2009: Sequential Approximate Multiobjective
Optimization Using Computational Intelligence. Springer-Verlag Berlin Heidelberg.
[57] R.E. Ladner and M.J. Fischer, “Parallel prefix computation”, Journal of ACM, Vol.
27, No. 4, pp.831-838, 1980.
121
[58] S. M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits. New York:
McGraw-Hill, 2002.
[59] F. Kashfi, S. Hatami, and M. Pedram, "Multiobjective optimization techniques for
the VLSI circuits," in Proc. ISQED, pp 1-8, Mar. 2011.
[60] F. Kashfi, and J. Draper, “Multiobjective Optimization of Cost, Performance and
Thermal Reliability in 3DICs,” in Proc. DSD, 2013.
[61] S. Boyd, and L. Vandenberghe, 2004: Convex Optimization, Cambridge University
Press.
[62] D. F. Shanno, "Conditioning of Quasi-Newton Methods for Function Minimization,"
in Mathematics of Computing, Vol. 24, pp. 647-656, 1970.
[63] International Technology Roadmap for Semiconductors:
http://www.itrs.net/reports.html
[64] E. Rotem, et al., Temperature measurement in the Intel® CoreTM duo processor,
(2006).
[65] P. Jain, T. Kim, J. Keane, and C. Kim, “A multi-story power delivery technique for
3D integrated circuits,” in Proc. ISLPED, pp. 57-62, 2008.
[66] S. Sapatnekar, “Addressing thermal and power delivery bottlenecks in 3D circuits,”
in Proc. ASP-DAC, pp. 423-428, 2009.
[67] N. Quinn, and M. Breuer, “A forced directed component placement procedure for
printed circuit boards,” in IEEE TCAS, vol. 26(6), pp. 377-388, 1979.
[68] A. K. Coskun, et al., Evaluating the impact of job scheduling and power
management on processor lifetime for chip multiprocessors, ACM SIGMETRICS,
(2009) 169-180.
122
Alphabetized Bibliography
K. Athikulwongse, et al. “Stress-driven 3D-IC placement with TSV keep-out zone and
regularity study,” in Proc. ICCAD, 2010.
M. B. Anand, H. Shibata, and M. Kakumu, “Multi-objective optimization of VLSI
interconnect parameters,” IEEE TCAD, vol. 17, no. 12, p. 1252, 1998.
K. Balakrishnan, et al., “Wire congestion and thermal aware 3D global placement,” in
Proc. ASP-DAC, pp.1131-1134, 2005.
S. Bota et al., “Smart temperature sensor for thermal testing of cell-based ICs,” in Proc.
DATE, pp. 464–465, 2005.
S. Boyd, and L. Vandenberghe, 2004: Convex Optimization, Cambridge University
Press.
D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A framework for architectural-level
power analysis and optimizations,” in Proc. ISCA, pp. 83–94, 2000.
Y. Chen, et al., “Analysis and mitigation of lateral thermal blockage effect of through-
silicon-via in 3D IC designs,” in Proc. ISLPED, pp. 397-402, 2011.
Y. Chen, et al., “Cost-effective integration of three-dimensional (3D) ICs emphasizing
testing cost analysis,” in Proc. ICCAD, pp. 471-476, 2010.
T. Cjhiang, et al., “Thermal analysis of heterogeneous 3-D ICs with various integration
scenarios,” in Proc. IEDM, pp. 31.2.1 - 31.2.4, 2001.
J. Clabes et al., “Design and implementation of the POWER5 microprocessor,” in IEEE
ISSCC Dig. Tech. Papers, pp. 56–57, 2004.
C. A. Coello, “A short tutorial on evolutionary multi-objective optimization,” in Proc.
EMO, pp. 21–40, 2001.
J. Cong, et al., “Thermal-aware 3D IC placement via transformation,” in Proc. ASP-
DAC, pp 780-785, 2007.
J. Cong and G. Luo, “An analytical placer for mixed-size 3D placement,” in Proc. ISPD,
2010.
J. Cong, and G. Luo, “A multilevel analytical placement for 3D ICs,” in Proc. ASP-DAC,
pp. 361-366, 2009.
123
J. Cong, J. Wei, and Y Zahng, “A thermal-driven floorplanning algorithm for 3D ICs,” in
Proc. ICCAD, 2004, pp. 306-313.
A. Coskun, A. Kahng, and T. Rosing, “Temperature- and Cost-Aware Design of 3D
Multiprocessor Architectures,” in Proc. DSD, pp. 183-190, 2009.
A. Coskun, et al., “Dynamic thermal management in 3D multicore architecture,” in Proc.
DATE, pp.1410-1415, 2009.
A. Coskun et al., “Energy-efficient variable-flow liquid cooling in 3D stacked
architectures,” in Proc. DATE, pp.111–116, 2010.
A. Coskun, et al., "Evaluating the impact of job scheduling and power management on
processor lifetime for chip multiprocessors", ACM SIGMETRICS, pp. 169-180, 2009.
A. Coskun, et. al., “Modeling and dynamic management of 3D multicore systems with
liquid cooling,” in Proc. VLSI-SOC, pp. 35-40, 2009.
J. Da-Cheng, S. Garg, and D. Marculescu, “Statistical thermal evaluation and mitigation
techniques for 3D Chip-Multiprocessors in the presence of process variations,” in Proc.
DATE, pp. 1-6, 2011.
X. Dong and Y. Xie, “System-level cost analysis and design exploration for three-
dimensional integrated circuits (3D ICs),” in Proc. IEEE ASP-DAC, pp. 234–241, 2009.
M. Ehrgott and X. Gandibleux. Multiple Criteria Optimization. State of the Art
Annotated Bibliographic Surveys. Kluwer, 2002.
A. Fourmigue, et al., “A linear-time approach for the transient thermal simulation of
liquid-cooled 3D ICs,” in Proc. CODES+ISSS, pp. 197-205, 2011.
P. Franzon, et al. “Design and CAD for 3D integrated circuits,” in Proc. DAC, pp. 668-
673, 2008.
B. Goplen and S. Sapatnekar, “Efficient thermal placement of standard cells in 3D ICs
using a force directed approach,” In Proc. of ICCAD, 2003.
B. Goplen and S. S. Sapatnekar, “Thermal via placement in 3-D ICS,” in Proc. ACM
ISPD, pp. 167–174, 2005.
B. Goplen and S. Sapatnekar, “Placement of 3D ICs with thermal and interlayer via
considerations,” In Proc. of DAC, 2007.
C. H. Yu, The 3rd dimension-more life for Moore’s Law, IMPACT, pp. 1–6, 2006.
124
L. He, W. Liao, and M. R. Stan, “System level leakage reduction considering the
interdependence of temperature and leakage,” in Proc. DAC, pp. 12–17, 2004.
M. Healy et al., “Multi-objective microarchitectural floorplanning for 2D and 3D ICS,”
IEEE TCAD, vol. 26, no. 1, pp. 38–52, 2007.
B. Hoppe, G. Neuendorf, D. Schmitt-Landsiedel, and W. Specks, “Optimization of high-
speed CMOS logic circuits with analytical models for signal delay, chip area, and
dynamic power dissipation,” IEEE TCAD, vol. 9, pp. 236–247, 1990.
M. K. Hsu, Y. W. Chang, and V. Balabanov, “TSV-aware analytical placement for 3D IC
designs,” in Proc. DAC, pp. 664-669, 2011.
W. Huang, et al., “Compact thermal modeling for temperature-aware design,” in Proc.
DAC, pp. 878-883, 2004.
W. Huang et al, “Hotspot: A compact thermal modeling methodology for early-stage
VLSI design.” IEEE TVLSI, 14(5), pp. 501–513, 2006.
W. Huang et al. “Differentiating the roles of ir measurement and simulation for power
and temperature-aware design,” in Proc. ISPASS, pp. 1–10, 2009.
W. Hung, et al., “Interconnect and thermal-aware floorplanning for 3D microprocessors,”
In proc. ISQED, 2006, pp. 104-109.
International Technology Roadmap for Semiconductors: http://www.itrs.net/reports.html
A. Jain, et al., "Thermal modeling and design of 3D integrated circuits", IEEE ITHERM,
pp. 1139-1145, 2008.
P. Jain, et al., “A multi-story power delivery technique for 3D integrated circuits,” in
Proc. ISLPED, pp. 57-62, 2008.
S. M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits. New York: McGraw-
Hill, 2002.
K. Kasamsetty, et. al., “A new class of convex functions for delay modeling and their
application to the transistor sizing problem,” IEEE TCAD, vol. 19, no. 7, pp. 779–788,
2000.
F. Kashfi, and J. Draper, "Thermal sensor design for 3D ICs", MWSCAS, pp. 482-485,
2012.
F. Kashfi, and J. Draper, “Thermal sensor distribution method for 3D Integrated Circuits
using efficient thermal map modeling,” in Proc. THERMINIC, pp. 1-6. 2012.
125
F. Kashfi, S. Hatami, and M. Pedram, "Multi-objective optimization techniques for the
VLSI circuits," in proc. ISQED, pp 1-8, 2011.
F. Kashfi, and J. Draper, “Multiobjective Optimization of Cost, Performance and
Thermal Reliability in 3DICs,”in Proc. DSD, 2013.
F. Kashfi, and J. Draper, “Thermal sensor allocation for 3DICs using three dimensional
thermal sensors,” Submitted to Microelectronics Journal, 2013.
F. Kashfi, and J. Draper, “Thermal sensor distribution method for 3D Integrated Circuits
using efficient thermal map modeling,” Presented in Ming Hsieh Department of
Electrical Engineering Annual Research Festival (Best Poster Honorable Mention
Award), May 2013.
R. E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro, 19(2):24–36, March-
April 1999.
C. K. Kim, et al., “CMOS temperature sensor with ring oscillator for mobile DRAM
self-refresh control,” Microelectronics J., vol. 38, no. 10–11, pp. 1042–1049, 2007.
J. Kleinhans, et al., “GORDIAN: VLSI placement by quadratic programming and slicing
optimization,” IEEE TCAD, vol. 10, no. 3, pp. 356-365, 2002.
J. Knechtel, et al., “Multiobjective optimization of deadspace, a critical resource for 3D-
IC integration,” in Proc. ICCAD, 2012.
G. L. Loi, et al., A thermally-aware performance analysis of vertically integrated (3-D)
processor-memory hierarchy, DAC, pp. 991–996, 2006.
L. Labrak and I. O’Connor, “Heterogeneous system design platform and perspectives for
3D integration,” in Proc. ICM, pp. 161-164, 2009.
R. E. Ladner and M.J. Fischer, “Parallel prefix computation”, Journal of ACM, Vol. 27,
No. 4, pp.831-838, 1980.
J. H. Lau, and T. G. Yue, "Thermal management of 3D IC integration with TSV (through
silicon via)", IEEE ECTC, pp. 635-640, 2009.
J. H. Lau, “TSV Manufacturing Yield and Hidden Costs for 3D IC Integration”, in Proc.
ECTC, pp. 1031-1042, 2010.
P. Leduca, et. al. “Challenges for 3D IC integration: bonding quality and thermal
management,” in Proc. IITC, pp. 210-212, 2007.
H. Lee, and K. Chakrabarty, “Test challenges for 3D integrated circuits,” IEEE DTC, vol.
26-5, pp. 26-35, 2009.
126
K. J. Lee and K. Skadron, “Analytical model for sensor placement on microprocessors,”
in Proc. ICCD, pp. 24-27, 2005.
Y. K. Lee, and S. K. Lim, “Timing analysis and optimization for 3D stacked multi-core
microprocessors,” in Proc. 3DIC, pp 1-7, 2010.
Y. J. Lee, et al., “Co-design of signal, power, and thermal distribution networks, for 3D
ICs,” in Proc. DATE, pp. 610-615, 2009.
S. Lee, and S. K. Dieter "3D IC architecture for high density memories." in Proc. IMW,
2010.
P. Li et al., “IC thermal simulation and modeling via efficient multi grid based
approaches”, IEEE TCAD, 25(9), pp.1763-1776, 2006.
X. Li, M. Yuchun, and H. Xianlong, “A novel thermal optimization flow using
incremental floorplanning for 3D ICs,” in Proc. ASP-DAC, pp. 347-352, 2009.
C. Liu, et al., "Bridging the processor-memory performance gap with 3D IC technology."
In Proc. DTC, 2005, pp. 556-564, 2005.
C. Liu, et al., “Full-chip TSV-to-TSV coupling analysis and optimization in 3D IC,” in
Proc. DAC, pp. 783-788, 2011.
G. Loh, “3D-stacked memory architecture for multi-processor processors,” in Proc.
ISCA, pp. 453-464, 2008.
G. L. Loi, et al. “A thermally-aware performance analysis of vertically integrated (3-D)
processor-memory hierarchy,” in Proc. DAC, pp. 991–996, 2006.
I. Loi, et al., “A low-overhead fault tolerance scheme for TSV-based 3-D network on
chip links,” in Proc. ICCAD, pp. 598–602, 2008.
C. Lung, et. al. “Thermal-aware on-line task allocation for 3D multi-core processor
throughput optimization,” in Proc. DATE, pp. 1-6, 2011.
S. Melamed, et al., “Junction-level thermal extraction and simulation of 3DICs,” in Proc.
3DIC, pp. 1-7, 2009.
S. O. Memik et al., “Optimizing thermal sensor allocation for microprocessors,” IEEE
TCAD, 27(3), pp. 516–527, 2008.
K. Miettinen, 1999: Nonlinear Multi-objective Optimization. Boston: Kluwer Academic
Publishers.
127
H. Mizunuma, et al., "Thermal modeling for 3D-ICs with integrated microchannel
cooling," IEEE/ACM ICCAD, pp. 256-263, 2009.
H. Mizunuma, Y. Lu, and C. Yang, “Thermal modeling and analysis for 3-D ICs with
integrated microchannel cooling,” IEEE TCAD., 30(9), pp. 1293 – 1306, 2011.
A. N. Nowroz, R. Cochran, and S. Reda, “Thermal Monitoring of Real Processors:
Techniques for Sensor Allocation and Full Characterization,” in Proc. DAC, pp. 56-61,
2010.
H. Nakayama, Y. Yun, M. Yoon. 2009: Sequential Approximate Multi-objective
Optimization Using Computational Intelligence. Springer-Verlag Berlin Heidelberg.
A. Naveh et. al., “Power and thermal management in the IntelTM CoreTM Duo
Processor,” Intel Technology J., vol. 10(2), pp. 109–122, 2006.
A. N. Nowroz, R. Cochran, and S. Reda, “Thermal Monitoring of Real Processors:
Techniques for Sensor Allocation and Full Characterization,” in Proc. DAC, pp. 56-61,
2010.
S. O. Memik et al., Optimizing thermal sensor allocation for microprocessors, IEEE
Trans. CAD for ICs, 27(3), pp. 516–527, 2008.
K. Puttaswamy, and G. H. Loh, “Thermal analysis of a 3D die-stacked high-performance
microprocessor,” in Proc. GLSVLSI, pp. 19-24, 2006.
K. Puttaswamy, and G. H. Loh, “Thermal herding: microarchitecture techniques for
controlling hotspots in high-performance 3D-integrated processors,” in Proc. HPCA, pp.
193-204, 2007.
N. Quinn, and M. Breuer, “A forced directed component placement procedure for printed
circuit boards,” in IEEE TCAS, vol. 26(6), pp. 377-388, 1979.
D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis, “An analysis of several heuristics for
the traveling salesman problem,” SIAM J. Computing, vol. 6, pp. 563–581, 1977.
E. Rotem, et al., Temperature measurement in the Intel® CoreTM duo processor, (2006).
S. Sapatnekar, “Addressing thermal and power delivery bottlenecks in 3D circuits,” in
Proc. ASP-DAC, pp.423-428, 2009.
G. Semeraro, et. al., “Energy-efficient processor design using multiple clock domains
with dynamic voltage and frequency scaling,” in Proc. HPCA, pp. 29-40, 2002.
D. F. Shanno, “Conditioning of Quasi-Newton Methods for Function Minimization,” in
Mathematics of Computing, Vol. 24, pp 647-656, 1970.
128
S. Sharifi and T. Rosing, “Accurate direct and indirect on-chip temperature sensing for
efficient dynamic thermal management,” IEEE TCAD, vol. 29, no. 10, pp. 1586– 1599,
Oct. 2010.
SPEC-CPU2000, Standard performance evaluation council, performance evaluation in
the new millennium, version 1.1, Electronic Resource, (2000).
A. Sridhar, et al., “3D-ICE: Fast compact transient thermal modeling for 3D ICs with
inter-tier liquid cooling”, in Proc. IEEE/ACM ICCAD, pp. 463 - 470, 2010.
C. Sun, L. Shang, and R. Dick, “Three-dimensional multiprocessor system-on-chip
thermal optimization,” in Proc. CODES+ISSS, pp. 117-122, 2007.
D. Velenis, et al., “Impact of 3D design choices on manufacturing cost,” in Proc. 3DIC,
pp. 1-5, 2009.
A. Vincenzi, A. Sridhar, M. Ruggiero, and D. Atienza, “Fast thermal simulation of
2D/3D integrated circuit exploiting neural networks and GPUs,” in Proc. ISLPED, pp.
151 – 156, 2011.
T. Wang, Y. Lee, and C. Chen, “3D Thermal-ADI – an efficient chip-level transient
thermal simulator,” in Proc. ISPD, pp. 10-17, 2003.
P. Wilkerson, A. Raman, and M. Turowski, “Fast, automated thermal simulation of three-
dimensional integrated circuits,” in Proc. ITHERM, pp. 706-713, 2004.
X. Wu, P. Falkenstern, and Y. Xie, “Scan chain design for three-dimensional integrated
circuits (3D ICs),” in Proc. ICCD, pp. 208-214, 2007.
Y. Xie, G. Loh, B. Black, and K. Bernstein, “Design space exploration for 3-D
architecture,” ACM JETC, vol. 2, no. 2, pp. 65–103, 2006.
C. Xu et al., “Fast 3D Thermal Analysis of Complex Interconnect Structures Using
Electrical Modeling and Simulation Methodologies”, in Proc. IEEE/ACM ICCAD, pp.
658 – 665, 2009.
H. Yan, Q. Zhou, X. Hong, "Efficient Thermal Aware Placement Approach Integrated
with 3D DCT Placement Algorithm," in Proc. ISQED, p.289-292, 2008.
T. Yan, et. al., “How does partitioning matter for 3D floorplanning?,” in Proc. GLSVLSI,
pp. 73-78, 2006.
C. H. Yu, “The 3rd dimension-more life for Moore’s Law,” in Proc. IMPACT, pp. 1–6,
2006.
129
R. S. Zebulum, M. A. Pacheco, and M. Vellasco, “A multi-objective optimization
methodology applied to the synthesis of low-power operational amplifiers,” in Proc.
IMAPS, pp. 264–271, 1998.
J. Zhao, X. Dong, and Y. Xie, "Cost-Aware Three-Dimensional (3D) Many-Core
Multiprocessor Design," in ACM/IEEE DAC, pp. 126-131, 2010
P. Zhou, et al. “3D-STAF: Scalable temperature and leakage aware floorplanning for
three-dimensional integrated circuits,” in Proc. IEEE/ACM CAD, pp. 590–597, 2007.
X. Zhou, et al., “Thermal management for 3D processors via task scheduling,” in Proc.
ICPP, pp. 115-122, 2008.
C. Zhu et al., “Three-dimensional chip-multiprocessor run-time thermal management,”
IEEE TCAD, 27(8):1479–1492, 2008.
130
Appendix
Theorem: An increasing function of a convex function is convex.
Proof:
Let , i.e., , where is an increasing function and is a
convex function. We have for , ∈ , , 0 1 ,
1 1 1 .
Assume without loss of generality and , otherwise consider
1 instead of 1 . So 1 , and since
is increasing
1 1 1 1 1 1 1 ,
i.e.,
1 1 .
Abstract (if available)
Abstract
Three Dimensional Integrated Circuit (3DIC) technology has been introduced to address the interconnect issues in nanometer circuit design that limit performance improvement and power reduction. However, stacking active layers of silicon leads to increased power density and overall higher temperatures in a 3D chip implementation for many designs. New thermal map modeling, and temperature measurement, mitigation and management techniques should be introduced for this technology. In this dissertation we study the thermal correlation between the stacked layers in 3DICs. We then propose a fast and efficient 3D thermal map modeling based on scaled hotspot areas, depending on the distance of a stacked layer from the heatsink and also thermal effects of the layers on each other. The modeling is 53× faster than the existing method of temperature compact modeling. The efficiency of the proposed modeling is demonstrated with its use in a thermal sensor distribution algorithm. We also show that the thermal sensor distribution algorithm should be solved as a 3D problem. In this way for the same sensor reading error the total number of needed sensors is reduced by 44%. We furthermore propose a new 3D design for thermal sensor circuits to be shared between layers in a 3DIC. Using 3D thermal sensors that are shared between adjacent layers can reduce the total number of needed sensors by half. ❧ We also study different methods of multiobjective optimization to find the optimum operating point of a VLSI circuit. We provide wide mathematical analyses of different multiobjective optimization techniques for this purpose. We also study the difference of convex and non-convex modeling of the objectives in the multiobjective optimization algorithms. ❧ We apply our multiobjective optimization methods to optimize three conflicting objectives of cost, performance and thermal reliability to find an optimum building block placement in 3DICs. The variables for the optimization are the number of layers, area of the 3DIC, position of the building blocks, number of TSVs and total wirelength. We used our proposed fast 3D thermal map modeling to eliminate the thermal analysis bottleneck in multiobjective optimization iterations. In comparison with a previous state-of-the-art multiobjective optimization method which employs Simulated Annealing, a weighted sum method of scalarization and compact modeling for thermal analysis, our method reduces the peak temperature of a representative 3DIC by 4.3% and total wire length by 5.7% while it is more than 17× faster in optimization runtime. The execution runtime of the proposed algorithm also scales linearly with problem size in contrast with the existing heuristic method of Simulated Annealing, which scales poorly with problem size.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A logic partitioning framework and implementation optimizations for 3-dimensional integrated circuits
PDF
Variation-aware circuit and chip level power optimization in digital VLSI systems
PDF
Optimizing power delivery networks in VLSI platforms
PDF
Optimal redundancy design for CMOS and post‐CMOS technologies
PDF
Stochastic dynamic power and thermal management techniques for multicore systems
PDF
Average-case performance analysis and optimization of conditional asynchronous circuits
PDF
Architectures and algorithms of charge management and thermal control for energy storage systems and mobile devices
PDF
Trustworthiness of integrated circuits: a new testing framework for hardware Trojans
PDF
Electronic design automation algorithms for physical design and optimization of single flux quantum logic circuits
PDF
Clustering and fanout optimizations of asynchronous circuits
PDF
Improving efficiency to advance resilient computing
PDF
Modeling and mitigation of radiation-induced charge sharing effects in advanced electronics
PDF
Hardware techniques for efficient communication in transactional systems
PDF
High level design for yield via redundancy in low yield environments
PDF
Power efficient design of SRAM arrays and optimal design of signal and power distribution networks in VLSI circuits
PDF
Defect-tolerance framework for general purpose processors
PDF
Synchronization and timing techniques based on statistical random sampling
PDF
A joint framework of design, control, and applications of energy generation and energy storage systems
PDF
SLA-based, energy-efficient resource management in cloud computing systems
PDF
Understanding dynamics of cyber-physical systems: mathematical models, control algorithms and hardware incarnations
Asset Metadata
Creator
Kashfi, Fatemeh
(author)
Core Title
Thermal analysis and multiobjective optimization for three dimensional integrated circuits
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering (VLSI Design)
Publication Date
08/16/2013
Defense Date
07/15/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
3DIC,Electronics,OAI-PMH Harvest,optimization,temperature,VLSI
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Draper, Jeffrey (
committee chair
), Gupta, Sandeep K. (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
fatemeh.kashfi@gmail.com,fkashfi@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-325495
Unique identifier
UC11292804
Identifier
etd-KashfiFate-2005.pdf (filename),usctheses-c3-325495 (legacy record id)
Legacy Identifier
etd-KashfiFate-2005.pdf
Dmrecord
325495
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Kashfi, Fatemeh
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
3DIC
optimization
temperature
VLSI