Page 1 |
Save page Remove page | Previous | 1 of 125 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
PARALLEL SIMULATION OF CHIP-MULTIPROCESSOR
by
Jianwei Chen
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER ENGINEERING)
December 2009
Copyright 2009 Jianwei Chen
Object Description
| Title | Parallel simulation of chip-multiprocessor |
| Author | Chen, Jianwei |
| Author email | jianwei@usc.edu; jianwei.c@gmail.com |
| Degree | Doctor of Philosophy |
| Document type | Dissertation |
| Degree program | Computer Engineering |
| School | Viterbi School of Engineering |
| Date defended/completed | 2009-06-29 |
| Date submitted | 2009 |
| Restricted until | Unrestricted |
| Date published | 2009-09-05 |
| Advisor (committee chair) | Dubois, Michel |
| Advisor (committee member) |
Annavaram, Murali Nakano, Aiichiro |
| Abstract | Simulation is an indispensable tool for computer architecture research. However, as target systems become increasingly complex, designers of computer architecture simulators are facing several challenges including runaway development cost, poor simulation speed, and compromised simulation accuracy.; In this work, we investigate and demonstrate novel integration and parallelization techniques to address the above issues, which are part of an overarching vision to design a parallel full-system simulation infrastructure for future Chip Multiprocessors (CMPs) running on current CMPs.; First, through integration, we combine two existing and widely used simulators. This cost-effective and practical approach not only enables detailed cycle-accurate microarchitectural modeling with full system simulation, but also dramatically reduces development time and cost. The integrated tool is called SimWattch. After describing how we met the technical challenges in the design of SimWattch, we show the type of errors a user-level simulator can make because it omits operating system activities. Our results demonstrate that if operating system effects are omitted, performance and power will be overestimated while energy will be underestimated, sometimes by large amounts in all OS-intensive workloads and some SPEC workloads as well.; Current trends signal an imminent crisis in the simulation of future CMPs. In order to address this challenge, we explore the paradigm of simulating each core of a target CMP in one thread and then spreading the threads across the hardware thread contexts of a host CMP. We start with cycle-by-cycle simulation and then relax the synchronization condition in various schemes, which we call slack simulations.; In slack simulations, the threads simulating different simulated cores do not synchronize after each simulated cycle, but rather they are given some slack. The simulation slack is the difference in cycle(s) between the simulated times of any two target cores. The maximum possible slack is called the slack bound. Small slacks, such as a few cycles, greatly improve the efficiency of parallel CMP simulations, with no or negligible simulation errors. Using POSIX Threads (Pthreads), we have developed a simulation framework called SlackSim to experiment with various slack simulation schemes. Unlike previous attempts to parallelize multiprocessor simulations on distributed memory machines, SlackSim takes advantage of the efficient sharing of data in the host CMP architecture. We demonstrate the efficiency and accuracy of some well-known slack simulation schemes and of some new ones on SlackSim running on a state-of-the-art CMP platform.; Slack simulations may suffer from simulation violations, which affect the accuracy of target metrics. In order to control the number of simulation violations and error rates in slack simulations, adaptive slack simulation is proposed. Adaptive slack simulation paces the simulation by adaptively adjusting the simulation slack to meet a target simulation violation rate. Our experiments show that adaptive slack simulation can regulate simulation violations reliably and at low performance cost.; Finally, an optimistic simulation technique called speculative slack simulation is developed and evaluated. In speculative slack simulations, the simulation is periodically checkpointed. When a violation is detected the simulation rolls back to its nearest checkpoint and replays the checkpoint in safe mode. Because of its complexity and large overheads, this solution is only acceptable for simulations with very low violation rates. |
| Keyword | chip-multiprocessor; computer architecture; parallel simulation |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Provenance | Electronically uploaded by the author |
| Type | texts |
| Legacy record ID | usctheses-m2587 |
| Rights | Chen, Jianwei |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Chen-2936 |
| Archival file | uscthesesreloadpub_Volume32/etd-Chen-2936.pdf |
Description
| Title | Page 1 |
| Full text | PARALLEL SIMULATION OF CHIP-MULTIPROCESSOR by Jianwei Chen A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER ENGINEERING) December 2009 Copyright 2009 Jianwei Chen |
Comments
Post a Comment for Page 1

