Efficient trace-driven simulation method for cache performance analysis
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Blocking: exploiting spatial locality for trace compaction
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Simulation of multiprocessors: accuracy and performance
Simulation of multiprocessors: accuracy and performance
Cache inclusion and processor sampling in multiprocessor simulations
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
Lamport clocks: verifying a directory cache-coherence protocol
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
An analysis of database workload performance on simultaneous multithreaded processors
Proceedings of the 25th annual international symposium on Computer architecture
Measuring computer performance: a practitioner's guide
Measuring computer performance: a practitioner's guide
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Parallel simulation of chip-multiprocessor architectures
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Can Trace-Driven Simulators Accurately Predict Superscalar Performance?
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
Flexible reference trace reduction for VM simulations
ACM Transactions on Modeling and Computer Simulation (TOMACS)
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
A First-Order Superscalar Processor Model
Proceedings of the 31st annual international symposium on Computer architecture
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors
Proceedings of the 32nd annual international symposium on Computer Architecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Interconnect-Centric Computing
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Accelerating Multiprocessor Simulation with a Memory Timestamp Record
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Enhancing Multiprocessor Architecture Simulation Speed Using Matched-Pair Comparison
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A Framework for Providing Quality of Service in Chip Multi-Processors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
An Adaptive Synchronization Technique for Parallel Simulation of Networked Clusters
ISPASS '08 Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
On the simulation of large-scale architectures using multiple application abstraction levels
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Accurately modeling superscalar processor performance with reduced trace
Journal of Parallel and Distributed Computing
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Virtual platforms: breaking new grounds
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Simulation is indispensable in computer architecture research. Researchers increasingly resort to detailed architecture simulators to identify performance bottlenecks, analyze interactions among different hardware and software components, and measure the impact of new design ideas on the system performance. However, the slow speed of conventional execution-driven architecture simulators is a serious impediment to obtaining desirable research productivity. This paper describes a novel fast multicore processor architecture simulation framework called Two-Phase Trace-driven Simulation (TPTS), which splits detailed timing simulation into a trace generation phase and a trace simulation phase. Much of the simulation overhead caused by uninteresting architectural events is only incurred once during the cycle-accurate simulation-based trace generation phase and can be omitted in the repeated trace-driven simulations. We report our experiences with tsim, an event-driven multicore processor architecture simulator that models detailed memory hierarchy, interconnect, and coherence protocol based on the TPTS framework. By applying aggressive event filtering, tsim achieves an impressive simulation speed of 146 millions of simulated instructions per second, when running 16-thread parallel applications. Copyright © 2010 John Wiley & Sons, Ltd.