Techniques for efficient inline tracing on a shared-memory multiprocessor
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Embra: fast and flexible machine simulation
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
Fast out-of-order processor simulation using memoization
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Complete Computer System Simulation: The SimOS Approach
IEEE Parallel & Distributed Technology: Systems & Technology
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Multiprocessor Memory Reference Generation Using Cerberus
MASCOTS '99 Proceedings of the 7th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Efficient memory simulation in SimICS
SS '95 Proceedings of the 28th Annual Simulation Symposium
Shaman: A Distributed Simulator for Shared Memory Multiprocessors
MASCOTS '02 Proceedings of the 10th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Design and Implementation of a High Speed Microprocessor Simulator BurstScalar
MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Automatic Synthesis of High-Speed Processor Simulators
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
An efficient approach for system-level timing simulation of compiler-optimized embedded software
Proceedings of the 46th Annual Design Automation Conference
Software performance simulation strategies for high-level embedded system design
Performance Evaluation
Hi-index | 0.00 |
This paper proposes a simple but efficient technique for instruction set simulators. Our simulator is made workload specific by a simple process to generate a set of C functions from a workload binary. It is as portable and retargetable as ordinary instruction emulators because the translation targets C code and works well with well-abstracted instruction definitions. The translation is also easy-to-implement without requiring any complicated analysis nor profiling. We also propose a set of simple optimization techniques for cache simulation which cooperates with the workload specific technique. A SimpleScalar-based implementation of these techniques results a significantly large performance improvement. Our evaluations with SPEC CPU95 exhibit that the maximum speedups over sim-fast, sim-cache and sim-outorder are 38-fold, 14-fold and 9.7-fold respectively, while the average numbers are 19-fold, 8.3-fold and 3.8-fold.