Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems
IEEE Transactions on Computers
The PowerPC performance modeling methodology
Communications of the ACM
IBM Systems Journal
Accuracy and Speedup of Parallel Trace-Driven Architectural Simulation
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Performance evaluation and validation of microprocessors
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Application-driven processor design exploration for power-performance trade-off analysis
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Accuracy and Speedup of Parallel Trace-Driven Architectural Simulation
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
EXPERT: expedited simulation exploiting program behavior repetition
Proceedings of the 18th annual international conference on Supercomputing
Efficient simulation of trace samples on parallel machines
Parallel Computing
Optimal sample length for efficient cache simulation
Journal of Systems Architecture: the EUROMICRO Journal
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
SMA: a self-monitored adaptive cache warm-up scheme for microprocessor simulation
International Journal of Parallel Programming
Yet shorter warmup by combining no-state-loss and MRRL for sampled LRU cache simulation
Journal of Systems and Software - Special issue: Quality software
NSL-BLRL: Efficient CacheWarmup for Sampled Processor Simulation
ANSS '06 Proceedings of the 39th annual Symposium on Simulation
Memory Trace Compression and Replay for SPMD Systems using Extended PRSDs?
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
Efficient sampling startup for sampled processor simulation
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
Trace-driven simulation continues to be one of the main evaluation methods in the design of high performance processor-memory sub-systems. In this paper, we examine the varying speed-up opportunities available by processing a given trace in parallel on an IBM SP-2 machine. We also develop a simple, yet effective method of correcting for cold-start cache miss errors, by the use of overlapped trace chunks. We then report selected experimental results to validate our expectations. We show that it is possible to achieve near-perfect speed-up without loss of accuracy. Next, in order to achieve further reduction in simulation cost, we combine uniform sampling methods with parallel processing with a slight loss of accuracy for finite-cache timer runs. We then show that by using warm-start sequences from preceding trace chunks, it is possible to reduce the errors back to acceptable bounds.