IEEE Transactions on Computers
Dynamic dependency analysis of ordinary programs
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The effect of speculatively updating branch history on branch prediction accuracy, revisited
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Alternative fetch and issue policies for the trace cache fetch mechanism
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Improving the accuracy and performance of memory communication through renaming
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Predictive techniques for aggressive load speculation
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
IEEE Transactions on Computers
Efficient performance prediction for modern microprocessors
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Computer
Mtool: An Integrated System for Performance Debugging Shared Memory Multiprocessor Applications
IEEE Transactions on Parallel and Distributed Systems
Toward reducing processor simulation time via dynamic reduction of microarchitecture complexity
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Journal of Systems and Software - Special issue: Performance modeling and analysis of computer systems and networks
Efficient cycle-accurate simulation of the UltraSPARC III CPU
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Kismet: parallel speedup estimates for serial programs
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Hi-index | 0.00 |
The increasing complexity of modern superscalar microprocessors makes the evaluation of new designs and techniques much more difficult. Fast and accurate methods for simulating program execution on realistic and hypothetical processor models are of great interest to many computer architects and compiler writers. There are many existing techniques, from profile based runtime estimation to complete cycle-level simulations. Many researchers choose to sacrifice the speed of profiling for the accuracy obtainable by cycle-level simulators. This paper presents a technique that provides accurate performance predictions, while avoiding the complexity associated with a complete processor emulator. The approach augments a fast in-order simulator with a time-stamping algorithm that provides a very good estimate of program running time. This algorithm achieves an average accuracy that is within 7.5% of a cycle-level out-of-order simulator in approximately 41% of the running time on the eight SPECInt95 integer benchmarks.