Analysis of benchmark characteristics and benchmark performance prediction
ACM Transactions on Computer Systems (TOCS)
Fast out-of-order processor simulation using memoization
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Application-driven processor design exploration for power-performance trade-off analysis
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Reducing State Loss For Effective Trace Sampling of Superscalar Processors
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Accuracy and Speedup of Parallel Trace-Driven Architectural Simulation
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Modeling Superscalar Processors via Statistical Simulation
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A Framework for Statistical Modeling of Superscalar Processor Performance
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Positional adaptation of processors: application to energy reduction
Proceedings of the 30th annual international symposium on Computer architecture
Minimal Subset Evaluation: Rapid Warm-Up for Simulated Hardware State
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Characterizing and Predicting Program Behavior and its Variability
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Performance analysis through synthetic trace generation
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Memory reference reuse latency: Accelerated warmup for sampled microarchitecture simulation
ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
Energy-aware fetch mechanism: trace cache and BTB customization
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Selecting Software Phase Markers with Code Structure Analysis
Proceedings of the International Symposium on Code Generation and Optimization
Complexity-based program phase analysis and classification
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A Sampling Method Focusing on Practicality
IEEE Micro
Predicting locality phases for dynamic memory optimization
Journal of Parallel and Distributed Computing
Analysis of input-dependent program behavior using active profiling
Proceedings of the 2007 workshop on Experimental computer science
Analysis of input-dependent program behavior using active profiling
ecs'07 Experimental computer science on Experimental computer science
Applying Statistical Sampling for Fast and Efficient Simulation of Commercial Workloads
IEEE Transactions on Computers
A superscalar simulation employing poisson distributed stalls
Computers and Electrical Engineering
Program phase detection and exploitation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Statistical sampling of microarchitecture simulation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
BarrierWatch: characterizing multithreaded workloads across and within program-defined epochs
Proceedings of the 8th ACM International Conference on Computing Frontiers
Dynamic code region (DCR) based program phase tracking and prediction for dynamic optimizations
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Thermal-aware sampling in architectural simulation
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Hi-index | 0.01 |
Studying program behavior is a central component in architectural designs. In this paper, we study and exploit one aspect of program behavior, the behavior repetition, to expedite simulation. Detailed architectural simulation can be long and computationally expensive. Various alternatives are commonly used to simulate a much smaller instruction stream to evaluate design choices: using a reduced input set or simulating only a small window of the instruction stream. In this paper, we propose to reduce the amount of detailed simulation by avoiding simulating repeated code sections that demonstrate stable behavior. By characterizing program behavior repetition and use the information to select a subset of instructions for detailed simulation, we can significantly speed up the process without affecting the accuracy. In most cases, simulation time of full-length SPEC CPU2000 benchmarks is reduced from hundreds of hours to a few hours. The average error incurred is only about 1% or less for a range of metrics.