Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Complexity/performance tradeoffs with non-blocking loads
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Computer
Reducing State Loss For Effective Trace Sampling of Superscalar Processors
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Modeling Superscalar Processors via Statistical Simulation
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
A Framework for Statistical Modeling of Superscalar Processor Performance
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
A First-Order Superscalar Processor Model
Proceedings of the 31st annual international symposium on Computer architecture
Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies
Proceedings of the 31st annual international symposium on Computer architecture
Improved automatic testcase synthesis for performance model validation
Proceedings of the 19th annual international conference on Supercomputing
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Efficient design space exploration of high performance embedded out-of-order processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Microprocessor power estimation using profile-driven program synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Computers
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Hi-index | 0.02 |
Microprocessor design is a very complex and time-consuming activity. One of the primary reasons is the huge design space that needs to be explored in order to identify the optimal design given a number of constraints. Simulations are usually used to explore these huge design spaces, however, they are fairly slow. Several hundreds of billions of instructions need to be simulated per benchmark; and this needs to be done for every design point of interest.Recently, statistical simulation was proposed to efficiently cull a huge design space. The basic idea of statistical simulation is to collect a number of important program characteristics and to generate a synthetic trace from it. Simulating this synthetic trace is extremely fast as it contains a million instructions only.This paper improves the statistical simulation methodology by proposing accurate memory data flow models. We model (i) load forwarding, (ii) delayed cache hits, and (iii) correlation between cache misses based on path info. Our experiments using the SPEC CPU2000 benchmarks show a substantial improvement upon current state-of-the-art statistical simulation methods. For example, for our baseline configuration we reduce the average IPC prediction error from 10.7% to 2.3%. In addition, we show that performance trends are predicted very accurately, making statistical simulation enhanced with accurate data flow models a useful tool for efficient and accurate microprocessor design space explorations.