Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

Authors:
Xi E. Chen;Tor M. Aamodt
Affiliations:
University of British Columbia, Vancouver, BC, Canada;University of British Columbia, Vancouver, BC, Canada
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2011

Citing 30
Cited 0

An analytical cache model

ACM Transactions on Computer Systems (TOCS)
An effective on-chip preloading scheme to reduce data access penalty

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Theoretical modeling of superscalar processor performance

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Olden: parallelizing programs with dynamic data structures on distributed-memory machines

Olden: parallelizing programs with dynamic data structures on distributed-memory machines
An Analytical Model for Designing Memory Hierarchies

IEEE Transactions on Computers
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Cache Memories

ACM Computing Surveys (CSUR)
An exploration of instruction fetch requirement in out-of-order superscalar processors

International Journal of Parallel Programming - parallel architectures and compilation techniques, part II
A discussion on non-blocking/lockup-free caches

ACM SIGARCH Computer Architecture News
Benchmark health considered harmful

ACM SIGARCH Computer Architecture News
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Asim: A Performance Model Framework

Computer
Microarchitectural exploration with Liberty

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
How Useful Are Non-Blocking Loads, Stream Buffers and Speculative Execution in Multiple Issue Processors?

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
A Framework for Statistical Modeling of Superscalar Processor Performance

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Exploring Instruction-Fetch Bandwidth Requirement in Wide-Issue Superscalar Processors

PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
Efficient performance prediction for modern microprocessors

Efficient performance prediction for modern microprocessors
A First-Order Superscalar Processor Model

Proceedings of the 31st annual international symposium on Computer architecture
Toward kilo-instruction processors

ACM Transactions on Architecture and Code Optimization (TACO)
A performance counter architecture for computing accurate CPI components

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
The Future of Simulation: A Field of Dreams

Computer
Scalable Cache Miss Handling for High Memory-Level Parallelism

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Automated design of application-specific superscalar processors

Automated design of application-specific superscalar processors
Automated design of application specific superscalar processors: an analytical approach

Proceedings of the 34th annual international symposium on Computer architecture
Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Storage hierarchy optimization procedure

IBM Journal of Research and Development
On optimization of storage hierarchies

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article proposes techniques to predict the performance impact of pending cache hits, hardware prefetching, and miss status holding register resources on superscalar microprocessors using hybrid analytical models. The proposed models focus on timeliness of pending hits and prefetches and account for a limited number of MSHRs. They improve modeling accuracy of pending hits by 3.9× and when modeling data prefetching, a limited number of MSHRs, or both, these techniques result in average errors of 9.5% to 17.8%. The impact of non-uniform DRAM memory latency is shown to be approximated well by using a moving average of memory access latency.