The Decoupled-Style Prefetch Architecture (Research Note)

Authors:
Kevin D. Rich;Matthew K. Farrens
Affiliations:
-;-
Venue:
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Year:
2000

Citing 6
Cited 0

Dynamic Instruction Scheduling and the Astronautics ZS-1

Computer
An effective programmable prefetch engine for on-chip caches

Proceedings of the 28th annual international symposium on Microarchitecture
Sunder: a programmable hardware prefetch architecture for numerical loops

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Performance of the decoupled ACRI-1 architecture: the perfect club

HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Characterizing and removing branch mispredictions

Characterizing and removing branch mispredictions
Compiler techniques for evaluating and extending decoupled architectures (data prefetching)

Compiler techniques for evaluating and extending decoupled architectures (data prefetching)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decoupled processing seeks to dynamically schedule memory accesses in order to tolerate memory latency by prefetching operands. Since decoupled processors can not speculatively issue memory operations, control flow operations can significantly impact their ability to prefetch data. The prefetching architecture proposed here seeks to leverage the dynamic scheduling benefits of decoupled processing while allowing memory operations to be speculatively invoked. The prefetching mechanism is evaluated using the SPEC95 suite of benchmarks and significant reductions in cache miss rate are achieved, resulting in speedups of over 40% of peak for most of the inputs.