A comparison of data prefetching on an access decoupled and superscalar machine

Authors:
G. P. Jones;N. P. Topham
Affiliations:
Dept. of Computer Science, Edinburgh University, Edinburgh, Scotland, U.K.;Dept. of Computer Science, Edinburgh University, Edinburgh, Scotland, UK
Venue:
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Year:
1997

Citing 7
Cited 8

Implementation of the PIPE Processor

Computer - Special issue on experimental research in computer architecture
Software prefetching

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Evaluation of the WM architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
Sunder: a programmable hardware prefetch architecture for numerical loops

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Limitation Study into Access Decoupling

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Performance Characterization of the Pentium® Pro Processor

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture

Predictor-directed stream buffers

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Multithreading decoupled architectures for complexity-effective general purpose computing

ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
A Decoupled Predictor-Directed Stream Prefetching Architecture

IEEE Transactions on Computers
Code Partitioning in Decoupled Compilers

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Design and evaluation of a hierarchical decoupled architecture

The Journal of Supercomputing
A complexity-effective microprocessor design with decoupled dispatch queues and prefetching

Parallel Computing
Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

Journal of Signal Processing Systems
Design and effectiveness of small-sized decoupled dispatch queues

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate the behavior of data prefetching on an access decoupled machine and a superscalar machine. We assess if there are benefits to using the decoupling paradigm given that an out-of-order (o-o-o) superscalar architecture could in principle prefetch to the same degree as an access decoupled machine. We have found that for large issue width the access decoupled machine can hide memory latency more effectively than a single instruction window o-o-o superscalar architecture. Our findings also demonstrate that an access decoupled machine offers the benefit of reducing the complexity of window issue logic.