Memory latency effects in decoupled architectures with a single data memory module

Authors:
Lizyamma Kurian;Paul T. Hulina;Lee D. Coraor
Affiliations:
-;-;-
Venue:
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Year:
1992

Citing 14
Cited 6

A Simulation Study of Decoupled Architecture Computers

IEEE Transactions on Computers
Line (block) size choice for CPU cache memories

IEEE Transactions on Computers
The ZS-1 central processor

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Performance evaluation of on-chip register and cache organizations

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
MIPS RISC architecture

MIPS RISC architecture
Dynamic Instruction Scheduling and the Astronautics ZS-1

Computer
Improving performance of small on-chip instruction caches

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Implementation of the PIPE Processor

Computer - Special issue on experimental research in computer architecture
A Performance Comparison of the IBM RS/6000 and the Astronautics ZS-1

Computer - Special issue on experimental research in computer architecture
Classification and performance evaluation of instruction buffering techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
PIPE: a VLSI decoupled architecture

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Cache Memories

ACM Computing Surveys (CSUR)
Decoupled access/execute computer architectures

ACM Transactions on Computer Systems (TOCS)
Performance Trade-Offs for Microprocessor Cache Memories

IEEE Micro

Load latency tolerance in dynamically scheduled processors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Memory Latency Effects in Decoupled Architectures

IEEE Transactions on Computers
Program balance and its impact on high performance RISC architectures

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
SST: Symbolic Subordinate Threading

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
OUTRIDER: efficient memory latency tolerance with decoupled strands

Proceedings of the 38th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.01

Visualization

Abstract

Decoupled computer architectures partition the memory access and execute functions in a computer program and achieve high performance by exploiting the fine-grain parallelism between the two. These architectures make use of an access processor to perform the data fetch ahead of demand by the execute process and hence are often less sensitive to memory access delays than conventional architectures. Past performance studies of decoupled computers used memory systems that are interleave or pipelined. We undertake a simulation study of the latency effects in decoupled computers when connected to a single, conventional non-interleaved data memory module so that the effect of decoupling is isolated from the improvement caused by interleaving. We compare decoupled computer performance to single processors with caches, study the memory latency sensitivity of the decoupled systems, and also perform simulations to determine the significance of data caches in a decoupled computer architecture. The Lawrence Livermore Loops and two signal processing algorithms are used as the simulation benchmark.