High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study

Authors:
G. S. Sohi
Affiliations:
-
Venue:
IEEE Transactions on Computers
Year:
1993

Citing 9
Cited 20

On the effective bandwidth of interleaved memories in vector processor systems

IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System

IEEE Transactions on Computers
Vector Computer Memory Bank Contention

IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers
An aperiodic storage scheme to reduce memory conflicts in vector processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
On randomly interleaved memories

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations

IEEE Transactions on Computers
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture

Accounting for memory bank contention and delay in high-bandwidth multiprocessors

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Semi-linear and bi-base storage schemes classes: general overview and case study

ICS '95 Proceedings of the 9th international conference on Supercomputing
Minimization of Memory and Network Contention for Accessing Arbitrary Data Patterns in SIMD Systems

IEEE Transactions on Computers
A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns

IEEE Transactions on Parallel and Distributed Systems
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems

IEEE Transactions on Computers
Code transformations to improve memory parallelism

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

International Journal of Parallel Programming
Buffered Banks in Multiprocessor Systems

IEEE Transactions on Computers
Array organization in parallel memories

International Journal of Parallel Programming
On Design of Parallel Memory Access Schemes for Video Coding

Journal of VLSI Signal Processing Systems
XOR-Based Hash Functions

IEEE Transactions on Computers
The design space of data-parallel memory systems

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Module Partitioning and Interlaced Data Placement Schemes to Reduce Conflicts in Interleaved Memories

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
PSIM: Periodically Shifted Interleaved Memory System

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
High-bandwidth Address Generation Unit

Journal of Signal Processing Systems
High-bandwidth address generation unit

SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MEMS interleaving read operation of a holographic memory for optically reconfigurable gate arrays

ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
A network congestion-aware memory subsystem for manycore

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures

Quantified Score

Hi-index	14.99

Visualization

Abstract

A family of alternate interleaving schemes called permutation-based interleaving schemes for improving memory bandwidth for a wide range of access patterns in high-performance vector processing systems is described. Permutation-based interleaving schemes can be implemented with a small amount of additional hardware and with a minimal time overhead. The results of a detailed simulation analysis are reviewed. The simulation analysis suggests that, with adequate buffering, permutation-based interleaving schemes similar to those studied can be used to implement a high-bandwidth memory system for vector processors. The resulting memory system sustains its bandwidth for a wide variety of access patterns and for large bank busy times far better than a memory system with standard interleaving.