On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System
IEEE Transactions on Computers
Performance evaluation of vector accesses in parallel memories using a skewed storage scheme
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Vector Computer Memory Bank Contention
IEEE Transactions on Computers
Some results in memory conflict analysis
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Accurate modelling of interconnection networks in vector supercomputers
ICS '91 Proceedings of the 5th international conference on Supercomputing
Conflict-Free Vector Access Using a Dynamic Storage Scheme
IEEE Transactions on Computers
Interleaved parallel schemes: improving memory throughput on supercomputers
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Increasing the number of strides for conflict-free vector access
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Characterizing memory performance in vector multiprocessors
ICS '92 Proceedings of the 6th international conference on Supercomputing
Conflict-free access of vectors with power-of-two strides
ICS '92 Proceedings of the 6th international conference on Supercomputing
A conflict-free memory design for multiprocessors
A conflict-free memory design for multiprocessors
Access conflicts in multiprocessor memories queueing models and simulation studies
ICS '90 Proceedings of the 4th international conference on Supercomputing
Conflict-Free Access for Streams in Multimodule Memories
IEEE Transactions on Computers
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems
IEEE Transactions on Parallel and Distributed Systems
Vector multiprocessors with arbitrated memory access
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Semi-linear and bi-base storage schemes classes: general overview and case study
ICS '95 Proceedings of the 9th international conference on Supercomputing
Hi-index | 0.00 |
The synchronized and simultaneous access to several vectors that form a single stream is typical in SIMD vector multiprocessors as well as in MIMD superscalar multiprocessors with decoupled access. In this paper we propose a block-interleaved storage scheme and an out-of-order access mechanism that allows conflict-free access to streams with an arbitrary initial address and constant stride between elements. The memory system can have any degree of unmatchness and we consider the use of either a crossbar or a multistage interconnection network. A maximal number of conflict-free families including the most commonly used strides can be obtained. We describe the hardware for address calculation and control and show that their additional costs are minimal compared with the cost of the hardware for in-order access. Finally, we evaluate the applicability of this technique to real loops from some programs of the Perfect Club and SPEC suites.