Interleaved parallel schemes: improving memory throughput on supercomputers

Authors:
André Seznec;Jacques Lenfant
Affiliations:
-;-
Venue:
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Year:
1992

Citing 4
Cited 11

An efficient routing control for the SIGMA network Σ(4)

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Performance evaluation of vector accesses in parallel memories using a skewed storage scheme

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A new interconnection network for SIMD computers: the sigma networks

IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers

Odd memory systems may be quite interesting

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Scalable parallel memory architecture with a skew scheme

ICS '93 Proceedings of the 7th international conference on Supercomputing
Synchronized access to streams in SIMD vector multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Vector multiprocessors with arbitrated memory access

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Semi-linear and bi-base storage schemes classes: general overview and case study

ICS '95 Proceedings of the 9th international conference on Supercomputing
A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Increasing the effective bandwidth of complex memory systems in multivector processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Conflict-Free Accesses to Strided Vectors on a Banked Cache

IEEE Transactions on Computers
Module Partitioning and Interlaced Data Placement Schemes to Reduce Conflicts in Interleaved Memories

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
High-bandwidth Address Generation Unit

Journal of Signal Processing Systems
PPT: joint performance/power/thermal management of DRAM memory for multi-core systems

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design

Quantified Score

Hi-index	0.00

Visualization

Abstract

On many commercial supercomputers, several vector register processors share a global highly interleaved memory in a MIMD mode. When all the processors are working on a single vector loop, a significant part of the potential memory throughput may be wasted due to the asynchronism of the processors.In order to limit this loss of memory throughput, a SIMD synchronization mode for vector accesses to memory may be used. But an important part of the memory bandwidth may be wasted when accessing vectors with an even stride.In this paper, we present IPS, an interleaved parallel scheme, which ensures an equitable distribution of elements on a highly interleaved memory for a wide range a vector strides. We show how to organize access to memory, such that unscrambling of vectors from memory to the vector register processors requires a minimum number of passes through the interconnection network.