On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System
IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme
IEEE Transactions on Computers
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
On randomly interleaved memories
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Future general purpose supercomputer architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Conflict-Free Vector Access Using a Dynamic Storage Scheme
IEEE Transactions on Computers
A performance comparison of four supercomputers
Communications of the ACM
Base-p-cyclic reduction for tridiagonal systems of equations
Selected papers from the symposia on CWI-IMACS symposia on parallel scientific computing
Interleaved parallel schemes: improving memory throughput on supercomputers
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Increasing the number of strides for conflict-free vector access
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Characterizing memory performance in vector multiprocessors
ICS '92 Proceedings of the 6th international conference on Supercomputing
Distributed storage control unit for the Hitachi S-3800 multivector supercomputer
ICS '94 Proceedings of the 8th international conference on Supercomputing
Vector multiprocessors with arbitrated memory access
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The performance of the cedar multistage switching network
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Reducing Interference Among Vector Accesses in Interleaved Memories
IEEE Transactions on Computers
Buffered Banks in Multiprocessor Systems
IEEE Transactions on Computers
Hi-index | 0.02 |
In multivector processors, the lost cycles due to conflicts between concurrent vector streams make the effective throughput be lower than the peak throughput. When the request rate of all the concurrent vector streams to every memory module is less than or equal to the service rate, conflicts appear because concurrent vector streams reference memory modules in different orders. In addition, in a memory system where several memory modules are mapped in every bus (complex memory system) bus conflicts are added to memory module conflicts. This paper proposes an access order to the vector stream elements that reduces the average memory access time in vector processors with complex memory systems. When request rate is greater than the service rate, the proposed order reduces the numbe of lost cycles, and the effective throughput increases. In other cases, the effective throughput reach the peak throughput. The proposed order generates the memory references in such a way that the memory modules shared by the concurrent self-conflict-free vector streams, and the sections where memory modules are mapped, are referenced using the same order.