On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme
IEEE Transactions on Computers
Interleaved parallel schemes: improving memory throughput on supercomputers
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Increasing the number of strides for conflict-free vector access
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Conflict-free access of vectors with power-of-two strides
ICS '92 Proceedings of the 6th international conference on Supercomputing
Odd memory systems may be quite interesting
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Scalable parallel memory architecture with a skew scheme
ICS '93 Proceedings of the 7th international conference on Supercomputing
Distributed storage control unit for the Hitachi S-3800 multivector supercomputer
ICS '94 Proceedings of the 8th international conference on Supercomputing
Synchronized access to streams in SIMD vector multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
Vector multiprocessors with arbitrated memory access
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Conflict-Free Access for Streams in Multimodule Memories
IEEE Transactions on Computers
A DRAM/SRAM Memory Scheme for Fast Packet Buffers
IEEE Transactions on Computers
Memory scheduling for modern microprocessors
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.01 |
This paper presents and evaluates a scheme for reducing the average memory access time in a vector processing architecture. This scheme uses data skewing to distribute vectors among the modules of a parallel memory system in such a way that, for typical vector access patterns, the average number of memory conflicts is reduced. It also employs both address and data buffers in each module to smooth out the transient irregularities that occur in some vector access patterns.Most previous data skewing techniques were developed to provide conflict-free access for a limited set of access strides. While the proposed scheme does not eliminate all conflicts, it improves the average performance over non-skewed parallel memories by significantly reducing the number of conflicts for a wide range of strides. Also, this effect is much less dependent on the number of memory modules than the skewing schemes used to obtain conflict-free access.