On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System
IEEE Transactions on Computers
Vector Computer Memory Bank Contention
IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme
IEEE Transactions on Computers
An aperiodic storage scheme to reduce memory conflicts in vector processors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
On randomly interleaved memories
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Pseudo-randomly interleaved memory
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
IEEE Transactions on Computers
Communications of the ACM - Special issue on computer architecture
Accounting for memory bank contention and delay in high-bandwidth multiprocessors
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Semi-linear and bi-base storage schemes classes: general overview and case study
ICS '95 Proceedings of the 9th international conference on Supercomputing
Minimization of Memory and Network Contention for Accessing Arbitrary Data Patterns in SIMD Systems
IEEE Transactions on Computers
A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns
IEEE Transactions on Parallel and Distributed Systems
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems
IEEE Transactions on Computers
Code transformations to improve memory parallelism
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes
International Journal of Parallel Programming
Buffered Banks in Multiprocessor Systems
IEEE Transactions on Computers
Array organization in parallel memories
International Journal of Parallel Programming
On Design of Parallel Memory Access Schemes for Video Coding
Journal of VLSI Signal Processing Systems
IEEE Transactions on Computers
The design space of data-parallel memory systems
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
PSIM: Periodically Shifted Interleaved Memory System
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
High-bandwidth Address Generation Unit
Journal of Signal Processing Systems
High-bandwidth address generation unit
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MEMS interleaving read operation of a holographic memory for optically reconfigurable gate arrays
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
A network congestion-aware memory subsystem for manycore
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
Hi-index | 14.99 |
A family of alternate interleaving schemes called permutation-based interleaving schemes for improving memory bandwidth for a wide range of access patterns in high-performance vector processing systems is described. Permutation-based interleaving schemes can be implemented with a small amount of additional hardware and with a minimal time overhead. The results of a detailed simulation analysis are reviewed. The simulation analysis suggests that, with adequate buffering, permutation-based interleaving schemes similar to those studied can be used to implement a high-bandwidth memory system for vector processors. The resulting memory system sustains its bandwidth for a wide variety of access patterns and for large bank busy times far better than a memory system with standard interleaving.