Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems

Authors:
D. T. Harper, III
Affiliations:
-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1991

Citing 11
Cited 28

On the effective bandwidth of interleaved memories in vector processor systems

IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System

IEEE Transactions on Computers
Vector Computer Memory Bank Contention

IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers
The Titan Graphics Supercomputer Architecture

Computer
Scrambled storage for parallel memory systems

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A dynamic storage scheme for conflict-free vector access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Perfect Latin squares and parallel array access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Conflict-Free Vector Access Using a Dynamic Storage Scheme

IEEE Transactions on Computers
Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations

IEEE Transactions on Computers
DFT/FFT and Convolution Algorithms: Theory and Implementation

DFT/FFT and Convolution Algorithms: Theory and Implementation

Accurate modelling of interconnection networks in vector supercomputers

ICS '91 Proceedings of the 5th international conference on Supercomputing
A novel cache design for vector processing

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Increasing the number of strides for conflict-free vector access

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Conflict-free access of vectors with power-of-two strides

ICS '92 Proceedings of the 6th international conference on Supercomputing
On storage schemes for parallel array access

ICS '92 Proceedings of the 6th international conference on Supercomputing
Introducing a New Cache Design into Vector Computers

IEEE Transactions on Computers
Synchronized access to streams in SIMD vector multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
A Memory Interference Model for Regularly Patterned Multiple Stream Vector Accesses

IEEE Transactions on Parallel and Distributed Systems
Minimization of Memory and Network Contention for Accessing Arbitrary Data Patterns in SIMD Systems

IEEE Transactions on Computers
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A Comparative Analysis of Cache Designs for Vector Processing

IEEE Transactions on Computers
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems

IEEE Transactions on Computers
Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

International Journal of Parallel Programming
Design and analysis of static memory management policies for CC-NUMA Multiprocessors

Journal of Systems Architecture: the EUROMICRO Journal
Analytical Estimation of Vector Access Performance in Parallel Memory Architectures

IEEE Transactions on Computers
A Multiaccess Frame Buffer Architecture

IEEE Transactions on Computers
Conflict-Free Access for Streams in Multimodule Memories

IEEE Transactions on Computers
Configurable parallel memory architecture for multimedia computers

Journal of Systems Architecture: the EUROMICRO Journal
Multiaccess Memory System for Attached SIMD Computer

IEEE Transactions on Computers
Array organization in parallel memories

International Journal of Parallel Programming
On Design of Parallel Memory Access Schemes for Video Coding

Journal of VLSI Signal Processing Systems
Module Partitioning and Interlaced Data Placement Schemes to Reduce Conflicts in Interleaved Memories

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Sams: single-affiliation multiple-stride parallel memory scheme

Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
High-bandwidth Address Generation Unit

Journal of Signal Processing Systems
Partial access conflict-relieving programmable address shuffler for parallel memory system in multi-core processor

Microprocessors & Microsystems
SAMS multi-layout memory: providing multiple views of data to boost SIMD performance

Proceedings of the 24th ACM International Conference on Supercomputing
An Efficient Memory Organization for High-ILP Inner Modem Baseband SDR Processors

Journal of Signal Processing Systems
Elastic pipeline: addressing GPU on-chip shared memory bank conflicts

Proceedings of the 8th ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	0.03

Visualization

Abstract

A discussion is presented of the use of dynamic storage schemes to improve parallelmemory performance during three important classes of data accesses: vector accesses inwhich multiple strides are used to access a single vector, block accesses, andconstant-geometry FFT accesses. The schemes investigated are based on linear addresstransformations, also known as XOR schemes. It has been shown that this class ofschemes can be implemented more efficiently in hardware and has more flexibility thanschemes based on row rotations or other techniques. Several analytical results areshown. These include: quantitative analysis of buffering effects in pipelined memorysystems; design rules for storage schemes that provide conflict-free access usingmultiple strides, blocks, and FFT access patterns; and an analysis of the effects ofmemory bank cycle time on storage scheme capabilities.