Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations

Authors:
D. T. Harper, III
Affiliations:
-
Venue:
IEEE Transactions on Computers
Year:
1992

Citing 6
Cited 11

On the effective bandwidth of interleaved memories in vector processor systems

IEEE Transactions on Computers
A Simulation Study of the CRAY X-MP Memory System

IEEE Transactions on Computers
Vector Computer Memory Bank Contention

IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers
Scrambled storage for parallel memory systems

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A dynamic storage scheme for conflict-free vector access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture

Semi-linear and bi-base storage schemes classes: general overview and case study

ICS '95 Proceedings of the 9th international conference on Supercomputing
Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

International Journal of Parallel Programming
High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study

IEEE Transactions on Computers
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Configurable parallel memory architecture for multimedia computers

Journal of Systems Architecture: the EUROMICRO Journal
Multiaccess Memory System for Attached SIMD Computer

IEEE Transactions on Computers
Sams: single-affiliation multiple-stride parallel memory scheme

Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
Configurable data memory for multimedia processing

Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
High-bandwidth Address Generation Unit

Journal of Signal Processing Systems
SAMS multi-layout memory: providing multiple views of data to boost SIMD performance

Proceedings of the 24th ACM International Conference on Supercomputing
Elastic pipeline: addressing GPU on-chip shared memory bank conflicts

Proceedings of the 8th ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	14.99

Visualization

Abstract

A technique to analyze transformation matrices is presented. This technique is based on decomposing complex transformations into elementary transformations. When combined with a factorization of the access stride into two components, one a power of 2 and the other relatively prime to 2, the technique leads to an algorithmic synthesis of a CF (conflict free) storage scheme. Additionally, because the address to storage location mapping arithmetic is performed modulo 2, the time required to transform an address to its corresponding storage location is smaller and the hardware cost is lower than if schemes based on row rotation were used.