Vector access performance in parallel memories using skewed storage scheme

Authors:
D. T. Harper, III;J. R. Jump
Affiliations:
Univ. of Taxas at Dallas, Richardson, TX;Tice Univ., Houston, TX
Venue:
IEEE Transactions on Computers
Year:
1987

Citing 6
Cited 40

On the effective bandwidth of interleaved memories in vector processor systems

IEEE Transactions on Computers
Performance evaluation of vector accesses in parallel memories using a skewed storage scheme

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
The Prime Memory System for Array Access

IEEE Transactions on Computers
Theoretical Limitations on the Efficient Use of Parallel Memories

IEEE Transactions on Computers
Access and Alignment of Data in an Array Processor

IEEE Transactions on Computers
The Organization and Use of Parallel Memories

IEEE Transactions on Computers

Compile-time techniques for efficient utilization of parallel memories

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
A dynamic storage scheme for conflict-free vector access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Perfect Latin squares and parallel array access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
An aperiodic storage scheme to reduce memory conflicts in vector processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of vector access performance on skewed interleaved memory

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The Monarch Parallel Processor Hardware Design

Computer
On randomly interleaved memories

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Conflict-Free Vector Access Using a Dynamic Storage Scheme

IEEE Transactions on Computers
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A conflict-free memory design for multiprocessors

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
A ultra fast Euclidean division algorithm for prime memory systems

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Measurement of memory access contentions in multiple vector processor systems

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations

IEEE Transactions on Computers
Interleaved parallel schemes: improving memory throughput on supercomputers

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Odd memory systems may be quite interesting

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Twisted data layout

ICS '94 Proceedings of the 8th international conference on Supercomputing
Interleaved Parallel Schemes

IEEE Transactions on Parallel and Distributed Systems
A Memory Interference Model for Regularly Patterned Multiple Stream Vector Accesses

IEEE Transactions on Parallel and Distributed Systems
OMP: a RISC-based multiprocessor using orthogonal-access memories and multiple spanning buses

ICS '90 Proceedings of the 4th international conference on Supercomputing
Fault-Tolerant Interleaved Memory Systems with Two-Level Redundancy

IEEE Transactions on Computers
The design and performance of a conflict-avoiding cache

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Bounding on the gain of optimizing data layout in vector processors

ICS '98 Proceedings of the 12th international conference on Supercomputing
Randomized Cache Placement for Eliminating Conflicts

IEEE Transactions on Computers - Special issue on cache memory and related problems
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems

IEEE Transactions on Computers
Increasing the effective bandwidth of complex memory systems in multivector processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Dynamic Access Ordering for Streamed Computations

IEEE Transactions on Computers
High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study

IEEE Transactions on Computers
Reducing Interference Among Vector Accesses in Interleaved Memories

IEEE Transactions on Computers
Analytical Estimation of Vector Access Performance in Parallel Memory Architectures

IEEE Transactions on Computers
Buffered Banks in Multiprocessor Systems

IEEE Transactions on Computers
A 3D Skewing and De-skewing Scheme for Conflict-Free Access to Rays in Volume Rendering

IEEE Transactions on Computers
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Compile-Time Techniques for Improving Scalar Access Performance in Parallel Memories

IEEE Transactions on Parallel and Distributed Systems
A Novel Sequencer Hardware for Application Specific Computing

ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Access ordering and memory-conscious cache utilization

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Multiaccess Memory System for Attached SIMD Computer

IEEE Transactions on Computers
Eliminating Conflict Misses Using Prime Number-Based Cache Indexing

IEEE Transactions on Computers
Conflict-Free Accesses to Strided Vectors on a Banked Cache

IEEE Transactions on Computers
Module Partitioning and Interlaced Data Placement Schemes to Reduce Conflicts in Interleaved Memories

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A programmable, scalable-throughput interleaver

EURASIP Journal on Wireless Communications and Networking

Quantified Score

Hi-index	15.02

Visualization

Abstract

The degree to which high-speed vector processors approach their peak performance levels is closely tied to the amount of interference they encounter while accessing vectors in memory. In this paper we present an evaluation of a storage scheme that reduces the average memory access time in a vector-oriented architecture. A skewing scheme is used to map vector components into parallel memory modules such that, for most vector access patterns, the number of memory conflicts is reduced over that observed in interleaved parallel memory systems. Address and data buffers are used locally in each module so that transient nonuniformities which occur in some access patterns do not degrade performance. Previous investigations into skewing techniques have attempted to provide conflict-free access for a limited subset of access patterns. The goal of this investigation is different. The skewing scheme evaluated here does not eliminate all memory conflicts but it does improve the average performance of vector access over interleaved systems for a wide range of strides. It is shown that little extra hardware is required to implement the skewing scheme. Also, far fewer restrictions are placed on the number of memory modules in the system than are present in other proposed schemes.