Increasing the number of strides for conflict-free vector access

Authors:
Mateo Valero;Tomás Lang;José M. Llabería;Montse Peiron;Eduard Ayguadé;Juan J. Navarra
Affiliations:
-;-;-;-;-;-
Venue:
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Year:
1992

Citing 7
Cited 22

On the effective bandwidth of interleaved memories in vector processor systems

IEEE Transactions on Computers
Performance evaluation of vector accesses in parallel memories using a skewed storage scheme

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
An aperiodic storage scheme to reduce memory conflicts in vector processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of vector access performance on skewed interleaved memory

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Conflict-Free Vector Access Using a Dynamic Storage Scheme

IEEE Transactions on Computers
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems

IEEE Transactions on Parallel and Distributed Systems

Conflict-free access of vectors with power-of-two strides

ICS '92 Proceedings of the 6th international conference on Supercomputing
Synchronized access to streams in SIMD vector multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Vector multiprocessors with arbitrated memory access

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Semi-linear and bi-base storage schemes classes: general overview and case study

ICS '95 Proceedings of the 9th international conference on Supercomputing
A data cache with multiple caching strategies tuned to different types of locality

ICS '95 Proceedings of the 9th international conference on Supercomputing
Eliminating cache conflict misses through XOR-based placement functions

ICS '97 Proceedings of the 11th international conference on Supercomputing
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems

IEEE Transactions on Computers
Algorithmic foundations for a parallel vector access memory system

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Increasing the effective bandwidth of complex memory systems in multivector processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Dynamic Access Ordering for Streamed Computations

IEEE Transactions on Computers
Tarantula: a vector extension to the alpha architecture

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Conflict-Free Access for Streams in Multimodule Memories

IEEE Transactions on Computers
A Memory Controller for Improved Performance of Streamed Computations on Symmetric Multiprocessors

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Access ordering and memory-conscious cache utilization

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Design and Implementation of High-Performance Memory Systems for Future Packet Buffers

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Adaptive History-Based Memory Schedulers

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Adaptive History-Based Memory Schedulers for Modern Processors

IEEE Micro
A DRAM/SRAM Memory Scheme for Fast Packet Buffers

IEEE Transactions on Computers
Impulse: Memory system support for scientific applications

Scientific Programming
Module Partitioning and Interlaced Data Placement Schemes to Reduce Conflicts in Interleaved Memories

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Memory scheduling for modern microprocessors

ACM Transactions on Computer Systems (TOCS)
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.01

Visualization

Abstract

Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories. In this paper, we extend these schemes to achieve this conflict-free access for a larger number of strides. The basic idea is to perform an out-of-order access to vectors of fixed length, equal to that of the vector registers of the processor. Both matched and unmatched memories are considered: we show that the number of strides is even larger for the latter case. The hardware for address calculations and access control is described and shown to be of similar complexity as that required for access in order.