Architecture of an Array Processor Using a Nonlinear Skewing Scheme

Authors:
De-Lei Lee
Affiliations:
-
Venue:
IEEE Transactions on Computers
Year:
1992

Citing 10
Cited 4

An Efficient Memory System for Image Processing

IEEE Transactions on Computers
High-performance computer architecture

High-performance computer architecture
On Linear Skewing Schemes and d-Ordered Vectors

IEEE Transactions on Computers
Scrambled storage for parallel memory systems

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
On access and alignment of data in a parallel processor

Information Processing Letters
Perfect Latin squares and parallel array access

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
An aperiodic storage scheme to reduce memory conflicts in vector processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)

Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)
Efficient address generation in a parallel processor

Information Processing Letters
Design of an array processor for image processing

Journal of Parallel and Distributed Computing

Semi-linear and bi-base storage schemes classes: general overview and case study

ICS '95 Proceedings of the 9th international conference on Supercomputing
Configurable parallel memory architecture for multimedia computers

Journal of Systems Architecture: the EUROMICRO Journal
Memory access reordering in vector processors

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
On Design of Parallel Memory Access Schemes for Video Coding

Journal of VLSI Signal Processing Systems

Quantified Score

Hi-index	14.98

Visualization

Abstract

The problem of constructing an array processor with N processing elements, N memories, and an interconnection network which provides conflict-free access and alignment of various N-vectors including rows, columns, diagonals, contiguous blocks, and distributed blocks of N*N arrays, where N is any even power of two, is discussed. The use of linear skewing schemes offers no solution to this problem. The solution developed makes use of a nonlinear skewing scheme. The solution leads to a simple, efficient array processor architecture. In particular, the memory organization requires O(log N) gates to generate memory addresses for any of the N-vectors simultaneously in O(1) time. The interconnection structure is able to accomplish data alignment for any of the N-vectors with a single pass through a network of O(N log N) gates. As the system uses the minimum number of memories, it allows both processing elements and memories to achieve the highest utilization possible.