Access and Alignment of Data in an Array Processor

Authors:
D. H. Lawrie
Affiliations:
Department of Computer Science, University of Illinois
Venue:
IEEE Transactions on Computers
Year:
1975

Citing 14
Cited 135

A Permutation Network

Journal of the ACM (JACM)
An Adaptation of the Fast Fourier Transform for Parallel Processing

Journal of the ACM (JACM)
A model for masking rotational latency by dynamic disk allocation

Communications of the ACM
Control structures in Illiac IV Fortran

Communications of the ACM
Parallelism exposure and exploitation in programs

Parallelism exposure and exploitation in programs
Parallel Processing with the Perfect Shuffle

IEEE Transactions on Computers
Cellular Interconnection Arrays

IEEE Transactions on Computers
Time and Parallel Processor Bounds for Linear Recurrence Systems

IEEE Transactions on Computers
A Fast Computer Method for Matrix Transposing

IEEE Transactions on Computers
ILLIAC IV Software and Application Programming

IEEE Transactions on Computers
The Organization and Use of Parallel Memories

IEEE Transactions on Computers
Sorting networks and their applications

AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Programmable indexing networks

AFIPS '70 (Spring) Proceedings of the May 5-7, 1970, spring joint computer conference
STARAN parallel processor system hardware

AFIPS '74 Proceedings of the May 6-10, 1974, national computer conference and exposition

Determining an Optimal Secondary Storage Service Rate for the PASM Control System

IEEE Transactions on Computers
Fault-Tolerant Multiprocessors with Redundant-Path Interconnection Networks

IEEE Transactions on Computers - The MIT Press scientific computation series
Performance of unbuffered shuffle-exchange networks

IEEE Transactions on Computers - The MIT Press scientific computation series
A connecting network with fault tolerance capabilities

IEEE Transactions on Computers - The MIT Press scientific computation series
Finite State Model and Compatibility Theory: New Analysis Tools for Permutation Networks

IEEE Transactions on Computers
Permutations on Illiac IV-Type Networks

IEEE Transactions on Computers
An Efficient Memory System for Image Processing

IEEE Transactions on Computers
The Load-Sharing Banyan Network

IEEE Transactions on Computers
Vector Computer Memory Bank Contention

IEEE Transactions on Computers
A new interconnection network for SIMD computers: the sigma networks

IEEE Transactions on Computers
On the permutation capability of multistage interconnection networks

IEEE Transactions on Computers
On Linear Skewing Schemes and d-Ordered Vectors

IEEE Transactions on Computers
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors

IEEE Transactions on Computers
The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems

IEEE Transactions on Computers
Parallelization and Performance Analysis of the Cooley-Tukey FFT Algorithm for Shared-Memory Architectures

IEEE Transactions on Computers
Performance analysis of the FFT algorithm on a shared-memory parallel architecture

IBM Journal of Research and Development
Best worst mappings for the omega network

IBM Journal of Research and Development
Applications considerations in the system design of highly concurrent multiprocessors

IEEE Transactions on Computers
An analytical model for a class of processor-memory interconnection networks

IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme

IEEE Transactions on Computers
A Characterization and Analysis of Parallel Processor Interconnection Networks

IEEE Transactions on Computers
Discrete Optimization Problem in Local Networks and Data Alignment

IEEE Transactions on Computers
A New Benes Network Control Algorithm

IEEE Transactions on Computers
The periodic balanced sorting network

Journal of the ACM (JACM)
The SP2 high-performance switch

IBM Systems Journal
The shuffle/exchange-plus networks

ACM-SE 20 Proceedings of the 20th annual Southeast regional conference
On the Rearrangeability of Multistage Networks Employing Uniform Connection Patterns

ADVIS '00 Proceedings of the First International Conference on Advances in Information Systems
Job Scheduling for the BlueGene/L System

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
A 2D Addressing Mode for Multimedia Applications

Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Reducing Cache Conflicts by a Parametrized Memory Mapping

ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
A Reconfigurable Stochastic Model Simulator for Analysis of Parallel Systems

FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
A Reconfigurable Stochastic Model Simulator for Analysis of Parallel Systems

FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Performing BMMC Permutations in Two Passes through the Expanded Delta Network and MasPar MP-2

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A journey into multicomputer routing algorithms

PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
The Stereo Correspondence Problem on a Ring-based Network

PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Performance of Multicasting Closed Interconnection Networks

INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Siamese-Twin: A Dynamically Fault-Tolerant Fat-Tree

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Multistage Interconnection Networks with Multiple Outlets

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A High Throughput Packet-Switching Network with Neural Network Controlled Bypass Queueing and Multiplexing

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Performance and Reliability of the Multistage Bus Network

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A New Tag Scheme and Its Tree Representation for a Shuffle-Exchange Network

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
On the Rearrangeability of Reverse Shuffle/Exchange Networks

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
SNAIL: A Multiprocessor Based on the Simple Serial Synchronized Multistage Interconnection Network Architecture

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Bounded-Collision Memory-Mapping Schemes for Data Structures with Applications to Parallel Memories

IEEE Transactions on Parallel and Distributed Systems
Memory Systems for Image Processing

IEEE Transactions on Computers
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

IEEE Transactions on Computers
Data Exchange Optimization in Reconfigurable

IEEE Transactions on Computers
Parallel Processing Approaches to Image Correlation

IEEE Transactions on Computers
Asynchronous and Clocked Control Structures for VSLI Based Interconnection Networks

IEEE Transactions on Computers
Notes on Shuffle/Exchange-Type Switching Networks

IEEE Transactions on Computers
Communication Structures for Large Networks of Microcomputers

IEEE Transactions on Computers
Analysis and Simulation of Buffered Delta Networks

IEEE Transactions on Computers
VLSI Performance Comparison of Banyan and Crossbar Communications Networks

IEEE Transactions on Computers
An Easily Controlled Network for Frequently Used Permutations

IEEE Transactions on Computers
On the Number of Permutations Performable by the Augmented Data Manipulator Network

IEEE Transactions on Computers
Optimal BPC Permutations on a Cube Connected SIMD Computer

IEEE Transactions on Computers
The Gamma Network

IEEE Transactions on Computers
Invariant Properties of the Shuffle-Exchange and a Simplified Cost-Effective Version of the Omega Network

IEEE Transactions on Computers
Supersystems: Technology and Architecture

IEEE Transactions on Computers
The Prime Memory System for Array Access

IEEE Transactions on Computers
The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems

IEEE Transactions on Computers
The Universality of the Shuffle-Exchange Network

IEEE Transactions on Computers
A Self-Routing Benes Network and Parallel Permutation Algorithms

IEEE Transactions on Computers
Theoretical Limitations on the Efficient Use of Parallel Memories

IEEE Transactions on Computers
The Indirect Binary n-Cube Microprocessor Array

IEEE Transactions on Computers
On the Effective Bandwidth of Parallel Memories

IEEE Transactions on Computers
Parallel Permutations of Data: A Benes Network Control Algorithm for Frequently Used Permutations

IEEE Transactions on Computers
Graph Theoretical Analysis and Design of Multistage Interconnection Networks

IEEE Transactions on Computers
High-Speed Multiprocessors and Compilation Techniques

IEEE Transactions on Computers
A Uniform Representation of Single-and Multistage Interconnection Networks Used in SIMD Machines

IEEE Transactions on Computers
The Theory Underlying the Partitioning of Permutation Networks

IEEE Transactions on Computers
The Reverse-Exchange Interconnection Network

IEEE Transactions on Computers
Fast Random and Sequential Access to Dynamic Memories of Any Size

IEEE Transactions on Computers
On a Class of Multistage Interconnection Networks

IEEE Transactions on Computers
Fault-Diagnosis for a Class of Multistage Interconnection Networks

IEEE Transactions on Computers
Performance of Processor-Memory Interconnections for Multiprocessors

IEEE Transactions on Computers
Task Preloading Schemes for Reconfigurable Parallel Processing Systems

IEEE Transactions on Computers
Routing Algorithms for Cellular Interconnection Arrays

IEEE Transactions on Computers
PUMPS Architecture for Pattern Analysis and Image Database Management

IEEE Transactions on Computers
Minimization of Interprocessor Communication for Parallel Computation

IEEE Transactions on Computers
A Practical Algorithm for the Solution of Triangular Systems on a Parallel Processing System

IEEE Transactions on Computers
Pin Limitations and Partitioning of VLSI Interconnection Networks

IEEE Transactions on Computers
Efficient Internode Communications in Reconfigurable Binary Trees

IEEE Transactions on Computers
A Two-Level Microprogrammed Multiprocessor Computer with Nonnumeric Functions

IEEE Transactions on Computers
Routing Schemes for the Augmented Data Manipulator Network in an MIMD System

IEEE Transactions on Computers
Hypertree: A Multiprocessor Interconnection Topology

IEEE Transactions on Computers
PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition

IEEE Transactions on Computers
The Lens Interconnection Strategy

IEEE Transactions on Computers
A Layout for the Shuffle-Exchange Network with O(N2/log3/2N) Area

IEEE Transactions on Computers
Design and Performance of Generalized Interconnection Networks

IEEE Transactions on Computers
The Performance of Multistage Interconnection Networks for Multiprocessors

IEEE Transactions on Computers
Fault Tolerant Interleaved Switching Fabrics For Scalable High-Performance Routers

IEEE Transactions on Parallel and Distributed Systems
Efficient algorithms and methods to solve dynamic MINs stability problem using stable matching with complete ties

Journal of Discrete Algorithms
Memory organization with multi-pattern parallel accesses

Proceedings of the conference on Design, automation and test in Europe
Performance and reliability analysis of new fault-tolerant advance omega network

WSEAS Transactions on Computers
Reed-Solomon turbo product codes for optical communications: from code optimization to decoder design

EURASIP Journal on Wireless Communications and Networking - Advances in Error Control Coding Techniques
The architecture of MANIP: a parallel computer system for solving NP-complete problems

AFIPS '81 Proceedings of the May 4-7, 1981, national computer conference
Design and implementation of the banyan interconnection network in TRAC

AFIPS '80 Proceedings of the May 19-22, 1980, national computer conference
Conflict-free memory allocation for associative data files

AFIPS '83 Proceedings of the May 16-19, 1983, national computer conference
Distributed scheduling of resources on interconnection networks

AFIPS '82 Proceedings of the June 7-10, 1982, national computer conference
Permuting streaming data using RAMs

Journal of the ACM (JACM)
Reliability and performance analysis of new fault tolerant irregular network

WSEAS Transactions on Computer Research
A new min: fault-tolerant advance omega network

ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
On Simplifying Placement and Routing by Extending Coarse-Grained Reconfigurable Arrays with Omega Networks

ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Evolutionary optimization of multistage interconnection networks performance

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
High-throughput Block Turbo Decoding: From Full-parallel Architecture to FPGA Prototyping

Journal of Signal Processing Systems
On rearrangeability of tandem connection of banyan-type networks

IEEE Transactions on Communications
Dynamic memories with faster random and sequential access

IBM Journal of Research and Development
Implementation and evaluation of a (b,k)-adjacent error-correcting/detecting scheme for supercomputer systems

IBM Journal of Research and Development
The performance of SNAIL-2 (a SSS-MIN connected multiprocessor with cache coherent mechanism)

Parallel Computing
All-to-all personalized exchange in generalized shuffle-exchange networks

Theoretical Computer Science
Scalable mpNoC for massively parallel systems - Design and implementation on FPGA

Journal of Systems Architecture: the EUROMICRO Journal
Unified algebraic theory of sorting, routing, multicasting, and concentration networks

IEEE Transactions on Communications
Parallelism and Array Processing

IEEE Transactions on Computers
Memory Allocations for Multiprocessor Systems That Incorporate Content-Addressable Memories

IEEE Transactions on Computers
Shuffling with the Illiac and PM2I SIMD Networks

IEEE Transactions on Computers
A Comparative Study of Distributed Resource Sharing on Multiprocessors

IEEE Transactions on Computers
A Classification of Cube-Connected Networks with a Simple Control Scheme

IEEE Transactions on Computers
A theory of decomposition into prime factors of layered interconnection networks

Discrete Applied Mathematics
Analysis techniques for SIMD machine interconnection networks and the effects of processor address masks

IEEE Transactions on Computers - Special issue on parallel processors and processing
Performance study of multilayered multistage interconnection networks under hotspot traffic conditions

Journal of Computer Systems, Networks, and Communications
Reliability analysis of multi-path multi-stage interconnection network

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Design and implementation of Multistage Interconnection Networks using Quantum-dot Cellular Automata

Microelectronics Journal
Design schemes and performance analysis of dynamic rerouting interconnection networks for tolerating faults and preventing collisions

ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Designing a high performance and fault tolerant multistage interconnection network with easy dynamic rerouting

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Towards UML 2 extensions for compact modeling of regular complex topologies

MoDELS'05 Proceedings of the 8th international conference on Model Driven Engineering Languages and Systems
Analyzing permutation capability of multistage interconnection networks with colored Petri nets

Information Sciences: an International Journal
Computer generation of efficient software viterbi decoders

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Dynamic routing of data stream tuples among parallel query plan running on multi-core processors

Distributed and Parallel Databases
Research: MS4 - a high performance output buffering ATM switch

Computer Communications
A fault-tolerant 2*2 switching element for switching networks

Computer Communications
A high-performance ATM switch based on modified shuffle-exchange network

Computer Communications
A mathematical abstraction of the rearrangeability conjecture for shuffle-exchange networks

Operations Research Letters
An optimal parallel prefix-sums algorithm on the memory machine models for GPUs

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
A method of batching conflict routings in shuffle-exchange networks

Theoretical Computer Science

Quantified Score

Hi-index	15.19

Visualization

Abstract

This paper discusses the design of a primary memory system for an array processor which allows parallel, conflict-free access to various slices of data (e.g., rows, columns, diagonals, etc.), and subsequent alignment of these data for processing. Memory access requirements for an array processor are discussed in general terms and a set of common requirements are defined. The ability to meet these requirements is shown to depend on the number of independent memory units and on the mapping of the data in these memories. Next, the need to align these data for processing is demonstrated and various alignment requirements are defined. Hardware which can perform this alignment function is discussed, e.g., permutation, indexing, switching or sorting networks, and a network (the omega network) based on Stone's shuffle-exchange operation [1] is presented. Construction of this network is described and many of its useful properties are proven. Finally, as an example of these ideas, an array processor is shown which allows conflict-free access and alignment of rows, columns, diagonals, backward diagonals, and square blocks in row or column major order, as well as certain other special operations.