The Reconfigurable Ring of Processors: Fine-Grain Tree-Structured Computations

Authors:
Arnold L. Rosenberg;Vittorio Scarano;Ramesh K. Sitaaman
Affiliations:
-;-;-
Venue:
IEEE Transactions on Computers
Year:
1997

Citing 6
Cited 1

Scans as Primitive Parallel Operations

IEEE Transactions on Computers
Introduction to parallel algorithms and architectures: array, trees, hypercubes

Introduction to parallel algorithms and architectures: array, trees, hypercubes
Parallel Computations on Reconfigurable Meshes

IEEE Transactions on Computers
Horizons of parallel computation

Journal of Parallel and Distributed Computing
Introduction to VLSI Systems

Introduction to VLSI Systems
Massively Parallel Computing: Data Distribution and Communication

Proceedings of the First Heinz Nixdorf Symposium on Parallel Architectures and Their Efficient Use

Optimal broadcast on parallel locality models

Journal of Discrete Algorithms

Quantified Score

Hi-index	14.98

Visualization

Abstract

We study fine-grain computation on the Reconfigurable Ring of Processors $({\cal RRP}),$ a parallel architecture whose processing elements (PEs) are interconnected via a multiline reconfigurable bus, each of whose lines has one-packet width and can be configured, independently of other lines, to establish an arbitrary PE-to-PE connection. We present a "cooperative" message-passing protocol that will, in the presence of suitable implementation technology, endow an ${\cal RRP}$ with message latency that is logarithmic in the number of PEs a message passes over in transit. Our study focuses on the computational consequences of such latency in such an architecture. Our main results prove that: 1) an N-PE ${\cal RRP}$ can execute a sweep up or down an N-leaf complete binary tree in time proportional to log N log log N; 2) a broad range of N-PE architectures, including N-PE ${\cal RRP}{\rm s},$ require time proportional to log N log log N to perform such a sweep.