Array processor with multiple broadcasting
Journal of Parallel and Distributed Computing
Mesh-connected array processors with bypass capability for signal/image processing
Proceedings of the Twenty-First Annual Hawaii International Conference on Architecture Track
The design and analysis of parallel algorithms
The design and analysis of parallel algorithms
Polymorphic-Torus Architecture for Computer Vision
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Reconfigurable massively parallel computers
Reconfigurable massively parallel computers
Parallel Computations on Reconfigurable Meshes
IEEE Transactions on Computers
Computer Arithmetic
Digital Computer Arithmetic
IEEE Transactions on Parallel and Distributed Systems
An Optimal Sorting Algorithm on Reconfigurable Mesh
IPPS '92 Proceedings of the 6th International Parallel Processing Symposium
Computing the Inner Product on Reconfigurable Buses with Shift Switching
CONPAR '92/ VAPP V Proceedings of the Second Joint International Conference on Vector and Parallel Processing: Parallel Processing
Computational Aspects of VLSI
Constant Time Algorithms for Computational Geometry on the Reconfigurable Mesh
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Sorting, Selection, and Routing on the Array with Reconfigurable Optical Buses
IEEE Transactions on Parallel and Distributed Systems
An Efficient Algorithm for Row Minima Computations on Basic Reconfigurable Meshes
IEEE Transactions on Parallel and Distributed Systems
Constant-Time Algorithms for Constrained Triangulations on Reconfigurable Meshes
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Scaling Simulation of the Fusing-Restricted Reconfigurable Mesh
IEEE Transactions on Parallel and Distributed Systems
Scalable Hardware-Algorithms for Binary Prefix Sums
IEEE Transactions on Parallel and Distributed Systems
Using bus linearization to scale the reconfigurable mesh
Journal of Parallel and Distributed Computing
Integer and Floating Point Matrix-Vector Multiplication on the Reconfigurable Mesh
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A run-time reconfigurable array of multipliers architecture
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Non-binary Parallel Arithmetic Architecture
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Scalable VLSI Architecture for Binary Prefix Sums
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Hi-index | 0.00 |
We propose to enhance traditional broadcast buses by the addition of a new feature that we call shift switching. We show that on a linear array of processors enhanced with shift switching, the prefix sums of n bits can be computed in [log(n+1)/log w] broadcasts, each over n switches, assuming a global bus of width w. Next our prefix sums algorithm is used in conjunction with broadcasting on short buses to obtain several efficient architectural designs for the following fundamental problems: 1) ranking linked lists, 2) counting the number of 1's in a sequence of n bits, and 3) sorting small sets. We see our main contribution in showing that the new bus feature leads to designs that are both theoretically interesting and practically relevant