An optimal routing algorithm for mesh-connected Parallel computers
Journal of the ACM (JACM)
Parallel permutation and sorting algorithms and a new generalized connection network
Journal of the ACM (JACM)
Programmable Radar Signal Processing Using the Rap
Proceedings of the Sagamore Computer Conference on Parallel Processing
The Gamma network: A multiprocessor interconnection network with redundant paths
ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Microprocessor implementation of a parallel processor
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Parallel Processing with the Perfect Shuffle
IEEE Transactions on Computers
Data Manipulating Functions in Parallel Processors and Their Implementations
IEEE Transactions on Computers
On the Number of Permutations Performable by the Augmented Data Manipulator Network
IEEE Transactions on Computers
IEEE Transactions on Computers
Interconnections Between Processors and Memory Modules Using the Shuffle-Exchange Network
IEEE Transactions on Computers
The Universality of the Shuffle-Exchange Network
IEEE Transactions on Computers
A Uniform Representation of Single-and Multistage Interconnection Networks Used in SIMD Machines
IEEE Transactions on Computers
The Theory Underlying the Partitioning of Permutation Networks
IEEE Transactions on Computers
Design of a Massively Parallel Processor
IEEE Transactions on Computers
Implementation of Permutation Functions in Illiac IV-Type Computers
IEEE Transactions on Computers
Access and Alignment of Data in an Array Processor
IEEE Transactions on Computers
Routing Schemes for the Augmented Data Manipulator Network in an MIMD System
IEEE Transactions on Computers
A Model of SIMD Machines and a Comparison of Various Interconnection Networks
IEEE Transactions on Computers
A Shuffle-Exchange Network with Simplified Control
IEEE Transactions on Computers
Graph theoretic characterization and reliability of the multiple-clique network
Mathematical and Computer Modelling: An International Journal
Hi-index | 14.98 |
Two SIMD single-stage interconnection networks which have been proposed and studied in the literature are the Illiac type and PM2I. The ability of the Illiac and PM2I networks to perform the shuffle data permutation in an SIMD machine with N processors is examined. Two algorithms for an SIMD or multiple-SIMD machine with the PM2I network to perform the shuffle are given. One algorithm is used in the event that the SIMD machine is of the same size (in terms of number of processors) as the shuffle to be emulated. The other algorithm is used when the shuffle to be performed is of smaller size than the given machine with the PM2I network. It is proven that both algorithms require only one more network transfer than the previously published lower bound (which is log2 S for a shuffle on S elements). Using the PM2I algorithm as a basis, an algorithm for the Illiac to emulate the shuffle is given. Its performance is 2驴N - 1 transfers which is only three transfers more than the previously published lower bound of 2驴N - 4.