The design and analysis of VLSI circuits
The design and analysis of VLSI circuits
Interconnection network with each node on two buses
Proceedings of the international workshop on Parallel algorithms & architectures
Wide-sense nonblocking networks
SIAM Journal on Discrete Mathematics
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Concurrent communication among multi-transceiver stations over shared media (computer networks, spread spectrum)
Computational Aspects of VLSI
Optimal Realization of Sets of Interconnection Functions on Synchronous Multiple Bus Systems
IEEE Transactions on Computers
Exact Bounds on Running ASCEND/DESCEND and FAN-IN Algorithms on Synchronous Multiple Bus Networks
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Computers
On the Complexity of Optimal Bused Interconnections
IEEE Transactions on Computers
Projective planes and congestion-free networks
Discrete Applied Mathematics
Multiple access communications using combinatorial designs
Theoretical aspects of computer science
Fault-Tolerant Multiple Bus Networks for Fan-In Algorithms
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Multiple Access Communications Using Combinatorial Designs
Theoretical Aspects of Computer Science, Advanced Lectures [First Summer School on Theoretical Aspects of Computer Science, Tehran, Iran, July 2000]
Linear work suffix array construction
Journal of the ACM (JACM)
Fast lightweight suffix array construction and checking
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Hi-index | 14.99 |
The problem of efficiently permuting data stored in VLSI chips in accordance with a predetermined set of permutations is explored. By connecting chips with shared bus interconnections, as opposed to point-to-point interconnections, it is shown that the number of pins per chip can often be reduced. As an example, for infinitely many n, the authors exhibit permutation architectures that can realize any of the n cyclic shifts on n chips in one clock tick, where the upper limit on the number of pins per chip is the greatest integer or= square root n. When the set of permutations forms a group with p elements, any permutation in the group can be realized in one clock tick by an architecture with O( square root plg p) pins per chip. When the permutation group is abelian, O( square root p) pins suffice. These results are all derived from a mathematical characterization of uniform permutation architectures based on the combinatorial notion of a difference cover. The authors also consider uniform permutation architectures that realize permutations in several clock ticks instead of one, and show that further savings in the number of pins per chip can be obtained.