Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
VLSI theory and parallel supercomputing
Proceedings of the decennial Caltech conference on VLSI on Advanced research in VLSI
The fat-pyramid: a robust network for parallel computation
AUSCRYPT '90 Proceedings of the sixth MIT conference on Advanced research in VLSI
Improved approximation algorithms for shop scheduling problems
SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Packet routing in networks with long wires
Packet routing in networks with long wires
Area-universal interconnection networks for VLSI parallel computers
Area-universal interconnection networks for VLSI parallel computers
Randomized routing and sorting on fixed-connection networks
Journal of Algorithms
Fast Algorithms for Manipulating Formal Power Series
Journal of the ACM (JACM)
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
The Impact of Pipelined Channels on k-ary n-Cube Networks
IEEE Transactions on Parallel and Distributed Systems
X-Tree: A tree structured multi-processor computer architecture
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Communication In X-TREE, A Modular Multiprocessor System
ACM '78 Proceedings of the 1978 annual conference
A complexity theory for VLSI
IEEE Transactions on Parallel and Distributed Systems
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
HSRA: high-speed, hierarchical synchronous reconfigurable array
FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
Generic Universal Switch Blocks
IEEE Transactions on Computers
Compact, multilayer layout for butterfly fat-tree
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Area-Universal Circuits with Constant Slowdown
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
An alternative routing algorithm for the pyramid structures
Proceedings of the 2003 ACM symposium on Applied computing
Unifying mesh- and tree-based programmable interconnect
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A segmented parallel-prefix VLSI circuit with small delays for small segments
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
The fat-stack and universal routing in interconnection networks
Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
Area-time tradeoffs for universal VLSI circuits
Theoretical Computer Science
Hi-index | 14.99 |
This paper shows that a fat-pyramid of area /spl Theta/(A) requires only O(log A) slowdown to simulate any competing network of area A under very general conditions. The result holds regardless of the processor size (amount of attached memory) and number of processors in the competing networks as long as the limitation on total area is met. Furthermore, the result is valid regardless of the relationship between wire length and wire delay. We especially focus on elimination of the common simplifying assumption that unit time suffices to traverse a wire regardless of its length, since the assumption becomes more and more untenable as the size of parallel systems increases. This paper concentrates on simulation using transmission lines (wires along which bits can be pipelined) with the message routing schedule set up off line, but it also discusses the extension to on-line simulation. This paper also examines the capabilities of a fat-pyramid when matched against a substantially larger network and points out the surprising difficulty of doing such a comparison without the unit wire delay assumption.