Designing Clustered Multiprocessor Systems under Packaging and Technological Advancements

Authors:
Debashis Basak;Dhabaleswar K. Panda
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1996

Citing 12
Cited 8

Optical Crossbar Networks

Computer
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
Hierarchical Interconnection Networks for Multicomputer Systems

IEEE Transactions on Computers
Efficient architectures for data access in a shared memory hierarchy

Journal of Parallel and Distributed Computing
The Stanford Dash Multiprocessor

Computer
Working sets, cache sizes, and node granularity issues for large-scale multiprocessors

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The J-machine multicomputer: an architectural evaluation

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Designing interconnection networks for multi-level packaging

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Limits on Interconnection Network Performance

IEEE Transactions on Parallel and Distributed Systems
The Impact of Pipelined Channels on k-ary n-Cube Networks

IEEE Transactions on Parallel and Distributed Systems
The Impact of Wiring Constraints on Hierarchical Network Performance

IPPS '92 Proceedings of the 6th International Parallel Processing Symposium
Interconnection network design based on packaging considerations

Interconnection network design based on packaging considerations

Macro-Star Networks: Efficient Low-Degree Alternatives to Star Graphs

IEEE Transactions on Parallel and Distributed Systems
VLSI layout and packaging of butterfly networks

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Turn Grouping for Multicast in Wormhole-Routed Mesh Networks Supporting the Turn Model

The Journal of Supercomputing
Alleviating Consumption Channel Bottleneck in Wormhole-Routed k-ary n-Cube Systems

IEEE Transactions on Parallel and Distributed Systems
Comparative Analysis of Adaptive Wormhole Routing in Tori and Hypercubes in the Presence of Hotspot Traffic

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Turn grouping for efficient multicast in wormhole mesh networks

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Multilayer VLSI Layout for Interconnection Networks

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
On the performance of multicomputer interconnection networks

Journal of Systems Architecture: the EUROMICRO Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustered or hierarchical interconnections demonstrate advantage in designing large scale multiprocessor systems. Earlier studies in literature have either focused on only flat interconnections or proposed hierarchical/clustered interconnections with limited packaging and demanded performance constraints. Large systems require several levels of packaging. Packaging technologies impose various physical constraints on bisection bandwidth and channel width of a system. Pinout technologies and capacity of packaging modules have been ignored in earlier studies, often leading to configurations that are not design-feasible. Similarly, the impact of processor and interconnect technologies on demanded performance has also not been considered. In this paper, we propose a new supply-demand framework for multiprocessor system design by considering packaging, processor, and interconnect technologies in an integrated manner. The elegance of this framework lies in its parameterized representation of different technologies. For a given set of technological parameters the framework derives the best configuration while considering practical design aspects like maximum board area, maximum available pinout, fixed channel width, and scalability. In order to build a scalable parallel system with a given number of processors, the framework explores the design space of flat k-ary n-cube topologies and their clustered variations (k-ary n-cube cluster-c) to derive design-feasible configurations with best system performance. The study identifies processor board area, supported channel width, board pinout density, and router pinout as critical parameters and analyzes their impact on deriving design-feasible and best configurations. For a wide range of parameters, it is shown that best configurations are achieved with cluster-based systems with up to 8 processors per cluster and 3D-5D intercluster interconnection.