On the Performance of Parallel Matrix Factorisation on the Hypermesh

Authors:
A. Al-Ayyoub;M. Ould-Khaoua;K. Day
Affiliations:
Department of Computer Science and Information Systems, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan ayyoub@just.edu.jo;Department of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK mohamed@dcs.gla.ac.uk;Department of Computer Science, Sultan Qaboos University, Sultanate of Oman kday@squ.edu.com
Venue:
The Journal of Supercomputing
Year:
2001

Citing 19
Cited 2

The cosmic cube

Communications of the ACM - Special section on computer architecture
Gaussian elimination on hypercubes

Proceedings of the international workshop on Parallel algorithms & architectures
Gaussian elimination on a hypercube automaton

Journal of Parallel and Distributed Computing
The iPSC/2 direct-connect communications technology

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
On the parallelization of blocked LU factorization algorithms on distributed memory architectures

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The Cost of Broadcasting on Star Graphs and k-ary Hypercubes

IEEE Transactions on Computers
Block-cyclic dense linear algebra

SIAM Journal on Scientific Computing
Unifying and Optimizing Parallel Linear Algebra Algorithms

IEEE Transactions on Parallel and Distributed Systems
Subcube matrix decomposition: a unifying view for LU factorization on multicomputers

Parallel Computing
“Hypermeshes”: optical interconnection networks for parallel computing

Journal of Parallel and Distributed Computing
Performance evaluation of hypermeshes and meshes with wormhole routing

Journal of Systems Architecture: the EUROMICRO Journal - Special quintuple issue: Euromicro 1995 short contributions
The Cross Product of Interconnection Networks

IEEE Transactions on Parallel and Distributed Systems
Matrix Decomposition on the Star Graph

IEEE Transactions on Parallel and Distributed Systems
CP-PACS: a massively parallel processor for large scale scientific calculations

ICS '97 Proceedings of the 11th international conference on Supercomputing
Orthogonal Graphs for the Construction of a Class of Interconnection Networks

IEEE Transactions on Parallel and Distributed Systems
Limits on Interconnection Network Performance

IEEE Transactions on Parallel and Distributed Systems
Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Design and Evaluation of Parallel Block Algorithms: LU Factorization on an IBM 3090 VF/600J

Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing
Graphs and Hypergraphs

Graphs and Hypergraphs

Merging, sorting and matrix operations on the SOME-bus multiprocessor architecture

Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
Modeling the effects of hot-spot traffic load on the performance of wormhole-switched hypermeshes

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most common multicomputer networks, e.g. d-ary h-cubes, are graph topologies where an edge (channel) interconnects exactly two vertices (nodes). Hypergraphs are a generalisation of the graph model, where a channel interconnects an arbitrary number of nodes. Previous studies have used synthetic workloads (e.g. statistical distributions) to stress the superior performance characteristics of regular multi-dimensional hypergraphs, also known as hypermeshes, over d-ary h-cubes. There has been, however, hardly any study that has considered real-world parallel applications. This paper contributes towards filling this gap by providing a comparative study of the performance of one of the most common numerical problems, namely matrix factorisation, on the hypermesh, hypercube, and d-ary h-cube. To this end, the paper first introduces orthogonal networks as a unified model for describing both the graph and hypergraph topologies. It then develops a generalised parallel algorithm for matrix factorisation and evaluates its performance on the hypermesh, hypercube and d-ary h-cube. The results reveal that the hypermesh supports matrix computation more efficiently, and therefore provides more evidence of the hypermesh as a viable network for future large-scale multicomputers.