QR factorization of a dense matrix on a hypercube multiprocessor
SIAM Journal on Scientific and Statistical Computing
Parallel algorithms and subcube embedding on a hypercube
SIAM Journal on Scientific Computing
Applied Mathematics and Computation
Topology-aware tile mapping for clusters of SMPs
Proceedings of the 3rd conference on Computing frontiers
Hi-index | 0.00 |
In this work, the impact of physical/logical network topology on parallel matrix computation is studied on an Intel Touchstone DELTA mesh and three generations of hyper-cube multiprocessors. These machines are representative of the continued development of distributed-memory message-passing multiprocessors in the current decade. As the processor architecture and the network hardware continue to improve, it is important that the software being developed can easily take advantage of such improvement. The author shows that for a collection of mathematical software fundamental to parallel matrix computation, this objective is accomplished by basing the parallel algorithms and the software development on a logical subcube-grid network topology. Since a hypercube can be configured as a subcube-grid, a logical subcube-grid is naturally supported by an identical physical subcube-grid embedded in a hypercube network. However, a mesh network has fewer connections than a subcube-grid, and the aspect ratios of the available physical mesh or submeshes may not be altered. This analysis shows that the optimal aspect ratio of the physical mesh is independent from that of the desired logical subcube-grid. The author shows that to achieve the best performance on the DELTA mesh, the optimal aspect ratios for the physical mesh and the logical subcube-grid should be chosen independently at runtime. Numerical experiments were performed on the DELTA mesh and the hypercubes using the same software, and extensive timing results are provided to demonstrate this finding.