Communications of the ACM - Special section on computer architecture
Communication effect basic linear algebra computations on hypercube architectures
Journal of Parallel and Distributed Computing
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Hypercube Algorithms and Implementations
Selected Papers from the Second Conference on Parallel Processing for Scientific Computing
A Fault-Tolerant Tree Communication Scheme for Hypercube Systems
IEEE Transactions on Computers
High-Performance Computing: Past, Present, and Future
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Modelling and analysis of communication overhead for parallel matrix algorithms
Mathematical and Computer Modelling: An International Journal
Hi-index | 0.00 |
Hypercube algorithms are presented for distributed block-matrix operations. These algorithms are based entirely on an interconnection scheme which involves two orthogonal sets of binary trees. This switching topology makes use of all hypercube interconnection links in a synchronized manner.An efficient novel matrix-vector multiplication algorithm based on this technique is described. Also, matrix transpose operations moving just pointers rather than actual data, have been implemented for some applications by taking advantage of the above tree structures. For the cases where actual physical vector and matrix transposes are needed, possible techniques, including extensions of the above scheme, are discussed.The algorithms support submatrix partitionings of the data, instead of being limited to row and/or column partitionings. This allows efficient use of nodal vector processors as well as shorter interprocessor communication packets. It also produces a favorable data distribution for applications which involve near neighbor operations such as image processing. The algorithms are based on an interprocessor communication paradigm which involves variable length, tagged block data transfers. They have been implemented on an Intel iPSC hypercube system with the support of the Hypercube Library developed at the Christian Michelsen Institute.