Efficient parallel solution of linear systems
STOC '85 Proceedings of the seventeenth annual ACM symposium on Theory of computing
Complexity of parallel matrix computations
Theoretical Computer Science
Matrix multiplication via arithmetic progressions
Journal of Symbolic Computation - Special issue on computational algebraic complexity
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Parallel Algorithms for Image Processing on OMC
IEEE Transactions on Computers
A transputer-based reconfigurable parallel system
NATUG-6 Proceedings of the sixth conference of the North American Transputer Users Group on Transputer research and applications 6
Polynomial and matrix computations (vol. 1): fundamental algorithms
Polynomial and matrix computations (vol. 1): fundamental algorithms
The REFINE multiprocessor—theoretical properties and algorithms
Parallel Computing
Doubly Logarithmic Communication Algorithms for Optical-Communication Parallel Computers
SIAM Journal on Computing
IEEE Transactions on Parallel and Distributed Systems
Parallel Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System
IEEE Transactions on Computers
Parallel Edge-Region-Based Segmentation Algorithm Targeted at Reconfigurable MultiRing Network
The Journal of Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Linear array with a reconfigurable pipelined bus system - Concepts and applications
Information Sciences: an International Journal
Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures
The Journal of Supercomputing
Hi-index | 0.00 |
We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.