Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Authors:
Keqin Li
Affiliations:
Department of Computer Science, State University of New York, New Paltz, USA 12561
Venue:
The Journal of Supercomputing
Year:
2010

Citing 15
Cited 1

Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories

Acta Informatica
Efficient parallel solution of linear systems

STOC '85 Proceedings of the seventeenth annual ACM symposium on Theory of computing
Complexity of parallel matrix computations

Theoretical Computer Science
Matrix multiplication via arithmetic progressions

Journal of Symbolic Computation - Special issue on computational algebraic complexity
Introduction to parallel algorithms and architectures: array, trees, hypercubes

Introduction to parallel algorithms and architectures: array, trees, hypercubes
Parallel Algorithms for Image Processing on OMC

IEEE Transactions on Computers
A transputer-based reconfigurable parallel system

NATUG-6 Proceedings of the sixth conference of the North American Transputer Users Group on Transputer research and applications 6
Polynomial and matrix computations (vol. 1): fundamental algorithms

Polynomial and matrix computations (vol. 1): fundamental algorithms
The REFINE multiprocessor—theoretical properties and algorithms

Parallel Computing
Doubly Logarithmic Communication Algorithms for Optical-Communication Parallel Computers

SIAM Journal on Computing
Fast and Processor Efficient Parallel Matrix Multiplication Algorithms on a Linear Array With a Reconfigurable Pipelined Bus System

IEEE Transactions on Parallel and Distributed Systems
Parallel Matrix Multiplication on a Linear Array with a Reconfigurable Pipelined Bus System

IEEE Transactions on Computers
Parallel Edge-Region-Based Segmentation Algorithm Targeted at Reconfigurable MultiRing Network

The Journal of Supercomputing
Analysis of Parallel Algorithms for Matrix Chain Product and Matrix Powers on Distributed Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Linear array with a reconfigurable pipelined bus system - Concepts and applications

Information Sciences: an International Journal

Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.