Computing the MDMT decomposition

  • Authors:
  • Linda Kaufman

  • Affiliations:
  • AT&T Bell Labs, Murray Hill, NJ

  • Venue:
  • ACM Transactions on Mathematical Software (TOMS)
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

The MDMT factorization of an n×n symmetric indefinite matrix A can be used to solve a linear system with A as the coefficient matrix. This factorization can be computed efficiently using an algorithm given in 1977 by Bunch and Kaufman. The LAPACK project has been implementing block versions of well-known algorithms for solving dense linear systems and eigenvalue problems. The block version of the MDMT decomposition algorithm in LAPACK requires the user to specify a block size b by supplying an n×b scratch array. It then makes (n/b−2)2/2 invocations of a matrix-matrix product subroutine with one matrix no larger than b×b, n2/(2b)−b invocations of a matrix-vector product routine with a matrix no larger than b×b, and between n−n/b and 2(n−n/b) invocations of a matrix-vector product routine with matrices with less than b columns. Because the user can query LAPACK about an optimal block size, our concern is focused on users who cannot change the amount of available scratch space or who neglect to use this facility and are unaware of a performance degradation with small block sizes. This article suggests two alternative algorithms. The first is a block algorithm requiring b×b scratch space and is about 5% slower than LAPACK's current block algorithm with large b. The user does not have to specify the block size. The second algorithm is a rejuvenation of an old implementation of the MDMT decomposition algorithm that requires n matrix-vector products. The performance of the various algorithms on a specific machine is dependent on the manufacturer's implementation of the different basic linear algebra subroutines that they each invoke. Our data indicate that on the Cray Y-MP, Alliant, and Convex the time for the rejuvenated algorithm is either less than or within 10% of that of LAPACK's block algorithm with large b.