On selecting a maximum volume sub-matrix of a matrix and related problems

  • Authors:
  • Ali Çivril;Malik Magdon-Ismail

  • Affiliations:
  • Rensselaer Polytechnic Institute, Computer Science Department, 110 8th Street Troy, NY 12180, USA;Rensselaer Polytechnic Institute, Computer Science Department, 110 8th Street Troy, NY 12180, USA

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 5.23

Visualization

Abstract

Given a matrix A@?R^m^x^n (n vectors in m dimensions), we consider the problem of selecting a subset of its columns such that its elements are as linearly independent as possible. This notion turned out to be important in low-rank approximations to matrices and rank revealing QR factorizations which have been investigated in the linear algebra community and can be quantified in a few different ways. In this paper, from a complexity theoretic point of view, we propose four related problems in which we try to find a sub-matrix C@?R^m^x^k of a given matrix A@?R^m^x^n such that (i) @s"m"a"x(C) (the largest singular value of C) is minimum, (ii) @s"m"i"n(C) (the smallest singular value of C) is maximum, (iii) @k(C)=@s"m"a"x(C)/@s"m"i"n(C) (the condition number of C) is minimum, and (iv) the volume of the parallelepiped defined by the column vectors of C is maximum. We establish the NP-hardness of these problems and further show that they do not admit PTAS. We then study a natural Greedy heuristic for the maximum volume problem and show that it has approximation ratio 2^-^O^(^k^l^o^g^k^). Our analysis of the Greedy heuristic is tight to within a logarithmic factor in the exponent, which we show by explicitly constructing an instance for which the Greedy heuristic is 2^-^@W^(^k^) from optimal. When A has unit norm columns, a related problem is to select the maximum number of vectors with a given volume. We show that if the optimal solution selects k columns, then Greedy will select @W(k/logk) columns, providing a logk approximation.