Efficient Parallel Nonnegative Least Squares on Multicore Architectures

  • Authors:
  • Yuancheng Luo;Ramani Duraiswami

  • Affiliations:
  • yluo1@umd.edu and ramani@umiacs.umd.edu;-

  • Venue:
  • SIAM Journal on Scientific Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We parallelize a version of the active-set iterative algorithm derived from the original works of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the $QR$ factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments.