A divide-and-conquer approach for solving singular value decomposition on a heterogeneous system

  • Authors:
  • Ding Liu;Ruixuan Li;David J. Lilja;Weijun Xiao

  • Affiliations:
  • Huazhong University of Science and Technology, Wuhan, Hubei, P. R. China and University of Minnesota, Minneapolis, MN;Huazhong University of Science and Technology, Wuhan, Hubei, P. R. China;University of Minnesota, Minneapolis, MN;Virginia Commonwealth University, Richmond, VA

  • Venue:
  • Proceedings of the ACM International Conference on Computing Frontiers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Singular value decomposition (SVD) is a fundamental linear operation that has been used for many applications, such as pattern recognition and statistical information processing. In order to accelerate this time-consuming operation, this paper presents a new divide-and-conquer approach for solving SVD on a heterogeneous CPU-GPU system. We carefully design our algorithm to match the mathematical requirements of SVD to the unique characteristics of a heterogeneous computing platform. This includes a high-performanc solution to the secular equation with good numerical stability, overlapping the CPU and the GPU tasks, and leveraging the GPU bandwidth in a heterogeneous system. The experimental results show that our algorithm has better performance than MKL's divide-and-conquer routine [18] with four cores (eight hardware threads) when the size of the input matrix is larger than 3000. Furthermore, it is up to 33 times faster than LAPACK's divide-and-conquer routine [17], 3 times faster than MKL's divide-and-conquer routine with four cores, and 7 times faster than CULA on the same device, when the size of the matrix grows up to 14,000. Our algorithm is also much faster than previous SVD approaches on GPUs.