A mixed-precision algorithm for the solution of Lyapunov equations on hybrid CPU-GPU platforms

  • Authors:
  • Peter Benner;Pablo Ezzatti;Daniel Kressner;Enrique S. Quintana-Ortı;Alfredo Remón

  • Affiliations:
  • Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, D-39106 Magdeburg, Germany;Centro de Cálculo-Instituto de la Computación, Universidad de la República, 11.300-Montevideo, Uruguay;Seminar für Angewandte Mathematik, ETHZ, CH-8092 Zürich, Switzerland;Dpto. de Ingenierıa y Ciencia de Computadores, Universidad Jaime I, 12.071-Castellón, Spain;Dpto. de Ingenierıa y Ciencia de Computadores, Universidad Jaime I, 12.071-Castellón, Spain

  • Venue:
  • Parallel Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a hybrid Lyapunov solver based on the matrix sign function, where the intensive parts of the computation are accelerated using a graphics processor (GPU) while executing the remaining operations on a general-purpose multi-core processor (CPU). The initial stage of the iteration operates in single-precision arithmetic, returning a low-rank factor of an approximate solution. As the main computation in this stage consists of explicit matrix inversions, we propose a hybrid implementation of Gausz-Jordan elimination using look-ahead to overlap computations on GPU and CPU. To improve the approximate solution, we introduce an iterative refinement procedure that allows to cheaply recover full double-precision accuracy. In contrast to earlier approaches to iterative refinement for Lyapunov equations, this approach retains the low-rank factorization structure of the approximate solution. The combination of the two stages results in a mixed-precision algorithm, that exploits the capabilities of both general-purpose CPUs and many-core GPUs and overlaps critical computations. Numerical experiments using real-world data and a platform equipped with two Intel Xeon QuadCore processors and an Nvidia Tesla C1060 show a significant efficiency gain of the hybrid method compared to a classical CPU implementation.