LAPACK's user's guide
Linear robust control
Solution of the matrix equation AX + XB = C [F4]
Communications of the ACM
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
State-space truncation methods for parallel model reduction of large-scale systems
Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Approximation of Large-Scale Dynamical Systems (Advances in Design and Control) (Advances in Design and Control)
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Solving linear-quadratic optimal control problems on parallel computers
Optimization Methods & Software
Exploiting the capabilities of modern GPUs for dense matrix computations
Concurrency and Computation: Practice & Experience
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Using hybrid CPU-GPU platforms to accelerate the computation of the matrix sign function
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems
SIAM Journal on Scientific Computing
Efficient model order reduction of large-scale systems on multi-core platforms
ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
Unleashing CPU-GPU acceleration for control theory applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Accelerating the Lyapack library using GPUs
The Journal of Supercomputing
Large-scale Stein and Lyapunov equations, Smith method, and applications
Numerical Algorithms
Hi-index | 0.00 |
We describe a hybrid Lyapunov solver based on the matrix sign function, where the intensive parts of the computation are accelerated using a graphics processor (GPU) while executing the remaining operations on a general-purpose multi-core processor (CPU). The initial stage of the iteration operates in single-precision arithmetic, returning a low-rank factor of an approximate solution. As the main computation in this stage consists of explicit matrix inversions, we propose a hybrid implementation of Gausz-Jordan elimination using look-ahead to overlap computations on GPU and CPU. To improve the approximate solution, we introduce an iterative refinement procedure that allows to cheaply recover full double-precision accuracy. In contrast to earlier approaches to iterative refinement for Lyapunov equations, this approach retains the low-rank factorization structure of the approximate solution. The combination of the two stages results in a mixed-precision algorithm, that exploits the capabilities of both general-purpose CPUs and many-core GPUs and overlaps critical computations. Numerical experiments using real-world data and a platform equipped with two Intel Xeon QuadCore processors and an Nvidia Tesla C1060 show a significant efficiency gain of the hybrid method compared to a classical CPU implementation.