GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
Efficient steady-state analysis based on matrix-free Krylov-subspace methods
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Design of millimeter-wave CMOS radios: a tutorial
IEEE Transactions on Circuits and Systems Part I: Regular Papers
A robust periodic arnoldi shooting algorithm for efficient analysis of large-scale RF/MM ICs
Proceedings of the 47th Design Automation Conference
Design considerations for 60 GHz CMOS radios
IEEE Communications Magazine
CUDA acceleration of a matrix-free Rosenbrock-K method applied to the shallow water equations
ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems
Hi-index | 0.00 |
The recent multi/many-core CPUs or GPUs have provided an ideal parallel computing platform to accelerate the time-consuming analysis of radio-frequency/millimeter-wave (RF/MM) integrated circuit (IC). This paper develops a structured shooting algorithm that can fully take advantage of parallelism in periodic steady state (PSS) analysis. Utilizing periodic structure of the state matrix of RF/MM-IC simulation, a cyclic-block-structured shooting-Newton method has been parallelized and mapped onto recent GPU platforms. We first present the formulation of the parallel cyclic-block-structured shooting-Newton algorithm, called periodic Arnoldi shooting method. Then we will present its parallel implementation details on GPU. Results from several industrial examples show that the structured parallel shooting-Newton method on Tesla's GPU can lead to speedups of more than 20x compared to the state-of-the-art implicit GMRES methods under the same accuracy on the CPU.