Architecting the finite element method pipeline for the GPU

Authors:
Zhisong Fu;T. James Lewis;Robert M. Kirby;Ross T. Whitaker
Affiliations:
-;-;-;-
Venue:
Journal of Computational and Applied Mathematics
Year:
2014

Citing 13
Cited 0

A fast and simple randomized parallel algorithm for the maximal independent set problem

Journal of Algorithms
Applied numerical linear algebra

Applied numerical linear algebra
Parallel multigrid solver for 3D unstructured finite element problems

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Parallel smoothed aggregation multigrid: aggregation strategies on massively parallel machines

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
BoomerAMG: a parallel algebraic multigrid solver and preconditioner

Applied Numerical Mathematics - Developments and trends in iterative methods for large systems of equations—in memoriam Rüdiger Weiss
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
Sparse matrix solvers on the GPU: conjugate gradients and multigrid

ACM SIGGRAPH 2003 Papers
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA

Journal of Parallel and Distributed Computing
Concurrent number cruncher: a GPU implementation of a general sparse linear solver

International Journal of Parallel, Emergent and Distributed Systems
Nodal discontinuous Galerkin methods on graphics processors

Journal of Computational Physics
From h to p efficiently: Implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations

Journal of Computational Physics
Multigrid Smoothers for Ultraparallel Computing

SIAM Journal on Scientific Computing
A parallel algebraic multigrid solver on graphics processing units

HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications

Quantified Score

Hi-index	7.29

Visualization

Abstract

The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core streaming processors like the graphical processing unit (GPU). In this paper, we present the algorithms and data-structures necessary to move the entire FEM pipeline to the GPU. First we propose an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE. To solve the corresponding linear system efficiently on the GPU, we implement a conjugate gradient method preconditioned with a geometry-informed algebraic multigrid (AMG) method preconditioner. We propose a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU. Comparison of our on-GPU assembly versus a traditional serial implementation on the CPU achieves up to an 87x speedup. Focusing on the linear system solver alone, we achieve a speedup of up to 51x versus use of a comparable state-of-the-art serial CPU linear system solver. Furthermore, the method compares favorably with other GPU-based, sparse, linear solvers.