A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Nodal high-order methods on unstructured grids
Journal of Computational Physics
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Fast multipole methods on graphics processors
Journal of Computational Physics
Taming the CFL Number for Discontinuous Galerkin Methods on Structured Meshes
SIAM Journal on Numerical Analysis
Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications
Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications
High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster
Journal of Computational Physics
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
GPU accelerated CESE method for 1D shock tube problems
Journal of Computational Physics
Shallow water simulations on multiple GPUs
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Journal of Computational Physics
Finite Element Integration on GPUs
ACM Transactions on Mathematical Software (TOMS)
Time-integration methods for finite element discretisations of the second-order Maxwell equation
Computers & Mathematics with Applications
GPU accelerated discontinuous Galerkin methods for Euler equations and its adjoint
Proceedings of the High Performance Computing Symposium
A GPU parallelized spectral method for elliptic equations in rectangular domains
Journal of Computational Physics
Architecting the finite element method pipeline for the GPU
Journal of Computational and Applied Mathematics
Vectorized OpenCL implementation of numerical integration for higher order finite elements
Computers & Mathematics with Applications
Accelerated finite element elastodynamic simulations using the GPU
Journal of Computational Physics
Numerical integration on GPUs for higher order finite elements
Computers & Mathematics with Applications
Hi-index | 31.48 |
Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG's high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell's equations on a general 3D unstructured grid by a factor of around 50 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device's available memory bandwidth. Example computations achieve and surpass 200gigaflops/s of net application-level floating point work. In this article, we describe and derive the techniques used to reach this level of performance. In addition, we present comprehensive data on the accuracy and runtime behavior of the method.