Vectorized OpenCL implementation of numerical integration for higher order finite elements

Authors:
Filip Kruel;Krzysztof Bana
Affiliations:
-;-
Venue:
Computers & Mathematics with Applications
Year:
2013

Citing 22
Cited 1

Spectral methods on triangles and other domains

Journal of Scientific Computing
Tetrahedral hp finite elements: algorithms and flow simulations

Journal of Computational Physics
A compiler for variational forms

ACM Transactions on Mathematical Software (TOMS)
A memory model for scientific algorithms on graphics processors

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Exploring weak scalability for FEM calculations on a GPU-enhanced cluster

Parallel Computing
Scientific computing Kernels on the cell processor

International Journal of Parallel Programming
Benchmarking GPUs to tune dense linear algebra

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Parallel Computing
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA

Journal of Parallel and Distributed Computing
Computing with Hp-Adaptive Finite Elements, Vol. 2: Frontiers Three Dimensional Elliptic and Maxwell Problems with Applications

Computing with Hp-Adaptive Finite Elements, Vol. 2: Frontiers Three Dimensional Elliptic and Maxwell Problems with Applications
Nodal discontinuous Galerkin methods on graphics processors

Journal of Computational Physics
Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU

International Journal of Computational Science and Engineering
DOLFIN: Automated finite element computing

ACM Transactions on Mathematical Software (TOMS)
From h to p efficiently: Implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations

Journal of Computational Physics
Finite element numerical integration on GPUs

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Finite element numerical integration on PowerXCell processors

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Adaptation of double-precision matrix multiplication to the cell broadband engine architecture

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Element-wise Implementation of Iterative Solvers for FEM Problems on the Cell Processor

PDP '11 Proceedings of the 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing
Model-driven adaptation of double-precision matrix multiplication to the Cell processor architecture

Parallel Computing
Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book

Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book
Finite Element Integration on GPUs

ACM Transactions on Mathematical Software (TOMS)
Intel Xeon Phi Coprocessor High Performance Programming

Intel Xeon Phi Coprocessor High Performance Programming

Numerical integration on GPUs for higher order finite elements

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.09

Visualization

Abstract

In our work we analyze computational aspects of the problem of numerical integration in finite element calculations and consider an OpenCL implementation of related algorithms for processors with wide vector registers. As a platform for testing the implementation we choose the PowerXCell processor, being an example of the Cell Broadband Engine (CellBE) architecture. Although the processor is considered old for today's standards (its design dates back to year 2001), we investigate its performance due to two features that it shares with recent Xeon Phi family of co-processors: wide vector units and relatively slow connection of computing cores with main global memory. The performed analysis of parallelization options can also be used for designing numerical integration algorithms for other processors with vector registers, such as contemporary x86 microprocessors. We consider higher order finite element approximations and implement the standard algorithm of numerical integration for prismatic elements. Original contributions of the paper include the analysis of data movement and vector operations performed during code execution. Several versions of the implementation are developed and tested in practice.