Programming the Finite Element Method
Programming the Finite Element Method
Concurrent number cruncher: a GPU implementation of a general sparse linear solver
International Journal of Parallel, Emergent and Distributed Systems
Solving Sparse Linear Systems on NVIDIA Tesla GPUs
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Intel Xeon Phi Coprocessor High Performance Programming
Intel Xeon Phi Coprocessor High Performance Programming
Hi-index | 0.00 |
Finite element method (FEM) is a popular approach to solving Differential equations [5]. Among its many attractive features is its ability to handle complex geometries. The domain is discretised using simple elements whose local contributions are assembled into a global system of equations. This is in contrast to the finite difference method (FDM) which can typically only handle regular geometries. However before solution is possible the system of equations of the FEM has to be assembled, a procedure which can be significant to the computational performance of the FEM solver, particularly when coupled with highly parallel execution [3]. In this work we outline a new algorithm for achieving a highly parallel assembler routine compatible with Intel® Xeon Phi and GPU architectures. We also present performance comparison and analysis of our algorithm and the globalNZ algorithm outlined by Cecka et al. in [2], as implemented on Intel® Xeon Phi architecture and compare these to the serial implementation of Hughes [5].