Front tracking for gas dynamics
Journal of Computational Physics
On Godunov-type methods for gas dynamics
SIAM Journal on Numerical Analysis
Applications of front tracking to the simulation of shock refractions and unstable mixing
Proceedings of the third ARO workshop on Adaptive methods for partial differential equations
Weighted essentially non-oscillatory schemes
Journal of Computational Physics
Adaptive multiresolution schemes for shock computations
Journal of Computational Physics
Efficient implementation of weighted ENO schemes
Journal of Computational Physics
Scheduling multithreaded computations by work stealing
Journal of the ACM (JACM)
A Simple Method for Compressible Multifluid Flows
SIAM Journal on Scientific Computing
A conservative fully adaptive multiresolution algorithm for parabolic PDEs
Journal of Computational Physics
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
An Adaptive Wavelet Collocation Method for Fluid-Structure Interaction at High Reynolds Numbers
SIAM Journal on Scientific Computing
Simultaneous space-time adaptive wavelet solution of nonlinear parabolic differential equations
Journal of Computational Physics
A conservative interface method for compressible flows
Journal of Computational Physics
Block structured adaptive mesh and time refinement for hybrid, hyperbolic+N-body systems
Journal of Computational Physics
A Brinkman penalization method for compressible flows in complex geometries
Journal of Computational Physics
IEEE Transactions on Parallel and Distributed Systems
An adaptive multiresolution scheme with local time stepping for evolutionary PDEs
Journal of Computational Physics
Large calculation of the flow over a hypersonic vehicle using a GPU
Journal of Computational Physics
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
Computing discrete transforms on the Cell Broadband Engine
Parallel Computing
Wavelet-Based Adaptive Solvers on Multi-core Architectures for the Simulation of Complex Systems
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
GPU accelerated simulations of bluff body flows using vortex particle methods
Journal of Computational Physics
Solving the euler equations on graphics processing units
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
High throughput software for direct numerical simulations of compressible two-phase flows
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
11 PFLOP/s simulations of cloud cavitation collapse
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
We present a computational method of coupling average interpolating wavelets with high-order finite volume schemes and its implementation on heterogeneous computer architectures for the simulation of multiphase compressible flows. The method is implemented to take advantage of the parallel computing capabilities of emerging heterogeneous multicore/multi-GPU architectures. A highly efficient parallel implementation is achieved by introducing the concept of wavelet blocks, exploiting the task-based parallelism for CPU cores, and by managing asynchronously an array of GPUs by means of OpenCL. We investigate the comparative accuracy of the GPU and CPU based simulations and analyze their discrepancy for two-dimensional simulations of shock-bubble interaction and Richtmeyer-Meshkov instability. The results indicate that the accuracy of the GPU/CPU heterogeneous solver is competitive with the one that uses exclusively the CPU cores. We report the performance improvements by employing up to 12 cores and 6 GPUs compared to the single-core execution. For the simulation of the shock-bubble interaction at Mach 3 with two million grid points, we observe a 100-fold speedup for the heterogeneous part and an overall speedup of 34.