Comparison of different propagation steps for lattice Boltzmann methods

Authors:
Markus Wittmann;Thomas Zeiser;Georg Hager;Gerhard Wellein
Affiliations:
Erlangen Regional Computing Center (RRZE), University of Erlangen-Nuremberg, Germany;Erlangen Regional Computing Center (RRZE), University of Erlangen-Nuremberg, Germany;Erlangen Regional Computing Center (RRZE), University of Erlangen-Nuremberg, Germany;Department of Computer Science, University of Erlangen-Nuremberg, Germany
Venue:
Computers & Mathematics with Applications
Year:
2013

Citing 4
Cited 1

Efficient Temporal Blocking for Stencil Computations by Multicore-Aware Wavefront Parallelization

COMPSAC '09 Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications Conference - Volume 01
Accelerating Lattice Boltzmann Fluid Flow Simulations Using Graphics Processors

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Introducing a performance model for bandwidth-limited loop kernels

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

Editorial: Mesoscopic Methods in Engineering and Science

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.09

Visualization

Abstract

Several possibilities exist to implement the propagation step of lattice Boltzmann methods. This paper describes common implementations and compares the number of memory transfer operations they require per lattice node update. A performance model based on the memory bandwidth is then used to obtain an estimation of the maximum achievable performance on different machines. A subset of the discussed implementations of the propagation step are benchmarked on different Intel- and AMD-based compute nodes using the framework of an existing flow solver that is specially adapted to simulate flow in porous media, and the model is validated against the measurements. Advanced approaches for the propagation step like ''A-A pattern'' or ''Esoteric Twist'' require more programming effort but often sustain significantly better performance than non-naive but straightforward implementations.