Parallel Lattice Boltzmann Flow Simulation on Emerging Multi-core Platforms

  • Authors:
  • Liu Peng;Ken-Ichi Nomura;Takehiro Oyakawa;Rajiv K. Kalia;Aiichiro Nakano;Priya Vashishta

  • Affiliations:
  • Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...;Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...;Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...;Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...;Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...;Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of South ...

  • Venue:
  • Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A parallel Lattice Boltzmann Method (pLBM), which is based on hierarchical spatial decomposition, is designed to perform large-scale flow simulations. The algorithm uses critical section-free, dual representation in order to expose maximal concurrency and data locality. Performances of emerging multi-core platforms--PlayStation3 (Cell Broadband Engine) and Compute Unified Device Architecture (CUDA)--are tested using the pLBM, which is implemented with multi-thread and message-passing programming. The results show that pLBM achieves good performance improvement, 11.02 for Cell over a traditional Xeon cluster and 8.76 for CUDA graphics processing unit (GPU) over a Sempron central processing unit (CPU). The results provide some insights into application design on future many-core platforms.