Exploring weak scalability for FEM calculations on a GPU-enhanced cluster

  • Authors:
  • Dominik Göddeke;Robert Strzodka;Jamaludin Mohd-Yusof;Patrick McCormick;Sven H. M. Buijssen;Matthias Grajewski;Stefan Turek

  • Affiliations:
  • Institute of Applied Mathematics, University of Dortmund, Vogelpothsweg 87, 44227 Dortmund, Germany;Stanford University, Max Planck Center, United States;Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, United States;Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, United States;Institute of Applied Mathematics, University of Dortmund, Vogelpothsweg 87, 44227 Dortmund, Germany;Institute of Applied Mathematics, University of Dortmund, Vogelpothsweg 87, 44227 Dortmund, Germany;Institute of Applied Mathematics, University of Dortmund, Vogelpothsweg 87, 44227 Dortmund, Germany

  • Venue:
  • Parallel Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

The first part of this paper surveys co-processor approaches for commodity based clusters in general, not only with respect to raw performance, but also in view of their system integration and power consumption. We then extend previous work on a small GPU cluster by exploring the heterogeneous hardware approach for a large-scale system with up to 160 nodes. Starting with a conventional commodity based cluster we leverage the high bandwidth of graphics processing units (GPUs) to increase the overall system bandwidth that is the decisive performance factor in this scenario. Thus, even the addition of low-end, out of date GPUs leads to improvements in both performance- and power-related metrics.