Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver

  • Authors:
  • Chenhan D. Yu;Weichung Wang

  • Affiliations:
  • -;-

  • Venue:
  • Computers & Mathematics with Applications
  • Year:
  • 2014

Quantified Score

Hi-index 0.09

Visualization

Abstract

Problems that involve large and sparse linear systems are ubiquitous in scientific computing, and there are strong needs to accelerate the solution processes. Hybrid CPU-GPU systems have recently become a new platform trend with powerful computing capabilities. However, it is not clear how such systems can accelerate the solvers. We study how to make the best use of the CPU and the GPU to minimize the total time required to solve symmetric positive definite systems using the multifrontal method. We analyze the computation and communication costs of the multifrontal method on such hybrid systems to build up timing performance models. Workload distribution algorithms are proposed to determine if a frontal matrix should be factored on the CPU or on the GPU to minimize the total execution time of the overall computation. We provide theoretical analyses and numerical results to illustrate the characteristics and efficiency of the proposed algorithms. Because the performance models and workload distribution algorithms can accommodate different CPUs and GPUs adaptively, we expect the applicability and significance of these techniques to continue to grow as heterogeneous hardware and software evolve.