Factors impacting performance of multithreaded sparse triangular solve

  • Authors:
  • Michael M. Wolf;Michael A. Heroux;Erik G. Boman

  • Affiliations:
  • Scalable Algorithms Dept., Sandia National Laboratories, Albuquerque, NM;Scalable Algorithms Dept., Sandia National Laboratories, Albuquerque, NM;Scalable Algorithms Dept., Sandia National Laboratories, Albuquerque, NM

  • Venue:
  • VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

As computational science applications grow more parallel with multi-core supercomputers having hundreds of thousands of computational cores, it will become increasingly difficult for solvers to scale. Our approach is to use hybrid MPI/threaded numerical algorithms to solve these systems in order to reduce the number of MPI tasks and increase the parallel efficiency of the algorithm. However, we need efficient threaded numerical kernels to run on the multi-core nodes in order to achieve good parallel efficiency. In this paper, we focus on improving the performance of a multithreaded triangular solver, an important kernel for preconditioning. We analyze three factors that affect the parallel performance of this threaded kernel and obtain good scalability on the multi-core nodes for a range of matrix sizes.