Optimization of geometric multigrid for emerging multi- and manycore processors

  • Authors:
  • Samuel Williams;Dhiraj D. Kalamkar;Amik Singh;Anand M. Deshpande;Brian Van Straalen;Mikhail Smelyanskiy;Ann Almgren;Pradeep Dubey;John Shalf;Leonid Oliker

  • Affiliations:
  • Lawrence Berkeley National Laboratory;Intel Corporation;University of California Berkeley;Intel Corporation;Lawrence Berkeley National Laboratory;Intel Corporation;Lawrence Berkeley National Laboratory;Intel Corporation;Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory

  • Venue:
  • SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this paper, we explore optimization techniques for geometric multigrid on existing and emerging multicore systems including the Opteron-based Cray XE6, Intel® Xeon® E5-2670 and X5550 processor-based Infiniband clusters, as well as the new Intel® Xeon Phi™ coprocessor (Knights Corner). Our work examines a variety of novel techniques including communication-aggregation, threaded wavefront-based DRAM communication-avoiding, dynamic threading decisions, SIMDization, and fusion of operators. We quantify performance through each phase of the V-cycle for both single-node and distributed-memory experiments and provide detailed analysis for each class of optimization. Results show our optimizations yield significant speedups across a variety of subdomain sizes while simultaneously demonstrating the potential of multi- and manycore processors to dramatically accelerate single-node performance. However, our analysis also indicates that improvements in networks and communication will be essential to reap the potential of manycore processors in large-scale multigrid calculations.