Performance analysis and optimization of MPI collective operations on multi-core clusters

  • Authors:
  • Bibo Tu;Jianping Fan;Jianfeng Zhan;Xiaofang Zhao

  • Affiliations:
  • Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100190;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100190 and Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China 518067;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100190;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100190

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Memory hierarchy on multi-core clusters has twofold characteristics: vertical memory hierarchy and horizontal memory hierarchy. This paper proposes new parallel computation model to unitedly abstract memory hierarchy on multi-core clusters in vertical and horizontal levels. Experimental results show that new model can predict communication costs for message passing on multi-core clusters more accurately than previous models, only incorporated vertical memory hierarchy. The new model provides the theoretical underpinning for the optimal design of MPI collective operations. Aimed at horizontal memory hierarchy, our methodology for optimizing collective operations on multi-core clusters focuses on hierarchical virtual topology and cache-aware intra-node communication, incorporated into existing collective algorithms in MPICH2. As a case study, multi-core aware broadcast algorithm has been implemented and evaluated. The results of performance evaluation show that the above methodology for optimizing collective operations on multi-core clusters is efficient.