Global Tiling for Communication Minimal Parallelization on Distributed Memory Systems

  • Authors:
  • Lei Liu;Li Chen;Chengyong Wu;Xiao-Bing Feng

  • Affiliations:
  • Key Laboratory of Computer Architecture Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080 and Graduate School of the Chinese Academy of Sciences, Beijing, China ...;Key Laboratory of Computer Architecture Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Key Laboratory of Computer Architecture Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Key Laboratory of Computer Architecture Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080

  • Venue:
  • Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most previous studies on tiling focus on the division of iteration space. However, on distributed memory parallel systems, the decomposition of computation and the distribution of data must be handled at the same time, in order to attain load balancing and to minimize data migration. In this paper, we formulate a 0-1 integer linear programming for the problem of globally optimal tiling to minimize the total execution time. To simplify the selection of tiling parameters, we restrict the tile shape to semi-oblique shape, and present two effective approaches to decide the tile shape for multi-dimensional semi-oblique shaped tiling. Besides, we present a tile-to-processor mapping scheme based on hyperplanes, which can express diverse parallelism and gain better performance than traditional methods. The experimentations with NPB2.3-serial SP and LU on Qsnet connected cluster achieved the average parallel efficiency of 87% and 73% respectively.