(R) Polynomial - Time Nested Loop Fusion with Full Parallelism

  • Authors:
  • Chenhua Lang

  • Affiliations:
  • -

  • Venue:
  • ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loop fusion is an effective way of reducing synchronization and improving data locality. Traditional fusion techniques, however either cannot address the case when fusion-preventing dependence exists in nested loops, or cannot achieve good parallelism after fusion. This paper gives a significant improvement by presenting several efficient polynomial-time algorithms to solve these problems. These algorithms combined with the retiming technique allow nested loop fusion in the existence of outmost loop-carried dependence as in the presence of fusion-preventing dependence. Furthermore, the technique is proved to achieve fully parallel execution of the fused loops.