Optimizing scientific application loops on stream processors

  • Authors:
  • Li Wang;Xuejun Yang;Jingling Xue;Yu Deng;Xiaobo Yan;Tao Tang;Quan Hoang Nguyen

  • Affiliations:
  • NUDT, ChangSha, China;NUDT, ChangSha, China;UNSW, Sydney, Australia;NDUT, ChangSha, China;NUDT, ChangSha, China;NUDT, ChangSha, China;UNSW, Sydney, Australia

  • Venue:
  • Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a graph coloring compiler framework to allocate on-chip SRF(Stream Register File) storage for optimizing scientific applications on stream processors. Our framework consists of first applying enabling optimizations such as loop unrolling to expose stream reuse and opportunities for maximizing parallelism, i.e., overlapping kernel execution and memory transfers.Then the three SRF management tasks are solved in a unified manner via graph coloring: (1) placing streams in the SRF, (2) exploiting stream use, and (3) maximizing parallelism. We evaluate the performance of our compiler framework by actually running nine representative scientific computing kernels on our FT64 stream processor. Our preliminary results show that compiler management achieves an average speedup of 2.3x compared to First-Fit allocation. In comparison with the performance results obtained from running these benchmarks on Itanium 2, an average speedup of 2.1x is observed.