Optimizing Overall Loop Schedules Using Prefetching and Partitioning

  • Authors:
  • Fei Chen;Timothy W. O'Neil;Edwin H.-M. Sha

  • Affiliations:
  • Univ. of Notre Dame, Notre Dame, IN;Univ. of Notre Dame, Notre Dame, IN;Univ. of Notre Dame, Notre Dame, IN

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a method combining the loop pipelining technique with data prefetching, called Partition Scheduling with Prefetching (PSP), is proposed. In PSP, the iteration space is first divided into regular partitions. Then a two-part schedule, consisting of the ALU and memory parts, is produced and balanced to produce high throughput. These two parts are executed simultaneously, and hence, the remote memory latencies are overlapped. We study the optimal partition shape and size so that a well-balanced overall schedule can be obtained. Experiments on DSP benchmarks show that the proposed methodology consistently produces optimal or near optimal solutions.