Compiler-based prefetching for recursive data structures

  • Authors:
  • Chi-Keung Luk;Todd C. Mowry

  • Affiliations:
  • Department of Computer Science, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada M5S 3G4;Department of Computer Science, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada M5S 3G4

  • Venue:
  • Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1996

Quantified Score

Hi-index 0.01

Visualization

Abstract

Software-controlled data prefetching offers the potential for bridging the ever-increasing speed gap between the memory subsystem and today's high-performance processors. While prefetching has enjoyed considerable success in array-based numeric codes, its potential in pointer-based applications has remained largely unexplored. This paper investigates compiler-based prefetching for pointer-based applications---in particular, those containing recursive data structures. We identify the fundamental problem in prefetching pointer-based data structures and propose a guideline for devising successful prefetching schemes. Based on this guideline, we design three prefetching schemes, we automate the most widely applicable scheme (greedy prefetching) in an optimizing research compiler, and we evaluate the performance of all three schemes on a modern superscalar processor similar to the MIPS R10000. Our results demonstrate that compiler-inserted prefetching can significantly improve the execution speed of pointer-based codes---as much as 45% for the applications we study. In addition, the more sophisticated algorithms (which we currently perform by hand, but which might be implemented in future compilers) can improve performance by as much as twofold. Compared with the only other compiler-based pointer prefetching scheme in the literature, our algorithms offer substantially better performance by avoiding unnecessary overhead and hiding more latency.