Interprocedural optimizations for improving data cache performance of array-intensive embedded applications

  • Authors:
  • W. Zhang;G. Chen;M. Kandemir;M. Karakoy

  • Affiliations:
  • The Pennsylvania State University, University Park, PA;The Pennsylvania State University, University Park, PA;The Pennsylvania State University, University Park, PA;Imperial College, London

  • Venue:
  • Proceedings of the 40th annual Design Automation Conference
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

As datasets processed by embedded processors increase in size and complexity, the management of higher levels of memory hierarchy (e.g., caches) is becoming an important issue. A major limitation of most of the cache locality optimization techniques proposed by previous research is that they handle a single procedure at a time. This prevents compilers from capturing the data access interactions between procedures and may result in poor performance. In this paper, we look at loop and data transformations from a different angle and use them in an interprocedural optimization framework. Employing the call graph representation of a given application, the proposed technique visits each node of this graph twice and uses loop and data transformations in a systematic way for optimizing array layouts whole program wide. Our experimental results show that this interprocedural locality optimization strategy is much more effective than the previous locality-based techniques that handle each procedure in isolation.