VHC: Quickly Building an Optimizer for Complex Embedded Architectures

  • Authors:
  • Michael Dupré;Nathalie Drach;Olivier Temam

  • Affiliations:
  • -;-;-

  • Venue:
  • Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

To meet the high demand for powerful embedded processors,VLIW architectures are increasingly complex (e.g.,multiple clusters), and moreover, they now run increasinglysophisticated control-intensive applications. As a result, developingarchitecture-specific compiler optimizations is becomingboth increasingly critical and complex, while time-to-market constraints remain very tight.In this article, we present a novel program optimizationapproach, called the Virtual Hardware Compiler (VHC),that can perform as well as static compiler optimizations,but which requires far less compiler development effort,even for complex VLIW architectures and complex targetapplications. The principle is to augment the target processorsimulator with superscalar-like features, observe howthe target program is dynamically optimized during execution,and deduce an optimized binary for the static VLIWarchitecture. Developing an architecture-specific optimizerthen amounts to modifying the processor simulator whichis very fast compared to adapting static compiler optimizationsto an architecture. We also show that a VHC-optimizedbinary trained on a number of data sets performs as wellas a statically-optimized binary on other test data sets. Theonly drawback of the approach is a largely increased compilationtime, which is often acceptable for embedded applicationsand devices. Using the Texas Instruments C62 VLIWprocessor and the associated compiler, we experimentallyshow that this approach performs as well as static compileroptimizations for a much lower research and developmenteffort. Using a single-core C60 and a dual-core clusteredC62 processors, we also show that the same approach canbe used for efficiently retargeting binary programs within afamily of processors.