Mining opportunities for code improvement in a just-in-time compiler

  • Authors:
  • Adam Jocksch;Marcel Mitran;Joran Siu;Nikola Grcevski;José Nelson Amaral

  • Affiliations:
  • Department of Computing Science, University of Alberta, Edmonton, Canada;IBM Toronto Software Laboratory, Toronto, Canada;IBM Toronto Software Laboratory, Toronto, Canada;IBM Toronto Software Laboratory, Toronto, Canada;Department of Computing Science, University of Alberta, Edmonton, Canada

  • Venue:
  • CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The productivity of a compiler development team depends on its ability not only to the design effective solutions to known code generation problems, but also to uncover potential code improvement opportunities. This paper describes a data mining tool that can be used to identify such opportunities based on a combination of hardware-profiling data and on compiler-generated counters. This data is combined into an Execution Flow Graph (EFG) and then FlowGSP, a new data mining algorithm, finds sequences of attributes associated with subpaths of the EFG. Many examples of important opportunities for code improvement in the IBM® Testarossa compiler are described to illustrate the usefulness of this data mining technique. This mining tool is specially useful for programs whose execution is not dominated by a small set of frequently executed loops. Information about the amount of space and time required to run the mining tool are also provided. In comparison with manual search through the data, the mining tool saved a significant amount of compiler development time and effort.