Identifying potential parallelism via loop-centric profiling

  • Authors:
  • Tipp Moseley;Daniel A. Connors;Dirk Grunwald;Ramesh Peri

  • Affiliations:
  • University of Colorado at Boulder, Boulder, CO;University of Colorado at Boulder, Boulder, CO;University of Colorado at Boulder, Boulder, CO;Intel Corporation, Austin, TX

  • Venue:
  • Proceedings of the 4th international conference on Computing frontiers
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The transition to multithreaded, multi-core designs places a greater responsibility on programmers and software for improving performance; thread-level parallelism (TLP) will be increasingly relied upon in addition to instruction-level parallelism (ILP) and increased clock frequency. Deciding where to try to parallelize code is difficult, especially for large, complex applications or those where the original developers have moved on. Outer loops are relatively easy targets for parallelization, but traditional profilers focus primarily on functions and hot inner loops. To aid in programmers' parallelization efforts, we introduce the concept of loop-centric profiling to provide a hierarchical view of how much time is spent in a loop and the loops nested within it.This paper introduces two techniques for loop profiling. First, we describe an instrumentation-based approach that gathers highly detailed and accurate information about loop behavior. Second, we present a sampling approach that achieves similar results with negligible overhead. The paper concludes with a case study evaluating the tool on several SPEC 2000 benchmarks.