SM-prof: a tool to visualise and find cache coherence performance bottlenecks in multiprocessor programs

  • Authors:
  • Mats Brorsson

  • Affiliations:
  • Department of Computer Engineering, Lund University, P.O. Box 118, S-221 00 LUND, Sweden

  • Venue:
  • Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cache misses due to coherence actions are often the major source for performance degradation in cache coherent multiprocessors. It is often difficult for the programmer to take cache coherence into account when writing the program since the resulting access pattern is not apparent until the program is executed.SM-prof is a performance analysis tool that addresses this problem by visualising the shared data access pattern in a diagram with links to the source code lines causing performance degrading access patterns. The execution of a program is divided into time slots and each data block is classified based on the accesses made to the block during a time slot. This enables the programmer to follow the execution over time and it is possible to track the exact position responsible for accesses causing many cache misses related to coherence actions.Matrix multiplication and the MP3D application from SPLASH are used to illustrate the use of SM-prof. For MP3D, SM-prof revealed performance limitations that resulted in a performance improvement of over 75%.The current implementation is based on program-driven simulation in order to achieve non-intrusive profiling. If a small perturbation of the program execution is acceptable, it is also possible to use software tracing techniques given that a data address can be related to the originating instruction.