Using the SMiLE Monitoring Infrastructure to Detect and Lower the Inefficiency of Parallel Applications

  • Authors:
  • Jie Tao;Wolfgang Karl;Martin Schulz

  • Affiliations:
  • -;-;-

  • Venue:
  • HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

High computational demands are one of the main reasons for the use of parallel architectures like clusters of PCs. Many parallel programs, however, suffer from severe inefficiencies when executed on such a loosely coupled architecture for a variety of reasons. One of the most important is the frequent access to remote memories. In this article, we present a hybrid event-driven monitoring system which uses a hardware monitor to observe all of the underlying transactions on the network and to deliver information about the run-time behavior of parallel programs to tools for performance analysis and debugging. This monitoring system is targeted towards cluster architectures with NUMA characteristics.