Performance analysis for teraflop computers: a distributed automatic approach

  • Authors:
  • Michael Gerndt;Andreas Schmidt;Martin Schulz;Roland Wismüller

  • Affiliations:
  • Institut für Informatik, LRR, Technische Universität München;Institut für Informatik, LRR, Technische Universität München;Institut für Informatik, LRR, Technische Universität München;Institut für Informatik, LRR, Technische Universität München

  • Venue:
  • EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Performance analysis for applications on teraflop computers requires a new combination of concepts: online processing, automation, and distribution. This article presents the design of a new analysis system that performs an automatic search for performance problems. This search is guided by a specification of performance properties based on the APART Specification Language. The system is being implemented as a network of analysis agents that are arranged in a hierarchy. Higher level agents search for global performance problems while lower level agents search local performance problems. Leaf agents request and receive performance data from the monitoring library linked to the application. Our online analysis takes also into account design patterns for parallel applications. These patterns make the analysis more effective and the output more application-related. The analysis is currently being implemented for the Hitachi SR8000 teraflop computer at the Leibniz-Rechenzentrum in Munich within the Peridot project.