Interoperable Run-Time Tools for Distributed Systems—A Case Study

  • Authors:
  • Roland Wismüller;Thomas Ludwig

  • Affiliations:
  • Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut für Informatik, Technische Universität München, D-80290 München, Germany wismuell@in.tum.de ...;Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut für Informatik, Technische Universität München, D-80290 München, Germany ludwig@in.tum.de

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tools that observe and manipulate the run-time behavior of parallel and distributed systems are essential for developing and maintaining these systems. Sometimes users would even need to use several tools at the same time in order to have a higher functionality at their disposal. Today, tools developed independently by different vendors are, however, not able to interoperate. Interoperability not only allows concurrent use of tools, but also can lead to an added value for the user. A debugger interoperating with a checkpointing system, for example, can provide a debugging environment where the debugged program can be reset to any previous state, thus speeding up cyclic debugging for long running programs.Using this example scenario, we derive requirements that should be met by the tools' software infrastructure in order to enable interoperability. A review of existing infrastructures shows that these requirements are only partially met today. In an ongoing research effort, support for all of the requirements is built into the OMIS compliant on-line monitoring system OCM.