Extending the scope of the controlled logical clock

  • Authors:
  • Daniel Becker;Markus Geimer;Rolf Rabenseifner;Felix Wolf

  • Affiliations:
  • German Research School for Simulation Sciences, Aachen, Germany 52062;Jülich Supercomputing Centre, Jülich, Germany 52425;University of Stuttgart, Stuttgart, Germany 70550;German Research School for Simulation Sciences, Aachen, Germany 52062 and Jülich Supercomputing Centre, Jülich, Germany 52425 and RWTH Aachen University, Aachen, Germany 52056

  • Venue:
  • Cluster Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Event traces are helpful in understanding the performance behavior of parallel applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks on most cluster systems may render the analysis ineffective because inaccurate relative event timings may misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors or confuse the users of time-line visualization tools by showing messages flowing backward in time. In our earlier work, we have developed a scalable algorithm called the controlled logical clock that eliminates inconsistent inter-process timings postmortem in traces of pure MPI applications, potentially running on large processor configurations. In this paper, we first demonstrate that our algorithm also proves beneficial in computational grids, where a single application is executed using the combined computational power of several geographically dispersed clusters. Second, we present an extended version of the algorithm that--in addition to message-passing event semantics--also preserves and restores shared-memory event semantics, enabling the correction of traces from hybrid applications.