Extending the scope of the controlled logical clock

Authors:
Daniel Becker;Markus Geimer;Rolf Rabenseifner;Felix Wolf
Affiliations:
German Research School for Simulation Sciences, Aachen, Germany 52062;Jülich Supercomputing Centre, Jülich, Germany 52425;University of Stuttgart, Stuttgart, Germany 70550;German Research School for Simulation Sciences, Aachen, Germany 52062 and Jülich Supercomputing Centre, Jülich, Germany 52425 and RWTH Aachen University, Aachen, Germany 52056
Venue:
Cluster Computing
Year:
2013

Citing 31
Cited 0

Global events and global breakpoints in distributed systems

Proceedings of the Twenty-First Annual Hawaii International Conference on Software Track
Partial orders for parallel debugging

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Partial ordering of synchronization events for distributed debugging in tightly-coupled multiprocessor systems

EDMCC2 Proceedings of the 2nd European conference on Distributed memory computing
Metacomputing

Communications of the ACM
Annotated bibliography on global states and times in distributed systems

ACM SIGOPS Operating Systems Review
A parallel hashed Oct-Tree N-body algorithm

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
On efficiently implementing global time for performance evaluation on multiprocessor systems

Journal of Parallel and Distributed Computing
Semicoarsening Multigrid on Distributed Memory Machines

SIAM Journal on Scientific Computing
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Design and Prototype of a Performance Tool Interface for OpenMP

The Journal of Supercomputing
Building a Global Time on Parallel Machines

Proceedings of the 3rd International Workshop on Distributed Algorithms
DiP: A Parallel Program Development Environment

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
A Preliminary Topological Debugger for MPI Programs

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
(Almost) No Cost Clock Synchronization

(Almost) No Cost Clock Synchronization
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface

Journal of Parallel and Distributed Computing - Special issue on computational grids
The OpenMP Source Code Repository

PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
Low-cost clock synchronization

Distributed Computing
Detecting causal relationships in distributed computations: in search of the holy grail

Distributed Computing
Performance Evaluation and Optimization of Parallel Grid Computing Applications

PDP '08 Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)
Clock Synchronization in Cell BE Traces

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Internal Timer Synchronization for Parallel Event Tracing

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
A scalable tool architecture for diagnosing wait states in massively parallel applications

Parallel Computing
Measuring causal propagation of overhead of inefficiencies in parallel applications

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Scalable timestamp synchronization for event traces of message-passing applications

Parallel Computing
A parallel trace-data interface for scalable performance analysis

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
HARC: the highly-available resource co-allocator

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
DTVS: a distributed trace visualization system

SPDP '94 Proceedings of the 1994 6th IEEE Symposium on Parallel and Distributed Processing
Using an enterprise grid for execution of MPI parallel applications: a case study

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Globus toolkit version 4: software for service-oriented systems

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
How to reconcile event-based performance analysis with tasking in OpenMP

IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more

Quantified Score

Hi-index	0.00

Visualization

Abstract

Event traces are helpful in understanding the performance behavior of parallel applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks on most cluster systems may render the analysis ineffective because inaccurate relative event timings may misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors or confuse the users of time-line visualization tools by showing messages flowing backward in time. In our earlier work, we have developed a scalable algorithm called the controlled logical clock that eliminates inconsistent inter-process timings postmortem in traces of pure MPI applications, potentially running on large processor configurations. In this paper, we first demonstrate that our algorithm also proves beneficial in computational grids, where a single application is executed using the combined computational power of several geographically dispersed clusters. Second, we present an extended version of the algorithm that--in addition to message-passing event semantics--also preserves and restores shared-memory event semantics, enabling the correction of traces from hybrid applications.