Global events and global breakpoints in distributed systems
Proceedings of the Twenty-First Annual Hawaii International Conference on Software Track
Partial orders for parallel debugging
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
EDMCC2 Proceedings of the 2nd European conference on Distributed memory computing
Communications of the ACM
Annotated bibliography on global states and times in distributed systems
ACM SIGOPS Operating Systems Review
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
On efficiently implementing global time for performance evaluation on multiprocessor systems
Journal of Parallel and Distributed Computing
Semicoarsening Multigrid on Distributed Memory Machines
SIAM Journal on Scientific Computing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Design and Prototype of a Performance Tool Interface for OpenMP
The Journal of Supercomputing
Building a Global Time on Parallel Machines
Proceedings of the 3rd International Workshop on Distributed Algorithms
DiP: A Parallel Program Development Environment
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
A Preliminary Topological Debugger for MPI Programs
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
(Almost) No Cost Clock Synchronization
(Almost) No Cost Clock Synchronization
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface
Journal of Parallel and Distributed Computing - Special issue on computational grids
The OpenMP Source Code Repository
PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
Low-cost clock synchronization
Distributed Computing
Detecting causal relationships in distributed computations: in search of the holy grail
Distributed Computing
Performance Evaluation and Optimization of Parallel Grid Computing Applications
PDP '08 Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)
Clock Synchronization in Cell BE Traces
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Internal Timer Synchronization for Parallel Event Tracing
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Measuring causal propagation of overhead of inefficiencies in parallel applications
PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
A parallel trace-data interface for scalable performance analysis
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
HARC: the highly-available resource co-allocator
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
DTVS: a distributed trace visualization system
SPDP '94 Proceedings of the 1994 6th IEEE Symposium on Parallel and Distributed Processing
Using an enterprise grid for execution of MPI parallel applications: a case study
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Globus toolkit version 4: software for service-oriented systems
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
How to reconcile event-based performance analysis with tasking in OpenMP
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
Hi-index | 0.00 |
Event traces are helpful in understanding the performance behavior of parallel applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks on most cluster systems may render the analysis ineffective because inaccurate relative event timings may misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors or confuse the users of time-line visualization tools by showing messages flowing backward in time. In our earlier work, we have developed a scalable algorithm called the controlled logical clock that eliminates inconsistent inter-process timings postmortem in traces of pure MPI applications, potentially running on large processor configurations. In this paper, we first demonstrate that our algorithm also proves beneficial in computational grids, where a single application is executed using the combined computational power of several geographically dispersed clusters. Second, we present an extended version of the algorithm that--in addition to message-passing event semantics--also preserves and restores shared-memory event semantics, enabling the correction of traces from hybrid applications.