Synchronizing clocks in the presence of faults
Journal of the ACM (JACM)
An overview of the SR language and implementation
ACM Transactions on Programming Languages and Systems (TOPLAS)
A new fault-tolerant algorithm for clock synchronization
Information and Computation
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Models for visualization in parallel debuggers
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Hermes: a language for distributed computing
Hermes: a language for distributed computing
Introduction to OSF DCE (rev. 1.0)
Introduction to OSF DCE (rev. 1.0)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Visualizing the Performance of Parallel Programs
IEEE Software
A prototype debugger for Hermes
CASCON '92 Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research - Volume 1
The use of process clustering in distributed-system event displays
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering - Volume 1
Achieving target-system independence in event visualisation
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Performing replay in an OSF DCE environment
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
A Tool for Debugging OSF DCE Applications
COMPSAC '96 Proceedings of the 20th Conference on Computer Software and Applications
A mechanism for visualizing TCP-socket interactions
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Event chain clocks for performance debugging in parallel and distributed systems
ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
The events occurring in the execution of a distributed or parallel application are related by a partial, rather than a total, order. We have developed prototype software that collects such events during program execution and produces a graphical display consistent with the partial order. Such a display can be very helpful in understanding and debugging distributed and parallel applications. However, using only partial-order information does not allow the performance characteristics of an application to be understood. Integrating real-time information with the partial order can provide a display that is useful for understanding both functional and performance aspects of the application. An algorithm is required to adjust the collected real-time information, to ensure that real times are consistent with the partial order. Lamport's clock algorithm provides such an adjustment, but can significantly distort the real-time values. It was necessary to develop a more complex algorithm, using the same basic principles, that minimises such distortions. We have extended existing prototype software for displaying event data, so that either a purely partial-order display or a real-time display may be obtained. The real-time facilities can be used in multiple target environments, such as OSF/DCE, Hermes, and SR.