Safe termination detection in an asynchronous distributed system when processes may crash and recover

Authors:
Neeraj Mittal;Kuppahalli L. Phaneesh;Felix C. Freiling
Affiliations:
Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75080, USA;Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75080, USA;Department of Computer Science, University of Mannheim, D-68131, Mannheim, Germany
Venue:
Theoretical Computer Science
Year:
2009

Citing 18
Cited 0

Logical Time in Distributed Computing Systems

Computer - Distributed computing systems: separate resources acting as one
Reliable communication over unreliable channels

Journal of the ACM (JACM)
An (N -1)-Resilient Algorithm for Distributed Termination Detection

IEEE Transactions on Parallel and Distributed Systems
Detecting termination by weight-throwing in a faulty distributed system

Journal of Parallel and Distributed Computing
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
Distributed control algorithms for AI

Multiagent systems
A taxonomy of distributed termination detection algorithms

Journal of Systems and Software
Computing Global Functions in Asynchronous Distributed Systems with Perfect Failure Detectors

IEEE Transactions on Parallel and Distributed Systems
Distributed Termination

ACM Transactions on Programming Languages and Systems (TOPLAS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Simulating Reliable Links with Unreliable Links in the Presence of Process Crashes

WDAG '96 Proceedings of the 10th International Workshop on Distributed Algorithms
Stateless Termination Detection

DISC '02 Proceedings of the 16th International Conference on Distributed Computing
A Realistic Look At Failure Detectors

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Termination detection in data-driven parallel computations/applications

Journal of Parallel and Distributed Computing
On the Implementation of Unreliable Failure Detectors in Partially Synchronous Systems

IEEE Transactions on Computers
Failure detection and consensus in the crash-recovery model

Distributed Computing
Tiered Algorithm for Distributed Process Quiescence and Termination Detection

IEEE Transactions on Parallel and Distributed Systems
Efficient reduction for wait-free termination detection in a crash-prone distributed system

DISC'05 Proceedings of the 19th international conference on Distributed Computing

Quantified Score

Hi-index	5.23

Visualization

Abstract

The termination detection problem involves detecting whether an ongoing distributed computation has ceased all its activities. We investigate the termination detection problem in an asynchronous distributed system under the crash-recovery model. It has been shown that the problem is impossible to solve under the crash-recovery model in general. We identify two conditions under which the termination detection problem can be solved in a safe manner. We also propose algorithms to detect termination under the conditions identified.