Partial orders for parallel debugging
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Consistent detection of global predicates
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Detecting atomic sequences of predicates in distributed computations
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Specification and verification of dynamic properties in distributed computations
Journal of Parallel and Distributed Computing
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Detection of Strong Unstable Predicates in Distributed Programs
IEEE Transactions on Parallel and Distributed Systems
Monitoring functions on global states of distributed programs
Journal of Parallel and Distributed Computing
Principles of Distributed Systems
Principles of Distributed Systems
Detection of Weak Unstable Predicates in Distributed Programs
IEEE Transactions on Parallel and Distributed Systems
Faster Possibility Detection by Combining Two Approaches
WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Efficient Detection of Restricted Classes of Global Predicates
WDAG '95 Proceedings of the 9th International Workshop on Distributed Algorithms
Observation of Software for Distributed Systems with RCL
Proceedings of the 15th Conference on Foundations of Software Technology and Theoretical Computer Science
Detecting conjunctive channel predicates in a distributed programming environment
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Expressing and detecting control flow properties of distributed computations
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Distributed algorithms for detecting conjunctive predicates
ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
Scalable Fault-Tolerant Aggregation in Large Process Groups
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Designing execution control in programs with global application states monitoring
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Hi-index | 0.00 |
A fundamental problem in developing distributed software is that no process has access to the global state. Thus, computing a global predicate or function-a need that occurs frequently in many distributed systems-typically requires significant programming. Being able to observe a distributed computation is useful for many fundamental problems in distributed software, such as debugging, testing, and fault-tolerance. After a program is debugged and tested, it must be monitored for fault-tolerance, again requiring something that will observe the global state. Finally, the ability to observe global predicates generalizes algorithms for many previous problems such as detecting program termination, token loss, and deadlock. Research on how to detect global predicates has yielded three sets of algorithms. In the global snapshot algorithm, global snap-shots of the computation are repeatedly computed until the desired predicate becomes true. However, this approach works only for stable predicates like deadlock and termination, which do not turn false once they become true. In the second set of algorithms, a lattice of global states is constructed. Unlike the global snapshot approach, this approach lets users detect unstable predicates. However, it can mean exploring a prohibitive number of global states. This article surveys algorithms that use a third approach, which exploits the structure of the predicate, but does not build a lattice. Instead, they examine the computation itself to deduce if a predicate became true. These algorithms are computation ally efficient and can be used to detect even unstable predicates.