Multiple-access protocols and time-constrained communication
ACM Computing Surveys (CSUR)
Automatically increasing the fault-tolerance of distributed algorithms
Journal of Algorithms
Analysis of hard real-time communications
Real-Time Systems
Bounds on information exchange for Byzantine agreement
Journal of the ACM (JACM)
Theoretical Computer Science
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment
Journal of the ACM (JACM)
On the Quality of Service of Failure Detectors
IEEE Transactions on Computers
Distributed Algorithms
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems
IEEE Transactions on Computers
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Non blocking atomic commitment with an unreliable failure detector
SRDS '95 Proceedings of the 14TH Symposium on Reliable Distributed Systems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
Group Membership and View Synchrony in Partitionable Asynchronous Distributed Systems: Specifications
Failure detection and consensus in the crash-recovery model
Distributed Computing
A simple and fast asynchronous consensus protocol based on a weak failure detector
Distributed Computing
Scheduling distributable real-time threads in the presence of crash failures and message losses
Proceedings of the 2008 ACM symposium on Applied computing
Fast Scheduling of Distributable Real-Time Threads with Assured End-to-End Timeliness
Ada-Europe '08 Proceedings of the 13th Ada-Europe international conference on Reliable Software Technologies
Revisiting simultaneous consensus with crash failures
Journal of Parallel and Distributed Computing
Narrowing power vs efficiency in synchronous set agreement: Relationship, algorithms and lower bound
Theoretical Computer Science
On distributed real-time scheduling in networked embedded systems in the presence of crash failures
SEUS'07 Proceedings of the 5th IFIP WG 10.2 international conference on Software technologies for embedded and ubiquitous systems
Consensus-driven distributable thread scheduling in networked embedded systems
EUC'07 Proceedings of the 2007 international conference on Embedded and ubiquitous computing
Narrowing power vs. efficiency in synchronous set agreement
ICDCN'08 Proceedings of the 9th international conference on Distributed computing and networking
Uncertainty and predictability: can they be reconciled?
Future directions in distributed computing
Recovering from distributable thread failures in distributed real-time Java
ACM Transactions on Embedded Computing Systems (TECS)
The failure detector abstraction
ACM Computing Surveys (CSUR)
Detecting failures in distributed systems with the Falcon spy network
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Structured derivation of semi-synchronous algorithms
DISC'11 Proceedings of the 25th international conference on Distributed computing
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
We investigate whether fast failure detectors can be useful--and if so by how much--in the design of real-time fault-tolerant systems. Specifically, we show how fast failure detectors can speed up consensus and fault-tolerant broadcasts, by providing fast algorithms and deriving some matching lower bounds, for synchronous systems with crashes. These results show that a fast failure detector service (implemented using specialized hardware or expedited message delivery) can be an important tool in the design of real-time mission-critical systems.