Delivery of time-critical messages using a multiple copy approach
ACM Transactions on Computer Systems (TOCS)
Internetworking with TCP/IP (2nd ed.), vol. I
Internetworking with TCP/IP (2nd ed.), vol. I
Reliable computer systems (2nd ed.): design and evaluation
Reliable computer systems (2nd ed.): design and evaluation
FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior Under Faults
IEEE Transactions on Software Engineering - Special issue on software reliability
Simulation study of the capacity effects of dispersity routing for fault tolerant realtime channels
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
IEEE Transactions on Computers
Fault Injection Techniques and Tools
Computer
Real-Time Communication in Multihop Networks
IEEE Transactions on Parallel and Distributed Systems
Experimental assessment of parallel systems
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Experimental evaluation of the fail-silent behaviour in programs with consistency checks
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
How Fail-Stop are Faulty Programs?
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Resource management for real-time communication: making theory meet practice
RTAS '96 Proceedings of the 2nd IEEE Real-Time Technology and Applications Symposium (RTAS '96)
DOCTOR: an integrated software fault injection environment for distributed real-time systems
IPDS '95 Proceedings of the International Computer Performance and Dependability Symposium on Computer Performance and Dependability Symposium
Exploring quality-of-service issues in network interface design
Exploring quality-of-service issues in network interface design
INTERWORKING '00 Proceedings of the 5th IFIP TC6 International Symposium on Next Generation Networks, Networks and Services for the Information Society
Hi-index | 0.00 |
Effective detection of failures is essential for reliable communication services. Traditionally, non-real-time computer networks have relied on behavior-based techniques for detecting communication failures. That is, each node uses heartbeats to detect the failure of its neighbors and the end-to-end transport protocol (e.g., TCP) achieves reliable communication by acknowledgment/retransmission. Recently, there has been a growing demand for reliable 驴real-time驴 communication, but little research has been done on the failure detection problem. In this paper, we present two behavior-based failure-detection schemes驴neighbor detection and end-to-end detection驴for reliable real-time communication services and experimentally evaluate their effectiveness. Specifically, we measure and analyze the coverage and latency of these detection schemes through fault-injection experiments. The experimental results have shown that nearly all failures can be detected very quickly by the neighbor detection scheme, while the end-to-end detection scheme uncovers the remaining failures with larger detection latencies.