Fault tolerant and fault testable hardware design
Fault tolerant and fault testable hardware design
Fault location techniques for distributed control interconnection networks
IEEE Transactions on Computers
Computer networks
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Application of game tree searching techniques to sequential pattern recognition
Communications of the ACM
Fault Tolerance: Principles and Practice
Fault Tolerance: Principles and Practice
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Bibliography on network management
ACM SIGCOMM Computer Communication Review
Software fault isolation in wide area networks
CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Proceedings of the 44th annual Southeast regional conference
Hi-index | 0.02 |
A new diagnostic message protocol that provides fault diagnosis capabilities for the communications in a distributed system environment is described. The protocol is designed to operate in conjunction with a standard end-to-end communication protocol and uses special messages to determine the system fault state. A diagnosis message is represented using a test dependency model that is derived from the system topology. These messages are used by an adaptive strategy designed to achieve specific objectives such as reduced testing cost. Using the test dependency model, a general purpose algorithm is developed for generating these strategies based on an information theory criterion. Specific properties of the protocol are discussed, and several examples of strategies for a distributed system topology are provided.