System-level fault diagnosis: A survey
Microprocessing and Microprogramming - Fault tolerant computing
Dynamic Testing Strategy for Distributed Systems
IEEE Transactions on Computers
Distributed fault-tolerance for large multiprocessor systems
ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Probabilistic diagnosis of multiprocessor systems
ACM Computing Surveys (CSUR)
An (N -1)-Resilient Algorithm for Distributed Termination Detection
IEEE Transactions on Parallel and Distributed Systems
A Distributed System-Level Diagnosis Algorithm for Arbitrary Network Topologies
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A Hierarchical Adaptive Distributed System-Level Diagnosis Algorithm
IEEE Transactions on Computers
IEEE Transactions on Computers
An Isochronous Testing Strategy for Hierarchical Adaptive Distributed System-Level Diagnosis
Journal of Electronic Testing: Theory and Applications
SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
Distributed Diagnosis in Dynamic Fault Environments
IEEE Transactions on Parallel and Distributed Systems
Diagnosing mobile ad-hoc networks: two distributed comparison-based self-diagnosis protocols
Proceedings of the 4th ACM international workshop on Mobility management and wireless access
Heartbeat based fault diagnosis for mobile ad-hoc network
ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
A distributed fault identification protocol for wireless and mobile ad hoc networks
Journal of Parallel and Distributed Computing
Model-centric development of highly available software systems
Architecting dependable systems IV
Distributed testing and diagnosis in a mobile computing environment
Proceedings of the 6th International Wireless Communications and Mobile Computing Conference
A scalable multi-level distributed system-level diagnosis
ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
MoDiVHA: A Hierarchical Strategy for Distributed Test Assignment
Journal of Electronic Testing: Theory and Applications
Hi-index | 0.01 |
The practical application and implementation of online distributed system-level diagnosis theory is documented. Proven distributed diagnosis algorithms are shown to be impractical in real systems due to high resource requirements. A distributed system-level diagnosis algorithm called Adaptive DSD is shown to minimize network resources and has resulted in a practical implementation. Adaptive DSD assumes a distributed network, in which network nodes can test other nodes and determine them to be faulty or fault-free. Tests are issued from each node adaptively and depend on the fault situation of the network. Test result reports are generated from test results and forwarded between nodes in the network. Adaptive DSD is proven correct in that each fault-free node reaches an accurate independent diagnosis of the fault conditions of the remaining nodes. No restriction is placed on the number of faulty nodes; any fault situation with any number of faulty nodes is diagnosed correctly. An implementation of the Adaptive DSD algorithm is described.