Open networking with OSI
SNMP, SNMPv2, and CMIP: the practical guide to network management
SNMP, SNMPv2, and CMIP: the practical guide to network management
Modeling and management of distributed applications and services using the OSI management framework
ICCC '95 Proceedings of the 12th international conference on computer communication on Information highways : for a smaller world and better living: for a smaller world and better living
A Case-Based Reasoning Approach to the Resolution of Faults in Communication Networks
Proceedings of the IFIP TC6/WG6.6 Third International Symposium on Integrated Network Management with participation of the IEEE Communications Society CNOM and with support from the Institute for Educational Services
SMW '96 Proceedings of the 2nd IEEE International Workshop on Systems Management (SMW'96)
DANTES: an expert system for real-time network troubleshooting
IJCAI'87 Proceedings of the 10th international joint conference on Artificial intelligence - Volume 1
Services supporting management of distributed applications and systems
IBM Systems Journal
An On-Line Test Platform for Component-Based Systems
SEW '02 Proceedings of the 27th Annual NASA Goddard Software Engineering Workshop (SEW-27'02)
Configuring policies in public health applications
Expert Systems with Applications: An International Journal
Exception handling in the choices operating system
Advanced Topics in Exception Handling Techniques
Hi-index | 0.00 |
Management policies can be used to specify requirements about the desired behaviour of distributed systems. Violations of policies (faults) can then be detected, isolated, located and corrected using a policy-driven fault management system. Other work in this area to date has focused on network-level faults. We believe that in a distributed system it is more appropriate to focus on faults at the application level. Furthermore, this work has been largely domain-specific-a generic, structured approach to this problem is needed. Our work has focused on policy-driven fault management in distributed systems at the application level. In this paper, we define a generic architecture for policy-driven fault management and present a prototype system based on this architecture. We also discuss experience to date using and experimenting with our prototype system.