Understanding DCE
Reference architecture for distributed systems management
IBM Systems Journal
Network and distributed systems management
Network and distributed systems management
Domains: a framework for structuring management policy
Network and distributed systems management
Conflict analysis for management policies
Proceedings of the fifth IFIP/IEEE international symposium on Integrated network management V : integrated management in a virtual world: integrated management in a virtual world
Fault isolation and event correlation for integrated fault management
Proceedings of the fifth IFIP/IEEE international symposium on Integrated network management V : integrated management in a virtual world: integrated management in a virtual world
Using a classification of management policies for policy specification and policy transformation
Proceedings of the fourth international symposium on Integrated network management IV
Towards a practical alarm correlation system
Proceedings of the fourth international symposium on Integrated network management IV
A coding approach to event correlation
Proceedings of the fourth international symposium on Integrated network management IV
Using master tickets as a storage for problem-solving expertise
Proceedings of the fourth international symposium on Integrated network management IV
Services supporting management of distributed applications and systems
IBM Systems Journal
Towards A Role-Based Framework for DistributedSystems Management
Journal of Network and Systems Management
A Case-Based Reasoning Approach to the Resolution of Faults in Communication Networks
Proceedings of the IFIP TC6/WG6.6 Third International Symposium on Integrated Network Management with participation of the IEEE Communications Society CNOM and with support from the Institute for Educational Services
Event Correlation in Heterogeneous Networks Using the OSI Management Framework
Proceedings of the IFIP TC6/WG6.6 Third International Symposium on Integrated Network Management with participation of the IEEE Communications Society CNOM and with support from the Institute for Educational Services
Configuration maintenance for distributed applications management
CASCON '97 Proceedings of the 1997 conference of the Centre for Advanced Studies on Collaborative research
Making Distributed Applications Manageable Through Instrumentation
PDSE '97 Proceedings of the 2nd International Workshop on Software Engineering for Parallel and Distributed Systems
On a rule based management architecture
SDNE '95 Proceedings of the 2nd International Workshop on Services in Distributed and Networked Environments
Policy Definition Language for Automated Management of Distributed Systems
SMW '96 Proceedings of the 2nd IEEE International Workshop on Systems Management (SMW'96)
A General Object Model for the Management of Distributed Applications
SMW '96 Proceedings of the 2nd IEEE International Workshop on Systems Management (SMW'96)
SMW '96 Proceedings of the 2nd IEEE International Workshop on Systems Management (SMW'96)
Configuring policies in public health applications
Expert Systems with Applications: An International Journal
A Survey of Fault Management in Wireless Sensor Networks
Journal of Network and Systems Management
Hi-index | 0.00 |
Managing the availability and performance of a distributed system involves monitoring the behavior of the system, identifying system problems, and correcting those problems. Each of these tasks requires some expertise, such as an understanding of the mechanics of the underlying system components. As the size and complexity of these systems increases, and the number of distributed applications executing on these systems increases, managing the availability and performance of distributed systems becomes more difficult. Little research has focused on embedding systems management expertise into a management application for a distributed system. In this paper we describe a rule-based management application for a commercially available distributed computing environment that is capable of monitoring the distributed system, detecting system service-related performance and availability problems, and generating corrective actions to correct the problems.