The Vision of Autonomic Computing
Computer
Critical event prediction for proactive management in large-scale computer clusters
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Autonomic Computing and Reliability Improvement
ISORC '05 Proceedings of the Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
Backward Inference in Bayesian Networks for Distributed Systems Management
Journal of Network and Systems Management
Hybrid Prediction Model for improving Reliability in Self-Healing System
SERA '06 Proceedings of the Fourth International Conference on Software Engineering Research, Management and Applications
Predictive algorithms in the management of computer systems
IBM Systems Journal
Adaptive diagnosis in distributed systems
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
The increasing complexity of Ubiquitous computing leads to the challenges in managing systems in an automated way, which accurately identifies problems and solves them. Many Artificial Intelligent techniques are presented to support problem determination. In this paper, a mechanism for problem localization based on analyzing real-time streams of system performance for automated system management is proposed. We use Bayesian network to construct a compact network and provide both inductive and deductive inferences through probabilistic dependency analysis throughout the network. An algorithm for extracting a certain factors that are highly related to problems is introduced, which supports network learning in diverse domains. The approach enables us to both diagnose problems on the underlying system status and predict potential problems at run time via probabilities propagation throughout network. A demonstration focusing on system reliability in distributed system management is given to prove the availability of proposed mechanism, and thereby achieving self-managing capability.