Probabilistic fault diagnosis for IT services in noisy and dynamic environments

  • Authors:
  • Lu Cheng;Xue-song Qiu;Luoming Meng;Yan Qiao;Zhi-qing Li

  • Affiliations:
  • State Key laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China;State Key laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China;State Key laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China;State Key laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China;State Key laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China

  • Venue:
  • IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The modern society has come to rely heavily on IT services. To improve the quality of IT services it is important to quickly and accurately detect and diagnose their faults which are usually detected as disruption of a set of dependent logical services affected by the failed IT resources. The task, depending on observed symptoms and knowledge about IT services, is always disturbed by noises and dynamic changing in the managed environments. We present a tool for analysis of IT services faults which, given a set of failed end-to-end services, discovers the underlying resources of faulty state. We demonstrate empirically that it applies in noisy and dynamic changing environments with bounded errors and high efficiency. We compare our algorithm with two prior approaches, Shrink and Maxcoverage, in two well-known types of network topologies. Experimental results show that our algorithm improves the overall performance.