Structure from failure

Authors:
Ralf Herbrich;Thore Graepel;Brendan Murphy
Affiliations:
Microsoft Research Ltd., Cambridge, UK;Microsoft Research Ltd., Cambridge, UK;Microsoft Research Ltd., Cambridge, UK
Venue:
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
Year:
2007

Citing 4
Cited 1

A model for reasoning about persistence and causation

Computational Intelligence
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Continuous time bayesian networks

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Learning continuous time bayesian networks

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Importance Sampling for Continuous Time Bayesian Networks

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the problem of learning the dependencies among servers in large networks based on failure patterns in their up-time behaviour. We model up-times in terms of exponential distributions whose inverse lifetime parameters lmay vary with the state of other servers. Based on a conjugate Gamma prior over inverse lifetimes we identify the most likely network configuration given that any node has at most one parent. The method can be viewed as a special case of learning a continuous time Bayesian network. Our approach enables us to easily incorporate existing expert prior knowledge. Furthermore our method enjoys advantages over a state-of-the-art rule-based approach. We validate the approach on synthetic data and apply it to five year data for a set of over 500 servers at a server farm of a major Microsoft web site.