Mixture approximations to Bayesian networks

Authors:
Volker Tresp;Michael Haft;Reimar Hofmann
Affiliations:
Siemens AG, Corporate Technology, Neural Computation, Dept. Information and Communications, Munich, Germany;Siemens AG, Corporate Technology, Neural Computation, Dept. Information and Communications, Munich, Germany;Siemens AG, Corporate Technology, Neural Computation, Dept. Information and Communications, Munich, Germany
Venue:
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Year:
1999

Citing 4
Cited 1

Mean field approach to learning in Boltzmann machines

Pattern Recognition Letters - special issue on pattern recognition in practice V
Approximating posterior distributions in belief networks using mixtures

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Mean field theory for sigmoid belief networks

Journal of Artificial Intelligence Research
Mixture representations for inference and learning in Boltzmann machines

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Estimating well-performing bayesian networks using Bernoulli mixtures

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Structure and parameters in a Bayesian network uniquely specify the probability distribution of the modeled domain. The locality of both structure and probabilistic information are the great benefits of Bayesian networks and require the modeler to only specify local information. On the other hand this locality of information might prevent the modeler -and even more any other personfrom obtaining a general overview of the important relationships within the domain. The goal of the work presented in this paper is to provide an "alternativen view on the knowledge encoded in a Bayesian network which might sometimes be very helpful for providing insights into the underlying domain. The basic idea is to calculate a mixture approximation to the probability distribution represented by the Bayesian network. The mixture component densities can be thought of as representing typical scenarios implied by the Bayesian model, providing intuition about the basic relationships. As an additional benefit, performing inference in the approximate model is very simple and intuitive and can provide additional insights. The computational complexity for the calculation of the mixture approximations critically depends on the measure which defines the distance between the probability distribution represented by the Bayesian network and the approximate distribution. Both the KL-divergence and the backward KL-divergence lead to inefficient algorithms. Incidentally, the latter is used in recent work on mixtures of mean field solutions to which the work presented here is closely related. We show, however, that using a mean squared error cost function leads to update equations which can be solved using the junction tree algorithm. We conclude that the mean squared error cost function can be used for Bayesian networks in which inference based on the junction tree is tractable. For large networks, however, one may have to rely on mean field approximations.