Multimodal nonlinear filtering using Gauss-Hermite quadrature

Authors:
Hannes P. Saal;Nicolas M. O. Heess;Sethu Vijayakumar
Affiliations:
School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK;School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK;School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK
Venue:
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Year:
2011

Citing 7
Cited 0

Universal approximation using radial-basis-function networks

Neural Computation
Improving the mean field approximation via the use of mixture distributions

Learning in graphical models
Feature extraction by non parametric mutual information maximization

The Journal of Machine Learning Research
An Efficient Image Similarity Measure Based on Approximations of KL-Divergence Between Two Gaussian Mixtures

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,

Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Split variational inference

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Analytic moment-based Gaussian process filtering

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many filtering problems the exact posterior state distribution is not tractable and is therefore approximated using simpler parametric forms, such as single Gaussian distributions. In nonlinear filtering problems the posterior state distribution can, however, take complex shapes and even become multimodal so that single Gaussians are no longer sufficient. A standard solution to this problem is to use a bank of independent filters that individually represent the posterior with a single Gaussian and jointly form a mixture of Gaussians representation. Unfortunately, since the filters are optimized separately and interactions between the components consequently not taken into account, the resulting representation is typically poor. As an alternative we therefore propose to directly optimize the full approximating mixture distribution by minimizing the KL divergence to the true state posterior. For this purpose we describe a deterministic sampling approach that allows us to perform the intractable minimization approximately and at reasonable computational cost. We find that the proposed method models multimodal posterior distributions noticeably better than banks of independent filters even when the latter are allowed many more mixture components. We demonstrate the importance of accurately representing the posterior with a tractable number of components in an active learning scenario where we report faster convergence, both in terms of number of observations processed and in terms of computation time, and more reliable convergence on up to ten-dimensional problems.