Sparse multiscale gaussian process regression

Authors:
Christian Walder;Kwang In Kim;Bernhard Schölkopf
Affiliations:
Max Planck Institute for Biological Cybernetics, Tuebingen, Germany;Max Planck Institute for Biological Cybernetics, Tuebingen, Germany;Max Planck Institute for Biological Cybernetics, Tuebingen, Germany
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 3
Cited 3

Sparse on-line Gaussian processes

Neural Computation
On the influence of the kernel on the consistency of support vector machines

The Journal of Machine Learning Research
A Unifying View of Sparse Approximate Gaussian Process Regression

The Journal of Machine Learning Research

Sparse Spectrum Gaussian Process Regression

The Journal of Machine Learning Research
Clustered Nyström method for large scale manifold learning and dimension reduction

IEEE Transactions on Neural Networks
Continuous character control with low-dimensional embeddings

ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case of Gaussian covariance function, by basing our computations on m Gaussian basis functions with arbitrary diagonal covariance matrices (or length scales). For a fixed number of basis functions and any given criteria, this additional flexibility permits approximations no worse and typically better than was previously possible. We perform gradient based optimisation of the marginal likelihood, which costs O(m2n) time where n is the number of data points, and compare the method to various other sparse g.p. methods. Although we focus on g.p. regression, the central idea is applicable to all kernel based algorithms, and we also provide some results for the support vector machine (s.v.m.) and kernel ridge regression (k.r.r.). Our approach outperforms the other methods, particularly for the case of very few basis functions, i. e. a very high sparsity ratio.