Updating the inverse of a matrix
SIAM Review
Neural Computation
Rigorous learning curve bounds from statistical mechanics
Machine Learning - Special issue on COLT '94
General bounds on Bayes errors for regression with Gaussian processes
Proceedings of the 1998 conference on Advances in neural information processing systems II
Learning curves for Gaussian processes
Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Upper and Lower Bounds on the Learning Curve for Gaussian Processes
Machine Learning
Bayesian Learning for Neural Networks
Bayesian Learning for Neural Networks
Learning curves for Gaussian process regression: approximations and bounds
Neural Computation
Diffusion Kernels on Graphs and Other Discrete Input Spaces
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Classes of kernels for machine learning: a statistics perspective
The Journal of Machine Learning Research
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
We consider learning on graphs, guided by kernels that encode similarity between vertices. Our focus is on random walk kernels, the analogues of squared exponential kernels in Euclidean spaces. We show that on large, locally treelike graphs these have some counter-intuitive properties, specifically in the limit of large kernel lengthscales. We consider using these kernels as covariance functions of Gaussian processes. In this situation one typically scales the prior globally to normalise the average of the prior variance across vertices. We demonstrate that, in contrast to the Euclidean case, this generically leads to significant variation in the prior variance across vertices, which is undesirable from a probabilistic modelling point of view. We suggest the random walk kernel should be normalised locally, so that each vertex has the same prior variance, and analyse the consequences of this by studying learning curves for Gaussian process regression. Numerical calculations as well as novel theoretical predictions for the learning curves using belief propagation show that one obtains distinctly different probabilistic models depending on the choice of normalisation. Our method for predicting the learning curves using belief propagation is significantly more accurate than previous approximations and should become exact in the limit of large random graphs.