Extended linear models with Gaussian prior on the parameters and adaptive expansion vectors

Authors:
Ignacio Barrio;Enrique Romero;Lluís Belanche
Affiliations:
Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, Barcelona, Spain;Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, Barcelona, Spain;Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, Barcelona, Spain
Venue:
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Year:
2007

Citing 10
Cited 0

Bayesian interpolation

Neural Computation
The nature of statistical learning theory

The nature of statistical learning theory
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Expectation Propagation for approximate Bayesian inference

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
A Theory of Networks for Approximation and Learning

A Theory of Networks for Approximation and Learning
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
The Bayesian backfitting relevance vector machine

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Predictive automatic relevance determination by expectation propagation

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Building Sparse Large Margin Classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Unifying View of Sparse Approximate Gaussian Process Regression

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an approximate Bayesian method for regression and classification with models linear in the parameters. Similar to the Relevance Vector Machine (RVM), each parameter is associated with an expansion vector. Unlike the RVM, the number of expansion vectors is specified beforehand. We assume an overall Gaussian prior on the parameters and find, with a gradient based process, the expansion vectors that (locally) maximize the evidence. This approach has lower computational demands than the RVM, and has the advantage that the vectors do not necessarily belong to the training set. Therefore, in principle, better vectors can be found. Furthermore, other hyperparameters can be learned in the same smooth joint optimization. Experimental results show that the freedom of the expansion vectors to be located away from the training data causes overfitting problems. These problems are alleviated by including a hyperprior that penalizes expansion vectors located far away from the input data.