Neural networks and the bias/variance dilemma
Neural Computation
Neural Computation
A practical Bayesian framework for backpropagation networks
Neural Computation
Machine Learning
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Choosing Multiple Parameters for Support Vector Machines
Machine Learning
Sparse on-line Gaussian processes
Neural Computation
A Fast Dual Algorithm for Kernel Logistic Regression
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Ridge Regression Learning Algorithm in Dual Variables
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Adaptive Sparseness for Supervised Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse bayesian learning and the relevance vector machine
The Journal of Machine Learning Research
Efficient svm training using low-rank kernel representations
The Journal of Machine Learning Research
The evidence framework applied to classification networks
Neural Computation
The evidence framework applied to support vector machines
IEEE Transactions on Neural Networks
Sequential Bayesian kernel modelling with non-Gaussian noise
Neural Networks
A classifier for Bangla handwritten numeral recognition
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
We present here a simple technique that simplifies the construction of Bayesian treatments of a variety of sparse kernel learning algorithms. An incomplete Cholesky factorisation is employed to modify the dual parameter space, such that the Gaussian prior over the dual model parameters is whitened. The regularisation term then corresponds to the usual weight-decay regulariser, allowing the Bayesian analysis to proceed via the evidence framework of MacKay. There is in addition a useful by-product associated with the incomplete Cholesky factorisation algorithm, it also identifies a subset of the training data forming an approximate basis for the entire dataset in the kernel-induced feature space, resulting in a sparse model. Bayesian treatments of the kernel ridge regression (KRR) algorithm, with both constant and heteroscedastic (input dependent) variance structures, and kernel logistic regression (KLR) are provided as illustrative examples of the proposed method, which we hope will be more widely applicable.