Sequential learning with LS-SVM for large-scale data sets

Authors:
Tobias Jung;Daniel Polani
Affiliations:
Department of Computer Science, University of Mainz, Germany;School of Computer Science, University of Hertfordshire, UK
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Year:
2006

Citing 8
Cited 1

A resource-allocating network for function interpolation

Neural Computation
Sparse Greedy Matrix Approximation for Machine Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Efficient svm training using low-rank kernel representations

The Journal of Machine Learning Research
Predictive low-rank decomposition for kernel methods

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Unifying View of Sparse Approximate Gaussian Process Regression

The Journal of Machine Learning Research
Subset based least squares subspace regression in RKHS

Neurocomputing
Kernel matching pursuit for large datasets

Pattern Recognition
The kernel recursive least-squares algorithm

IEEE Transactions on Signal Processing

Sparse approximation through boosting for learning large scale kernel machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a subspace-based variant of LS-SVMs (i.e. regularization networks) that sequentially processes the data and is hence especially suited for online learning tasks. The algorithm works by selecting from the data set a small subset of basis functions that is subsequently used to approximate the full kernel on arbitrary points. This subset is identified online from the data stream. We improve upon existing approaches (esp. the kernel recursive least squares algorithm) by proposing a new, supervised criterion for the selection of the relevant basis functions that takes into account the approximation error incurred from approximating the kernel as well as the reduction of the cost in the original learning task. We use the large-scale data set 'forest' to compare performance and efficiency of our algorithm with greedy batch selection of the basis functions via orthogonal least squares. Using the same number of basis functions we achieve comparable error rates at much lower costs (CPU-time and memory wise).