Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Everything old is new again: a fresh look at historical approaches in machine learning
Everything old is new again: a fresh look at historical approaches in machine learning
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
Hi-index | 0.00 |
In this paper, we revisited the classical technique of Regularized Least Squares (RLS) for the classification of large-scale nonlinear data. Specifically, we focus on a low-rank formulation of RLS and show that it has linear time complexity in the data size only and does not rely on the number of labels and features for problems with moderate feature dimension. This makes low-rank RLS particularly suitable for classification with large data sets. Moreover, we have proposed a general theorem for the closed-form solutions to the Leave-One-Out Cross Validation (LOOCV) estimation problem in empirical risk minimization which encompasses all types of RLS classifiers as special cases. This eliminates the reliance on cross validation, a computationally expensive process for parameter selection, and greatly accelerate the training process of RLS classifiers. Experimental results on real and synthetic large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.