Machine Learning
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
The Relaxed Online Maximum Margin Algorithm
Machine Learning
Sparse Online Greedy Support Vector Regression
ECML '02 Proceedings of the 13th European Conference on Machine Learning
A new approximate maximal margin classification algorithm
The Journal of Machine Learning Research
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Sparseness of support vector machines
The Journal of Machine Learning Research
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Core Vector Machines: Fast SVM Training on Very Large Data Sets
The Journal of Machine Learning Research
An efficient method for simplifying support vector machines
ICML '05 Proceedings of the 22nd international conference on Machine learning
Building Sparse Large Margin Classifiers
ICML '05 Proceedings of the 22nd international conference on Machine learning
Trading convexity for scalability
ICML '06 Proceedings of the 23rd international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Kernel Classifiers with Online and Active Learning
The Journal of Machine Learning Research
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Building Support Vector Machines with Reduced Classifier Complexity
The Journal of Machine Learning Research
Proceedings of the 24th international conference on Machine learning
Simpler core vector machines with enclosing balls
Proceedings of the 24th international conference on Machine learning
The Forgetron: A Kernel-Based Perceptron on a Budget
SIAM Journal on Computing
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
A simpler unified analysis of budget perceptrons
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
P-packSVM: Parallel Primal grAdient desCent Kernel SVM
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Tighter perceptron with improved dual use of cached data for model representation and validation
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent
The Journal of Machine Learning Research
Bounded Kernel-Based Online Learning
The Journal of Machine Learning Research
Bundle Methods for Regularized Risk Minimization
The Journal of Machine Learning Research
Online training on a budget of support vector machines using twin prototypes
Statistical Analysis and Data Mining
Large linear classification when data cannot fit in memory
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Training and Testing Low-degree Polynomial Data Mappings via Linear SVM
The Journal of Machine Learning Research
Tree Decomposition for Large-Scale SVM Problems
The Journal of Machine Learning Research
Pegasos: primal estimated sub-gradient solver for SVM
Mathematical Programming: Series A and B - Special Issue on "Optimization and Machine learning"; Alexandre d’Aspremont • Francis Bach • Inderjit S. Dhillon • Bin Yu
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Tracking the best hyperplane with a simple budget perceptron
COLT'06 Proceedings of the 19th annual conference on Learning Theory
IEEE Transactions on Signal Processing
Input space versus feature space in kernel-based methods
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Online algorithms that process one example at a time are advantageous when dealing with very large data or with data streams. Stochastic Gradient Descent (SGD) is such an algorithm and it is an attractive choice for online Support Vector Machine (SVM) training due to its simplicity and effectiveness. When equipped with kernel functions, similarly to other SVM learning algorithms, SGD is susceptible to the curse of kernelization that causes unbounded linear growth in model size and update time with data size. This may render SGD inapplicable to large data sets. We address this issue by presenting a class of Budgeted SGD (BSGD) algorithms for large-scale kernel SVM training which have constant space and constant time complexity per update. Specifically, BSGD keeps the number of support vectors bounded during training through several budget maintenance strategies. We treat the budget maintenance as a source of the gradient error, and show that the gap between the BSGD and the optimal SVM solutions depends on the model degradation due to budget maintenance. To minimize the gap, we study greedy budget maintenance methods based on removal, projection, and merging of support vectors. We propose budgeted versions of several popular online SVM algorithms that belong to the SGD family. We further derive BSGD algorithms for multi-class SVM training. Comprehensive empirical results show that BSGD achieves higher accuracy than the state-of-the-art budgeted online algorithms and comparable to non-budget algorithms, while achieving impressive computational efficiency both in time and space during training and prediction.