Machine Learning - Special issue on context sensitivity and concept drift
Generalization performance of support vector machines and other pattern classifiers
Advances in kernel methods
The robustness of the p-norm algorithms
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
General Convergence Results for Linear Discriminant Updates
Machine Learning
The Relaxed Online Maximum Margin Algorithm
Machine Learning
Learning Additive Models Online with Fast Evaluating Kernels
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Tracking Linear-Threshold Concepts with Winnow
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Tracking the best linear predictor
The Journal of Machine Learning Research
A new approximate maximal margin classification algorithm
The Journal of Machine Learning Research
Online learning of linear classifiers
Advanced lectures on machine learning
Tracking linear-threshold concepts with Winnow
The Journal of Machine Learning Research
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Tracking a moving hypothesis for visual data with explicit switch detection
CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
Online learning with multiple kernels: A review
Neural Computation
Hi-index | 0.00 |
We consider using online large margin classification algorithms in a setting where the target classifier may change over time. The algorithms we consider are Gentile's ALMA, and an algorithm we call NORMA which performs a modified online gradient descent with respect to a regularised risk. The update rule of ALMA includes a projection-based regularisation step, whereas NORMA has a weight decay type of regularisation. For ALMA we can prove mistake bounds in terms of the total distance the target moves during the trial sequence. For NORMA, we need the additional assumption that the movement rate stays sufficiently low uniformly over time. In addition to the movement of the target, the mistake bounds for both algorithms depend on the hinge loss of the target. Both algorithms use a margin parameter which can be tuned to make them mistake-driven (update only when classification error occurs) or more aggressive (update when the confidence of the classification is below the margin). We get similar mistake bounds both for the mistake-driven and a suitable aggressive tuning. Experiments on artificial data confirm that an aggressive tuning is often useful even if the goal is just to minimise the number of mistakes.