COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
Matrix computations (3rd ed.)
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
On Relative Loss Bounds in Generalized Linear Regression
FCT '99 Proceedings of the 12th International Symposium on Fundamentals of Computation Theory
Tracking the best linear predictor
The Journal of Machine Learning Research
A Second-Order Perceptron Algorithm
SIAM Journal on Computing
Prediction, Learning, and Games
Prediction, Learning, and Games
Tracking the best hyperplane with a simple budget Perceptron
Machine Learning
Confidence-weighted linear classification
Proceedings of the 25th international conference on Machine learning
Weighted last-step min-max algorithm with improved sub-logarithmic regret
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Adaptive regularization of weight vectors
Machine Learning
Hi-index | 0.00 |
The goal of a learner in standard online learning is to have the cumulative loss not much larger compared with the best-performing prediction-function from some fixed class. Numerous algorithms were shown to have this gap arbitrarily close to zero compared with the best function that is chosen off-line. Nevertheless, many real-world applications (such as adaptive filtering) are non-stationary in nature and the best prediction function may not be fixed but drift over time. We introduce a new algorithm for regression that uses per-feature-learning rate and provide a regret bound with respect to the best sequence of functions with drift. We show that as long as the cumulative drift is sub-linear in the length of the sequence our algorithm suffers a regret that is sub-linear as well. We also sketch an algorithm that achieves the best of the two worlds: in the stationary settings has log (T) regret, while in the non-stationary settings has sub-linear regret. Simulations demonstrate the usefulness of our algorithm compared with other state-of-the-art approaches.