Convex Optimization
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Online convex optimization in the bandit setting: gradient descent without a gradient
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Prediction, Learning, and Games
Prediction, Learning, and Games
Logarithmic regret algorithms for online convex optimization
Machine Learning
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Efficient Euclidean projections in linear time
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online Learning with Sample Path Constraints
The Journal of Machine Learning Research
Online learning with constraints
COLT'06 Proceedings of the 19th annual conference on Learning Theory
On the generalization ability of on-line learning algorithms
IEEE Transactions on Information Theory
Hi-index | 0.00 |
In this paper we propose efficient algorithms for solving constrained online convex optimization problems. Our motivation stems from the observation that most algorithms proposed for online convex optimization require a projection onto the convex set Κ from which the decisions are made. While the projection is straightforward for simple shapes (e.g., Euclidean ball), for arbitrary complex sets it is the main computational challenge and may be inefficient in practice. In this paper, we consider an alternative online convex optimization problem. Instead of requiring that decisions belong to Κ for all rounds, we only require that the constraints, which define the set Κ, be satisfied in the long run. By turning the problem into an online convex-concave optimization problem, we propose an efficient algorithm which achieves O(√T) regret bound and O(T3/4) bound on the violation of constraints. Then, we modify the algorithm in order to guarantee that the constraints are satisfied in the long run. This gain is achieved at the price of getting O(T3/4) regret bound. Our second algorithm is based on the mirror prox method (Nemirovski, 2005) to solve variational inequalities which achieves O(T2/3) bound for both regret and the violation of constraints when the domain Κ can be described by a finite number of linear constraints. Finally, we extend the results to the setting where we only have partial access to the convex set Κ and propose a multipoint bandit feedback algorithm with the same bounds in expectation as our first algorithm.