Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Relative Loss Bounds for Multidimensional Regression Problems
Machine Learning
General Convergence Results for Linear Discriminant Updates
Machine Learning
On the Learnability and Design of Output Codes for Multiclass Problems
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
The Robustness of the p-Norm Algorithms
Machine Learning
Generalization error bounds for Bayesian mixture algorithms
The Journal of Machine Learning Research
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
Feature selection, L1 vs. L2 regularization, and rotational invariance
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Prediction, Learning, and Games
Prediction, Learning, and Games
Bounds for Linear Multi-Task Learning
The Journal of Machine Learning Research
Uncovering shared structures in multiclass classification
Proceedings of the 24th international conference on Machine learning
A primal-dual perspective of online learning algorithms
Machine Learning
Consistency of the Group Lasso and Multiple Kernel Learning
The Journal of Machine Learning Research
Convex multi-task feature learning
Machine Learning
Joint covariate selection and joint subspace selection for multiple classification problems
Statistics and Computing
The Journal of Machine Learning Research
Learning bounds for support vector machines with learned kernels
COLT'06 Proceedings of the 19th annual conference on Learning Theory
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Online Learning and Online Convex Optimization
Foundations and Trends® in Machine Learning
Guaranteed classification via regularized similarity learning
Neural Computation
Hi-index | 0.00 |
There is growing body of learning problems for which it is natural to organize the parameters into a matrix. As a result, it becomes easy to impose sophisticated prior knowledge by appropriately regularizing the parameters under some matrix norm. This work describes and analyzes a systematic method for constructing such matrix-based regularization techniques. In particular, we focus on how the underlying statistical properties of a given problem can help us decide which regularization function is appropriate. Our methodology is based on a known duality phenomenon: a function is strongly convex with respect to some norm if and only if its conjugate function is strongly smooth with respect to the dual norm. This result has already been found to be a key component in deriving and analyzing several learning algorithms. We demonstrate the potential of this framework by deriving novel generalization and regret bounds for multi-task learning, multi-class learning, and multiple kernel learning.