The covering number in learning theory
Journal of Complexity
Support vector machines are universally consistent
Journal of Complexity
A theoretical characterization of linear SVM-based feature selection
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Support Vector Machine Soft Margin Classifiers: Error Analysis
The Journal of Machine Learning Research
Some Properties of Regularized Kernel Methods
The Journal of Machine Learning Research
Model Selection for Regularized Least-Squares Algorithm in Learning Theory
Foundations of Computational Mathematics
Foundations of Computational Mathematics
Multi-kernel regularized classifiers
Journal of Complexity
Estimation of Gradients and Coordinate Covariation in Classification
The Journal of Machine Learning Research
Derivative reproducing properties for kernel methods in learning theory
Journal of Computational and Applied Mathematics
Learning from dependent observations
Journal of Multivariate Analysis
IEEE Transactions on Signal Processing
Capacity of reproducing kernel spaces in learning theory
IEEE Transactions on Information Theory
On the generalization ability of on-line learning algorithms
IEEE Transactions on Information Theory
Online Regularized Classification Algorithms
IEEE Transactions on Information Theory
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Learning algorithms are based on samples which are often drawn independently from an identical distribution (i.i.d.). In this paper we consider a different setting with samples drawn according to a non-identical sequence of probability distributions. Each time a sample is drawn from a different distribution. In this setting we investigate a fully online learning algorithm associated with a general convex loss function and a reproducing kernel Hilbert space (RKHS). Error analysis is conducted under the assumption that the sequence of marginal distributions converges polynomially in the dual of a Hölder space. For regression with least square or insensitive loss, learning rates are given in both the RKHS norm and the L2 norm. For classification with hinge loss and support vector machine q-norm loss, rates are explicitly stated with respect to the excess misclassification error.