Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
The Journal of Machine Learning Research
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
Model Selection for Regularized Least-Squares Algorithm in Learning Theory
Foundations of Computational Mathematics
Foundations of Computational Mathematics
Learning low-rank kernel matrices
ICML '06 Proceedings of the 23rd international conference on Machine learning
Nonparametric Quantile Estimation
The Journal of Machine Learning Research
Optimal Rates for the Regularized Least-Squares Algorithm
Foundations of Computational Mathematics
Consistency of kernel-based quantile regression
Applied Stochastic Models in Business and Industry
Hi-index | 0.01 |
The paper studies convex stochastic optimization problems in a reproducing kernel Hilbert space (RKHS). The objective (risk) functional depends on functions from this RKHS and takes the form of a mathematical expectation (integral) of a nonnegative integrand (loss function) over a probability measure. The problem is generally ill-posed, a difficulty that in statistical learning is addressed through Tihonov regularization, with Monte Carlo approximation of integrals, which also makes it possible to solve the problem by finite dimensional (convex) quadratic optimization. The approximate solutions are referred to as kernel learning estimators and are expressed as a linear combination of kernels evaluated at the sample points. They are functional random variables that depend on the full sample. The paper studies a probabilistic convergence of these approximate solutions under a gradual elimination of the regularization parameter with rising number of observations. Its intended contribution is to derive novel nonasymptotic bounds on the minimization error and exponential bounds on the tail distribution of errors and to establish novel sufficient conditions for uniform convergence of kernel estimators to the true (normal) solution with probability one, jointly with a rule for downward adjustment of the regularization factor with increasing sample size. Applications to least squares, median, and quantile regression estimation, as well as to binary classification, are discussed.