On Dimension-independent Rates of Convergence for Function Approximation with Gaussian Kernels

Authors:
Gregory E. Fasshauer;Fred J. Hickernell;Henryk Woźniakowski
Affiliations:
fasshauer@iit.edu and hickernell@iit.edu;-;henryk@cs.columbia.edu
Venue:
SIAM Journal on Numerical Analysis
Year:
2012

Citing 13
Cited 1

Information-based complexity

Information-based complexity
Bounds on multivariate polynomials and exponential error estimates for multiquadratic interpolation

Journal of Approximation Theory
Explicit cost bounds of algorithms for multivariate tensor product problems

Journal of Complexity
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Radial Basis Functions

Radial Basis Functions
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics)

Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics)
Support Vector Machines

Support Vector Machines
Meshfree Approximation Methods with MATLAB

Meshfree Approximation Methods with MATLAB
On the power of standard information for multivariate approximation in the worst case setting

Journal of Approximation Theory
Quasi-polynomial tractability

Journal of Complexity
An Explicit Description of the Reproducing Kernel Hilbert Spaces of Gaussian RBF Kernels

IEEE Transactions on Information Theory
Stable Evaluation of Gaussian Radial Basis Function Interpolants

SIAM Journal on Scientific Computing

Stable Evaluation of Gaussian Radial Basis Function Interpolants

SIAM Journal on Scientific Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article studies the problem of approximating functions belonging to a Hilbert space $\mathcal{H}_d$ with an isotropic or anisotropic translation invariant (or stationary) reproducing kernel with special attention given to the Gaussian kernel $K_d(\boldsymbol{x},\boldsymbol{t}) = \exp\big(-\sum_{\ell=1}^d\gamma_\ell^2(x_\ell-t_\ell)^2\big)$ for all $\boldsymbol{x},\boldsymbol{t}\in\mathbb{R}^d.$ The isotropic (or radial) case corresponds to using the same shape parameters for all coordinates, i.e., $\gamma_\ell=\gamma0$ for all $\ell$, whereas the anisotropic case corresponds to varying $\gamma_\ell$. The approximation error of the optimal approximation algorithm, called a meshfree or kriging method, is known to decay faster than any polynomial in $n^{-1}$, for fixed $d$, where $n$ is the number of data points. We are especially interested in moderate to large $d$, which in particular arise in the construction of surrogates for computer experiments. This article presents dimension-independent error bounds, i.e., the error is bounded by $Cn^{-p}$, where $C$ and $p$ are independent of both $d$ and $n$. This is equivalent to strong polynomial tractability. The pertinent error criterion is the worst case of such an algorithm over the unit ball in $\mathcal{H}_d$, with the error for a single function given by the ${\mathcal L}_2$ norm whose weight is also a Gaussian which is used to “localize” $\mathbb{R}^d$. We consider two classes of algorithms: (i) using data generated by finitely many arbitrary linear functionals, and (ii) using only finitely many function values. Provided that arbitrary linear functional data is available, we show $p=1/2$ is possible for any translation invariant positive definite kernel. We also consider the sequence of shape parameters $\gamma_d$ decaying to zero like $d^{-\omega}$ as $d$ tends to $\infty$. Note that for large $\omega$ this means that the function to be approximated is “essentially low-dimensional.” Then the largest $p$ is roughly $\max(1/2,\omega)$. If only function values are available, dimension-independent convergence rates are somewhat worse. If the goal is to make the error smaller than $Cn^{-p}$ times the initial $(n=0)$ error, then the corresponding dimension-independent exponent $p$ is roughly $\omega$. In particular, for the isotropic case, when $\omega=0$, the error does not even decay polynomially with $n^{-1}$. In summary, excellent dimension-independent error decay rates are possible only when the sequence of shape parameters decays rapidly.