On Dimension-independent Rates of Convergence for Function Approximation with Gaussian Kernels

  • Authors:
  • Gregory E. Fasshauer;Fred J. Hickernell;Henryk Woźniakowski

  • Affiliations:
  • fasshauer@iit.edu and hickernell@iit.edu;-;henryk@cs.columbia.edu

  • Venue:
  • SIAM Journal on Numerical Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article studies the problem of approximating functions belonging to a Hilbert space $\mathcal{H}_d$ with an isotropic or anisotropic translation invariant (or stationary) reproducing kernel with special attention given to the Gaussian kernel $K_d(\boldsymbol{x},\boldsymbol{t}) = \exp\big(-\sum_{\ell=1}^d\gamma_\ell^2(x_\ell-t_\ell)^2\big)$ for all $\boldsymbol{x},\boldsymbol{t}\in\mathbb{R}^d.$ The isotropic (or radial) case corresponds to using the same shape parameters for all coordinates, i.e., $\gamma_\ell=\gamma0$ for all $\ell$, whereas the anisotropic case corresponds to varying $\gamma_\ell$. The approximation error of the optimal approximation algorithm, called a meshfree or kriging method, is known to decay faster than any polynomial in $n^{-1}$, for fixed $d$, where $n$ is the number of data points. We are especially interested in moderate to large $d$, which in particular arise in the construction of surrogates for computer experiments. This article presents dimension-independent error bounds, i.e., the error is bounded by $Cn^{-p}$, where $C$ and $p$ are independent of both $d$ and $n$. This is equivalent to strong polynomial tractability. The pertinent error criterion is the worst case of such an algorithm over the unit ball in $\mathcal{H}_d$, with the error for a single function given by the ${\mathcal L}_2$ norm whose weight is also a Gaussian which is used to “localize” $\mathbb{R}^d$. We consider two classes of algorithms: (i) using data generated by finitely many arbitrary linear functionals, and (ii) using only finitely many function values. Provided that arbitrary linear functional data is available, we show $p=1/2$ is possible for any translation invariant positive definite kernel. We also consider the sequence of shape parameters $\gamma_d$ decaying to zero like $d^{-\omega}$ as $d$ tends to $\infty$. Note that for large $\omega$ this means that the function to be approximated is “essentially low-dimensional.” Then the largest $p$ is roughly $\max(1/2,\omega)$. If only function values are available, dimension-independent convergence rates are somewhat worse. If the goal is to make the error smaller than $Cn^{-p}$ times the initial $(n=0)$ error, then the corresponding dimension-independent exponent $p$ is roughly $\omega$. In particular, for the isotropic case, when $\omega=0$, the error does not even decay polynomially with $n^{-1}$. In summary, excellent dimension-independent error decay rates are possible only when the sequence of shape parameters decays rapidly.