A tree-based regressor that adapts to intrinsic dimension

  • Authors:
  • Samory Kpotufe;Sanjoy Dasgupta

  • Affiliations:
  • Max Planck Institute for Intelligent Systems, Germany;UCSD Computer Science and Engineering, United States

  • Venue:
  • Journal of Computer and System Sciences
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of nonparametric regression, consisting of learning an arbitrary mapping f:X-Y from a data set of (x,y) pairs in which the y values are corrupted by noise of mean zero. This statistical task is known to be subject to a severe curse of dimensionality: if X@?R^D, and if the only smoothness assumption on f is that it satisfies a Lipschitz condition, it is known that any estimator based on n data points will have an error rate (risk) of @W(n^-^2^/^(^2^+^D^)). Here we present a tree-based regressor whose risk depends only on the doubling dimension of X, not on D. This notion of dimension generalizes two cases of contemporary interest: when X is a low-dimensional manifold, and when X is sparse. The tree is built using random hyperplanes as splitting criteria, building upon recent work of Dasgupta and Freund (2008) [5]; and we show that axis-parallel splits cannot achieve the same finite-sample rate of convergence.