Boosting nearest neighbors for the efficient estimation of posteriors

  • Authors:
  • Roberto D'Ambrosio;Richard Nock;Wafa Bel Haj Ali;Frank Nielsen;Michel Barlaud

  • Affiliations:
  • University Campus Bio-Medico of Rome, Rome, Italy, CNRS - U. Nice, France;CEREGMIA - Université Antilles-Guyane, Martinique, France;CNRS - U. Nice, France;Sony Computer Science Laboratories, Inc., Tokyo, Japan;CNRS - U. Nice, France, Institut Universitaire de France, France

  • Venue:
  • ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is an admitted fact that mainstream boosting algorithms like AdaBoost do not perform well to estimate class conditional probabilities. In this paper, we analyze, in the light of this problem, a recent algorithm, unn, which leverages nearest neighbors while minimizing a convex loss. Our contribution is threefold. First, we show that there exists a subclass of surrogate losses, elsewhere called balanced, whose minimization brings simple and statistically efficient estimators for Bayes posteriors. Second, we show explicit convergence rates towards these estimators for unn, for any such surrogate loss, under a Weak Learning Assumption which parallels that of classical boosting results. Third and last, we provide experiments and comparisons on synthetic and real datasets, including the challenging SUN computer vision database. Results clearly display that boosting nearest neighbors may provide highly accurate estimators, sometimes more than a hundred times more accurate than those of other contenders like support vector machines.