Benchmarking local classification methods

  • Authors:
  • Bernd Bischl;Julia Schiffner;Claus Weihs

  • Affiliations:
  • Department of Statistics, TU Dortmund University, Dortmund, Germany 44221;Department of Statistics, TU Dortmund University, Dortmund, Germany 44221;Department of Statistics, TU Dortmund University, Dortmund, Germany 44221

  • Venue:
  • Computational Statistics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years in the fields of statistics and machine learning an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new, but have lately become popular. Well-known examples are the $$k$$ nearest neighbors method and classification trees. However, in most publications on this topic the term "local" is used without further explanation of its particular meaning. Only little is known about the properties of local methods and the types of classification problems for which they may be beneficial. We explain the basic principles and introduce the most important variants of local methods. To our knowledge there are very few extensive studies in the literature that compare several types of local methods and global methods across many data sets. In order to assess their performance we conduct a benchmark study on real-world and synthetic tasks. We cluster data sets and considered learning algorithms with regard to the obtained performance structures and try to relate our theoretical considerations and intuitions to these results. We also address some general issues of benchmark studies and cover some pitfalls, extensions and improvements.