Feature Extraction for Dynamic Integration of Classifiers

  • Authors:
  • Mykola Pechenizkiy;Alexey Tsymbal;Seppo Puuronen;David Patterson

  • Affiliations:
  • Department of Computer Science Eindhoven University of Technology Eindhoven, The Netherlands. E-mail: m.pechenizkiy@tue.nl;Corporate Technology Division Siemens AG Erlangen, Germany. E-mail: alexey.tsymbal@siemens.com;Dept. of Comp. Sc. and Information Systems University of Jyväskylä Jyväskylä, Finland. E-mail: sepi@cs.jyu.fi;Northern Ireland Knowledge Engineering Laboratory University of Ulster, Belfast, UK. E-mail: wd.patterson@ulster.ac.uk

  • Venue:
  • Fundamenta Informaticae
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique of dynamic integration, in which local accuracy estimates are calculated for each base classifier of an ensemble, in the neighborhood of a new instance to be processed. Generally, the whole space of original features is used to find the neighborhood of a new instance for local accuracy estimates in dynamic integration. However, when dynamic integration takes place in high dimensions the search for the neighborhood of a new instance is problematic, since the majority of space is empty and neighbors can in fact be located far from each other. Furthermore, when noisy or irrelevant features are present it is likely that also irrelevant neighbors will be associated with a test instance. In this paper, we propose to use feature extraction in order to cope with the curse of dimensionality in the dynamic integration of classifiers. We consider classical principal component analysis and two eigenvector-based class-conditional feature extraction methods that take into account class information. Experimental results show that, on some data sets, the use of FEDIC leads to significantly higher ensemble accuracies than the use of plain dynamic integration in the space of original features.