Representative prototype sets for data characterization and classification

  • Authors:
  • Ludwig Lausser;Christoph Müssel;Hans A. Kestler

  • Affiliations:
  • Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, Ulm, Germany;Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, Ulm, Germany;Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, Ulm, Germany

  • Venue:
  • ANNPR'12 Proceedings of the 5th INNS IAPR TC 3 GIRPR conference on Artificial Neural Networks in Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Common classifier models are designed to achieve high accuracies, while often neglecting the question of interpretability. In particular, most classifiers do not allow for drawing conclusions on the structure and quality of the underlying training data. By keeping the classifier model simple, an intuitive interpretation of the model and the corresponding training data is possible. A lack of accuracy of such simple models can be compensated by accumulating the decisions of several classifiers. We propose an approach that is particularly suitable for high-dimensional data sets of low cardinality, such as data gained from high-throughput biomolecular experiments. Here, simple base classifiers are obtained by choosing one data point of each class as a prototype for nearest neighbour classification. By enumerating all such classifiers for a specific data set, one can obtain a systematic description of the data structure in terms of class coherence. We also investigate the performance of the classifiers in cross-validation experiments by applying stand-alone prototype classifiers as well as ensembles of selected prototype classifiers.