Comparison of reuse strategies for case-based classification in bioinformatics

  • Authors:
  • Isabelle Bichindaritz

  • Affiliations:
  • Institute of Technology, University of Washington Tacoma, Tacoma, Washington

  • Venue:
  • ICCBR'11 Proceedings of the 19th international conference on Case-Based Reasoning Research and Development
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bioinformatics offers an interesting challenge for data mining algorithms given the high dimensionality of its data and the comparatively small set of samples. Case-based classification algorithms have been successfully applied to classify bioinformatics data and often serve as a reference for other algorithms. Therefore this paper proposes to study, on some of the most benchmarked datasets in bioinformatics, the performance of different reuse strategies in case-based classification in order to make methodological recommendations for applying these algorithms to this domain. In conclusion, k-nearest-neighbor (kNN) classifiers coupled with between-group to within-group sum of squares (BSS/WSS) feature selection can perform as well and even better than the best benchmarked algorithms to date. However the reuse strategy chosen played a major role to optimize the algorithms. In particular, the optimization of both the number k of neighbors and the number of features accounted was key to improving classification accuracy.