Data Mining for Case-Based Reasoning in High-Dimensional Biological Domains

  • Authors:
  • Niloofar Arshadi;Igor Jurisica

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Case-based reasoning (CBR) is a suitable paradigm for class discovery in molecular biology, where the rules that define the domain knowledge are difficult to obtain and the number and the complexity of the rules affecting the problem are too large for formal knowledge representation. To extend the capabilities of CBR, we propose the mixture of experts for case-based reasoning (MOE4CBR), a method that combines an ensemble of CBR classifiers with spectral clustering and logistic regression. Our approach not only achieves higher prediction accuracy, but also leads to the selection of a subset of features that have meaningful relationships with their class labels. We evaluate MOE4CBR by applying the method to a CBR system called {TA3}驴a computational framework for CBR systems. For two ovarian mass spectrometry data sets, the prediction accuracy improves from 80 percent to 93 percent and from 90 percent to 98.4 percent, respectively. We also apply the method to leukemia and lung microarray data sets with prediction accuracy improving from 65 percent to 74 percent and from 60 percent to 70 percent, respectively. Finally, we compare our list of discovered biomarkers with the lists of selected biomarkers from other studies for the mass spectrometry data sets.