Feature Selection for Medical Data Mining: Comparisons of Expert Judgment and Automatic Approaches

  • Authors:
  • Tsang-Hsiang Cheng;Chih-Ping Wei;Vincent S. Tseng

  • Affiliations:
  • Southern Taiwan University of Technology, Taiwan, R.O.C.;National Tsing Hua University, Taiwan, R.O.C.;National Cheng Kung University, Taiwan, R.O.C.

  • Venue:
  • CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining refers to the process of automatic extracting previously unknown, valid, and actionable patterns or knowledge from large databases for crucial decision support. Among different data mining technique, classification analysis is widely adopted for healthcare applications for supporting medical diagnostic decisions, improving quality of patient care, etc. If a training dataset contains irrelevant features (i.e., attributes), classification analysis may produce less accurate and less understandable results. Two commonly employed feature selection approaches include use of automatic feature selection mechanisms (i.e., data-driven) or expert judgment (i.e., knowledgedriven). Due to differences in their underlying processes, the two prevailing feature selection approaches may have their unique biases that possibly lead to dissimilar classification effectiveness. In this study, we empirically evaluate the classification effectiveness resulted from the two feature selection approaches on a risk prediction of cardiovascular disease dataset. Our evaluation results suggest that the feature subsets selected domain experts improve the sensitivity of a classifier, while the feature subsets selected by an automatic feature selection mechanism improve the predictive power of a classifier on the majority class (i.e., the specificity in this study).