Automatic feature selection for classification of health data

  • Authors:
  • Hongxing He;Huidong Jin;Jie Chen

  • Affiliations:
  • CSIRO Mathematical and Information Sciences, Canberra, ACT, Australia;CSIRO Mathematical and Information Sciences, Canberra, ACT, Australia;CSIRO Mathematical and Information Sciences, Canberra, ACT, Australia

  • Venue:
  • AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

For classification of health data, we propose in this paper a fast and accurate feature selection method, FIEBIT (Feature Inclusion and Exclusion Based on Information Theory). FIEBIT selects the most relevant and non-redundant features using Conditional Mutual Information (CMU) while excluding irrelevant and redundant features according to the comparison among Individual Symmetrical Uncertainty (ISU) and Combined Symmetrical Uncertainty (CSU). Small feature subsets are selected before classification without compromising the classification accuracy. In addition, the size of the feature subset is determined automatically. Our preliminary empirical results on health data with hundreds of features suggest FIEBIT is efficient and effective in comparison with representative feature selection methods.