Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces

  • Authors:
  • Pradipta Maji;Sankar K. Pal

  • Affiliations:
  • Indian Statistical Institute, Kolkata;Indian Statistical Institute, Kolkata

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

The selection of nonredundant and relevant features of real-valued data sets is a highly challenging problem. A novel feature selection method is presented here based on fuzzy-rough sets by maximizing the relevance and minimizing the redundancy of the selected features. By introducing the fuzzy equivalence partition matrix, a novel representation of Shannon's entropy for fuzzy approximation spaces is proposed to measure the relevance and redundancy of features suitable for real-valued data sets. The fuzzy equivalence partition matrix also offers an efficient way to calculate many more information measures, termed as f-information measures. Several f-information measures are shown to be effective for selecting nonredundant and relevant features of real-valued data sets. This paper compares the performance of different f-information measures for feature selection in fuzzy approximation spaces. Some quantitative indexes are introduced based on fuzzy-rough sets for evaluating the performance of proposed method. The effectiveness of the proposed method, along with a comparison with other methods, is demonstrated on a set of real-life data sets.