Feature selection with dynamic mutual information

  • Authors:
  • Huawen Liu;Jigui Sun;Lei Liu;Huijie Zhang

  • Affiliations:
  • College of Computer Science, Jilin University, Changchun 130012, China;College of Computer Science, Jilin University, Changchun 130012, China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Changchun 130012, China;College of Computer Science, Jilin University, Changchun 130012, China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Changchun 130012, China;College of Computer, Northeast Normal University, Changchun 130021, China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Feature selection plays an important role in data mining and pattern recognition, especially for large scale data. During past years, various metrics have been proposed to measure the relevance between different features. Since mutual information is nonlinear and can effectively represent the dependencies of features, it is one of widely used measurements in feature selection. Just owing to these, many promising feature selection algorithms based on mutual information with different parameters have been developed. In this paper, at first a general criterion function about mutual information in feature selector is introduced, which can bring most information measurements in previous algorithms together. In traditional selectors, mutual information is estimated on the whole sampling space. This, however, cannot exactly represent the relevance among features. To cope with this problem, the second purpose of this paper is to propose a new feature selection algorithm based on dynamic mutual information, which is only estimated on unlabeled instances. To verify the effectiveness of our method, several experiments are carried out on sixteen UCI datasets using four typical classifiers. The experimental results indicate that our algorithm achieved better results than other methods in most cases.