Higher order feature selection for text classification

  • Authors:
  • Jan Bakus;Mohamed S. Kamel

  • Affiliations:
  • Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada;Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present the MIFS-C variant of the mutual information feature-selection algorithms. We present an algorithm to find the optimal value of the redundancy parameter, which is a key parameter in the MIFS-type algorithms. Furthermore, we present an algorithm that speeds up the execution time of all the MIFS variants. Overall, the presented MIFS-C has comparable classification accuracy (in some cases even better) compared with other MIFS algorithms, while its running time is faster. We compared this feature selector with other feature selectors, and found that it performs better in most cases. The MIFS-C performed especially well for the breakeven and F-measure because the algorithm can be tuned to optimise these evaluation measures.