Feature ranking fusion for text classifier

  • Authors:
  • Masoud Makrehchi;Mohamed S. Kamel

  • Affiliations:
  • Department of Electrical, Computer, and Software Engineering, University of Ontario Institute of Technology UOIT, Oshawa, ON, Canada;Pattern Analysis and Machine Intelligence Lab, Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature ranking is widely used in text classification. One problem with feature ranking methods is their non-robust behavior when applied to different data sets. In other words, the feature ranking methods behave differently from one data set to the other. The problem becomes more complex when we consider that the performance of feature ranking methods highly depends on the type of text classifier. In this paper, a new method based on combining feature rankings is proposed to find the best features among a set of feature rankings. The proposed method is applied to the text classification problem and evaluated on three well-known data sets using Support Vector Machine and Rocchio classifier. Several combining methods are employed to aggregate ranked list of the features. We show that combining methods can offer reliable results very close to the best solution without the need to use a classifier.