A Feature Selection Framework for Text Filtering

  • Authors:
  • Zhaohui Zheng;Rohini Srihari;Sargur Srihari

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new framework for local featureselection in text filtering. In this framework, a feature setis constructed per category by first selecting a set of termshighly indicative of membership (positive set) and anotherset of terms highly indicative of non-membership (negativeset), and then combining these two sets. This feature selectionframework not only unifies several standard featureselection methods, but also facilitates the proposal of a newmethod that optimally combines the positive and negativesets. The experimental comparison between the proposedmethod and standard methods was conducted on six featureselection metrics: chi-square, correlation coefficient, oddsratio, GSS coefficient and two proposed variants of odds ratioand GSS coefficient: OR-square and GSS-square respectively.The results show that the proposed feature selectionmethod improves text filtering performance.