An Aggressive Feature Selection Method based on Rough Set Theory

Authors:
Fangtao Li;Tao Guan;Xian Zhang;Xiaoyan Zhu
Affiliations:
-;-;-;-
Venue:
ICICIC '07 Proceedings of the Second International Conference on Innovative Computing, Informatio and Control
Year:
2007

Citing 0
Cited 2

A General Framework of Feature Selection for Text Categorization

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
The study on feature selection in customer churn prediction modeling

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature selection is an important component of text classification to reduce the data dimensionality. In this paper, we s Heuristic algorithm for rough set reduction, and then propose an aggressive feature selection method for text categorization. This method integrates the advantages of knowledge reduction in rough set (RS) theory and the conventional feature selection methods information gain (IG) and document frequency (DF). It is the first time that the rough set based feature selection method is experimented on the large-scale data set Reuters. And the results show that the proposed method can obtain higher categorization accuracy than IG and DF with much fewer features. In addition, comparing with the original rough set reduction, the proposed method reduces the computational time significantly. For the Reuters dataset, several discretization widths are adopted, and with our method, the quantities of features are reduced by 93.5%, 88.4% with only 0.61%, 0.13% decreases of F1 measure respectively.