Text Feature Extraction Based on Rough Set

  • Authors:
  • Yiyuan Cheng;Ruiling Zhang;Xiufeng Wang;Qiushuang Chen

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a method for Text Feature Extraction based on Rough Set (TFERS) is proposed. Firstly, a new formulation for attribute significance is presented based on the classification capability of condition attributes, which avoids the recalculation of attribute significance during iterations of reduction procedure conducted in conventional rough-set-based methods. Secondly, the at-tribute correlation analysis is incorporated, which helps to achieve a satisfactory reduction of text features. In text preprocessing phase, the typical vector space representa-tion is extended from term to concept (‘synset’) level based on Wordnet. In this way, the problem of synonym is solved and the dimension of the feature vector is reduced obviously. The simulation experiment and applications in text classification show that TFERS can improve the clas-sification performance significantly.