Discovering business intelligence from online product reviews: A rule-induction framework
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper, a method for Text Feature Extraction based on Rough Set (TFERS) is proposed. Firstly, a new formulation for attribute significance is presented based on the classification capability of condition attributes, which avoids the recalculation of attribute significance during iterations of reduction procedure conducted in conventional rough-set-based methods. Secondly, the at-tribute correlation analysis is incorporated, which helps to achieve a satisfactory reduction of text features. In text preprocessing phase, the typical vector space representa-tion is extended from term to concept (‘synset’) level based on Wordnet. In this way, the problem of synonym is solved and the dimension of the feature vector is reduced obviously. The simulation experiment and applications in text classification show that TFERS can improve the clas-sification performance significantly.