Discovering business intelligence from online product reviews: A rule-induction framework
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Text categorization is a key problem of text mining. Although there are many researchs on this problem, the main works are focused on classification of big categories. There are very few researchs on text categorization problems characterised by many redundant features. We call this kind of problem as fine-text-categorization. In this paper, we presented an algorithm based on modified CHI square feature selection and rough set to solve this problem. The features of categories are selected in a aggressive maner. The classification rules are extracted by using rough set theory. Experiments on real world corpora show that our algorithm can evidently improve classification precision, thus is promising.