Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
The nature of statistical learning theory
The nature of statistical learning theory
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Information Sciences: an International Journal
Hi-index | 0.00 |
Text Categorization (TC)-the assignment of predefined categories to documents of a corpus-plays an important role in a wide variety of information organization and management tasks of Information Retrieval (IR). It involves the management of a lot of information, but some of them could be noisy or irrelevant and hence, a previous feature reduction could improve the performance of the classification. In this paper we proposed a wrapper approach. This kind of approach is time-consuming and sometimes could be infeasible. But our wrapper explores a reduced number of feature subsets and also it uses Support Vector Machines (SVM) as the evaluation system; and this two properties make the wrapper fast enough to deal with large number of features present in text domains. Taking the Reuters-21578 corpus, we also compare this wrapper with the common approach for feature reduction widely applied in TC, which consists of filtering according to scoring measures.