Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
The nature of statistical learning theory
The nature of statistical learning theory
Unifying instance-based and rule-based induction
Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A study of instance-based algorithms for supervised learning tasks: mathematical, empirical, and psychological evaluations
Computers in Biology and Medicine
Hi-index | 0.00 |
Text Categorization (TC) is the process of assigning documents to a set of previously fixed categories. A lot of research is going on with the goal of automating this time-consuming task due to the great amount of information available. Machine Learning (ML) algorithms are methods recently applied with this purpose. In this paper, we compare the performance of two of these algorithms (SVM and ARNI) on a collection with an unbalanced distribution of documents into categories. Feature reduction is previously applied with both classical measures (information gain and term frequency) and 3 new measures that we propose here for first time. We also compare their performance.