Introduction to non-linear optimization
Introduction to non-linear optimization
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Unbalanced Class Distribution and Naive Bayes
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Text Categorization (TC) is an important issue within Information Retrieval (IR). Feature Selection (FS) becomes a crucial task, because of the presence of irrelevant features causing a loss in the performance. FS is usually performed selecting the features with highest score according to certain measures. However, the disadvantage of these approaches is that they need to determine in advance the number of features that are selected, commonly defined by the percentage of words removed, which is called Filtering Level (FL). In view of that, it is usual to carry out a set of experiments manually taking several FLs representing all possible ones. This process does not guarantee that any of the FLs chosen are the optimal ones, even not an approximation. This paper deals with overcoming this difficulty proposing a method that automatically determines optimal FLs by means of solving a univariate maximization problem.