Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Machine Learning
Condorcet fusion for improved retrieval
Proceedings of the eleventh international conference on Information and knowledge management
High-performing feature selection for text classification
Proceedings of the eleventh international conference on Information and knowledge management
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Feature selection using linear classifier weights: interaction with classification models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Combining feature selectors for text classification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
BNS feature scaling: an improved representation over tf-idf for svm text classification
Proceedings of the 17th ACM conference on Information and knowledge management
Reciprocal rank fusion outperforms condorcet and individual rank learning methods
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Combination of feature selection methods for text categorisation
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Hi-index | 0.00 |
Text categorisation relies heavily on feature selection. Both the possible reduction in dimensionality as well as improvements in classification performance are highly desirable. To the end of feature selection for text, a range of different methods have been developed, each having unique properties and selecting different features. However, it remains unclear which of them can be combined and what benefits this brings with it. In this paper we present correlation methods for the analysis of feature rankings and evaluate the combination of features according to these metrics. We further show results of an extensive study of feature selection approaches using a wide range of combination methods. We performed experiments on 19 test collections and report our findings.