Summarization as feature selection for text categorization
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving text categorization using the importance of sentences
Information Processing and Management: an International Journal
A text categorization based on summarization technique
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Graph-based ranking algorithms for sentence extraction, applied to text summarization
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Noise reduction through summarization for Web-page classification
Information Processing and Management: an International Journal
A document is known by the company it keeps: neighborhood consensus for short text categorization
Language Resources and Evaluation
Hi-index | 0.00 |
Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and discriminative information from the defined categories. Considering that training sets are extremely small in many classification tasks, in this paper we explore the use of unsupervised extractive summarization as a feature selection technique for document categorization. Our experiments using training sets of different sizes indicate that text summarization is a competitive approach for feature selection, and show its appropriateness for situations having small training sets, where it could clearly outperform the traditional information gain technique.