Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Making large-scale support vector machine learning practical
Advances in kernel methods
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text databases & document management
Summarization as feature selection for text categorization
Proceedings of the tenth international conference on Information and knowledge management
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Improving text categorization using the importance of sentences
Information Processing and Management: an International Journal
Improved automatic keyword extraction given more linguistic knowledge
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Using bag-of-concepts to improve the performance of support vector machines in text categorization
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Text categorization with class-based and corpus-based keyword selection
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Semantic Text Classification of Emergent Disease Reports
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Association Rule Mining Based on the Semantic Categories of Tourism Information
ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks, Part II
KP-Miner: A keyphrase extraction system for English and Arabic documents
Information Systems
IEEE Transactions on Multimedia - Special issue on integration of context and content
Re-examining automatic keyphrase extraction approaches in scientific articles
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Automatic classification of sentences for evidence based medicine
DTMBIO '10 Proceedings of the ACM fourth international workshop on Data and text mining in biomedical informatics
Automatic categorization of questions for user-interactive question answering
Information Processing and Management: an International Journal
Combining classification with clustering for web person disambiguation
Proceedings of the 21st international conference companion on World Wide Web
A hybrid bug triage algorithm for developer recommendation
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Automatic keyphrase extraction from scientific articles
Language Resources and Evaluation
Hi-index | 0.00 |
This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization. In summary we show that a higher performance --- as measured by micro-averaged F-measure on a standard text categorization collection --- is achieved when the full-text representation is combined with the automatically extracted keywords. The combination is obtained by giving higher weights to words in the full-texts that are also extracted as keywords. We also present results for experiments in which the keywords are the only input to the categorizer, either represented as unigrams or intact. Of these two experiments, the unigrams have the best performance, although neither performs as well as headlines only.