Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Automatic text categorization in terms of genre and author
Computational Linguistics
Automatic detection of text genre
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Recognizing text genres with simple metrics using discriminant analysis
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Effects of web document evolution on genre classification
Proceedings of the 14th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Researchers have concentrated on topic-based text classification while the genre of a document is rarely considered. In this article, we discuss the automatic genre classification and its application. We argue that word level features and sentence level features are two important measures which vary in number among different genres. Word level features include word frequency and POS (Part of Speech) tag statistics. Sentence level features include grammar rules, which have strong relations between different genres. Based on the two aspects of view, we explore a robust approach where the Co-training method is employed to obtain high effectiveness for genre classification.