A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Genres and the Web: is the personal home page the first uniquely digital genre?
Journal of the American Society for Information Science
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Automatic detection of text genre
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Recognizing text genres with simple metrics using discriminant analysis
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Text genre detection using common word frequencies
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Augmenting Naive Bayes Classifiers with Statistical Language Models
Information Retrieval
Language and task independent text categorization with simple language models
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Multiple sets of features for automatic genre classification of web documents
Information Processing and Management: an International Journal
Extracting key-substring-group features for text classification
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards genre classification for IR in the workplace
IIiX Proceedings of the 1st international conference on Information interaction in context
Reading level assessment using support vector machines and statistical language models
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Web resources for language modeling in conversational speech recognition
ACM Transactions on Speech and Language Processing (TSLP)
Math information retrieval: user requirements and prototype implementation
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
A machine learning approach to reading level assessment
Computer Speech and Language
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
On the Impact of Lexical and Linguistic Features in Genre- and Domain-Based Categorization
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Is Web Genre Identification Feasible?
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Learning to recognize webpage genres
Information Processing and Management: an International Journal
Classifying factored genres with part-of-speech histograms
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Classifying Web Pages by Genre: An n-Gram Approach
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Multiple sets of features for automatic genre classification of web documents
Information Processing and Management: an International Journal
We're not in Kansas anymore: detecting domain changes in streams
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Automatic genre detection of web documents
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Recognition of word collocation habits using frequency rank ratio and inter-term intimacy
Expert Systems with Applications: An International Journal
Classifying the socio-situational settings of transcripts of spoken discourses
Speech Communication
Hi-index | 0.00 |
Subject or prepositional content has been the focus of most classification research. Genre or style, on the other hand, is a different and important property of text, and automatic text genre classification is becoming important for classification and retrieval purposes as well as for some natural language processing research. In this paper, we present a method for automatic genre classification that is based on statistically selected features obtained from both subject-classified and genre classified training data. The experimental results show that the proposed method outperforms a direct application of a statistical learner often used for subject classification. We also observe that the deviation formula and discrimination formula using document frequency ratios also work as expected. We conjecture that this dual feature set approach can be generalized to improve the performance of subject classification as well.