Automatic turkish text categorization in terms of author, genre and gender

Authors:
M. Fatih Amasyalı;Banu Diri
Affiliations:
Computer Engineering Department, Yıldız Technical University, Beşiktaş, İstanbul, Turkey;Computer Engineering Department, Yıldız Technical University, Beşiktaş, İstanbul, Turkey
Venue:
NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
Year:
2006

Citing 2
Cited 2

Automatic text categorization in terms of genre and author

Computational Linguistics
Automatic detection of text genre

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics

Author attribution of Turkish texts by feature mining

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Chat mining: Automatically determination of chat conversations' topic in Turkish text based chat mediums

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this study, a first comprehensive text classification using n-gram model has been realized for Turkish. We worked in 3 different areas such as determining the identification of a Turkish document's author, classifying documents according to text's genre and identifying a gender of an author, automatically. Naive Bayes, Support Vector Machine, C 4.5 and Random Forest were used as classification methods and the results were given comparatively. The success in determining the author of the text, genre of the text and gender of the author was obtained as 83%, 93% and 96%, respectively.