Using Literal and Grammatical Statistics for Authorship Attribution
Problems of Information Transmission
The disputed federalist papers: SVM feature selection via concave minimization
Proceedings of the 2003 conference on Diversity in computing
Automatic authorship attribution
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Language independent authorship attribution using character level language models
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Automatic turkish text categorization in terms of author, genre and gender
NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
Hi-index | 0.00 |
The aim of this study is to identify the author of an unauthorized document. Ten different feature vectors are obtained from authorship attributes, n-grams and various combinations of these feature vectors that are extracted from documents, which the authors are intended to be identified. Comparative performance of every feature vector is analyzed by applying Naïve Bayes, SVM, k-NN, RF and MLP classification methods. The most successful classifiers are MLP and SVM. In document classification process, it is observed that n-grams give higher accuracy rates than authorship attributes. Nevertheless, using n-gram and authorship attributes together, gives better results than when each is used alone.