Authorship attribution with latent Dirichlet allocation

Authors:
Yanir Seroussi;Ingrid Zukerman;Fabian Bohnert
Affiliations:
Monash University, Clayton, Victoria, Australia;Monash University, Clayton, Victoria, Australia;Monash University, Clayton, Victoria, Australia
Venue:
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Year:
2011

Citing 13
Cited 3

Latent dirichlet allocation

The Journal of Machine Learning Research
In Defense of One-Vs-All Classification

The Journal of Machine Learning Research
Extreme re-balancing for SVMs: a case study

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
The author-topic model for authors and documents

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Linguistic correlates of style: authorship classification with deep linguistic analysis features

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Authorship attribution

Foundations and Trends in Information Retrieval
A survey of modern authorship attribution methods

Journal of the American Society for Information Science and Technology
Authorship attribution and verification with many authors and limited data

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Learning author-topic models from text corpora

ACM Transactions on Information Systems (TOIS)
A hierarchical classifier applied to multi-way sentiment detection

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Authorship attribution in the wild

Language Resources and Evaluation
Personalised rating prediction for new users using latent factor models

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Collaborative inference of sentiments from texts

UMAP'10 Proceedings of the 18th international conference on User Modeling, Adaptation, and Personalization

Stylometric relevance-feedback towards a hybrid book recommendation algorithm

Proceedings of the fifth ACM workshop on Research advances in large digital book repositories and complementary media
Authorship attribution with author-aware topic models

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Authorship attribution based on a probabilistic topic model

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of authorship attribution -- attributing texts to their original authors -- has been an active research area since the end of the 19th century, attracting increased interest in the last decade. Most of the work on authorship attribution focuses on scenarios with only a few candidate authors, but recently considered cases with tens to thousands of candidate authors were found to be much more challenging. In this paper, we propose ways of employing Latent Dirichlet Allocation in authorship attribution. We show that our approach yields state-of-the-art performance for both a few and many candidate authors, in cases where these authors wrote enough texts to be modelled effectively.