Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity

Authors:
Michael Brennan;Sadia Afroz;Rachel Greenstadt
Affiliations:
Drexel University;Drexel University;Drexel University
Venue:
ACM Transactions on Information and System Security (TISSEC)
Year:
2012

Citing 10
Cited 2

WordNet: a lexical database for English

Communications of the ACM
Can pseudonymity really guarantee privacy?

SSYM'00 Proceedings of the 9th conference on USENIX Security Symposium - Volume 9
Obfuscating document stylometry to preserve author anonymity

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace

ACM Transactions on Information Systems (TOIS)
Authorship attribution

Foundations and Trends in Information Retrieval
A classifier system for author recognition using synonym-based features

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Empirical evaluation of authorship obfuscation using JGAAP

Proceedings of the 3rd ACM workshop on Artificial intelligence and security
Inside WikiLeaks: My Time with Julian Assange at the World's Most Dangerous Website

Inside WikiLeaks: My Time with Julian Assange at the World's Most Dangerous Website
A comparative study of language models for book and author recognition

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Detecting Hoaxes, Frauds, and Deception in Writing Style Online

SP '12 Proceedings of the 2012 IEEE Symposium on Security and Privacy

Explanation in computational stylometry

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
A behavioral biometrics based authentication method for MOOC's that is robust against imitation attempts

Proceedings of the first ACM conference on Learning @ scale conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of stylometry, authorship recognition through purely linguistic means, has contributed to literary, historical, and criminal investigation breakthroughs. Existing stylometry research assumes that authors have not attempted to disguise their linguistic writing style. We challenge this basic assumption of existing stylometry methodologies and present a new area of research: adversarial stylometry. Adversaries have a devastating effect on the robustness of existing classification methods. Our work presents a framework for creating adversarial passages including obfuscation, where a subject attempts to hide her identity, and imitation, where a subject attempts to frame another subject by imitating his writing style, and translation where original passages are obfuscated with machine translation services. This research demonstrates that manual circumvention methods work very well while automated translation methods are not effective. The obfuscation method reduces the techniques' effectiveness to the level of random guessing and the imitation attempts succeed up to 67% of the time depending on the stylometry technique used. These results are more significant given the fact that experimental subjects were unfamiliar with stylometry, were not professional writers, and spent little time on the attacks. This article also contributes to the field by using human subjects to empirically validate the claim of high accuracy for four current techniques (without adversaries). We have also compiled and released two corpora of adversarial stylometry texts to promote research in this field with a total of 57 unique authors. We argue that this field is important to a multidisciplinary approach to privacy, security, and anonymity.