Detecting Hoaxes, Frauds, and Deception in Writing Style Online

Authors:
Sadia Afroz;Michael Brennan;Rachel Greenstadt
Affiliations:
-;-;-
Venue:
SP '12 Proceedings of the 2012 IEEE Symposium on Security and Privacy
Year:
2012

Citing 0
Cited 4

Use fewer instances of the letter "i": toward writing style anonymization

PETS'12 Proceedings of the 12th international conference on Privacy Enhancing Technologies
Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity

ACM Transactions on Information and System Security (TISSEC)
Detecting stylistic deception

EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
k-subscription: privacy-preserving microblogging browsing through obfuscation

Proceedings of the 29th Annual Computer Security Applications Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

In digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. While stylometry techniques can identify authors with high accuracy in non-adversarial scenarios, their accuracy is reduced to random guessing when faced with authors who intentionally obfuscate their writing style or attempt to imitate that of another author. While these results are good for privacy, they raise concerns about fraud. We argue that some linguistic features change when people hide their writing style and by identifying those features, stylistic deception can be recognized. The major contribution of this work is a method for detecting stylistic deception in written documents. We show that using a large feature set, it is possible to distinguish regular documents from deceptive documents with 96.6% accuracy (F-measure). We also present an analysis of linguistic features that can be modified to hide writing style.