On the use of homogenous sets of subjects in deceptive language analysis

Authors:
Tommaso Fornaciari;Massimo Poesio
Affiliations:
University of Trento;University of Essex and University of Trento
Venue:
EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
Year:
2012

Citing 6
Cited 0

Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07

ACM SIGIR Forum
A Statistical Language Modeling Approach to Online Deception Detection

IEEE Transactions on Knowledge and Data Engineering
Verification and implementation of language-based deception indicators in civil and criminal narratives

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Authorship attribution and verification with many authors and limited data

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
The lie detector: explorations in the automatic recognition of deceptive language

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Automatic emotion classification for interpersonal communication

WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent studies on deceptive language suggest that machine learning algorithms can be employed with good results for classification of texts as truthful or untruthful. However, the models presented so far do not attempt to take advantage of the differences between subjects. In this paper, models have been trained in order to classify statements issued in Court as false or not-false, not only taking into consideration the whole corpus, but also by identifying more homogenous subsets of producers of deceptive language. The results suggest that the models are effective in recognizing false statements, and their performance can be improved if subsets of homogeneous data are provided.