Improving a textual deception detection model

  • Authors:
  • S. Gupta;D. B. Skillicorn

  • Affiliations:
  • Queen's University;Queen's University

  • Venue:
  • CASCON '06 Proceedings of the 2006 conference of the Center for Advanced Studies on Collaborative research
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In intelligence, law enforcement, and, increasingly, organizational settings there is interest in detecting deception; for example, in intercepted phone calls, emails, and web sites. Humans are not naturally good at detecting deception, but recent work has shown that deception is actually readily detectable - using markers that humans don't see but which software can readily compute. Pennebaker's model suggests that deceptive communication is characterized by changes in the frequency of four kinds of words: first-person pronouns, exception words, negative emotion words, and action words.We investigate what can be learned about the deception model by applying it to a large corpus of Enron emails. We show that each of the four kinds of words in the Pennebaker model acts as a separate latent factor for deception, rather than having their effects mixed together.