An experimental comparison of real and artificial deception using a deception generation model

  • Authors:
  • Yanjuan Yang;Michael V. Mannino

  • Affiliations:
  • Automapath Inc., Santa Clara, CA 95050, United States;The Business School, University of Colorado Denver, United States

  • Venue:
  • Decision Support Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

To develop a data mining approach for a deception application, data collection costs can be prohibitive because both deceptive data and truthful data are necessary to be collected. To reduce data collection costs, artificially generated deception data can be used, but the impact of using artificially generated deception data is not well understood. To study the relationship between artificial and real deception, this paper presents an experimental comparison using a novel deception generation model. The deception and truth data were collected from financial aid applications, a document centric area with limited resources for verification. The data collection provided a unique data set containing truth, natural deception, and boosted deception. To simulate deception, the Application Deception Model was developed to generate artificial deception in different deception scenarios. To study differences between artificial and real deception, an experiment was performed using deception level and data generation method as factors and directed distance and outlier score as outcome variables. Our results provided evidence of a reasonable similarity between artificial and real deception, suggesting the possibility of using artificially generated deception to reduce the costs associated with obtaining training data.