An experimental comparison of real and artificial deception using a deception generation model

Authors:
Yanjuan Yang;Michael V. Mannino
Affiliations:
Automapath Inc., Santa Clara, CA 95050, United States;The Business School, University of Colorado Denver, United States
Venue:
Decision Support Systems
Year:
2012

Citing 21
Cited 1

An experimental evaluation of the assumption of independence in multiversion programming

IEEE Transactions on Software Engineering
Learning in the presence of malicious errors

SIAM Journal on Computing
A classification-based methodology for planning audit strategies in fraud detection

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The 1999 DARPA off-line intrusion detection evaluation

Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on recent advances in intrusion detection systems
Learning From Noisy Examples

Machine Learning
An Exploratory Study into Deception Detection in Text-Based Computer-Mediated Communication

HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track1 - Volume 1
Synthesizing Test Data for Fraud Detection Systems

ACSAC '03 Proceedings of the 19th Annual Computer Security Applications Conference
Deception Detection under Varying Electronic Media and Warning Conditions

HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 1 - Volume 1
Heuristics and Modalities in Determining Truth Versus Deception

HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 1 - Volume 01
StrikeCOM: A Multi-Player Online Strategy Game for Researching and Teaching Group Dynamics

HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 1 - Volume 01
Lying on the Web: Implications for Expert Systems Redesign

Information Systems Research
Directed metrics and directed graph partitioning problems

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Class noise vs. attribute noise: a quantitative study of their impacts

Artificial Intelligence Review
A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication

Journal of Management Information Systems
A Statistical Language Modeling Approach to Online Deception Detection

IEEE Transactions on Knowledge and Data Engineering
Classification algorithm sensitivity to training data with non representative attribute noise

Decision Support Systems
Improved heterogeneous distance functions

Journal of Artificial Intelligence Research
Detecting deception through linguistic analysis

ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics
An Experimental Comparison of a Document Deception Detection Policy using Real and Artificial Deception

Journal of Data and Information Quality (JDIQ)

An Experimental Comparison of a Document Deception Detection Policy using Real and Artificial Deception

Journal of Data and Information Quality (JDIQ)

Quantified Score

Hi-index	0.00

Visualization

Abstract

To develop a data mining approach for a deception application, data collection costs can be prohibitive because both deceptive data and truthful data are necessary to be collected. To reduce data collection costs, artificially generated deception data can be used, but the impact of using artificially generated deception data is not well understood. To study the relationship between artificial and real deception, this paper presents an experimental comparison using a novel deception generation model. The deception and truth data were collected from financial aid applications, a document centric area with limited resources for verification. The data collection provided a unique data set containing truth, natural deception, and boosted deception. To simulate deception, the Application Deception Model was developed to generate artificial deception in different deception scenarios. To study differences between artificial and real deception, an experiment was performed using deception level and data generation method as factors and directed distance and outlier score as outcome variables. Our results provided evidence of a reasonable similarity between artificial and real deception, suggesting the possibility of using artificially generated deception to reduce the costs associated with obtaining training data.