Large Scale Detection of Irregularities in Accounting Data

Authors:
Stephen Bay;Krishna Kumaraswamy;Markus G. Anderle;Rohit Kumar;David M. Steier
Affiliations:
PricewaterhouseCoopers LLP, USA;PricewaterhouseCoopers LLP, USA;PricewaterhouseCoopers LLP, USA;PricewaterhouseCoopers LLP, USA;PricewaterhouseCoopers LLP, USA
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 7

Incremental pattern discovery on streams, graphs and tensors

ACM SIGKDD Explorations Newsletter
SNARE: a link analytic system for graph labeling and risk detection

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
RADAR: rare category detection via computation of boundary degree

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
OddBall: spotting anomalies in weighted graphs

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Metafraud: a meta-learning framework for detecting financial fraud

MIS Quarterly
Inside insider trading: patterns & discoveries from a large scale exploratory analysis

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Rare category exploration

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. This paper proposes techniques for automatic analysis of company general ledgers on such a large scale, identifying irregularities - which may indicate fraud or just honest errors - for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner. In this paper, we describe how we addressed these challenges over the past two years and report on experiments evaluating Sherlock.