Financial data and the skewed generalized T distribution
Management Science
Machine Learning
Data Mining and Knowledge Discovery
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Using artificial anomalies to detect unknown and known network intrusions
Knowledge and Information Systems
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Context-Aware Data Mining Methodology for Supply Chain Finance Cooperative Systems
ICAS '09 Proceedings of the 2009 Fifth International Conference on Autonomic and Autonomous Systems
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Context-Aware ubiquitous data mining based agent model for intersection safety
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Severe class imbalance: why better algorithms aren't the answer
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Discovery of financial fraud has profound social consequences. Loss of stockholder value, bankruptcy, and loss of confidence in the professional audit firms have resulted from failure to detect financial fraud. Previous studies that have attempted to discover fraud patterns from publicly available information have achieved only moderate levels of success. This study explores the capabilities of recently developed statistical learning and data mining methods in an attempt to advance fraud discovery performance to levels that have potential for proactive discovery or mitigation of financial fraud. The partially adaptive methods we test have achieved success in a number of complex problem domains and are easily interpretable. Ensemble methods, which combine predictions from multiple models via boosting, bagging, or related approaches, have emerged as among the most powerful data mining and machine learning methods. Our study includes random forests, stochastic gradient boosting, and rule ensembles. The results for ensemble models show marked improvement over past efforts, with accuracy approaching levels of practical potential. In particular, rule ensembles do so while maintaining a degree of interpretability absent in the other ensemble methods. © 2012 Wiley Periodicals, Inc.