LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
An Algorithm for Multi-relational Discovery of Subgroups
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Interestingness measures for data mining: A survey
ACM Computing Surveys (CSUR)
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Hi-index | 0.00 |
In this paper we present a hybrid method for identifying suspicious behavior in transactional data by combining techniques from outlier detection and subgroup discovery. Most existing outlier detection approaches focus on the identification of single outliers without providing a description of these outliers. Moreover, these methods find single outliers instead of groups of outlying records. However, when searching for fraud, it is important to analyze data not on the level of single records but on a higher, group level, such as sets of records of customers, shops, etc. Our method is able to analyze data on such a higher level and additionally it provides descriptions of groups of found outliers. The method involves three steps: scoring of individual records with help of a newly proposed outlier measure which is calculated with help of random forests, identification of unusual groups of records with help of subgroup discovery techniques, and finally, identify the most deviating entities such as shops, hospitals.