Machine Learning
Data Mining and Knowledge Discovery
Distributed Data Mining in Credit Card Fraud Detection
IEEE Intelligent Systems
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Using relational knowledge discovery to prevent securities fraud
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Using ghost edges for classification in sparsely labeled networks
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Bayesian Networks: An Introduction
Bayesian Networks: An Introduction
OddBall: spotting anomalies in weighted graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Time-Evolving relational classification and ensemble methods
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Measuring tie strength in implicit social networks
Proceedings of the 3rd Annual ACM Web Science Conference
Hi-index | 0.00 |
As social networks offer a vast amount of additional information to enrich standard learning algorithms, the most challenging part is extracting relevant information from networked data. Fraudulent behavior is imperceptibly concealed both in local and relational data, making it even harder to define useful input for prediction models. Starting from expert knowledge, this paper succeeds to efficiently incorporate social network effects to detect fraud for the Belgian governmental social security institution, and to improve the performance of traditional non-relational fraud prediction tasks. As there are many types of social security fraud, this paper concentrates on payment fraud, predicting which companies intentionally disobey their payment duties to the government. We introduce a new fraudulent structure, the so-called spider constructions, which can easily be translated in terms of social networks and included in the learning algorithms. Focusing on the egonet of each company, the proposed method can handle large scale networks. In order to face the skewed class distribution, the SMOTE approach is applied to rebalance the data. The models were trained on different timestamps and evaluated on varying time windows. Using techniques as Random Forest, logistic regression and Naive Bayes, this paper shows that the combined relational model improves the AUC score and the precision of the predictions in comparison to the base scenario where only local variables are used.