Using social network knowledge for detecting spider constructions in social security fraud

Authors:
Véronique Van Vlasselaer;Jan Meskens;Dries Van Dromme;Bart Baesens
Affiliations:
Katholieke Universiteit Leuven, Leuven, Belgium;Research Center, Brussels, Belgium;Research Center, Brussels, Belgium;Katholieke Universiteit Leuven, Leuven, Belgium and University of Southampton, Highfield Southampton, United Kingdom
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 12
Cited 0

Random Forests

Machine Learning
Adaptive Fraud Detection

Data Mining and Knowledge Discovery
Distributed Data Mining in Credit Card Fraud Detection

IEEE Intelligent Systems
Minority report in fraud detection: classification of skewed data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Using relational knowledge discovery to prevent securities fraud

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Using ghost edges for classification in sparsely labeled networks

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Bayesian Networks: An Introduction

Bayesian Networks: An Introduction
OddBall: spotting anomalies in weighted graphs

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Time-Evolving relational classification and ensemble methods

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Measuring tie strength in implicit social networks

Proceedings of the 3rd Annual ACM Web Science Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

As social networks offer a vast amount of additional information to enrich standard learning algorithms, the most challenging part is extracting relevant information from networked data. Fraudulent behavior is imperceptibly concealed both in local and relational data, making it even harder to define useful input for prediction models. Starting from expert knowledge, this paper succeeds to efficiently incorporate social network effects to detect fraud for the Belgian governmental social security institution, and to improve the performance of traditional non-relational fraud prediction tasks. As there are many types of social security fraud, this paper concentrates on payment fraud, predicting which companies intentionally disobey their payment duties to the government. We introduce a new fraudulent structure, the so-called spider constructions, which can easily be translated in terms of social networks and included in the learning algorithms. Focusing on the egonet of each company, the proposed method can handle large scale networks. In order to face the skewed class distribution, the SMOTE approach is applied to rebalance the data. The models were trained on different timestamps and evaluated on varying time windows. Using techniques as Random Forest, logistic regression and Naive Bayes, this paper shows that the combined relational model improves the AUC score and the precision of the predictions in comparison to the base scenario where only local variables are used.