Original Contribution: Stacked generalization
Neural Networks
Data mining with neural networks: solving business problems from application development to decision support
Data preparation for data mining
Data preparation for data mining
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
Robust Classification for Imprecise Environments
Machine Learning
Mastering Data Mining: The Art and Science of Customer Relationship Management
Mastering Data Mining: The Art and Science of Customer Relationship Management
Data Mining and Knowledge Discovery
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach
Data Mining and Knowledge Discovery
Distributed Data Mining in Credit Card Fraud Detection
IEEE Intelligent Systems
Mining the Knowledge Mine: The Hot Spots Methodology for Mining Large Real World Databases
AI '97 Proceedings of the 10th Australian Joint Conference on Artificial Intelligence: Advanced Topics in Artificial Intelligence
Using ethnography to design a mass detection tool (MDT) for the early discovery of insurance fraud
CHI '03 Extended Abstracts on Human Factors in Computing Systems
Detecting fraud in the real world
Handbook of massive data sets
Management of intelligent learning agents in distributed data mining systems
Management of intelligent learning agents in distributed data mining systems
On Data and Algorithms: Understanding Inductive Performance
Machine Learning
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
No free lunch theorems for optimization
IEEE Transactions on Evolutionary Computation
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
The Journal of Machine Learning Research
Classifying imbalanced data using a bagging ensemble variation (BEV)
ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Expert Systems with Applications: An International Journal
Improving railroad wheel inspection planning using classification methods
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
An Evaluation of the Robustness of MTS for Imbalanced Data
IEEE Transactions on Knowledge and Data Engineering
Back propagation networks for credit card fraud prediction using stratified personalized data
ISP'06 Proceedings of the 5th WSEAS International Conference on Information Security and Privacy
Ontology-Based Fraud Detection
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
International Journal of Approximate Reasoning
Expert Systems with Applications: An International Journal
ACM Computing Surveys (CSUR)
Anomaly detection using manifold embedding and its applications in transportation corridors
Intelligent Data Analysis - Knowledge Discovery from Data Streams
Towards fraud detection support using grid technology
Multiagent and Grid Systems - New tendencies on agents and grid environments
Information Sciences: an International Journal
Selective costing ensemble for handling imbalanced data sets
International Journal of Hybrid Intelligent Systems
An unbalanced data classification model using hybrid sampling technique for fraud detection
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Comparative analysis of data mining techniques for financial data using parallel processing
Proceedings of the 7th International Conference on Frontiers of Information Technology
A hybrid fraud scoring and spike detection technique in streaming data
Intelligent Data Analysis
Expert Systems with Applications: An International Journal
Anomaly detection in monitoring sensor data for preventive maintenance
Expert Systems with Applications: An International Journal
Active learning and subspace clustering for anomaly detection
Intelligent Data Analysis
Detecting fraud in online games of chance and lotteries
Expert Systems with Applications: An International Journal
Anomaly detection in categorical datasets using bayesian networks
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Expert Systems with Applications: An International Journal
Testing the fraud detection ability of different user profiles by means of FF-NN classifiers
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Methodology for fraud detection in electronic transactions
Proceedings of the 18th Brazilian symposium on Multimedia and the web
Improving risk predictions by preprocessing imbalanced credit data
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
Multi-level relationship outlier detection
International Journal of Business Intelligence and Data Mining
A new probabilistic active sample selection algorithm for class imbalance problem
International Journal of Knowledge Engineering and Soft Data Paradigms
Neurocomputing
Empirical study of bagging predictors on medical data
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Using social network knowledge for detecting spider constructions in social security fraud
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Multimedia Tools and Applications
Hi-index | 0.01 |
This paper proposes an innovative fraud detection method, built upon existing fraud detection research and Minority Report, to deal with the data mining problem of skewed data distributions. This method uses backpropagation (BP), together with naive Bayesian (NB) and C4.5 algorithms, on data partitions derived from minority oversampling with replacement. Its originality lies in the use of a single meta-classifier (stacking) to choose the best base classifiers, and then combine these base classifiers' predictions (bagging) to improve cost savings (stacking-bagging). Results from a publicly available automobile insurance fraud detection data set demonstrate that stacking-bagging performs slightly better than the best performing bagged algorithm, C4.5, and its best classifier, C4.5 (2), in terms of cost savings. Stacking-bagging also outperforms the common technique used in industry (BP without both sampling and partitioning). Subsequently, this paper compares the new fraud detection method (meta-learning approach) against C4.5 trained using undersampling, oversampling, and SMOTEing without partitioning (sampling approach). Results show that, given a fixed decision threshold and cost matrix, the partitioning and multiple algorithms approach achieves marginally higher cost savings than varying the entire training data set with different class distributions. The most interesting find is confirming that the combination of classifiers to produce the best cost savings has its contributions from all three algorithms.