Statistical analysis with missing data
Statistical analysis with missing data
C4.5: programs for machine learning
C4.5: programs for machine learning
Genetic algorithms + data structures = evolution programs (3rd ed.)
Genetic algorithms + data structures = evolution programs (3rd ed.)
Self-Nonself Discrimination in a Computer
SP '94 Proceedings of the 1994 IEEE Symposium on Security and Privacy
A Case Study of Applying Boosting Naive Bayes to Claim Fraud Diagnosis
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Making generative classifiers robust to selection bias
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Classification Techniques of Neural Networks Using Improved Genetic Algorithms
WGEC '08 Proceedings of the 2008 Second International Conference on Genetic and Evolutionary Computing
A Comparative Study of Classification Methods in Financial Risk Detection
NCM '08 Proceedings of the 2008 Fourth International Conference on Networked Computing and Advanced Information Management - Volume 02
Probability density estimation for survival data with censoring indicators missing at random
Journal of Multivariate Analysis
AN EMPIRICAL COMPARISON OF TECHNIQUES FOR HANDLING INCOMPLETE DATA USING DECISION TREES
Applied Artificial Intelligence
Computational Intelligence for Missing Data Imputation, Estimation, and Management: Knowledge Optimization Techniques
Nearest neighbours in least-squares data imputation algorithms with different missing patterns
Computational Statistics & Data Analysis
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
PPCA-based missing data imputation for traffic flow volume: a systematical approach
IEEE Transactions on Intelligent Transportation Systems
Tree-Based Approach to Missing Data Imputation
ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
The hybrid credit scoring model based on KNN classifier
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Missing data imputation: a fuzzy K-means clustering algorithm over sliding window
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 3
Training and testing of recommender systems on data missing not at random
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Ensemble missing data techniques for software effort prediction
Intelligent Data Analysis
Missing data imputation in multivariate data by evolutionary algorithms
Computers in Human Behavior
A robust missing value imputation method for noisy data
Applied Intelligence
Learning and optimization using the clonal selection principle
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
Missing data in large insurance datasets affects the learning and classification accuracies in predictive modelling. Insurance datasets will continue to increase in size as more variables are added to aid in managing client risk and will therefore be even more vulnerable to missing data. This paper proposes a hybrid multi-layered artificial immune system and genetic algorithm for partial imputation of missing data in datasets with numerous variables. The multi-layered artificial immune system creates and stores antibodies that bind to and annihilate an antigen. The genetic algorithm optimises the learning process of a stimulated antibody. The evaluation of the imputation is performed using the RIPPER, k-nearest neighbour, naive Bayes and logistic discriminant classifiers. The effect of the imputation on the classifiers is compared with that of the mean/mode and hot deck imputation methods. The results demonstrate that when missing data imputation is performed using the proposed hybrid method, the classification improves and the robustness to the amount of missing data is increased relative to the mean/mode method for data missing completely at random (MCAR) missing at random (MAR), and not missing at random (NMAR).The imputation performance is similar to or marginally better than that of the hot deck imputation.