Machine Learning
Self-Organizing Maps
Imputation of Missing Data in Industrial Databases
Applied Intelligence
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Feedforward Neural Network Construction Using Cross Validation
Neural Computation
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
Partial identification with missing data: concepts and findings
International Journal of Approximate Reasoning
Current trends on knowledge extraction and neural networks
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Towards efficient imputation by nearest-neighbors: a clustering-based approach
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Artificial Intelligence in Medicine
A combined neural network and decision trees model for prognosis of breast cancer relapse
Artificial Intelligence in Medicine
A classifier ensemble approach for the missing feature problem
Artificial Intelligence in Medicine
Expert Systems with Applications: An International Journal
WIMP: Web server tool for missing data imputation
Computer Methods and Programs in Biomedicine
Classifying patterns with missing values using Multi-Task Learning perceptrons
Expert Systems with Applications: An International Journal
Missing data in medical databases: Impute, delete or classify?
Artificial Intelligence in Medicine
An algorithmic approach to missing data problem in modeling human aspects in software development
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
A biological continuum based approach for efficient clinical classification
Journal of Biomedical Informatics
Hi-index | 0.00 |
Objectives: Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. Materials and methods: Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the ''El Alamo-I'' project, and the results were then compared to those obtained from the listwise deletion (LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values. Results: The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman's test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model. Conclusion: The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures.