Missing values imputation for cDNA microarray data using ranked covariance vectors
International Journal of Hybrid Intelligent Systems - Recent developments in Hybrid Intelligent Systems
Ameliorative missing value imputation for robust biological knowledge inference
Journal of Biomedical Informatics
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
Microarray data often contains multiple missing genetic expression values that degrade the performance of statistical and machine learning algorithms. This paper presents a K ranked diagonal covariance-based missing value estimation algorithm (KRCOV) that hasdemonstrated significantly superior performance compared to the more commonly used K-nearest neighbour (KNN) imputation algorithm when it is applied to estimate missing valuesof BRCA1, BRCA2 and Sporadic genetic mutation samples present in ovarian cancer. Experimental results confirm KRCOV outperformed both KNN and zero imputation techniques in terms of their classification accuracies when used to impute randomly missing values from 1% to 5%. The classifier used for this purpose was the Generalized Regression Neural Network. The paper also provides a hypothesis for why KRCOV performs better than KNN not only for bioinformatics data but also for other data types having strong correlated values.