HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
Missing values imputation for cDNA microarray data using ranked covariance vectors
International Journal of Hybrid Intelligent Systems - Recent developments in Hybrid Intelligent Systems
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Hi-index | 0.00 |
Microarray data is used in a large number of applications ranging from diagnosis through to drug discovery. Such data however, often contains multiple missing genetic expressions which are generally ignored thus degrading the reliability of inferred results. This paper presents an innovative and robust imputation framework that more accurately estimates missing values leading subsequently to better gene selection and class prediction. To prove this premise, several missing value techniques including the Collateral Missing Values Estimation (CMVE), Bayesian Principal Component Analysis (BPCA), Least Square Impute (LSImpute), k-Nearest Neighbour (KNN) and ZeroImpute are analysed. A combination of univariate and multiple gene selection methods, namely, Between Group to within Group Sum of Squares and Weighted Partial Least Squares is then performed before applying class prediction using the Ridge Partial Least Square method. Overall, CMVE imputation consistently provided superior missing values estimation accuracy compared with the other algorithms examined, by virtue of exploiting local and global as well as positive and negative correlations between genes, with all empirical results being corroborated by the two-sided Wilcoxon Rank sum statistical significance test.