Missing values imputation for cDNA microarray data using ranked covariance vectors
International Journal of Hybrid Intelligent Systems - Recent developments in Hybrid Intelligent Systems
Ameliorative missing value imputation for robust biological knowledge inference
Journal of Biomedical Informatics
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
How to improve postgenomic knowledge discovery using imputation
EURASIP Journal on Bioinformatics and Systems Biology - Special issue on applications of signal procesing techniques to bioinformatics, genomics, and proteomics
Comments on selected fundamental aspects of microarray analysis
Computational Biology and Chemistry
A survey of evolutionary algorithms for clustering
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Discovery of gene regulatory networks in aspergillus fumigatus
KDECB'06 Proceedings of the 1st international conference on Knowledge discovery and emergent complexity in bioinformatics
International Journal of Data Mining and Bioinformatics
Pattern recognition using boundary data of component distributions
Computers and Industrial Engineering
Iterative clustering analysis for grouping missing data in gene expression profiles
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Missing values estimation in microarray data with partial least squares regression
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Mixtures of common factor analyzers for high-dimensional data with missing information
Journal of Multivariate Analysis
Hi-index | 3.84 |
Motivation: In microarray experiments, missing entries arise from blemishes on the chips. In large-scale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses. This study compares methods of missing value estimation. Results: Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of mis-clustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both time-series (correlated) and non-time series (uncorrelated) data sets. Availability: Matlab code is available on request from the authors.