Missing Value Estimation For Microarray Data Based On Fuzzy C-means Clustering

  • Authors:
  • JiaWei Luo;Tao Yang;Yan Wang

  • Affiliations:
  • Hunan University, Changsha , China;Hunan University, Changsha , China;Hunan University, Changsha , China

  • Venue:
  • HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray experiments can generate data sets with multiple missing expression values, normally due to various experimental problems. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. Effective missing value estimation methods are needed, therefore, to minimize the effect of incomplete data sets on analysis, and to increase the range of data sets to which these algorithms can be applied. In this paper, a new imputation method (FCMimpute) based on the fuzzy Cmeans clustering algorithm is proposed to estimate missing values in microarray data, which utilizes information in the cluster structures. This imputes the missing value by the attribute over all cluster centers obtained through fuzzy C-means clustering algorithm applicable to incomplete data. We compare the estimation accuracy of our method with the widely used KNNimpute and another SKNNimpute method on various microarray data sets with different percentage of missing entries. In our experiments, the proposed FCMimpute method shows better performance than other methods in terms of Root Means Square error.