Missing value estimation of microarray data using similarity measurement

  • Authors:
  • Soumen Kumar Pati;Asit Kumar Das

  • Affiliations:
  • Department of Computer Science/Information Technology, St. Thomas‘ College of Engineering and Technology, Kolkata, India;Department of Computer Science and Technology, Bengal Engineering and Science University, Howrah, India

  • Venue:
  • SEMCCO'12 Proceedings of the Third international conference on Swarm, Evolutionary, and Memetic Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

DNA gene expression profiling plays an important role in a wide range of areas in biological science for handling cancer diseases. Data generated in microarray related experiments have many missing expression values which lose valuable information from the dataset. The proposed method first partitions the genes without missing values using clustering algorithm and then measures the similarity between a gene with missing values and the centroid of the clusters and finally, the missing values are estimated by the corresponding expression values of the centroid giving maximum similarity factor. The method explicitly depends on expression values to imputes missing values, completed the input dataset with low errors for data analysis and knowledge discovery. The method is compared with prominent approaches, such as zero-impute, row-average-impute and KNN-impute in terms of "Normalized Root Mean Square Error" to claim its novelty.