K-Ranked Covariance Based Missing Values Estimation for Microarray Data Classification

  • Authors:
  • Muhammad Shoaib B. Sehgal;Iqbal Gondal;Laurence Dooley

  • Affiliations:
  • GSCIT, Monash University, Australia;GSCIT, Monash University, Australia;GSCIT, Monash University, Australia

  • Venue:
  • HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray data often contains multiple missing genetic expression values that degrade the performance of statistical and machine learning algorithms. This paper presents a K ranked diagonal covariance-based missing value estimation algorithm (KRCOV) that hasdemonstrated significantly superior performance compared to the more commonly used K-nearest neighbour (KNN) imputation algorithm when it is applied to estimate missing valuesof BRCA1, BRCA2 and Sporadic genetic mutation samples present in ovarian cancer. Experimental results confirm KRCOV outperformed both KNN and zero imputation techniques in terms of their classification accuracies when used to impute randomly missing values from 1% to 5%. The classifier used for this purpose was the Generalized Regression Neural Network. The paper also provides a hypothesis for why KRCOV performs better than KNN not only for bioinformatics data but also for other data types having strong correlated values.