Combination of KNN-Based Feature Selection and KNNBased Missing-Value Imputation of Microarray Data

  • Authors:
  • Phayung Meesad;Kairung Hengpraprohm

  • Affiliations:
  • -;-

  • Venue:
  • ICICIC '08 Proceedings of the 2008 3rd International Conference on Innovative Computing Information and Control
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarrays are useful biological resource to study living forms at the molecule level. Microarrays usually have only few samples but high dimensionality with many missing values. The consequent downstream analysis becomes less efficiency. This paper proposes a methodology to impute missing values in microarray data. The proposed methodology is a combination of KNN-Based Feature Selection and KNN-based imputation (KNNFS Impute). The KNNFS Impute comprises of two main ideas: feature selection and estimation of new values. A comparative study of the proposed method with traditional KNN and Row average methods has been presented for the estimation of the missing values on three microarray data sets: Lung Tumor, Colon Cancer, and ALL-AML Leukemia dataset. The best estimation results are measured by the minimum Normalized Root Mean Squared Error (NRMSE). The results show that the proposed method has powerful estimation ability on the three data sets with smaller NRMSE than the compared methods.