Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method

  • Authors:
  • Chien-Pang Lee;Wen-Shin Lin;Yuh-Min Chen;Bo-Jein Kuo

  • Affiliations:
  • Biometry Division, Department of Agronomy, National Chung Hsing University, No. 250, Kuo Kuang Rd., Taichung 40227, Taiwan, ROC;Biometry Division, Department of Agronomy, National Chung Hsing University, No. 250, Kuo Kuang Rd., Taichung 40227, Taiwan, ROC;School of Nursing, China Medical University, No. 91, Hsueh Shih Rd., Taichung 40402, Taiwan, ROC;Biometry Division, Department of Agronomy, National Chung Hsing University, No. 250, Kuo Kuang Rd., Taichung 40227, Taiwan, ROC

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

Recently, microarray technology has widely used on the study of gene expression in cancer diagnosis. The main distinguishing feature of microarray technology is that can measure thousands of genes at the same time. In the past, researchers always used parametric statistical methods to find the significant genes. However, microarray data often cannot obey some of the assumptions of parametric statistical methods, or type I error may be over expanded. Therefore, our aim is to establish a gene selection method without assumption restriction to reduce the dimension of the data set. In our study, adaptive genetic algorithm/k-nearest neighbor (AGA/KNN) was used to evolve gene subsets. We find that AGA/KNN can reduce the dimension of the data set, and all test samples can be classified correctly. In addition, the accuracy of AGA/KNN is higher than that of GA/KNN, and it only takes half the CPU time of GA/KNN. After using the proposed method, biologists can identify the relevant genes efficiently from the sub-gene set and classify the test samples correctly.