Feature Selection for Cancer Classification on Microarray Expression Data

  • Authors:
  • Hui-Huang Hsu;Ming-Da Lu

  • Affiliations:
  • -;-

  • Venue:
  • ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 03
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray is an important tool in gene analysis research. It can help identify genes that might cause various cancers. In this paper, we use feature selection methods and the support vector machine (SVM) to search for the disease-causing genes in microarray data of three different cancers. The feature selection methods are based on Euclidian distance (ED) and Pearson correlation coefficient(PCC). We investigated the effect on prediction results by training the SVM with different numbers of features and different kinds of kernels. The results show that linear kernel is the fittest kernel for this problem. Also, equal or higher accuracy can be achieved with only 15 to 100 features which are selected from 7129 or more features of the original data sets.