Applying Data Mining Techniques for Cancer Classification from Gene Expression Data

  • Authors:
  • Jinn-Yi Yeh;Tai-Shi Wu;Min-Che Wu;Der-Ming Chang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICCIT '07 Proceedings of the 2007 International Conference on Convergence Information Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent studies on molecular level classification of tissues have produced remarkable results, and indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification based on the DNA array data is still a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples. It makes accurate classification of data more difficult. This paper applies genetic algorithms (GA) with an initial solution provided by t- statistics (t-GA) for selecting a group of relevant genes from cancer microarray data. The decision tree based cancer classifier is then built on top of these selected genes. The performance of this approach is evaluated by comparing with other gene selection methods using the publicly available gene expression datasets. Experimental results indicate that t-GA has the highest accurate rate among different methods. The Z-score figure also shows that the gene selection operation provided by t-GA is reproducible.