On the effectiveness of gene selection for microarray classification methods

  • Authors:
  • Zhongwei Zhang;Jiuyong Li;Hong Hu;Hong Zhou

  • Affiliations:
  • Department of Mathematics and Computing, University of Southern Queensland, QLD, Australia;School of Computer and Information Science, University of South Australia, Adelaide, SA, Australia;Planning and Quality Office, University of Southern Queensland, QLD, Australia;Faculty of Engineering, University of Southern Queensland, QLD, Australia

  • Venue:
  • ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes place, it is desirable to eliminate as much noisy data as possible. An approach to improving the accuracy and efficiency of Microarray data classification is to make a small selection from the large volume of high dimensional gene expression dataset. An effective gene selection helps to clean up the existing Microarray data and therefore the quality of Microarray data has been improved. In this paper, we study the effectiveness of the gene selection technology for Microarray classification methods. We have conducted some experiments on the effectiveness of gene selection for Microarray classification methods such as two benchmark algorithms: SVMs and C4.5. We observed that although in general the performance of SVMs and C4.5 are improved by using the preprocessed datasets rather than the original data sets in terms of accuracy and efficiency, while an inappropriate choice of gene data can only be detrimental to the power of prediction. Our results also implied that with preprocessing, the number of genes selected affects the classification accuracy.