Gene subset selection for cancer classification using statsitical and rough set approach

  • Authors:
  • Asit Kumar Das;Soumen Kumar Pati

  • Affiliations:
  • Department of Computer Science and Technology, Bengal Engineering and Science University, Howrah, India;Department of Computer Science/Information Technology, St. Thomas‘ College of Engineering and Technology, Kolkata, India

  • Venue:
  • SEMCCO'12 Proceedings of the Third international conference on Swarm, Evolutionary, and Memetic Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microarray technique is very useful for measuring expression levels of thousands or more of genes simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data is to select minimal number of relevant genes which can maximize classification accuracy. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, selecting the most informative cancer-related genes from high volume microarray gene expression data is an important and challenging bioinformatics research topic. In the paper, first some important genes are identified based on their rank computed statistically and then rough set theory is applied on reduced gene set for selecting genes with high class-discrimination capability. The method constructs relative discernibility matrix to find out the core genes which are essentially required to distinguish the normal and tumor samples and iteratively adds high ranked noncore genes one at a time to core genes for maximizing classification accuracy. The method is applied on some well known cancerous datasets to show the goodness of the method.