Tumor classification by combining PNN classifier ensemble with neighborhood rough set based gene reduction

  • Authors:
  • Shu-Lin Wang;Xueling Li;Shanwen Zhang;Jie Gui;De-Shuang Huang

  • Affiliations:
  • School of Computer and Communication, Hunan University, Changsha, Hunan 410082, China and Intelligent Computation Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, ...;Intelligent Computation Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China;Intelligent Computation Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China;Intelligent Computation Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China and Department of Automation, University of Science and Technology of ...;Intelligent Computation Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui 230031, China

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since Golub applied gene expression profiles (GEP) to the molecular classification of tumor subtypes for more accurately and reliably clinical diagnosis, a number of studies on GEP-based tumor classification have been done. However, the challenges from high dimension and small sample size of tumor dataset still exist. This paper presents a new tumor classification approach based on an ensemble of probabilistic neural network (PNN) and neighborhood rough set model based gene reduction. Informative genes were initially selected by gene ranking based on an iterative search margin algorithm and then were further refined by gene reduction to select many minimum gene subsets. Finally, the candidate base PNN classifiers trained by each of the selected gene subsets were integrated by majority voting strategy to construct an ensemble classifier. Experiments on tumor datasets showed that this approach can obtain both high and stable classification performance, which is not too sensitive to the number of initially selected genes and competitive to most existing methods. Additionally, the classification results can be cross-verified in a single biomedical experiment by the selected gene subsets, and biologically experimental results also proved that the genes included in the selected gene subsets are functionally related to carcinogenesis, indicating that the performance obtained by the proposed method is convincing.