A statistical and learning based oncogene detection and classification scheme using human cDNA expressions for ovarian carcinoma

  • Authors:
  • Meng-Hsiun Tsai;Ching-Hao Lai;Shyr-Shen Yu

  • Affiliations:
  • Department of Management Information Systems and Institute of Genomics and Bioinformatics, National Chung-Hsing University, 250, Kuo-Kuang Road, Taichung 402, Taiwan, ROC;Emerging Smart Technology Institute, Institute for Information Industry, 13F., No. 133, Sec. 4, Minsheng E. Rd., Taipei City 105, Taiwan, ROC;Department of Computer Science and Engineering, National Chung-Hsing University, 250, Kuo-Kuang Road, Taichung 402, Taiwan, ROC

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

In this paper, a human ovarian cDNA expression database is analyzed for detecting oncogenes and then selected oncogenes are used to identify pathological stages of ovarian carcinoma. This human ovarian cDNA expression database collects 41 patient samples which includes 13 samples of normal ovarian tumors (OVT), six samples of borderline of cancers (BOT), seven samples of ovarian cancer at stage I (OVCA-I) and 15 samples of ovarian cancer at stage III (OVCA-III). Each pathological sample contains a large number of genes (9600 genes). Hence oncogene analyzing and discovering is difficult. For this reason, a statistical testing method, t-test, is used to cull most of unconcerned genes in five different pathological stage classification cases. Then, these selected oncogenes are further used by artificial neural network (ANN) with five different classifications according to their gene expressions of pathological stages to set up a recognition system. This recognition system is used to show the efficiency of the proposed classification scheme. From the experimental results, the highest and lowest accuracy of five classification experiments is 100% and 89.47%. Moreover, this paper also proposed a novel t-test strategy to select more important oncogenes and increase lowest classification accuracy to 94.74%. The proposed scheme also can be used to develop a graphical user interface (GUI) bio-statistical or automatic diagnosis system for gene expression analysis to assist doctors and pathologists to analyze and diagnose ovarian cancer.