A hybrid classification system for cancer diagnosis with proteomic bio-markers

  • Authors:
  • Jung-Ja Kim;Young-Ho Kim;Yonggwan Won

  • Affiliations:
  • Department of Electronics and Computer Engineering, Chonnam National University, Kwangju, Republic of Korea;Doul Info. Technology, Gwang-ju, Republic of Korea;Department of Electronics and Computer Engineering, Chonnam National University, Kwangju, Republic of Korea

  • Venue:
  • EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A number of studies have been performed with the objective of applying various artificial intelligence techniques to the prediction and classification of cancer specific biomarkers for use in clinical diagnosis. Most biological data, such as that obtained from SELDI-TOF (Surface Enhanced Laser Desorption and Ionization-Time Of Flight) MS (Mass Spectrometry) is high dimensional, and therefore requires dimension reduction in order to limit the computational complexity and cost. The DT (Decision Tree) is an algorithm which allows for the fast classification and effective dimension reduction of high dimensional data. However, it does not guarantee the reliability of the features selected by the process of dimension reduction. Another approach is the MLP (Multi-Layer Perceptron) which is often more accurate at classifying data, but is not suitable for the processing of high dimensional data. In this paper, we propose on a novel approach, which is able to accurately classify prostate cancer SELDI data into normal and abnormal classes and to identify the potential biomarkers. In this approach, we first select those features that have excellent discrimination power by using the DT. These selected features constitute the potential biomarkers. Next, we classify the selected features into normal and abnormal categories by using the MLP; at this stage we repeatedly perform cross validation to evaluate the propriety of the selected features. In this way, the proposed algorithm can take advantage of both the DT and MLP, by hybridizing these two algorithms. The experimental results demonstrate that the proposed algorithm is able to identify multiple potential biomarkers that enhance the confidence of diagnosis, also showing better specificity, sensitivity and learning error rates than other algorithms. The proposed algorithm represents a promising approach to the identification of proteomic patterns in serum that can distinguish cancer from normal or benign and is applicable to clinical diagnosis and prognosis.