Robust ensemble learning for cancer diagnosis based on microarray data classification

  • Authors:
  • Yonghong Peng

  • Affiliations:
  • Department of Computing, University of Bradford, West Yorkshire, U.K

  • Venue:
  • ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

DNA microarray technology has demonstrated to be an effective methodology for the diagnosis of cancers by means of microarray data classification. Although much research has been conducted during the recent years to apply machine learning techniques for microarray data classification, there are two important issues that prevent the use of conventional machine learning techniques, namely the limited availability of training samples and the existence of various uncertainties (e.g. biological variability and experiment variability). This paper presents a new ensemble machine learning approach to address these issues in order to achieve a robust microarray data classification. Ensemble learning combines a set of base classifiers as a committee to make appropriate decisions when classifying new data instances. In order to enhance the performance of the ensemble learning process, the approach presented includes a procedure to select optimal ensemble members that maximize the behavioural diversity. The proposed approach has been verified by three microarray datasets for cancer diagnosis. Experimental results have demonstrated that the classifier constructed by the proposed method outperforms not only the classifiers generated by the conventional machine learning techniques, but also the classifiers generated by two widely-used conventional Bagging and Boosting ensemble learning methods.