Tumor classification from gene expression data: a coding-based multiclass learning approach

  • Authors:
  • Alexander Hüntemann;José C. González;Elizabeth Tapia

  • Affiliations:
  • Katholieke Universiteit Leuven, Leuven, Belgium;E.T.S.I. Telecomunicación, Universidad Politécnica de Madrid, Madrid, Spain;Facultad de Ciencias Exactas, Ingeniería Agromesura, Escuela de Ingeniería Electrónica, Rosario, Argentina

  • Venue:
  • ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The effectiveness of cancer treatment depends strongly on an accurate diagnosis. In this paper we propose a system for automatic and precise diagnosis of a tumor's origin based on genetic data. This system is based on a combination of coding theory techniques and machine learning algorithms. In particular, tumor classification is described as a multiclass learning setup, where gene expression values serve the system to distinguish between types of tumors. Since multiclass learning is intrinsically complex, the data is divided into several biclass problems whose results are combined with an error correcting linear block code. The robustness of the prediction is increased as errors of the base binary classifiers are corrected by the linear code. Promising results have been achieved with a best case precision of 72% when the system was tested on real data from cancer patients.