Gene selection and cancer microarray data classification via mixed-integer optimization

  • Authors:
  • Carlotta Orsenigo

  • Affiliations:
  • Dip. di Scienze Economiche, Aziendali e Statistiche, Università di Milano, Italy

  • Venue:
  • EvoBIO'08 Proceedings of the 6th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The growing availability of biological measurements at the molecular level has recently enhanced the role of machine learning methods for effective early cancer diagnosis, prognosis and treatment. These measurements are represented by the expression levels of thousands of genes in normal and tumor sample tissues. In this paper we present a two-phase algorithm for gene expression data classification. In the first phase, a novel gene selection method based on mixed-integer optimization is applied with the aim of selecting a small subset of cancer marker genes. In the second phase, a binary polyhedral classifier is used in order to label gene expression data. Computational experiments performed on three benchmark datasets indicate the usefulness of the proposed framework which is capable of competitive performances with respect to the best classification accuracy so far achieved for each dataset. Moreover, the classification rules generated are based on very few genes which, in our computations, can be credited as the most influential genes for tumor differentiation.