Summarizing gene-expression-based classifiers by meta-mining comprehensible relational patterns

  • Authors:
  • Filip Železný;Olga Štěpánková;Jakub Tolar;Nada Lavrač

  • Affiliations:
  • Deptartment of Cybernetics, Czech Technical Univ. in Prague, Praha, Czech Republic;Deptartment of Cybernetics, Czech Technical Univ. in Prague, Praha, Czech Republic;Department of Pediatrics, Univ. of Minnesota Medical School, Minneapolis;Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia

  • Venue:
  • BioMed'06 Proceedings of the 24th IASTED international conference on Biomedical engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a methodology for predictive classification from gene expression data, able to combine the robustness of high-dimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then (meta)-mine the classifier for compact summarizations of subgroups among genes associated with a given class therein. The subgroups are described by means of relational logic features extracted from publicly available gene ontology information. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary, meta-mining task as here the original attributes become learning examples. We cross-validate the proposed method on two classification problems: (i) acute lymphoblastic leukemia (ALL) vs. acute myeloid leukemia (AML), (ii) seven subclasses of ALL.