Mining Rules for the Automatic Selection Process of Clustering Methods Applied to Cancer Gene Expression Data

  • Authors:
  • André C. Nascimento;Ricardo B. Prudêncio;Marcilio C. Souto;Ivan G. Costa

  • Affiliations:
  • Center of Informatics, Federal University of Pernambuco, Recife, Brazil;Center of Informatics, Federal University of Pernambuco, Recife, Brazil;Dept. of Informatics and Applied Mathematics, Fed. Univ. of Rio Grande do Norte, Natal, Brazil;Center of Informatics, Federal University of Pernambuco, Recife, Brazil

  • Venue:
  • ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Different algorithms have been proposed in the literature to cluster gene expression data, however there is no single algorithm that can be considered the best one independently on the data. In this work, we applied the concepts of Meta-Learning to relate features of gene expression data sets to the performance of clustering algorithms. In our context, each meta-example represents descriptive features of a gene expression data set and a label indicating the best clustering algorithm when applied to the data. A set of such meta-examples is given as input to a learning technique (the meta-learner ) which is responsible to acquire knowledge relating the descriptive features and the best algorithms. In our work, we performed experiments on a case study in which a meta-learner was applied to discriminate among three competing algorithms for clustering gene expression data of cancer. In this case study, a set of meta-examples was generated from the application of the algorithms to 30 different cancer data sets. The knowledge extracted by the meta-learner was useful to understanding the suitability of each clustering algorithm for specific problems.