Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming

  • Authors:
  • André L. V. Coelho;EverlíNdio Fernandes;Katti Faceli

  • Affiliations:
  • Graduate Program in Applied Informatics, Center of Technological Sciences, University of Fortaleza, Av. Washington Soares, 1321/J30, 60811-905, Fortaleza, CE, Brazil;Graduate Program in Applied Informatics, Center of Technological Sciences, University of Fortaleza, Av. Washington Soares, 1321/J30, 60811-905, Fortaleza, CE, Brazil;Federal University of São Carlos, Sorocaba Campus, Rod. João Leme dos Santos, Km 110, Itinga, 18052-780, Sorocaba, SP, Brazil

  • Venue:
  • Decision Support Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates a genetic programming (GP) approach aimed at the multi-objective design of hierarchical consensus functions for clustering ensembles. By this means, data partitions obtained via different clustering techniques can be continuously refined (via selection and merging) by a population of fusion hierarchies having complementary validation indices as objective functions. To assess the potential of the novel framework in terms of efficiency and effectiveness, a series of systematic experiments, involving eleven variants of the proposed GP-based algorithm and a comparison with basic as well as advanced clustering methods (of which some are clustering ensembles and/or multi-objective in nature), have been conducted on a number of artificial, benchmark and bioinformatics datasets. Overall, the results corroborate the perspective that having fusion hierarchies operating on well-chosen subsets of data partitions is a fine strategy that may yield significant gains in terms of clustering robustness.