Genetic algorithm for finding cluster hierarchies

  • Authors:
  • Christian Böhm;Annahita Oswald;Christian Richter;Bianca Wackersreuther;Peter Wackersreuther

  • Affiliations:
  • Ludwig-Maximilians-University, Department for Informatics, Munich, Germany;Ludwig-Maximilians-University, Department for Informatics, Munich, Germany;Ludwig-Maximilians-University, Department for Informatics, Munich, Germany;Ludwig-Maximilians-University, Department for Informatics, Munich, Germany;Ludwig-Maximilians-University, Department for Informatics, Munich, Germany

  • Venue:
  • DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hierarchical clustering algorithms have been studied extensively in the last years. However, existing approaches for hierarchical clustering suffer from several drawbacks. The representation of the results is often hard to interpret even for large datasets. Many approaches are not robust to noise objects or overcome these limitation only by difficult parameter settings. As many approaches heavily depend on their initialization, the resulting hierarchical clustering get stuck in a local optimum. In this paper, we propose the novel geneticbased hierarchical clustering algorithm GACH (Genetic Algorithm for finding Cluster Hierarchies) that solves those problems by a beneficial combination of genetic algorithms, information theory and model-based clustering. GACH is capable to find the correct number of model parameters using the Minimum Description Length (MDL) principle and does not depend on the initialization by the use of a population-based stochastic search which ensures a thorough exploration of the search space. Moreover, outliers are handled as they are assigned to appropriate inner nodes of the hierarchy or even to the root. An extensive evaluation of GACH on synthetic as well as on real data demonstrates the superiority of our algorithm over several existing approaches.