Clustering of Leaf-Labelled Trees

  • Authors:
  • Jakub Koperwas;Krzysztof Walczak

  • Affiliations:
  • Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland;Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

  • Venue:
  • ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces novel methodology for the clustering of data represented as leaf-labelled trees on the same leaf-set. We define an abstract term - the representative tree, which can be represented with a variety of trees, depending on applications. The quality of tree-clustering is based on Information Gain, which measures the increase of information contained by representative trees of the resulting clusters compared to a single representative tree of the whole dataset. Finally, we propose the k-best algorithm the objective function of which is to maximize the information gain. We show how it can be constructed for two different representative trees, well- known in phylogenetic analysis. Developed algorithms yield very promissing results.