Clustering of Leaf-Labelled Trees

Authors:
Jakub Koperwas;Krzysztof Walczak
Affiliations:
Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland;Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland
Venue:
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Year:
2007

Citing 4
Cited 1

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Case Study: Visualizing Sets of Evolutionary Trees

INFOVIS '02 Proceedings of the IEEE Symposium on Information Visualization (InfoVis'02)
A clustering algorithm for huge trees

Advances in Applied Mathematics
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees

IEEE Transactions on Knowledge and Data Engineering

Clustering of Leaf-Labelled Trees on Free Leafset

RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces novel methodology for the clustering of data represented as leaf-labelled trees on the same leaf-set. We define an abstract term - the representative tree, which can be represented with a variety of trees, depending on applications. The quality of tree-clustering is based on Information Gain, which measures the increase of information contained by representative trees of the resulting clusters compared to a single representative tree of the whole dataset. Finally, we propose the k-best algorithm the objective function of which is to maximize the information gain. We show how it can be constructed for two different representative trees, well- known in phylogenetic analysis. Developed algorithms yield very promissing results.