Sibling Distance for Rooted Labeled Trees

  • Authors:
  • Taku Aratsu;Kouichi Hirata;Tetsuji Kuboyama

  • Affiliations:
  • Graduate School of Computer Science and Systems Engineering,;Department of Artificial Intelligence, Kyushu Institute of Technology, Iizuka, Japan 820-8502;Computer Center, Gakushuin University, Tokyo, Japan 171-8588

  • Venue:
  • New Frontiers in Applied Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce a sibling distance Δ s for rooted labeled trees as an L 1-distance between their sibling histograms, which consist of the frequencies of every pair of the label of a node and the sequence of labels of its children. Then, we show that Δ s gives a constant factor lower bound on the tree edit distance Δ such that Δ s (T 1,T 2) ≤ 4Δ(T 1,T 2). Next, we design the algorithm to compute the sibling histogram in O(n) time for ordered trees and in O(gn) time for unordered trees, where n and g are the number of nodes and the degree of a tree. Finally, we give experimental results by applying the sibling distance to glycan data.