Indexing for subtree similarity-search using edit distance

  • Authors:
  • Sara Cohen

  • Affiliations:
  • The Hebrew University of Jerusalem, Jerusalem, Israel

  • Venue:
  • Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a tree Q and a large set of trees T = {T1,...,Tn}, the subtree similarity-search problem is that of finding the subtrees of trees among T that are most similar to Q, using the tree edit distance metric. Determining similarity using tree edit distance has been proven useful in a variety of application areas. While subtree similarity-search has been studied in the past, solutions required traversal of all of T, which poses a severe bottleneck in processing time, as T grows larger. This paper proposes the first index structure for subtree similarity-search, provided that the unit cost function is used. Extensive experimentation and comparison to previous work shows the huge improvement gained when using the proposed index structure and processing algorithm.