Indexing for subtree similarity-search using edit distance

Authors:
Sara Cohen
Affiliations:
The Hebrew University of Jerusalem, Jerusalem, Israel
Venue:
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Year:
2013

Citing 20
Cited 0

Simple fast algorithms for the editing distance between trees and related problems

SIAM Journal on Computing
The Tree-to-Tree Correction Problem

Journal of the ACM (JACM)
A tree-edit-distance algorithm for comparing simple, closed shapes

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Comparison of AESA and LAESA search algorithms using string and tree-edit-distances

Pattern Recognition Letters
Correlating XML data streams using tree-edit distance embeddings

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
RNA Secondary structure comparison: exact analysis of the Zhang--Shasha tree edit algorithm

Theoretical Computer Science
Automatic web news extraction using tree edit distance

Proceedings of the 13th international conference on World Wide Web
A survey on tree edit distance and related problems

Theoretical Computer Science
Discovering Shape Classes using Tree Edit-Distance and Pairwise Clustering

International Journal of Computer Vision
Node labeling schemes for dynamic XML documents reconsidered

Data & Knowledge Engineering
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Document similarity based on concept tree distance

Proceedings of the nineteenth ACM conference on Hypertext and hypermedia
An optimal decomposition algorithm for tree edit distance

ACM Transactions on Algorithms (TALG)
The pq-gram distance between ordered labeled trees

ACM Transactions on Database Systems (TODS)
Analysis of tree edit distance algorithms

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Efficient Top-k Approximate Subtree Matching in Small Memory

IEEE Transactions on Knowledge and Data Engineering
RTED: a robust algorithm for the tree edit distance

Proceedings of the VLDB Endowment
Combining lexical resources with tree edit distance for recognizing textual entailment

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Cerebral vascular tree matching of 3D-RA data based on tree edit distance

Miar'06 Proceedings of the Third international conference on Medical Imaging and Augmented Reality

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a tree Q and a large set of trees T = {T1,...,Tn}, the subtree similarity-search problem is that of finding the subtrees of trees among T that are most similar to Q, using the tree edit distance metric. Determining similarity using tree edit distance has been proven useful in a variety of application areas. While subtree similarity-search has been studied in the past, solutions required traversal of all of T, which poses a severe bottleneck in processing time, as T grows larger. This paper proposes the first index structure for subtree similarity-search, provided that the unit cost function is used. Extensive experimentation and comparison to previous work shows the huge improvement gained when using the proposed index structure and processing algorithm.