Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Indexing and Querying XML Data for Regular Path Expressions
Proceedings of the 27th International Conference on Very Large Data Bases
A Fast Index for Semistructured Data
Proceedings of the 27th International Conference on Very Large Data Bases
D(k)-index: an adaptive structural summary for graph-structured data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
An Efficient Algorithm to Compute Differences between Structured Documents
IEEE Transactions on Knowledge and Data Engineering
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
XML stream processing using tree-edit distance embeddings
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
DogmatiX tracks down duplicates in XML
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Similarity evaluation on tree-structured data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Approximate matching of hierarchical data using pq-grams
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Comparing stars: on approximating graph edit distance
Proceedings of the VLDB Endowment
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
RTED: a robust algorithm for the tree edit distance
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Several recent papers argue for approximate lookups in hierarchical data and propose index structures that support approximate searches in large sets of hierarchical data. These index structures must be updated if the underlying data changes. Since the performance of a full index reconstruction is prohibitive, the index must be updated incrementally.We propose a persistent and incrementally maintainable index for approximate lookups in hierarchical data. The index is based on small tree patterns, called pq-grams. It supports efficient updates in response to structure and value changes in hierarchical data and is based on the log of tree edit operations. We prove the correctness of the incremental maintenance for sequences of edit operations. Our algorithms identify a small set of pq-grams that must be updated to maintain the index. The experimental results with synthetic and real data confirm the scalability of our approach.