Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Fast algorithms for the unit cost editing distance between trees
Journal of Algorithms
An O(NP) sequence comparison algorithm
Information Processing Letters
The SGML handbook
Structural and cognitive problems in providing version control for hypertext
ECHT '92 Proceedings of the ACM conference on Hypertext
CoVer: a contextual version server for hypertext applications
ECHT '92 Proceedings of the ACM conference on Hypertext
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Structural matching and discovery in document databases
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The XML handbook
The String-to-String Correction Problem
Journal of the ACM (JACM)
Author's Guide to the Standard Generalized Markup Language
Author's Guide to the Standard Generalized Markup Language
Active Database Systems: Triggers and Rules for Advanced Database Processing
Active Database Systems: Triggers and Rules for Advanced Database Processing
On Implementing a Language for Specifying Active Database Execution Models
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Efficient Snapshot Differential Algorithms for Data Warehousing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A graphical environment for change detection in structured documents
COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
On the complexity of the Extended String-to-String Correction Problem
STOC '75 Proceedings of seventh annual ACM symposium on Theory of computing
Extending a Structured Document Model with Version Control
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Approximate matching of hierarchical data using pq-grams
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An incrementally maintainable index for approximate lookups in hierarchical data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A heuristic algorithm for clustering rooted ordered trees
Intelligent Data Analysis
The pq-gram distance between ordered labeled trees
ACM Transactions on Database Systems (TODS)
Extracting prehistories of software refactorings from version archives
LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application
XML: some papers in a haystack
ACM SIGMOD Record
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Proceedings of the 11th ACM symposium on Document engineering
RTED: a robust algorithm for the tree edit distance
Proceedings of the VLDB Endowment
S2MP: similarity measure for sequential patterns
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
RWS-Diff: flexible and efficient change detection in hierarchical data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
SGML/XML are having a profound impact on data modeling and processing. This paper presents an efficient algorithm to compute differences between old and new versions of an SGML/XML document. The difference between the two versions can be considered to be an edit script that transforms one document tree into another. The proposed algorithm is based on a hybridization of bottom-up and top-down methods: The matching relationships between nodes in the two versions are produced in a bottom-up manner and then the top-down breadth-first search computes an edit script. Faster matching is achieved because the algorithm does not need to investigate the possible existence of matchings for all nodes. Furthermore, it can detect structurally meaningful changes such as the movement and copy of a subtree as well as simple changes to the node itself like insertion, deletion, and update.