KF-Diff+: Highly Efficient Change Detection Algorithm for XML Documents

Authors:
Haiyuan Xu;Quanyuan Wu;Huaimin Wang;Guogui Yang;Yan Jia
Affiliations:
-;-;-;-;-
Venue:
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Year:
2002

Citing 6
Cited 3

Simple fast algorithms for the editing distance between trees and related problems

SIAM Journal on Computing
Change detection in hierarchically structured information

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The AT&T Internet Difference Engine: Tracking and viewing changes on the web

World Wide Web
A New Editing based Distance between Unordered Labeled Trees

CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Keys with Upward Wildcards for XML

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications

Sync your data: update propagation for heterogeneous protein databases

The VLDB Journal — The International Journal on Very Large Data Bases
RWS-Diff: flexible and efficient change detection in hierarchical data

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Synthetising changes in XML documents as PULs

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.01

Visualization

Abstract

Most previous work in change detection on XML documents used the ordered tree, with the best complexity of O(nlogn), where n is the size of the document. The best algorithm we had ever known for unordered model achieves polynomial time in complexity. In this paper, we propose a highly efficient algorithm named KF-Diff+. The key property of our algorithm is that the algorithm transforms the traditional tree-to-tree correction into the comparing of the key trees which are substantially label trees without duplicate paths with the complexity of O(n), where n is the number of nodes in the trees. In addition, KF-Diff+ is tailored to both ordered trees and unordered trees. Experiment shows that KF-Diff+ can handle XML documents at extreme speed.