Fast and simple XML tree differencing by sequence alignment

  • Authors:
  • Tancred Lindholm;Jaakko Kangasharju;Sasu Tarkoma

  • Affiliations:
  • Helsinki Institute for Information Technology;Helsinki Institute for Information Technology;Helsinki Institute for Information Technology

  • Venue:
  • Proceedings of the 2006 ACM symposium on Document engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the advent of XML we have seen a renewed interest in methods for computing the difference between trees. Methods that include heuristic elements play an important role in practical applications due to the inherent complexity of the problem. We present a method for differencing XML as ordered trees based on mapping the problem to the domain of sequence alignment, applying simple and efficient heuristics in this domain, and transforming back to the tree domain. Our approach provides a method to quickly compute changes that are meaningful transformations on the XML tree level, and includes subtree move as a primitive operation. We evaluate the feasibility of our approach and benchmark it against a selection of existing differencing tools. The results show our approach to be feasible and to have the potential to perform on par with tools of a more complex design in terms of both output size and execution time.