Building automatic mapping between XML documents using approximate tree matching

  • Authors:
  • Guangming Xing;Zhonghang Xia;Andrew Ernest

  • Affiliations:
  • Western Kentucky University, Bowling Green, KY;Western Kentucky University, Bowling Green, KY;Western Kentucky University, Bowling Green, KY

  • Venue:
  • Proceedings of the 2007 ACM symposium on Applied computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The eXtensible Markup Language (XML) is becoming the standard format for data exchange on the Internet, providing interoperability among Web applications. It is important to provide efficient algorithms and tools to manipulate XML documents that are ubiquitous on the Web. In this paper, we present a novel system for automating the transformation of XML documents based on structural mapping with the restriction that the leaf text information are exactly the same in the source and target documents. Firstly, tree edit distance algorithm is used to find the mapping between a pair of source and target documents. With the introduction of tree partition, the efficiency of the tree matching algorithm has been improved significantly. Secondly, template rules for transformation are inferred from the mapping using generalization. Thirdly, a template matching component is used to process new documents. Experimental studies have shown that our methods are very promising and can be widely used for Web document cleaning, information filtering, and other applications.