Building automatic mapping between XML documents using approximate tree matching

Authors:
Guangming Xing;Zhonghang Xia;Andrew Ernest
Affiliations:
Western Kentucky University, Bowling Green, KY;Western Kentucky University, Bowling Green, KY;Western Kentucky University, Bowling Green, KY
Venue:
Proceedings of the 2007 ACM symposium on Applied computing
Year:
2007

Citing 1
Cited 2

XTRACT: a system for extracting document type descriptors from XML documents

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data

Standardized interoperable image retrieval

Proceedings of the 2010 ACM Symposium on Applied Computing
Mapping audiovisual metadata formats using formal semantics

SAMT'10 Proceedings of the 5th international conference on Semantic and digital media technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The eXtensible Markup Language (XML) is becoming the standard format for data exchange on the Internet, providing interoperability among Web applications. It is important to provide efficient algorithms and tools to manipulate XML documents that are ubiquitous on the Web. In this paper, we present a novel system for automating the transformation of XML documents based on structural mapping with the restriction that the leaf text information are exactly the same in the source and target documents. Firstly, tree edit distance algorithm is used to find the mapping between a pair of source and target documents. With the introduction of tree partition, the efficiency of the tree matching algorithm has been improved significantly. Secondly, template rules for transformation are inferred from the mapping using generalization. Thirdly, a template matching component is used to process new documents. Experimental studies have shown that our methods are very promising and can be widely used for Web document cleaning, information filtering, and other applications.