Approximate XML document matching

  • Authors:
  • E. Rodney Canfield;Guangming Xing

  • Affiliations:
  • University of Georgia, Athens, GA;Western Kentucky University, Bowling Green, KY

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Regular Hedge Grammar is a formal method to specify XML schema. XML document can be viewed as an ordered labeled tree. Computing the approximate matching between an XML document with a schema with minimum cost is not only theoretically interesting. This problem can be modeled as: Given an ordered labeled tree F and a regular hedge grammar P, how to compute the minimum edit distance to transform the forest F into F' so that F' is exactly matched by P. In this paper, with the introduction of leaf forest, we gave an algorithm for this problem in O(F2P(F + log P)) time, where F is the size of the forest and P is the size of the grammar. From the authors' knowledge, this is the first algorithm to transform an XML document (ordered labeled tree) to conform to a schema (tree grammar).