Fast approximate matching between XML documents and schemata

  • Authors:
  • Guangming Xing

  • Affiliations:
  • Department of Computer Science, Western Kentucky University, Bowling Green, KY

  • Venue:
  • APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML has become the standard format for web publishing and data exchange on the Internet. Much research has been done to provide efficient access to relevant information that is ubiquitous on the Web. In this paper, we present an algorithm to find a sequence of top-down edit operations with minimum cost that transforms an XML document such that it conforms to a schema. The minimum cost is based on the tree edit distance with top-down edit operations. It is shown that the algorithm runs in O(p × log p × n), where p is the size of the schema(grammar) and n is the size of the XML document(tree). Experimental studies have also shown that the running time of our algorithm is linear with respect to the size of the XML document when normalized regular hedge grammar is used to specify a schema.