Approximately matching context-free languages
Information Processing Letters
New algorithm for ordered tree-to-tree correction problem
Journal of Algorithms
Information Systems - Special issue on web data integration
Automating XML documents transformations: a conceptual modelling based approach
APCCM '04 Proceedings of the first Asian-Pacific conference on Conceptual modelling - Volume 31
Automatic web news extraction using tree edit distance
Proceedings of the 13th international conference on World Wide Web
Finding an optimum edit script between an XML document and a DTD
Proceedings of the 2005 ACM symposium on Applied computing
Approximate XML document matching
Proceedings of the 2005 ACM symposium on Applied computing
Extracting differences between regular tree grammars
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
In this paper, we present an algorithm to find a sequence of top-down edit operations with minimum cost that transforms an XML document such that it conforms to a schema. It is shown that the algorithm runs in O(p x log p x n), where p is the size of the schema(grammar) and n is the size of the XML document (tree). We have also shown that edit distance with restricted top-down edit operations can be computed the same way.We will also show how to use the edit distances in document classification. Experimental studies have shown that our methods are effective in structure-oriented classification for both real and synthesized data sets.