Identification of Syntactically Similar DTD Elements for Schema Matching

  • Authors:
  • Hong Su;Sriram Padmanabhan;Ming-Ling Lo

  • Affiliations:
  • -;-;-

  • Venue:
  • WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML Document Type Definition (DTD) enforces the structure of XML documents. XML applications such as data translation, schema integration, and wrapper generation require DTD schema matching as a core procedure. While schema matching usually relies on a human arbiter, we are aiming at an automated system that can give the arbiter a starting point for designing a matching that can best meet the requirements of the given application. We present an approach that identifies the syntactically similar DTD elements that can be potential matching components. We first describe DTD element graph, a data model for the DTD elements. We then define the distance between two DTD element graphs. We introduce the concept of syntactically equivalent and syntactically similar graphs. Then, we describe the algorithm to detect both schema equivalent and similar DTD elements. We have implemented the matching detection algorithm and several heuristics which improve performance. Our experimental results show reasonable precision of the algorithm in terms of recognition of correct matches.