QMatch - Using paths to match XML schemas

  • Authors:
  • Naiyana Tansalarak;Kajal T. Claypool

  • Affiliations:
  • Department of Computer Science, University of Massachusetts, Lowell, MA 01854, United States;Department of Computer Science, University of Massachusetts, Lowell, MA 01854, United States

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing popularity of the XML model and the proliferation of XML documents on-line, automated matching of XML documents and databases has become a critical issue. In this paper, we present a hybrid schema match algorithm, QMatch, that provides a unique path-based framework for harnessing traditional structural and semantic information, while exploiting the constraints inherent in XML documents such as the order of XML elements, to provide improved levels of matching between two given XML schemata. QMatch is based on the measurement of a unique quality of match metric, QoM, and a set of classifiers which together provide not only an effective basis for the development of a new schema match algorithm, but also a useful tool for tuning existing schema match algorithms to output at desired levels of matching. In this paper, we show via a set of experiments the benefits of the path-based QMatch over existing structural, linguistic, and hybrid algorithms such as Cupid, and provide an empirical measure of the accuracy of QMatch in terms of the true matches discovered by the algorithm.