A schema matching-based approach to XML schema clustering

  • Authors:
  • Alsayed Algergawy;Eike Schallehn;Gunter Saake

  • Affiliations:
  • Magdeburg University, Magdeburg, Germany;Magdeburg University, Magdeburg, Germany;Magdeburg University, Magdeburg, Germany

  • Venue:
  • Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The relationship between XML data clustering and schema matching is bidirectional. On one side, clustering techniques have been adopted to improve matching performance, and on the other side schema matching is the backbone of the clustering technique. This paper presents a new approach for clustering XML schema based on schema matching. In particular, we develop and implement an XML schema matching system, which determines semantic similarities between XML schemas based on the Prüfer sequence representation of schema trees. The proposed computation similarity algorithm makes use of the semantic meaning of XML elements as well as the hierarchical features of XML schemas. The computed similarities are then exploited by an agglomerative clustering algorithm to group similar schemas. Our experimental results show that the proposed approach is fast and accurate in clustering heterogeneous XML schemas.