Approximate subtree identification in heterogeneous XML documents collections

  • Authors:
  • Ismael Sanz;Marco Mesiti;Giovanna Guerrini;Rafael Berlanga Llavori

  • Affiliations:
  • Universitat Jaume I, Castellón, Spain;Università di Milano, Italy;Università di Pisa, Italy;Universitat Jaume I, Castellón, Spain

  • Venue:
  • XSym'05 Proceedings of the Third international conference on Database and XML Technologies
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the heterogeneous nature of XML data for internet applications exact matching of queries is often inadequate. The need arises to quickly identify subtrees of XML documents in a collection that are similar to a given pattern. In this paper we discuss different similarity measures between a pattern and subtrees of documents in the collection. An efficient algorithm for the identification of document subtrees, approximately conforming to the pattern, by indexing structures is then introduced.