FiST: scalable XML document filtering by sequencing twig patterns

  • Authors:
  • Joonho Kwon;Praveen Rao;Bongki Moon;Sukho Lee

  • Affiliations:
  • Seoul National University, Seoul, Korea;University of Arizona, Tucson, AZ;University of Arizona, Tucson, AZ;Seoul National University, Seoul, Korea

  • Venue:
  • VLDB '05 Proceedings of the 31st international conference on Very large data bases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, publish-subscribe (pub-sub) systems based on XML document filtering have received much attention. In a typical pub-sub system, subscribed users specify their interest in profiles expressed in the XPath language, and each new content is matched against the user profiles so that the content is delivered to only the interested subscribers. As the number of subscribed users and their profiles can grow very large, the scalability of the system is critical to the success of pub-sub services. In this paper, we propose a novel scalable filtering system called FiST (Filtering by Sequencing Twigs) that transforms twig patterns expressed in XPath and XML documents into sequences using Prüfer's method. As a consequence, instead of matching linear paths of twig patterns individually and merging the matches during post-processing, FiST performs holistic matching of twig patterns with incoming documents. FiST organizes the sequences into a dynamic hash based index for efficient filtering. We demonstrate that our holistic matching approach yields lower filtering cost and good scalability under various situations.