Effective pruning for XML structural match queries

  • Authors:
  • Yefei Xin;Zhen He;Jinli Cao

  • Affiliations:
  • Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia;Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia;Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extensible Markup Language (XML) is becoming the de facto standard for exchanging information over the Internet, which results in the proliferation of XML documents. This has led to increased interest in this area by the research community. One of the main challenges is processing large collections of XML documents efficiently. Most current methods suffer from two drawbacks: an inability to complement each other to further enhance query processing performance without modifying the existing query processing engine; and an incapability of being customized for different structural and usage characteristics. This paper presents a new approach for structural query processing called Property-Driven Pruning Algorithm (PDPA), which offers the twin features of structural query processing independence and plug-and-play properties to overcome both drawbacks. PDPA consists of two phases: the offline and the online phase. During the offline phase, a list of pruning properties is added into the original XML documents. During the online phase, the input queries are modified with a list of carefully selected properties which are used during query processing to quickly prune non-matching candidate documents. We have proposed an exhaustive and a greedy heuristic algorithm. The experimental results based on both algorithms demonstrate that PDPA can improve XML query processing performance in a variety of situations by up to twofold.