ANDES: efficient evaluation of NOT-twig queries in relational databases

  • Authors:
  • Kheng Hong Soh;Ba Quan Truong;Sourav S. Bhowmick

  • Affiliations:
  • School of Computer Engineering, Nanyang Technological University, Singapore, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore, Singapore

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Despite a large body of work on XPath query processing in relational environment, systematic study of queries containing not-predicates have received little attention in the literature. Particularly, several xml supports of industrial-strength commercial rdbms fail to efficiently evaluate such queries. In this paper, we present an efficient and novel strategy to evaluate not -twig queries in a tree-unaware relational environment. not -twig queries are XPath queries with ancestor---descendant and parent---child axis and contain one or more not-predicates. We propose a novel Dewey-based encoding scheme called Andes (ANcestor Dewey-based Encoding Scheme), which enables us to efficiently filter out elements satisfying a not-predicate by comparing their ancestor group identifiers. In this approach, a set of elements under the same common ancestor at a specific level in the xml tree is assigned same ancestor group identifier. Based on this scheme, we propose a novel sql translation algorithm for not-twig query evaluation. Experiments carried out confirm that our proposed approach built on top of an off-the-shelf commercial rdbms significantly outperforms state-of-the-art relational and native approaches. We also explore the query plans selected by a commercial relational optimizer to evaluate our translated queries in different input cardinality. Such exploration further validates the performance benefits of Andes.