Query ranking in probabilistic XML data

  • Authors:
  • Lijun Chang;Jeffrey Xu Yu;Lu Qin

  • Affiliations:
  • The Chinese University of Hong Kong, Hong Kong, China;The Chinese University of Hong Kong, Hong Kong, China;The Chinese University of Hong Kong, Hong Kong, China

  • Venue:
  • Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Twig queries have been extensively studied as a major fragment of XPATH queries to query XML data. In this paper, we study PXML-RANK query, (Q, k), which is to rank top-k probabilities of the answers of a twig query Q in probabilistic XML (PXML) data. A new research issue is how to compute top-k probabilities of answers of a twig query Q in PXML in the presence of containment (ancestor/descendant) relationships. In the presence of the ancestor/descendant relationships, the existing dynamic programming approaches to rank top-k probabilities over a set of tuples cannot be directly applied, because any node/edge in PXML may have impacts on the top-k probabilities of answers. We propose new algorithms to compute PXML-RANK queries efficiently and give conditions under which a PXML-RANK query can be processed efficiently without enumeration of all the possible worlds. We conduct extensive performance studies using both real and large benchmark datasets, and confirm the efficiency of our algorithms.