Querying and ranking incomplete twigs in probabilistic XML

  • Authors:
  • Jian Liu;Z. M. Ma;Li Yan

  • Affiliations:
  • School of Information Science & Engineering, Northeastern University, Shenyang, China 110819;School of Information Science & Engineering, Northeastern University, Shenyang, China 110819;School of Software, Northeastern University, Shenyang, China 110819

  • Venue:
  • World Wide Web
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.