Matching top-k answers of twig patterns in probabilistic XML

Authors:
Bo Ning;Chengfei Liu;Jeffrey Xu Yu;Guoren Wang;Jianxin Li
Affiliations:
Swinburne University of Technology, Melbourne, Australia;Swinburne University of Technology, Melbourne, Australia;The Chinese University of Hongkong, Hongkong, China;Northeastern University, Shenyang, China;Swinburne University of Technology, Melbourne, Australia
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Year:
2010

Citing 10
Cited 3

On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
From region encoding to extended dewey: on efficient processing of XML twig pattern matching

VLDB '05 Proceedings of the 31st international conference on Very large data bases
On the complexity of managing probabilistic XML data

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ProTDB: probabilistic data in XML

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Matching twigs in probabilistic XML

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query efficiency in probabilistic XML models

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query ranking in probabilistic XML data

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Twiglist: make twig pattern matching fast

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications

A hybrid algorithm for finding top-k twig answers in probabilistic XML

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Boosting twig joins in probabilistic XML

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
ELCA evaluation for keyword search on probabilistic XML data

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. The top-k matching of a twig pattern against probabilistic XML data is essential. Some classical twig pattern algorithms can be adjusted to process the probabilistic XML. However, as far as finding answers of the top-k probabilities is concerned, the existing algorithms suffer in performance, because many unnecessary intermediate path results, with small probabilities, need to be processed. To cope with this problem, we propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.