On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
From region encoding to extended dewey: on efficient processing of XML twig pattern matching
VLDB '05 Proceedings of the 31st international conference on Very large data bases
On the complexity of managing probabilistic XML data
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ProTDB: probabilistic data in XML
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Matching twigs in probabilistic XML
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranking queries on uncertain data: a probabilistic threshold approach
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query efficiency in probabilistic XML models
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query ranking in probabilistic XML data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Twiglist: make twig pattern matching fast
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
A hybrid algorithm for finding top-k twig answers in probabilistic XML
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Boosting twig joins in probabilistic XML
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
ELCA evaluation for keyword search on probabilistic XML data
World Wide Web
Hi-index | 0.00 |
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. The top-k matching of a twig pattern against probabilistic XML data is essential. Some classical twig pattern algorithms can be adjusted to process the probabilistic XML. However, as far as finding answers of the top-k probabilities is concerned, the existing algorithms suffer in performance, because many unnecessary intermediate path results, with small probabilities, need to be processed. To cope with this problem, we propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.