Estimating alphanumeric selectivity in the presence of wildcards
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Substring selectivity estimation
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectively estimation for Boolean queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Estimating Answer Sizes for XML Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Multi-Dimensional Substring Selectivity Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
Containment join size estimation: models and methods
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Selectivity Estimation for XML Twigs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Graph summaries for subgraph frequency estimation
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Hi-index | 0.00 |
In this paper we present a novel approach for estimating the selectivity of XML twig queries. Such a technique is useful for answering approximate queries as well as for determining an optimal query plan for complex queries based on said estimates. Our approach relies on a summary structure that contains the occurrence statistics of small twigs. We rely on a novel probabilistic approach for decomposing larger twig queries into smaller ones. We then show how it can be used to estimate the selectivity of the larger query in conjunction with the summary information. We present and evaluate different strategies for decomposition and compare this work against a state-of-the-art selectivity estimation approach on synthetic and real datasets. The experimental results show that our proposed approach is very effective in estimating the selectivity of XML twig queries.