Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The XML benchmark project
Containment join size estimation: models and methods
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Selectivity Estimation for XML Twigs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Efficient processing of XML twig queries with OR-predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On boosting holism in XML twig pattern matching using structural indexing techniques
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
From region encoding to extended dewey: on efficient processing of XML twig pattern matching
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Tree-pattern queries on a lightweight XML processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
XSEED: Accurate and Fast Cardinality Estimation for XPath Queries
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Estimating XML Structural Join Size Quickly and Economically
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Introduction to Probability Models, Ninth Edition
Introduction to Probability Models, Ninth Edition
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
As the Extensible Markup Language (XML) rapidly establishes itself as the de facto standard for presenting, storing, and exchanging data on the Internet, large volume of XML data and their supporting facilities start to surface. A fast and accurate selectivity estimation mechanism is of practical importance because selectivity estimation plays a fundamental role in XML query optimization. Recently proposed techniques are all based on some forms of structure synopses that could be time-consuming to build and not effective for summarizing complex structure relationships. In this research, we propose an innovative sampling method that can capture the tree structures and intricate relationships among nodes in a simple and effective way. The derived sample tree is stored as a synopsis for selectivity estimation. Extensive experimental results show that, in comparison with the state-of-the-art structure synopses, specifically the TreeSketch and Xseed synopses, our sample tree synopsis applies to a broader range of query types, requires several orders of magnitude less construction time, and generates estimates with considerably better precision for complex datasets.