Enabling XPath Optional Axes Cardinality Estimation Using Path Synopses

Authors:
Yury Soldak;Maxim Lukichev
Affiliations:
Department of Computer Science, University of Saint-Petersburg, Russian Federation;Department of Computer Science, University of Saint-Petersburg, Russian Federation
Venue:
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
Year:
2008

Citing 13
Cited 0

StatiX: making XML count

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Estimating Answer Sizes for XML Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree

Proceedings of the 17th International Conference on Data Engineering
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications

Proceedings of the 27th International Conference on Very Large Data Bases
Selectivity Estimation for XML Twigs

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Approximate XML query answers

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Benefits of path summaries in an XML query optimizer supporting multiple access methods

VLDB '05 Proceedings of the 31st international conference on Very large data bases
CXHist: an on-line classification-based histogram for XML string selectivity estimation

VLDB '05 Proceedings of the 31st international conference on Very large data bases
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Structure and value synopses for XML data graphs

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Bloom histogram: path selectivity estimation for XML data with updates

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Quantified Score

Hi-index	0.00

Visualization

Abstract

The effective support for XML query languages is becoming increasingly important with the emergence of new applications that access large volumes of XML data. The efficient query execution, especially in the distributed case, requires estimating of the path expression cardinalities. In this paper, we propose two novel techniques for the cardinality estimation of the simple path expressions with optional axes (following/preceding): the document order grouping (DG) and the neighborhood grouping (NG). Both techniques summarize the structure of source XML data in compact graph structures (path synopses) and use these summaries for cardinality estimation. We experimentally evaluated accuracy of the techniques, size of the summaries and studied performance of the prototypes. The wide range of source data was used in order to study the behavior of the structures and the area of techniques application.