Query caching and optimization in distributed mediator systems
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An adaptive query execution system for data integration
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Proceedings of the 27th International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
Leveraging Mediator Cost Models with Heterogeneous Data Sources
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XSKETCH synopses for XML data graphs
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
There have been several techniques proposed for building statistics for static XML data. However, very little work has been done in the area of building XML statistics for data sources that export XML views of data that is stored in relational or other databases. For such data sources, we need statistics that are built in an on-line manner, by observing the XML queries to the data sources and their results. In this paper, we present a technique for building on-line XML statistics by observing the XPath queries issued to a data source and their result sizes. These XPath queries select parts of the virtual XML document representing the XML view of the data at the data source. We convert these XPath queries to a more abstract and generalized form that we call annotated path expressions. We present a technique for storing these annotated path expressions and information about their selectivity for use in estimating the selectivity of future XPath queries. We also present an experimental evaluation of our proposed approach.