Building regression cost models for multidatabase systems
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The XML benchmark project
A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
XBench Benchmark and Performance Testing of XML DBMSs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Evolutionary techniques for updating query cost models in a dynamic multidatabase environment
The VLDB Journal — The International Journal on Very Large Data Bases
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
The VLDB Journal — The International Journal on Very Large Data Bases
Mixed mode XML query processing
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Automated statistics collection in DB2 UDB
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Cost-based optimization in DB2 XML
IBM Systems Journal
Managing operational business intelligence workloads
ACM SIGOPS Operating Systems Review
Using Structural Joins and Holistic Twig Joins for Native XML Query Optimization
ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Statistics-based parallelization of XPath queries in shared memory systems
Proceedings of the 13th International Conference on Extending Database Technology
Efficient physical operators for cost-based XPath execution
Proceedings of the 13th International Conference on Extending Database Technology
Towards a comprehensive assessment for selectivity estimation approaches of XML queries
International Journal of Web Engineering and Technology
An integrative approach to query optimization in native XML database management systems
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Scaling XML query processing: distribution, localization and pruning
Distributed and Parallel Databases
Aggregation strategies for columnar in-memory databases in a mixed workload
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Managing dynamic mixed workloads for operational business intelligence
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Robust estimation of resource consumption for SQL queries using statistical techniques
Proceedings of the VLDB Endowment
Efficient fragmentation of large XML documents
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Automatic selection of processing units for coprocessing in databases
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Efficient co-processor utilization in database query processing
Information Systems
Active and accelerated learning of cost models for optimizing scientific applications
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Hi-index | 0.00 |
Developing cost models for query optimization is significantly harder for XML queries than for traditional relational queries. The reason is that XML query operators are much more complex than relational operators such as table scans and joins. In this paper, we propose a new approach, called COMET, to modeling the cost of XML operators; to our knowledge, COMET is the first method ever proposed for addressing the XML query costing problem. As in relational cost estimation, COMET exploits a set of system catalog statistics that summarizes the XML data; the set of "simple path" statistics that we propose is new, and is well suited to the XML setting. Unlike the traditional approach, COMET uses a new statistical learning technique called "transform regression" instead of detailed analytical models to predict the overall cost. Besides rendering the cost estimation problem tractable for XML queries, COMET has the further advantage of enabling the query optimizer to be self-tuning, automatically adapting to changes over time in the query workload and in the system environment. We demonstrate COMET's feasibility by developing a cost model for the recently proposed XNAV navigational operator. Empirical studies with synthetic, benchmark, and real-world data sets show that COMET can quickly obtain accurate cost estimates for a variety of XML queries and data sets.