An integrated method for estimating selectivities in a multidatabase system

Authors:
Qiang Zhu
Affiliations:
University of Waterloo, Ontario, Canada
Venue:
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Year:
1993

Citing 16
Cited 6

Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Random sampling from B+ trees

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
A Taxonomy and Current Issues in Multidatabase Systems

Computer
On global multidatabase query optimization

ACM SIGMOD Record
A supplement to sampling-based methods for query size estimation in a database system

ACM SIGMOD Record
Query size estimation by adaptive sampling (extended abstract)

PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Adaptive Techniques for Distributed Query Optimization

Proceedings of the Second International Conference on Data Engineering
Query Optimization in a Heterogeneous DBMS

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Query optimization in multidatabase systems

CASCON '92 Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research - Volume 2

The CORDS multidatabase project

IBM Systems Journal
Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems

Distributed and Parallel Databases
A piggyback method to collect statistics for query optimization in database management systems

CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
CORDS multidatabase project: research and prototype overview

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Query optimization using fuzzy set theory for multidatabase systems

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Query Size Estimation for Joins Using Systematic Sampling

Distributed and Parallel Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

A multidatabase system (MDBS) integrates information from autonomous local databases managed by different database management systems (MDBS) in a distributed environment. A number of challenges are raised for query optimization in such an MDBS. One of the major challenges is that some local optimization information may not be available at the global level. We recently proposed a query sampling method to drive cost estimation formulas for local databases in an MDBS [22]. To use the derived formulas to estimate the costs of queries, we need to know the selectivities of the qualifications of the queries. Unfortunately, existing methods for estimating selectivities cannot be used efficiently in an MDBS environment. This paper discusses difficulties of estimating selectivities in an MDBS. Based on the discussion, this paper presents an integrated method to estimate selectivities in an MDBS. The method integrates and extends several existing methods so that they can be used in an MDBS efficiently. It extends Christodoulakis's parametric method so that estimation accuracy is improved and more types of queries can be handled. It extends Lipton and Naughton's adaptive sampling method so that both performance and accuracy are improved. Theoretical and experimental results show that the extended Lipton and Naughton's method described in this paper can be many times faster than the original one. In addition, the integrated method uses a new piggyback approach to collect and maintain statistics, which can reduce the statistic maintenance cost. The integrated method is designed for the MDBS in the CORDS project (CORDS-MDBS). Implementation considerations are also given in the paper.