Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems

  • Authors:
  • Qiang Zhu;Per-åke Larson

  • Affiliations:
  • Department of Computer and Information Science, The University of Michigan - Dearborn, Dearborn, MI 48128, USA. Email: qzhu@umich.edu;Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. Email: palarson@microsoft.com

  • Venue:
  • Distributed and Parallel Databases
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

To meet users‘ growing needs for accessing pre-existing heterogeneousdatabases, a multidatabase system (MDBS) integrating multiple databases has attracted many researchers recently. A key feature of an MDBS is local autonomy. For a query retrieving data from multiple databases, global query optimization should be performed to achieve good system performance. There are a number of new challenges for global query optimization in an MDBS. Among them, a major one is that some local optimization information, such as local cost parameters, may not be available at the global level because of local autonomy. It creates difficulties for finding a good decomposition of a global query during query optimization. To tackle this challenge, a new query sampling method is proposed in this paper. The idea is to group component queries into homogeneous classes, draw a sample of queries from each class, and use observed costs of sample queries to derive a cost formula for each class by multiple regression. The derived formulas can be used to estimate the cost of a query during query optimization. The relevant issues, such as query classification rules, sampling procedures, and cost model development and validation, are explored in this paper. To verify the feasibility of the method, experiments were conducted on three commercial database managementsystems supported in an MDBS. Experimental results demonstrate that the proposed method is quite promising in estimating local cost parameters in an MDBS.