A piggyback method to collect statistics for query optimization in database management systems

Authors:
Qiang Zhu;Brian Dunkel;Nandit Soparkar;Suyun Chen;Berni Schiefer;Tony Lai
Affiliations:
Department of Computer and Information Science, The University of Michigan, Dearborn, MI;Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, MI;Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, MI;IBM Toronto Laboratory, North York, Ontario, M3C 1H7, Canada;IBM Toronto Laboratory, North York, Ontario, M3C 1H7, Canada;IBM Toronto Laboratory, North York, Ontario, M3C 1H7, Canada
Venue:
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Year:
1998

Citing 20
Cited 4

Distributed query processing

ACM Computing Surveys (CSUR)
Statistical profile estimation in database systems

ACM Computing Surveys (CSUR)
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Query optimisation in distributed object-oriented database systems

The Computer Journal - Special issue on database programming paradigms
Dynamic analysis of some relational databases parameters

Theoretical Computer Science - Special volume on mathematical analysis of algorithms (dedicated to D. E. Knuth)
Using the new DB2: IBM's object-relational database system

Using the new DB2: IBM's object-relational database system
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
DB2 Developer's Guide

DB2 Developer's Guide
SYBASE Architecture and Administration

SYBASE Architecture and Administration
Informix, with CD-ROM (Unleashed)

Informix, with CD-ROM (Unleashed)
Ingres: Tools for Building an Information Architecture

Ingres: Tools for Building an Information Architecture
Oracle Performance Tuning and Optimization with CD-ROM

Oracle Performance Tuning and Optimization with CD-ROM
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Dynamic Query Optimization in Rdb/VMS

Proceedings of the Ninth International Conference on Data Engineering
Buffering Schemes for Permanent Data

Proceedings of the Second International Conference on Data Engineering
Adaptive Techniques for Distributed Query Optimization

Proceedings of the Second International Conference on Data Engineering
Estimating Block Accessses when Attributes are Correlated

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Query optimization in multidatabase systems

CASCON '92 Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research - Volume 2
An integrated method for estimating selectivities in a multidatabase system

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2

Multiple-granularity interleaving for piggyback query processing

CASCON '99 Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research
Evolutionary techniques for updating query cost models in a dynamic multidatabase environment

The VLDB Journal — The International Journal on Very Large Data Bases
Automated statistics collection in DB2 UDB

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Distributed database statistics collection using mobile agents

TELE-INFO'07 Proceedings of the 6th WSEAS Int. Conference on Telecommunications and Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A database management system (DBMS) performs query optimization based on statistical information about data in the underlying data-base. Out-of-date statistics may lead to inefficient query processing in the system. Existing solutions to this problem have some drawbacks such as heavy administrative burden, high system load, and tardy updates. To overcome these drawbacks, our new approach, called the piggyback method, is proposed in this paper. The key idea is to piggyback some additional retrievals during the processing of a user query in order to collect more up-to-date statistics. The collected statistics are used to optimize the processing of subsequent queries. To specify the piggybacked queries, basic piggybacking operators are defined in this paper. Using the operators, several types of piggybacking such as vertical, horizontal, mixed vertical and horizontal, and multi-query piggybacking are introduced. Statistics that can be obtained from different access methods by applying piggyback analysis during query processing are also studied. In order to meet users' different requirements for the associated overhead, several piggybacking levels are suggested. Other related issues including initial statistics, piggybacking time, and parallelism are also discussed. Our analysis shows that the piggyback method is promising in improving the quality of query optimization in a DBMS as well as in reducing the user's administrative burden for maintaining an efficient DBMS.