Index scans using a finite LRU buffer: a validated I/O model

Authors:
Lothar F. Mackert;Guy M. Lohman
Affiliations:
IBM European Networking Center, Heidelberg, W. Germany;IBM Almaden Research Center, San Jose, CA
Venue:
ACM Transactions on Database Systems (TODS)
Year:
1989

Citing 17
Cited 22

Principles of database buffer management

ACM Transactions on Database Systems (TODS)
R* optimizer validation and performance evaluation for local queries

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
Duplicate record elimination in large data files

ACM Transactions on Database Systems (TODS)
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)
Estimating block accesses in database organizations: a closed noniterative formula

Communications of the ACM
On estimating block accesses in database organizations

Communications of the ACM
Estimating block accesses and number of records in file management

Communications of the ACM
Operating system support for database management

Communications of the ACM
Approximating block accesses in database organizations

Communications of the ACM
Analysis and performance of inverted data base structures

Communications of the ACM
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating block transfers and join sizes

SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Estimating Bucket Accesses: A Practical Approach

Proceedings of the Second International Conference on Data Engineering
Estimating Block Accessses when Attributes are Correlated

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
R* Optimizer Validation and Performance Evaluation for Distributed Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model

VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases

A performance evaluation of pointer-based joins

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Efficient Expressions for Completely and Partly Unsuccessful Batched Search of Tree-Structured Files

IEEE Transactions on Software Engineering
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Optimization of dynamic query evaluation plans

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Estimating page fetches for index scans with finite LRU buffers

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
A tool for performance evaluation of database systems for small computer systems

SAC '95 Proceedings of the 1995 ACM symposium on Applied computing
Buffer management based on return on consumption in a multi-query environment

The VLDB Journal — The International Journal on Very Large Data Bases
Estimating page fetches for index scans with finite LRU buffers

The VLDB Journal — The International Journal on Very Large Data Bases
Estimating Block Selectivities for Physical Database Design

IEEE Transactions on Knowledge and Data Engineering
Estimating Block Accesses in Database Organizations

IEEE Transactions on Knowledge and Data Engineering
Domains and Active Domains: What This Distinction Implies for the Estimation of Projection Sizes in Relational Databases

IEEE Transactions on Knowledge and Data Engineering
Block Access Estimation for Clustered Data Using a Finite LRU Buffer

IEEE Transactions on Software Engineering
R* Optimizer Validation and Performance Evaluation for Distributed Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Implementing an Interpreter for Functional Rules in a Query Optimizer

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Performance Analysis of Database Systems

Performance Evaluation: Origins and Directions
Persistently Cached B-Trees

IEEE Transactions on Knowledge and Data Engineering
Buffer performance modeling in the context of unclustered index accesses with non-uniform access pattern

Information Sciences—Informatics and Computer Science: An International Journal
Estimating the output cardinality of partial preaggregation with a measure of clusteredness

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Construction of tree-based indexes for level-contiguous buffering support

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Online monitoring and visualisation of database structural deterioration

International Journal of Autonomic Computing
A modeling tool for workload analysis and performance tuning of parallel database applications

ADBIS'97 Proceedings of the First East-European conference on Advances in Databases and Information systems
Making cost-based query optimization asymmetry-aware

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware

Quantified Score

Hi-index	0.00

Visualization

Abstract

Indexes are commonly employed to retrieve a portion of a file or to retrieve its records in a particular order. An accurate performance model of indexes is essential to the design, analysis, and tuning of file management and database systems, and particularly to database query optimization. Many previous studies have addressed the problem of estimating the number of disk page fetches when randomly accessing k records out of N given records stored on T disk pages. This paper generalizes these results, relaxing two assumptions that usually do not hold in practice: unlimited buffer and unique records for each key value. Experiments show that the performance of an index scan is very sensitive to buffer size limitations and multiple records per key value. A model for these more practical situations is presented and a formula derived for estimating the performance of an index scan. We also give a closed-form approximation that is easy to compute. The theoretical results are validated using the R* distributed relational database system. Although we use database terminology throughout the paper, the model is more generally applicable whenever random accesses are made using keys.