Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control

Authors:
Christian A. Lang;Bishwaranjan Bhattacharjee;Tim Malkemus;Kwai Wong
Affiliations:
IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM Toronto Lab, Markham, ON
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 16
Cited 8

Data cache management using frequency-based replacement

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The LRU-K page replacement algorithm for database disk buffering

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Red brick warehouse: a read-mostly RDBMS for open SMP platforms

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
TPC-D—the challenges, issues and results

ACM SIGMOD Record
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies

IEEE Transactions on Computers
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model

VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases
Dynamic Caching of Query Results for Decision Support Systems

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Multi-dimensional clustering: a new data layout scheme in DB2

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Outperforming LRU with an Adaptive Replacement Cache Algorithm

Computer
Technological impact of magnetic hard disk drives on storage systems

IBM Systems Journal
QPipe: a simultaneously pipelined relational query engine

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An evaluation of buffer management strategies for relational database systems

VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Efficient query processing for multi-dimensionally clustered tables in DB2

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Technology challenges in a data warehouse

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Optimizing complex queries with multiple relation instances

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Main-memory scan sharing for multi-core CPUs

Proceedings of the VLDB Endowment
Transaction reordering with application to synchronized scans

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Transaction reordering

Data & Knowledge Engineering
Towards efficient concurrent scans on flash disks

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Fast loads and queries

Transactions on large-scale data- and knowledge-centered systems II
Fast loads and queries

Transactions on large-scale data- and knowledge-centered systems II
From cooperative scans to predictive buffer management

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decision support systems are characterized by large concurrent scan operations. A significant percentage of these scans are executed as index based scans of the data. This is especially true when the data is physically clustered on the index columns using the various clustering schemes employed by database engines. Common database management systems have only limited ability to reuse buffer content across multiple running queries due to their treatment of queries in isolation. Previous attempts to coordinate scans for better buffer reuse were limited to table scans only. Attempts for index based scan sharing were non existent or were less than satisfactory due to drifting between scans. In this paper, we describe a mechanism to keep scans using the same index closer together on scan position during scanning. This is achieved via intelligent placement of index scans at scan start time based on their scan ranges and speeds. This is then augmented by adaptive throttling of scan speeds based on the index scans runtime behavior during scan execution. We discuss the challenges in doing it for index scans in comparison to the more common table scan sharing. We show that this can be done with minimal changes to an existing database management system as demonstrated in our DB2 UDB prototype. Our experiments show significant gains in end-to-end response times and disk I/O for TPC-H workloads.