B-tree indexes, interpolation search, and skew

Authors:
Goetz Graefe
Affiliations:
Microsoft
Venue:
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Year:
2006

Citing 18
Cited 5

Order-preserving key transformations

ACM Transactions on Database Systems (TODS)
The Escrow transactional method

ACM Transactions on Database Systems (TODS)
The SB-tree: an index-sequential structure for high-performance sequential access

Acta Informatica
AlphaSort: a RISC machine sort

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The five-minute rule ten years later, and other computer storage rules of thumb

ACM SIGMOD Record
B-tree page size when caching is considered

ACM SIGMOD Record
Prefix B-trees

ACM Transactions on Database Systems (TODS)
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)
Interpolation search—a log logN search

Communications of the ACM
Distribution-dependent hashing functions and their characteristics

SIGMOD '75 Proceedings of the 1975 ACM SIGMOD international conference on Management of data
The evolution of effective B-tree: page organization and techniques: a personal account

ACM SIGMOD Record
Order Preserving Compression

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
The Bounded Disorder Access Method

Proceedings of the Second International Conference on Data Engineering
B-Tree Indexes and CPU Caches

Proceedings of the 17th International Conference on Data Engineering
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Transaction support for indexed summary views

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases

Traverse: Simplified Indexing on Large Map-Reduce-Merge Clusters

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Fast Loads and Fast Queries

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
A survey of B-tree logging and recovery techniques

ACM Transactions on Database Systems (TODS)
New algorithms for join and grouping operations

Computer Science - Research and Development
Modern B-Tree Techniques

Foundations and Trends in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent performance improvements in storage hardware have benefited bandwidth much more than latency. Among other implications, this trend favors large B-tree pages. Recent performance improvements in processor hardware also have benefited processing bandwidth much more than memory latency. Among other implications, this trend favors adding calculations if they save cache faults.With small calculations guiding the search directly to the desired key, interpolation search complements these trends much better than binary search. It performs well if the distribution of key values is perfectly uniform, but it can be useless and even wasteful otherwise. This paper collects and describes more than a dozen techniques for interpolation search in B-tree indexes. Most of them attempt to avoid skew or to detect skew very early and then to avoid its bad effects. Some of these methods are part of the folklore of B-tree search, whereas other techniques are new. The purpose of this survey is to encourage research into such techniques and their performance on modern hardware.