Fractal prefetching B+-Trees: optimizing both cache and disk performance

Authors:
Shimin Chen;Phillip B. Gibbons;Todd C. Mowry;Gary Valentin
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Bell Laboratories, Murray Hill, NJ;Carnegie Mellon University, Pittsburgh, PA;IBM Toronto Lab, Markham, Ontario, Canada
Venue:
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Year:
2002

Citing 17
Cited 32

The five-minute rule ten years later, and other computer storage rules of thumb

ACM SIGMOD Record
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
B-tree page size when caching is considered

ACM SIGMOD Record
Cache-conscious structure layout

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Performance analysis of the Alpha 21264-based Compaq ES40 system

Proceedings of the 27th annual international symposium on Computer architecture
Main-memory index structures with fixed-size partial keys

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The evolution of effective B-tree: page organization and techniques: a personal account

ACM SIGMOD Record
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
B-Tree Indexes and CPU Caches

Proceedings of the 17th International Conference on Data Engineering
A Study of Index Structures for Main Memory Database Management Systems

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Cache Conscious Indexing for Decision-Support in Main Memory

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The Value of Merge-Join and Hash-Join in SQL Server

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Cache-oblivious B-trees

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science

Effect of node size on the performance of cache-conscious B+-trees

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Improving Hash Join Performance through Prefetching

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Main Memory Indexing: The Case for BD-Tree

IEEE Transactions on Knowledge and Data Engineering
Buffering databse operations for enhanced instruction cache performance

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Cache-Conscious Automata for XML Filtering

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improving database performance on simultaneous multithreading processors

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Database hash-join algorithms on multithreaded computer architectures

Proceedings of the 3rd conference on Computing frontiers
Improving hash join performance through prefetching

ACM Transactions on Database Systems (TODS)
Efficient execution of multiple queries on deep memory hierarchy

Journal of Computer Science and Technology
Buffering accesses to memory-resident index structures

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Lachesis: robust database storage management based on device-specific performance characteristics

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Write-optimized B-trees

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A general framework for improving query processing performance on multi-level memory hierarchies

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
An Extended R-Tree Indexing Method Using Selective Prefetching in Main Memory

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Indexing Moving Objects Using Short-Lived Throwaway Indexes

SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
FastAD: an authenticated directory for billions of objects

ACM SIGOPS Operating Systems Review
Enhancing the B+-tree by dynamic node popularity caching

Information Processing Letters
Quantization techniques for similarity search in high-dimensional data spaces

BNCOD'03 Proceedings of the 20th British national conference on Databases
An efficient compression technique for a multi-dimensional index in main memory

VISUAL'07 Proceedings of the 9th international conference on Advances in visual information systems
FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Using evolving storage structures for data storage

Proceedings of the 8th International Conference on Frontiers of Information Technology
Data management for SSDs for large-scale interactive graphics applications

I3D '11 Symposium on Interactive 3D Graphics and Games
MOVIES: indexing moving objects by shooting index images

Geoinformatica
Designing fast architecture-sensitive tree search on modern multicore/many-core processors

ACM Transactions on Database Systems (TODS)
APR-Quad: an update efficient authenticated dictionary for spatial data

Proceedings of the 4th ACM SIGSPATIAL International Workshop on Security and Privacy in GIS and LBS
ECOS: evolutionary column-oriented storage

BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
MiniTasking: improving cache performance for multiple query workloads

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Modern B-Tree Techniques

Foundations and Trends in Databases
A lock-free B+tree

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Adapting the b+-tree for asymmetric i/o

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Scalable and dynamically balanced shared-everything OLTP with physiological partitioning

The VLDB Journal — The International Journal on Very Large Data Bases
OLTP in wonderland: where do cache misses come from in major OLTP components?

Proceedings of the Ninth International Workshop on Data Management on New Hardware

Quantified Score

Hi-index	0.00

Visualization

Abstract

B+-Trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed new types of B+-Trees optimized for CPU cache performance in main memory environments, where the tree node sizes are one or a few cache lines. Unfortunately, due primarily to this large discrepancy in optimal node sizes, existing disk-optimized B+-Trees suffer from poor cache performance while cache-optimized B+-Trees exhibit poor disk performance. In this paper, we propose fractal prefetching B+-Trees (fpB+-Trees), which embed "cache-optimized" trees within "disk-optimized" trees, in order to optimize both cache and I/O performance. We design and evaluate two approaches to breaking disk pages into cache-optimized nodes: disk-first and cache-first. These approaches are somewhat biased in favor of maximizing disk and cache performance, respectively, as demonstrated by our results. Both implementations of fpB+-Trees achieve dramatically better cache performance than disk-optimized B+-Trees: a factor of 1.1-1.8 improvement for search, up to a factor of 4.2 improvement for range scans, and up to a 20-fold improvement for updates, all without significant degradation of I/O performance. In addition, fpB+-Trees accelerate I/O performance for range scans by using jump-pointer arrays to prefetch leaf pages, thereby achieving a speed-up of 2.5-5 on IBM's DB2 Universal Database.