VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal

Authors:
Takeshi Yamamuro;Makoto Onizuka;Toshio Hitaka;Masashi Yamamuro
Affiliations:
NTT Cyber Space Laboratories, NTT corporation;NTT Cyber Space Laboratories, NTT corporation;NTT Cyber Space Laboratories, NTT corporation;NTT Cyber Space Laboratories, NTT corporation
Venue:
Proceedings of the 15th International Conference on Extending Database Technology
Year:
2012

Citing 26
Cited 1

Prefix B-trees

ACM Transactions on Database Systems (TODS)
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
B-Tree Indexes and CPU Caches

Proceedings of the 17th International Conference on Data Engineering
A Study of Index Structures for Main Memory Database Management Systems

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Cache Conscious Indexing for Decision-Support in Main Memory

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Effect of node size on the performance of cache-conscious B+-trees

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Fast computation of database operations using graphics processors

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Super-Scalar RAM-CPU Cache Compression

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
GPUTeraSort: high performance graphics co-processor sorting for large database management

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Buffering accesses to memory-resident index structures

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient implementation of sorting on multi-core SIMD CPU architecture

Proceedings of the VLDB Endowment
k-ary search on modern processors

Proceedings of the Fifth International Workshop on Data Management on New Hardware
Compressing term positions in web indexes

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Designing efficient sorting algorithms for manycore GPUs

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Database architecture evolution: mammals flourished long before dinosaurs became extinct

Proceedings of the VLDB Endowment
FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Parallel search on video cards

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Database compression on graphics processors

Proceedings of the VLDB Endowment
High-throughput transaction executions on graphics processors

Proceedings of the VLDB Endowment
Designing fast architecture-sensitive tree search on modern multicore/many-core processors

ACM Transactions on Database Systems (TODS)
Modern B-Tree Techniques

Foundations and Trends in Databases

Automatic vectorization of tree traversals

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a compact and efficient index structure for massive data sets. Several indexing techniques are widely-used and well-known such as binary trees and B+trees. Unfortunately, we find that these techniques suffer major two shortcomings when applied to massive sets; first, their indices are so large they could overflow regular main memory, and, second, they suffer from a variety of penalties (e.g., conditional branches, low cache hits, and TLB misses), which restricts the number of instructions executed per processor cycle. Our state-of-the-art index structure, called VAST-Tree, classifies branch nodes into multiple layers. It applies existing techniques such as cache-conscious, aligned, and branch-free structures to the top layers of branch nodes in trees. Next, it applies the adaptive compression technique to save space and harness data parallelism with SIMD instructions to the middle and bottom layers of branch nodes. Moreover, a processor-friendly compression technique is applied to leaf nodes. The end result is that trees are much more compact and traversal efficiency is high. We implement a prototype and show its resulting index size and performance as compared to binary trees, and the hardware-conscious technique called FAST which currently offers the highest performance. Compared to current alternatives, VAST-Tree compacts the branch nodes by more than 95%, and the overall index size by 47-84% given that there are 230 keys. With 228 keys, it has roughly 6.0-times and 1.24-times throughput and saves the memory consumption by more than 94.7% and 40.5% as compared to binary trees and FAST, respectively.