Order-preserving minimal perfect hash functions and information retrieval
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
ACM Computing Surveys (CSUR)
ACM Transactions on Database Systems (TODS)
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Main-memory index structures with fixed-size partial keys
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Implementing database operations using SIMD instructions
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Fractal prefetching B+-Trees: optimizing both cache and disk performance
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
IEEE Transactions on Knowledge and Data Engineering
On Sort-Merge Algorithm for Band Joins
IEEE Transactions on Knowledge and Data Engineering
Compressing Relations and Indexes
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Proceedings of the 17th International Conference on Data Engineering
A Study of Index Structures for Main Memory Database Management Systems
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Cache Conscious Indexing for Decision-Support in Main Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Data Compression Support in Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Effect of node size on the performance of cache-conscious B+-trees
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Improving instruction cache performance in OLTP
ACM Transactions on Database Systems (TODS)
How to barter bits for chronons: compression and bandwidth trade offs for database scans
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Buffering accesses to memory-resident index structures
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive aggregation on chip multiprocessors
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Efficient implementation of sorting on multi-core SIMD CPU architecture
Proceedings of the VLDB Endowment
Dictionary-based order-preserving string compression for main memory column stores
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Cache-conscious buffering for database operators with state
Proceedings of the Fifth International Workshop on Data Management on New Hardware
k-ary search on modern processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Designing efficient sorting algorithms for manycore GPUs
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Real-time parallel hashing on the GPU
ACM SIGGRAPH Asia 2009 papers
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units
Proceedings of the VLDB Endowment
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Parallel search on video cards
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal
Proceedings of the 15th International Conference on Extending Database Technology
TJJE: An efficient algorithm for top-k join on massive data
Information Sciences: an International Journal
Hi-index | 0.00 |
In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous computing power by integrating multiple cores, each with wide vector units. There has been much work to exploit modern processor architectures for database primitives like scan, sort, join, and aggregation. However, unlike other primitives, tree search presents significant challenges due to irregular and unpredictable data accesses in tree traversal. In this article, we present FAST, an extremely fast architecture-sensitive layout of the index tree. FAST is a binary tree logically organized to optimize for architecture features like page size, cache line size, and Single Instruction Multiple Data (SIMD) width of the underlying hardware. FAST eliminates the impact of memory latency, and exploits thread-level and data-level parallelism on both CPUs and GPUs to achieve 50 million (CPU) and 85 million (GPU) queries per second for large trees of 64M elements, with even better results on smaller trees. These are 5X (CPU) and 1.7X (GPU) faster than the best previously reported performance on the same architectures. We also evaluated FAST on the Intel$^\tiny\textregistered$ Many Integrated Core architecture (Intel$^\tiny\textregistered$ MIC), showing a speedup of 2.4X--3X over CPU and 1.8X--4.4X over GPU. FAST supports efficient bulk updates by rebuilding index trees in less than 0.1 seconds for datasets as large as 64M keys and naturally integrates compression techniques, overcoming the memory bandwidth bottleneck and achieving a 6X performance improvement over uncompressed index search for large keys on CPUs.