The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Conjunctive selection conditions in main memory
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Implementing database operations using SIMD instructions
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
CellSort: high performance sorting on the cell processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Vectorized data processing on the cell broadband engine
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Efficient implementation of sorting on multi-core SIMD CPU architecture
Proceedings of the VLDB Endowment
Larrabee: a many-core Intel® architecture for visual computing
Proceedings of the 6th ACM conference on Computing frontiers
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Designing fast architecture-sensitive tree search on modern multicore/many-core processors
ACM Transactions on Database Systems (TODS)
VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal
Proceedings of the 15th International Conference on Extending Database Technology
Navigating big data with high-throughput, energy-efficient data partitioning
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
This paper presents novel tree-based search algorithms that exploit the SIMD instructions found in virtually all modern processors. The algorithms are a natural extension of binary search: While binary search performs one comparison at each iteration, thereby cutting the search space in two halves, our algorithms perform k comparisons at a time and thus cut the search space into k pieces. On traditional processors, this so-called k-ary search procedure is not beneficial because the cost increase per iteration offsets the cost reduction due to the reduced number of iterations. On modern processors, however, multiple scalar operations can be executed simultaneously, which makes k-ary search attractive. In this paper, we provide two different search algorithms that differ in terms of efficiency and memory access patterns. Both algorithms are first described in a platform independent way and then evaluated on various state-of-the-art processors. Our experiments suggest that k-ary search provides significant performance improvements (factor two and more) on most platforms.