k-ary search on modern processors

Authors:
Benjamin Schlegel;Rainer Gemulla;Wolfgang Lehner
Affiliations:
Technische Universität Dresden;IBM Almaden Research Center;Technische Universität Dresden
Venue:
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Year:
2009

Citing 7
Cited 4

The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Conjunctive selection conditions in main memory

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Implementing database operations using SIMD instructions

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
CellSort: high performance sorting on the cell processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Vectorized data processing on the cell broadband engine

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Efficient implementation of sorting on multi-core SIMD CPU architecture

Proceedings of the VLDB Endowment
Larrabee: a many-core Intel® architecture for visual computing

Proceedings of the 6th ACM conference on Computing frontiers

FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Designing fast architecture-sensitive tree search on modern multicore/many-core processors

ACM Transactions on Database Systems (TODS)
VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal

Proceedings of the 15th International Conference on Extending Database Technology
Navigating big data with high-throughput, energy-efficient data partitioning

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents novel tree-based search algorithms that exploit the SIMD instructions found in virtually all modern processors. The algorithms are a natural extension of binary search: While binary search performs one comparison at each iteration, thereby cutting the search space in two halves, our algorithms perform k comparisons at a time and thus cut the search space into k pieces. On traditional processors, this so-called k-ary search procedure is not beneficial because the cost increase per iteration offsets the cost reduction due to the reduced number of iterations. On modern processors, however, multiple scalar operations can be executed simultaneously, which makes k-ary search attractive. In this paper, we provide two different search algorithms that differ in terms of efficiency and memory access patterns. Both algorithms are first described in a platform independent way and then evaluated on various state-of-the-art processors. Our experiments suggest that k-ary search provides significant performance improvements (factor two and more) on most platforms.