Order-preserving minimal perfect hash functions and information retrieval
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
ACM Computing Surveys (CSUR)
ACM Transactions on Database Systems (TODS)
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Main-memory index structures with fixed-size partial keys
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Implementing database operations using SIMD instructions
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Fractal prefetching B+-Trees: optimizing both cache and disk performance
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Compressing Relations and Indexes
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Proceedings of the 17th International Conference on Data Engineering
A Study of Index Structures for Main Memory Database Management Systems
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Cache Conscious Indexing for Decision-Support in Main Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Data Compression Support in Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Effect of node size on the performance of cache-conscious B+-trees
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
How to barter bits for chronons: compression and bandwidth trade offs for database scans
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Buffering accesses to memory-resident index structures
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive aggregation on chip multiprocessors
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Efficient implementation of sorting on multi-core SIMD CPU architecture
Proceedings of the VLDB Endowment
Dictionary-based order-preserving string compression for main memory column stores
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
k-ary search on modern processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Real-time parallel hashing on the GPU
ACM SIGGRAPH Asia 2009 papers
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units
Proceedings of the VLDB Endowment
Parallel search on video cards
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Proceedings of the 37th annual international symposium on Computer architecture
Database compression on graphics processors
Proceedings of the VLDB Endowment
High-throughput transaction executions on graphics processors
Proceedings of the VLDB Endowment
Designing fast architecture-sensitive tree search on modern multicore/many-core processors
ACM Transactions on Database Systems (TODS)
Fast updates on read-optimized databases using multi-core CPUs
Proceedings of the VLDB Endowment
Efficient methods for finding influential locations with adaptive grids
Proceedings of the 20th ACM international conference on Information and knowledge management
GPU-based minwise hashing: GPU-based minwise hashing
Proceedings of the 21st international conference companion on World Wide Web
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
KISS-Tree: smart latch-free in-memory indexing on modern architectures
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Ameliorating memory contention of OLAP operators on GPU processors
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
GiST scan acceleration using coprocessors
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
VAST-Tree: a vector-advanced and compressed structure for massive data tree traversal
Proceedings of the 15th International Conference on Extending Database Technology
Can traditional programming bridge the Ninja performance gap for parallel computing applications?
Proceedings of the 39th Annual International Symposium on Computer Architecture
Gdev: first-class GPU resource management in the operating system
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Accelerating pathology image data cross-comparison on CPU-GPU hybrid systems
Proceedings of the VLDB Endowment
Partitioning and multi-core parallelization of multi-equation forecast models
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputing
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Automatic synthesis of out-of-core algorithms
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Energy-efficient in-memory database computing
Proceedings of the Conference on Design, Automation and Test in Europe
Parallel multi-dimensional range query processing with R-trees on GPU
Journal of Parallel and Distributed Computing
Automatic vectorization of tree traversals
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
b-bit minwise hashing in practice
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Efficient co-processor utilization in database query processing
Information Systems
OmniDB: towards portable and efficient query processing on parallel CPU/GPU architectures
Proceedings of the VLDB Endowment
Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS
Proceedings of the VLDB Endowment
A study on parallelizing XML path filtering using accelerators
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous computing power by integrating multiple cores, each with wide vector units. There has been much work to exploit modern processor architectures for database primitives like scan, sort, join and aggregation. However, unlike other primitives, tree search presents significant challenges due to irregular and unpredictable data accesses in tree traversal. In this paper, we present FAST, an extremely fast architecture sensitive layout of the index tree. FAST is a binary tree logically organized to optimize for architecture features like page size, cache line size, and SIMD width of the underlying hardware. FAST eliminates impact of memory latency, and exploits thread-level and datalevel parallelism on both CPUs and GPUs to achieve 50 million (CPU) and 85 million (GPU) queries per second, 5X (CPU) and 1.7X (GPU) faster than the best previously reported performance on the same architectures. FAST supports efficient bulk updates by rebuilding index trees in less than 0.1 seconds for datasets as large as 64Mkeys and naturally integrates compression techniques, overcoming the memory bandwidth bottleneck and achieving a 6X performance improvement over uncompressed index search for large keys on CPUs.