SIMD-based decoding of posting lists

Authors:
Alexander A. Stepanov;Anil R. Gangolli;Daniel E. Rose;Ryan J. Ernst;Paramjit S. Oberoi
Affiliations:
A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 11
Cited 2

Optimization for dynamic inverted index maintenance

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Integrating structured data and text: a relational approach

Integrating structured data and text: a relational approach
The implementation and performance of compressed databases

ACM SIGMOD Record
Inverted Index Compression Using Word-Aligned Binary Codes

Information Retrieval
Super-Scalar RAM-CPU Cache Compression

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Introduction to Information Retrieval

Introduction to Information Retrieval
Challenges in building large-scale information retrieval systems: invited talk

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice
Fast integer compression using SIMD instructions

Proceedings of the Sixth International Workshop on Data Management on New Hardware
Information Retrieval: Implementing and Evaluating Search Engines

Information Retrieval: Implementing and Evaluating Search Engines
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Exploiting SIMD instructions in current processors to improve classical string algorithms

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
osmfind: fast textual search on OSM data -- on smartphones and servers

Proceedings of the Second ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Powerful SIMD instructions in modern processors offer an opportunity for greater search performance. In this paper, we apply these instructions to decoding search engine posting lists. We start by exploring variable-length integer encoding formats used to represent postings. We define two properties, byte-oriented and byte-preserving, that characterize many formats of interest. Based on their common structure, we define a taxonomy that classifies encodings along three dimensions, representing the way in which data bits are stored and additional bits are used to describe the data. Using this taxonomy, we discover new encoding formats, some of which are particularly amenable to SIMD-based decoding. We present generic SIMD algorithms for decoding these formats. We also extend these algorithms to the most common traditional encoding format. Our experiments demonstrate that SIMD-based decoding algorithms are up to 3 times faster than non-SIMD algorithms.