SIMD-based decoding of posting lists

  • Authors:
  • Alexander A. Stepanov;Anil R. Gangolli;Daniel E. Rose;Ryan J. Ernst;Paramjit S. Oberoi

  • Affiliations:
  • A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA;A9.com, Palo Alto, CA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Powerful SIMD instructions in modern processors offer an opportunity for greater search performance. In this paper, we apply these instructions to decoding search engine posting lists. We start by exploring variable-length integer encoding formats used to represent postings. We define two properties, byte-oriented and byte-preserving, that characterize many formats of interest. Based on their common structure, we define a taxonomy that classifies encodings along three dimensions, representing the way in which data bits are stored and additional bits are used to describe the data. Using this taxonomy, we discover new encoding formats, some of which are particularly amenable to SIMD-based decoding. We present generic SIMD algorithms for decoding these formats. We also extend these algorithms to the most common traditional encoding format. Our experiments demonstrate that SIMD-based decoding algorithms are up to 3 times faster than non-SIMD algorithms.