Improving the performance of list intersection

  • Authors:
  • Dimitris Tsirogiannis;Sudipto Guha;Nick Koudas

  • Affiliations:
  • University of Toronto;University of Pennsylvania;University of Toronto

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

List intersection is a central operation, utilized excessively for query processing on text and databases. We present list intersection algorithms for an arbitrary number of sorted and unsorted lists tailored to the characteristics of modern hardware architectures. Two new list intersection algorithms are presented for sorted lists. The first algorithm, termed Dynamic Probes, dynamically decides the probing order on the lists exploiting information from previous probes at runtime. This information is utilized as a cache-resident microindex. The second algorithm, termed Quantile-based, deduces in advance a good probing order, thus avoiding the overhead of adaptivity and is based on detecting lists with non-uniform distribution of document identifiers. For unsorted lists, we present a novel hash-based algorithm that avoids the overhead of sorting. A detailed experimental evaluation is presented based on real and synthetic data using existing chip multiprocessor architectures with eight cores, validating the efficiency and efficacy of the proposed algorithms.