Greedy List Intersection

  • Authors:
  • Robert Krauthgamer;Aranyak Mehta;Vijayshankar Raman;Atri Rudra

  • Affiliations:
  • Weizmann Institute, Rehovot, Israel and IBM Almaden, San Jose, CA, USA. robert.krauthgamer@weizmann.ac.il;Google Inc., Mountain View, CA, USA. aranyak@google.com;IBM Almaden, San Jose, CA, USA. ravijay@us.ibm.com;University at Buffalo, State University of New York, Buffalo, NY, USA. atri@cse.buffalo.edu

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A common technique for processing conjunctive queries is to first match each predicate separately using an index lookup, and then compute the intersection of the resulting rowid lists, via an AND-tree. The performance of this technique depends crucially on the order of lists in this tree: it is important to compute early the intersections that will produce small results. But this optimization is hard to do when the data or predicates have correlation. We present a new algorithm for ordering the lists in an AND-tree tree by sampling the intermediate intersection sizes. We prove that our algorithm is near-optimal and validate its effectiveness experimentally on datasets with a variety of distributions.