Adaptive algorithms for set containment joins

  • Authors:
  • Sergey Melnik;Hector Garcia-Molina

  • Affiliations:
  • Stanford University, Stanford, CA;Stanford University, Stanford, CA

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A set containment join is a join between set-valued attributes of two relations, whose join condition is specified using the subset (⊆) operator. Set containment joins are deployed in many database applications, even those that do not support set-valued attributes. In this article, we propose two novel partitioning algorithms, called the Adaptive Pick-and-Sweep Join (APSJ) and the Adaptive Divide-and-Conquer Join (ADCJ), which allow computing set containment joins efficiently. We show that APSJ outperforms previously suggested algorithms for many data sets, often by an order of magnitude. We present a detailed analysis of the algorithms and study their performance on real and synthetic data using an implemented testbed.