The effect of reading policy on early join result production

  • Authors:
  • Ramon Lawrence;Ralph P. Russo;Nariankadu D. Shyamalkumar

  • Affiliations:
  • Department of Computer Science, University of British Columbia Okanagan, Canada;Department of Statistics and Actuarial Science, The University of Iowa, United States;Department of Statistics and Actuarial Science, The University of Iowa, United States

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2007

Quantified Score

Hi-index 0.07

Visualization

Abstract

The ability to produce join results before having read an entire input (early) reduces query response time. This is especially important for interactive applications, and for joins in mediator systems that may have to wait on network delays when reading the inputs. Although several early join algorithms have been proposed, there has been no formal treatment of how different reading policies affect the number of results produced. In this work, we show that alternate reading is optimal among fixed reading policies, and we provide expressions for the expected number of results produced over time. Further, we analyze policies that adapt their execution to the tuples already read and to the distribution of the inputs. We present a greedy, adaptive algorithm that is optimal in that it outperforms all reading policies, on average. However, the greedy policy is shown to perform only marginally better than the alternating policy. Thus, the alternating policy emerges as a policy that is easy to implement, requires no knowledge of the input distributions, is optimal among fixed policies, and is nearly optimal among all policies.