Maximizing the output rate of multi-way join queries over streaming information sources

  • Authors:
  • Stratis D. Viglas;Jeffrey F. Naughton;Josef Burger

  • Affiliations:
  • University of Wisconsin-Madison, Department of Computer Sciences, Madison, WI;University of Wisconsin-Madison, Department of Computer Sciences, Madison, WI;University of Wisconsin-Madison, Department of Computer Sciences, Madison, WI

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently there has been a growing interest in join query evaluation for scenarios in which inputs arrive at highly variable and unpredictable rates. In such scenarios, the focus shifts from completing the computation as soon as possible to producing a prefix of the output as soon as possible. To handle this shift in focus, most solutions to date rely upon some combination of streaming binary operators and "on-the-fly" execution plan reorganization. In contrast, we consider the alternative of extending existing symmetric binary join operators to handle more than two inputs. Toward this end, we have completed a prototype implementation of a multi-way join operator, which we term the "MJoin" operator, and explored its performance. Our results show that in many instances the MJoin produces outputs sooner than any tree of binary operators. Additionally, since MJoins are completely symmetric with respect to their inputs, they can reduce the need for expensive runtime plan reorganization. This suggests that supporting multiway joins in a single, symmetric, streaming operator may be a useful addition to systems that support queries over input streams from remote sites.