RPJ: producing fast join results on streams through rate-based optimization

  • Authors:
  • Yufei Tao;Man Lung Yiu;Dimitris Papadias;Marios Hadjieleftheriou;Nikos Mamoulis

  • Affiliations:
  • City University of Hong Kong, Hong Kong;University of Hong Kong, Hong Kong;Hong Kong University of Science and Technology, Hong Kong;University of California, Riverside, CA;University of Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 2005 ACM SIGMOD international conference on Management of data
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of "progressively" joining relations whose records are continuously retrieved from remote sources through an unstable network that may incur temporary failures. The objectives are to (i) start reporting the first output tuples as soon as possible (before the participating relations are completely received), and (ii) produce the remaining results at a fast rate. We develop a new algorithm RPJ (Rate-based Progressive Join) based on solid theoretical analysis. RPJ maximizes the output rate by optimizing its execution according to the characteristics of the join relations (e.g., data distribution, tuple arrival pattern, etc.). Extensive experiments prove that our technique delivers results significantly faster than the previous methods.