Window join approximation over data streams with importance semantics

  • Authors:
  • Adegoke Ojewole;Qiang Zhu;Wen-Chi Hou

  • Affiliations:
  • University of Michigan, Dearborn, Dearborn, MI;University of Michigan, Dearborn, Dearborn, MI;Southern Illinois University, Carbondale, IL

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Load shedding techniques generate approximate sliding window join results when memory constraints prevent exact computation. The previously proposed random load shedding method drops input tuples without consideration for the number of outputs created, while the recently proposed semantic load shedding technique aims to produce the largest possible result set. We consider a new model in which data stream tuples contain numerical importance values relevant to the query source and seek to maximize the importance of the approximate join result. We show that both random load shedding and semantic load shedding are sub-optimal in this situation, while the techniques presented in this paper satisfy the objective function by considering both tuple importance and join attribute distributions. We extend the existing offline semantic approximation technique to make it compatible with our objective function and show that it is less space and time efficient than our new optimal offline algorithm for small and large join memory allotments. We also introduce four efficient online algorithms, which are quite promising in maximizing the importance of the approximate join result without foreknowledge of input streams.