An optimal strategy for monitoring top-k queries in streaming windows

  • Authors:
  • Di Yang;Avani Shastri;Elke A. Rundensteiner;Matthew O. Ward

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • Proceedings of the 14th International Conference on Extending Database Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Continuous top-k queries, which report a certain number (k) of top preferred objects from data streams, are important for a broad class of real-time applications, ranging from financial analysis to network traffic monitoring. Existing solutions for tackling this problem aim to reduce the computational costs by incrementally updating the top-k results upon each window slide. However, they all suffer from the performance bottleneck of periodically requiring a complete recomputation of the top-k results from scratch. Such an operation is not only computationally expensive but also causes significant memory consumption, as it requires keeping all objects alive in the query window. To solve this problem, we identify the "Minimal Top-K candidate set" (MTK), namely the subset of stream objects that is both necessary and sufficient for continuous top-k monitoring. Based on this theoretical foundation, we design the MinTopk algorithm that elegantly maintains MTK and thus eliminates the need for recomputation. We prove the optimality of the MinTopk algorithm in both CPU and memory utilization for continuous top-k monitoring. Our experimental study shows that both the efficiency and scalability of our proposed algorithm is clearly superior to the state-of-the-art solutions.