Index tuning for parameterized streaming groupby queries

  • Authors:
  • Luping Ding;Elke A. Rundensteiner

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similar groupby queries are common in many stream processing applications. We propose the concept of the parameterized streaming groupby query template (PSGB template) as an abstraction for representing potentially infinite number of runtime instantiated groupby queries with customized results. To handle high-speed data streams and large numbers of PSGB queries, the IMP index is proposed for organizing the quickly evolving PSGB operator state to support query workloads. In this paper, we tackle the IMP index tuning problem. We propose the EPrune algorithm that is guaranteed to find the optimal IMP index configuration for a given query workload. To support frequent index tuning required for coping with dynamic stream environments, efficiency of index selection becomes more important than guaranteed optimality. To achieve this, we design a greedy index selection algorithm named RGreedy and equip it with three heuristics - OWL, PCL and Hybrid. Our experiments show that RGreedy finds the optimal IMP configuration in practically all of our extensive test cases. While EPrune takes hours to finish, RGreedy terminates within seconds.