Achieving high output quality under limited resources through structure-based spilling in XML streams

  • Authors:
  • Mingzhu Wei;Elke A. Rundensteiner;Murali Mani

  • Affiliations:
  • Worcester Polytechnic Institute;Worcester Polytechnic Institute;University of Michigan, Flint

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because of high volumes and unpredictable arrival rates, stream processing systems are not always able to keep up with input data - resulting in buffer overflow and uncontrolled loss of data. To produce eventually complete results, load spilling, which pushes some fractions of data to disks temporarily, is commonly employed in relational stream engines. In this work, we now introduce "structure-based spilling", a spilling technique customized for XML streams by considering the partial spillage of possibly complex XML elements. Such structure-based spilling brings new challenges. When a path is spilled, multiple paths may be affected. We analyze possible spilling effects on the query paths and how to execute the "reduced" query to produce partial results. To select the reduced query that maximizes output quality, we develop three optimization strategies, namely, OptR, OptPrune and ToX. We also examine the clean-up stage to guarantee that an entire result set is eventually generated by producing supplementary results. Our experimental study demonstrates that our proposed solutions consistently achieve higher quality results compared to the state-of-the-art techniques.