Semantic query optimization for processing XML streams with minimized memory footprint

  • Authors:
  • Ming Li;Murali Mani;Elke A. Rundensteiner

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, Massachusetts;Worcester Polytechnic Institute, Worcester, Massachusetts;Worcester Polytechnic Institute, Worcester, Massachusetts

  • Venue:
  • DataX '08 Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

XQuery evaluation over XML streams requires the temporary buffering of XML elements. This paper presents a semantic query optimization solution to minimize memory footprint during XQuery evaluation by exploiting schema knowledge. We focus on one particular class of constraints, namely, the Pattern Non-Occurrence (PNO) constraints for XML streams conforming to pre-defined DTDs. PNO constraints facilitate the early release of buffered data (early buffer release) or possibly avoid to ever store the data (buffer avoidance), thus achieving a minimized memory footprint. We develop an automaton-based technique to detect PNO constraints at runtime. For a given query, optimization opportunities of early buffer release and buffer avoidance which can be triggered by runtime PNO detection are explored and the optimization decision is then encoded into the Raindrop algebraic plan. We implement our optimization technique within the Raindrop XQuery engine. Our experimental studies illustrate that the proposed techniques bring significant performance improvement in both memory and CPU usage with little overhead.