Stream schema: providing and exploiting static metadata for data stream processing

  • Authors:
  • Peter M. Fischer;Kyumars Sheykh Esmaili;Renée J. Miller

  • Affiliations:
  • ETH Zurich, Switzerland;ETH Zurich, Switzerland;University of Toronto, Canada

  • Venue:
  • Proceedings of the 13th International Conference on Extending Database Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Schemas, and more generally metadata specifying structural and semantic constraints, are invaluable in data management. They facilitate conceptual design and enable checking of data consistency. They also play an important role in permitting semantic query optimization, that is, optimization and processing strategies that are often highly effective, but only correct for data conforming to a given schema. While the use of metadata is well-established in relational and XML databases, the same is not true for data streams. The existing work mostly focuses on the specification of dynamic information. In this paper, we consider the specification of static metadata for streams in a model called Stream Schema. We show how Stream Schema can be used to validate the consistency of streams. By explicitly modeling stream constraints, we show that stream queries can be simplified by removing predicates or subqueries that check for consistency. This can greatly enhance pro-grammability of stream processing systems. We also present a set of semantic query optimization strategies that both permit compile-time checking of queries (for example, to detect empty queries) and new runtime processing options, options that would not have been possible without a Stream Schema specification. Case studies on two stream processing platforms (covering different applications and underlying stream models), along with an experimental evaluation, show the benefits of Stream Schema.