From a calculus to an execution environment for stream processing

  • Authors:
  • Robert Soulé;Martin Hirzel;Buğra Gedik;Robert Grimm

  • Affiliations:
  • New York University;IBM Research;Bilkent University;New York University

  • Venue:
  • Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

At one level, this paper is about River, a virtual execution environment for stream processing. Stream processing is a paradigm well-suited for many modern data processing systems that ingest high-volume data streams from the real world, such as audio/video streaming, high-frequency trading, and security monitoring. One attractive property of stream processing is that it lends itself to parallelization on multicores, and even to distribution on clusters when extreme scale is required. Stream processing has been co-evolved by several communities, leading to diverse languages with similar core concepts. Providing a common execution environment reduces language development effort and increases portability. We designed River as a practical realization of Brooklet, a calculus for stream processing. So at another level, this paper is about a journey from theory (the calculus) to practice (the execution environment). The challenge is that, by definition, a calculus abstracts away all but the most central concepts. Hence, there are several research questions in concretizing the missing parts, not to mention a significant engineering effort in implementing them. But the effort is well worth it, because using a calculus as a foundation yields clear semantics and proven correctness results.