Massively-parallel stream processing under QoS constraints with Nephele

  • Authors:
  • Björn Lohrmann;Daniel Warneke;Odej Kao

  • Affiliations:
  • Technische Universität Berlin, Berlin, Germany;Technische Universität Berlin, Berlin, Germany;Technische Universität Berlin, Berlin, Germany

  • Venue:
  • Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today, a growing number of commodity devices, like mobile phones or smart meters, is equipped with rich sensors and capable of producing continuous data streams. The sheer amount of these devices and the resulting overall data volumes of the streams raise new challenges with respect to the scalability of existing stream processing systems. At the same time, massively-parallel data processing systems like MapReduce have proven that they scale to large numbers of nodes and efficiently organize data transfers between them. Many of these systems also provide streaming capabilities. However, unlike traditional stream processors, these systems have disregarded QoS requirements of prospective stream processing applications so far. In this paper we address this gap. First, we analyze common design principles of today's parallel data processing frameworks and identify those principles that provide degrees of freedom in trading off the QoS goals latency and throughput. Second, we propose a scheme which allows these frameworks to detect violations of user-defined latency constraints and optimize the job execution without manual interaction in order to meet these constraints while keeping the throughput as high as possible. As a proof of concept, we implemented our approach for our parallel data processing framework Nephele and evaluated its effectiveness through a comparison with Hadoop Online. For a multimedia streaming application we can demonstrate an improved processing latency by factor of at least 15 while preserving high data throughput when needed.