Understanding and improving the cost of scaling distributed event processing

  • Authors:
  • Shoaib Akram;Manolis Marazakis;Angelos Bilas

  • Affiliations:
  • Foundation for Research and Technology - Hellas (FORTH), Vassilika Vouton, Heraklion, GR, Greece;Foundation for Research and Technology - Hellas (FORTH), Vassilika Vouton, Heraklion, GR, Greece;Foundation for Research and Technology - Hellas (FORTH), Vassilika Vouton, Heraklion, GR, Greece

  • Venue:
  • Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Building scalable back-end infrastructures for data-centric applications is becoming important. Applications used in data-centres have complex, multilayer software stacks and are required to scale to a large number of nodes. Today, there is increased interest in improving the efficiency of such software stacks. In this paper, we examine the efficiency of such a stack used for distributed stream processing, an important application domain. We use a specific streaming system, Borealis [10], and extensively hand-tune the end-to-end data path. We focus on parts of the stack that are related to intra- and inter-node communication and data exchange, a central component of many software stacks. We find that application-independent code in stream processing middleware employs operations for communication that consume significant amount of CPU cycles and are not strictly necessary. We first categorize these operations based on the protocol function they support. We then proceed to remove these operations by producing a functionally equivalent software stack in terms of application processing. Our results show that restructuring the data path achieves up to 5x higher throughput, reduces energy consumption by up to 60% and saves infrastructure cost by up to 40%. Finally, we project that with 1024-core processors per node, stream processing applications will demand up to 2 TBits/s/node of networking throughput.