SODA: an optimizing scheduler for large-scale stream-based distributed computer systems

  • Authors:
  • Joel Wolf;Nikhil Bansal;Kirsten Hildrum;Sujay Parekh;Deepak Rajan;Rohit Wagle;Kun-Lung Wu;Lisa Fleischer

  • Affiliations:
  • IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;Dartmouth College, Hanover, NH

  • Venue:
  • Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper describes the SODA scheduler for System S, a highly scalable distributed stream processing system. Unlike traditional batch applications, streaming applications are open-ended. The system cannot typically delay the processing of the data. The scheduler must be able to shift resource allocation dynamically in response to changes to resource availability, job arrivals and departures, incoming data rates and so on. The design assumptions of System S, in particular, pose additional scheduling challenges. SODA must deal with a highly complex optimization problem, which must be solved in real-time while maintaining scalability. SODA relies on a careful problem decomposition, and intelligent use of both heuristic and exact algorithms. We describe the design and functionality of SODA, outline the mathematical components, and describe experiments to show the performance of the scheduler.