Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S

  • Authors:
  • Kun-Lung Wu;Kirsten W. Hildrum;Wei Fan;Philip S. Yu;Charu C. Aggarwal;David A. George;Buǧra Gedik;Eric Bouillet;Xiaohui Gu;Gang Luo;Haixun Wang

  • Affiliations:
  • IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY

  • Venue:
  • VLDB '07 Proceedings of the 33rd international conference on Very large data bases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe the challenges of prototyping a reference application on System S, a distributed stream processing middleware under development at IBM Research. With a large number of stream PEs (Processing Elements) implementing various stream analytic algorithms, running on a large-scale, distributed cluster of nodes, and collaboratively digesting several multi-modal source streams with vastly differing rates, prototyping a reference application on System S faces many challenges. Specifically, we focus on our experience in prototyping DAC (Disaster Assistance Claim monitoring), a reference application dealing with multi-modal stream analytic and monitoring. We describe three critical challenges: (1) How do we generate correlated, multi-modal source streams for DAC? (2) How do we design and implement a comprehensive stream application, like DAC, from many divergent stream analytic PEs? (3) How do we deploy DAC in light of source streams with extremely different rates? We report our experience in addressing these challenges, including modeling a disaster claim processing center to generate correlated source streams, constructing the PE flow graph, utilizing programming supports from System S, adopting parallelism, and exploiting resource-adaptive computation.