CLASP: collaborating, autonomous stream processing systems

  • Authors:
  • Michael Branson;Fred Douglis;Brad Fawcett;Zhen Liu;Anton Riabov;Fan Ye

  • Affiliations:
  • IBM Systems and Technology Group, Rochester, MN;IBM T.J. Watson Research Center, Hawthorne, NY;IBM Systems and Technology Group, Rochester, MN;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY

  • Venue:
  • Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are currently a number of streaming data analysis systems in research or commercial operation. These systems are generally large-scale distributed systems, but each system operates in isolation, under the control of one administrative authority. We are developing middleware that permits autonomous or semi-autonomous streaming analysis systems (called "sites") to interoperate, providing them opportunities for data access, performance improvements, and reliability far exceeding that available in a single system. Unique characteristics of our system include an architecture for the management of multiple cooperation paradigms depending on the degree of trust and dependencies among the participating sites; a multisite planner that converts user-specified declarative queries into specifications of distributed jobs; and a mechanism for automatic recovery of site failures by redispatching failed pieces of a distributed job. We evaluate our architecture via experiments on a running prototype, and the results demonstrate the advantages of multisite cooperation: collaborative jobs that share resources, even across only a few sites, can produce results 50% faster than independent execution, and jobs on failed sites can be recovered within a few seconds.