CX: A scalable, robust network for parallel computing

  • Authors:
  • Peter Cappello;Dimitrios Mourloukos

  • Affiliations:
  • Computer Science Department, University of California, Santa Barbara, CA 93106, USA. E-mail: {cappello, mourlouk}@cs.ucsb.edu;Computer Science Department, University of California, Santa Barbara, CA 93106, USA. E-mail: {cappello, mourlouk}@cs.ucsb.edu

  • Venue:
  • Scientific Programming
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

CX, a network-based {c}omputational e{x}change, is presented. The system's design integrates variations of ideas from other researchers, such as work stealing, non-blocking tasks, eager scheduling, and space-based coordination. The object-oriented API is simple, compact, and cleanly separates application logic from the logic that supports interprocess communication and fault tolerance. Computations, of course, run to completion in the presence of computational hosts that join and leave the ongoing computation. Such hosts, or producers, use task caching and prefetching to overlap computation with interprocessor communication. To break a potential task server bottleneck, a network of task servers is presented. Even though task servers are envisioned as reliable, the self-organizing, scalable network of $n$ servers, described as a {\it sibling-connected height-balanced fat tree}, tolerates a sequence of $n-1$ server failures. Tasks are distributed throughout the server network via a simple "diffusion" process. CX is intended as a test bed for research on automated silent auctions, reputation services, authentication services, and bonding services. CX also provides a test bed for algorithm research into network-based parallel computation.