CX: A scalable, robust network for parallel computing

Authors:
Peter Cappello;Dimitrios Mourloukos
Affiliations:
Computer Science Department, University of California, Santa Barbara, CA 93106, USA. E-mail: {cappello, mourlouk}@cs.ucsb.edu;Computer Science Department, University of California, Santa Barbara, CA 93106, USA. E-mail: {cappello, mourlouk}@cs.ucsb.edu
Venue:
Scientific Programming
Year:
2002

Citing 18
Cited 0

Applied combinatorics

Applied combinatorics
Spawn: A Distributed Computational Economy

IEEE Transactions on Software Engineering
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A worldwide flock of Condors: load sharing among workstation clusters

Future Generation Computer Systems - Special issue: resource management in distributed systems
The Legion vision of a worldwide virtual computer

Communications of the ACM
Application level scheduling of gene sequence comparison on metacomputers

ICS '98 Proceedings of the 12th international conference on Supercomputing
Wire-area parallel computing in Java

JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Inside Java 2 platform security architecture, API design, and implementation

Inside Java 2 platform security architecture, API design, and implementation
Javelin: parallel computing on the internet

Future Generation Computer Systems - Special issue on metacomputing
CoG kits: a bridge between commodity distributed computing and high-performance grids

Proceedings of the ACM 2000 conference on Java Grande
Performance and interoperability issues in incorporating cluster management systems within a wide-area network-computing environment

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
ATLAS: an infrastructure for global computing

EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
JavaSpaces Principles, Patterns, and Practice

JavaSpaces Principles, Patterns, and Practice
Lessons Learned While Operating Two Large SCI Clusters

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Market-based Massively Parallel Internet Computing

MPPM '97 Proceedings of the Conference on Massively Parallel Programming Models
The globe distribution network

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Adaptive and reliable parallel computing on networks of workstations

ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
First steps in metacomputing with Amica

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

CX, a network-based {c}omputational e{x}change, is presented. The system's design integrates variations of ideas from other researchers, such as work stealing, non-blocking tasks, eager scheduling, and space-based coordination. The object-oriented API is simple, compact, and cleanly separates application logic from the logic that supports interprocess communication and fault tolerance. Computations, of course, run to completion in the presence of computational hosts that join and leave the ongoing computation. Such hosts, or producers, use task caching and prefetching to overlap computation with interprocessor communication. To break a potential task server bottleneck, a network of task servers is presented. Even though task servers are envisioned as reliable, the self-organizing, scalable network of $n$ servers, described as a {\it sibling-connected height-balanced fat tree}, tolerates a sequence of $n-1$ server failures. Tasks are distributed throughout the server network via a simple "diffusion" process. CX is intended as a test bed for research on automated silent auctions, reputation services, authentication services, and bonding services. CX also provides a test bed for algorithm research into network-based parallel computation.