CALYPSO: a computer algebra library for parallel symbolic computation
PASCO '97 Proceedings of the second international symposium on Parallel symbolic computation
Efficient and flexible fault tolerance and migration of scientific simulations using CUMULVS
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Task Spreading and Shrinking on Multiprocessor Systems and Networks of Workstations
IEEE Transactions on Parallel and Distributed Systems
Building programs in the network of tasks model
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 1
CLIP: a checkpointing tool for message-passing parallel programs
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Experiments with the CHIME Parallel Processing System
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Fault-Tolerant Parallel Applications Using Queues and Actions
ICPP '97 Proceedings of the international Conference on Parallel Processing
Adaptive Scheduling for Task Farming with Grid Middleware
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Exploiting Data-Flow for Fault-Tolerance in a Wide-Area Parallel System
SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
A work-optimal deterministic algorithm for the asynchronous certified write-all problem
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
On Honey Bees and Dynamic Server Allocation in Internet Hosting Centers
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
A tight analysis and near-optimal instances of the algorithm of Anderson and Woll
Theoretical Computer Science
Adaptive Scheduling for Task Farming with Grid Middleware
International Journal of High Performance Computing Applications
Scheduling DAGs on asynchronous processors
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Parallel processing with windows NT networks
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Is the island model fault tolerant?
Proceedings of the 9th annual conference companion on Genetic and evolutionary computation
Characterizing fault tolerance in genetic programming
BADS '09 Proceedings of the 2009 workshop on Bio-inspired algorithms for distributed systems
Characterizing fault tolerance in genetic programming
Future Generation Computer Systems
Compiler-Assisted Checkpointing of Parallel Codes: The Cetus and LLVM Experience
International Journal of Parallel Programming
Hi-index | 0.00 |
The importance of adapting networks of workstations for use as parallel processing platforms is well established. However current solutions do not always address important issues that exist in real networks. External factors like the sharing of resources, unpredictable behavior of the network and failures, are present in multiuser networks and must be addressed. CALYPSO is a prototype software system for writing and executing parallel programs on non-dedicated platforms, based on COTS networked workstations operating systems, and compilers. Among notable properties of the system are: (1) simple programming paradigm incorporating shared memory constructs and separating the programming and the execution parallelism, (2) transparent utilization of unreliable shared resources by providing dynamic load balancing and fault tolerance, and (3) effective performance for large classes of coarse-grained computations. We present the system and report our initial experiments and performance results in settings that closely resemble the dynamic behavior of a "real" network. Under varying work-load conditions, resource availability and process failures, the efficiency of the test program we present ranged from 84% to 94% bench-marked against a sequential program.