Supercomputing out of recycled garbage: preliminary experience with Piranha
ICS '92 Proceedings of the 6th international conference on Supercomputing
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A worldwide flock of Condors: load sharing among workstation clusters
Future Generation Computer Systems - Special issue: resource management in distributed systems
The Legion vision of a worldwide virtual computer
Communications of the ACM
Wire-area parallel computing in Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Running EveryWare on the computational grid
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Bayanihan: building and studying web-based volunteer computing systems using Java
Future Generation Computer Systems - Special issue on metacomputing
CoG kits: a bridge between commodity distributed computing and high-performance grids
Proceedings of the ACM 2000 conference on Java Grande
The AppLeS parameter sweep template: user-level middleware for the grid
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
SEDA: an architecture for well-conditioned, scalable internet services
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
ATLAS: an infrastructure for global computing
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
ParaWeb: towards world-wide supercomputing
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Toward a Framework for Preparing and Executing Adaptive Grid Programs
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
REXEC: A Decentralized, Secure Remote Execution Environment for Clusters
CANPC '00 Proceedings of the 4th International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Internet-Based TSP Computation with Javelin++
ICPP '00 Proceedings of the 2000 International Workshop on Parallel Processing
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
MJSA: Markov job scheduler based on availability in desktop grid computing environment
Future Generation Computer Systems
WSPE: a peer-to-peer programming environment for grid-unaware applications
Proceedings of the 5th international workshop on Middleware for grid computing: held at the ACM/IFIP/USENIX 8th International Middleware Conference
Satin: A high-level and efficient grid programming model
ACM Transactions on Programming Languages and Systems (TOPLAS)
A transparent framework for hierarchical master-slave grid computing
Euro-Par'06 Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing
An economy-driven mapping heuristic for hierarchical master-slave applications in grid systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Small webcomputing applied to distributed monte carlo calculations
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Group-Based scheduling scheme for result checking in global computing systems
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
Javelin 3 is a software system for developing large-scale, fault tolerant, adaptively parallel applications. When all or part of their application can be cast as a master-worker or branch-and-bound computation, Javelin 3 frees application developers from concerns about inter-processor communication and fault tolerance among networked hosts, allowing them to focus on the underlying application. The paper describes a fault tolerant task scheduler and its performance analysis. The task scheduler integrates work stealing with an advanced form of eager scheduling. It enables dynamic task decomposition, which improves host load-balancing in the presence of tasks whose non-uniform computational load is evident only at execution time. Speedup measurements are presented of actual performance on up to 1,000 hosts. We analyze the expected performance degradation due to unresponsive hosts, and measure actual performance degradation due to unresponsive hosts.