Satin: A high-level and efficient grid programming model

Authors:
Rob V. Van Nieuwpoort;Gosia Wrzesińska;Ceriel J. H. Jacobs;Henri E. Bal
Affiliations:
Vrije Universiteit Amsterdam, The Netherlands;Vrije Universiteit Amsterdam, The Netherlands;Vrije Universiteit Amsterdam, The Netherlands;Vrije Universiteit Amsterdam, The Netherlands
Venue:
ACM Transactions on Programming Languages and Systems (TOPLAS)
Year:
2010

Citing 51
Cited 10

Speedup Versus Efficiency in Parallel Systems

IEEE Transactions on Computers
Lazy task creation: a technique for increasing the granularity of parallel programs

LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
The Othello game on an n × n board is PSPACE-complete

Theoretical Computer Science
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Best-first fixed-depth minimax algorithms

Artificial Intelligence
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
Programming with POSIX threads

Programming with POSIX threads
Performance evaluation of the Orca shared-object system

ACM Transactions on Computer Systems (TOCS)
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
MagPIe: MPI's collective communication operations for clustered wide area systems

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The network weather service: a distributed resource performance forecasting service for metacomputing

Future Generation Computer Systems - Special issue on metacomputing
A Java fork/join framework

Proceedings of the ACM 2000 conference on Java Grande
Implementing remote procedure calls

ACM Transactions on Computer Systems (TOCS)
Asynchronous exceptions in Haskell

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Efficient load balancing for wide-area divide-and-conquer applications

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
ATLAS: an infrastructure for global computing

EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Efficient Java RMI for parallel programming

ACM Transactions on Programming Languages and Systems (TOPLAS)
Highly portable and efficient implementations of parallel adaptive N-body methods

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Java Virtual Machine Specification

Java Virtual Machine Specification
A Performance Analysis of Transposition-Table-Driven Work Scheduling in Distributed Search

IEEE Transactions on Parallel and Distributed Systems
Advanced eager scheduling for Java-based adaptively parallel computing

JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Remote Procedure Calls and Java Remote Method Invocation

IEEE Concurrency
Java-Centric Distributed Computing

IEEE Micro
Overview of GridRPC: A Remote Procedure Call API for Grid Computing

GRID '02 Proceedings of the Third International Workshop on Grid Computing
Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Transposition Table Driven Work Scheduling in Distributed Game-Tree Search

AI '02 Proceedings of the 15th Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Distributed Computing in a Heterogeneous Computing Environment

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer Interconnects

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
A Task Migration Implementation of the Message-Passing Interface

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
An Enabling Framework for Master-Worker Applications on the Computational Grid

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Experiences in Programming a Traffic Shaper

ISCC '00 Proceedings of the Fifth IEEE Symposium on Computers and Communications (ISCC 2000)
A Malleable-Job System for Timeshared Parallel Machines

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface

Journal of Parallel and Distributed Computing - Special issue on computational grids
The Grid 2: Blueprint for a New Computing Infrastructure

The Grid 2: Blueprint for a New Computing Infrastructure
Fault-Tolerance, Malleability and Migration for Divide-and-Conquer Applications on the Grid

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Ibis: a flexible and efficient Java-based Grid programming environment: Research Articles

Concurrency and Computation: Practice & Experience - 2002 ACM Java Grande–ISCOPE Conference Part II
Self adaptivity in Grid computing: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Performance evaluation of adaptive MPI

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Satin++: Divide-and-Share on the Grid

E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
Self-adaptive applications on the grid

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward an International "Computer Science Grid"

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Adaptive and reliable parallel computing on networks of workstations

ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Smartsockets: solving the connectivity problems in grid computing

Proceedings of the 16th international symposium on High performance distributed computing
The Info-plosion Project

NPC '07 Proceedings of the 2007 IFIP International Conference on Network and Parallel Computing Workshops
User-friendly and reliable grid computing based on imperfect middleware

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Experiences with Fine-Grained Distributed Supercomputing on a 10G Testbed

CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Scheduling multithreaded computations by work stealing

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Globus toolkit version 4: software for service-oriented systems

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
MPJ/Ibis: a flexible and efficient message passing platform for java

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Using MDL for grammar induction

ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications

Towards jungle computing with Ibis/Constellation

Proceedings of the 2011 workshop on Dynamic distributed data-intensive applications, programming abstractions, and systems
Enhancing the BYG gridification tool with state-of-the-art Grid scheduling mechanisms and explicit tuning support

Advances in Engineering Software
Generating synchronization statements in divide-and-conquer programs

Parallel Computing
Function flow: making synchronization easier in task parallelism

Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Component-based approach for programming and running scientific applications on grids and clouds

International Journal of High Performance Computing Applications
Using load information in work-stealing on distributed systems with non-uniform communication latencies

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A work-stealing scheduling framework supporting fault tolerance

Proceedings of the Conference on Design, Automation and Test in Europe
STAMINA: a competition to encourage the development and assessment of software model inference techniques

Empirical Software Engineering
How to be a successful thief: feudal work stealing for irregular divide-and-conquer applications on heterogeneous distributed systems

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Energy-efficient job stealing for CPU-intensive processing in mobile devices

Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered before, mainly due to the highly heterogeneous and dynamic computing and networking environment. Furthermore, failures occur frequently, and resources may be claimed by higher-priority jobs at any time. In this article, we solve these problems for an important class of applications: divide-and-conquer. We introduce a system called Satin that simplifies the development of parallel grid applications by providing a rich high-level programming model that completely hides communication. All grid issues are transparently handled in the runtime system, not by the programmer. Satin's programming model is based on Java, features spawn-sync primitives and shared objects, and uses asynchronous exceptions and an abort mechanism to support speculative parallelism. To allow an efficient implementation, Satin consistently exploits the idea that grids are hierarchically structured. Dynamic load-balancing is done with a novel cluster-aware scheduling algorithm that hides the long wide-area latencies by overlapping them with useful local work. Satin's shared object model lets the application define the consistency model it needs. If an application needs only loose consistency, it does not have to pay high performance penalties for wide-area communication and synchronization. We demonstrate how grid problems such as resource changes and failures can be handled transparently and efficiently. Finally, we show that adaptivity is important in grids. Satin can increase performance considerably by adding and removing compute resources automatically, based on the application's requirements and the utilization of the machines and networks in the grid. Using an extensive evaluation on real grids with up to 960 cores, we demonstrate that it is possible to provide a simple high-level programming model for divide-and-conquer applications, while achieving excellent performance on grids. At the same time, we show that the divide-and-conquer model scales better on large systems than the master-worker approach, since it has no single central bottleneck.