The Manchester prototype dataflow computer
Communications of the ACM - Special section on computer architecture
MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Serial combinators: “optimal” grains of parallelism
Proc. of a conference on Functional programming languages and computer architecture
ORBIT: an optimizing compiler for scheme
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Annual review of computer science vol. 1, 1986
Computer
Managing stack frames in Smalltalk
SIGPLAN '87 Papers of the Symposium on Interpreters and interpretive techniques
Control of parallelism in the Manchester Dataflow Machine
Proc. of a conference on Functional programming languages and computer architecture
An assessment of multilisp: lessons from experience
International Journal of Parallel Programming
Preliminary results with the initial implementation of Qlisp
LFP '88 Proceedings of the 1988 ACM conference on LISP and functional programming
Workcrews: an abstraction for controlling parallelism
International Journal of Parallel Programming
Multiprocessor execution of functional programs
International Journal of Parallel Programming
Mul-T: a high-performance parallel Lisp
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Connection Machine Lisp: fine-grained parallel symbolic processing
LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Para-functional programming: a paradigm for programming multiprocessor systems
POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Queue-based multi-processing LISP
LFP '84 Proceedings of the 1984 ACM Symposium on LISP and functional programming
Orbit: an optimizing compiler for scheme
Orbit: an optimizing compiler for scheme
Fast parallel implementation of lazy languages—the EQUALS experience
LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
A foundation for an efficient multi-threaded scheme system
LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
A customizable substrate for concurrent languages
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Heterogeneous parallel programming in Jade
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Static dependent costs for estimating execution time
LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Using the run-time sizes of data structures to guide parallel-thread creation
LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
The semantics of future and its use in program optimization
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Commutativity analysis: a new analysis framework for parallelizing compilers
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
The semantics of Scheme with future
Proceedings of the first ACM SIGPLAN international conference on Functional programming
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Commutativity analysis: a new analysis technique for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
The design, implementation, and evaluation of Jade
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Transparent communication for distributed objects in Java
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Eliminating synchronization bottlenecks in object-based programs using adaptive replication
ICS '99 Proceedings of the 13th international conference on Supercomputing
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Transactions on Computer Systems (TOCS)
Efficient load balancing for wide-area divide-and-conquer applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
A Syntactic Theory of Dynamic Binding
Higher-Order and Symbolic Computation
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Eliminating synchronization bottlenecks using adaptive replication
ACM Transactions on Programming Languages and Systems (TOPLAS)
Predicting Scalability of Parallel Garbage Collectors on Shared Memory Multiprocessors
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Satin: Efficient Parallel Divide-and-Conquer in Java
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Towards a Computational Model for UFO
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Commutativity Analysis: A Technique for Automatically Parallelizing Pointer-Based Computations
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Exploiting Implicit Parallelism in Functional Programs with SLAM
IFL '00 Selected Papers from the 12th International Workshop on Implementation of Functional Languages
The semantics of future and an application
Journal of Functional Programming
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A unifying link abstraction for wireless sensor networks
Proceedings of the 3rd international conference on Embedded networked sensor systems
Adaptive scheduling with parallelism feedback
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Manticore: a heterogeneous parallel language
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Carbon: architectural support for fine-grained parallelism on chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Status report: the manticore project
ML '07 Proceedings of the 2007 workshop on Workshop on ML
Quasi-static scheduling for safe futures
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Adaptive work-stealing with parallelism feedback
ACM Transactions on Computer Systems (TOCS)
A scheduling framework for general-purpose parallel languages
Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
An adaptive cut-off for task parallelism
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Efficient, portable implementation of asynchronous multi-place programs
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
COORDINATION '09 Proceedings of the 11th International Conference on Coordination Models and Languages
Beyond nested parallelism: tight bounds on work-stealing overheads for parallel futures
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
ClearPath: highly parallel collision avoidance for multi-agent simulation
Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Lazy binary-splitting: a run-time adaptive work-stealing scheduler
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Lightweight asynchrony using parasitic threads
Proceedings of the 5th ACM SIGPLAN workshop on Declarative aspects of multicore programming
Satin: A high-level and efficient grid programming model
ACM Transactions on Programming Languages and Systems (TOPLAS)
Evaluation of OpenMP task scheduling strategies
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Exploiting fine-grained parallelism on cell processors
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Programming in Manticore, a heterogenous parallel functional language
CEFP'09 Proceedings of the Third summer school conference on Central European functional programming school
Implicitly threaded parallelism in manticore
Journal of Functional Programming
Dynamic workload balancing deques for branch and bound algorithms in the message passing interface
International Journal of High Performance Systems Architecture
Oracle scheduling: controlling granularity in implicitly parallel languages
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
A work-stealing scheduler for X10's task parallelism with suspension
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Dependence analysis for safe futures
Science of Computer Programming
On the granularity of divide-and-conquer parallelism
FP'95 Proceedings of the 1995 international conference on Functional Programming
Design and implementation of a customizable work stealing scheduler
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Energy-efficient work-stealing language runtimes
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Friendly barriers: efficient work-stealing with return barriers
Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Hi-index | 0.00 |
Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a MIMD parallel system can exploit efficiently. Most builders of parallel systems have looked to either the programmer or a parallelizing compiler to increase the granularity of such algorithms. In this paper we explore a third approach to the granularity problem by analyzing two strategies for combining parallel tasks dynamically at run-time. We reject the simpler load-based inlining method, where tasks are combined based on dynamic load level, in favor of the safer and more robust lazy task creation method, where tasks are created only retroactively as processing resources become available.These strategies grew out of work on Mul-T [14], an efficient parallel implementation of Scheme, but could be used with other applicative languages as well. We describe our Mul-T implementations of lazy task creation for two contrasting machines, and present performance statistics which show the method's effectiveness. Lazy task creation allows efficient execution of naturally expressed algorithms of a substantially finer grain than possible with previous parallel Lisp systems.