Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Data flow graph optimization in ifi
Proc. of a conference on Functional programming languages and computer architecture
SISAL: initial MIMD performances results
Proc. of the conference on algorithms and hardware for parallel processing on CONPAR 86
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Compiling collection-oriented languages onto massively parallel computers
Journal of Parallel and Distributed Computing - Massively parallel computation
Supercompilers for parallel and vector computers
Supercompilers for parallel and vector computers
CML: A higher concurrent language
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
Retire Fortran?: a debate rekindled
Communications of the ACM
Factoring: a method for scheduling parallel loops
Communications of the ACM
The essence of functional programming
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fortran 90 handbook: complete ANSI/ISO reference
Fortran 90 handbook: complete ANSI/ISO reference
The high performance Fortran handbook
The high performance Fortran handbook
Benchmarking implementations of lazy functional languages
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Implementation of a portable nested data-parallel language
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
An optimizing Fortran D compiler for MIMD distributed-memory machines
An optimizing Fortran D compiler for MIMD distributed-memory machines
Advanced Array Optimizations for High Performance Functional Languages
IEEE Transactions on Parallel and Distributed Systems
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Synchronization minimization in a SPMD execution model
Journal of Parallel and Distributed Computing - Special issue on distributed shared memory systems
Programming parallel algorithms
Communications of the ACM
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems
IEEE Transactions on Parallel and Distributed Systems
Optimizing Fortran90D/HPF for distributed-memory computers
Optimizing Fortran90D/HPF for distributed-memory computers
Eliminating barrier synchronization for compiler-parallelized codes on software DSMs
International Journal of Parallel Programming - Special issue on languages and compilers for parallel computing. Part I
Accelerating APL programs with SAC
Proceedings of the conference on APL '99 : On track to the 21st century: On track to the 21st century
The aggregate update problem in functional programming systems
POPL '85 Proceedings of the 12th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Information Technology-Portable Operating System Interface
Information Technology-Portable Operating System Interface
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
IEEE Parallel & Distributed Technology: Systems & Technology
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
The Case for High-Level Parallel Programming in ZPL
IEEE Computational Science & Engineering
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers
IEEE Transactions on Parallel and Distributed Systems
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Implementing the NAS Benchmark MG in SAC
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Optimization Rules for Programming with Collective Operations
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Parallel Programming Using Skeleton Functions
PARLE '93 Proceedings of the 5th International PARLE Conference on Parallel Architectures and Languages Europe
PARLE '93 Proceedings of the 5th International PARLE Conference on Parallel Architectures and Languages Europe
PLILPS '95 Proceedings of the 7th International Symposium on Programming Languages: Implementations, Logics and Programs
Experience with the Implementation of a Concurrent Graph Reduction System on an nCube/2 Platform
CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
Implementing the Conjugate Gradient Algorithm in a Functional Language
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
The Implementation and Efficiency of Arrays in Clean 1.1
IFL '96 Selected Papers from the 8th International Workshop on Implementation of Functional Languages
WITH-Loop-Folding in SAC - Condensing Consecutive Array Operations
IFL '97 Selected Papers from the 9th International Workshop on Implementation of Functional Languages
HaskSkel: Algorithmic Skeletons in Haskell
IFL '99 Selected Papers from the 11th International Workshop on Implementation of Functional Languages
On Code Generation for Multi-generator WITH-Loops in SAC
IFL '99 Selected Papers from the 11th International Workshop on Implementation of Functional Languages
The Eden Coordination Model for Distributed Memory Systems
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Single Assignment C: efficient support for high-level array operations in a functional setting
Journal of Functional Programming
Algorithm + strategy = parallelism
Journal of Functional Programming
On the effectiveness of functional language features: NAS benchmark FT
Journal of Functional Programming
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Locality and Loop Scheduling on NUMA Multiprocessors
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 02
With-Loop scalarization – merging nested array operations
IFL'03 Proceedings of the 15th international conference on Implementation of Functional Languages
SAC: a functional array language for efficient multi-threaded execution
International Journal of Parallel Programming
Merging compositions of array skeletons in SAC
Parallel Computing - Algorithmic skeletons
SAC: off-the-shelf support for data-parallelism on multicores
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Controlling chaos: on safe side-effects in data-parallel operations
Proceedings of the 4th workshop on Declarative aspects of multicore programming
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Compiler-support for robust multi-core computing
ISoLA'10 Proceedings of the 4th international conference on Leveraging applications of formal methods, verification, and validation - Volume Part I
Descriptor-free representation of arrays with dependent types
IFL'08 Proceedings of the 20th international conference on Implementation and application of functional languages
Concurrent non-deferred reference counting on the Microgrid: first experiences
IFL'10 Proceedings of the 22nd international conference on Implementation and application of functional languages
With-Loop fusion for data locality and parallelism
IFL'05 Proceedings of the 17th international conference on Implementation and Application of Functional Languages
Asynchronous adaptive optimisation for generic data-parallel array programming
Concurrency and Computation: Practice & Experience
CEFP'11 Proceedings of the 4th Summer School conference on Central European Functional Programming School
Hi-index | 0.00 |
Classical application domains of parallel computing are dominated by processing large arrays of numerical data. Whereas most functional languages focus on lists and trees rather than on arrays, SAC is tailor-made in design and in implementation for efficient high-level array processing. Advanced compiler optimizations yield performance levels that are often competitive with low-level imperative implementations. Based on SAC, we develop compilation techniques and runtime system support for the compiler-directed parallel execution of high-level functional array processing code on shared memory architectures. Competitive sequential performance gives us the opportunity to exploit the conceptual advantages of the functional paradigm for achieving real performance gains with respect to existing imperative implementations, not only in comparison with uniprocessor runtimes. While the design of SAC facilitates parallelization, the particular challenge of high sequential performance is that realization of satisfying speedups through parallelization becomes substantially more difficult. We present an initial compilation scheme and multi-threaded execution model, which we step-wise refine to reduce organizational overhead and to improve parallel performance. We close with a detailed analysis of the impact of certain design decisions on runtime performance, based on a series of experiments.