A general approach for run-time specialization and its application to C
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Dynamic specialization in the Fabius system
ACM Computing Surveys (CSUR) - Special issue: electronic supplement to the September 1998 issue
An evaluation of staged run-time optimizations in DyC
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Implementing the NAS Benchmark MG in SAC
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Compilation Scheme for a Hierarchy of Array Types
IFL '02 Selected Papers from the 13th International Workshop on Implementation of Functional Languages
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Automatic, Template-Based Run-Time Specialization: Implementation and Experimental Study
ICCL '98 Proceedings of the 1998 International Conference on Computer Languages
Shared memory multiprocessor support for functional array processing in SAC
Journal of Functional Programming
SAC: a functional array language for efficient multi-threaded execution
International Journal of Parallel Programming
Merging compositions of array skeletons in SAC
Parallel Computing - Algorithmic skeletons
From Contracts Towards Dependent Types: Proofs by Partial Evaluation
Implementation and Application of Functional Languages
FlumeJava: easy, efficient data-parallel pipelines
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
IFL'09 Proceedings of the 21st international conference on Implementation and application of functional languages
A practical method for quickly evaluating program optimizations
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Special Issue: Compilers for Parallel Computing (CPC 2010)
Concurrency and Computation: Practice & Experience
CEFP'11 Proceedings of the 4th Summer School conference on Central European Functional Programming School
Hi-index | 0.00 |
Programming productivity very much depends on the availability of basic building blocks that can be reused for a wide range of application scenarios and the ability to define rich abstraction hierarchies. Driven by the aim for increased reuse, such basic building blocks tend to become more and more generic in their specification; structural as well as behavioural properties are turned into parameters that are passed on to lower layers of abstraction where eventually a differentiation is being made. In the context of array programming, such properties are typically array ranks (number of axes/dimensions) and array shapes (number of elements along each axis/dimension). This allows for abstract definitions of operations such as element-wise additions, concatenations, rotations, and so on, which jointly enable a very high-level compositional style of programming, similar to, for instance, MATLAB. However, such a generic programming style generally comes at a price in terms of runtime overheads when compared against tailor-made low-level implementations. Additional layers of abstraction as well as the lack of hard-coded structural properties often inhibits optimisations that are obvious otherwise. Although complex static compiler analyses and transformations such as partial evaluations can ameliorate the situation to quite some extent, there are cases, where the required level of information is not available until runtime. In this paper, we propose to shift part of the optimisation process into the runtime of applications. Triggered by some runtime observation, the compiler asynchronously applies partial evaluation techniques to frequently used program parts and dynamically replaces initial program fragments by more specialised ones through dynamic re-linking. In contrast to many existing approaches, we suggest this optimisation to be done in a rather non-intrusive, decoupled way. We use a full-fledged compiler that is run on a separate core. This measure enables us to run the compiler on its highest optimisation-level, which requires non-negligible compilation times for our optimisations. We use the compiler's type system to identify the potential dynamic optimisations. And we use the host language's module system as a facilitator for the dynamic code modifications. We present the architecture and implementation of an adaptive compilation framework for Single Assignment C, a data-parallel array programming language. Single Assignment C advocates shape-generic and rank-generic programming with arrays. A sophisticated, highly optimising compiler technology nevertheless achieves competitive runtime performance. We demonstrate the suitability of our approach to achieve consistently high performance independent of the static availability of array properties by means of several experiments based on a highly generic formulation of rank-invariant convolution as a case study. Copyright © 2011 John Wiley & Sons, Ltd.