In search of a program generator to implement generic transformations for high-performance computing

Authors:
Albert Cohen;Sébastien Donadio;Maria-Jesus Garzaran;Christoph Herrmann;Oleg Kiselyov;David Padua
Affiliations:
ALCHEMY group, INRIA Futurs, Orsay, France;PRiSM, University of Versailles, France;DCS, University of Illinois at Urbana-Champaign, IL;FMI, University of Passau, Germany;FNMOC, Monterey, CA;DCS, University of Illinois at Urbana-Champaign, IL
Venue:
Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Year:
2006

Citing 33
Cited 13

Array expansion

ICS '88 Proceedings of the 2nd international conference on Supercomputing
A practical algorithm for exact array dependence analysis

Communications of the ACM
Improving locality and parallelism in nested loops

Improving locality and parallelism in nested loops
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
Monad transformers and modular interpreters

POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Integrated predicated and speculative execution in the IMPACT EPIC architecture

Proceedings of the 25th annual international symposium on Computer architecture
Parameterized polyhedra and their vertices

International Journal of Parallel Programming
The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization

IEEE Transactions on Parallel and Distributed Systems
Techniques for the translation of MATLAB programs into Fortran 90

ACM Transactions on Programming Languages and Systems (TOPLAS)
C and tcc: a language and compiler for dynamic code generation

ACM Transactions on Programming Languages and Systems (TOPLAS)
A sound reduction semantics for untyped CBN mutli-stage computation. Or, the theory of MetaML is non-trival (extended abstract)

PEPM '00 Proceedings of the 2000 ACM SIGPLAN workshop on Partial evaluation and semantics-based program manipulation
Overcoming the challenges to feedback-directed optimization (Keynote Talk)

DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
Domain-specific languages: an annotated bibliography

ACM SIGPLAN Notices
Optimizing strategies for telescoping languages: procedure strength reduction and procedure vectorization

ICS '01 Proceedings of the 15th international conference on Supercomputing
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
Automatic intra-register vectorization for the Intel architecture

International Journal of Parallel Programming
Adaptive Optimizing Compilers for the 21st Century

The Journal of Supercomputing
Environment classifiers

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A comparison of empirical and model-driven optimization

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Telescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Swing Modulo Scheduling: A Lifetime-Sensitive Approach

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Implementing multi-stage languages using ASTs, Gensym, and reflection

Proceedings of the 2nd international conference on Generative programming and component engineering
Parallelization of divide-and-conquer by translation to nested loops

Journal of Functional Programming
A Dynamically Tuned Sorting Library

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A methodology for generating verified combinatorial circuits

Proceedings of the 4th ACM international conference on Embedded software
Code Generation in the Polyhedral Model Is Easier Than You Think

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Towards a Systematic, Pragmatic and Architecture-Aware Program Optimization Process for Complex Processors

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms

International Journal of High Performance Computing Applications
Facilitating the search for compositions of program transformations

Proceedings of the 19th annual international conference on Supercomputing
Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code

GPCE'05 Proceedings of the 4th international conference on Generative Programming and Component Engineering
Implicitly heterogeneous multi-stage programming

GPCE'05 Proceedings of the 4th international conference on Generative Programming and Component Engineering

Combining partial evaluation and staged interpretation in the implementation of domain-specific languages

Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Shifting the stage: staging with delimited control

Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation
Quick and Practical Run-Time Evaluation of Multiple Program Optimizations

Transactions on High-Performance Embedded Architectures and Compilers I
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs

GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
A generative geometric kernel

Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation
Multi-stage programming with functors and monads: Eliminating abstraction overhead from generic code

Science of Computer Programming
Probabilistic auto-tuning for architectures with complex constraints

Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Shifting the stage: Staging with delimited control

Journal of Functional Programming
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs

Communications of the ACM
Reasoning about multi-stage programs

ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
Shonan challenge for generative programming: short position paper

PEPM '13 Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation
Spiral in scala: towards the systematic construction of generators for performance libraries

Proceedings of the 12th international conference on Generative programming: concepts & experiences
Combinators for impure yet hygienic code generation

Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program Manipulation

Quantified Score

Hi-index	0.02

Visualization

Abstract

The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the performance gap has been widening over time, due to the tremendous complexity increase in microprocessor and memory architectures, and to the rising level of abstraction of popular programming languages and styles. This paper explores in-between solutions, neither fully automatic nor fully manual ways to adapt a computationally intensive application to the target architecture. By mimicking complex sequences of transformations useful to optimize real codes, we show that generative programming is a practical means to implement architecture-aware optimizations for high-performance applications.This work explores the promises of generative programming languages and techniques for the high-performance computing expert. We show that complex, architecture-specific optimizations can be implemented in a type-safe, purely generative framework. Peak performance is achievable through the careful combination of a high-level, multi-stage evaluation language - MetaOCaml - with low-level code generation techniques. Nevertheless, our results also show that generative approaches for high-performance computing do not come without technical caveats and implementation barriers concerning productivity and reuse. We describe these difficulties and identify ways to hide or overcome them, from abstract syntaxes to heterogeneous generators of code generators, combining high-level and type-safe multi-stage programming with a back-end generator of imperative code.