Composition and reuse with compiled domain-specific languages

Authors:
Arvind K. Sujeeth;Tiark Rompf;Kevin J. Brown;HyoukJoong Lee;Hassan Chafi;Victoria Popic;Michael Wu;Aleksandar Prokopec;Vojin Jovanovic;Martin Odersky;Kunle Olukotun
Affiliations:
Stanford University;École Polytechnique Fédérale de Lausanne (EPFL), Switzerland,Oracle Labs;Stanford University;Stanford University;Stanford University and Oracle Labs;Stanford University;Stanford University;École Polytechnique Fédérale de Lausanne (EPFL), Switzerland;École Polytechnique Fédérale de Lausanne (EPFL), Switzerland;École Polytechnique Fédérale de Lausanne (EPFL), Switzerland;Stanford University
Venue:
ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
Year:
2013

Citing 27
Cited 2

Domain specific embedded compilers

Proceedings of the 2nd conference on Domain-specific languages
A Java fork/join framework

Proceedings of the ACM 2000 conference on Java Grande
MetaML and multi-stage programming with explicit annotations

Theoretical Computer Science - Partial evaluation and semantics-based program manipulation
Compiling Embedded Languages

SAIG '00 Proceedings of the International Workshop on Semantics, Applications, and Implementation of Program Generation
Formal loop merging for signal transforms

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
LINQ: reconciling object, relations and XML in the .NET framework

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Distributed data-parallel computing using a high-level programming language

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
An architecture for composing embedded domain-specific languages

Proceedings of the 9th International Conference on Aspect-Oriented Software Development
Nikola: embedding compiled GPU functions in Haskell

Proceedings of the third ACM Haskell symposium on Haskell
Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs

GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
The spoofax language workbench: rules for declarative specification of languages and IDEs

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Patterns of temporal variation in online media

Proceedings of the fourth ACM international conference on Web search and data mining
A domain-specific approach to heterogeneous parallelism

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Copperhead: compiling an embedded data parallel language

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
A generic parallel collection framework

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Firepile: run-time compilation for GPUs in scala

Proceedings of the 10th ACM international conference on Generative programming and component engineering
The design and implementation of feldspar an embedded language for digital signal processing

IFL'10 Proceedings of the 22nd international conference on Implementation and application of functional languages
Creating languages in Racket

Communications of the ACM
Liszt: a domain specific language for building portable mesh-based PDE solvers

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Scala-virtualized

PEPM '12 Proceedings of the ACM SIGPLAN 2012 workshop on Partial evaluation and program manipulation
A Heterogeneous Parallel Framework for Domain-Specific Languages

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Green-Marl: a DSL for easy and efficient graph analysis

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Composing domain-specific languages for wide-scope software engineering applications

MoDELS'05 Proceedings of the 8th international conference on Model Driven Engineering Languages and Systems
MadLINQ: large-scale distributed matrix computation for the cloud

Proceedings of the 7th ACM european conference on Computer Systems
Diderot: a parallel DSL for image analysis and visualization

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Optimizing data structures in high-level programs: new directions for extensible compilers based on staging

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Reify your collection queries for modularity and speed!

Proceedings of the 12th annual international conference on Aspect-oriented software development

Forge: generating a high performance DSL implementation from a declarative specification

Proceedings of the 12th international conference on Generative programming: concepts & experiences
Unifying functional and object-oriented programming with Scala

Communications of the ACM

Quantified Score

Hi-index	0.02

Visualization

Abstract

Programmers who need high performance currently rely on low-level, architecture-specific programming models (e.g. OpenMP for CMPs, CUDA for GPUs, MPI for clusters). Performance optimization with these frameworks usually requires expertise in the specific programming model and a deep understanding of the target architecture. Domain-specific languages (DSLs) are a promising alternative, allowing compilers to map problem-specific abstractions directly to low-level architecture-specific programming models. However, developing DSLs is difficult, and using multiple DSLs together in a single application is even harder because existing compiled solutions do not compose together. In this paper, we present four new performance-oriented DSLs developed with Delite, an extensible DSL compilation framework. We demonstrate new techniques to compose compiled DSLs embedded in a common backend together in a single program and show that generic optimizations can be applied across the different DSL sections. Our new DSLs are implemented with a small number of reusable components (less than 9 parallel operators total) and still achieve performance up to 125x better than library implementations and at worst within 30% of optimized stand-alone DSLs. The DSLs retain good performance when composed together, and applying cross-DSL optimizations results in up to an additional 1.82x improvement.