Firepile: run-time compilation for GPUs in scala

Authors:
Nathaniel Nystrom;Derek White;Kishen Das
Affiliations:
University of Lugano, Lugano, Switzerland;University of Texas at Arlington, Arlington, TX, USA;University of Texas at Arlington, Arlington, TX, USA
Venue:
Proceedings of the 10th ACM international conference on Generative programming and component engineering
Year:
2011

Citing 15
Cited 8

The evolution of Lisp

HOPL-II The second ACM SIGPLAN conference on History of programming languages
Multi-stage programming with explicit annotations

PEPM '97 Proceedings of the 1997 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Revised5 report on the algorithmic language scheme

ACM SIGPLAN Notices
Soot - a Java bytecode optimization framework

CASCON '99 Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research
Brook for GPUs: stream computing on graphics hardware

ACM SIGGRAPH 2004 Papers
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))

Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Implementing an embedded GPU language by combining translation and generation

Proceedings of the 2006 ACM symposium on Applied computing
Accelerator: using data parallelism to program GPUs for general-purpose uses

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Why it's nice to be quoted: quasiquoting for haskell

Haskell '07 Proceedings of the ACM SIGPLAN workshop on Haskell workshop
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary

ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Mnemonics: type-safe bytecode generation at run time

Proceedings of the 2010 ACM SIGPLAN workshop on Partial evaluation and program manipulation
Nikola: embedding compiled GPU functions in Haskell

Proceedings of the third ACM Haskell symposium on Haskell
Lime: a Java-compatible and synthesizable language for heterogeneous architectures

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Language virtualization for heterogeneous parallel computing

Proceedings of the ACM international conference on Object oriented programming systems languages and applications

JaBEE: framework for object-oriented Java bytecode compilation and execution on graphics processor units

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
A data-parallel extension to Ruby for GPGPU: toward a framework for implementing domain-specific optimizations

Proceedings of the 9th ECOOP Workshop on Reflection, AOP, and Meta-Data for Software Evolution
Optimizing data structures in high-level programs: new directions for extensible compilers based on staging

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Accelerating Habanero-Java programs with OpenCL generation

Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools
River trail: a path to parallelism in JavaScript

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Forge: generating a high performance DSL implementation from a declarative specification

Proceedings of the 12th international conference on Generative programming: concepts & experiences
Composition and reuse with compiled domain-specific languages

ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
A Framework for Multiplatform HPC Applications

Proceedings of Programming Models and Applications on Multicores and Manycores

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances have enabled GPUs to be used as general-purpose parallel processors on commodity hardware for little cost. However, the ability to program these devices has not kept up with their performance. The programming model for GPUs has a number of restrictions that make it difficult to program. For example, software running on the GPU cannot perform dynamic memory allocation, requiring the programmer to pre-allocate all memory the GPU might use. To achieve good performance, GPU programmers must also be aware of how data is moved between host and GPU memory and between the different levels of the GPU memory hierarchy. We describe Firepile, a library for GPU programming in Scala. The library enables a subset of Scala to be executed on the GPU. Code trees can be created from run-time function values, which can then be analyzed and transformed to generate GPU code. A key property of this mechanism is that it is modular: unlike with other meta-programming constructs, the use of code trees need not be exposed in the library interface. Code trees are general and can be used by library writers in other application domains. Our experiments show Firepile users can achieve performance comparable to C code targeted to the GPU with shorter, simpler, and easier-to-understand code.