Lambda lifting: transforming programs to recursive equations
Proc. of a conference on Functional programming languages and computer architecture
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Scans as Primitive Parallel Operations
IEEE Transactions on Computers
Implementation of a portable nested data-parallel language
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Programming parallel algorithms
Communications of the ACM
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Guarded recursive datatype constructors
POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
Journal of Functional Programming
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
ACM SIGGRAPH 2004 Papers
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Scan primitives for GPU computing
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Stream fusion: from lists to streams to nothing at all
ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Efficient computation of sum-products on GPUs through software-managed cache
Proceedings of the 22nd annual international conference on Supercomputing
AIEE-IRE '62 (Spring) Proceedings of the May 1-3, 1962, spring joint computer conference
Nikola: embedding compiled GPU functions in Haskell
Proceedings of the third ACM Haskell symposium on Haskell
Regular, shape-polymorphic, parallel arrays in Haskell
Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
Implementing survey propagation on graphics processing units
SAT'06 Proceedings of the 9th international conference on Theory and Applications of Satisfiability Testing
Expressive array constructs in an embedded GPU kernel programming language
DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
Nested data-parallelism on the gpu
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Optimising purely functional GPU programs
Proceedings of the 18th ACM SIGPLAN international conference on Functional programming
Hi-index | 0.00 |
Graphics processors (GPUs) are highly parallel devices that promise high performance, and they are now flexible enough to be used for general-purpose computing. A programming language based on implicitly data-parallel collective array operations can permit high-level, effective programming of GPUs. I describe three optimizations for such a language: automatic use of GPU shared memory cache, array fusion, and hoisting of nested parallel constructs. These optimizations are simple to implement because of the design of the language to which they are applied but can result in large run-time speedups.