M-structures: extending a parallel, non-strict, functional language with state
Proceedings of the 5th ACM conference on Functional programming languages and computer architecture
JOMP—an OpenMP-like interface for Java
Proceedings of the ACM 2000 conference on Java Grande
Future Generation Computer Systems - Special issue on cellular automata: promise in computational science
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Optimizing Compiler for the CELL Processor
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
JaMP: an implementation of OpenMP for a Java DSM: Research Articles
Concurrency and Computation: Practice & Experience - Current Trends in Compilers for Parallel Computers (CPC2006)
Scalable Parallel Programming with CUDA
Queue - GPU Computing
Efficient computation of sum-products on GPUs through software-managed cache
Proceedings of the 22nd annual international conference on Supercomputing
A performance study of general-purpose applications on graphics processors using CUDA
Journal of Parallel and Distributed Computing
Comparing the performance of concurrent linked-list implementations in Haskell
Proceedings of the 4th workshop on Declarative aspects of multicore programming
Hi-index | 0.00 |
Current (heterogeneous) multi-core environments such as gpGPU architectures are hard to program with normal imperative and object-oriented (OO) languages. There are two basic problems to tackle: (1) it is too easy to program race conditions and dead-locks with the available synchronization primitives, and (2) these environments do not support (or support inefficiently) the instructions required for efficient execution of OO programs, e.g., because function pointers and pointer arithmetic are lacking. We address both problems with a new language that comprises both Functional Programming (FP) and OO programming. We solve problem (1) by auto-parallelization in the functional core where all loops and non-dependent calls can be executed in parallel. FP is to be used to write computational intensive code with safe concurrent memory access. An alternative object model that does neither use pointer arithmetic nor function pointers but smart pointers/proxies (to implement polymorphism) as well as mixins and templates (to implement OO like code reuse) solves problem (2). To cleanly integrate the two language cores, we propose a new integration model that even grants some restricted ways to access state from within FP mode. With the new language and prototype compiler we can transparently parallelize code to target both Cuda and multi-core machines (without annotations from the programmer) and obtain good speedups.