Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Java For Numerically Intensive Computing: From Flops To Gigaflops
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Metaprogramming GPUs with Sh
JaMP: an implementation of OpenMP for a Java DSM: Research Articles
Concurrency and Computation: Practice & Experience - Current Trends in Compilers for Parallel Computers (CPC2006)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Scalable Parallel Programming with CUDA
Queue - GPU Computing
Programming the Cell Processor: For Games, Graphics, and Computation
Programming the Cell Processor: For Games, Graphics, and Computation
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
CuPP - A framework for easy CUDA integration
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Automatic scoping of variables in parallel regions of an OpenMP program
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
New Horizons in Multicore Software Engineering
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Enabling multiple accelerator acceleration for Java/OpenMP
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
OpenMP-style parallelism in data-centered multicore computing with R
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Optimization strategies in different CUDA architectures using llCoMP
Microprocessors & Microsystems
Development of Java multi-threaded simulation for chemical reacting flow of ethanol
Advances in Engineering Software
Java in the High Performance Computing arena: Research, practice and experience
Science of Computer Programming
A Framework for Multiplatform HPC Applications
Proceedings of Programming Models and Applications on Multicores and Manycores
A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
The Journal of Supercomputing
Hi-index | 0.00 |
We present an OpenMP framework for Java that can exploit an available graphics card as an application accelerator. Dynamic languages (Java, C#, etc.) pose a challenge here because of their write-once-run-everywhere approach. This renders it impossible to make compile-time assumptions on whether and which type of accelerator or graphics card might be available in the system at run-time. We present an execution model that dynamically analyzes the running environment to find out what hardware is attached. Based on the results it dynamically rewrites the bytecode and generates the necessary gpGPU code on-the-fly. Furthermore, we solve two extra problems caused by the combination of Java and CUDA. First, CUDA-capable hardware usually has little memory (compared to main memory). However, as Java is a pointer-free language, array data can be stored in main memory and buffered in GPU memory. Second, CUDA requires one to copy data to and from the graphics card's memory explicitly. As modern languages use many small objects, this would involve many copy operations when done naively. This is exacerbated because Java uses arrays-of-arrays to implement multi-dimensional arrays. A clever copying technique and two new array packages allow for more efficient use of CUDA.