Benchmarking Java against C and Fortran for scientific applications
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Java Native Interface: Programmer's Guide and Reference
Java Native Interface: Programmer's Guide and Reference
A parallel java grande benchmark suite
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Java programming for high-performance numerical computing
IBM Systems Journal
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Scalable Parallel Programming with CUDA
Queue - GPU Computing
Language Extensions in Support of Compiler Parallelization
Languages and Compilers for Parallel Computing
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Polyglot: an extensible compiler framework for Java
CC'03 Proceedings of the 12th international conference on Compiler construction
Unified parallel C for GPU clusters: language extensions and compiler implementation
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
CnC-CUDA: declarative programming for GPUs
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Automatic CPU-GPU communication management and optimization
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Firepile: run-time compilation for GPUs in scala
Proceedings of the 10th ACM international conference on Generative programming and component engineering
Habanero-Java: the new adventures of old X10
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Hierarchical place trees: a portable abstraction for task parallelism and data movement
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Writing a modular GPGPU program in Java
Proceedings of the 2012 workshop on Modularity in Systems Software
Parallelizing SOR for GPGPUs using alternate loop tiling
Parallel Computing
Dynamically managed data for CPU-GPU architectures
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Portable performance on heterogeneous architectures
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Java in the High Performance Computing arena: Research, practice and experience
Science of Computer Programming
Scaling large-data computations on multi-GPU accelerators
Proceedings of the 27th international ACM conference on International conference on supercomputing
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
A Framework for Multiplatform HPC Applications
Proceedings of Programming Models and Applications on Multicores and Manycores
A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
The Journal of Supercomputing
Hi-index | 0.00 |
A recent trend in mainstream desktop systems is the use of general-purpose graphics processor units (GPGPUs) to obtain order-of-magnitude performance improvements. CUDA has emerged as a popular programming model for GPGPUs for use by C/C++ programmers. Given the widespread use of modern object-oriented languages with managed runtimes like Java and C#, it is natural to explore how CUDA-like capabilities can be made accessible to those programmers as well. In this paper, we present a programming interface called JCUDA that can be used by Java programmers to invoke CUDA kernels. Using this interface, programmers can write Java codes that directly call CUDA kernels, and delegate the responsibility of generating the Java-CUDA bridge codes and host-device data transfer calls to the compiler. Our preliminary performance results show that this interface can deliver significant performance improvements to Java programmers. For future work, we plan to use the JCUDA interface as a target language for supporting higher level parallel programming languages like X10 and Habanero-Java.