EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Asynchronous Language and System of Numerical Algorithms Fragmented Programming
PaCT '09 Proceedings of the 10th International Conference on Parallel Computing Technologies
ACM Transactions on Architecture and Code Optimization (TACO)
memCUDA: map device memory to host memory on GPGPU platform
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A multi-core software API for embedded MPSoC environments
MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
A domain-specific approach to heterogeneous parallelism
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Bothnia: a dual-personality extension to the Intel integrated graphics driver
ACM SIGOPS Operating Systems Review
FPGA Acceleration of MultiFactor CDO Pricing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
RapidMind: portability across architectures and its limitations
Facing the multicore-challenge
RapidMind: portability across architectures and its limitations
Facing the multicore-challenge
Case study: stereo vision experiments with multi-core software API on embedded MPSoC environments
The Journal of Supercomputing
Hi-index | 0.00 |
The high-performance parallel processors in video accelerators, GPUs, can be used as numerical co-processors in a variety of applications. The RapidMind Development Platform is a software development system that allows the developer to use standard C++ programming to easily create high-performance and massively parallel applications that run on the GPU. Using the RapidMind platform, we compare the performance of FFT, BLAS dense matrix multiplication, and quasi-Monte Carlo option pricing benchmarks on the GPU against highly tuned CPU implementations. The advantages and limitations of GPU acceleration are discussed as well as techniques for optimizing performance.