Understanding throughput-oriented architectures
Communications of the ACM
Copperhead: compiling an embedded data parallel language
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Fast Mersenne prime testing on the GPU
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Real-time Eulerian water simulation using a restricted tall cell grid
ACM SIGGRAPH 2011 papers
Effect of the block occupancy in GPGPU over the performance of particle swarm algorithm
ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part I
Simulation of bevel gear cutting with GPGPUs--performance and productivity
Computer Science - Research and Development
Free surface flow simulations on GPGPUs using the LBM
Computers & Mathematics with Applications
Mesh deformations in X3D via CUDA with freeform deformation lattices
Proceedings of the 2011 international conference on Virtual and mixed reality: systems and applications - Volume Part II
On the GPGPU parallelization issues of finite element approximate inverse preconditioning
Journal of Computational and Applied Mathematics
Toward real-time simulation of cardiac dynamics
Proceedings of the 9th International Conference on Computational Methods in Systems Biology
The right balance: restructuring the parallel and scientific computing course
Journal of Computing Sciences in Colleges
Sisal 3.2 language features overview
PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Chestnut: a GPU programming language for non-experts
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Real-Time interactive character animation by parallelization of genetic algorithms
MIG'11 Proceedings of the 4th international conference on Motion in Games
Speeding up a chaos-based image encryption algorithm using GPGPU
EUROCAST'11 Proceedings of the 13th international conference on Computer Aided Systems Theory - Volume Part I
A high-performance implementation of differential power analysis on graphics cards
CARDIS'11 Proceedings of the 10th IFIP WG 8.8/11.2 international conference on Smart Card Research and Advanced Applications
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
SCF: A Framework for Task-Level Coordination in Reconfigurable, Heterogeneous Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Computers & Mathematics with Applications
Parallel terrain visibility calculation on the graphics processing unit
Concurrency and Computation: Practice & Experience
A fair comparison of modern CPUs and GPUs running the genetic algorithm under the knapsack benchmark
EvoApplications'12 Proceedings of the 2012t European conference on Applications of Evolutionary Computation
Proceedings of the 2012 Symposium on High Performance Computing
GPU accelerated computation of the longest common subsequence
Facing the Multicore-Challenge II
Swarm grid: a proposal for high performance of parallel particle swarm optimization using GPGPU
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
C-DAC's efforts: application kernels on HPC cluster with GPU accelerators
Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms
Journal of Computational Physics
Parallel verlet neighbor list algorithm for GPU-optimized MD simulations
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Central force optimization on a GPU: a case study in high performance metaheuristics
The Journal of Supercomputing
Three-dimensional thinning algorithms on graphics processing units and multicore CPUs
Concurrency and Computation: Practice & Experience
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Neural PCA and maximum likelihood hebbian learning on the GPU
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Computational optimization strategies for the simulation of random media and components
Computational Optimization and Applications
A fully parallel, high precision, N-body code running on hybrid computing platforms
Journal of Computational Physics
Parallel design for error-resilient entropy coding algorithm on GPU
Journal of Parallel and Distributed Computing
Parallel interval newton method on CUDA
PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
Reversible simulations of elastic collisions
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Development of a unified FDTD-FEM library for electromagnetic analysis with CPU and GPU computing
The Journal of Supercomputing
RMASBench: benchmarking dynamic multi-agent coordination in urban search and rescue
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
RMASBench: a benchmarking system for multi-agent coordination in urban search and rescue
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
A novel concurrent cache-friendly binary decision diagram construction for multi-core platforms
Proceedings of the Conference on Design, Automation and Test in Europe
Use of multiple GPUs on shared memory multiprocessors for ultrasound propagation simulations
AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
Accelerate MapReduce on GPUs with multi-level reduction
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS
Proceedings of the VLDB Endowment
Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs
Journal of Computational Physics
Software Transactional Memory for GPU Architectures
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Mathematics and Computers in Simulation
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
The Journal of Supercomputing
Accelerating Single Iteration Performance of CUDA-Based 3D Reaction---Diffusion Simulations
International Journal of Parallel Programming
Hi-index | 0.03 |
This book is required reading for anyone working with accelerator-based computing systems. From the Foreword by Jack Dongarra, University of Tennessee and Oak Ridge National Laboratory CUDA is a computing architecture designed to facilitate the development of parallel programs. In conjunction with a comprehensive software platform, the CUDA Architecture enables programmers to draw on the immense power of graphics processing units (GPUs) when building high-performance applications. GPUs, of course, have long been available for demanding graphics and game applications. CUDA now brings this valuable resource to programmers working on applications in other domains, including science, engineering, and finance. No knowledge of graphics programming is requiredjust the ability to program in a modestly extended version of C. CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details the techniques and trade-offs associated with each key CUDA feature. Youll discover when to use each CUDA C extension and how to write CUDA software that delivers truly outstanding performance. Major topics covered include Parallel programming Thread cooperation Constant memory and events Texture memory Graphics interoperability Atomics Streams CUDA C on multiple GPUs Advanced atomics Additional CUDA resources All the CUDA software tools youll need are freely available for download from NVIDIA.http://developer.nvidia.com/object/cuda-by-example.html