Accelerating CUDA graph algorithms at maximum warp
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Collision-streams: fast GPU-based collision detection for deformable models
I3D '11 Symposium on Interactive 3D Graphics and Games
Unstructured grid applications on GPU: performance analysis and improvement
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Improving programmability of heterogeneous many-core systems via explicit platform descriptions
Proceedings of the 4th International Workshop on Multicore Software Engineering
A free-viewpoint virtual mirror with marker-less user interaction
SCIA'11 Proceedings of the 17th Scandinavian conference on Image analysis
Journal of Computational and Applied Mathematics
Iterative sparse Matrix-Vector multiplication for integer factorization on GPUs
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
SIAM Journal on Scientific Computing
Identifying hotspots in a program for data parallel architecture: an early experience
Proceedings of the 5th India Software Engineering Conference
Better speedups using simpler parallel programming for graph connectivity and biconnectivity
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Scalable framework for mapping streaming applications onto multi-GPU systems
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Proceedings of the 43rd ACM technical symposium on Computer Science Education
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Why on-chip cache coherence is here to stay
Communications of the ACM
On the correctness of the SIMT execution model of GPUs
ESOP'12 Proceedings of the 21st European conference on Programming Languages and Systems
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems
Proceedings of the 26th ACM international conference on Supercomputing
Journal of Computational Physics
Productivity of GPUs under different programming paradigms
Concurrency and Computation: Practice & Experience
Energy-efficient non-minimal path on-chip interconnection network for heterogeneous systems
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Simultaneous branch and warp interweaving for sustained GPU performance
Proceedings of the 39th Annual International Symposium on Computer Architecture
A simulation framework for scheduling performance evaluation on CPU-GPU heterogeneous system
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
GPGPU implementation of growing neural gas: Application to 3D scene reconstruction
Journal of Parallel and Distributed Computing
Workload and power budget partitioning for single-chip heterogeneous processors
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
GPU optimization of convolution for large 3-d real images
ACIVS'12 Proceedings of the 14th international conference on Advanced Concepts for Intelligent Vision Systems
Spill code placement for SIMD machines
SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
RDFS reasoning on massively parallel hardware
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Recognition of two-dimensional representation of urban environment for autonomous flying agents
Expert Systems with Applications: An International Journal
A multi-processor NoC-based architecture for real-time image/video enhancement
Journal of Real-Time Image Processing
Efficient design space exploration of GPGPU architectures
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Exploring GPU architectures to accelerate semantic comparison for intention-based search
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
Aging-aware compiler-directed VLIW assignment for GPGPU architectures
Proceedings of the 50th Annual Design Automation Conference
Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
ACM Transactions on Programming Languages and Systems (TOPLAS)
Journal of Parallel and Distributed Computing
GPU code generation for ODE-based applications with phased shared-data access patterns
ACM Transactions on Architecture and Code Optimization (TACO)
GPU-based iterative transmission reconstruction in 3D ultrasound computer tomography
Journal of Parallel and Distributed Computing
Journal of Real-Time Image Processing
Boosting CUDA Applications with CPU---GPU Hybrid Computing
International Journal of Parallel Programming
Use of GPU computing for uncertainty quantification in computational mechanics: A case study
Scientific Programming
Hi-index | 0.02 |
GPU computing is at a tipping point, becoming more widely used in demanding consumer applications and high-performance computing. This article describes the rapid evolution of GPU architectures—from graphics processors to massively parallel many-core multiprocessors, recent developments in GPU computing architectures, and how the enthusiastic adoption of CPU+GPU coprocessing is accelerating parallel applications.