Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Introducing 'Bones': a parallelizing source-to-source compiler based on algorithmic skeletons
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Lossless compression of variable-precision floating-point buffers on GPUs
I3D '12 Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games
Proceedings of the 9th conference on Computing Frontiers
Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU
Proceedings of the 49th Annual Design Automation Conference
Power efficiency for software algorithms running on graphics processors
EGGH-HPG'12 Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics
Designing a unified programming model for heterogeneous machines
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
How much (execution) time and energy does my algorithm cost?
XRDS: Crossroads, The ACM Magazine for Students - Scientific Computing
Journal of Computational Physics
DeNovoND: efficient hardware support for disciplined non-determinism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Accelerating simulation of agent-based models on heterogeneous architectures
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
The power 775 architecture at scale
Proceedings of the 27th international ACM conference on International conference on supercomputing
Future of GPGPU micro-architectural parameters
Proceedings of the Conference on Design, Automation and Test in Europe
Modeling synthetic aperture radar computation with Aspen
International Journal of High Performance Computing Applications
APOGEE: adaptive prefetching on GPUs for energy efficiency
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Neither more nor less: optimizing thread-level parallelism for GPGPUs
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
In-memory data compression for sparse matrices
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Exploiting heterogeneous parallelism with the Heterogeneous Programming Library
Journal of Parallel and Distributed Computing
A locality-aware memory hierarchy for energy-efficient GPU architectures
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
GPUfs: Integrating a file system with GPUs
ACM Transactions on Computer Systems (TOCS)
ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors
Proceedings of Workshop on General Purpose Processing Using GPUs
Hi-index | 0.01 |
This article discusses the capabilities of state-of-the art GPU-based high-throughput computing systems and considers the challenges to scaling single-chip parallel-computing systems, highlighting high-impact areas that the computing research community can address. Nvidia Research is investigating an architecture for a heterogeneous high-performance computing system that seeks to address these challenges.