The HPC Challenge (HPCC) benchmark suite
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Scan primitives for GPU computing
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Designing efficient sorting algorithms for manycore GPUs
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Rodinia: A benchmark suite for heterogeneous computing
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Proceedings of the 24th ACM International Conference on Supercomputing
Maestro: data orchestration and tuning for OpenCL devices
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Quantifying NUMA and contention effects in multi-GPU systems
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A static task partitioning approach for heterogeneous systems using OpenCL
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Automatic OpenCL device characterization: guiding optimized kernel design
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
FLAT: a GPU programming framework to provide embedded MPI
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Automatic NUMA characterization using Cbench
ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
The tradeoffs of fused memory hierarchies in heterogeneous computing architectures
Proceedings of the 9th conference on Computing Frontiers
Improving performance of adaptive component-based dataflow middleware
Parallel Computing
A fair comparison of modern CPUs and GPUs running the genetic algorithm under the knapsack benchmark
EvoApplications'12 Proceedings of the 2012t European conference on Applications of Evolutionary Computation
Fine-grain parallelism using multi-core, Cell/BE, and GPU Systems
Parallel Computing
An OpenMP 3.1 validation testsuite
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Optimization of geometric multigrid for emerging multi- and manycore processors
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
CUDA-for-clusters: a system for efficient execution of CUDA kernels on multi-core clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
OMB-GPU: a micro-benchmark suite for evaluating MPI libraries on GPU clusters
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Automatic problem size sensitive task partitioning on heterogeneous parallel systems
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
An automatic input-sensitive approach for heterogeneous task partitioning
Proceedings of the 27th international ACM conference on International conference on supercomputing
Performance characterization of data-intensive kernels on AMD Fusion architectures
Computer Science - Research and Development
Coordinated energy management in heterogeneous processors
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Use of multiple GPUs on shared memory multiprocessors for ultrasound propagation simulations
AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
RSVM: a region-based software virtual memory for GPU
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Evaluating integrated graphics processors for data center workloads
Proceedings of the Workshop on Power-Aware Computing and Systems
A sound and complete abstraction for reasoning about parallel prefix sums
Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
Exploiting heterogeneous parallelism with the Heterogeneous Programming Library
Journal of Parallel and Distributed Computing
Divergence-aware warp scheduling
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
An application-centric evaluation of OpenCL on multi-core CPUs
Parallel Computing
Efficient implementation of data flow graphs on multi-gpu clusters
Journal of Real-Time Image Processing
Hi-index | 0.00 |
Scalable heterogeneous computing systems, which are composed of a mix of compute devices, such as commodity multicore processors, graphics processors, reconfigurable processors, and others, are gaining attention as one approach to continuing performance improvement while managing the new challenge of energy efficiency. As these systems become more common, it is important to be able to compare and contrast architectural designs and programming systems in a fair and open forum. To this end, we have designed the Scalable HeterOgeneous Computing benchmark suite (SHOC). SHOC's initial focus is on systems containing graphics processing units (GPUs) and multi-core processors, and on the new OpenCL programming standard. SHOC is a spectrum of programs that test the performance and stability of these scalable heterogeneous computing systems. At the lowest level, SHOC uses microbenchmarks to assess architectural features of the system. At higher levels, SHOC uses application kernels to determine system-wide performance including many system features such as intranode and internode communication among devices. SHOC includes benchmark implementations in both OpenCL and CUDA in order to provide a comparison of these programming models.