Where is the data? Why you cannot debate CPU vs. GPU performance without the answer

Authors:
Chris Gregg;Kim Hazelwood
Affiliations:
Department of Computer Science, University of Virginia;Department of Computer Science, University of Virginia
Venue:
ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
Year:
2011

Citing 0
Cited 16

On the GPGPU parallelization issues of finite element approximate inverse preconditioning

Journal of Computational and Applied Mathematics
Dymaxion: optimizing memory access patterns for heterogeneous systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Programmable data dependencies and placements

DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
X-device query processing by bitwise distribution

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
GiST scan acceleration using coprocessors

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Automatic selection of processing units for coprocessing in databases

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Performance Modeling of Spatio-Temporal Algorithms Over GEDS Framework

International Journal of Grid and High Performance Computing
Automatic problem size sensitive task partitioning on heterogeneous parallel systems

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Valar: a benchmark suite to study the dynamic behavior of heterogeneous systems

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
An automatic input-sensitive approach for heterogeneous task partitioning

Proceedings of the 27th international ACM conference on International conference on supercomputing
Arbiter work stealing for parallelizing games on heterogeneous computing environments

Proceedings of the High Performance Computing Symposium
Efficient co-processor utilization in database query processing

Information Systems
Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS

Proceedings of the VLDB Endowment
Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures

Proceedings of Programming Models and Applications on Multicores and Manycores
Design patterns for sparse-matrix computations on hybrid CPU/GPU platforms

Scientific Programming
Multichannel massive audio processing for a generalized crosstalk cancellation and equalization application using GPUs

Integrated Computer-Aided Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

General purpose GPU Computing (GPGPU) has taken off in the past few years, with great promises for increased desktop processing power due to the large number of fast computing cores on high-end graphics cards. Many publications have demonstrated phenomenal performance and have reported speedups as much as 1000脳 over code running on multi-core CPUs. Other studies have claimed that well-tuned CPU code reduces the performance gap significantly. We demonstrate that this important discussion is missing a key aspect, specifically the question of where in the system data resides, and the overhead to move the data to where it will be used, and back again if necessary. We have benchmarked a broad set of GPU kernels on a number of platforms with different GPUs and our results show that when memory transfer times are included, it can easily take between 2 to 50脳 longer to run a kernel than the GPU processing time alone. Therefore, it is necessary to either include memory transfer overhead when reporting GPU performance, or to explain why this is not relevant for the application in question. We suggest a taxonomy for future CPU/GPU comparisons, and we argue that this is not only germane for reporting performance, but is important to heterogeneous scheduling research in general.