OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.2
OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.2
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation (Gpu Gems)
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads
IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures
IEEE Transactions on Parallel and Distributed Systems
OpenCL Programming Guide
Faster GPS via the sparse fourier transform
Proceedings of the 18th annual international conference on Mobile computing and networking
Multi2Sim: a simulation framework for CPU-GPU computing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Combining GPU data-parallel computing with OpenGL
ACM SIGGRAPH 2013 Courses
Analyzing Optimization Techniques for Power Efficiency on Heterogeneous Platforms
IPDPSW '13 Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Interactive particle dynamics using OpenCL and Kinect
International Journal of Parallel, Emergent and Distributed Systems
Hi-index | 0.00 |
Graphics Processing Units (GPUs) have gained recognition as the primary form of accelerators for graphics rendering in the gaming domain. They have also been widely accepted as the computing platform of choice in many scientific and high performance computing domains. The parallelism offered by the GPUs is used for simultaneous processing of compute and graphics by applications belonging to a range of domains. The availability of programming standards such as OpenCL and OpenGL has been leveraged to achieve the compute-graphics interoperability in the same application. However, given the increasing demands in both compute and graphics for emerging scientific visualization and immersive gaming applications, degradation in efficiency can be seen due to the continual switching between compute/graphics, swapping in and out of their associated runtime environments. We need to better understand how to tune this interoperable environment in order to allow compute and graphics to run both efficiently and simultaneously. Presently we evaluate each of these domains in isolation. In this paper, we evaluate the performance and efficiency of the OpenCL-OpenGL(CL-GL) interoperability(interop) mode. We explore different methods to improve the execution performance of the CL-GL interop-based applications. We propose a slot-based rendering mechanism for CL-GL interop to increase the efficiency of the application. To evaluate CL-GL and our slot-based scheme, we study five scientific applications using OpenCL and OpenGL for compute and graphics rendering. Our study covers two AMD Radeon discrete GPUs and one shared memory AMD APU as test platforms. We demonstrate that leveraging the CL-GL interop interface results in a 2.2X performance increase, and our slot-based rendering provides 60% increase in performance by providing a 24% improvement in L2 cache hit rate on GPUs and APUs.