High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs

Authors:
Cedric Nugteren;Gert-Jan van den Braak;Henk Corporaal;Bart Mesman
Affiliations:
Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands
Venue:
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Year:
2011

Citing 10
Cited 3

Digital Image Processing

Digital Image Processing
OpenVIDIA: parallel GPU computer vision

Proceedings of the 13th annual ACM international conference on Multimedia
GPU histogram computation

ACM SIGGRAPH 2006 Research posters
Efficient histogram generation using scattering on GPUs

Proceedings of the 2007 symposium on Interactive 3D graphics and games
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Software engineering for multicore systems: an experience report

Proceedings of the 1st international workshop on Multicore software engineering
Parallel Computing Experiences with CUDA

IEEE Micro
Parallel Image Processing Based on CUDA

CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 03
Design and Performance Evaluation of Image Processing Algorithms on GPUs

IEEE Transactions on Parallel and Distributed Systems

Fast hough transform on GPUs: exploration of algorithm trade-offs

ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Introducing 'Bones': a parallelizing source-to-source compiler based on algorithmic skeletons

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
GPU-vote: a framework for accelerating voting algorithms on GPU

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image processing, due to their massive parallel processing power. Many image processing applications use the histogramming algorithm, which fills a set of bins according to the frequency of occurrence of pixel values taken from an input image. Histogramming has been mapped on a GPU prior to this work. Although significant research effort has been spent in optimizing the mapping, we show that the performance and performance predictability of existing methods can still be improved. In this paper, we present two novel histogramming methods, both achieving a higher performance and predictability than existing methods. We discuss performance limitations for both novel methods by exploring algorithm trade-offs. Both the novel and the existing histogramming methods are evaluated for performance. The first novel method gives an average performance increase of 33% over existing methods for non-synthetic benchmarks. The second novel method gives an average performance increase of 56% over existing methods and guarantees to be fully data independent. While the second method is specifically designed for newer GPU architectures, the first method is also suitable for older architectures.