Digital Image Processing
OpenVIDIA: parallel GPU computer vision
Proceedings of the 13th annual ACM international conference on Multimedia
ACM SIGGRAPH 2006 Research posters
Efficient histogram generation using scattering on GPUs
Proceedings of the 2007 symposium on Interactive 3D graphics and games
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Software engineering for multicore systems: an experience report
Proceedings of the 1st international workshop on Multicore software engineering
Parallel Computing Experiences with CUDA
IEEE Micro
Parallel Image Processing Based on CUDA
CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 03
Design and Performance Evaluation of Image Processing Algorithms on GPUs
IEEE Transactions on Parallel and Distributed Systems
Fast hough transform on GPUs: exploration of algorithm trade-offs
ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Introducing 'Bones': a parallelizing source-to-source compiler based on algorithmic skeletons
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
GPU-vote: a framework for accelerating voting algorithms on GPU
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image processing, due to their massive parallel processing power. Many image processing applications use the histogramming algorithm, which fills a set of bins according to the frequency of occurrence of pixel values taken from an input image. Histogramming has been mapped on a GPU prior to this work. Although significant research effort has been spent in optimizing the mapping, we show that the performance and performance predictability of existing methods can still be improved. In this paper, we present two novel histogramming methods, both achieving a higher performance and predictability than existing methods. We discuss performance limitations for both novel methods by exploring algorithm trade-offs. Both the novel and the existing histogramming methods are evaluated for performance. The first novel method gives an average performance increase of 33% over existing methods for non-synthetic benchmarks. The second novel method gives an average performance increase of 56% over existing methods and guarantees to be fully data independent. While the second method is specifically designed for newer GPU architectures, the first method is also suitable for older architectures.