Implementation of an Automatic Semi-Fluid Motion Analysis Algorithm on a Massively Parallel Computer
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Robust Real-Time Face Detection
International Journal of Computer Vision
Integral Histogram: A Fast Way To Extract Histograms in Cartesian Spaces
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Robust Fragments-based Tracking using the Integral Histogram
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Fast Human Detection Using a Cascade of Histograms of Oriented Gradients
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
NVIDIA cuda software and gpu parallel computing architecture
Proceedings of the 6th international symposium on Memory management
Design and Performance Evaluation of Image Processing Algorithms on GPUs
IEEE Transactions on Parallel and Distributed Systems
Parallel implementation of the integral histogram
ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Fragments based tracking with adaptive cue integration
Computer Vision and Image Understanding
Realtime motion detection based on the spatio-temporal median filter using GPU integral histograms
Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Hi-index | 0.00 |
The integral histogram for images is an efficient preprocessing method for speeding up diverse computer vision algorithms including object detection, appearance-based tracking, recognition and segmentation. Our proposed Graphics Processing Unit (GPU) implementation uses parallel prefix sums on row and column histograms in a cross-weave scan with high GPU utilization and communication-aware data transfer between CPU and GPU memories. Two different data structures and communication models were evaluated. A 3-D array to store binned histograms for each pixel and an equivalent linearized 1-D array, each with distinctive data movement patterns. Using the 3-D array with many kernel invocations and low workload per kernel was inefficient, highlighting the necessity for careful mapping of sequential algorithms onto the GPU. The reorganized 1-D array with a single data transfer to the GPU with high GPU utilization, was 60 times faster than the CPU version for a 1K ×1K image reaching 49 fr/sec and 21 times faster for 512×512 images reaching 194 fr/sec. The integral histogram module is applied as part of the likelihood of features tracking (LOFT) system for video object tracking using fusion of multiple cues.