A GPU-based high-throughput image retrieval algorithm

Authors:
Feiwen Zhu;Peng Chen;Donglei Yang;Weihua Zhang;Haibo Chen;Binyu Zang
Affiliations:
Fudan University, and Chinese Academy of Sciences;Fudan University;Fudan University;Fudan University;Fudan University;Fudan University
Venue:
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Year:
2012

Citing 20
Cited 1

Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust content-based image searches for copyright protection

MMDB '03 Proceedings of the 1st ACM international workshop on Multimedia databases
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
Online video recommendation based on multimodal fusion and relevance feedback

Proceedings of the 6th ACM international conference on Image and video retrieval
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Parallel Computing Experiences with CUDA

IEEE Micro
Fast and scalable list ranking on the GPU

Proceedings of the 23rd international conference on Supercomputing
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU

Proceedings of the 23rd international conference on Supercomputing
A translation system for enabling data mining applications on GPUs

Proceedings of the 23rd international conference on Supercomputing
Scene classification using pLSA with visterm spatial location

IMCE '09 Proceedings of the 1st international workshop on Interactive multimedia for consumer electronics
Real-time bag of words, approximately

Proceedings of the ACM International Conference on Image and Video Retrieval
Fast tridiagonal solvers on the GPU

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Robust content-based video copy identification in a large reference database

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Computing parallel speeded-up robust features (P-SURF) via POSIX threads

ICIC'09 Proceedings of the 5th international conference on Emerging intelligent computing technology and applications
An empirically tuned 2D and 3D FFT library on CUDA GPU

Proceedings of the 24th ACM International Conference on Supercomputing
Large-scale FFT on GPU clusters

Proceedings of the 24th ACM International Conference on Supercomputing
Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization

Proceedings of the 24th ACM International Conference on Supercomputing
A comprehensive analysis and parallelization of an image retrieval algorithm

ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Retina mosaicing using local features

MICCAI'06 Proceedings of the 9th international conference on Medical Image Computing and Computer-Assisted Intervention - Volume Part II

Interleaving and lock-step semantics for analysis and verification of GPU kernels

ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the development of Internet and cloud computing, multimedia data, such as images and videos, has become one of the most common data types being processed. As the scale of multimedia data being still increasing, it is vitally important to efficiently extract useful information from such a huge amount of multimedia data. However, due to the complexity of the core algorithms, multimedia retrieval applications are not only data intensive but also computationally intensive. Therefore, it has been a major challenge to accelerate the processing speed of such applications to satisfy the real-time requirement. As Graphic Processing Unit (GPU) has entered the general-propose computing domain (GPGPU), it has become one of the most popular accelerators for the applications with real-time requirements. In this paper, we parallelize a widely-used image retrieval algorithm called SURF on GPGPU, which is the core algorithm for many video and image retrieval applications. We first analyze the parallelism within SURF to guarantee that there are sufficient tasks being mapped to the large-scale computation resources in GPGPU. We then exploit some inherent GPGPU characteristics, such as 2D memory, to further boost the performance. Finally, we provide some optimization to the cooperation between CPU and GPGPU, which is generally ignored in previous designs. Experimental results show that our parallelization and optimization achieve a throughput of 340.5 frames/s on a NVIDIA GTX295 GPGPU, which is 15X faster than the maximal optimized CPU version. Compared to CUDA SURF, a state-of-the-art parallelization of SURF on GPGPU, our system achieves a speedup by a factor of 2.3X.