Empowering Visual Categorization With the GPU

Authors:
K. E.A. van de Sande;T. Gevers;C. G.M. Snoek
Affiliations:
Intell. Syst. Lab. Amsterdam, Univ. of Amsterdam, Amsterdam, Netherlands;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2011

Citing 0
Cited 12

Assistive tagging: A survey of multimedia tagging with human-computer joint exploration

ACM Computing Surveys (CSUR)
High-performance dynamic quantum clustering on graphics processors

Journal of Computational Physics
Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Journal of Parallel and Distributed Computing
Accelerating visual categorization with the GPU

ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part II
Accelerating satellite image based large-scale settlement detection with GPU

Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
Enhanced representation and multi-task learning for image annotation

Computer Vision and Image Understanding
Recommendations for video event recognition using concept vocabularies

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Large-scale visual concept detection with explicit kernel maps and power mean SVM

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Exclusive visual descriptor quantization

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Scalable training with approximate incremental laplacian eigenmaps and PCA

Proceedings of the 21st ACM international conference on Multimedia
A GPU implementation of a structural-similarity-based aerial-image classification

The Journal of Supercomputing
Selective Search for Object Recognition

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Visual categorization is important to manage large collections of digital images and video, where textual metadata is often incomplete or simply unavailable. The bag-of-words model has become the most powerful method for visual categorization of images and video. Despite its high accuracy, a severe drawback of this model is its high computational cost. As the trend to increase computational power in newer CPU and GPU architectures is to increase their level of parallelism, exploiting this parallelism becomes an important direction to handle the computational cost of the bag-of-words approach. When optimizing a system based on the bag-of-words approach, the goal is to minimize the time it takes to process batches of images. this paper, we analyze the bag-of-words model for visual categorization in terms of computational cost and identify two major bottlenecks: the quantization step and the classification step. We address these two bottlenecks by proposing two efficient algorithms for quantization and classification by exploiting the GPU hardware and the CUDA parallel programming model. The algorithms are designed to (1) keep categorization accuracy intact, (2) decompose the problem, and (3) give the same numerical results. In the experiments on large scale datasets, it is shown that, by using a parallel implementation on the Geforce GTX260 GPU, classifying unseen images is 4.8 times faster than a quad-core CPU version on the Core i7 920, while giving the exact same numerical results. In addition, we show how the algorithms can be generalized to other applications, such as text retrieval and video retrieval. Moreover, when the obtained speedup is used to process extra video frames in a video retrieval benchmark, the accuracy of visual categorization is improved by 29%.