A probabilistic representation for efficient large scale visual recognition tasks

Authors:
S. Bhattacharya;R. Sukthankar; Rong Jin;M. Shah
Affiliations:
Comput. Vision Lab., Univ. of Central Florida, Orlando, FL, USA;Intel Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA;Michigan State Univ., East Lansing, MI, USA;Comput. Vision Lab., Univ. of Central Florida, Orlando, FL, USA
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 4

Human activities as stochastic kronecker graphs

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
A survey of video datasets for human action and activity recognition

Computer Vision and Image Understanding
Exclusive visual descriptor quantization

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Recognition of complex events in open-source web-scale videos: a bottom up approach

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present an efficient alternative to the traditional vocabulary based on bag-of-visual words (BoW) used for visual classification tasks. Our representation is both conceptually and computationally superior to the bag-of-visual words: (1) We iteratively generate a Maximum Likelihood estimate of an image given a set of characteristic features in contrast to the BoW methods where an image is represented as a histogram of visual words, (2) We randomly sample a set of characteristic features instead of employing computation-intensive clustering algorithms used during the vocabulary generation step of BoW methods. Our performance compares favorably to the state-of-the-art on experiments over three challenging human action and a scene categorization dataset, demonstrating the universal applicability of our method.