Human activities as stochastic kronecker graphs
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
A survey of video datasets for human action and activity recognition
Computer Vision and Image Understanding
Exclusive visual descriptor quantization
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Recognition of complex events in open-source web-scale videos: a bottom up approach
Proceedings of the 21st ACM international conference on Multimedia
Hi-index | 0.00 |
In this paper, we present an efficient alternative to the traditional vocabulary based on bag-of-visual words (BoW) used for visual classification tasks. Our representation is both conceptually and computationally superior to the bag-of-visual words: (1) We iteratively generate a Maximum Likelihood estimate of an image given a set of characteristic features in contrast to the BoW methods where an image is represented as a histogram of visual words, (2) We randomly sample a set of characteristic features instead of employing computation-intensive clustering algorithms used during the vocabulary generation step of BoW methods. Our performance compares favorably to the state-of-the-art on experiments over three challenging human action and a scene categorization dataset, demonstrating the universal applicability of our method.