Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Coupled hidden Markov models for complex action recognition
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
The Journal of Machine Learning Research
A System for Learning Statistical Motion Patterns
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
International Journal of Computer Vision
Unusual Activity Analysis Using Video Epitomes and pLSA
ICVGIP '08 Proceedings of the 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Spatial-Temporal correlatons for unsupervised action classification
WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Human Action Recognition by Semilatent Topic Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Correspondence-Free Activity Analysis and Scene Modeling in Multiple Camera Views
IEEE Transactions on Pattern Analysis and Machine Intelligence
Action categorization by structural probabilistic latent semantic analysis
Computer Vision and Image Understanding
Learning semantic scene models by trajectory analysis
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part III
Hi-index | 0.00 |
Topic models such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) have been successfully used to discover individual activities in a scene. However these methods do not discover group activities which are commonly observed in real life videos of public places. In this paper we address the problem of discovering activities and their associations as a group activity in an unsupervised manner. We propose a method that uses a two layer hierarchical latent structure to correlate individual activities in lower layer with group activity in higher layer. Our model considers each scene to be composed of a mixture of group activities. Each group activity is in turn composed as a mixture of individual activities represented as multinomial distributions. Each individual activity is represented as a distribution over local visual features. We use a Gibbs sampling based algorithm to infer these activities. Our method can summarize not only the individual activities but also the common group activities in a video. We demonstrate the strength of our method by mining activities and the salient correlation amongst them in real life videos of crowded public scenes.