Language-motivated approaches to action recognition

  • Authors:
  • Manavender R. Malgireddy;Ifeoma Nwogu;Venu Govindaraju

  • Affiliations:
  • Department of Computer Science and Engineering, University at Buffalo, SUNY, Buffalo, NY;Department of Computer Science and Engineering, University at Buffalo, SUNY, Buffalo, NY;Department of Computer Science and Engineering, University at Buffalo, SUNY, Buffalo, NY

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present language-motivated approaches to detecting, localizing and classifying activities and gestures in videos. In order to obtain statistical insight into the underlying patterns of motions in activities, we develop a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classes of activities. This process is somewhat analogous to the method of detecting topics or categories from documents based on the word content of the documents, except that our documents are dynamic. The proposed generative model harnesses both the temporal ordering power of dynamic Bayesian networks such as hidden Markov models (HMMs) and the automatic clustering power of hierarchical Bayesian models such as the latent Dirichlet allocation (LDA) model. We also introduce a probabilistic framework for detecting and localizing pre-specified activities (or gestures) in a video sequence, analogous to the use of filler models for keyword detection in speech processing. We demonstrate the robustness of our classification model and our spotting framework by recognizing activities in unconstrained real-life video sequences and by spotting gestures via a one-shot-learning approach.