Action recognition based on learnt motion semantic vocabulary

  • Authors:
  • Qiong Zhao;Zhiwu Lu;Horace H. S. Ip

  • Affiliations:
  • Centre for Innovative Applications of Internet And Multimedia Technologies, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong;Centre for Innovative Applications of Internet And Multimedia Technologies, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong;Centre for Innovative Applications of Internet And Multimedia Technologies, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

  • Venue:
  • PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel contextual spectral embedding (CSE) framework for human action recognition, which automatically learns the high-level features (motion semantic vocabulary) from a large vocabulary of abundant mid-level features (i.e. visual words). Our novelty is to exploit the inter-video context between mid-level features for spectral embedding, while the context is captured by the Pearson product moment correlation between mid-level features instead of Gaussian function computed over the vectors of point-wise information as mid-level feature representation. Our goal is to embed the mid-level features into a semantic low-dimensional space, and learn a much compact semantic vocabulary upon the CSE framework. Experiments on two action datasets demonstrate that our approach can achieve significantly improved results with respect to the state of the arts.