Video event detection using motion relativity and visual relatedness

  • Authors:
  • Feng Wang;Yu-Gang Jiang;Chong-Wah Ngo

  • Affiliations:
  • City University of Hong Kong, Hong Kong, Hong Kong;City University of Hong Kong, Hong Kong, Hong Kong;City University of Hong Kong, Hong Kong, Hong Kong

  • Venue:
  • MM '08 Proceedings of the 16th ACM international conference on Multimedia
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Event detection plays an essential role in video content analysis. However, the existing features are still weak in event detection because: i) most features just capture what is involved in an event or how the event evolves separately, and thus cannot completely describe the event; ii) to capture event evolution information, only motion distribution over the whole frame is used which proves to be noisy in unconstrained videos; iii) the estimated object motion is usually distorted by camera movement. To cope with these problems, in this paper, we propose a new motion feature, namely Expanded Relative Motion Histogram of Bag-of-Visual-Words (ERMH-BoW) to employ motion relativity and visual relatedness for event detection. In ERMH-BoW, by representing what aspect of an event with Bag-of-Visual-Words (BoW), we construct relative motion histograms between visual words to depict the object activities or how aspect of the event. ERMH-BoW thus integrates both what and how aspects for a complete event description. Instead of motion distribution features, local motion of visual words is employed which is more discriminative in event detection. Meanwhile, we show that by employing relative motion, ERMH-BoW is able to honestly describe object activities in an event regardless of varying camera movement. Besides, to alleviate the visual word correlation problem in BoW, we propose a novel method to expand the relative motion histogram. The expansion is achieved by diffusing the relative motion among correlated visual words measured by visual relatedness. To validate the effectiveness of the proposed feature, ERMH-BoW is used to measure video clip similarity with Earth Mover's Distance (EMD) for event detection. We conduct experiments for detecting LSCOM events in TRECVID 2005 video corpus, and performance is improved by 74% and 24% compared with existing motion distribution feature and BoW feature respectively.