HMM based structuring of tennis videos using visual and audio cues

  • Authors:
  • E. Kijak;G. Gravier;P. Gros;L. Oisel;F. Bimbot

  • Affiliations:
  • Thomson Multimedia R& D, Cesson Sevigne, France;Nat. Inst. of Informatics, Tokyo, Japan;Nat. Inst. of Informatics, Tokyo, Japan;Nat. Inst. of Informatics, Tokyo, Japan;Perceptual Interfaces & Reality Lab., Maryland Univ., College Park, MD, USA

  • Venue:
  • ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper focuses on the use of hidden Markov models (HMMs) for structure analysis of videos, and demonstrates how they can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The basic temporal unit is the video shot. Visual features describe the audio events within a video shot. The video structure parsing relies on the analysis of the temporal interleaving of video shots, with respect to prior information about tennis content and editing rules. As a result, typical tennis scenes are identified. In addition, each shot is assigned to a level in the hierarchy described in terms of point, game and set.