Multimodal Event Detection in User Generated Videos

  • Authors:
  • Francesco Cricri;Kostadin Dabov;Igor D. D. Curcio;Sujeet Mate;Moncef Gabbouj

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ISM '11 Proceedings of the 2011 IEEE International Symposium on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays most camera-enabled electronic devices contain various auxiliary sensors such as accelerometers, gyroscopes, compasses, GPS receivers, etc. These sensors are often used during the media acquisition to limit camera degradations such as shake and also to provide some basic tagging information such as the location used in geo-tagging. Surprisingly, exploiting the sensor-recordings modality for high-level event detection has been a subject of rather limited research, further constrained to highly specialized acquisition setups. In this work, we show how these sensor modalities, alone or in combination with content-based analysis, allow inferring information about the video content. In addition, we consider a multi-camera scenario, where multiple user generated recordings of a common scene (e.g., music concerts, public events) are available. In order to understand some higher-level semantics of the recorded media, we jointly analyze the individual video recordings and sensor measurements of the multiple users. The detected semantics include generic interesting events and some more specific events. The detection exploits correlations in the camera motion and in the audio content of multiple users. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real live music performances.