Audio and video feature fusion for activity recognition in unconstrained videos

  • Authors:
  • José Lopes;Sameer Singh

  • Affiliations:
  • Research School of Informatics, Loughborough University, Loughborough, UK;Research School of Informatics, Loughborough University, Loughborough, UK

  • Venue:
  • IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combining audio and image processing for understanding video content has several benefits when compared to using each modality on their own. For the task of context and activity recognition in video sequences, it is important to explore both data streams to gather relevant information. In this paper we describe a video context and activity recognition model. Our work extracts a range of audio and visual features, followed by feature reduction and information fusion. We show that combining audio with video based decision making improves the quality of context and activity recognition in videos by 4% over audio data and 18% over image data.