E-LAMP: integration of innovative ideas for multimedia event detection

  • Authors:
  • Wei Tong;Yi Yang;Lu Jiang;Shoou-I Yu;Zhenzhong Lan;Zhigang Ma;Waito Sze;Ehsan Younessian;Alexander G. Hauptmann

  • Affiliations:
  • Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Department of Information Engineering and Computer Science, University of Trento, Trento, Italy;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA

  • Venue:
  • Machine Vision and Applications
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detecting multimedia events in web videos is an emerging hot research area in the fields of multimedia and computer vision. In this paper, we introduce the core methods and technologies of the framework we developed recently for our Event Labeling through Analytic Media Processing (E-LAMP) system to deal with different aspects of the overall problem of event detection. More specifically, we have developed efficient methods for feature extraction so that we are able to handle large collections of video data with thousands of hours of videos. Second, we represent the extracted raw features in a spatial bag-of-words model with more effective tilings such that the spatial layout information of different features and different events can be better captured, thus the overall detection performance can be improved. Third, different from widely used early and late fusion schemes, a novel algorithm is developed to learn a more robust and discriminative intermediate feature representation from multiple features so that better event models can be built upon it. Finally, to tackle the additional challenge of event detection with only very few positive exemplars, we have developed a novel algorithm which is able to effectively adapt the knowledge learnt from auxiliary sources to assist the event detection. Both our empirical results and the official evaluation results on TRECVID MED'11 and MED'12 demonstrate the excellent performance of the integration of these ideas.