A human-centered multiple instance learning framework for semantic video retrieval

  • Authors:
  • Xin Chen;Chengcui Zhang;Shu-Ching Chen;Stuart Rubin

  • Affiliations:
  • Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL;Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL;School of Computing and Information Sciences, Florida International University, Miami, FL;SPAWAR Systems Center San Diego, Intelligence, Surveillance, and Reconnaissance Department, San Diego, CA

  • Venue:
  • IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a human-centered interactive framework for automatically mining and retrieving semantic events in videos. After preprocessing, the object trajectories and event models are fed into the core components of the framework for learning and retrieval. As trajectories are spatiotemporal in nature, the learning component is designed to analyze time series data. The human feedback to the retrieval results provides progressive guidance for the retrieval component in the framework. The retrieval results are in the form of video sequences instead of contained trajectories for user convenience. Thus, the trajectories are not directly labeled by the feedback as required by the training algorithm. A mapping between semantic video retrieval and multiple instance learning (MIL) is established in order to solve this problem. The effectiveness of the algorithm is demonstrated by experiments on real-life transportation surveillance videos.