Video summarization with visual and semantic features

  • Authors:
  • Pei Dong;Zhiyong Wang;Li Zhuo;Dagan Feng

  • Affiliations:
  • School of Information Technologies, University of Sydney, Australia and Signal and Information Processing Laboratory, Beijing University of Technology, Beijing, China;School of Information Technologies, University of Sydney, Australia;Signal and Information Processing Laboratory, Beijing University of Technology, Beijing, China;School of Information Technologies, University of Sydney, Australia and Dept. of Electronic and Information Engineering, Hong Kong Polytechnic University, Hong Kong

  • Venue:
  • PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Video summarization aims to provide a condensed yet informative version for original footages so as to facilitate content comprehension, browsing and delivery, where multi-modal features play an important role in differentiating individual segments of a video. In this paper, we present a method combining both visual and semantic features. Rather than utilize domain specific or heuristic textual features as semantic features, we assign semantic concepts to video segments through automatic video annotation. Therefore, semantic coherence between accompanying text and high-level concepts of video segments is exploited to characterize the importance of video segments. Visual features (e.g. motion and face) which have been widely used in user attention model-based summarization have been integrated with the proposed semantic coherence to obtain the final summarization. Experiments on a halfhour sample video from TRECVID 2006 dataset have been conducted to demonstrate that semantic coherence is very helpful for video summarization when being fused with different visual features.