Near-lossless semantic video summarization and its applications to video analysis

  • Authors:
  • Tao Mei;Lin-Xie Tang;Jinhui Tang;Xian-Sheng Hua

  • Affiliations:
  • Microsoft Research Asia, China;University of Science and Technology of China, China;Nanjing University of Science and Technology, China;Microsoft, USA

  • Venue:
  • ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ever increasing volume of video content on the Web has created profound challenges for developing efficient indexing and search techniques to manage video data. Conventional techniques such as video compression and summarization strive for the two commonly conflicting goals of low storage and high visual and semantic fidelity. With the goal of balancing both video compression and summarization, this article presents a novel approach, called Near-Lossless Semantic Summarization (NLSS), to summarize a video stream with the least high-level semantic information loss by using an extremely small piece of metadata. The summary consists of compressed image and audio streams, as well as the metadata for temporal structure and motion information. Although at a very low compression rate (around ¼0; of H.264 baseline, where traditional compression techniques can hardly preserve an acceptable visual fidelity), the proposed NLSS still can be applied to many video-oriented tasks, such as visualization, indexing and browsing, duplicate detection, concept detection, and so on. We evaluate the NLSS on TRECVID and other video collections, and demonstrate that it is a powerful tool for significantly reducing storage consumption, while keeping high-level semantic fidelity.