Accuracy vs. Speed Trade-Off in Detecting of Shots in Video Content for Abstracting Digital Video Libraries

  • Authors:
  • Mikolaj Leszczuk;Zdzislaw Papir

  • Affiliations:
  • -;-

  • Venue:
  • IDMS/PROMS 2002 Proceedings of the Joint International Workshops on Interactive Distributed Multimedia Systems and Protocols for Multimedia Systems: Protocols and Systems for Interactive Distributed Multimedia
  • Year:
  • 2002

Quantified Score

Hi-index 0.03

Visualization

Abstract

Two basic requirements for a digital video library to be "browsable" are a precisely indexed content and informative abstracts. Nowadays such solutions are not common in video search engines or generic digital video platforms, therefore, the authors suggest developing some computer applications resolving the problems of at least abstracts' creation. The abstracts cannot be constructed without a deep video content analysis, including some low level processing like a shot detection towards a video sequence segmented to a series of "camera takes". The presented method, aimed at a shot detection, deploys a concept of a Motion Factor (of frame transitions). The basic definition considers the motion factor as a very sudden peak of difference between two successive frames. In some specific areas, the intrashot motion factor may suppress the shot-boundary motion factor. In order to avoid misrecognition of both motion factors during a shot detection process a concept of a differential motion factor was implemented. The full-resolution algorithm achieves the accuracy of up to 80%, however, it is very time-consuming. The shot detection accuracy was measured including true and false shots detected as well as real shots that were bounded visually. The authors' research of a representative number of movies (from various categories) has revealed that the shot detection process can be accelerated up to 500 times without any significant deterioration of shot recognition accuracy. The shot detection algorithm was accelerated in a simple manner by two-dimensional reduction of a frame resolution (in pixels).