Compact video description for copy detection with precise temporal alignment

  • Authors:
  • Matthijs Douze;Hervé Jégou;Cordelia Schmid;Patrick Pérez

  • Affiliations:
  • INRIA Grenoble, France;INRIA Rennes, France;INRIA Grenoble, France;Technicolor Rennes, France

  • Venue:
  • ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a very compact yet discriminative video description, which allows example-based search in a large number of frames corresponding to thousands of hours of video. Our description extracts one descriptor per indexed video frame by aggregating a set of local descriptors. These frame descriptors are encoded using a time-aware hierarchical indexing structure. A modified temporal Hough voting scheme is used to rank the retrieved database videos and estimate segments in them that match the query. If we use a dense temporal description of the videos, matched video segments are localized with excellent precision. Experimental results on the Trecvid 2008 copy detection task and a set of 38000 videos from YouTube show that our method offers an excellent trade-off between search accuracy, efficiency and memory usage.