Compact video description for copy detection with precise temporal alignment

Authors:
Matthijs Douze;Hervé Jégou;Cordelia Schmid;Patrick Pérez
Affiliations:
INRIA Grenoble, France;INRIA Rennes, France;INRIA Grenoble, France;Technicolor Rennes, France
Venue:
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Year:
2010

Citing 11
Cited 2

Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Video copy detection: a comparative study

Proceedings of the 6th ACM international conference on Image and video retrieval
New local descriptors based on dissociated dipoles

Proceedings of the 6th ACM international conference on Image and video retrieval
Speeded-Up Robust Features (SURF)

Computer Vision and Image Understanding
Description of interest regions with local binary patterns

Pattern Recognition
Video copy detection by fast sequence matching

Proceedings of the ACM International Conference on Image and Video Retrieval
An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering

IEEE Transactions on Multimedia

Accurate content-based video copy detection with efficient feature indexing

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Frame filtering and path verification for improving video copy detection

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a very compact yet discriminative video description, which allows example-based search in a large number of frames corresponding to thousands of hours of video. Our description extracts one descriptor per indexed video frame by aggregating a set of local descriptors. These frame descriptors are encoded using a time-aware hierarchical indexing structure. A modified temporal Hough voting scheme is used to rank the retrieved database videos and estimate segments in them that match the query. If we use a dense temporal description of the videos, matched video segments are localized with excellent precision. Experimental results on the Trecvid 2008 copy detection task and a set of 38000 videos from YouTube show that our method offers an excellent trade-off between search accuracy, efficiency and memory usage.