Large-scale near-duplicate web video search: challenge and opportunity

Authors:
Wan-Lei Zhao;Song Tan;Chong-Wah Ngo
Affiliations:
Department of Computer Science, City University of Hong Kong;Department of Computer Science, City University of Hong Kong;Department of Computer Science, City University of Hong Kong
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 8
Cited 3

Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Robust voting algorithm based on labels of behavior for video copy detection

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Practical elimination of near-duplicates from web video search

Proceedings of the 15th international conference on Multimedia
Efficiently matching sets of features with random histograms

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval

Computer Vision and Image Understanding
Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection

IEEE Transactions on Image Processing
Fast and robust short video clip search for copy detection

PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II

Scalable clip-based near-duplicate video detection with ordinal measure

Proceedings of the ACM International Conference on Image and Video Retrieval
Real-time large scale near-duplicate web video retrieval

Proceedings of the international conference on Multimedia
Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The massive amount of near-duplicate and duplicate web videos has presented both challenge and opportunity to multimedia computing. On one hand, browsing videos on Internet becomes highly inefficient for the need to repeatedly fast-forward videos of similar content. On the other hand, the tremendous amount of somewhat duplicate content also makes some traditionally difficult vision tasks become simple and easy. For example, annotating pictures can be as simple as recycling the tags of Internet images retrieved from image search engines. Such tasks, of either to eliminate or to recycle near-duplicates, can usually be achieved by the nearest neighbor search of videos from Internet. The fundamental problem lies on the scalability of a search technique, in face of the intractable volume of videos which keep rolling on the web. In this paper, we investigate scalability of several well-known features including color signature and visual keywords for web-based retrieval. Indexing these features based on embedding technique for scalable retrieval is also presented. On an Internet video dataset of more than 700 hours collected during years 2006 to 2008, we show some preliminary insights to the challenge of scalable retrieval.