The use of temporal, semantic and visual partitioning model for efficient near-duplicate keyframe detection in large scale news corpus

  • Authors:
  • Yan-Tao Zheng;Shi-Yong Neo;Tat-Seng Chua;Qi Tian

  • Affiliations:
  • National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;Institute for Infocomm Research, Singapore

  • Venue:
  • Proceedings of the 6th ACM international conference on Image and video retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Near-duplicate keyframes (NDKs) are important visual cues to link news stories from different TV channel, time, language, etc. However, the quadratic complexity required for NDK detection renders it intractable in large-scale news video corpus. To address this issue, we propose a temporal, semantic and visual partitioning model to divide the corpus into small overlapping partitions by exploiting domain knowledge and corpus characteristics. This enables us to efficiently detect NDKs in each partition separately and then link them together across partitions. We divide the corpus temporally into sequential partitions and semantically into news story genre groups; and within each partition, we visually group potential NDKs by using asymmetric hierarchical k-means clustering on our proposed semi-global image features. In each visual group, we detect NDK pairs by exploiting our proposed SIFT-based fast keypoint matching scheme based on local color information of keypoints. Finally, the detected NDK groups in each partition are linked up via transitivity propagation of NDKs shared by different partitions. The testing on TRECVID 06 corpus with 62k keyframes shows that our proposed approach could result in multifold increase in speed as compared to the best reported approach and complete the NDK detection in a manageable time with satisfactory accuracy.