Story-Based retrieval by learning and measuring the concept-based and content-based similarity

  • Authors:
  • Yuxin Peng;Jianguo Xiao

  • Affiliations:
  • Institute of Computer Science and Technology, Peking University;Institute of Computer Science and Technology, Peking University

  • Venue:
  • MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new idea and approach for the story-based news video retrieval, i.e. clip-based retrieval. Generally speaking, clip-based retrieval can be divided into two phases: feature representation and similarity ranking. The existing methods only adopt the content-based features and pairwise similarity measure for clip-based retrieval. The main deficiencies are: (1) In feature representation, the concept-based features is still not used to represent the content of video clip; (2) In similarity ranking, the learning-based method is not considered to rank the similar clips with the query. To address the above issues, in this paper, on one hand, we consider jointly the concept-based and content-based features to represent adequately the news story; on the other hand, we consider jointly the learning classifier and pairwise similarity measure to rank effectively the similar stories with the query. Both are the main novelty of this paper. The model construction of learning classifier for story-based retrieval is our focus, which is constructed as follows: given one query story, we can use its all keyframes as the set of positive examples of its topic, and the retrieval data set in which most of the keyframes are irrelevant to the topic as the candidates of negative examples. The multi-bag SVM is employed to compute the score of all keyframes in the data set, and then the stories in the data set are ranked according to the average score of their keyframes, which reflects their similarity with the query story. We compare and evaluate the performance of our approach on 1334 stories from TRECVID 2005 benchmark, and the results show our approach can achieve superior performance.